More Core i7 trichannel "failure" from "thiefs"

**Tony** · 10-31-2008, 08:13 AM

Originally Posted by Zucker2k

20.5K Everest Read

http://www.youtube.com/watch?v=GedUuXD20CA

And this proves????

20k writes would be impressive...

**Sly Fox** · 10-31-2008, 08:27 AM

I think any test showing no gain from tri-channel is going to be wrong.

Intel isn't magic, but they are far from dumb

**massman** · 10-31-2008, 09:12 AM

Originally Posted by Drwho?

There is actually an important gain in Mem Bandwidth when going from 2 to 3 Dimms, if you don t see it, you have old hardware, Sandra will show it for sure.

Francois

Are you absolutely sure? From various sources I've heard (and seen) that the difference in performance between dual and triple channel is close to nothing. Actually, I am not quite sure yet, because the extra channel should be adding much more bandwidth than the 500MB's I've seen (DDR3 @ 933MHz, exact same system settings). It's either Lavalys Everest that screws up, although I'm not likely to believe that, or something in the bios/motherboard design of the Intel reference motherboard is making the triple channel run at dual channel, which also would surprise me.

What are the possible underlying causes that make people report a lack of performance gain going from dual channel to triple channel? Software/hardware?

**Drwho?** · 10-31-2008, 07:31 PM

Originally Posted by massman

Are you absolutely sure? From various sources I've heard (and seen) that the difference in performance between dual and triple channel is close to nothing. Actually, I am not quite sure yet, because the extra channel should be adding much more bandwidth than the 500MB's I've seen (DDR3 @ 933MHz, exact same system settings). It's either Lavalys Everest that screws up, although I'm not likely to believe that, or something in the bios/motherboard design of the Intel reference motherboard is making the triple channel run at dual channel, which also would surprise me.

What are the possible underlying causes that make people report a lack of performance gain going from dual channel to triple channel? Software/hardware?

I always said that Core 2 Quad does not need much mem Bandwidth, so, on real application, if memory is not important, you will see very little from Mem Bandwidth.
In the mean time, on application like SETI, Rosetta, folding@home, you got to feed 8 threads, the bandwidth will come handy. If you do H264 with profile level 4, you ll go and see 16 frames in both directions, this will get handy too.

Just remember, to keep 8 threads happy, you ll need the bandwidth, it is just a matter of time before the software use it 100%.
I heard people telling me that nobody will use MMX, it was so hard to use ... Today, you can t boot most of the OS without it.

and yes, I am very sure that 3 Dimms goes faster than 2 Dimms. (on mem test) If it does not, the proto is broken

**JumpingJack** · 10-31-2008, 08:45 PM

Originally Posted by Tony

And this proves????

20k writes would be impressive...

Too bad it was the 'tria version'

.... we don't get to see ... but no way is it 20K write.

**JumpingJack** · 10-31-2008, 08:49 PM

Originally Posted by Sly Fox

I think any test showing no gain from tri-channel is going to be wrong.

Intel isn't magic, but they are far from dumb

Who knows -- Sandra VII did not work right with AMD's split memory controller in Phenom, and memory scores were overall incorrect. It took Sandra VIII to actually report correct numbers for Phenom....

The client market never has had a 192 bit mem bus to the CPU, at least not one that was mainstream/popular. It could be that testing mem BW with today's utilities are only doing single cycle 128 bit read and writes ... so it would not matter if the third stick is there, you will always only see roughly the performance of a 2 stick -- conjecture on my part. However, if I am right -- who is to say Windows or any other App makes good use of the 192 bit channel anyway... it make take some software revisions across the broader market to access the added BW of a 192 bit mem bus.

Or, it just could be that 3 channels is not really needed at all, and the BW measurements are saturated by the speed of the CPU. We will need CPUs in hand to test this thoroughly, and I do not trust 90% of the HW review sites out there to get it right anyway.

**massman** · 11-01-2008, 03:26 AM

Originally Posted by Drwho?

I always said that Core 2 Quad does not need much mem Bandwidth, so, on real application, if memory is not important, you will see very little from Mem Bandwidth.
In the mean time, on application like SETI, Rosetta, folding@home, you got to feed 8 threads, the bandwidth will come handy. If you do H264 with profile level 4, you ll go and see 16 frames in both directions, this will get handy too.

Just remember, to keep 8 threads happy, you ll need the bandwidth, it is just a matter of time before the software use it 100%.
I heard people telling me that nobody will use MMX, it was so hard to use ... Today, you can t boot most of the OS without it.

and yes, I am very sure that 3 Dimms goes faster than 2 Dimms. (on mem test) If it does not, the proto is broken

Thanks for the reply.

If I understand correctly, we should all notice the difference between dual and triple channel, but it's very likely that if we use non-multicore applications that the difference will be very small. The bandwidth that is added because of the extra channel is to provide enough bandwidth to fully cover the 8 threads, but is 'overkill' when using in single/dual threaded applications.

Now, that only leaves the everest bandwidth problems. As far as I know, the Lavalys Everest program is quite accurate when it comes to calculating the memory bandwidth and latency, but in tests I've seen the difference still is only 500MB/s:

Maybe this is the problem:

Lavalys Everest 4.60 new features & improvements:

- Asus EPU and Gigabyte DES support
- Enhanced hardware monitoring capabilities
- Optimized benchmarks for Intel Atom and VIA Nano
- Preliminary support for Intel Core i7 and X58
- Support for the latest chipset and graphics technologies

Only two tests actually show the difference between dual and triple channel, which probably is the correct performance scaling.

**jaredpace** · 11-01-2008, 03:51 AM

i don't see how it's bad because tri-channel is slightly faster than dual channel i7 in every graph....

**Hornet331** · 11-01-2008, 04:10 AM

Originally Posted by jaredpace

i don't see how it's bad because tri-channel is slightly faster than dual channel i7 in every graph....

its "bad" cause it dont shows the real impact that tri-channle has.

**massman** · 11-01-2008, 04:23 AM

Originally Posted by jaredpace

i don't see how it's bad because tri-channel is slightly faster than dual channel i7 in every graph....

That's why. Single channel to dual channel made a huge difference when it was introduced on the NF2 platform (1500MB/s -> 1900MB/s = 26% increase) and it still does in memory hungry applications, such as Lavalys Everest. The fact that the difference between the two technologies is so small, makes many of us wonder why it's that low (in the contrary to those who are only out on bashing the technology).

**eva2000** · 11-01-2008, 04:46 AM

Originally Posted by massman

That's why. Single channel to dual channel made a huge difference when it was introduced on the NF2 platform (1500MB/s -> 1900MB/s = 26% increase) and it still does in memory hungry applications, such as Lavalys Everest. The fact that the difference between the two technologies is so small, makes many of us wonder why it's that low (in the contrary to those who are only out on bashing the technology).

maybe it's in the bios settings and no one has yet had long enough time to tune/tweak triple channel yet ?

**massman** · 11-01-2008, 05:03 AM

That's one possibility, although I don't see why anyone should be tweaking the triple channel functionality as it's one of the key features of the new platform. Surely, you'd expect the Intel X58 motherboard to be able to show the correct performance difference between dual and triple channel, no?

There are some newer biosses available on the Intel restricted area website, but there are no changelogs available, so no idea whether the bios updates would help. I hope so, though.

Another weird/funny effect I've seen people reporting is that overclocking the BCLK frequency has little to no effect in performance, which means that running 133x25 or 166x20 will not make any difference. Next to that, people report that underclocking the QPI frequency (lowering the QPI multiplier) has no effect on the BCLK wall, so decreasing the QPI multiplier will not allow people to go high in BCLK frequency?

The wall itself would apparently be a CPU issue, not a motherboard problem ... kinda like the FSB wall we know now.

I'm re-reading every PDF on the Intel page at the moment, maybe I missed something.

//EDIT: No, I didn't.

**dinos22** · 11-01-2008, 05:11 AM

Originally Posted by eva2000

maybe it's in the bios settings and no one has yet had long enough time to tune/tweak triple channel yet ?

but if general public has to invest serious time to try and gain a bit more performance would it really count for anything

most of these systems....are going to non-tweakers and overclockers alike

tripple channel RAM configuration will also add a new hurdle in overclocking. we've already seen the the recent greek magazine review posted here that tripple channel was giving the guy headaches so he went down to two sticks to be able to get OCs/figures he was after

...having said that i know both of us look forward to the challenge...we love the smell of new silicone and all the head scratching until things start kick some serious arse

tripple channel should eventually be able to give us what we need for SuperPi heheh i know how lame some would say but i am pragmatic...if it aint working for SuperPi i call it broken

.....i cant wait to squeeze CAS, TRCD & TRFC for the first time, feel my first heart race & see the magic unravel

eeeeek

**Calmatory** · 11-01-2008, 05:19 AM

Single channel: 64-bit.
Dual channel: 128-bit. 100 % increase from previous.
Triple channel: 192-bit. 50 % increase from previous.

Now, triple channel would yield half the increase from going from single to dual channel, than in case where the memory bandwidth demand exceeds the max. bandwidth capable with tri channel. No wonder it yields no big differences.

Though, TC and tight timings would yield better results than max. bandwidth.

**Oliver** · 11-01-2008, 05:20 AM

u got it Dino, if it dont play well with spi and 2k1 then

**massman** · 11-01-2008, 05:32 AM

Originally Posted by Calmatory

Single channel: 64-bit.
Dual channel: 128-bit. 100 % increase from previous.
Triple channel: 192-bit. 50 % increase from previous.

Now, triple channel would yield half the increase from going from single to dual channel, than in case where the memory bandwidth demand exceeds the max. bandwidth capable with tri channel. No wonder it yields no big differences.

Considering the (rougly) 25% increase in memory bandwidth on the NF2 going from single to dual channel, I'd expect to see a 12,5% increase going from dual to triple channel, still much more than Lavalys Everest points to.

**dinos22** · 11-01-2008, 05:32 AM

Originally Posted by Oliver

u got it Dino, if it dont play well with spi and 2k1 then

will nehalem even be able to compete with E8600 in '01

**dinos22** · 11-01-2008, 05:35 AM

Originally Posted by massman

Considering the (rougly) 25% increase in memory bandwidth on the NF2 going from single to dual channel, I'd expect to see a 12,5% increase going from dual to triple channel, still much more than Lavalys Everest points to.

it could be early bioses as well who knows

but i would definitely NOT trust software like sandra

32M SuperPi is still the best measure....tapakah is showing a 1% difference with same settings.......that is pretty big for Pi so i'll take it

**Metroid** · 11-01-2008, 05:39 AM

Originally Posted by massman

Considering the (rougly) 25% increase in memory bandwidth on the NF2 going from single to dual channel, I'd expect to see a 12,5% increase going from dual to triple channel, still much more than Lavalys Everest points to.

Nvidia 680i gives me almost 100% more bandwidth going from single to dual channel, not sure about Intel chipsets as I have never tested it using Sandra.

**bingo13** · 11-01-2008, 07:22 AM

Originally Posted by massman

Another weird/funny effect I've seen people reporting is that overclocking the BCLK frequency has little to no effect in performance, which means that running 133x25 or 166x20 will not make any difference.

They are so wrong...at least in the tests we ran.

**dinos22** · 11-01-2008, 07:30 AM

Originally Posted by bingo13

They are so wrong...at least in the tests we ran.

what tests

**Calmatory** · 11-01-2008, 07:51 AM

Originally Posted by massman

Considering the (rougly) 25% increase in memory bandwidth on the NF2 going from single to dual channel, I'd expect to see a 12,5% increase going from dual to triple channel, still much more than Lavalys Everest points to.

Well, NF2 was back in 2002, 6 years ago. Memory bandwidth demand has increased quite alot from those days, and thus the comparison from NF2 is quite much worthless IMO. Besides, I only saw sub 15 % improvements, though, the RAM was cheap kingston and FSB was sub 166 all the time.

**JumpingJack** · 11-01-2008, 08:37 AM

Originally Posted by massman

Thanks for the reply.

If I understand correctly, we should all notice the difference between dual and triple channel, but it's very likely that if we use non-multicore applications that the difference will be very small. The bandwidth that is added because of the extra channel is to provide enough bandwidth to fully cover the 8 threads, but is 'overkill' when using in single/dual threaded applications.

What it boils down to is that most of today's client applications do not produce a demand that exceeds even modest memory bandwidths, aided with a strong cache structure. Increase in BW either by clocking up the bus or increasing memory clocks gives minor improvements, in most cases -- some exceptions are WinRAR's internal benchmark which all it does is read/writes random data to memory while executing it's compression engine... it shows significant sensitivty to BW. I have also seen noteable sensitivity with Mainconcepts H264 encoder.

So, in what Dr. Who? is saying, at 12 GB/s + memory bandwidth is not really going to impact what you observe in real life -- not because the BW is not real, but because the applications used for desktop never deman throughput that exceeds the capabilities.

You will see the BW play an important role in 2S servers, where those applications are more throughput oriented as opposed to client side which are really just task based.

**JumpingJack** · 11-01-2008, 08:44 AM

Originally Posted by massman

Another weird/funny effect I've seen people reporting is that overclocking the BCLK frequency has little to no effect in performance, which means that running 133x25 or 166x20 will not make any difference. Next to that, people report that underclocking the QPI frequency (lowering the QPI multiplier) has no effect on the BCLK wall, so decreasing the QPI multiplier will not allow people to go high in BCLK frequency?

The wall itself would apparently be a CPU issue, not a motherboard problem ... kinda like the FSB wall we know now.

//EDIT: No, I didn't.

There is nothing weird or funny about this, pretty much expected. QPI in a single socket desktop is now simply servicing the chipset which acts as nothing more than a multiprotocol network switch, shuttling to different IO standards, USB, SATA, etc. there is not part of this that is critical in the computational equation to performance overall -- the QPI link will service higher BW than any periphrial will yield, hence ... 25.6 GB/sec or 12. GB/sec (lower QPI BW) will have little to no overall effect, considering SATA can do 300 MB/sec... QPI will neither speed up nor slow down a HD transfer, for example.... this is just intuitive.

All this means is that Intel has done a good job making QPI very robust such that it can scale much higher clock if/when needed. Where and what creates the BCLK wall is irrelevant, there will always be a wall with some component, the weakest component of the chain will be the limiter -- this is not surprising ... perhaps it is the CPU NB frequency, who knows ... or it could be QPI, and it simply is before the multiplier... who knows. What we do know is there is a whole new set of fun knobs to turn.

**STaRGaZeR** · 11-01-2008, 10:21 AM

Originally Posted by JumpingJack

There is nothing weird or funny about this, pretty much expected. QPI in a single socket desktop is now simply servicing the chipset which acts as nothing more than a multiprotocol network switch, shuttling to different IO standards, USB, SATA, etc. there is not part of this that is critical in the computational equation to performance overall -- the QPI link will service higher BW than any periphrial will yield, hence ... 25.6 GB/sec or 12. GB/sec (lower QPI BW) will have little to no overall effect, considering SATA can do 300 MB/sec... QPI will neither speed up nor slow down a HD transfer, for example.... this is just intuitive.

USB, SATA, etc. are connected to the ICH10. The ICH10 is connected to X58 via a DMI link. So whatever you plug into any USB or SATA will have the limitation of the DMI link, which is much slower than the QPI. Raising the QPI speed won't make any differencies for pheripherals. You don't need more for that anyways.

Thread: More Core i7 trichannel "failure" from "thiefs"

Thread Tools

Search Thread

Rate This Thread

Display

i am sure

Bookmarks

Bookmarks

Posting Permissions