Intel's First Nehalem Cpu-Z Pic.

**JumpingJack** · 02-03-2008, 01:02 AM

Originally Posted by Shintai

Kinda yes, but would be long time till we could know.

Tripplechannel memory..2GB?

Yeah, I agree... for such a radically different Intel CPU, CPUID is giving way to much information and it does not make sense based on the info we know about Nehalem. Ironic (and incorrect) cooincendence that the system clock yields a quad pumped CSI speed just like the old FSB

**jas420221** · 02-03-2008, 01:45 AM

Regarding its authenticity I was thinking...

Since this is the first chip from Intel with an Integrated Memory Controller, the cache sizes they have currently arent needed (just look at AMD's current lineup and lack of 6+mb of cache).

**kl0012** · 02-03-2008, 02:22 AM

Originally Posted by ChaosMinionX

http://www.beyond3d.com/content/news/540

There was another article as well done by Xbit, and I believe one on Anand... I will dig them up.

Socket 1160 & Socket 715 (if will exists) are for single-socket solutions. The old multi-CPU scheme (where all CPUs are connected through the chipset) doesn't make sense. But the system on screenshot is dual-socket. So it definitely uses QPI as system bus.

**savantu** · 02-03-2008, 02:35 AM

Originally Posted by jas420221

Regarding its authenticity I was thinking...

Since this is the first chip from Intel with an Integrated Memory Controller, the cache sizes they have currently arent needed (just look at AMD's current lineup and lack of 6+mb of cache).

That's not true.

IMC doesn't eliminate the need for a good cache subsystem.The winning combination is to have both : good caches and IMC.Good caches are needed for scalability.That's why server chips feature large amounts of cache.

AMD did not use large cache because of die sizer and engineering constraints.Intel OTOH had excellent caches and their density allowed to be implemented in large variants.

If the cache size is true , it means core complexity increased to such an extent in Nehalem that it was impossible to field a large L2 and still keep a reasonable die size ( < 300mm^2 ).

**jas420221** · 02-03-2008, 02:53 AM

Originally Posted by savantu

That's not true.

IMC doesn't eliminate the need for a good cache subsystem.The winning combination is to have both : good caches and IMC.Good caches are needed for scalability.That's why server chips feature large amounts of cache.

AMD did not use large cache because of die sizer and engineering constraints.Intel OTOH had excellent caches and their density allowed to be implemented in large variants.

If the cache size is true , it means core complexity increased to such an extent in Nehalem that it was impossible to field a large L2 and still keep a reasonable die size ( < 300mm^2 ).

Did I say eliminate? Nope. It still needs it, just NOT AS MUCH OF IT b/c of the significant decreases in latency with the IMC. Other factors Im sure played into it though.

**kiwi** · 02-03-2008, 03:21 AM

Originally Posted by Spawne32

What happens when we are done with the nanometers?

Quantum computing

**Mad1723** · 02-03-2008, 06:22 AM

or optics one, with their laser project

**Monkeywoman** · 02-03-2008, 06:41 AM

heres a road map for intel cpus.

source; chiphell

**Shintai** · 02-03-2008, 08:44 AM

First slide is about dynamic speeds for mobile 45nm Penryn CPUs.

Second slide is..well. Old and from another site and more or less guesswork.

Third is from a jap site that basicly just try and paste forum/site news into a roadmap.

Nothing new actually...

**Polizei** · 02-03-2008, 08:49 AM

With that die layout, I have a feeling cooling is going to be tough on these. Forget 10c differences with the bad IHS of the Q6x00's, I think look more at 15c with Nehalem. Why not just arrange it in a square instead of a rectangle in a line like that?

**chispy** · 02-03-2008, 09:02 AM

Hmmmm , i shall wait for propper benchmarks and reviews to know exactly where nehalem stands.

**largon** · 02-03-2008, 10:06 AM

"Core VID"

->
OBVIOUS FAKE

Originally Posted by jas420221

Since this is the first chip from Intel with an Integrated Memory Controller, the cache sizes they have currently arent needed (just look at AMD's current lineup and lack of 6+mb of cache).

Infact, AMD's upcoming 45nm K10 derivate (dubbed Shanghai) will bump the green team's L2 to 6MB...

**~~terrace215~~** · 02-03-2008, 10:13 AM

Originally Posted by savantu

Well , both have IMC , one will use NB ( QPI enabled ) and one won't.See my previous post.

I'd like to comment a little on the low L2.If the picture is true , then Holy Jesus!That's utterly different from what I expected. ( 64KB L1s and 8MB shared L2 )

I'd wager that the small 256KB L2 isn't a typical L2 as we know it. ( Instruction+Data ).I'd say if the photo is true , the L2 is an L2I or instruction cache ( holds only instructions , not data - that approach was done on Montecito Itanium , the L2 there is split into 1MB L2I and 256KB L2D ).
The L3 should be extremely fast , like on Itanium ( 14-20 cycles , that's L2 territory ) and data should go directly to the L1.The L1s need to be extremely fast too.

What are the implications if all the above are true : low frequency scalability.As for performance, it's strange.I doubt this cache structure is superior to that of Core , in fact I'd say it is inferior.

Um, Intel is *on the record* with the claim that Nehalem even has better single-thread IPC than Penryn. Then there's the claim that Nehalem has 1.6 times the SpecInt_rate performance of Clovertown, and 2.6 times the SpecFP_rate performance. (That's in dual-socket QC systems @ 3GHz.)

So, I'd say you need to adjust your "implications" accordingly.

Whatever Intel has done with Nehalem's cache, Nehalem is blazingly fast.

**Qkjhfhaiguihfma** · 02-03-2008, 10:25 AM

60% increase on integer and 160% increase in floating would be evil.

**BrowncoatGR** · 02-03-2008, 10:25 AM

Originally Posted by largon

->
OBVIOUS FAKEInfact, AMD's upcoming 45nm K10 derivate (dubbed Shanghai) will bump the green team's L2 to 6MB...

6MB L3 actually. I believe L2 will stay at 512kb per core. 256k is a bit low for L2 cache. If its true savantu might very well be right on this one(instruction cache only).

Originally Posted by terrace215

Um, Intel is *on the record* with the claim that Nehalem even has better single-thread IPC than Penryn. Then there's the claim that Nehalem has 1.6 times the SpecInt_rate performance of Clovertown, and 2.6 times the SpecFP_rate performance. (That's in dual-socket QC systems @ 3GHz.)

So, I'd say you need to adjust your "implications" accordingly.

Whatever Intel has done with Nehalem's cache, Nehalem is blazingly fast.

I thought SpecFP/Int_rate are bandwidth benchmarks

Needless to say just having an IMC and QPI would boost "rate" benchmarks significantly and does not imply better single thread performance.

**Swatrecon_** · 02-03-2008, 10:27 AM

Originally Posted by Spawne32

What happens when we are done with the nanometers?

someone said picometers, but i rather fancy attometers.

**xlink** · 02-03-2008, 10:35 AM

want lower latency cache... please say 256kb means ultra-low latency...

**cpuz** · 02-03-2008, 11:02 AM

Hey guys,

This screenshot is not fake. Now, this does not mean that cpuz is reporting all correctly, of course (Core VID is mentionned when the core voltage was not obtained from sensor chip, then cpuz displays the CPU VID).

Concerning caches on Nehalem : the L3 is now shared between 4 physical cores, meaning that is offers 4 access ports. The most access ports a cache has, the slowest it is. Consequently, it is not surprising that Intel added four small, fast and dedicated (and unified) L2 between the L1s and the L3. These caches keep using an inclusive relationship, so of course this means that the useful size of these L2s is only 128KB. However, those caches are not designed for high success rates but for speed.
CPU-Z is wrong on L1 Data size however, they should be 4x32 KB and not 4x16KB. And I don't know about FSB.

**savantu** · 02-03-2008, 11:10 AM

Originally Posted by cpuz

Hey guys,

This screenshot is not fake. Now, this does not mean that cpuz is reporting all correctly, of course (Core VID is mentionned when the core voltage was not obtained from sensor chip, then cpuz displays the CPU VID).

Concerning caches on Nehalem : the L3 is now shared between 4 physical cores, meaning that is offers 4 access ports. The most access ports a cache has, the slowest it is. Consequently, it is not surprising that Intel added four small, fast and dedicated (and unified) L2 between the L1s and the L3. These caches keep using an inclusive relationship, so of course this means that the useful size of these L2s is only 128KB. However, those caches are not designed for high success rates but for speed.
CPU-Z is wrong on L1 Data size however, they should be 4x32 KB and not 4x16KB. And I don't know about FSB.

A shared L3 between 4 cores means highly complex arbitration mechanism ( that equals increased latency ).Look at AMD's K10 L3.Nothing to brag about either.It is slow , very slow.

With such small L2s , how can they feed a highly complex core with 2 threads ?? Multiple threads means cache thrashing , small size amplifies that and you have a slow L3 behind it.
If that's a good cache subsystem , I'm stupefied.It goes against everything Intel has done lately ( large , shared , extremely fast L2s which thrashed IMC equipped CPUs ).

What about the FSB ? DP Nehalem uses QPI , that works at speed of up to 4.8GTs.How does CPU-Z read that ?

**cpuz** · 02-03-2008, 11:25 AM

Originally Posted by savantu

A shared L3 between 4 cores means highly complex arbitration mechanism ( that equals increased latency ).Look at AMD's K10 L3.Nothing to brag about either.It is slow , very slow.

It's tough to compare with K10, since constraints are different. K10 uses exclusive cache hierarchy, and snooping consumes a lot of the bandwidth (inclusive caches act as snoop filters). And AFAIK the Phenom's L3 does not offer 4 access ports.
To summarize, Phenom's cache hierarchy is the one of the A64, plus a shared L3. Nehalem extends C2D cache hierarchy to 4-cores sharing, plus four small L2 in between. These are two different approaches, even if they seem very close.

Originally Posted by savantu

What about the FSB ? DP Nehalem uses QPI , that works at speed of up to 4.8GTs.How does CPU-Z read that ?

cpuz uses the exact same method on Nehalem and C2D

**BrowncoatGR** · 02-03-2008, 11:43 AM

Originally Posted by savantu

A shared L3 between 4 cores means highly complex arbitration mechanism ( that equals increased latency ).Look at AMD's K10 L3.Nothing to brag about either.It is slow , very slow.

With such small L2s , how can they feed a highly complex core with 2 threads ?? Multiple threads means cache thrashing , small size amplifies that and you have a slow L3 behind it.
If that's a good cache subsystem , I'm stupefied.It goes against everything Intel has done lately ( large , shared , extremely fast L2s which thrashed IMC equipped CPUs ).

What about the FSB ? DP Nehalem uses QPI , that works at speed of up to 4.8GTs.How does CPU-Z read that ?

Yeah its weird. It's less cache than AMD will have with shanghai although latency might be better. I didnt expect Nehalem to have a unified L2 though. It looks like this will be a great server chip but i'm not sure it will have a significant impact on the desktop.

**Jamesrt2004** · 02-03-2008, 12:41 PM

Does this have no Northbridge then only a southbridge?

would that change OC-ing atall???

**Yoxxy** · 02-03-2008, 01:12 PM

Originally Posted by cpuz

Hey guys,

This screenshot is not fake. Now, this does not mean that cpuz is reporting all correctly, of course (Core VID is mentionned when the core voltage was not obtained from sensor chip, then cpuz displays the CPU VID).

Concerning caches on Nehalem : the L3 is now shared between 4 physical cores, meaning that is offers 4 access ports. The most access ports a cache has, the slowest it is. Consequently, it is not surprising that Intel added four small, fast and dedicated (and unified) L2 between the L1s and the L3. These caches keep using an inclusive relationship, so of course this means that the useful size of these L2s is only 128KB. However, those caches are not designed for high success rates but for speed.
CPU-Z is wrong on L1 Data size however, they should be 4x32 KB and not 4x16KB. And I don't know about FSB.

I think Shintai owes 100E to a lot of people on the board...

Thanks for the insightful post and explaining how the new chip will work!

**Shintai** · 02-03-2008, 01:22 PM

Originally Posted by Jamesrt2004

Does this have no Northbridge then only a southbridge?

would that change OC-ing atall???

It will change OC abit. More components to be affected.

The server and extreme version will have a northbridge for PCIe only and a southbridge for the usual.

The performance/mainstream will have a southbridge only. With PCIe on the CPU.

The value end will have both IGP and PCIe on the CPU. But still the same southbridge as performance/mainstream.

Server/extreme will have a different southbridge than performance/mainstream/value.

**Shintai** · 02-03-2008, 01:23 PM

Originally Posted by Yoxxy

I think Shintai owes 100E to a lot of people on the board...

Thanks for the insightful post and explaining how the new chip will work!

Not yet, as he even says himself it reads some parts wrong.

I still dont believe in a L3. It simply makes no sense when looking on the size and past history. Itanium only got a L3 due to the massive sizes of up to 24MB and soon 30MB. And I dont think anyone here on the board got access to a nehalem system, nor will have it for the next 3-6 months.

L3 is a step backwards for mainstream, not upwards.

Thread: Intel's First Nehalem Cpu-Z Pic.

Thread Tools

Search Thread

Rate This Thread

Display

Bookmarks

Bookmarks

Posting Permissions