Possible Theory for Penryn's 3 MB/6 MB Cache?

**Fuji** · 01-16-2007, 04:54 PM

I don't claim to be an expert at this although I kinda thought of this.

It's kind of a known fact that Penryn will "only" have a 3-6 MB L2 cache.

My theory is that a) there's not that much need for more cache (if you look at the 2 MB vs 4 MB benchmarks, there's not always that big a difference) and b) since they've been in the CPU wars with AMD, they need profit margins to go up a lot.

I'm thinking more of b then a.

From what I read, it's been said that with every die shrink, you basically cut the size in half (from increased density and the actual shrink).

Core 2 (4 MB L2 cache) has a die size of 143 mm^2.

They say that the cache takes up about half the size of the CPU or around 72 mm^2

So if Penryn's die (minus the cache) stays the same (highly unlikely), and they only increase the cache by 50%, you have 72mm + 36mm = 106mm^2 which is a little bit less then 40% smaller then Conroe. That's a pretty big drop and would increase profit margins.

Or...

Tomshardware posted an article and for those of you who like me, are lazy am going to quote it:

"During the Q4 earnings call, Otellini also mentioned that Intel has working samples of the 45 nm Penryn processor, scheduled for delivery in the second half of this year. Penryn will be a die-shrink of the current Core 2 Duo processors and is expected to deliver lower consumption and higher performance - especially in floating point scenarios. Otellini's remarks that Penryn is booting on Windows XP, Windows Vista, Mac OS as well as Linux suggest that processor's validation process is already in full swing."

(http://www.tgdaily.com/2007/01/16/intel_quad_core/)

So what if theories a and b are wrong and they are tweaking the core enough for performance that they need the actual die space?

And being that this is XS when can we expect to see Penryn bechmarks?

**Phosphate** · 01-16-2007, 04:58 PM

I know 2MB vs 4MB L2 cache is not a big deal at less than 3 GHz speeds (ie, stock) with current Conroes, but how does L2 cache figure in when speeds are scaled up?

I'm saying this because according to Anandtech a OC'd e4300 can beat a X6800 but it seems to need disportionately more speed to do this . For this reason I think L2 cache may matter more as frequency increases.

Can anyone clear this up?

**nn_step** · 01-16-2007, 05:02 PM

as CPU frequencies increase, the number of clock cycles that it takes to access the ram increases.
In theory if you work out of the Cache, then performance scales linearly with the clock speed. However if you don't work out of the Cache, you would have to deal with larger and larger penalties for Cache misses.
Now extremely well written code, it doesn't matter. But for normal code, it plays a real factor.

**Fuji** · 01-16-2007, 05:03 PM

Originally Posted by nn_step

as CPU frequencies increase, the number of clock cycles that it takes to access the ram increases.
In theory if you work out of the Cache, then performance scales linearly with the clock speed. However if you don't work out of the Cache, you would have to deal with larger and larger penalties for Cache misses.
Now extremely well written code, it doesn't matter. But for normal code, it plays a real factor.

that's true, but by the time that Penryn (i'm using Penryn as kind of an universal term for Intel's 45nm processors) is out, we will have DDR3 which should decrease the amount of time that the CPU is waiting around for the ram to hurry up.

**nn_step** · 01-16-2007, 05:07 PM

Originally Posted by Fuji

that's true, but by the time that Penryn (i'm using Penryn as kind of an universal term for Intel's 45nm processors) is out, we will have DDR3 which should decrease the amount of time that the CPU is waiting around for the ram to hurry up.

we are talking about a 15% decrease in latency and alot more bandwidth. That isn't going to help it much. The cost of 40ns isn't much for a 2Ghz processor, but it is double for a 4Ghz processor.

**Phosphate** · 01-16-2007, 05:12 PM

DDR3 is coming out the end of 2007?

**Fuji** · 01-16-2007, 05:12 PM

Originally Posted by nn_step

we are talking about a 15% decrease in latency and alot more bandwidth. That isn't going to help it much. The cost of 40ns isn't much for a 2Ghz processor, but it is double for a 4Ghz processor.

I see. Because the CPU will operate at 4 GHz and so although the 2 GHz is 40ns, the 4 GHz one will have to wait the same time that the 2GHz one would have to wait and it hampers performance more because you have a more powerful processor idling for the same amount of time that your 2 GHz one.

So, as we increase the clock speed, the need for faster memory becomes more of a must. Rambus anyone?

Originally Posted by Phosphate

DDR3 is coming out the end of 2007?

Along with PCI-E v.2

**nn_step** · 01-16-2007, 05:18 PM

Originally Posted by Fuji

I see. Because the CPU will operate at 4 GHz and so although the 2 GHz is 40ns, the 4 GHz one will have to wait the same time that the 2GHz one would have to wait and it hampers performance more because you have a more powerful processor idling for the same amount of time that your 2 GHz one.

So, as we increase the clock speed, the need for faster memory becomes more of a must. Rambus anyone?

Actually Rambus wouldn't be a good idea, since thier current Memory solution is the XDR2, which has huge bandwidth but doesn't actually improve latency.
Right now there are only a handful of things that can deliver better latency than DDR3:
1)SRAM (usually called Cache)
2) MRAM (which is cheaper than Cache but no where near mainstream)
3)Z-RAM (Which AMD is going to use on K8L and is faster than SRAM in Capacities over 4MB)

Now the really interesting thing about Z-RAM is that you need to use a SOI process to use it, however Intel is never going to use the SOI process (atleast according to their road maps) And since it is 1/6 the Size of SRAM, it is going to be far more effective for L2/3 Cache

**Fuji** · 01-16-2007, 05:23 PM

Although their move to the IMC in about a year will hail performance increases as the CPU will have to spend less time waiting on the ram plus the bandwith increase.

You said that the move to DDR3 brings a decrease in latency. From the way it looks, the switch to DDR3 will actually increase latency. Any explanation?

**nn_step** · 01-16-2007, 05:28 PM

Originally Posted by Fuji

Although their move to the IMC in about a year will hail performance increases as the CPU will have to spend less time waiting on the ram plus the bandwith increase.

You said that the move to DDR3 brings a decrease in latency. From the way it looks, the switch to DDR3 will actually increase latency. Any explanation?

well the IMC will decrease Intel's access latency greatly (cut it almost in half)
however for some reason I don't seem to think 12ns is a great latency than 15ns

one little Note but DDR3 is mostly an AMD design for memory

**Fuji** · 01-16-2007, 08:09 PM

wait hold up. If DDR2 has the same latency as DDR why does performance go down with AMD with AM2? Cause on paper, they have the same latency, but in reality it's different.

And 12ns is the actual time your CPU waits for ram right? Not the total latency it takes to make it's way to the CPU yes? Cause I heard that Opteron had a total latency of like 80ns.

**nn_step** · 01-16-2007, 08:20 PM

Originally Posted by Fuji

wait hold up. If DDR2 has the same latency as DDR why does performance go down with AMD with AM2? Cause on paper, they have the same latency, but in reality it's different.

And 12ns is the actual time your CPU waits for ram right? Not the total latency it takes to make it's way to the CPU yes? Cause I heard that Opteron had a total latency of like 80ns.

there are many ways of measuring latency.
DDR2-800@4-4-4-10 and DDR-400@2-2-2-5 on AMD System, performance goes up with the move from DDR to DDR2.
The reason you heard about AMD's losing performance on DDR2 is because they were using Memory with worse latency. The latency for DDR2-800@4-4-4-10 is about equal to DDR-400@2-2-2-5

**babalouj** · 01-16-2007, 09:06 PM

Originally Posted by nn_step

Actually Rambus wouldn't be a good idea, since thier current Memory solution is the XDR2, which has huge bandwidth but doesn't actually improve latency.
Right now there are only a handful of things that can deliver better latency than DDR3:
1)SRAM (usually called Cache)
2) MRAM (which is cheaper than Cache but no where near mainstream)
3)Z-RAM (Which AMD is going to use on K8L and is faster than SRAM in Capacities over 4MB)

MRAM has a long way to go before its bandwidth is there (and becomes mainstream). Some of my professors at Georgia Tech were doing research on it and I worked with them some. About 1.5 years ago the absolute highest bandwidths they were getting was equivalent to ddr333 with bad latencies and the results were not repeatable across batches. Changing the spin of the electron takes some time but the benefit of retaining data when powered off is huge. Sorry to bust in, just thought I would throw that out there.

**Jacky** · 01-18-2007, 04:07 AM

i thought penryn is going to have 2*6mb of cache?
otherwise it would be a bad move from intel, because less cache prevents the cpu from scaling well, doesn't it?
and if they want good yields they can still make a 2*3mb version for the low end.
source

**Thorburn** · 01-18-2007, 05:36 AM

Penryn is believed to be the dual core replacement for Conroe, so that would be 6MB shared L2 cache as opposed to 4MB at present.
The interesting bit will be if Kentsfields replacement (Yorkfield?) will be a single die or a dual die implementation as at present, and if its a single die if its 2 x 6MB with each 6MB shared between two cores, or 12MB shared between all 4.

**~~Turtle 1~~** · 01-30-2007, 01:30 PM

Originally Posted by nn_step

Actually Rambus wouldn't be a good idea, since thier current Memory solution is the XDR2, which has huge bandwidth but doesn't actually improve latency.
Right now there are only a handful of things that can deliver better latency than DDR3:
1)SRAM (usually called Cache)
2) MRAM (which is cheaper than Cache but no where near mainstream)
3)Z-RAM (Which AMD is going to use on K8L and is faster than SRAM in Capacities over 4MB)

Now the really interesting thing about Z-RAM is that you need to use a SOI process to use it, however Intel is never going to use the SOI process (atleast according to their road maps) And since it is 1/6 the Size of SRAM, it is going to be far more effective for L2/3 Cache

I find this really Interesting as IBM highk and metal gates uses Strained Silicon.

**nn_step** · 01-30-2007, 01:34 PM

Originally Posted by Turtle 1

I find this really Interesting as IBM highk and metal gates uses Strained Silicon.

yep and so does AMD but not Intel. So interesting isn't it.

**Frank M** · 01-30-2007, 03:50 PM

Originally Posted by Phosphate

I know 2MB vs 4MB L2 cache is not a big deal at less than 3 GHz speeds (ie, stock) with current Conroes

Well...
When the first C2Ds came out, there was a comparison on Anandtech with a
6300 and an X6800 at 7×266 - that is, same clock, just different cache sizes.
The performance difference ranged from 0 to 10.0%, with an average of 3.5%.
Their conclusion was that the intel prefetch was optimised for 4MBs.
http://www.anandtech.com/cpuchipsets...spx?i=2795&p=4

**~~Turtle 1~~** · 01-31-2007, 05:47 AM

Originally Posted by nn_step

yep and so does AMD but not Intel. So interesting isn't it.

I didn't read that intel doesn't use strained silcon with the 45nm process you got a link for that? Intel has been using SS since 90nm.

Also the reason I mentioned the strained silicon was you said Zram only works on SOL.

Were all still waiting for link that K8L uses zram

This is IBM/AMD dec06 info release

http://www.dailytech.com/article.aspx?newsid=5329

After intel announces High k IBM/AMD announce high k . As I said before until AMD shows something They can't be trusted,is it ultra low k or is it High k. IBM has always said highK and metal gates at 32nm. If the do use High k I really don't see how it can be ready in 08 as they and AMD have been working on ultra low k.

Thread: Possible Theory for Penryn's 3 MB/6 MB Cache?

Thread Tools

Search Thread

Rate This Thread

Display

Possible Theory for Penryn's 3 MB/6 MB Cache?

Bookmarks

Bookmarks

Posting Permissions