Results 1 to 19 of 19

Thread: Possible Theory for Penryn's 3 MB/6 MB Cache?

  1. #1
    Xtreme Member
    Join Date
    Apr 2006
    Location
    Virginia
    Posts
    179

    Possible Theory for Penryn's 3 MB/6 MB Cache?

    I don't claim to be an expert at this although I kinda thought of this.

    It's kind of a known fact that Penryn will "only" have a 3-6 MB L2 cache.

    My theory is that a) there's not that much need for more cache (if you look at the 2 MB vs 4 MB benchmarks, there's not always that big a difference) and b) since they've been in the CPU wars with AMD, they need profit margins to go up a lot.

    I'm thinking more of b then a.

    From what I read, it's been said that with every die shrink, you basically cut the size in half (from increased density and the actual shrink).

    Core 2 (4 MB L2 cache) has a die size of 143 mm^2.

    They say that the cache takes up about half the size of the CPU or around 72 mm^2

    So if Penryn's die (minus the cache) stays the same (highly unlikely), and they only increase the cache by 50%, you have 72mm + 36mm = 106mm^2 which is a little bit less then 40% smaller then Conroe. That's a pretty big drop and would increase profit margins.

    Or...

    Tomshardware posted an article and for those of you who like me, are lazy am going to quote it:

    "During the Q4 earnings call, Otellini also mentioned that Intel has working samples of the 45 nm Penryn processor, scheduled for delivery in the second half of this year. Penryn will be a die-shrink of the current Core 2 Duo processors and is expected to deliver lower consumption and higher performance - especially in floating point scenarios. Otellini's remarks that Penryn is booting on Windows XP, Windows Vista, Mac OS as well as Linux suggest that processor's validation process is already in full swing."


    (http://www.tgdaily.com/2007/01/16/intel_quad_core/)

    So what if theories a and b are wrong and they are tweaking the core enough for performance that they need the actual die space?

    And being that this is XS when can we expect to see Penryn bechmarks?

  2. #2
    Xtreme Enthusiast
    Join Date
    Jul 2006
    Posts
    650
    I know 2MB vs 4MB L2 cache is not a big deal at less than 3 GHz speeds (ie, stock) with current Conroes, but how does L2 cache figure in when speeds are scaled up?

    I'm saying this because according to Anandtech a OC'd e4300 can beat a X6800 but it seems to need disportionately more speed to do this . For this reason I think L2 cache may matter more as frequency increases.

    Can anyone clear this up?

  3. #3
    YouTube Addict
    Join Date
    Aug 2005
    Location
    Klaatu barada nikto
    Posts
    17,574
    as CPU frequencies increase, the number of clock cycles that it takes to access the ram increases.
    In theory if you work out of the Cache, then performance scales linearly with the clock speed. However if you don't work out of the Cache, you would have to deal with larger and larger penalties for Cache misses.
    Now extremely well written code, it doesn't matter. But for normal code, it plays a real factor.
    Fast computers breed slow, lazy programmers
    The price of reliability is the pursuit of the utmost simplicity. It is a price which the very rich find most hard to pay.
    http://www.lighterra.com/papers/modernmicroprocessors/
    Modern Ram, makes an old overclocker miss BH-5 and the fun it was

  4. #4
    Xtreme Member
    Join Date
    Apr 2006
    Location
    Virginia
    Posts
    179
    Quote Originally Posted by nn_step
    as CPU frequencies increase, the number of clock cycles that it takes to access the ram increases.
    In theory if you work out of the Cache, then performance scales linearly with the clock speed. However if you don't work out of the Cache, you would have to deal with larger and larger penalties for Cache misses.
    Now extremely well written code, it doesn't matter. But for normal code, it plays a real factor.
    that's true, but by the time that Penryn (i'm using Penryn as kind of an universal term for Intel's 45nm processors) is out, we will have DDR3 which should decrease the amount of time that the CPU is waiting around for the ram to hurry up.

  5. #5
    YouTube Addict
    Join Date
    Aug 2005
    Location
    Klaatu barada nikto
    Posts
    17,574
    Quote Originally Posted by Fuji
    that's true, but by the time that Penryn (i'm using Penryn as kind of an universal term for Intel's 45nm processors) is out, we will have DDR3 which should decrease the amount of time that the CPU is waiting around for the ram to hurry up.
    we are talking about a 15% decrease in latency and alot more bandwidth. That isn't going to help it much. The cost of 40ns isn't much for a 2Ghz processor, but it is double for a 4Ghz processor.
    Fast computers breed slow, lazy programmers
    The price of reliability is the pursuit of the utmost simplicity. It is a price which the very rich find most hard to pay.
    http://www.lighterra.com/papers/modernmicroprocessors/
    Modern Ram, makes an old overclocker miss BH-5 and the fun it was

  6. #6
    Xtreme Enthusiast
    Join Date
    Jul 2006
    Posts
    650
    DDR3 is coming out the end of 2007?

  7. #7
    Xtreme Member
    Join Date
    Apr 2006
    Location
    Virginia
    Posts
    179
    Quote Originally Posted by nn_step
    we are talking about a 15% decrease in latency and alot more bandwidth. That isn't going to help it much. The cost of 40ns isn't much for a 2Ghz processor, but it is double for a 4Ghz processor.
    I see. Because the CPU will operate at 4 GHz and so although the 2 GHz is 40ns, the 4 GHz one will have to wait the same time that the 2GHz one would have to wait and it hampers performance more because you have a more powerful processor idling for the same amount of time that your 2 GHz one.

    So, as we increase the clock speed, the need for faster memory becomes more of a must. Rambus anyone?

    Quote Originally Posted by Phosphate
    DDR3 is coming out the end of 2007?
    Along with PCI-E v.2

  8. #8
    YouTube Addict
    Join Date
    Aug 2005
    Location
    Klaatu barada nikto
    Posts
    17,574
    Quote Originally Posted by Fuji
    I see. Because the CPU will operate at 4 GHz and so although the 2 GHz is 40ns, the 4 GHz one will have to wait the same time that the 2GHz one would have to wait and it hampers performance more because you have a more powerful processor idling for the same amount of time that your 2 GHz one.

    So, as we increase the clock speed, the need for faster memory becomes more of a must. Rambus anyone?
    Actually Rambus wouldn't be a good idea, since thier current Memory solution is the XDR2, which has huge bandwidth but doesn't actually improve latency.
    Right now there are only a handful of things that can deliver better latency than DDR3:
    1)SRAM (usually called Cache)
    2) MRAM (which is cheaper than Cache but no where near mainstream)
    3)Z-RAM (Which AMD is going to use on K8L and is faster than SRAM in Capacities over 4MB)

    Now the really interesting thing about Z-RAM is that you need to use a SOI process to use it, however Intel is never going to use the SOI process (atleast according to their road maps) And since it is 1/6 the Size of SRAM, it is going to be far more effective for L2/3 Cache
    Last edited by nn_step; 01-16-2007 at 05:54 PM.
    Fast computers breed slow, lazy programmers
    The price of reliability is the pursuit of the utmost simplicity. It is a price which the very rich find most hard to pay.
    http://www.lighterra.com/papers/modernmicroprocessors/
    Modern Ram, makes an old overclocker miss BH-5 and the fun it was

  9. #9
    Xtreme Member
    Join Date
    Apr 2006
    Location
    Virginia
    Posts
    179
    Although their move to the IMC in about a year will hail performance increases as the CPU will have to spend less time waiting on the ram plus the bandwith increase.

    You said that the move to DDR3 brings a decrease in latency. From the way it looks, the switch to DDR3 will actually increase latency. Any explanation?

  10. #10
    YouTube Addict
    Join Date
    Aug 2005
    Location
    Klaatu barada nikto
    Posts
    17,574
    Quote Originally Posted by Fuji
    Although their move to the IMC in about a year will hail performance increases as the CPU will have to spend less time waiting on the ram plus the bandwith increase.

    You said that the move to DDR3 brings a decrease in latency. From the way it looks, the switch to DDR3 will actually increase latency. Any explanation?
    well the IMC will decrease Intel's access latency greatly (cut it almost in half)
    however for some reason I don't seem to think 12ns is a great latency than 15ns

    one little Note but DDR3 is mostly an AMD design for memory
    Fast computers breed slow, lazy programmers
    The price of reliability is the pursuit of the utmost simplicity. It is a price which the very rich find most hard to pay.
    http://www.lighterra.com/papers/modernmicroprocessors/
    Modern Ram, makes an old overclocker miss BH-5 and the fun it was

  11. #11
    Xtreme Member
    Join Date
    Apr 2006
    Location
    Virginia
    Posts
    179
    wait hold up. If DDR2 has the same latency as DDR why does performance go down with AMD with AM2? Cause on paper, they have the same latency, but in reality it's different.

    And 12ns is the actual time your CPU waits for ram right? Not the total latency it takes to make it's way to the CPU yes? Cause I heard that Opteron had a total latency of like 80ns.

  12. #12
    YouTube Addict
    Join Date
    Aug 2005
    Location
    Klaatu barada nikto
    Posts
    17,574
    Quote Originally Posted by Fuji
    wait hold up. If DDR2 has the same latency as DDR why does performance go down with AMD with AM2? Cause on paper, they have the same latency, but in reality it's different.

    And 12ns is the actual time your CPU waits for ram right? Not the total latency it takes to make it's way to the CPU yes? Cause I heard that Opteron had a total latency of like 80ns.
    there are many ways of measuring latency.
    DDR2-800@4-4-4-10 and DDR-400@2-2-2-5 on AMD System, performance goes up with the move from DDR to DDR2.
    The reason you heard about AMD's losing performance on DDR2 is because they were using Memory with worse latency. The latency for DDR2-800@4-4-4-10 is about equal to DDR-400@2-2-2-5
    Fast computers breed slow, lazy programmers
    The price of reliability is the pursuit of the utmost simplicity. It is a price which the very rich find most hard to pay.
    http://www.lighterra.com/papers/modernmicroprocessors/
    Modern Ram, makes an old overclocker miss BH-5 and the fun it was

  13. #13
    Xtreme Member
    Join Date
    Nov 2004
    Location
    Boston, MA
    Posts
    488
    Quote Originally Posted by nn_step
    Actually Rambus wouldn't be a good idea, since thier current Memory solution is the XDR2, which has huge bandwidth but doesn't actually improve latency.
    Right now there are only a handful of things that can deliver better latency than DDR3:
    1)SRAM (usually called Cache)
    2) MRAM (which is cheaper than Cache but no where near mainstream)
    3)Z-RAM (Which AMD is going to use on K8L and is faster than SRAM in Capacities over 4MB)
    MRAM has a long way to go before its bandwidth is there (and becomes mainstream). Some of my professors at Georgia Tech were doing research on it and I worked with them some. About 1.5 years ago the absolute highest bandwidths they were getting was equivalent to ddr333 with bad latencies and the results were not repeatable across batches. Changing the spin of the electron takes some time but the benefit of retaining data when powered off is huge. Sorry to bust in, just thought I would throw that out there.
    Last edited by babalouj; 01-16-2007 at 09:13 PM.
    **Georgia Tech Grad, I am an Electrical Engineer with a specialization in RF IC design and Analog circuits.**

    Intel I7 3770K Delidded
    Gigabyte Z77X-UD5H
    2x4gb Gskill 7-8-7-16
    EVGA GTX680 Signature OC
    Crucial M4 256gb
    Seasonic X-750
    Watercooling Loop: Raystorm Acetal, EK GTX580 Full Cover, MCR420, MCR320, MCP35X2 & 7 x AP-15 Gentle Typhoons

    Heatware: gte460z

  14. #14
    Xtreme Enthusiast
    Join Date
    May 2006
    Location
    Austria
    Posts
    532
    i thought penryn is going to have 2*6mb of cache?
    otherwise it would be a bad move from intel, because less cache prevents the cpu from scaling well, doesn't it?
    and if they want good yields they can still make a 2*3mb version for the low end.
    source
    Quote Originally Posted by freecableguy
    the idiots out number us 10,000:1

  15. #15
    Xtreme Member
    Join Date
    Feb 2004
    Posts
    279
    Penryn is believed to be the dual core replacement for Conroe, so that would be 6MB shared L2 cache as opposed to 4MB at present.
    The interesting bit will be if Kentsfields replacement (Yorkfield?) will be a single die or a dual die implementation as at present, and if its a single die if its 2 x 6MB with each 6MB shared between two cores, or 12MB shared between all 4.

  16. #16
    Banned
    Join Date
    Oct 2005
    Posts
    1,533
    Quote Originally Posted by nn_step
    Actually Rambus wouldn't be a good idea, since thier current Memory solution is the XDR2, which has huge bandwidth but doesn't actually improve latency.
    Right now there are only a handful of things that can deliver better latency than DDR3:
    1)SRAM (usually called Cache)
    2) MRAM (which is cheaper than Cache but no where near mainstream)
    3)Z-RAM (Which AMD is going to use on K8L and is faster than SRAM in Capacities over 4MB)

    Now the really interesting thing about Z-RAM is that you need to use a SOI process to use it, however Intel is never going to use the SOI process (atleast according to their road maps) And since it is 1/6 the Size of SRAM, it is going to be far more effective for L2/3 Cache
    I find this really Interesting as IBM highk and metal gates uses Strained Silicon.

  17. #17
    YouTube Addict
    Join Date
    Aug 2005
    Location
    Klaatu barada nikto
    Posts
    17,574
    Quote Originally Posted by Turtle 1
    I find this really Interesting as IBM highk and metal gates uses Strained Silicon.
    yep and so does AMD but not Intel. So interesting isn't it.
    Fast computers breed slow, lazy programmers
    The price of reliability is the pursuit of the utmost simplicity. It is a price which the very rich find most hard to pay.
    http://www.lighterra.com/papers/modernmicroprocessors/
    Modern Ram, makes an old overclocker miss BH-5 and the fun it was

  18. #18
    Xtreme Addict
    Join Date
    Aug 2006
    Location
    eu/hungary/budapest.tmp
    Posts
    1,591
    Quote Originally Posted by Phosphate
    I know 2MB vs 4MB L2 cache is not a big deal at less than 3 GHz speeds (ie, stock) with current Conroes
    Well...
    When the first C2Ds came out, there was a comparison on Anandtech with a
    6300 and an X6800 at 7×266 - that is, same clock, just different cache sizes.
    The performance difference ranged from 0 to 10.0%, with an average of 3.5%.
    Their conclusion was that the intel prefetch was optimised for 4MBs.
    http://www.anandtech.com/cpuchipsets...spx?i=2795&p=4
    Usual suspects: i5-750 & H212+ | Biostar T5XE CFX-SLI | 4GB RAndoM | 4850 + AC S1 + 120@5V + modded stock for VRAM/VRM | Seasonic S12-600 | 7200.12 | P180 | U2311H & S2253BW | MX518
    mITX media & to-be-server machine: A330ION | Seasonic SFX | WD600BEVS boot & WD15EARS data
    Laptops: Lifebook T4215 tablet, Vaio TX3XP
    Bike: ZX6R

  19. #19
    Banned
    Join Date
    Oct 2005
    Posts
    1,533
    Quote Originally Posted by nn_step
    yep and so does AMD but not Intel. So interesting isn't it.
    I didn't read that intel doesn't use strained silcon with the 45nm process you got a link for that? Intel has been using SS since 90nm.


    Also the reason I mentioned the strained silicon was you said Zram only works on SOL.

    Were all still waiting for link that K8L uses zram

    This is IBM/AMD dec06 info release

    http://www.dailytech.com/article.aspx?newsid=5329

    After intel announces High k IBM/AMD announce high k . As I said before until AMD shows something They can't be trusted,is it ultra low k or is it High k. IBM has always said highK and metal gates at 32nm. If the do use High k I really don't see how it can be ready in 08 as they and AMD have been working on ultra low k.
    Last edited by Turtle 1; 01-31-2007 at 06:44 AM.

Bookmarks

Bookmarks

Posting Permissions

  • You may not post new threads
  • You may not post replies
  • You may not post attachments
  • You may not edit your posts
  •