Page 9 of 9 FirstFirst ... 6789
Results 201 to 212 of 212

Thread: Intel Details Nehalem uArch Improvements - 256KB L2, 8MB L3 Confirmed

  1. #201
    Xtreme Mentor
    Join Date
    Aug 2006
    Location
    HD0
    Posts
    2,646
    Quote Originally Posted by gojirasan View Post
    I don't think any of us here really know what is going on in the minds of Intel corporate folks. However, I would guess that Intel would actually prefer to stop or at least hinder overclocking if they could, at least in the lower bins.
    if they did that they'de kill this market segment and I'd be off to AMD land more likely than not.

    besides fried CPU + fried board with intel parts = more profit.

    if anything they should encourage us to run our G0s and even B3s at 4Ghz day to day for the sake of their profits.

  2. #202
    Coat It with GOOOO
    Join Date
    Aug 2006
    Location
    Portland, OR
    Posts
    1,608
    Intel is by no means killing overclocking. The current platforms just might limit it's usefulness to only the high end desktop platforms based on Bloomfield. Only time will tell how Lynnfield and Havendale will operate, and there's a chance there is some saleability left in the platform.
    Main-- i7-980x @ 4.5GHZ | Asus P6X58D-E | HD5850 @ 950core 1250mem | 2x160GB intel x25-m G2's |
    Wife-- i7-860 @ 3.5GHz | Gigabyte P55M-UD4 | HD5770 | 80GB Intel x25-m |
    HTPC1-- Q9450 | Asus P5E-VM | HD3450 | 1TB storage
    HTPC2-- QX9750 | Asus P5E-VM | 1TB storage |
    Car-- T7400 | Kontron mini-ITX board | 80GB Intel x25-m | Azunetech X-meridian for sound |


  3. #203
    V3 Xeons coming soon!
    Join Date
    Nov 2005
    Location
    New Hampshire
    Posts
    36,363
    Quote Originally Posted by Blauhung View Post
    Intel is by no means killing overclocking. The current platforms just might limit it's usefulness to only the high end desktop platforms based on Bloomfield. Only time will tell how Lynnfield and Havendale will operate, and there's a chance there is some saleability left in the platform.
    Hey,hey,hey..Tell them to leave a little for us dual socket guys.
    We like to push the systems also.
    Crunch with us, the XS WCG team
    The XS WCG team needs your support.
    A good project with good goals.
    Come join us,get that warm fuzzy feeling that you've done something good for mankind.

    Quote Originally Posted by Frisch View Post
    If you have lost faith in humanity, then hold a newborn in your hands.

  4. #204
    Coat It with GOOOO
    Join Date
    Aug 2006
    Location
    Portland, OR
    Posts
    1,608
    Quote Originally Posted by Movieman View Post
    Hey,hey,hey..Tell them to leave a little for us dual socket guys.
    We like to push the systems also.
    since the dual socket systems use the exact same silicon that the Smackover platform uses (Nehalem CPU, Tylersburg bridge, each with 1 QPI disabled in UP desktop), I see no possible reason that motherboards won't come out with all the same tweaking options. I don't know if there's a Skull Trail type Intel designed board in the works, but all the groundwork is there to make one.
    Main-- i7-980x @ 4.5GHZ | Asus P6X58D-E | HD5850 @ 950core 1250mem | 2x160GB intel x25-m G2's |
    Wife-- i7-860 @ 3.5GHz | Gigabyte P55M-UD4 | HD5770 | 80GB Intel x25-m |
    HTPC1-- Q9450 | Asus P5E-VM | HD3450 | 1TB storage
    HTPC2-- QX9750 | Asus P5E-VM | 1TB storage |
    Car-- T7400 | Kontron mini-ITX board | 80GB Intel x25-m | Azunetech X-meridian for sound |


  5. #205
    Xtreme Enthusiast
    Join Date
    Jan 2005
    Location
    Frederick, MD
    Posts
    513
    Quote Originally Posted by xlink View Post
    if they did that they'de kill this market segment and I'd be off to AMD land more likely than not.

    besides fried CPU + fried board with intel parts = more profit.
    Yea, since AMD overclocking is so hot right now

    BTW, fried cpu = rma != profit
    Core i5 750 3.8ghz, TRUE 120 w/Panaflo M1A 7v
    ASRock P55 Deluxe
    XFX 5870
    2x2GB GSkill Ripjaw DDR3-1600
    Samsung 2233RZ - Pioneer PDP-5020FD - Hyundai L90D+
    Raptor WD1500ADFD - WD Caviar Green 1.5TB
    X-FI XtremeMusic w/ LN4962
    Seasonic S12-500
    Antec P182

  6. #206
    Xtreme Member
    Join Date
    Oct 2007
    Posts
    407
    Quote Originally Posted by Donnie27
    Intel did complain about Shady VAR's selling overclocked systems.
    Did I ever say they didn't? They complained, but no one in the overclocking community believed them. Did you? I am sure they were crying a river that they made overclocking so much more difficult. I admit that it's possible that since that time they have seen that the ability to overclock does not really eat into their profits. In fact maybe they will ship all future CPUs with unlocked multipliers instead of just the extreme editions. That would surely build some good will in this community. Do you think they will?

    It would be VERY SILLY for Intel to sponsor Fugger's Demo and then do as you suggest
    How do you figure?

    This small market can't influence Intel or AMD's bottom line=P
    On this we agree. At least not significantly. But maybe they are not as sure of that as we are. If not then how do you explain that both companies use multiplier locking?

    It would be a waste of time for Intel or Anyone else to worry about legit overclockers as compared to some Jerk selling a 2.4GHz as a 3GHz. There were plenty of Bogus Companies selling Counterfeit everything from fake MS mice, re-badged RAM, overclocked processor, Windows all the way back to 3.11 and even DOS LOL!
    I agree that it would be a waste of time unless it is very easy and cheap to do. If it is expensive then it is clearly not worth it. If we are lucky it will take some extra and very costly modifications to prevent overclocking Nehalem. As far as there having been 'plenty' of counterfeit CPUs, IIRC wasn't most of the remarking done in Europe? And I thought it was pretty limited even there. Of course pre-overclocked systems are a different story. I have no idea how prevalent that was.


    Contradicting your own statements uh?
    Yup. I'm human. I realized I was wrong. But now I'm right .

    But you're dead wrong about Good-Will. Intel spent too much time and money gaining that BACK from AMD. Even as A64 was barely better and X2 was CLEARLY better, Intel kept Good-will right up until Prescott. Many folks loved their Northwood C's.
    IIRC, only the true fanbois liked Northwood. I absolutely refused to buy any Pentium 4 product. In fact I am typing this on a Pentium 3. They made a lot of bad decisions in those days. Rambus and Netburst. My god. And now I have bought a share of Intel stock not to make money but just because they have shown themselves to be so seriously baddass. For once an American company I can be proud of. They have obviously learned from the errors of their ways. Now if Nvidia could only do the same.

    I don't think 'good will' means a whole lot to most of us. And loyalty is seriously overrated. Enthusiasts are about the least loyal customers they could have. We'll jump ship over a few extra FPS in Crysis or 30 seconds less render time in 3DStudioMax or a price $10 lower. The bleeding edge is the bleeding edge regardless of whose logo is on the box. I don't think Intel is unaware of that either.

    Then you're unaware of how much higher the higher Multiplier Processors can go They hit the wall much later than the Cheaper models.
    Indeed I am. Since I scored my E8400 I haven't been paying much attention to the overclocking records. Does the E8500 really clock that much higher? That's surprising since the stock clock is not much higher. I guess I'll have to go take a look at the numbers.

    The problem with what you're saying here is that Nehalem *should start out faster clock for clock. Meaning it doesn't have to overclocked as hard.
    Maybe it won't have to be overclocked 'as hard', but if it can't be overclocked at all we may find lots of enthusiasts sticking with Penryn until the next process shrink. IIRC, my lil E8400 with a stock speed of 3GHz has made it up to 4.7 Ghz on air and 5+ on phase. That is a lot more than 30% faster. So for an overclocker that may become relevant in a Penryn vs. Nehalem comparison.
    Last but not least, as was proven at IDF, Intel and most of the folks there are VERY AWARE of overclocking and this site.
    So what if they are aware? That doesn't mean they are going to unlock all their multipliers, provide a warranty for overclocked chips, and welcome the overclocking community with open arms. Talk is cheap. Let's see some action if they support us so much.

  7. #207
    Xtremely Hot Sauce
    Join Date
    Sep 2007
    Location
    New York
    Posts
    3,586
    Quote Originally Posted by Blauhung View Post
    since the dual socket systems use the exact same silicon that the Smackover platform uses (Nehalem CPU, Tylersburg bridge, each with 1 QPI disabled in UP desktop), I see no possible reason that motherboards won't come out with all the same tweaking options. I don't know if there's a Skull Trail type Intel designed board in the works, but all the groundwork is there to make one.
    MM has it right--DP is where many of us lusting after the highest performance are at. I'd personally like to see a lot more options for overclocking on DP systems. They're the same silicon and they'll likely clock better due to higher binning. I'd like to see all the same options on desktop as on workstation/servers.

    BTW, Intel won't unlock everything and replace chips we kill. Replace those few CPUs that are really dead, not those overclocked to death.

    My toys:
    Asus Sabertooth X58 | Core i7-950 (D0) | CM Hyper 212+ | G.Skill Sniper LV 12GB DDR3-1600 CL9 | GeForce GTX 670-2048MB | OCZ Agility 4 512GB, WD Raptor 150GB x 3 (RAID0), WD Black 1TB x 2 (RAID0) | XFX 650W CAH9 | Lian-Li PC-9F | Win 7 Pro x86-64
    Gigabyte EX58-UD3R | Core i7-920 (D0) | Stock HSF | G.Skill Sniper LV 4GB DDR3-1600 CL9 | Radeon HD 2600 Pro 512MB | WD Caviar 80GB IDE, 4TB x 2 (RAID5) | Corsair TX750 | XClio 188AF | Win 7 Pro x86-64
    Dell Dimension 8400 | Pentium 4 530 HT (E0) | Stock HSF | 1.5GB DDR2-400 CL3 | GeForce 8800 GT 256MB | WD Caviar 160GB SATA | Stock PSU | (Broken) Stock Case | Win Vista HP x86
    Little Dot DAC_I | Little Dot MK IV | Beyerdynamic DT-880 Premium (600 Ω) | TEAC AG-H300 MkIII | Polk Audio Monitor 5 Series 2's

  8. #208
    I am Xtreme
    Join Date
    Jul 2004
    Location
    Little Rock
    Posts
    7,204
    Quote Originally Posted by gojirasan View Post
    Did I ever say they didn't? They complained, but no one in the overclocking community believed them. Did you?
    I did and no, I wasn't the only one. Intel didn't shut or punished folk for shipping Enthusiast Motherboards that did allow overclocking. In fact, one of these folks came up the current scheme to show the Processor's Name and Original Speed in a ROM. That's now why you see a something like E6600 at 3200MHz in the current BIOS, so unscrupulous dealers can't sell it as a 3.2GHz for more money. Problem solved.

    Quote Originally Posted by gojirasan View Post
    I am sure they were crying a river that they made overclocking so much more difficult. I admit that it's possible that since that time they have seen that the ability to overclock does not really eat into their profits. In fact maybe they will ship all future CPUs with unlocked multipliers instead of just the extreme editions. That would surely build some good will in this community. Do you think they will?
    Sorry that is even close to how any of this works. These folks are in Business to make money. Not so you can skate their speed binning efforts.
    Intel always leaves some headroom, that's about all any overclocker should depend on=P

    It would be pretty unprofitable for Intel, AMD, IBM or any one else to sell you an unlocked CPU that would EAT their own profits. AMD only did this when most folks know their processors are still slower after its overclocked. Yet they're still speed binned and still used price difference for faster models.

    Just because you don't believe Intel Created Goodwill even on this forum, doesn't mean everyone else feels the same way you do.

    http://www.xtremesystems.org/forums/...ght=Fugger+IDF

    Goodwill
    How do you figure?
    It would be silly for Intel to sponsor Fugger, give props to this site, have a round table of Geeks, implement some of their Ideas and then turn off the tap. Sorry that makes no sense what so ever to me.

    On this we agree. At least not significantly. But maybe they are not as sure of that as we are. If not then how do you explain that both companies use multiplier locking?
    Yepp, I did several times already. I started building computer in 1994. I remember folks getting sold 300MHz Celerons overclocked and sold as 450MHz. Most folks who has dealt with computer back in those days will tell the same. Overclockers were the only ones loving those overclocking friendly Celerons.

    I agree that it would be a waste of time unless it is very easy and cheap to do. If it is expensive then it is clearly not worth it. If we are lucky it will take some extra and very costly modifications to prevent overclocking Nehalem. As far as there having been 'plenty' of counterfeit CPUs, IIRC wasn't most of the remarking done in Europe? And I thought it was pretty limited even there. Of course pre-overclocked systems are a different story. I have no idea how prevalent that was.
    Remarking is done all over the world. One place even tried to see broken CPUs AMD had thrown away. Oh, we're not innocent either. I remember the SNDS BD where Intel warned folk no to exceed 1.7 v-core. Folks using 1.9v was wondering why their procs were dying. This only affect something like 40% of all processors so the other 60% said it must have been something unrelated, sheesh!

    I do remember those days well. Last year a friend of mine sent my old Bootlegged MS Mouse to Microsoft LOL! I have Rebadged GSkill RAM LOL!

    Yup. I'm human. I realized I was wrong. But now I'm right .
    I've been wrong as well, will be wrong again. All I really hope to do is learn something when I'am wrong.

    IIRC, only the true fanbois liked Northwood. I absolutely refused to buy any Pentium 4 product. In fact I am typing this on a Pentium 3. They made a lot of bad decisions in those days. Rambus and Netburst. My god. And now I have bought a share of Intel stock not to make money but just because they have shown themselves to be so seriously baddass. For once an American company I can be proud of. They have obviously learned from the errors of their ways. Now if Nvidia could only do the same.
    Nope not at all. Northwood hung well with its competition, still had better motherboards and overclocked easy as hell. Northwood didn't need RAMBUS, Intel had switched to DDR-400 by then. The cool thing was that Intel announced DDR-400, NW 800MHz FSB and 865/875 long before it shipped.

    I don't think 'good will' means a whole lot to most of us. And loyalty is seriously overrated. Enthusiasts are about the least loyal customers they could have. We'll jump ship over a few extra FPS in Crysis or 30 seconds less render time in 3DStudioMax or a price $10 lower. The bleeding edge is the bleeding edge regardless of whose logo is on the box. I don't think Intel is unaware of that either.
    So ask the owner of this site?

    It doesn't mean locked multipliers are a sign that Intel or AMD is at war with us. I bought NW, it was an upgrade of my old AthlonXP that replaced my Thunderbird. 3500+ replaced that NW and my current Conroe replaced that 3500+. If Phenom would have kicked ass, I would jumped on that bandwagon.

    Indeed I am. Since I scored my E8400 I haven't been paying much attention to the overclocking records. Does the E8500 really clock that much higher? That's surprising since the stock clock is not much higher. I guess I'll have to go take a look at the numbers.
    I honestly don't know for sure. I did say I believe.

    Maybe it won't have to be overclocked 'as hard', but if it can't be overclocked at all we may find lots of enthusiasts sticking with Penryn until the next process shrink. IIRC, my lil E8400 with a stock speed of 3GHz has made it up to 4.7 Ghz on air and 5+ on phase. That is a lot more than 30% faster. So for an overclocker that may become relevant in a Penryn vs. Nehalem comparison.
    I said we'll have to wait and see. But that's just my opinion, not a fact. I pointed out that I could easily be wrong.

    Quote Originally Posted by gojirasan View Post
    So what if they are aware? That doesn't mean they are going to unlock all their multipliers, provide a warranty for overclocked chips, and welcome the overclocking community with open arms. Talk is cheap. Let's see some action if they support us so much.
    IMHO that's unrealistic=P I don't even think overclockers expect something like that. If I were you, I wouldn't use the term "us" with that claim.

    I openly complained to folks trashing out the FSB on the desktop. I said that FSB was more flexible and would be easier to overclock, I got jumped for my efforts I wonder if folks still think FSB is so terrible now?

    My fingers are still crossed for that legacy Nehalem chip/s for the old fashioned stuff.
    Quote Originally Posted by Movieman
    With the two approaches to "how" to design a processor WE are the lucky ones as we get to choose what is important to us as individuals.
    For that we should thank BOTH (AMD and Intel) companies!


    Posted by duploxxx
    I am sure JF is relaxed and smiling these days with there intended launch schedule. SNB Xeon servers on the other hand....
    Posted by gallag
    there yo go bringing intel into a amd thread again lol, if that was someone droping a dig at amd you would be crying like a girl.
    qft!

  9. #209
    Xtreme Mentor
    Join Date
    May 2007
    Posts
    2,792
    Jack, I really don't want to get into details since its too lengthy for the time I have sidelined to post on a rumor thread, but for the sake of you and other level-headed enthusiasts, heregoes
    Quote Originally Posted by JumpingJack View Post
    Quote Originally Posted by KTE
    K10h is a 12 stage pipeline, 65nm, 283mm², 463M transistor, 23.x FO4 delays design. Not made for high clocks in any way, AMD intended, as presented at one of the global IEEE 2006 conferences to reach 2-2.8GHz with Barcelona with it's rated supply Vdd. Intel Core 2 is a 21 FO4 depth design AFAIK and Penryn at FO4 ~18, it is supposed to have been reduced substantially since HKMG integration.

    The IBM Power6 is not the least nor the only architecture with 13 FO4 inversion delay, it just happens to be very well tuned for absolute speed and performance. P3 had FO4 15 depth, Willamette P4 FO4 8-10, Alpha 21264 has 15 FO4, and so on. Neither of those could achieve what IBM did.
    If you have references, I would really love to see those.
    They are details I've known from Intel and AMD themselves directly, thus more authoritative than online quotes I provide, apart from Penryn Core 2 Extreme where I was only told that FO4 delay is reduced from Core 2 65nm (drop them an email or ask David/s at RW, they should know as ISSCC and IEEE 2006/7 Conferences did make brief mention of them). The fact that either MFG doesn't like releasing such vital data online since the P4 days, means we're not going to find much online on it since the little documentation that does exist does not cover either architectural engineering in such depth for competitive reasons, until it's old. You only ever hear tid-bits through journals and studies now which isn't important detailed engineering and very few daily journalistic sources can catch or even understand on to the real details which matter in engineering (they're not exactly educated to). The only thing they tend to do is feed extravagant hype 12-18 months early to suite the intended extremists, who do a good enough payroll job each time to propel things in-favor of their obsession, as the MFG intended to begin with, and then you have unintelligent corner lurking individuals react like their mother is being held hostage by one of the MFGs, so they have to wage childish trantrums on anyone who speaks ever so slightly admonishing or not-so-perfect of that particular MFG, regardless of accuracy or their knowledge limitations, be it on Intel or AMD. A sad case I wish never existed since '98 online, since we're only interested in the architectures when discussing and I know I don't favour any MFG in any product but whatever is cheap and okay for my intended tasks in the end, as most of the sane will. They just want our money. Thus it just spoils forums and usefulness in discussion.
    To search for those 2 figure FO4 online will require much time I'm not able to spend right now with an intermittent broken network connection for about 7 days now, even if it does exist, however I will try and get you some mentions of those FO4 depths no doubt, specifically for Core 2 and K10h later.

    Ok, it wasn't that bad actually, just scanned to approximate how hard it may be to find it and took less than 20 seconds for K10h:
    K10h inverter delay mentioned [end of page 2 and start of page 3]: http://www.hypertransport.org/docs/n...a_05-16-07.pdf
    This document mentions a few FO4 depths including that of Core 2 (only one I've found so far online): http://www.springerlink.com/index/q88838k207r37554.pdf
    This document mentions them of many more CPUs: http://www.realworldtech.com/page.cf...1502231107&p=2
    These comparative graphs also looks accurate to me judging off all the lower FO4s I know about to be correct: http://www-vlsi.stanford.edu/group/chart/cycleFO4.pdf, http://www-vlsi.stanford.edu/group/c...kFrequency.pdf, http://www-vlsi.stanford.edu/group/c...werDensity.pdf

    I'll try and get some word from Intel on Penryn FO4 for you specifically and let you know the full reply by PM (you can then post it if you want, since I don't have any need to post in this thread after my first post and this to answer your enthusiastically put genuine request).
    I have not been able to find that type of data readily. Now, true... P3 nor Willamette got to these clocks (your FO4s are probably correct, I have not seen the data myself), but they were also not built on a 65 nm process with a 1.09 nm gate.
    Yep, exactly. My point in focus wasn't to compare clocks between any of them at all, you and I both know there are major variables which would make that inaccurate, but that FO4 doesn't dictate Frequency@TDP alone, a wide variety of features and a whole architectural design and material choice can limit and affect this greatly. If Intel CPUs can clock greatly, I would never say it is only because of one circutry factor alone, it has been like this since NetBurst which went from +16 to 8 FO4 depths (not sure of the maximum, but it was above 16 for sure and some PEs wager 6 is the lowest FO4 they had) and Core 2 has a fairly reserved FO4 above 20 to begin with yet it still can clock high, although with high TDPs at 65nm, it still is very good.

    I'll quickly explain a little for the benefit of genuine and sane minded knowledge seekers. In any modern microprocessor, the slowest pipeline stage is what more than determines your maximum operatable frequency. In VHDL, the critical path delays is where the major problem for clocking arises as the delays will add up here. The biggest factors affecting a CPU regarding maximum clock frequencies at a constant FO4 delay are:
    a) Microarchitecture
    b) Process Variation and Accessibility
    c) Logic Styles
    d) Timing Overheads
    e) Cell designs
    f) Wiring Size
    g) Floorplan and Placement

    Now, even more so than these are the FO4latch (incl. clock skew and jitter delays), FO4logic and subsequently FO4pipeline delays, which designate the depth of the critical path through logic in one pipeline stage. They affect a CPU clocking frequency greatly within the desktop TDPs, as well as how much of a CPU surface can be covered in one processor cycle. The most paramount of those parameters affects the critical path lengths and critical path delays (i.e. register propogation delay). Even the subthreshold leakage, gate direct tunneling leakage, junction leakage and gate induced drain leakage affects any CPUs clocking greatly at a given transistor Vdd and Tox. Array power, latch and clock are the primary essential components of power dissipation in CPUs too, whereas modern CPUs have a power given by the formula P = Pdynamic + Pleakage, and leakage for SiO is supposed to be as much as 40% of the used power, especially as the fabrication node decreases; decreasing the threshold voltage with any transistor increases the leakage current exponentially (i.e. decreasing the threshold voltage by 100mV increases the leakage current by a factor of 10) and decreasing the length of transistors increases the leakage current as well. This again poses huge clock frequency barriers to CPUs in real-life, rather than theoretical simulations when you shift process size [more on it here].
    The fan-out of four inverters metric becomes an ideal metric to compare and estimate clocking which is entirely technology scaling based only, i.e. if you keep the same architecture but just change FO4, it is bound to clock better if design/TDP does not restrict this. The ratio of a CPUs FO4 delay to the minimal signal delay for any CMOS is node independent, and you can calculate it by the formula (I don't have the required characters) Fmax ≈1/π.Trise where Trise=τFO4 (one FO4 delay). Such that, for a given technology node, FO4 13 at 65nm for a CMOS has a maxmimum theoretical limit of ~7.5GHz, while at 18nm it has a maximum theoretical frequency of ~11.5GHz. Now, this is where FO4 delay becomes paramount for CMOS, alone, all things kept constant. If you decrease the FO4 as is commonly done in engineering to find the maximum theoretical frequency of the circuitry, to one FO4, then the maximum clock frequency possible at 65nm is ~90GHz whilst at 18nm it's ~225GHz. The industry standard is to measure energy efficiency between FO4s in different CPU designs as power-performance space (some do this as Energy*Delay^2). In this respect, an electrical assessment, you will see Power6 outperform Netburst, K8, Core 2 and K10h for the efficiency.

    As I mentioned, Power6 is not a desktop CPU nor does it compare to desktop CPUs in the low load desktop workloads nor in applications it is designed for and they are made to drive, but for 65nm CPUs, it sure is electrically much better engineering than K8, P4 or Core 2 is for absolute Performance/MHz/TDP. It doesn't produce the most Gigaflops and throughput between them for no reason, and all the meanwhile it stays sub 60C on air cooling while it is a circuit designed for heavy temperatures (plus 100C was the burn-in testing).

    Some excellent and authoritative sources for such knowledge are: Proceedings of the Advanced Metallization Conference -2007, IEEE Transactions on Computers (i.e. Integrated Analysis of Power and Performance for Pipelined Microprocessors), IEEE Transactions on Electron Devices, IEEE Transactions on Computer-Aided Design of Integrated Circuits and Systems, IEEE Transactions on Very Large Scale Integration (VLSI) Systems, Standby and Active Leakage Current Control and Minimization in CMOS VLSI Circuits, Inductance Calculations Working Formula and Tables- Research Triangle Park, Inductance Calculations in a Complex Circuit Environment - IBM J. Res. Develop. and so on.

  10. #210
    Xtreme Enthusiast
    Join Date
    Nov 2005
    Location
    Sweden, Örebro
    Posts
    818
    I love your long replies, but I think most people could use a summary at the end

    //Andreas

  11. #211
    Xtreme Addict
    Join Date
    Nov 2007
    Location
    Vancouver
    Posts
    1,073

    Talking

    Quote Originally Posted by KTE View Post
    Jack, I really don't want to get into details since its too lengthy for the time I have sidelined to post on a rumor thread, but for the sake of you and other level-headed enthusiasts, heregoes
    They are details I've known from Intel and AMD themselves directly, thus more authoritative than online quotes I provide, apart from Penryn Core 2 Extreme where I was only told that FO4 delay is reduced from Core 2 65nm (drop them an email or ask David/s at RW, they should know as ISSCC and IEEE 2006/7 Conferences did make brief mention of them). The fact that either MFG doesn't like releasing such vital data online since the P4 days, means we're not going to find much online on it since the little documentation that does exist does not cover either architectural engineering in such depth for competitive reasons, until it's old. You only ever hear tid-bits through journals and studies now which isn't important detailed engineering and very few daily journalistic sources can catch or even understand on to the real details which matter in engineering (they're not exactly educated to). The only thing they tend to do is feed extravagant hype 12-18 months early to suite the intended extremists, who do a good enough payroll job each time to propel things in-favor of their obsession, as the MFG intended to begin with, and then you have unintelligent corner lurking individuals react like their mother is being held hostage by one of the MFGs, so they have to wage childish trantrums on anyone who speaks ever so slightly admonishing or not-so-perfect of that particular MFG, regardless of accuracy or their knowledge limitations, be it on Intel or AMD. A sad case I wish never existed since '98 online, since we're only interested in the architectures when discussing and I know I don't favour any MFG in any product but whatever is cheap and okay for my intended tasks in the end, as most of the sane will. They just want our money. Thus it just spoils forums and usefulness in discussion.
    To search for those 2 figure FO4 online will require much time I'm not able to spend right now with an intermittent broken network connection for about 7 days now, even if it does exist, however I will try and get you some mentions of those FO4 depths no doubt, specifically for Core 2 and K10h later.

    Ok, it wasn't that bad actually, just scanned to approximate how hard it may be to find it and took less than 20 seconds for K10h:
    K10h inverter delay mentioned [end of page 2 and start of page 3]: http://www.hypertransport.org/docs/n...a_05-16-07.pdf
    This document mentions a few FO4 depths including that of Core 2 (only one I've found so far online): http://www.springerlink.com/index/q88838k207r37554.pdf
    This document mentions them of many more CPUs: http://www.realworldtech.com/page.cf...1502231107&p=2
    These comparative graphs also looks accurate to me judging off all the lower FO4s I know about to be correct: http://www-vlsi.stanford.edu/group/chart/cycleFO4.pdf, http://www-vlsi.stanford.edu/group/c...kFrequency.pdf, http://www-vlsi.stanford.edu/group/c...werDensity.pdf

    I'll try and get some word from Intel on Penryn FO4 for you specifically and let you know the full reply by PM (you can then post it if you want, since I don't have any need to post in this thread after my first post and this to answer your enthusiastically put genuine request).
    Yep, exactly. My point in focus wasn't to compare clocks between any of them at all, you and I both know there are major variables which would make that inaccurate, but that FO4 doesn't dictate Frequency@TDP alone, a wide variety of features and a whole architectural design and material choice can limit and affect this greatly. If Intel CPUs can clock greatly, I would never say it is only because of one circutry factor alone, it has been like this since NetBurst which went from +16 to 8 FO4 depths (not sure of the maximum, but it was above 16 for sure and some PEs wager 6 is the lowest FO4 they had) and Core 2 has a fairly reserved FO4 above 20 to begin with yet it still can clock high, although with high TDPs at 65nm, it still is very good.

    I'll quickly explain a little for the benefit of genuine and sane minded knowledge seekers. In any modern microprocessor, the slowest pipeline stage is what more than determines your maximum operatable frequency. In VHDL, the critical path delays is where the major problem for clocking arises as the delays will add up here. The biggest factors affecting a CPU regarding maximum clock frequencies at a constant FO4 delay are:
    a) Microarchitecture
    b) Process Variation and Accessibility
    c) Logic Styles
    d) Timing Overheads
    e) Cell designs
    f) Wiring Size
    g) Floorplan and Placement

    Now, even more so than these are the FO4latch (incl. clock skew and jitter delays), FO4logic and subsequently FO4pipeline delays, which designate the depth of the critical path through logic in one pipeline stage. They affect a CPU clocking frequency greatly within the desktop TDPs, as well as how much of a CPU surface can be covered in one processor cycle. The most paramount of those parameters affects the critical path lengths and critical path delays (i.e. register propogation delay). Even the subthreshold leakage, gate direct tunneling leakage, junction leakage and gate induced drain leakage affects any CPUs clocking greatly at a given transistor Vdd and Tox. Array power, latch and clock are the primary essential components of power dissipation in CPUs too, whereas modern CPUs have a power given by the formula P = Pdynamic + Pleakage, and leakage for SiO is supposed to be as much as 40% of the used power, especially as the fabrication node decreases; decreasing the threshold voltage with any transistor increases the leakage current exponentially (i.e. decreasing the threshold voltage by 100mV increases the leakage current by a factor of 10) and decreasing the length of transistors increases the leakage current as well. This again poses huge clock frequency barriers to CPUs in real-life, rather than theoretical simulations when you shift process size [more on it here].
    The fan-out of four inverters metric becomes an ideal metric to compare and estimate clocking which is entirely technology scaling based only, i.e. if you keep the same architecture but just change FO4, it is bound to clock better if design/TDP does not restrict this. The ratio of a CPUs FO4 delay to the minimal signal delay for any CMOS is node independent, and you can calculate it by the formula (I don't have the required characters) Fmax ≈1/π.Trise where Trise=τFO4 (one FO4 delay). Such that, for a given technology node, FO4 13 at 65nm for a CMOS has a maxmimum theoretical limit of ~7.5GHz, while at 18nm it has a maximum theoretical frequency of ~11.5GHz. Now, this is where FO4 delay becomes paramount for CMOS, alone, all things kept constant. If you decrease the FO4 as is commonly done in engineering to find the maximum theoretical frequency of the circuitry, to one FO4, then the maximum clock frequency possible at 65nm is ~90GHz whilst at 18nm it's ~225GHz. The industry standard is to measure energy efficiency between FO4s in different CPU designs as power-performance space (some do this as Energy*Delay^2). In this respect, an electrical assessment, you will see Power6 outperform Netburst, K8, Core 2 and K10h for the efficiency.

    As I mentioned, Power6 is not a desktop CPU nor does it compare to desktop CPUs in the low load desktop workloads nor in applications it is designed for and they are made to drive, but for 65nm CPUs, it sure is electrically much better engineering than K8, P4 or Core 2 is for absolute Performance/MHz/TDP. It doesn't produce the most Gigaflops and throughput between them for no reason, and all the meanwhile it stays sub 60C on air cooling while it is a circuit designed for heavy temperatures (plus 100C was the burn-in testing).

    Some excellent and authoritative sources for such knowledge are: Proceedings of the Advanced Metallization Conference -2007, IEEE Transactions on Computers (i.e. Integrated Analysis of Power and Performance for Pipelined Microprocessors), IEEE Transactions on Electron Devices, IEEE Transactions on Computer-Aided Design of Integrated Circuits and Systems, IEEE Transactions on Very Large Scale Integration (VLSI) Systems, Standby and Active Leakage Current Control and Minimization in CMOS VLSI Circuits, Inductance Calculations Working Formula and Tables- Research Triangle Park, Inductance Calculations in a Complex Circuit Environment - IBM J. Res. Develop. and so on.


    To be serious, you ve filled in some gaps in knowledge i had regarding the actual effect and limitation f04 and process size has on theoretical max speed.
    " Business is Binary, your either a 1 or a 0, alive or dead." - Gary Winston ^^



    Asus rampage III formula,i7 980xm, H70, Silverstone Ft02, Gigabyte Windforce 580 GTX SLI, Corsair AX1200, intel x-25m 160gb, 2 x OCZ vertex 2 180gb, hp zr30w, 12gb corsair vengeance

    Rig 2
    i7 980x ,h70, Antec Lanboy Air, Samsung md230x3 ,Saphhire 6970 Xfired, Antec ax1200w, x-25m 160gb, 2 x OCZ vertex 2 180gb,12gb Corsair Vengence MSI Big Bang Xpower

  12. #212
    Xtreme Mentor
    Join Date
    Mar 2006
    Posts
    2,978
    KTE -- Thanks for the link, and great post.

    Look how shallow the P4's were, the long pipeline did it's job there

    Another interesting paper on Clock Scaling vs IPC (projection written about 2000 or so): http://www.cs.utexas.edu/ftp/pub/dbu...ers/ISCA00.pdf (table 2)

    Jack
    Last edited by JumpingJack; 03-21-2008 at 05:05 PM.
    One hundred years from now It won't matter
    What kind of car I drove What kind of house I lived in
    How much money I had in the bank Nor what my cloths looked like.... But The world may be a little better Because, I was important In the life of a child.
    -- from "Within My Power" by Forest Witcraft

Page 9 of 9 FirstFirst ... 6789

Bookmarks

Bookmarks

Posting Permissions

  • You may not post new threads
  • You may not post replies
  • You may not post attachments
  • You may not edit your posts
  •