Actually, Intel does a separate SRAM cell design for their L3 caches that's much denser. AMD simply re-uses the SRAM cells from its L2 design for the L3.
Printable View
Actually, Intel does a separate SRAM cell design for their L3 caches that's much denser. AMD simply re-uses the SRAM cells from its L2 design for the L3.
Guys 2B never made sense in the first place when you did the rough sums, 1.2B sounds closer but too little IMO:
these figures may be slightly out, but close enough to get an idea how wrong 2B sounds.
4 Core deneb:
6M cache: 458M
2M L2: 152M
4 cores: 140M
cpu-NB misc: ~8M
Total : 758M
6 Core Thuban:
6M Cache: 458M
2MB l2: 228M
6 Cores: 210M
cpu-NB+misc: ~8M
Total 904M
4 Module Bulldozer:
Module transistor count based on AMD's pre release slide stating 268M Transistors for 1 module including 2MB cache
8MB L3 Cache: ~610M
8MB L2 Cache: ~610M
4 Modules: ~240M (at ~60M each)
cPUNB+Misc: ~8M
Total: ~1.46B
Never ever use performance slides from the manufacturer in a review... mostly that will backfire on you !!
I wouldn't call that a catastrophe, just horrible perfomance. AMD needs to abandon this architecture, and fast.
What a rubbish article... The guy is acknowledging that it's faster than 12C MC and Xeon BUT... He then says it's "not fast enough" since it has 33% more cores and scores a bit lower than that:"only" 27/32% faster in SPEC JBB2005/SAP. What happened to Ars Technica ? Don't bother with the 3rd page of the "article".
Well everyone still clings to the 33% more cores 50% more performanec claim... that was taunted all over the internet for months like a gospel... and he has some point... How would have a h10 with 2 more cores on 32nm would have done? Presonally I think not much worse.
this article (Ars Technica ) is not bad at all , but just said :
did Anandtech ever say this ?Quote:
AMD faces an uphill struggle just to compete with its own old chips—let alone with Intel.
Quote:
So if performance/watt is your first priority, we think the current Xeons are your best option.
from heise.de or EnglishQuote:
If performance/dollar is your first priority, we think the Opteron 6276 is an attractive alternative.
in LINPACK GFlops : Opteron 6276 vs Xeon 5680 : 205 ~239 Gflops vs 144 Ggflops
With AMD-Compiler open64 vs Intel Composer2011 SP1 : an integer in comparison with 454 to 349 and 337 to 246 floating
also 502 MFLOPS / watt (6276) compared with 311 MFLOPS / Watt (5680)
The comparison simply shows how FMA can double your FP throughput. FYI, Intel claims AVX enabled 8 core SB Xeons will get 2.1x improvement in Linpack over current high end Xeons. That would mean 300 GFLOPs, completely changing the situation.
Never mind the fact that 6282SE will not be the top model forever. Whenever intel launches the new 8C SB-E that scores 300Gflops in linpack,AMD will be refreshing their lineup by that time. We can expect 2.8Ghz stock model so it's roughly around 2.8/2.3=1.21 or 21% faster than what 6276 gets in linpack (or around 289Gflops). This is just a tad(~3%) behind projected intel's performance with AVX enabled on their highest(?) end model. Price difference will be huge between two chips though.
You assume the process will improve significantly in 2-3 months. The 6282SE is a 140w chip, pumping the stock frequency another 200MHz could be an issue without a new stepping.
Intel was never top dog in Linpack. MC pushed a lot more GFLOPs at a significantly lower cost $/Gflops. Looking at the HPC wins, I'd say price is less of an factor than assumed, otherwise Xeon wouldn't dominate. It would be interesting to see how 16 ( assuming 2P nodes ) really fat SB cores will do compared with 32 skinnier BD cores in HPC codes ( except Linpack, which is best case for both ).Quote:
We can expect 2.8Ghz stock model so it's roughly around 2.8/2.3=1.21 or 21% faster than what 6276 gets in linpack (or around 289Gflops). This is just a tad(~3%) behind projected intel's performance with AVX enabled on their highest(?) end model. Price difference will be huge between two chips though.
Well the guy who knows about glofo stuff(rich_wargo @ SA forum) hints at improved process node in Q1. So maybe they will fix yield and clock/power issues that obviously plague both Llano and Bulldozer. They managed to launch 16C/8M 2.6Ghz chip within the max. TDP bracket on G34,on this crappy process. So I expect another speed bump in Q1. 100Mhz is too low for a speed bump so next step is 2.8Ghz. This chip would put AMD in good position in spec rate tests(both integer and fp throughput). It would be a good duel to watch in HPC workloads: 4P 8C SB-EP @ 3Ghz @ 150W vs 2.8Ghz 8M/16C Opteron @ 140W.
C'mon, rich knows nada. And I doubt the process is solely to blame. BD is massive and it's high speed nature could mean it's just like Prescott reloaded : no matter how good the process is/was, it can't make BD/Prescott shine. Intel's 90nm was outstanding by any metric and Dothan fully showed that. However that couldn't save Prescott's bacon. I have the impression something similar is going on here : the process is reasonably ok, yields are poorer than planned due the intrisic things like gate first, BUT, BD and Llano aren't first class engineering jobs.
And with the relation getting really sour, GF probably doesn't give a damn about AMD's issues with 32nm and simply wait for the pay-only-good-die deal to end. GF is taking huge losses and part of the blame is the design which they have no influence upon.
And their other customers care more about 28nm bulk than 32nm SOI HKMG. Last yield figures put 28nm at 1-2 good dies per wafer. They must be dancing in the isles at GF.
Edit : just found something to reinforce my point that the process is acceptable :
http://www.eetimes.com/electronics-n...benefits-TSMC-Quote:
Meanwhile, Globalfoundries said it would not comment on its customer's foundry selection process or on their products unless they did so first. The spokesman also said problems with Llano had been specific to that product and that yields for AMD's 32/28nm Bulldozer products were on target and not affecting AMD's ability to meet customer commitments.
“We are still the only foundry producing HKMG products that can be purchased in stores now,” the Globalfoundries spokesman said, noting that the fab expected to ship “far more” HKMG volume in 2011 than all other foundries combined.
There are some serious issues with the process. Anand mentioned it along with a few other informed people. Yes, maybe the engineering has some issues as well since they have tried something very different so hopefully it will be fixed in later revisions. But the process definitely is not reasonably ok.
The Netburst based Prescott design used a lot of high speed dynamic logic, which is not only faster (as required for an aggressive 8 FO4 (IIRC) frequency goal) but uses much more power and more transistors. BD is a static CMOS design using faster logic styles for single speed paths.
A look into the BD/Llano ISSCC papers (incl. the L3 schmoo plot) should indicate, how they expected the designs to behave using the 32nm process.
There are many areas on the die which seem to be empty and might just contain wires and repeaters.
And as already said there are different types of transisors w/ different specs and size.
IIRC Llano contains 1B T.
AMD also works with macro blocks containing specific logic circuits. These might cause a little bit less efficient placement while being size optimized in itself.
Sent from my GT-I9000 using Tapatalk
It's official now. AMD contacted AT:
http://www.anandtech.com/show/5176/a...unt-12b-not-2b
Quote:
This is a bit unusual. I got an email from AMD PR this week asking me to correct the Bulldozer transistor count in our Sandy Bridge E review. The incorrect number, provided to me (and other reviewers) by AMD PR around 3 months ago was 2 billion transistors. The actual transistor count for Bulldozer is apparently 1.2 billion transistors...
Well it kinda was but the website that claimed they were contacted by AMD never really posted what AMD said. Apparently AMD contacted several websites and AT was the only one to post anything substantial.
The funny thing is we still didn't get the explanation about the 2B figure...