MMM
Page 21 of 23 FirstFirst ... 11181920212223 LastLast
Results 501 to 525 of 560

Thread: AMD Bulldozer Thread

  1. #501
    Xtreme Member
    Join Date
    Jun 2005
    Location
    Bulgaria, Varna
    Posts
    447
    Actually, Intel does a separate SRAM cell design for their L3 caches that's much denser. AMD simply re-uses the SRAM cells from its L2 design for the L3.

  2. #502
    Xtreme Addict
    Join Date
    Jan 2007
    Location
    Brisbane, Australia
    Posts
    1,264
    Guys 2B never made sense in the first place when you did the rough sums, 1.2B sounds closer but too little IMO:

    these figures may be slightly out, but close enough to get an idea how wrong 2B sounds.

    4 Core deneb:
    6M cache: 458M
    2M L2: 152M
    4 cores: 140M
    cpu-NB misc: ~8M

    Total : 758M


    6 Core Thuban:
    6M Cache: 458M
    2MB l2: 228M
    6 Cores: 210M
    cpu-NB+misc: ~8M

    Total 904M

    4 Module Bulldozer:

    Module transistor count based on AMD's pre release slide stating 268M Transistors for 1 module including 2MB cache

    8MB L3 Cache: ~610M
    8MB L2 Cache: ~610M
    4 Modules: ~240M (at ~60M each)
    cPUNB+Misc: ~8M

    Total: ~1.46B

  3. #503
    Xtreme Member
    Join Date
    Sep 2008
    Location
    Italy
    Posts
    246
    Quote Originally Posted by Gambit_2K View Post
    How is that a review? It's an analysis of the architecture, they dont even have any form of performance test in the article...
    don't worry about the second part is coming
    http://www.xtremehardware.com/

    The Best Scene of Hardware In Italy

    Follow Us And Add Me http://www.facebook.com/?ref=tn_tnmn...00003658778509

  4. #504
    I am Xtreme
    Join Date
    Sep 2006
    Posts
    10,374
    Never ever use performance slides from the manufacturer in a review... mostly that will backfire on you !!
    Question : Why do some overclockers switch into d*ckmode when money is involved

    Remark : They call me Pro Asus Saaya yupp, I agree

  5. #505

  6. #506
    Xtreme Mentor
    Join Date
    Nov 2006
    Location
    Spain, EU
    Posts
    2,949
    I wouldn't call that a catastrophe, just horrible perfomance. AMD needs to abandon this architecture, and fast.
    Friends shouldn't let friends use Windows 7 until Microsoft fixes Windows Explorer (link)


    Quote Originally Posted by PerryR, on John Fruehe (JF-AMD) View Post
    Pretty much. Plus, he's here voluntarily.

  7. #507
    Xtreme Cruncher
    Join Date
    Jun 2006
    Posts
    6,215
    What a rubbish article... The guy is acknowledging that it's faster than 12C MC and Xeon BUT... He then says it's "not fast enough" since it has 33% more cores and scores a bit lower than that:"only" 27/32% faster in SPEC JBB2005/SAP. What happened to Ars Technica ? Don't bother with the 3rd page of the "article".

  8. #508
    I am Xtreme
    Join Date
    Jul 2007
    Location
    Austria
    Posts
    5,485
    Well everyone still clings to the 33% more cores 50% more performanec claim... that was taunted all over the internet for months like a gospel... and he has some point... How would have a h10 with 2 more cores on 32nm would have done? Presonally I think not much worse.

  9. #509
    Xtreme Enthusiast
    Join Date
    Oct 2008
    Posts
    678
    Quote Originally Posted by informal View Post
    What a rubbish article... The guy is acknowledging that it's faster than 12C MC and Xeon BUT... He then says it's "not fast enough" since it has 33% more cores and scores a bit lower than that:"only" 27/32% faster in SPEC JBB2005/SAP. What happened to Ars Technica ? Don't bother with the 3rd page of the "article".
    Well it has a much worse performance per dollar and performance per watt than both MC and Xeon, how is that not bad? It's only faster than Xeon when comparing to relatively cheap and slow Xeons. Per dollar performance is still worse.

  10. #510
    Xtreme Member
    Join Date
    Oct 2009
    Posts
    146
    this article (Ars Technica ) is not bad at all , but just said :

    AMD faces an uphill struggle just to compete with its own old chips—let alone with Intel.
    did Anandtech ever say this ?

    So if performance/watt is your first priority, we think the current Xeons are your best option.
    If performance/dollar is your first priority, we think the Opteron 6276 is an attractive alternative.
    from heise.de or English

    in LINPACK GFlops : Opteron 6276 vs Xeon 5680 : 205 ~239 Gflops vs 144 Ggflops

    With AMD-Compiler open64 vs Intel Composer2011 SP1 : an integer in comparison with 454 to 349 and 337 to 246 floating

    also 502 MFLOPS / watt (6276) compared with 311 MFLOPS / Watt (5680)
    Last edited by behrouz; 11-22-2011 at 08:39 AM.
    CPU : Athlon X2 7850,Clock:3000 at 1.20 | Mobo : Biostar TA790GX A2+ Rev 5.1 | PSU : Green GP535A | VGA : Sapphire 5770 Clock:910,Memory:1300 | Memory : Patriot 2x2 GB DDR2 800 CL 5-5-5-15 | LCD : AOC 931Sw

  11. #511
    Xtreme Addict
    Join Date
    Jan 2005
    Posts
    1,730
    The comparison simply shows how FMA can double your FP throughput. FYI, Intel claims AVX enabled 8 core SB Xeons will get 2.1x improvement in Linpack over current high end Xeons. That would mean 300 GFLOPs, completely changing the situation.
    Quote Originally Posted by Heinz Guderian View Post
    There are no desperate situations, there are only desperate people.

  12. #512
    Xtreme Member
    Join Date
    Sep 2010
    Posts
    139
    Quote Originally Posted by savantu View Post
    The comparison simply shows how FMA can double your FP throughput. FYI, Intel claims AVX enabled 8 core SB Xeons will get 2.1x improvement in Linpack over current high end Xeons. That would mean 300 GFLOPs, completely changing the situation.
    And if you take the topmodel of AMD, intel will have a ~30GFLops advantage in linpack when both use optimized compilers. That indeed changes the situation from 100GFlops slower to 30GFlops faster.

  13. #513
    Xtreme Cruncher
    Join Date
    Jun 2006
    Posts
    6,215
    Quote Originally Posted by flyck View Post
    And if you take the topmodel of AMD, intel will have a ~30GFLops advantage in linpack when both use optimized compilers. That indeed changes the situation from 100GFlops slower to 30GFlops faster.
    Never mind the fact that 6282SE will not be the top model forever. Whenever intel launches the new 8C SB-E that scores 300Gflops in linpack,AMD will be refreshing their lineup by that time. We can expect 2.8Ghz stock model so it's roughly around 2.8/2.3=1.21 or 21% faster than what 6276 gets in linpack (or around 289Gflops). This is just a tad(~3%) behind projected intel's performance with AVX enabled on their highest(?) end model. Price difference will be huge between two chips though.

  14. #514
    Xtreme Addict
    Join Date
    Jan 2005
    Posts
    1,730
    Quote Originally Posted by informal View Post
    Never mind the fact that 6282SE will not be the top model forever. Whenever intel launches the new 8C SB-E that scores 300Gflops in linpack,AMD will be refreshing their lineup by that time.
    You assume the process will improve significantly in 2-3 months. The 6282SE is a 140w chip, pumping the stock frequency another 200MHz could be an issue without a new stepping.
    We can expect 2.8Ghz stock model so it's roughly around 2.8/2.3=1.21 or 21% faster than what 6276 gets in linpack (or around 289Gflops). This is just a tad(~3%) behind projected intel's performance with AVX enabled on their highest(?) end model. Price difference will be huge between two chips though.
    Intel was never top dog in Linpack. MC pushed a lot more GFLOPs at a significantly lower cost $/Gflops. Looking at the HPC wins, I'd say price is less of an factor than assumed, otherwise Xeon wouldn't dominate. It would be interesting to see how 16 ( assuming 2P nodes ) really fat SB cores will do compared with 32 skinnier BD cores in HPC codes ( except Linpack, which is best case for both ).
    Quote Originally Posted by Heinz Guderian View Post
    There are no desperate situations, there are only desperate people.

  15. #515
    Xtreme Cruncher
    Join Date
    Jun 2006
    Posts
    6,215
    Quote Originally Posted by savantu View Post
    You assume the process will improve significantly in 2-3 months. The 6282SE is a 140w chip, pumping the stock frequency another 200MHz could be an issue without a new stepping.


    Intel was never top dog in Linpack. MC pushed a lot more GFLOPs at a significantly lower cost $/Gflops. Looking at the HPC wins, I'd say price is less of an factor than assumed, otherwise Xeon wouldn't dominate. It would be interesting to see how 16 ( assuming 2P nodes ) really fat SB cores will do compared with 32 skinnier BD cores in HPC codes ( except Linpack, which is best case for both ).
    Well the guy who knows about glofo stuff(rich_wargo @ SA forum) hints at improved process node in Q1. So maybe they will fix yield and clock/power issues that obviously plague both Llano and Bulldozer. They managed to launch 16C/8M 2.6Ghz chip within the max. TDP bracket on G34,on this crappy process. So I expect another speed bump in Q1. 100Mhz is too low for a speed bump so next step is 2.8Ghz. This chip would put AMD in good position in spec rate tests(both integer and fp throughput). It would be a good duel to watch in HPC workloads: 4P 8C SB-EP @ 3Ghz @ 150W vs 2.8Ghz 8M/16C Opteron @ 140W.

  16. #516
    Xtreme Addict
    Join Date
    Jan 2005
    Posts
    1,730
    Quote Originally Posted by informal View Post
    Well the guy who knows about glofo stuff(rich_wargo @ SA forum) hints at improved process node in Q1. So maybe they will fix yield and clock/power issues that obviously plague both Llano and Bulldozer. They managed to launch 16C/8M 2.6Ghz chip within the max. TDP bracket on G34,on this crappy process. So I expect another speed bump in Q1. 100Mhz is too low for a speed bump so next step is 2.8Ghz. This chip would put AMD in good position in spec rate tests(both integer and fp throughput). It would be a good duel to watch in HPC workloads: 4P 8C SB-EP @ 3Ghz @ 150W vs 2.8Ghz 8M/16C Opteron @ 140W.
    C'mon, rich knows nada. And I doubt the process is solely to blame. BD is massive and it's high speed nature could mean it's just like Prescott reloaded : no matter how good the process is/was, it can't make BD/Prescott shine. Intel's 90nm was outstanding by any metric and Dothan fully showed that. However that couldn't save Prescott's bacon. I have the impression something similar is going on here : the process is reasonably ok, yields are poorer than planned due the intrisic things like gate first, BUT, BD and Llano aren't first class engineering jobs.

    And with the relation getting really sour, GF probably doesn't give a damn about AMD's issues with 32nm and simply wait for the pay-only-good-die deal to end. GF is taking huge losses and part of the blame is the design which they have no influence upon.

    And their other customers care more about 28nm bulk than 32nm SOI HKMG. Last yield figures put 28nm at 1-2 good dies per wafer. They must be dancing in the isles at GF.

    Edit : just found something to reinforce my point that the process is acceptable :

    Meanwhile, Globalfoundries said it would not comment on its customer's foundry selection process or on their products unless they did so first. The spokesman also said problems with Llano had been specific to that product and that yields for AMD's 32/28nm Bulldozer products were on target and not affecting AMD's ability to meet customer commitments.

    “We are still the only foundry producing HKMG products that can be purchased in stores now,” the Globalfoundries spokesman said, noting that the fab expected to ship “far more” HKMG volume in 2011 than all other foundries combined.
    http://www.eetimes.com/electronics-n...benefits-TSMC-
    Last edited by savantu; 11-24-2011 at 06:23 AM.
    Quote Originally Posted by Heinz Guderian View Post
    There are no desperate situations, there are only desperate people.

  17. #517
    Xtreme Member
    Join Date
    Nov 2006
    Posts
    324
    Quote Originally Posted by savantu View Post
    [.... and that yields for AMD's 32/28nm Bulldozer products were on target and not affecting AMD's ability to meet customer commitments]
    It is about yield only, the % of good chips.
    Consider this: different power drainage chips can be delivered with same yield.
    Windows 8.1
    Asus M4A87TD EVO + Phenom II X6 1055T @ 3900MHz + HD3850
    APUs

  18. #518
    Xtreme Enthusiast
    Join Date
    Jul 2004
    Location
    London
    Posts
    577
    Quote Originally Posted by savantu View Post
    I have the impression something similar is going on here : the process is reasonably ok, yields are poorer than planned due the intrisic things like gate first, BUT, BD and Llano aren't first class engineering jobs.
    There are some serious issues with the process. Anand mentioned it along with a few other informed people. Yes, maybe the engineering has some issues as well since they have tried something very different so hopefully it will be fixed in later revisions. But the process definitely is not reasonably ok.
    i7 920@4.34 | Rampage II GENE | 6GB OCZ Reaper 1866 | 8800GT (zzz) | Corsair AX750 | Xonar Essence ST w/ 3x LME49720 | HiFiMAN EF2 Amplifier | Shure SRH840 | EK Supreme HF | Thermochill PA 120.3 | MCP355 | XSPC Reservoir | 3/8" ID Tubing

    Phenom 9950BE @ 3400/2000 (CPU/NB) | Gigabyte MA790GP-DS4H | HD4850 | 4GB Corsair DHX @850 | Corsair TX650W | T.R.U.E Push-Pull

    E2160 @3.06 | ASUS P5K-Pro | BFG 8800GT | 4GB G.Skill @ 1040 | 600W Tt PP

    A64 3000+ @2.87 | DFI-NF4 | 7800 GTX | Patriot 1GB DDR @610 | 550W FSP

  19. #519
    Xtreme Member
    Join Date
    Jul 2004
    Location
    Berlin
    Posts
    275
    Quote Originally Posted by mAJORD View Post
    Guys 2B never made sense in the first place when you did the rough sums, 1.2B sounds closer but too little IMO:

    these figures may be slightly out, but close enough to get an idea how wrong 2B sounds.

    4 Core deneb:
    6M cache: 458M
    2M L2: 152M
    4 cores: 140M
    cpu-NB misc: ~8M

    Total : 758M


    6 Core Thuban:
    6M Cache: 458M
    2MB l2: 228M
    6 Cores: 210M
    cpu-NB+misc: ~8M

    Total 904M

    4 Module Bulldozer:

    Module transistor count based on AMD's pre release slide stating 268M Transistors for 1 module including 2MB cache

    8MB L3 Cache: ~610M
    8MB L2 Cache: ~610M
    4 Modules: ~240M (at ~60M each)
    cPUNB+Misc: ~8M

    Total: ~1.46B
    Each module with 2MB L2 has 213M transistors according to AMDs ISSCC papers.
    Now on Twitter: @Dresdenboy!
    Blog: http://citavia.blog.de/

  20. #520
    Xtreme Addict
    Join Date
    Jun 2002
    Location
    Ontario, Canada
    Posts
    1,782
    Quote Originally Posted by savantu View Post
    C'mon, rich knows nada. And I doubt the process is solely to blame. BD is massive and it's high speed nature could mean it's just like Prescott reloaded : no matter how good the process is/was, it can't make BD/Prescott shine. Intel's 90nm was outstanding by any metric and Dothan fully showed that. However that couldn't save Prescott's bacon. I have the impression something similar is going on here : the process is reasonably ok, yields are poorer than planned due the intrisic things like gate first, BUT, BD and Llano aren't first class engineering jobs.

    And with the relation getting really sour, GF probably doesn't give a damn about AMD's issues with 32nm and simply wait for the pay-only-good-die deal to end. GF is taking huge losses and part of the blame is the design which they have no influence upon.

    And their other customers care more about 28nm bulk than 32nm SOI HKMG. Last yield figures put 28nm at 1-2 good dies per wafer. They must be dancing in the isles at GF.

    Edit : just found something to reinforce my point that the process is acceptable :



    http://www.eetimes.com/electronics-n...benefits-TSMC-
    Jesus Christ, the world is coming to an end when I agree with Savantu.
    As quoted by LowRun......"So, we are one week past AMD's worst case scenario for BD's availability but they don't feel like communicating about the delay, I suppose AMD must be removed from the reliable sources list for AMD's products launch dates"

  21. #521
    Xtreme Member
    Join Date
    Jul 2004
    Location
    Berlin
    Posts
    275
    The Netburst based Prescott design used a lot of high speed dynamic logic, which is not only faster (as required for an aggressive 8 FO4 (IIRC) frequency goal) but uses much more power and more transistors. BD is a static CMOS design using faster logic styles for single speed paths.

    A look into the BD/Llano ISSCC papers (incl. the L3 schmoo plot) should indicate, how they expected the designs to behave using the 32nm process.
    Now on Twitter: @Dresdenboy!
    Blog: http://citavia.blog.de/

  22. #522
    Xtreme Member
    Join Date
    Jul 2004
    Location
    Berlin
    Posts
    275
    Quote Originally Posted by STaRGaZeR View Post
    I know you like to suppose a lot, but the official figures are ~2B transistors for the die and this is pure BS. That or AMD's PR department just hit another level of mediocrity. 1,2B on a process that is known to be more dense than the competition?
    There are many areas on the die which seem to be empty and might just contain wires and repeaters.

    And as already said there are different types of transisors w/ different specs and size.

    IIRC Llano contains 1B T.

    AMD also works with macro blocks containing specific logic circuits. These might cause a little bit less efficient placement while being size optimized in itself.


    Sent from my GT-I9000 using Tapatalk
    Now on Twitter: @Dresdenboy!
    Blog: http://citavia.blog.de/

  23. #523
    Xtreme Cruncher
    Join Date
    Jun 2006
    Posts
    6,215
    Quote Originally Posted by Dresdenboy View Post
    There are many areas on the die which seem to be empty and might just contain wires and repeaters.

    And as already said there are different types of transisors w/ different specs and size.

    IIRC Llano contains 1B T.

    AMD also works with macro blocks containing specific logic circuits. These might cause a little bit less efficient placement while being size optimized in itself.


    Sent from my GT-I9000 using Tapatalk
    It's official now. AMD contacted AT:
    http://www.anandtech.com/show/5176/a...unt-12b-not-2b
    This is a bit unusual. I got an email from AMD PR this week asking me to correct the Bulldozer transistor count in our Sandy Bridge E review. The incorrect number, provided to me (and other reviewers) by AMD PR around 3 months ago was 2 billion transistors. The actual transistor count for Bulldozer is apparently 1.2 billion transistors...

  24. #524
    Xtreme Mentor
    Join Date
    Nov 2006
    Location
    Spain, EU
    Posts
    2,949
    Quote Originally Posted by informal View Post
    It's official now. AMD contacted AT:
    http://www.anandtech.com/show/5176/a...unt-12b-not-2b
    Wasn't official 2 weeks ago?

    Quote Originally Posted by informal View Post
    Official number has been corrected now,it's 1.2B and die size is 315mm^2.
    Friends shouldn't let friends use Windows 7 until Microsoft fixes Windows Explorer (link)


    Quote Originally Posted by PerryR, on John Fruehe (JF-AMD) View Post
    Pretty much. Plus, he's here voluntarily.

  25. #525
    Xtreme Cruncher
    Join Date
    Jun 2006
    Posts
    6,215
    Quote Originally Posted by STaRGaZeR View Post
    Wasn't official 2 weeks ago?
    Well it kinda was but the website that claimed they were contacted by AMD never really posted what AMD said. Apparently AMD contacted several websites and AT was the only one to post anything substantial.

    The funny thing is we still didn't get the explanation about the 2B figure...

Page 21 of 23 FirstFirst ... 11181920212223 LastLast

Bookmarks

Bookmarks

Posting Permissions

  • You may not post new threads
  • You may not post replies
  • You may not post attachments
  • You may not edit your posts
  •