MMM
Page 22 of 23 FirstFirst ... 121920212223 LastLast
Results 526 to 550 of 560

Thread: AMD Bulldozer Thread

  1. #526
    Xtreme Mentor
    Join Date
    Nov 2005
    Location
    Devon
    Posts
    3,437
    Quote Originally Posted by informal View Post
    Well it kinda was but the website that claimed they were contacted by AMD never really posted what AMD said. Apparently AMD contacted several websites and AT was the only one to post anything substantial.

    The funny thing is we still didn't get the explanation about the 2B figure...
    As a joke:

    "We designed 2B CPU but had to disable 800M of them due to bugs/thermal issues. When we fix all problems, possibly in BDv2 or BDv3 chips size will stay almost the same but transistor count will return to 2B "
    RiG1: Ryzen 7 1700 @4.0GHz 1.39V, Asus X370 Prime, G.Skill RipJaws 2x8GB 3200MHz CL14 Samsung B-die, TuL Vega 56 Stock, Samsung SS805 100GB SLC SDD (OS Drive) + 512GB Evo 850 SSD (2nd OS Drive) + 3TB Seagate + 1TB Seagate, BeQuiet PowerZone 1000W

    RiG2: HTPC AMD A10-7850K APU, 2x8GB Kingstone HyperX 2400C12, AsRock FM2A88M Extreme4+, 128GB SSD + 640GB Samsung 7200, LG Blu-ray Recorder, Thermaltake BACH, Hiper 4M880 880W PSU

    SmartPhone Samsung Galaxy S7 EDGE
    XBONE paired with 55'' Samsung LED 3D TV

  2. #527
    Xtreme Addict
    Join Date
    Aug 2004
    Location
    Sweden
    Posts
    2,084
    Quote Originally Posted by Lightman View Post
    As a joke:

    "We designed 2B CPU but had to disable 800M of them due to bugs/thermal issues. When we fix all problems, possibly in BDv2 or BDv3 chips size will stay almost the same but transistor count will return to 2B "
    Exactly. Or another way to look at it: AMD will enable the extra powah if/when needed, sometime in the future..

  3. #528
    Xtreme Cruncher
    Join Date
    Jun 2006
    Posts
    6,215
    @ Lightman

    That was funny indeed

  4. #529
    c[_]
    Join Date
    Nov 2002
    Location
    Alberta, Canada
    Posts
    18,728
    Quote Originally Posted by Mats View Post
    Exactly. Or another way to look at it: AMD will enable the extra powah if/when needed, sometime in the future..
    Sounds kind of like Fermi

    All along the watchtower the watchmen watch the eternal return.

  5. #530
    Xtreme Member
    Join Date
    Mar 2005
    Posts
    447
    Quote Originally Posted by Mats View Post
    Exactly. Or another way to look at it: AMD will enable the extra powah if/when needed, sometime in the future..
    if/when needed?


    uh how bout now/2 years ago?
    Iron Lung 3.0 | Intel Core i7 6800k @ 4ghz | 32gb G.SKILL RIPJAW V DDR4-3200 @16-16-16-36 | ASUS ROG STRIX X99 GAMING + ASUS ROG GeForce GTX 1070 STRIX GAMING | Samsung 960 Pro 512GB + Samsung 840 EVO + 4TB HDD | 55" Samsung KS8000 + 30" Dell u3011 via Displayport - @ 6400x2160

  6. #531
    Xtreme Member
    Join Date
    Nov 2006
    Location
    CroLand
    Posts
    379
    This is a bit unusual. I got an email from AMD PR this week asking me to correct the Bulldozer transistor count in our Sandy Bridge E review. The incorrect number, provided to me (and other reviewers) by AMD PR around 3 months ago was 2 billion transistors. The actual transistor count for Bulldozer is apparently 1.2 billion transistors. I don't have an explanation as to why the original number was wrong, just that the new number has been triple checked by my contact and is indeed right. The total die area for a 4-module/8-core Bulldozer remains correct at 315mm2.
    http://www.anandtech.com/show/5176/a...unt-12b-not-2b
    Phenom II x6 1055T | ASRock 880G Ex.3 | 560Ti FrozrII 1GB| Corsair Vengeance 1600 2x4GB | Win7 64 | M4 128GB

    VR Box - i5 6600 | MSI Mortar | Gigabyte G1 GTX 1060 | Viper 16GB DDR4 2400 | 256 SSD | Oculus Rift CV1 + Touch

  7. #532
    Xtreme Enthusiast
    Join Date
    Mar 2007
    Location
    Portsmouth, UK
    Posts
    963
    The only thing with the number of transistors that bugs me is why it took them so long to realise they'd given out bad information. I mean seriously, is their QA so bad that even their counting/marketing is faulty?

  8. #533
    Banned Movieman...
    Join Date
    May 2009
    Location
    illinois
    Posts
    1,809
    Quote Originally Posted by DeathReborn View Post
    The only thing with the number of transistors that bugs me is why it took them so long to realise they'd given out bad information. I mean seriously, is their QA so bad that even their counting/marketing is faulty?
    are you just now figuring out the there marketing sucks ass?

  9. #534
    Xtreme Addict
    Join Date
    Aug 2004
    Location
    Sweden
    Posts
    2,084
    Quote Originally Posted by Tenknics View Post
    if/when needed?


    uh how bout now/2 years ago?
    Irony.

  10. #535
    Xtreme Member
    Join Date
    Jul 2004
    Location
    Berlin
    Posts
    275
    The people in a big company might do very good work but this might get lost/reduced in quality due to processes and low quality decisions further up in the hierarchy.
    Now on Twitter: @Dresdenboy!
    Blog: http://citavia.blog.de/

  11. #536
    Registered User
    Join Date
    Jan 2007
    Posts
    24
    I saw a review where the reviewer had an FX (retail) that had 24M of cache rather than 16M. Could another 8M of L3 be those missing transistors? I wouldn't know so just asking the question.

    Could it also be that the server variants have the full compliment of cache as the speeds are slower so fit the TDP but the retail parts cannot fit inside the TDP with all that cache so some of it has been disabled?

    The transistors might still be there but just not used in retail parts.

    The marketing dept have then failed to make the rationalisation between server transistor count (used) and retail transistor count (used) which then compounded the 'so many transistors for so little performance' argument? If so, no wonder they got culled.

    Just some musings from me here.

  12. #537
    Xtreme Cruncher
    Join Date
    Jun 2006
    Posts
    6,215
    BSN* is reporting that allegedly AMD's CFO Thomas Seiffert has been let go yesterday... If true than it's one of those management decisions that make zero sense(like Killebrew,Moorhead,David Hoff, Rick Bergman,Dirk Meyer etc.). AMD's debt right now is around 1.8B while back in 2009 when he was appointed to his position it was ~7B.

  13. #538
    Xtreme Member
    Join Date
    Jul 2004
    Location
    Berlin
    Posts
    275
    Quote Originally Posted by Augustus View Post
    I saw a review where the reviewer had an FX (retail) that had 24M of cache rather than 16M. Could another 8M of L3 be those missing transistors? I wouldn't know so just asking the question.

    Could it also be that the server variants have the full compliment of cache as the speeds are slower so fit the TDP but the retail parts cannot fit inside the TDP with all that cache so some of it has been disabled?

    The transistors might still be there but just not used in retail parts.

    The marketing dept have then failed to make the rationalisation between server transistor count (used) and retail transistor count (used) which then compounded the 'so many transistors for so little performance' argument? If so, no wonder they got culled.

    Just some musings from me here.
    Since Interlagos uses two of the same dies as the desktop variant and the die shot has been shown to the public, there would have been comments if there were more than the known 8MB of L3.

    BTW, the transistor density of the modules (213M on 30.9 sqmm) is rather high (6.89M/sqmm) and the 1.2B number also seems to be a bit too low.
    Now on Twitter: @Dresdenboy!
    Blog: http://citavia.blog.de/

  14. #539
    Xtreme Addict
    Join Date
    Jan 2009
    Posts
    1,445
    Quote Originally Posted by informal View Post
    BSN* is reporting that allegedly AMD's CFO Thomas Seiffert has been let go yesterday... If true than it's one of those management decisions that make zero sense(like Killebrew,Moorhead,David Hoff, Rick Bergman,Dirk Meyer etc.). AMD's debt right now is around 1.8B while back in 2009 when he was appointed to his position it was ~7B.
    link; or it didn't happen.
    [MOBO] Asus CrossHair Formula 5 AM3+
    [GPU] ATI 6970 x2 Crossfire 2Gb
    [RAM] G.SKILL Ripjaws X Series 16GB (4 x 4GB) 240-Pin DDR3 1600
    [CPU] AMD FX-8120 @ 4.8 ghz
    [COOLER] XSPC Rasa 750 RS360 WaterCooling
    [OS] Windows 8 x64 Enterprise
    [HDD] OCZ Vertex 3 120GB SSD
    [AUDIO] Logitech S-220 17 Watts 2.1

  15. #540
    Xtreme Mentor
    Join Date
    Mar 2006
    Posts
    2,978
    Quote Originally Posted by god_43 View Post
    link; or it didn't happen.
    It was up for a short while then taken down.
    One hundred years from now It won't matter
    What kind of car I drove What kind of house I lived in
    How much money I had in the bank Nor what my cloths looked like.... But The world may be a little better Because, I was important In the life of a child.
    -- from "Within My Power" by Forest Witcraft

  16. #541
    Xtreme Member
    Join Date
    Jul 2004
    Location
    Berlin
    Posts
    275
    Quote Originally Posted by JumpingJack View Post
    It was up for a short while then taken down.
    The internet is full of people (besides google cache) creating quick copies:
    http://investorvillage.com/smbd.asp?...g&mid=11217202
    Now on Twitter: @Dresdenboy!
    Blog: http://citavia.blog.de/

  17. #542
    Xtreme Mentor
    Join Date
    Mar 2006
    Posts
    2,978
    Quote Originally Posted by Dresdenboy View Post
    The internet is full of people (besides google cache) creating quick copies:
    http://investorvillage.com/smbd.asp?...g&mid=11217202
    +1. Yeah I just happened upon it and literally an hour or two it was gone, even trying to get it via my history was unsuccessful.

    AMD is in massive flux right now. The direction and outcome is highly uncertain.
    One hundred years from now It won't matter
    What kind of car I drove What kind of house I lived in
    How much money I had in the bank Nor what my cloths looked like.... But The world may be a little better Because, I was important In the life of a child.
    -- from "Within My Power" by Forest Witcraft

  18. #543
    Xtreme Addict
    Join Date
    Jan 2005
    Posts
    1,730
    Since AMD did not want to submit Spec_INT/FP scores, Intel did it instead, just to rub salt on the wounds :

    SPECint_base2006/SPECfp_base2006 (autoparallel=yes)

    i7-2700k (3.5/3.9 GHz) 45.5 / 56.1
    FX-8150 (3.6/4.2 GHz) 20.8 / 25.7
    X6-1100T (3.3/3.7 GHz) 25.0 / 32.2

    http://www.spec.org/cpu2006/results/res2011q4/

    In the most widely used and accepted industry standard benchmark, it clearly shows how BD has significantly lower IPC than K10.5 despite a 300-500MHz advantage.
    Quote Originally Posted by Heinz Guderian View Post
    There are no desperate situations, there are only desperate people.

  19. #544
    Xtreme Enthusiast
    Join Date
    Sep 2004
    Posts
    650
    wowa.. I love bulldozers and this cpu is giving them a bad name, also making it harder for me to find awesome bulldozer media on the internet :p
    [SIGPIC][/SIGPIC]
    TJ07BW | i7 980x | Asus RIII | 12Gb Corsair Dominator | 2xSapphire 7950 vapor-x | WD640Gb / SG1.5TB | Corsair HX1000W | 360mm TFC Rad + Swiftech GTZ + MCP655 | Dell U2711

  20. #545
    Registered User
    Join Date
    Jul 2008
    Posts
    73
    Quote Originally Posted by savantu View Post
    Since AMD did not want to submit Spec_INT/FP scores, Intel did it instead, just to rub salt on the wounds :

    SPECint_base2006/SPECfp_base2006 (autoparallel=yes)

    i7-2700k (3.5/3.9 GHz) 45.5 / 56.1
    FX-8150 (3.6/4.2 GHz) 20.8 / 25.7
    X6-1100T (3.3/3.7 GHz) 25.0 / 32.2

    http://www.spec.org/cpu2006/results/res2011q4/

    In the most widely used and accepted industry standard benchmark, it clearly shows how BD has significantly lower IPC than K10.5 despite a 300-500MHz advantage.
    Autoparallel FAIL for ICC actually. It doesn't seem to work well for more than 4 cores (i7-3960X loses significantly to i7-2700K in lots of tests). They definitely should run it with OMP_NUM_THREADS=4 for FX8150 and set core affinity accordingly to "first cores in module". But it is Intel and I don't think they have intention to make AMD processor look good.
    And comparing SSE3 code for AMD and AVX code for Intel is totally irrelevant.

  21. #546
    Xtreme Addict
    Join Date
    Jan 2005
    Posts
    1,730
    Quote Originally Posted by sergiojr View Post
    Autoparallel FAIL for ICC actually. It doesn't seem to work well for more than 4 cores (i7-3960X loses significantly to i7-2700K in lots of tests). They definitely should run it with OMP_NUM_THREADS=4 for FX8150 and set core affinity accordingly to "first cores in module". But it is Intel and I don't think they have intention to make AMD processor look good.
    It is Spec_INT, more or less single threaded. Autoparallel cannot offer significant speedups..
    And comparing SSE3 code for AMD and AVX code for Intel is totally irrelevant.
    Why ? AVX and SSE have the same throughput for BD since it did not spend any transistors to optimize for AVX. The only question is whether having used FMA would have made a difference.
    Quote Originally Posted by Heinz Guderian View Post
    There are no desperate situations, there are only desperate people.

  22. #547
    Registered User
    Join Date
    Jul 2008
    Posts
    73
    Quote Originally Posted by savantu View Post
    It is Spec_INT, more or less single threaded. Autoparallel cannot offer significant speedups..
    It offers both speedups (e.g. libquantum) and slowdowns(e.g. h264ref) depending on the subtest and hardware. To limit slowdowns Intel set number of threads to number of cores. For example slowdown of 3960X compared to 2700K in h264ref perfectly correlates with the fact that FX8150 loses the most in this subtest compared to 1100T.
    Quote Originally Posted by savantu View Post
    Why ? AVX and SSE have the same throughput for BD since it did not spend any transistors to optimize for AVX. The only question is whether having used FMA would have made a difference.
    What? You know there is a reason for 3-operand instructions. If you don't understand it from developers' viewpoint, at least you can relay on tests comparing SSE and AVX versions of x264 or other software.
    And besides this ICC doesn't allow SSE instructions above SSE3 on AMD hardware so it is AVX (which is full superset of SSE instructions) for Intel vs SSE w/o SSSE3, SSE4.1 and SSE4.2 for AMD.

  23. #548
    Xtreme Addict
    Join Date
    Jan 2005
    Posts
    1,730
    Quote Originally Posted by sergiojr View Post
    It offers both speedups (e.g. libquantum) and slowdowns(e.g. h264ref) depending on the subtest and hardware. To limit slowdowns Intel set number of threads to number of cores. For example slowdown of 3960X compared to 2700K in h264ref perfectly correlates with the fact that FX8150 loses the most in this subtest compared to 1100T.
    That's mostly with the compilers cracking libquantum ( known for a long time ). And what's the problem with setting the number of threads to the number of cores? The most fair aproach.
    What? You know there is a reason for 3-operand instructions. If you don't understand it from developers' viewpoint, at least you can relay on tests comparing SSE and AVX versions of x264 or other software.
    And besides this ICC doesn't allow SSE instructions above SSE3 on AMD hardware so it is AVX (which is full superset of SSE instructions) for Intel vs SSE w/o SSSE3, SSE4.1 and SSE4.2 for AMD.
    The difference between 128bit SSE and 128bit AVX is extremely limited even on SB. Constantly refering to speed ups of AVX ( 256bit ) vs. SSE on SB and extrapolating that to BD is faulty; SB executes 256bit instructions in a single cycle, BD breaks them into 2x128bit ones. I'll restate my point : BD AVX speedups are limited because it wasn't designed to perform due to time pressure, just to get it compatible. The difference between 128bit SSE and 128/256bit AVX on BD is going to be in the noise region ( actually 256bit AVX is discouraged since it incurs penalties in the breaking up and recombining phase ).
    Quote Originally Posted by Heinz Guderian View Post
    There are no desperate situations, there are only desperate people.

  24. #549
    Registered User
    Join Date
    Jul 2008
    Posts
    73
    Quote Originally Posted by savantu View Post
    And what's the problem with setting the number of threads to the number of cores? The most fair aproach.
    There is a risk of slowing application down if autoparallelization is done on hardware that has a lot of shared resources between cores due to the overhead that additional threads add. And for the record I do not blame Intel for doing this test that way as they are not supposed to know how to get the best performance from AMD's processor. I just say that better result on 8150 could be achieved when autoparallelization will be limited to 4 threads and thread affinity distributed between modules.

    Quote Originally Posted by savantu View Post
    I'll restate my point : BD AVX speedups are limited because it wasn't designed to perform due to time pressure, just to get it compatible.
    You are wrong. 3-operand SSE5 instructions that later appeared as 3-operand AVX versions of SSE instructions reduce number of registers needed in code, reduce latencies due to removal of unnecessary MOVs and reduce size of code. All of this allows to achieve higher utilization of FPUs functional units and increase performance w/o need to increase theoretical throughput of functional units. x264 shows pretty good example of such increased performance as vectInt throughput is the same in SB and yet there is quite substantial boost in performance on both SB and FX8150.
    And as I said before not only ICC doesn't allow building AVX code for BD but also doesn't allow to use large part of SSEx instruction set.

    If you want to compare single-threaded performance between FX8150, i7 2700 and PhIIX6 using ICC and SPECInt then you should turn off autoparallelization, build for SSE3 target and then compare resulting performance. Everything else is just marketing.

  25. #550
    Xtreme Member
    Join Date
    Sep 2008
    Posts
    235
    The Open64 compiler produces up to 25% faster code as Intel's latest version 12 compilers even
    though the intentionally crippled results submitted by Intel run on a 40% higher clocked Bulldozer.....


    Open64 4.2.5.2 Compiler suite: (SPEC results submitted by Dell)
    2.6 GHz Bulldozer: SPEC_int_rate 134, SPEC_FP_rate 100

    Intel Studio XE 12.0.3.176 compilers: (SPEC results submitted by Intel)
    3.6 GHz Bulldozer: SPEC_int_rate 115, SPEC_FP_rate 79.8

    http://www.spec.org/cpu2006/results/res2011q4/

    Hans
    Last edited by Hans de Vries; 12-08-2011 at 05:49 AM.

Page 22 of 23 FirstFirst ... 121920212223 LastLast

Bookmarks

Bookmarks

Posting Permissions

  • You may not post new threads
  • You may not post replies
  • You may not post attachments
  • You may not edit your posts
  •