Page 1 of 11 1234 ... LastLast
Results 1 to 25 of 267

Thread: AMD FX "Bulldozer" Review - (4) !exclusive! Excuse for 1-Threaded Perf.

  1. #1
    Registered User DGLee's Avatar
    Join Date
    Oct 2010
    Location
    South Korea
    Posts
    82

    AMD FX "Bulldozer" Review - (4) !exclusive! Excuse for 1-Threaded Perf.

    What I'm about to deal with here is comparing 2CU/4C and 4CU/4C Bulldozers.
    (CU stands for Compute Unit, or equivalently 'Module')
    It can be an excuse for Bulldozer's initial poor single-thread performance:
    benchmark tools are just under-optimized for that kind of architecture, but not virtually a 'poor' performance if some optimization is done.

    Here's the test scheme: I'm sure that everybody understand what these pics meaning for.

    11_bulldozer.png
    (4M/8C)

    bulldozer_die.jpg
    (real die: do they seems likely?)

    2c4t_1.png
    (2M/4C: traditional manner for AMD)

    4c4t_1.png
    (4M/4C: conceptual diagram)

    But there's a flaw: current version of CH5F BIOS doesn's support core-by-core on/off. Am I finished then?

    original.jpg

    ...Not really. Newest version of BIOS supports such function!

    2c4t.jpg
    4c4t.jpg

    Here's the result:

    ctc_01_fritz.png
    ctc_02_wprime.png
    ctc_03_winrar.png
    ctc_04_3d06.png
    ctc_05_3dv.png

    ▲ (there are six more results. I'll attach a reply to this thread)
    My Blog: http://udteam.tistory.com

    CPU: AMD FX-8150P
    Cooler: Antec KÜHLER H2O 920
    M/B: ASUS CROSSHAIR V FORMULA
    RAM: Samsung DDR3 PC3-10600 4GB x 2
    VGA: HIS & Sapphire Radeon HD 6990 4GB x 2
    Storage: Intel SSD 510 Series 120GB + Seagate Barracuda Green 2TB
    PSU: Antec True Power Quattro 1200
    Case: Lian Li PC-X500FX
    O/S: Microsoft Windows 7 Enterprise 64-bit

  2. #2
    Registered User DGLee's Avatar
    Join Date
    Oct 2010
    Location
    South Korea
    Posts
    82
    My Blog: http://udteam.tistory.com

    CPU: AMD FX-8150P
    Cooler: Antec KÜHLER H2O 920
    M/B: ASUS CROSSHAIR V FORMULA
    RAM: Samsung DDR3 PC3-10600 4GB x 2
    VGA: HIS & Sapphire Radeon HD 6990 4GB x 2
    Storage: Intel SSD 510 Series 120GB + Seagate Barracuda Green 2TB
    PSU: Antec True Power Quattro 1200
    Case: Lian Li PC-X500FX
    O/S: Microsoft Windows 7 Enterprise 64-bit

  3. #3
    I am Xtreme Manicdan's Avatar
    Join Date
    Dec 2007
    Posts
    7,747
    so for anyone wonder either, what would a 4M/4C chip be like, and how much scaling is lost sharing resources, you've answered both,
    seriously thank you for having what is so far, the best review.
    2500k @ 4900mhz - Asus Maxiums IV Gene Z - Swiftech Apogee LP
    GTX 680 @ +170 (1267mhz) / +300 (3305mhz) - EK 680 FC EN/Acteal
    Swiftech MCR320 Drive @ 1300rpms - 3x GT 1850s @ 1150rpms
    XS Build Log for: My Latest Custom Case

  4. #4
    Xtreme Mentor Particle's Avatar
    Join Date
    Apr 2008
    Location
    Kansas
    Posts
    3,101
    I appreciate your efforts. However, it's likely that you're largely seeing the performance hit of the cache thrashing issue recently discovered since disabling a core in each module would prevent that contention.
    Particle's First Rule of Online Technical Discussion:
    As a thread about any computer related subject has its length approach infinity, the likelihood and inevitability of a poorly constructed AMD vs. Intel fight also exponentially increases.

    Rule 1A:
    Likewise, the frequency of a car pseudoanalogy to explain a technical concept increases with thread length. This will make many people chuckle, as computer people are rarely knowledgeable about vehicular mechanics.

    Rule 2:
    When confronted with a post that is contrary to what a poster likes, believes, or most often wants to be correct, the poster will pick out only minor details that are largely irrelevant in an attempt to shut out the conflicting idea. The core of the post will be left alone since it isn't easy to contradict what the person is actually saying.

    Rule 2A:
    When a poster cannot properly refute a post they do not like (as described above), the poster will most likely invent fictitious counter-points and/or begin to attack the other's credibility in feeble ways that are dramatic but irrelevant. Do not underestimate this tactic, as in the online world this will sway many observers. Do not forget: Correctness is decided only by what is said last, the most loudly, or with greatest repetition.

    Rule 3:
    When it comes to computer news, 70% of Internet rumors are outright fabricated, 20% are inaccurate enough to simply be discarded, and about 10% are based in reality. Grains of salt--become familiar with them.

    Remember: When debating online, everyone else is ALWAYS wrong if they do not agree with you!

    Random Tip o' the Whatever
    You just can't win. If your product offers feature A instead of B, people will moan how A is stupid and it didn't offer B. If your product offers B instead of A, they'll likewise complain and rant about how anyone's retarded cousin could figure out A is what the market wants.

  5. #5
    Xtreme Member
    Join Date
    Jan 2007
    Location
    Argentina
    Posts
    382
    Excellent review, DGLee. Could you try 4CU/4C vs 4CU/8C with 2 and 4 threaded applications? I would like to see if having the cores active impact performance in those scenarios, to see if: a) Windows scheduler is at fault and can be optimized (you could force the threads to run on a specific CPU) and b) if having the cores "online" but inactive impacts performance.
    Main: Windows 8.1 PRO Intel Core i7 4820K @ 4700Mhz, Corsair H100i, 16GB DDR3-2400, HD7950 3GB @ Corsair H60, Crucial M4 128GB OS, Crucial C300 64GB Cache + Samsung 1TB F3, WD 1.5TB Green, Gigabyte GA-X79-UD3
    HTPC: Windows 8.1 PRO AMD Athlon 5150, 6GB DDR3-1600, Radeon R3 iGPU, 500GB, Gigabyte GA-AM1M-S2H
    ESXi Server 5.5 AMD Athlon X3 455, 8GB DDR3-1600, 80GB, Biostar A880GU3, 2xIntel PRO1000
    ESXi Server 5.5 AMD Fusion A6-3500, 8GB DDR3-1600, 80GB, Biostar TA75M, 2xIntel PRO1000
    ESXi Server 5.1 Intel Core i5 650, 8GB DDR3-1600, 160GB, Biostar TH55B HD, 2xIntel PRO1000
    FreeNAS 9.2.1.6 x64 Intel Pentium G2030, 8GB DDR3-1333, Intel HD Graphics, 4x1TB WD Blue RAID-Z1, Asrock H61GM-VG3, Bitfenix Prodigy

  6. #6
    Xtreme Addict
    Join Date
    Jan 2007
    Location
    Brisbane, Australia
    Posts
    1,258
    I questioned this.. a long time ago.

    The answer from AMD was it was better off in the '2cu 4c' arrangment, and turboing higher.. however your results seem to disagree with that quite a bit.

    any chance you can test single thread without affinity set between the two configs? I'd like to see the implications of threads bouncing between modules vs cores
    Last edited by mAJORD; 10-11-2011 at 11:40 PM.

  7. #7
    Xtreme Addict duploxxx's Avatar
    Join Date
    May 2005
    Posts
    1,336
    Quote Originally Posted by mAJORD View Post
    I questioned this.. a long time ago.

    The answer from AMD was it was better off in the '2cu 4c' arrangment, and turboing higher.. however your results seem to disagree with that quite a bit.

    any chance you can test single thread without affinity set between the two configs? I'd like to see the implications of threads bouncing between modules vs cores
    very interesting results.

    can you also pls try to kick that NB to higher regions and see how it reacts?
    Quote Originally Posted by Movieman View Post
    Fanboyitis..
    Comes in two variations and both deadly.
    There's the green strain and the blue strain on CPU.. There's the red strain and the green strain on GPU..

  8. #8
    Xtreme Addict
    Join Date
    Mar 2009
    Posts
    1,116
    I understand you are talking about disabling alternating cores so each core unit only has one core activated. but your graphs confuse me. you will have to explain what your "4cu/8c" notation means, what each tested configuration means, and what your conclusion is based on this chart.

  9. #9
    Xtreme Member Spectrobozo's Avatar
    Join Date
    Jan 2004
    Posts
    392
    basically he is talking about using a module as a single core, single thread....
    4 modules with a total of 4 threads is obviously faster than 2 modules with a total of 4 threads, since the 4 threads from 2 modules will be sharing resources while the other option don't...

    when using less than 4 threads the optimal choice would be to use it like this (1t per module), but probably windows cannot make the distinction?
    and that's what is going to be fixed for Windows 8?

  10. #10
    Registered User
    Join Date
    Sep 2010
    Posts
    9
    just for s a giggles can you run a single thread benchmark like Pi.

  11. #11
    Xtreme 3D Team BeepBeep2's Avatar
    Join Date
    Jan 2009
    Location
    Ohio
    Posts
    8,398
    Quote Originally Posted by bamtan2 View Post
    I understand you are talking about disabling alternating cores so each core unit only has one core activated. but your graphs confuse me. you will have to explain what your "4cu/8c" notation means, what each tested configuration means, and what your conclusion is based on this chart.
    CU = Computational Unit

    AMD's internal tech term for PR "Module"
    Intel all dey erry dey
    AMD all nite erry nite party party party

  12. #12
    Xtreme Addict
    Join Date
    Jan 2005
    Posts
    1,730
    I'm a bit puzzled by what's the news here. You're basically proven an axiom : in any CPU were you have resource sharing among 2 threads, running only one ( thus giving it the whole resources ) it will run better.
    If you take a Core 2 and disable 1 core, rest assured, the performance of that one thread will be better than running the same thread with both cores active.

    The whole point of AMD's aproach is to avoid exactly that : don't make a fat core ( what you're suggesting ), but skinny ones and lots of them. On desktop, as BD has proven, this is a failure.

    AMD could have created BD as a 4 core with each module being transformed in a fat core, but that's SB reloaded.
    Quote Originally Posted by Heinz Guderian View Post
    There are no desperate situations, there are only desperate people.

  13. #13
    Xtreme Addict
    Join Date
    Jan 2007
    Location
    Brisbane, Australia
    Posts
    1,258
    Quote Originally Posted by savantu View Post
    I'm a bit puzzled by what's the news here. You're basically proven an axiom : in any CPU were you have resource sharing among 2 threads, running only one ( thus giving it the whole resources ) it will run better.
    If you take a Core 2 and disable 1 core, rest assured, the performance of that one thread will be better than running the same thread with both cores active.

    The whole point of AMD's aproach is to avoid exactly that : don't make a fat core ( what you're suggesting ), but skinny ones and lots of them. On desktop, as BD has proven, this is a failure.

    AMD could have created BD as a 4 core with each module being transformed in a fat core, but that's SB reloaded.
    Thanks, but I think we all know that :p . This is not the news section.. So no ones claiming it's news.

    Some people are interested in the effects of the shared resources, DGlee has put the effort in to show the results, that's all there is to it.

  14. #14
    mclarenfung
    Guest
    i wonder that the next bios of asus ..will it have more improve

  15. #15
    Registered User
    Join Date
    Jan 2009
    Posts
    13
    What you actually proved is the opposite (for now), that AMD gets 33-59% on the CMT

    Chess 11800/8813=1.3389 ?
    Wprime 13.814/9.531=1.4494
    Winrar 4467/3027=1.4757
    3d06 5803/4134=1.4037
    3dvantage 19215/12102=1.5878
    3d11 6340/4289=1.4782
    CB R10 20552/15033=1.3671
    CB R11.5 6/3.8=1.5789
    Blender 9.76/7.16=1.3631
    X264 37.23/25.18=1.4786
    Transcode (222+210)/(185+135)=1.35

    well, unless you run single thread program like SuperPi however (for comparing Single 'Core' Module and Dual 'Core' Module), then you can see if there is an impact to single threaded stuff

  16. #16
    Xtreme Addict duploxxx's Avatar
    Join Date
    May 2005
    Posts
    1,336
    time to finetune the win7 os and create an application that shoots newly started applications to different modules/cores eliminating the usage of the second module as much as possible untill one need 5 cores

    btw patch for win7 shoudln't be taht hard, Vmware already got it in there VSphere esxi scheduler.
    Quote Originally Posted by Movieman View Post
    Fanboyitis..
    Comes in two variations and both deadly.
    There's the green strain and the blue strain on CPU.. There's the red strain and the green strain on GPU..

  17. #17
    Xtreme Mentor demonkevy666's Avatar
    Join Date
    May 2008
    Location
    cleveland ohio
    Posts
    2,813
    Quote Originally Posted by Particle View Post
    I appreciate your efforts. However, it's likely that you're largely seeing the performance hit of the cache thrashing issue recently discovered since disabling a core in each module would prevent that contention.
    question would not the trace cache prevent that ?
    it's 4Kbytes pre core. 8 way association.
    HAVE NO FEAR!
    "AMD fallen angel"
    "Behold the gaseous stench of Skeletor's breakfast burrito!"
    Quote Originally Posted by Gamekiller View Post
    You didn't get the memo? 1 hour 'Fugger time' is equal to 12 hours of regular time.

  18. #18
    Xtreme 3D Team BeepBeep2's Avatar
    Join Date
    Jan 2009
    Location
    Ohio
    Posts
    8,398
    Quote Originally Posted by ThePointer View Post
    What you actually proved is the opposite (for now), that AMD gets 33-59% on the CMT

    Chess 11800/8813=1.3389 ?
    Wprime 13.814/9.531=1.4494
    Winrar 4467/3027=1.4757
    3d06 5803/4134=1.4037
    3dvantage 19215/12102=1.5878
    3d11 6340/4289=1.4782
    CB R10 20552/15033=1.3671
    CB R11.5 6/3.8=1.5789
    Blender 9.76/7.16=1.3631
    X264 37.23/25.18=1.4786
    Transcode (222+210)/(185+135)=1.35

    well, unless you run single thread program like SuperPi however (for comparing Single 'Core' Module and Dual 'Core' Module), then you can see if there is an impact to single threaded stuff
    I think IPC would be an improvement over Phenom II if that is the case...
    35% of 20s would be 7 seconds, that would put stock 4.2 Ghz around 13 seconds :O
    Currently taking 6.3 Ghz to do 13 seconds from what I see.
    Intel all dey erry dey
    AMD all nite erry nite party party party

  19. #19
    25 to life - Eminem chew*'s Avatar
    Join Date
    Jan 2005
    Location
    Hell on Earth
    Posts
    10,130
    Although nice to see somone tested it this is really nothing more than what i stated in the BD info thread.

    Killing resource sharing improves performance.

    There are other things that can be done as well.

    The next logical step is this.

    Run max 24/7 clocks for deneb, thuban, and BD then compare performance.

    That should answer questions for faithfull AMD users if BD is an upgrade for them.
    Last edited by chew*; 10-12-2011 at 05:08 PM.
    heatware chew*
    For every beginning there must be an end
    Will be around to help with AMD specific hardware till shortly after BD launches, after that I'm Ghost

  20. #20
    Xtreme 3D Team BeepBeep2's Avatar
    Join Date
    Jan 2009
    Location
    Ohio
    Posts
    8,398
    Quote Originally Posted by chew* View Post
    Although nice to see somone tested it this is really nothing more than what i stated in the BD info thread.

    Killing resource sharing improves performance.

    There are other things that can be done as well.

    The next logical step is this.

    Run max 24/7 clocks for deneb, thuban, and BD then compare performance.

    That should answer questions for faithfull AMD users if BD is an upgrade for them.
    So if I wanted less heat output, lower power consumption, higher theoretical clocks and much better single thread performance I'd kill half the cores in bios and overclock the CPU to hell and back, right?

    In my opinion the CPU wouldn't look half as bad if ST performance gained 20% from the non-resource sharing up to 4 threads...
    Last edited by BeepBeep2; 10-12-2011 at 05:19 PM.
    Intel all dey erry dey
    AMD all nite erry nite party party party

  21. #21
    25 to life - Eminem chew*'s Avatar
    Join Date
    Jan 2005
    Location
    Hell on Earth
    Posts
    10,130
    Quote Originally Posted by BeepBeep2 View Post
    So if I wanted less heat output, lower power consumption, higher theoretical clocks and much better single thread performance I'd kill half the cores in bios and overclock the CPU to hell and back, right?
    Cores?

    You can not kill cores, get that marketing out of your head.

    You can disable clusters inside cores.....

    There is no mention of the term module in patents so lets bury the damn term......................

    Me personally I am going to run 1 in a daily rig with all 4 cores but clusters B disabled in all cores.

    Run it at 4.8-5.0 24/7 with high ram and call it a day.
    Last edited by chew*; 10-12-2011 at 05:22 PM.
    heatware chew*
    For every beginning there must be an end
    Will be around to help with AMD specific hardware till shortly after BD launches, after that I'm Ghost

  22. #22
    Xtreme 3D Team BeepBeep2's Avatar
    Join Date
    Jan 2009
    Location
    Ohio
    Posts
    8,398
    Quote Originally Posted by chew* View Post
    Cores?

    You can not kill cores, get that marketing out of your head.

    You can disable clusters inside cores.....

    There is no mention of the term module in patents so lets bury the damn term......................
    Sorry, the bios says cores.
    Sorry, "clusters inside the computational units"

    Why the hell couldn't AMD figure out a way to disable this resource sharing when 4 or less threads are executed?
    Intel all dey erry dey
    AMD all nite erry nite party party party

  23. #23
    25 to life - Eminem chew*'s Avatar
    Join Date
    Jan 2005
    Location
    Hell on Earth
    Posts
    10,130
    Quote Originally Posted by BeepBeep2 View Post
    Why the hell couldn't AMD figure out a way to disable this resource sharing when 4 or less threads are executed?
    good question, one I don't have an answer for at this point in time.
    heatware chew*
    For every beginning there must be an end
    Will be around to help with AMD specific hardware till shortly after BD launches, after that I'm Ghost

  24. #24
    Xtreme Member Baam's Avatar
    Join Date
    Aug 2008
    Location
    Freedom PA
    Posts
    143
    So if i disabled 4 clusters, will it run games better?

  25. #25
    Xtreme Addict gOtVoltage's Avatar
    Join Date
    Jul 2007
    Posts
    2,103
    Quote Originally Posted by chew* View Post
    Cores?

    You can not kill cores, get that marketing out of your head.

    You can disable clusters inside cores.....

    There is no mention of the term module in patents so lets bury the damn term......................

    Me personally I am going to run 1 in a daily rig with all 4 cores but clusters B disabled in all cores.

    Run it at 4.8-5.0 24/7 with high ram and call it a day.
    @ Chew ,,Exactly +1,,

    It will be a fun one to play with regardless...its just the beginning my friends.
    "Phenom,...Like the perfect Storm,...Everything needs to be just right"
    X555x4SuperCore@4450mhz@1.64v..........

    8120FX@5000mhz 990FXA UD5@1.63vcore
    FX6100@ 5000mhz
    655pump/Dual Feser320's/FlowModded D-Tek.Idle.@22c.Bad@$$
    Gigabyte WindForce 6950 = 6970 Shader and clock unlocked.
    ASUS EAH 5750'sAir)Xfired@ 850//5600 T2C Hynix
    GTX260/216 my backup

Page 1 of 11 1234 ... LastLast

Tags for this Thread

Posting Permissions

  • You may not post new threads
  • You may not post replies
  • You may not post attachments
  • You may not edit your posts
  •