Page 27 of 30 FirstFirst ... 1724252627282930 LastLast
Results 651 to 675 of 733

Thread: AMD FX-8150 Bulldozer finally tested

  1. #651
    I am Xtreme
    Join Date
    Dec 2008
    Location
    France
    Posts
    9,060
    You see, matt... A lot of applications are starting to use 2 threads now, some use up to 4... The thing is, we have 8 threads available (to consumers) since 2008... And there are next to no applications around that can utilise them all now, almost in 2012 (we are talking about an average Joe, not a hardcore cruncher or professional 3D artist who already sits there with a 12 core machine). A lot of tasks are extremely difficult to multi-thread. So for the vast majority of applications single threaded performance will stay extremely important, as long as you have the necessary number of threads available (and let's face it, 4 thread CPUs are dirt cheap these days).
    Last edited by zalbard; 10-15-2011 at 01:44 PM.
    Donate to XS forums
    Quote Originally Posted by jayhall0315 View Post
    If you are really extreme, you never let informed facts or the scientific method hold you back from your journey to the wrong answer.

  2. #652
    I am Xtreme
    Join Date
    Jul 2007
    Location
    Austria
    Posts
    5,485
    As zalbard said... hell itunes, as much as i hate it, is a very popular app... its still single threaded... the only thing that uses 4 threads+ efficiently is video encoding, audio encoding is also mostly singel or dual threaded (lame) etc. etc....

  3. #653
    Xtreme Enthusiast
    Join Date
    Aug 2008
    Posts
    577
    What I don't understand is that clearly the BD is good for multi-threaded performance such as possibly the server environment. Why release current chips with this poor leakage and clocks, why not just release the server first (where the chip is apparently designed for) then after another respin, then launch the desktop Zambezi later with better power/clocks? If you miss desired clocks by 30%, why release it early and have everyone talk about it really bad. Why not just be straight forward and say that the B2 silicon has issues and they are going to do respin?
    --Intel i5 3570k 4.4ghz (stock volts) - Corsair H100 - 6970 UL XFX 2GB - - Asrock Z77 Professional - 16GB Gskill 1866mhz - 2x90GB Agility 3 - WD640GB - 2xWD320GB - 2TB Samsung Spinpoint F4 - Audigy-- --NZXT Phantom - Samsung SATA DVD--(old systems Intel E8400 Wolfdale/Asus P45, AMD965BEC3 790X, Antec 180, Sapphire 4870 X2 (dead twice))

  4. #654
    Xtreme Addict
    Join Date
    Nov 2007
    Location
    Vancouver
    Posts
    1,073
    http://crazyworldofchips.blogspot.com/ solid write up on the state of bulldozer, and the issues at hand.
    " Business is Binary, your either a 1 or a 0, alive or dead." - Gary Winston ^^



    Asus rampage III formula,i7 980xm, H70, Silverstone Ft02, Gigabyte Windforce 580 GTX SLI, Corsair AX1200, intel x-25m 160gb, 2 x OCZ vertex 2 180gb, hp zr30w, 12gb corsair vengeance

    Rig 2
    i7 980x ,h70, Antec Lanboy Air, Samsung md230x3 ,Saphhire 6970 Xfired, Antec ax1200w, x-25m 160gb, 2 x OCZ vertex 2 180gb,12gb Corsair Vengence MSI Big Bang Xpower

  5. #655
    Xtreme Member
    Join Date
    Aug 2004
    Posts
    210
    Quote Originally Posted by Stukov View Post
    What I don't understand is that clearly the BD is good for multi-threaded performance such as possibly the server environment. Why release current chips with this poor leakage and clocks, why not just release the server first (where the chip is apparently designed for) then after another respin, then launch the desktop Zambezi later with better power/clocks? If you miss desired clocks by 30%, why release it early and have everyone talk about it really bad. Why not just be straight forward and say that the B2 silicon has issues and they are going to do respin?
    I asked myself the same. It looks strange, especially when there is a B3 revision coming out shortly, too.
    Only explanation to me is money. Maybe AMD need the cash-flow and/or they wanted to launch previously to SandyE, because the reviews would have been even worse, which would force them to reduce prices@launch even more.

    Anybody with another idea?

  6. #656
    Xtreme Addict
    Join Date
    Apr 2011
    Location
    North Queensland Australia
    Posts
    1,445
    Quote Originally Posted by Stukov View Post
    What I don't understand is that clearly the BD is good for multi-threaded performance such as possibly the server environment. Why release current chips with this poor leakage and clocks, why not just release the server first (where the chip is apparently designed for) then after another respin, then launch the desktop Zambezi later with better power/clocks? If you miss desired clocks by 30%, why release it early and have everyone talk about it really bad. Why not just be straight forward and say that the B2 silicon has issues and they are going to do respin?
    Well technically they didn't release BD early, they released it late. Very late.

    -PB
    -Project Sakura-
    Intel i7 860 @ 4.0Ghz, Asus Maximus III Formula, 8GB G-Skill Ripjaws X F3 (@ 1600Mhz), 2x GTX 295 Quad SLI
    2x 120GB OCZ Vertex 2 RAID 0, OCZ ZX 1000W, NZXT Phantom (Pink), Dell SX2210T Touch Screen, Windows 8.1 Pro

    Koolance RP-401X2 1.1 (w/ Swiftech MCP35X), XSPC EX420, XSPC X-Flow 240, DT Sniper, EK-FC 295s (w/ RAM Blocks), Enzotech M3F Mosfet+NB/SB

  7. #657
    Xtreme Addict
    Join Date
    Jan 2007
    Location
    Brisbane, Australia
    Posts
    1,264
    Quote Originally Posted by Drwho? View Post
    I propose you take a simple linear algorithm and run it on one thread, then, count the number of instructions retired, and then, divide by the number of clock ticks ... You ll be surprise ;-)
    ( make sure your code is totally compute, with 1 to 2 instructions dependancy ... )

    Power point are one thing, but measuring and checking yourself is much better ... Otherwise , at 4.2ghz, how could you explain the poor performance of BD on superPI? Low IPC ... Then, ask yourself, if you measure the IPC for each thread, why it never goes about 2 on a single thread ... Please experiment before trying to correct me. I did my homework ;-)


    Then , for your intel diagram, you forgot to count code fusion ... SandyB is 4 large + Fusion ... That gives you up to 5!

    We saw a lot of powerpoint slide, but the measurement don t match what is showed in the ppt, sorry, you assume the marketing slide are correct, this is where is the gap. I looked for everywhere, I could not find anywhere clearly said that it will decode more than 2 per threads, and match it with an ASM code doing more than 2 IPC , did you try?


    Hehe ...

    Francois

    Sorry I'm a bit lost here. Why are you focusing on one thread When the front end of bulldozer is responsble for two threads, just like Sandybridge?

    I know it falls behind clock/clock, but I don't you think that has more to do with other bottlnecks? Including, for integer code, the much debated ALU resources ona single thread? What about the longer pipeine? Are you taking into account there may still be a deficiency in Branch prediction next to Intel ? Pipeline Bubbles (floating point) that get filled by a 2nd thread?

    What would be more interesting I think, is comparing code thats exlusivley floating point with a single thread, then two threads, both on the one module. This would remove the integer clusters from the equation completely. (don't know if this is practical.. programming knowlege is my deficiency so help me out here! )

  8. #658
    Xtreme Enthusiast
    Join Date
    Oct 2007
    Location
    Singapore
    Posts
    970
    Quote Originally Posted by mAJORD View Post
    Sorry I'm a bit lost here. Why are you focusing on one thread When the front end of bulldozer is responsble for two threads, just like Sandybridge?

    I know it falls behind clock/clock, but I don't you think that has more to do with other bottlnecks? Including, for integer code, the much debated ALU resources ona single thread? What about the longer pipeine? Are you taking into account there may still be a deficiency in Branch prediction next to Intel ? Pipeline Bubbles (floating point) that get filled by a 2nd thread?

    What would be more interesting I think, is comparing code thats exlusivley floating point with a single thread, then two threads, both on the one module. This would remove the integer clusters from the equation completely. (don't know if this is practical.. programming knowlege is my deficiency so help me out here! )
    IMO, BD front end is two separated computational unit or integer core. if there is only 1 thread going in, I think two cores couldn't break the single thread into two components and then later on fuse it back at the back end.

  9. #659
    Xtreme Addict
    Join Date
    Jul 2008
    Location
    US
    Posts
    1,379
    Quote Originally Posted by Hornet331 View Post
    As zalbard said... hell itunes, as much as i hate it, is a very popular app... its still single threaded... the only thing that uses 4 threads+ efficiently is video encoding, audio encoding is also mostly singel or dual threaded (lame) etc. etc....
    That's all good and well, but stating that overall performance is determined by ST perf is just plain bonkers. Sure there are plenty of applications out there that are still single threaded, and many tasks cannot easily take advantage of threading, but multi-threaded apps are clearly the direction in which things are headed for the future and it is evident today.

    You've got to be kidding me on video encoding being the only thing that uses 4 threads efficiently. There are games that can take advantage of and benefit from 8 cores today. Mass audio encoding can use as many cores as you have available (dbpoweramp). Windows, with its various components and services, can also easily utilize many cores when a lot is going on. Sharing files, copying files, live transcoding for DLNA media sharing, recording multiple TV shows, and watching a movie at the same time can certainly utilize more than a quad core on its own every day of the week in a home environment. I'm assuming you've got multiple applications open now, likely more than one active. Provided you are, you have successfully taken advantage of more than one core. You're obviously not getting 4x the perf all the time from 4 cores (or more), but when the going gets tough and a lot is going on it can certainly help. The cpu manufacturers aren't adding cores and threads to processors for their health, they're doing it because software is able to take advantage of them and the user experience benefits from their presence.

    I'm definitely not going out on a limb and calling BD good or great in ST apps (I'll settle for a decent first try at a new arch), but stating that multi threaded performance is somehow irrelevant and single threaded performance is the only (or even primary) meaningful yard stick is way off base.

    --Matt
    Last edited by mattkosem; 10-15-2011 at 08:26 PM.
    My Rig :
    Core i5 4570S - ASUS Z87I-DELUXE - 16GB Mushkin Blackline DDR3-2400 - 256GB Plextor M5 Pro Xtreme

  10. #660
    Xtreme Addict
    Join Date
    Jan 2007
    Location
    Brisbane, Australia
    Posts
    1,264
    Quote Originally Posted by haylui View Post
    IMO, BD front end is two separated computational unit or integer core. if there is only 1 thread going in, I think two cores couldn't break the single thread into two components and then later on fuse it back at the back end.
    The seperate Integer cores are the exeuction units / Schedulers, The front end is unifed, 1 per module.

  11. #661
    Xtreme Mentor
    Join Date
    Jul 2008
    Location
    Shimla , India
    Posts
    2,631
    Well they released the consumer chips first because they dont require months and months of testing that server chips need. The BD is very server focused but so is SNB-E it will be a very interesting to see them in the server environment where multi-thread applications dominate. I have seen a 6 core SNB-E it was hot and heavy "In cooling req hehe" i dont think it will have 2B transistor count tough :P "4B for 16core BD my god that's huge"
    Coming Soon

  12. #662
    Xtreme Member
    Join Date
    Apr 2008
    Posts
    239


    Are you from the future? AMD's "Excavator" is planned for 2014

  13. #663
    Xtreme Mentor
    Join Date
    Jul 2008
    Location
    Shimla , India
    Posts
    2,631
    Quote Originally Posted by AKM View Post

    Are you from the future? AMD's "Excavator" is planned for 2014
    Ooopps i was half a sleep when i edited & saved it as my sig, have to store the image for later taunting....
    Coming Soon

  14. #664
    I am Xtreme FlanK3r's Avatar
    Join Date
    May 2008
    Location
    Czech republic
    Posts
    6,823
    Quote Originally Posted by Hornet331 View Post
    More cores don't yield you more ST performance, same goes for ISA extensions it don't yields you more performance in current apps. It was the same issue that plagued all P4s... SSE2 performance wsn't that bad. But hardly any apps used it, it took years till people adoped it. Today its still the case, look how many apps use SSE4.x and how many can make use of it... The only thing that would yield more performance if IPC goes down is clock speed. And the more IPC you loose the more clock you need. Lets say haswell looses 15% IPC compared to IB, now it needs at least 15% clock to only reach the speed of IB, then you wan't also a performance increase ~10%... so you need 25% more clock... considering that IB probably will be close to the 4ghz mark, you need a 5ghz haswell to beat a 4ghz IB... nope same situation as we see now with BD... powerconsumption will be trough the roof compared to its predecessor for only a marginal increase in performance.

    PPl tell me now for nearly a decade consumer market is changing.. yet performance is still determined by ST.. its the same thing with gpu computing... (just that isn't that long). Its a whole other picture in the professional market, but thats not what we discuss right now.

    I think, it must be balanced a bit. Single thread performance and multi. Yes, we are more and more in multitasking age, but still we need "average" single thread performance. If Piledriver will come with Phenom II single thread performance clock to clock or better, it will be good product.
    To others, BD is far away from Pentium IV desing and pipelines....
    ROG Power PCs - Intel and AMD
    CPUs:i9-7900X, i9-9900K, i7-6950X, i7-5960X, i7-8086K, i7-8700K, 4x i7-7700K, i3-7350K, 2x i7-6700K, i5-6600K, R7-2700X, 4x R5 2600X, R5 2400G, R3 1200, R7-1800X, R7-1700X, 3x AMD FX-9590, 1x AMD FX-9370, 4x AMD FX-8350,1x AMD FX-8320,1x AMD FX-8300, 2x AMD FX-6300,2x AMD FX-4300, 3x AMD FX-8150, 2x AMD FX-8120 125 and 95W, AMD X2 555 BE, AMD x4 965 BE C2 and C3, AMD X4 970 BE, AMD x4 975 BE, AMD x4 980 BE, AMD X6 1090T BE, AMD X6 1100T BE, A10-7870K, Athlon 845, Athlon 860K,AMD A10-7850K, AMD A10-6800K, A8-6600K, 2x AMD A10-5800K, AMD A10-5600K, AMD A8-3850, AMD A8-3870K, 2x AMD A64 3000+, AMD 64+ X2 4600+ EE, Intel i7-980X, Intel i7-2600K, Intel i7-3770K,2x i7-4770K, Intel i7-3930KAMD Cinebench R10 challenge AMD Cinebench R15 thread Intel Cinebench R15 thread

  15. #665
    Xtreme Enthusiast
    Join Date
    Jan 2010
    Posts
    533
    Quote Originally Posted by FlanK3r View Post
    If Piledriver will come with Phenom II single thread performance clock to clock or better, it will be good product.
    lol, no it won't.
    Bulldozer was supposed to be better than Phenom II, if Piledriver only catches up to Phenom II it will be a failure again. They need to get closer to the IPC levels of Sandy Bridge, not match their several years old architecture...if they somehow beat PII, then it might be something different, but there are still issues with Windows core management and I don't see anyone from AMD talking about a possible fix in the works.

  16. #666
    I am Xtreme FlanK3r's Avatar
    Join Date
    May 2008
    Location
    Czech republic
    Posts
    6,823
    No, this is unreal. Because if will be IPC near to SB, Piledriver destroy totally all CPU segment. And with respect, it is not possible now. Example of this: Now FX has 6.02p in R11.5, in Photoshop is near 2500k etc etc. If single thread will be near SB with Piledriver, in multithread has Ivy Bridge no chance. And this is not real from my point of view.
    If Piledriver will have clock to clock PII single thread perfomance, means with Piledriver clocks about 3700 MHz at stock+turbo will be better than Denebs on stock (maybe as core i7 930 or 950 at stock). In Multithread could be in R11.5 about 7.5 points and this is simillary as Ivy bridge-DT (expect IB about 7.2 with 3600K).
    ROG Power PCs - Intel and AMD
    CPUs:i9-7900X, i9-9900K, i7-6950X, i7-5960X, i7-8086K, i7-8700K, 4x i7-7700K, i3-7350K, 2x i7-6700K, i5-6600K, R7-2700X, 4x R5 2600X, R5 2400G, R3 1200, R7-1800X, R7-1700X, 3x AMD FX-9590, 1x AMD FX-9370, 4x AMD FX-8350,1x AMD FX-8320,1x AMD FX-8300, 2x AMD FX-6300,2x AMD FX-4300, 3x AMD FX-8150, 2x AMD FX-8120 125 and 95W, AMD X2 555 BE, AMD x4 965 BE C2 and C3, AMD X4 970 BE, AMD x4 975 BE, AMD x4 980 BE, AMD X6 1090T BE, AMD X6 1100T BE, A10-7870K, Athlon 845, Athlon 860K,AMD A10-7850K, AMD A10-6800K, A8-6600K, 2x AMD A10-5800K, AMD A10-5600K, AMD A8-3850, AMD A8-3870K, 2x AMD A64 3000+, AMD 64+ X2 4600+ EE, Intel i7-980X, Intel i7-2600K, Intel i7-3770K,2x i7-4770K, Intel i7-3930KAMD Cinebench R10 challenge AMD Cinebench R15 thread Intel Cinebench R15 thread

  17. #667
    I am Xtreme
    Join Date
    Dec 2008
    Location
    France
    Posts
    9,060
    Quote Originally Posted by FlanK3r View Post
    No, this is unreal. Because if will be IPC near to SB, Piledriver destroy totally all CPU segment. And with respect, it is not possible now. Example of this: Now FX has 6.02p in R11.5, in Photoshop is near 2500k etc etc. If single thread will be near SB with Piledriver, in multithread has Ivy Bridge no chance.
    You have to remember that SB already clocks higher than BD using ambient cooling. IB will be even further ahead clock rate wise. So one module of PD matching one HT core of SB sounds perfectly reasonable to me.
    Donate to XS forums
    Quote Originally Posted by jayhall0315 View Post
    If you are really extreme, you never let informed facts or the scientific method hold you back from your journey to the wrong answer.

  18. #668
    Xtreme Addict
    Join Date
    Jan 2009
    Location
    SF
    Posts
    1,070
    IB won't really clock higher; it should theoretically just consume less power.

  19. #669
    Xtreme Addict
    Join Date
    Jan 2007
    Location
    Brisbane, Australia
    Posts
    1,264
    Quote Originally Posted by zalbard View Post
    You have to remember that SB already clocks higher than BD using ambient cooling. IB will be even further ahead clock rate wise. So one module of PD matching one HT core of SB sounds perfectly reasonable to me.
    No it doesn't.. That's the thing. the Bulldozer uArch clearly scales to higher frequencies. For it to achieve this in real world though it will need a much improved process

  20. #670
    Xtreme Cruncher
    Join Date
    Jun 2006
    Posts
    6,215
    mAJORD is correct. Bulldozer should be able to clock higher but it needs mature 32nm process. The power draw/frequency they achieved with Zambezi B2 is entirely GloFo's fault .

  21. #671
    Xtreme Member
    Join Date
    Dec 2006
    Posts
    247
    Well, AMD surely aimed 5+ghz clocks on paper. Would have been fairly reasonable cpu, if fx-8150 was 5000+mhz by default and still be inside its TDP.

  22. #672
    I am Xtreme
    Join Date
    Sep 2006
    Posts
    10,374
    I already wonder how many can do a 5GHz prime 95 run for hours, my heatoutput is already to high at 4.7, reaching over 90°C in a matter of minutes... ( and no my coolers are mounted fine :p )
    Question : Why do some overclockers switch into d*ckmode when money is involved

    Remark : They call me Pro Asus Saaya yupp, I agree

  23. #673
    Xtreme Mentor
    Join Date
    Jul 2008
    Location
    Shimla , India
    Posts
    2,631
    Quote Originally Posted by Kristoferr View Post
    Well, AMD surely aimed 5+ghz clocks on paper. Would have been fairly reasonable cpu, if fx-8150 was 5000+mhz by default and still be inside its TDP.
    5+ghz even for top is too much even on paper, its more likly they aimed for 4.2-4.5ghz that seems more possible for top end.
    Coming Soon

  24. #674
    Xtreme Member
    Join Date
    Aug 2004
    Posts
    210
    Quote Originally Posted by Drwho? View Post
    Ps: forgot, if you want to verify yourself, count how many retired instructions are done on intel and AMD and compare the numbers, they land very very very close to each other on cinebench
    I'd rather search for "GenuineIntel" in the exe file. For example at these positions (for the 64b binary):

    006F6595, 006F65A4 & 006F65AE.

    There are these commands:

    cmp eax,0756E6547
    cmp eax,049656E69
    cmp eax,06C65746E

    The hex numbers translated, from bottom to top:
    "letn Ieni uneG"

    Now read form right to left ... ;-)

    What's the purpose of this?

    I have to admit however, that there is not much performance difference on a AMD K10. Just wonder what it is doing there ...
    Last edited by Opteron146; 10-16-2011 at 09:59 AM.

  25. #675
    Xtreme Member
    Join Date
    Jan 2007
    Location
    Dorset, UK
    Posts
    439
    Quote Originally Posted by mattkosem View Post
    Windows, with its various components and services, can also easily utilize many cores when a lot is going on. Sharing files, copying files, live transcoding for DLNA media sharing, recording multiple TV shows, and watching a movie at the same time can certainly utilize more than a quad core on its own every day of the week in a home environment. I'm assuming you've got multiple applications open now, likely more than one active. Provided you are, you have successfully taken advantage of more than one core. You're obviously not getting 4x the perf all the time from 4 cores (or more), but when the going gets tough and a lot is going on it can certainly help.
    You're missing the point that much or all of that activity is likely subject to bottlenecks elsewhere in the system. Most threads doing meaningful computational work on significant amounts of data will be waiting for disk (or Internet) IO requests most of the time, not actually computing, since even with SSDs disk IO is orders of magnitude slower than memory access. Unless you are running only truly multi-threaded apps like encoders, once you've gone past a number of cores (4? 6?) the difference is not likely to be user-noticeable since the core scheduler will be allowing busy threads to use the time the held-up threads don't need while waiting for the IO subsystem.

    I love the idea that the problem with ST versus MT is all due to lazy programmers who haven't multi-threaded their software. How, exactly, would multi-threading my email software help? Will it get my email off the remote server faster? Will it display the single email I am looking at any faster? Better ST performance might, though...
    Quote Originally Posted by Particle View Post
    Software patents are mostly fail wrapped in fail sprinkled with fail and sautéed in a light fail sauce.

Page 27 of 30 FirstFirst ... 1724252627282930 LastLast

Bookmarks

Bookmarks

Posting Permissions

  • You may not post new threads
  • You may not post replies
  • You may not post attachments
  • You may not edit your posts
  •