Page 88 of 181 FirstFirst ... 38788586878889909198138 ... LastLast
Results 2,176 to 2,200 of 4519

Thread: AMD Zambezi news, info, fans !

  1. #2176
    I am Xtreme
    Join Date
    Dec 2007
    Posts
    7,750
    Quote Originally Posted by xdan View Post
    And who says and who can prove that BD will have single thread performance equal to SB or better?
    Or even better than Thuban?
    I don't think that you understood what i meant .
    An improved Phenom II arhitecture should have interesting performance: reduced L2& L3 cachem better IMC, larger L3 cache -8MB). Just some tweaks as Nehalem -> SB.
    And probably would have better yelds, because is something well known and so on.
    improved PII is Llano (minus L3)
    i dont know if it will be better than SB in single threaded tests (sorry if i implied that), i do EXPECT it to be better than thuban due to clocks and previous details provided.
    if 6 BD threads = 6 thuban threads, and 2 cores gets 195% scaling while 2 BD threads on one core/module gets 180%, then a single core/thread of BD is ahead by +15% already because we KNOW the scaling of BD is not the same as true independent cores. so if all cores being used is equal, then less cores being used must be stronger.
    2500k @ 4900mhz - Asus Maxiums IV Gene Z - Swiftech Apogee LP
    GTX 680 @ +170 (1267mhz) / +300 (3305mhz) - EK 680 FC EN/Acteal
    Swiftech MCR320 Drive @ 1300rpms - 3x GT 1850s @ 1150rpms
    XS Build Log for: My Latest Custom Case

  2. #2177
    I am Xtreme FlanK3r's Avatar
    Join Date
    May 2008
    Location
    Czech republic
    Posts
    6,823
    My thinking about all performance questions:
    1) older x6 Thuban with die size 370+mm could be better than FX 8150 with 315mm size?
    2)older x6 Thuban with 3.3 GHz/3.7 GHz (half of threads) could be better than FX 8150 with 3.6 GHz/4.2 GHz?
    3)If yes, why is not 32nm Phenom II x8 with die size simillary as FX? It would been finally better (and at 32nm maybe could Thuban higher stock clocks)!

    No, it make no senses..
    ROG Power PCs - Intel and AMD
    CPUs:i9-7900X, i9-9900K, i7-6950X, i7-5960X, i7-8086K, i7-8700K, 4x i7-7700K, i3-7350K, 2x i7-6700K, i5-6600K, R7-2700X, 4x R5 2600X, R5 2400G, R3 1200, R7-1800X, R7-1700X, 3x AMD FX-9590, 1x AMD FX-9370, 4x AMD FX-8350,1x AMD FX-8320,1x AMD FX-8300, 2x AMD FX-6300,2x AMD FX-4300, 3x AMD FX-8150, 2x AMD FX-8120 125 and 95W, AMD X2 555 BE, AMD x4 965 BE C2 and C3, AMD X4 970 BE, AMD x4 975 BE, AMD x4 980 BE, AMD X6 1090T BE, AMD X6 1100T BE, A10-7870K, Athlon 845, Athlon 860K,AMD A10-7850K, AMD A10-6800K, A8-6600K, 2x AMD A10-5800K, AMD A10-5600K, AMD A8-3850, AMD A8-3870K, 2x AMD A64 3000+, AMD 64+ X2 4600+ EE, Intel i7-980X, Intel i7-2600K, Intel i7-3770K,2x i7-4770K, Intel i7-3930KAMD Cinebench R10 challenge AMD Cinebench R15 thread Intel Cinebench R15 thread

  3. #2178
    Brilliant Idiot
    Join Date
    Jan 2005
    Location
    Hell on Earth
    Posts
    11,015
    Quote Originally Posted by FlanK3r View Post
    3)at 32nm maybe could Thuban higher stock clocks)!

    No, it make no senses..
    Said it before and will say it again.

    TDP.........

    If you shrunk thuban to 32nm you could do 1 of 2 things, add 2 cores or increase clocks......not both, there's no room to grow.

    Apparently according to AMD's roadmap however with bulldozer there is room to grow and stay within TDP limitations......
    Last edited by chew*; 09-09-2011 at 10:57 AM.
    heatware chew*
    I've got no strings to hold me down.
    To make me fret, or make me frown.
    I had strings but now I'm free.
    There are no strings on me

  4. #2179
    I am Xtreme FlanK3r's Avatar
    Join Date
    May 2008
    Location
    Czech republic
    Posts
    6,823
    maybe u right, but still would be x8 PII at 32nm multithread beast with the same clocks as today
    ROG Power PCs - Intel and AMD
    CPUs:i9-7900X, i9-9900K, i7-6950X, i7-5960X, i7-8086K, i7-8700K, 4x i7-7700K, i3-7350K, 2x i7-6700K, i5-6600K, R7-2700X, 4x R5 2600X, R5 2400G, R3 1200, R7-1800X, R7-1700X, 3x AMD FX-9590, 1x AMD FX-9370, 4x AMD FX-8350,1x AMD FX-8320,1x AMD FX-8300, 2x AMD FX-6300,2x AMD FX-4300, 3x AMD FX-8150, 2x AMD FX-8120 125 and 95W, AMD X2 555 BE, AMD x4 965 BE C2 and C3, AMD X4 970 BE, AMD x4 975 BE, AMD x4 980 BE, AMD X6 1090T BE, AMD X6 1100T BE, A10-7870K, Athlon 845, Athlon 860K,AMD A10-7850K, AMD A10-6800K, A8-6600K, 2x AMD A10-5800K, AMD A10-5600K, AMD A8-3850, AMD A8-3870K, 2x AMD A64 3000+, AMD 64+ X2 4600+ EE, Intel i7-980X, Intel i7-2600K, Intel i7-3770K,2x i7-4770K, Intel i7-3930KAMD Cinebench R10 challenge AMD Cinebench R15 thread Intel Cinebench R15 thread

  5. #2180
    Brilliant Idiot
    Join Date
    Jan 2005
    Location
    Hell on Earth
    Posts
    11,015
    Quote Originally Posted by FlanK3r View Post
    maybe u right, but still would be x8 PII at 32nm multithread beast with the same clocks as today
    Maybe but you missed the other part i mentioned.........Where do you go from there.

    You have no room to grow so your back to square 1 and need a new architecture to build on.
    heatware chew*
    I've got no strings to hold me down.
    To make me fret, or make me frown.
    I had strings but now I'm free.
    There are no strings on me

  6. #2181
    I am Xtreme
    Join Date
    Dec 2007
    Posts
    7,750
    ^i like the new avatar
    2500k @ 4900mhz - Asus Maxiums IV Gene Z - Swiftech Apogee LP
    GTX 680 @ +170 (1267mhz) / +300 (3305mhz) - EK 680 FC EN/Acteal
    Swiftech MCR320 Drive @ 1300rpms - 3x GT 1850s @ 1150rpms
    XS Build Log for: My Latest Custom Case

  7. #2182
    Xtreme Mentor
    Join Date
    May 2008
    Location
    cleveland ohio
    Posts
    2,879
    pseudo cores....
    HAVE NO FEAR!
    "AMD fallen angel"
    Quote Originally Posted by Gamekiller View Post
    You didn't get the memo? 1 hour 'Fugger time' is equal to 12 hours of regular time.

  8. #2183
    Xtreme Member
    Join Date
    Aug 2011
    Posts
    180
    Maybe BD is a Pentium 4 AMD style.

    Pentium 4 introduced 2 threads per core, was slower than PIII clock for clock, but excelled in certain tasks (video encoding)

  9. #2184
    Xtreme Cruncher
    Join Date
    Jun 2006
    Posts
    6,215
    Quote Originally Posted by ice_chill View Post
    Maybe BD is a Pentium 4 AMD style.

    Pentium 4 introduced 2 threads per core, was slower than PIII clock for clock, but excelled in certain tasks (video encoding)
    I doubt that.

  10. #2185
    Xtreme Addict
    Join Date
    Mar 2005
    Location
    Rotterdam
    Posts
    1,553
    People really should get over the fact that BD will NOT be slower than Thuban.. Jesus what's wrong with yall...
    Gigabyte Z77X-UD5H
    G-Skill Ripjaws X 16Gb - 2133Mhz
    Thermalright Ultra-120 eXtreme
    i7 2600k @ 4.4Ghz
    Sapphire 7970 OC 1.2Ghz
    Mushkin Chronos Deluxe 128Gb

  11. #2186
    Xtreme Member
    Join Date
    Aug 2011
    Posts
    180
    Each module might be faster than each core in Thuban, but each mini core in BD might not match each full core in Thurban. AMD even themselves once refered to a BD module as 1.5 cores and not 2 cores.

  12. #2187
    I am Xtreme
    Join Date
    Dec 2007
    Posts
    7,750
    Quote Originally Posted by ice_chill View Post
    Each module might be faster than each core in Thuban, but each mini core in BD might not match each full core in Thurban. AMD even themselves once refered to a BD module as 1.5 cores and not 2 cores.
    not that i can recall. they have never compared thuban to BD, the only thing they have said is 180% scaling with CMT vs 195% scaling of old independent cores vs 115% scaling of SMT
    2500k @ 4900mhz - Asus Maxiums IV Gene Z - Swiftech Apogee LP
    GTX 680 @ +170 (1267mhz) / +300 (3305mhz) - EK 680 FC EN/Acteal
    Swiftech MCR320 Drive @ 1300rpms - 3x GT 1850s @ 1150rpms
    XS Build Log for: My Latest Custom Case

  13. #2188
    Xtreme Addict
    Join Date
    Mar 2005
    Location
    Rotterdam
    Posts
    1,553
    I have just one beef with comparing 1 Thuban core vs 1 bulldozer "core", while 1 Thuban core could (ignoring design differences) technically be faster being a complete core, people forget that 1 thread on a BD core will be run with full 2MB L2 + 8MB L3 cache and full FP Scheduler, which together with faster clock speed should really help single threaded performance.

    The design is just really smart, it has the potential do do very well especially once its drawbacks are ironed out in future revisions.
    Gigabyte Z77X-UD5H
    G-Skill Ripjaws X 16Gb - 2133Mhz
    Thermalright Ultra-120 eXtreme
    i7 2600k @ 4.4Ghz
    Sapphire 7970 OC 1.2Ghz
    Mushkin Chronos Deluxe 128Gb

  14. #2189
    Xtreme Member
    Join Date
    Apr 2007
    Location
    Serbia
    Posts
    102
    Quote Originally Posted by ice_chill View Post
    Each module might be faster than each core in Thuban, but each mini core in BD might not match each full core in Thurban. AMD even themselves once refered to a BD module as 1.5 cores and not 2 cores.
    Depends on workload. For example, FFT - Lucas Lehmer Fast Furier Transformations get almost all FPU resources. 95% of instructions are SSE2 packed float and IPC is 1.9. FPU executes 1.8 FP instructions per cycle. There is two arithmetic pipes, FADD and FMUL in Phenom and Sandy Bridge cores. Because of algorithm, it effectively uses FADD and FMUL pipes. In such situation Bulldozer core has half of resources to execute. There is some exception if algorithm uses FMA pipe and integer SIMD ALU pipe. In that case BD module has double resources of 10h core.
    Linpack also uses alot of FP resources. It uses all three pipes in 10h with IPC of 2.6 there is 98% of FP instructions. But that is synthetic benchmark which use FADD, FMUL and FSTOR pipe. But for example BD FlexFP is 4-issue and ALU SIMD instructions like ANDNPD can execute on FPU pipe 2 and 3 independent of other ADD or MUL instruction. On 10h it can execute only on FADD or FMUL pipe. Theoretically, for synthetic benchmarks, per core, FlexFP has only half of resources from K10 FPU.
    If instruction mix has some bitwise SIMD (ALU) and SIMD FP instructions it has same resources like K10.
    Also if it uses FMA instructions module has double of FP resources of 10h, and same compared core to core.
    In normal workloads, for eg. rendering tasks, it is a mix of FP and integer code. That could run much faster on 8-core BD than 6-core Thuban. Also in single thread or light threaded code BD could be much faster.

    Conclusion is that the BD in multithread code will be much faster where Intel Core i7 with hyperthreading doing well. This is every example where execution resources are underutilised. For example, linpack runs slower with hyperthreading than without it. That is because of shared resources and shared L1D cache. Linpack with 4 threads on BD will not run faster than 8 threads, but with 8 threads will not run much faster than 4 threads.
    But Cinebench which FPU utilisation is 25% ( 0.55 FP IPC and 0.65 IPC on Thuban CORE) can run much faster with 8 threads than 4 threads. Also, with hyperthreading, CB scales very well.
    Last edited by drfedja; 09-09-2011 at 03:46 PM.
    "That which does not kill you only makes you stronger." ---Friedrich Nietzsche
    PCAXE

  15. #2190
    Xtreme Member
    Join Date
    Sep 2008
    Location
    Eastern Tennessee (from Minnesota)
    Posts
    241
    Quote Originally Posted by repman View Post
    This maybe be a dumb question but, is it possible that the OS (Windows) would need an update because of the BD design in order to properly support it?
    Quote Originally Posted by Mats View Post
    Yeah, JF talks about "Final drivers"- what's that about?
    I posted about this a long time ago, where JF talks about it in the AMD blog (would have to dig through my early posts on here, thankfully there aren't many). While JF only can speak on matters pertaining to Server products, I think it's fair to assume at least some of what he said will still hold true for the Desktop versions as well. From what I recall it was not only OS coding that would help (I guess drivers in this case), but application coding as well, which will better utilize how the cache works in BD (sharing). He seemed to indicate that Windows 8 (which he did not outright say 'Windows', or '8') would support it, which I went on to wonder how possibly it would be to get a patch out for 7 for it to full support BD as it's intended. I figured if it was able to be done w/o an arse load of work that it may require quite a hefty patch size, so I'm hoping the drivers mentioned are able to do most/all of what is needed

  16. #2191
    Xtreme Addict
    Join Date
    Jun 2002
    Location
    Ontario, Canada
    Posts
    1,782
    Quote Originally Posted by Formula350 View Post
    I posted about this a long time ago, where JF talks about it in the AMD blog (would have to dig through my early posts on here, thankfully there aren't many). While JF only can speak on matters pertaining to Server products, I think it's fair to assume at least some of what he said will still hold true for the Desktop versions as well. From what I recall it was not only OS coding that would help (I guess drivers in this case), but application coding as well, which will better utilize how the cache works in BD (sharing). He seemed to indicate that Windows 8 (which he did not outright say 'Windows', or '8') would support it, which I went on to wonder how possibly it would be to get a patch out for 7 for it to full support BD as it's intended. I figured if it was able to be done w/o an arse load of work that it may require quite a hefty patch size, so I'm hoping the drivers mentioned are able to do most/all of what is needed
    Let's all hope you're right!
    As quoted by LowRun......"So, we are one week past AMD's worst case scenario for BD's availability but they don't feel like communicating about the delay, I suppose AMD must be removed from the reliable sources list for AMD's products launch dates"

  17. #2192
    Xtreme 3D Team
    Join Date
    Jan 2009
    Location
    Ohio
    Posts
    8,499
    Quote Originally Posted by informal View Post
    You mean this is shipping performance? B2 stepping,even if it's 6-7% slower or even 15% slower,it sucks badly since it is slower/or barely equal to 1100T. Rumored price from AMD themselves 300$. Rumored price from one dude having them listed on his own site : 260$. Todays 1100T price :190$ (will go down after Zambezi launches). If as you say price reflects performance then you will have 1100T performance (+-10/15%) with 30+% higher price. Is this logical?

    In the link above (Vr-zone),just one glance at C10 64bit single core test tells you something is off. You have a single Bulldozer core using 256bit FPU for itself and running at 4Ghz.It gets 3769 pts with some of the features turned off in BIOS(best result they managed). Now ,take a look at single Thuban core, running at 3.7Ghz in same benchmark. It scores 4103pts. That is 17% faster than what Zambezi would get at 3.7Ghz and still faster (8%) than what Zambezi gets at 4Ghz. This is the brand new,double sized,improved,SMT capable FlexFP,with free reg-reg moves(no cost instruction according to AMD), and million other improvements versus K10? Yeah,call me crazy but I don't think so.
    The only thing positive I see in there is the 4.7 to 5x multiprocessor speedup. Performance as a whole sucks ass, (something is wrong with these results, because Opterons would have not been handed to Cray for upgrades had the CPU's been underperforming) however it seems my hypothesis about single to multithread performance after what chew* hinted at was logical.
    Last edited by BeepBeep2; 09-09-2011 at 06:53 PM.
    Smile

  18. #2193
    Xtreme Cruncher
    Join Date
    Jun 2006
    Posts
    6,215
    4.7 to 5x multiprocessor speedup is simple to explain. In single thread test you have one core using one 256bit FPU. In MT you have 8 cores using 4 256bit FPUs. From one to 4 FPUs you have 4-5x speedup (more than four due to SMT mode in the FlexFP which adds 20% on top of 4x).
    The problem is one 256bit double-sized FlexFP is slower than old Thuban core.At least in these benchmarks/conditions. That's what's illogical.

  19. #2194
    Xtreme Mentor
    Join Date
    Mar 2006
    Posts
    2,978
    Quote Originally Posted by informal View Post
    4.7 to 5x multiprocessor speedup is simple to explain. In single thread test you have one core using one 256bit FPU. In MT you have 8 cores using 4 256bit FPUs. From one to 4 FPUs you have 4-5x speedup (more than four due to SMT mode in the FlexFP which adds 20% on top of 4x).
    The problem is one 256bit double-sized FlexFP is slower than old Thuban core.At least in these benchmarks/conditions. That's what's illogical.
    I don't quite follow, but I am likely not thinking about it right. First question, has CB been recompiled for AVX? Second question, if not, then CB will take advantage of 8 128 bit FPUs will it not?
    One hundred years from now It won't matter
    What kind of car I drove What kind of house I lived in
    How much money I had in the bank Nor what my cloths looked like.... But The world may be a little better Because, I was important In the life of a child.
    -- from "Within My Power" by Forest Witcraft

  20. #2195
    Xtreme Addict
    Join Date
    Dec 2007
    Location
    Earth
    Posts
    1,787
    Quote Originally Posted by JumpingJack View Post
    I don't quite follow, but I am likely not thinking about it right. First question, has CB been recompiled for AVX? Second question, if not, then CB will take advantage of 8 128 bit FPUs will it not?
    I don't think cinebench has AVX optimizations to date. Therefore the benchmarks we have seen for BD thus far have not actually made use of the 256 floating point mode.

    On another note, I'll go out on a limb here and say; I do not think AMD has anything spectacular to offer as of yet in regards to BD. Obviously this is why It has not been released. And I do think at this point BD is nearly as flawed as P4. A core is NOT a core unless it can stand on its own in my book, with the exception of shared cache, witch is good since many cores have access to the same cached data.
    Sandy Bridge 2500k @ 4.5ghz 1.28v | MSI p67a-gd65 B3 Mobo | Samsung ddr3 8gb |
    Swiftech apogee drive II | Coolgate 120| GTX660ti w/heat killer gpu x| Seasonic x650 PSU

    QX9650 @ 4ghz | P5K-E/WIFI-AP Mobo | Hyperx ddr2 1066 4gb | EVGA GTX560ti 448 core FTW @ 900mhz | OCZ 700w Modular PSU |
    DD MC-TDX CPU block | DD Maze5 GPU block | Black Ice Xtreme II 240 Rad | Laing D5 Pump

  21. #2196
    Xtreme 3D Team
    Join Date
    Jan 2009
    Location
    Ohio
    Posts
    8,499
    Quote Originally Posted by informal View Post
    4.7 to 5x multiprocessor speedup is simple to explain. In single thread test you have one core using one 256bit FPU. In MT you have 8 cores using 4 256bit FPUs. From one to 4 FPUs you have 4-5x speedup (more than four due to SMT mode in the FlexFP which adds 20% on top of 4x).
    The problem is one 256bit double-sized FlexFP is slower than old Thuban core.At least in these benchmarks/conditions. That's what's illogical.
    Obviously it's all terrace215's fault. (lol, joking)
    We will see when the CPU's will launch, a CPU performing at that level is probably only worth about $120.

    CrazyNutz -
    Why would Cray buy a downgrade, even if it did have 4 more cores? I think all these samples are crippled in some way and that is the reason why John stated that nothing from anyone other than AMD with proper testing procedures will be correct until after launch time.
    Last edited by BeepBeep2; 09-09-2011 at 09:18 PM.
    Smile

  22. #2197
    D.F.I Pimp Daddy
    Join Date
    Jan 2007
    Location
    Still Lost At The Dead Show Parking Lot
    Posts
    5,182
    Quote Originally Posted by Dimitriman View Post
    ok that's it. I am definitively settling for an i7 because all these latest benchmarks prove for sure that Bulldozer will be slower than 1100T!

    OK....I have just built a workstation with a i7 950 and recently 2 different server/workstations with Westmere XEONS one of them being my own in my sig and quite frankly all of them are just marginally better than what I was running prior to this build which was a 3 year + old Phenom II 940 system. So as I have been saying for years now most of this exaggerated stuff with Higher numbers/benchmarks are just theoretical and e-Penis stroking. Real World it really is not that much faster from apples to oranges and that's a fact!
    Last edited by Brother Esau; 09-09-2011 at 09:31 PM.
    SuperMicro X8SAX
    Xeon 5620
    12GB - Crucial ECC DDR3 1333
    Intel 520 180GB Cherryville
    Areca 1231ML ~ 2~ 250GB Seagate ES.2 ~ Raid 0 ~ 4~ Hitachi 5K3000 2TB ~ Raid 6 ~

  23. #2198
    Xtreme Mentor
    Join Date
    Mar 2006
    Posts
    2,978
    Quote Originally Posted by BeepBeep2 View Post
    Obviously it's all terrace215's fault. (lol, joking)
    We will see when the CPU's will launch, a CPU performing at that level is probably only worth about $120.
    He posted so much IPC actually did go down (j/k)
    One hundred years from now It won't matter
    What kind of car I drove What kind of house I lived in
    How much money I had in the bank Nor what my cloths looked like.... But The world may be a little better Because, I was important In the life of a child.
    -- from "Within My Power" by Forest Witcraft

  24. #2199
    D.F.I Pimp Daddy
    Join Date
    Jan 2007
    Location
    Still Lost At The Dead Show Parking Lot
    Posts
    5,182
    OH, AND YAY FOR BULLDOZER!

    Too bad I am broke now and moving to Charlotte soon DAMN you AMD! GOD HOW I LOATH INTEL! Must admit though, my current build is quite nice and very stable but my heart and preference lies with the Green Team.......tsk..tsk.....dammit!
    SuperMicro X8SAX
    Xeon 5620
    12GB - Crucial ECC DDR3 1333
    Intel 520 180GB Cherryville
    Areca 1231ML ~ 2~ 250GB Seagate ES.2 ~ Raid 0 ~ 4~ Hitachi 5K3000 2TB ~ Raid 6 ~

  25. #2200
    Xtreme Addict
    Join Date
    Dec 2007
    Location
    Earth
    Posts
    1,787
    Quote Originally Posted by BeepBeep2 View Post
    CrazyNutz -
    Why would Cray buy a downgrade, even if it did have 4 more cores? I think all these samples are crippled in some way and that is the reason why John stated that nothing from anyone other than AMD with proper testing procedures will be correct until after launch time.
    I did take that into consideration. However the upgrade path to 256bit FPU should be a very good upgrade since we know most organizations using a cray will most likely benefit from that for their scientific research.
    Also we've only seen AMD handing over 1 box of CPU's to Cray, this may only be for a single cray customer, how many others did not chose this upgrade?
    Sandy Bridge 2500k @ 4.5ghz 1.28v | MSI p67a-gd65 B3 Mobo | Samsung ddr3 8gb |
    Swiftech apogee drive II | Coolgate 120| GTX660ti w/heat killer gpu x| Seasonic x650 PSU

    QX9650 @ 4ghz | P5K-E/WIFI-AP Mobo | Hyperx ddr2 1066 4gb | EVGA GTX560ti 448 core FTW @ 900mhz | OCZ 700w Modular PSU |
    DD MC-TDX CPU block | DD Maze5 GPU block | Black Ice Xtreme II 240 Rad | Laing D5 Pump

Page 88 of 181 FirstFirst ... 38788586878889909198138 ... LastLast

Bookmarks

Bookmarks

Posting Permissions

  • You may not post new threads
  • You may not post replies
  • You may not post attachments
  • You may not edit your posts
  •