Page 2 of 4 FirstFirst 1234 LastLast
Results 26 to 50 of 89

Thread: [TPU] AMD Readies 16-core Processors with Full Uncore

  1. #26
    Xtreme Mentor
    Join Date
    May 2008
    Location
    cleveland ohio
    Posts
    2,879

    excuess me sir I'ld like to point to a few things.

    Quote Originally Posted by [XC] Lead Head View Post
    Please stop trying to defend Kaveri. Steamroller sucks. AMD needs a 30-40% if not more increase in IPC across the board to be competitive with Intel. Kaveri is maybe 5-10% faster on average compared to Piledriver and it STILL has significantly less IPC compared to Thuban. AMD was absolutely idiotic going to an integer focused architecture when almost everything these days is using floating point math. Until HSA becomes a thing, Kaveri/Streamroller is a joke, and I suspect Excavator will be as well.
    last time I looked in cinebench 11.5r piledriver at 5.0ghz can score only 1.36 and haswell is somewhere around 2.2+ which is far great than 40% I'm just wondering where you got 30-40% ?

    it's not that I don't agree with most of that, it's just that AMD's IPC to FPS isn't the exact same as intels IPC to FPS

    Quote Originally Posted by tajoh111 View Post
    If this is based on steam roller, I would be worried of what this things are clocked at.

    All these crazy amount of core chips always take a pretty big frequency hit. For most general purpose tasks, bulldozers needs a really high frequency to be somewhat competitive with Intel. Steam roller is pretty similar in this regard. It indeed did improve IPC a bit, but the problem is steamroller appears to clock worse than piledriver. Thus, adding this many modules + steam rollers lower clocks makes this seem like a product that it is so niche that it really doesn't have much of a market. Particularly when you add that AMD has pretty much lost all of he server + workstation market.
    I believe there a few reason for this a second decoder is going add power usage and heat just like the density of the amount of transistors will too. Along with the drop in the pipeline from 4 to 3.
    you can't just add a second decoder with 0 power draw added. heck even going from 3 to 4 increase power a lot, with a die shrink and it just because going from 3 way decoder to 4 way decoder makes the decoder 4 times bigger in surface area than before.

    Quote Originally Posted by stuffme View Post
    I hear you.

    Once again AMD has ed us in the ass with it's BS. I should have known that the talks about 30% better IPC are best case scenario with nothing to do with the real world. If only one application can get the 30% boost, but the rest will get only 5-10% boost, I call it a super fail. Same goes for intel, haswell was fail even though the high anticipation. It seems amd lost it when jerry sanders left the building. Hector was complete joke without no idea how to run promising company and now on CPU side AMD is complete joke to the industry.

    I used to love AMD products and the ability to seriously fight against the goliath and I have owned many amd system. But now I just cant buy anything they make, they don't have anything interesting in any segment. Like now I would like to have AMD based NUC, but noooo, not gonna happen. If going to happen, it will be too late again since i probably own intel nuc by then. Performance wise all their cpus are complete joke, but price wise attractive... but then you see the power consumption figures and that's the last straw. I just cannot swallow that load of failness in whole product line and I just cannot have something so badly engineered on my system. I probably would have bought 8350 processor if it would have been energy efficient, but like we all know how that turned out to be.
    stagnant for everything currently and pretty boring.


    Quote Originally Posted by AliG View Post
    Yup my thoughts exactly. I'm not sure why they reduced the execution resources, the original schematics suggested 4 ALUs per core for a total of 8 per module. I suppose part of that was due to thermal constraints (not to mention bulldozer was already a huge die at 45nm), but whoever was the head of the project really screwed up.

    The idea of shared resources actually is pretty solid if you start to consider very large scale expansions. The only problem is it doesn't matter how many threads you can handle if your individual cores are considerably weaker than the competition.

    Yeah pretty much. Again, the idea of shared resources actually makes a lot of sense if you have extremely efficient cores - alas AMD does not.
    shared resources you know people seem to have all forgotten about the speed that is there with similar data falling in the instruction cache that is shared between two cores.

    I'm not sure if you can find any games that show this but it would be a game that shows less fps on 4 modules 4 cores vs 2 modules 4 cores.

    for the Admins of this forum I'm going to post about a banned person.
    I was back reading the bobcat bulldozer thread.
    There is actually really good information in that thread.

    now I would like to point out I went back and found the ORIGINAL POSTER of bulldozers IPC decrease was not terrace215, but Hornet331

    Quote Originally Posted by Hornet331 View Post
    Well they also revealved that each int core only has 2x alus/agus, thats also new.

    Hate me for what I am saying, but with this is pretty much given that SB will rape bulldozer in singelthreaded apps. Sure bulldozer will use turbo to offset this handycap, but so does SB.
    Multithreading load on the other had will be very interesting, chances are good that amd can win a lot of server share back with this.

    What I also wonder how this design will deal with apps that only have limited thread numbers, Lame mp3 encoding comes to mind or many games. For me it seems as long as the thread count is below the physical amount of cores, SB will be the winner and even PII will do better. But if we go higher, Bulldozer will really shows what it is capable off.

    So it seems AMD fully aimed at server sapce with this desgin.
    hhhhmmmm

    Quote Originally Posted by Hornet331 View Post
    Yes i already read the artciel...


    The improtant part is other enhancements.

    4way wide frontend is wide, but alone its useless if you can't process the ops fast enough, so the solution is higher clocks speeds aka turbo.

    Again i said, its highly likely that bulldozer will have higher singlethreaded performance, but IPC will go down compared to a single deneb core.
    ^ this is still a problem even more so now with 2 decoders, because I don't think the 3 way 96Kbytes cache isn't good enough to even feed one decode let alone two. also see blow

    now the instresting part in there was Chumbucket843

    Quote Originally Posted by Chumbucket843 View Post
    sure a low voltage chip can do that. at 3GHz? unlikely.

    no, i will not read that article. i dont need someone to tell me the sky is blue either.

    the only way to increase alu utilization efficiently is through software. using hardware to improve this is generally inefficient. gpu's are a good example of this. it is harder to exploit data parallelism but the majority of transistors in gpu's are sram in register files/caches, and alu's. this is about as lean as you can get.
    software being HSA enable is most likely where AMD was headed.

    Quote Originally Posted by terrace215 View Post
    16KB L1D (down from 64KB in K10.5), only 2 ALUs per integer core (down from 3 in K10.5), longer pipeline... compensated by some op fusion, better prefetching & branch prediction, but: pretty clear now that the single- (and therefore low-) threaded integer IPC is NOT going to make a huge jump from K10.5. I'm not sure it will match Nehalem, even- might be close. SB should be well in the clear.
    in his post he was trying to get across the problems, (yes know he posted too much about it to make it look like trolling)

    Those post are in order ironically.

    I didn't much like the small data caches my self. They are pretty efficient for their 1/4 size compared to K10.5. They just aren't going to be fast enough ever beat K10. As AliG mention they're aren't nearly as good as K10.5 was.

    I also thought it was odd to keep 2 way 64Kbytes Instruction caches for a 4 way decoder.
    if it's not working well for 10.5 why would work better with an even bigger decoder.

    I don't know if this is possible but why not just take both decoders for use of a single thread.
    Last edited by demonkevy666; 01-21-2014 at 10:08 AM.
    HAVE NO FEAR!
    "AMD fallen angel"
    Quote Originally Posted by Gamekiller View Post
    You didn't get the memo? 1 hour 'Fugger time' is equal to 12 hours of regular time.

  2. #27
    Xtreme Cruncher
    Join Date
    Nov 2005
    Location
    Rhode Island
    Posts
    2,740
    Quote Originally Posted by demonkevy666 View Post
    last time I looked in cinebench 11.5r piledriver at 5.0ghz can score only 1.36 and haswell is somewhere around 2.2+ which is far great than 40% I'm just wondering where you got 30-40% ?

    it's not that I don't agree with most of that, it's just that AMD's IPC to FPS isn't the exact same as intels IPC to FPS
    It was just a random guess based on some benchmarks I've seen. Looks like they might need a 80-100% increase in IPC then?
    Quote Originally Posted by FlanK3r View Post
    Do you have some practice expirience with it? Me yes, with both platform (some details in my rig example). Im not saying AMD is better than Intel, but is not true, the difference is so big as you mean.
    And Stars cores are not better. In this days/software is Phenom II x4 after OC compared similar as Athlon APU X4 after OC. And this Athlon has better effectivity (less power).
    First Bulldozer was slighty better than x6 1100T average. Vishera is much better, because there are higher clocks and fixed bugs in cache and prediction.
    Yes I do. I have an FX-6300 Vishera (3.5GHz nominal, usually turbos to around 3.8 GHz) and an i7-2630QM (2.0GHz, ~2.3GHz turbo on 4-thread loads, yes it's a laptop). I've done various benchmarks, and the i7 at 2.3GHz is 1-5% faster than the FX-6300 while using half the power, at 40% less clock speed.

    Compare a benchmark between an FX-6300 and a X6 1100T. The X6 will humiliate the FX-6300 in every possible way. The stars-cores are FAR better than Vishera and Steamroller clock-for-clock:


    Last edited by [XC] Lead Head; 01-21-2014 at 01:06 PM.
    Fold for XS!
    You know you want to

  3. #28
    I am Xtreme FlanK3r's Avatar
    Join Date
    May 2008
    Location
    Czech republic
    Posts
    6,823
    but.. you talking about apples and pears. Bulldozer architecture 15h is totaly different from classic cores as example Stars CPUs. There is many option as shared cache L1+L2, FP etc. Main building unit for Stars is simply core. But for Bulldozers is it one module. Module should be an alternative to Intels hyperthreading with better % effectivity (8 logical cores vs 8 logical cores). Effectivity in multiple performance is better than hyperthreading (2CU/4C vs 2C/4T, both 4 logical cores). But as we can see its not so easy. Because die size is with big caches much bigger and of course power consumption (80W vs 125W and performance 100% vs around 80%).

    example of effectivity 1CU/2C (so simply one module)


    effectivity with 2CU/2C (so 2 FP active, the same number of logical threads)


    clock to clock, one core - stars is a bit better, but the limit of architecture with the same power consumption is very different. OC of Thuban ending at 4200 MHz, Vishera begins at 4000/4200 turbo as stock. Kaveri is clock to clock with Piledriver much better:

    one example (one of betters): Superpi 32M, single thread:

    Kaveri at 4544 MHz with single sided memory (single sided is not good for superpi as double sided DIMs)


    Richland with double sided memory, 5 GHz + Stilt x87 fix
    Last edited by FlanK3r; 01-21-2014 at 02:12 PM.
    ROG Power PCs - Intel and AMD
    CPUs:i9-7900X, i9-9900K, i7-6950X, i7-5960X, i7-8086K, i7-8700K, 4x i7-7700K, i3-7350K, 2x i7-6700K, i5-6600K, R7-2700X, 4x R5 2600X, R5 2400G, R3 1200, R7-1800X, R7-1700X, 3x AMD FX-9590, 1x AMD FX-9370, 4x AMD FX-8350,1x AMD FX-8320,1x AMD FX-8300, 2x AMD FX-6300,2x AMD FX-4300, 3x AMD FX-8150, 2x AMD FX-8120 125 and 95W, AMD X2 555 BE, AMD x4 965 BE C2 and C3, AMD X4 970 BE, AMD x4 975 BE, AMD x4 980 BE, AMD X6 1090T BE, AMD X6 1100T BE, A10-7870K, Athlon 845, Athlon 860K,AMD A10-7850K, AMD A10-6800K, A8-6600K, 2x AMD A10-5800K, AMD A10-5600K, AMD A8-3850, AMD A8-3870K, 2x AMD A64 3000+, AMD 64+ X2 4600+ EE, Intel i7-980X, Intel i7-2600K, Intel i7-3770K,2x i7-4770K, Intel i7-3930KAMD Cinebench R10 challenge AMD Cinebench R15 thread Intel Cinebench R15 thread

  4. #29
    Xtreme Guru
    Join Date
    May 2007
    Location
    Ace Deuce, Michigan
    Posts
    3,955
    Quote Originally Posted by FlanK3r View Post
    Effectivity in multiple performance is better than hyperthreading (2CU/4C vs 2C/4T, both 4 logical cores). But as we can see its not so easy. Because die size is with big caches much bigger and of course power consumption (80W vs 125W and performance 100% vs around 80%)
    Right, which has been my point the whole time. Their module design makes a lot of sense if they could actually design power efficient cores that could keep up on a single threaded level. But the factor of the matter is that Intel's processes have much higher transistor density even at the same node, and their cache latency is way lower to boot. Consider AMD has to jack up the frequency to keep the playing field close, there's just no way they can currently compete at the server level without having to slash prices
    Quote Originally Posted by Hans de Vries View Post

    JF-AMD posting: IPC increases!!!!!!! How many times did I tell you!!!

    terrace215 post: IPC decreases, The more I post the more it decreases.
    terrace215 post: IPC decreases, The more I post the more it decreases.
    terrace215 post: IPC decreases, The more I post the more it decreases.
    .....}
    until (interrupt by Movieman)


    Regards, Hans

  5. #30
    Xtreme Guru
    Join Date
    May 2007
    Location
    Ace Deuce, Michigan
    Posts
    3,955
    Also on that note, if I'm AMD I would stop adding modules and beef up the underlying cores. I bet you a 3 Module chip with 4 ALUs per core would blow out of the water a 4 Module chip with the current 2 ALUs per core at the same power consumption (think about it, total ALUs = 24 vs 16).
    Quote Originally Posted by Hans de Vries View Post

    JF-AMD posting: IPC increases!!!!!!! How many times did I tell you!!!

    terrace215 post: IPC decreases, The more I post the more it decreases.
    terrace215 post: IPC decreases, The more I post the more it decreases.
    terrace215 post: IPC decreases, The more I post the more it decreases.
    .....}
    until (interrupt by Movieman)


    Regards, Hans

  6. #31
    Xtreme Cruncher
    Join Date
    Nov 2005
    Location
    Rhode Island
    Posts
    2,740
    Quote Originally Posted by FlanK3r View Post
    but.. you talking about apples and pears. Bulldozer architecture 15h is totaly different from classic cores as example Stars CPUs. There is many option as shared cache L1+L2, FP etc. Main building unit for Stars is simply core. But for Bulldozers is it one module.
    Yes I know.
    Module should be an alternative to Intels hyperthreading with better % effectivity (8 logical cores vs 8 logical cores). Effectivity in multiple performance is better than hyperthreading (2CU/4C vs 2C/4T, both 4 logical cores). But as we can see its not so easy. Because die size is with big caches much bigger and of course power consumption (80W vs 125W and performance 100% vs around 80%).
    Yes, it *should* be an alternative to hyperthreading, and it does scale well, the problem is the performance is still bad. The modules do not have enough execution hardware, don't have enough cache and don't have a beefy enough scheduler to keep what little resources they do have filled.
    clock to clock, one core - stars is a bit better, but the limit of architecture with the same power consumption is very different. OC of Thuban ending at 4200 MHz, Vishera begins at 4000/4200 turbo as stock. Kaveri is clock to clock with Piledriver much better
    Clock to clock, Thuban is A LOT faster than Vishera, and is still faster than Kaveri. If AMD had beefed up Thuban's execution resources (More ALUs, two 256bit FPU, better cache, scheduler, etc..) they would have a CPU that would be competitive with Haswell. Build an improved Thuban on 28nm, and your heat and clock issues get addressed as well.

    Quote Originally Posted by AliG View Post
    Also on that note, if I'm AMD I would stop adding modules and beef up the underlying cores. I bet you a 3 Module chip with 4 ALUs per core would blow out of the water a 4 Module chip with the current 2 ALUs per core at the same power consumption (think about it, total ALUs = 24 vs 16).
    AMD has a big FPU performance problem as well. Sandy Bridge has two 256bit AVX FPUs per core. BD/PD/SR core can execute two 128bit SSE FPU instructions per cycle or they can best case execute 1 256bit AVX instruction. Worst case would take two cycles (if one FPU was occupied with an SSE instruction). So 256bit FPU performance could be anywhere from 1/2 to 1/4 the speed per cycle vs. SB/IB/Haswell. That's a huge penalty, that should hopefully be fixed with excavator.
    Fold for XS!
    You know you want to

  7. #32
    Xtreme Guru
    Join Date
    May 2007
    Location
    Ace Deuce, Michigan
    Posts
    3,955
    Quote Originally Posted by [XC] Lead Head View Post
    AMD has a big FPU performance problem as well. Sandy Bridge has two 256bit AVX FPUs per core. BD/PD/SR core can execute two 128bit SSE FPU instructions per cycle or they can best case execute 1 256bit AVX instruction. Worst case would take two cycles (if one FPU was occupied with an SSE instruction). So 256bit FPU performance could be anywhere from 1/2 to 1/4 the speed per cycle vs. SB/IB/Haswell. That's a huge penalty, that should hopefully be fixed with excavator.
    The irony is that AMD used to have a HUGE lead in float performance. K8 absolutely destroyed everything Intel had until core 2 duo, and the stars cores were actually still competitive against Nehalem. This is why I consider my 2500k one of the greatest investments cpu ever (same goes for whoever got a 2600k). The only other cpu I can think of that held its own for such a long time at a low cost would be the opteron 180, where people would overclock it to FX-60 performance levels on the cheap.
    Quote Originally Posted by Hans de Vries View Post

    JF-AMD posting: IPC increases!!!!!!! How many times did I tell you!!!

    terrace215 post: IPC decreases, The more I post the more it decreases.
    terrace215 post: IPC decreases, The more I post the more it decreases.
    terrace215 post: IPC decreases, The more I post the more it decreases.
    .....}
    until (interrupt by Movieman)


    Regards, Hans

  8. #33
    Xtreme Enthusiast
    Join Date
    Feb 2009
    Location
    Hawaii
    Posts
    611
    There's more to a bulldozer module than just the int and fp bits. The way the cores talk to each other is more modular with this arch and that allows kaveri to exist. Going forward you're going to see more shared resources and more integrated components. Best get used to it.
    Xeon E3-1245 @ Stock | Gigabyte H87N-Wifi | 16GB Crucial Ballistix LP @ 1600Mhz | R7 260x | Much and varied storage

  9. #34
    Xtreme Enthusiast Shocker003's Avatar
    Join Date
    Jul 2007
    Location
    Germany
    Posts
    725
    I don't care if they bring out a 16 core CPU that can't match a quadcore Intel CPU performance wise. I will buy one if it makes it to the stores.


    MAIN RIG--:
    ASUS ROG Strix XG32VQ---:AMD Ryzen 7 5800X--Aquacomputer Cuplex Kryos NEXT--:ASUS Crosshair VIII HERO---
    32GB G-Skill AEGIS F4-3000C16S-8GISB --:MSI RADEON RX 6900 XT---:X-Fi Titanium HD modded
    Inter-Tech Coba Nitrox Nobility CN-800 NS 800W 80+ Silver--:Cyborg RAT 8--:Creative Sound BlasterX Vanguard K08

  10. #35
    Xtreme Addict
    Join Date
    Mar 2005
    Location
    Rotterdam
    Posts
    1,553
    Kaveri and all Bulldozer based CPUs are a stepping stones for a completely homogeneous architecture (GPU+CPU completely integrated into the same logic).
    It is a necessary evil...
    Gigabyte Z77X-UD5H
    G-Skill Ripjaws X 16Gb - 2133Mhz
    Thermalright Ultra-120 eXtreme
    i7 2600k @ 4.4Ghz
    Sapphire 7970 OC 1.2Ghz
    Mushkin Chronos Deluxe 128Gb

  11. #36
    Xtreme Member
    Join Date
    Mar 2009
    Location
    Miltown, Wisconsin
    Posts
    353
    AMD Debuts New 12- and 16-Core Opteron 6300 Series Processors

    AMD today announced the immediate availability of its new 12- and 16-core AMD Opteron 6300 Series server processors, code named "Warsaw." Designed for enterprise workloads, the new AMD Opteron 6300 Series processors feature the "Piledriver" core and are fully socket and software compatible with the existing AMD Opteron 6300 Series. The power efficiency and cost effectiveness of the new products are ideal for the AMD Open 3.0 Open Compute Platform - the industry's most cost effective Open Compute platform.

    Source at TPU

    starting price of $377 and $598
    Quote Originally Posted by ***Deimos*** View Post
    WARNING GTX480 - may cause dizziness, blurred vision, dry mouth, dehydration, shortness of breath, headaches, naussea, explosive diahrea


    Foxconn Bloodrage P11 ( 2.1 SLIC MOD )
    Corei7 980 (3118B583) 4.2ghz 24-7 with stock vcore
    2x8GB PNY 1600c9 @ 1600mhz 9-9-9-24-1T
    nVidia GTX 770
    256gb OCZ Vertex4 FW 1.5
    2TB Green Barracuda
    Antec HCG-620w PSU
    Corsair H50 ( Sucks Hairry Balls IMHO )
    Coolermaster Storm Sniper Black Custom Sleeved
    3 x Dell U2410 H-IPS 1920x1200 Surround
    Windows 7 x64 Ultimate




  12. #37
    Xtreme Addict Evantaur's Avatar
    Join Date
    Jul 2011
    Location
    Finland
    Posts
    1,043
    Quote Originally Posted by To(V)bo Co(V)bo View Post
    starting price of $377 and $598
    not bad :O, could be good for crunching, that's if the power consumption is low enough

    I like large posteriors and I cannot prevaricate

  13. #38
    Administrator
    Join Date
    Nov 2007
    Location
    Stockton, CA
    Posts
    3,568
    Quote Originally Posted by Dimitriman View Post
    Kaveri and all Bulldozer based CPUs are a stepping stones for a completely homogeneous architecture (GPU+CPU completely integrated into the same logic).
    It is a necessary evil...
    I think you are right on that. Seems that's the way AMD is going. With the way AMD GPU's handle crunching, bitcoins etc they could really have something here in HPC setups.

  14. #39
    Xtreme Member
    Join Date
    Mar 2009
    Location
    Miltown, Wisconsin
    Posts
    353
    Quote Originally Posted by Evantaur View Post
    not bad :O, could be good for crunching, that's if the power consumption is low enough
    Pictures say it has a TDP of only 99.
    Quote Originally Posted by ***Deimos*** View Post
    WARNING GTX480 - may cause dizziness, blurred vision, dry mouth, dehydration, shortness of breath, headaches, naussea, explosive diahrea


    Foxconn Bloodrage P11 ( 2.1 SLIC MOD )
    Corei7 980 (3118B583) 4.2ghz 24-7 with stock vcore
    2x8GB PNY 1600c9 @ 1600mhz 9-9-9-24-1T
    nVidia GTX 770
    256gb OCZ Vertex4 FW 1.5
    2TB Green Barracuda
    Antec HCG-620w PSU
    Corsair H50 ( Sucks Hairry Balls IMHO )
    Coolermaster Storm Sniper Black Custom Sleeved
    3 x Dell U2410 H-IPS 1920x1200 Surround
    Windows 7 x64 Ultimate




  15. #40
    Xtreme Mentor
    Join Date
    Feb 2007
    Location
    West hartford, CT
    Posts
    2,804
    FX-8350(1249PGT) @ 4.7ghz 1.452v, Swiftech H220x
    Asus Crosshair Formula 5 Am3+ bios v1703
    G.skill Trident X (2x4gb) ~1200mhz @ 10-12-12-31-46-2T @ 1.66v
    MSI 7950 TwinFrozr *1100/1500* Cat.14.9
    OCZ ZX 850w psu
    Lian-Li Lancool K62
    Samsung 830 128g
    2 x 1TB Samsung SpinpointF3, 2T Samsung
    Win7 Home 64bit
    My Rig

  16. #41
    Xtremely High Voltage Sparky's Avatar
    Join Date
    Mar 2006
    Location
    Ohio, USA
    Posts
    16,040
    Quote Originally Posted by Buckeye View Post
    I think you are right on that. Seems that's the way AMD is going. With the way AMD GPU's handle crunching, bitcoins etc they could really have something here in HPC setups.
    The AMD fan in me keeps hoping for all this seeming nonsense to be the stepping stones needed to get to something pretty awesome. The cynic in me thinks they are just blundering along sometimes... Reality is probably somewhere in the middle

    I'm happy with my Sandy Bridge in my desktop, and my AMD Richland APU in my laptop. They both do their jobs quite well.
    The Cardboard Master
    Crunch with us, the XS WCG team
    Intel Core i7 2600k @ 4.5GHz, 16GB DDR3-1600, Radeon 7950 @ 1000/1250, Win 10 Pro x64

  17. #42
    Xtreme Enthusiast
    Join Date
    Feb 2009
    Posts
    800
    Quote Originally Posted by [XC] Lead Head View Post
    Clock to clock, Thuban is A LOT faster than Vishera, and is still faster than Kaveri. If AMD had beefed up Thuban's execution resources (More ALUs, two 256bit FPU, better cache, scheduler, etc..) they would have a CPU that would be competitive with Haswell. Build an improved Thuban on 28nm, and your heat and clock issues get addressed as well.
    Just to add to this, 2 cores of thuban is smaller than 1 module of vishera when both are on 32nm, aaand the IPC of thuban is still higher than vishera and steamroller. What I'm not sure of is whether 2 cores of thuban is smaller than 1 module of steamroller.

    That said, it didn't really stopped me from supporting AMD (my laptops are AMD.. and so is my main rig) but man, I can't recommend AMD to my friends. It wouldn't do them justice.

    Back in my mind, I also think Bulldozer-based processors are all necessary evil for future HSA, and that's where AMD is heading but couldn't they have done it on Thuban instead? How hard is it to apply what they've learned form making all these APUs and bulldozer-based processors back into the Stars architecture?
    Last edited by blindbox; 01-22-2014 at 07:30 PM.

  18. #43
    Xtreme Enthusiast
    Join Date
    Feb 2009
    Location
    Hawaii
    Posts
    611
    AMD is making some sense. Screw TDP, 16 cores at $598 is fantastic. That's cheaper than 70% of LGA 2011 chips.
    Xeon E3-1245 @ Stock | Gigabyte H87N-Wifi | 16GB Crucial Ballistix LP @ 1600Mhz | R7 260x | Much and varied storage

  19. #44
    I am Xtreme
    Join Date
    Dec 2008
    Location
    France
    Posts
    9,060
    Quote Originally Posted by Darakian View Post
    AMD is making some sense. Screw TDP, 16 cores at $598 is fantastic. That's cheaper than 70% of LGA 2011 chips.
    You have to consider the frequency, though...
    Donate to XS forums
    Quote Originally Posted by jayhall0315 View Post
    If you are really extreme, you never let informed facts or the scientific method hold you back from your journey to the wrong answer.

  20. #45
    Xtreme Addict
    Join Date
    Sep 2010
    Location
    Australia / Europe
    Posts
    1,310
    ^^ correct! I frequently want as many cores as I can

    that's my consideration regarding frequency
    この世界には 人の運命を司る 何らかの超越的な 〝律〝...... 〝神の手〝が 存在するのだろうか? 少なくとも 人は 自らの意志さえ 自由にはできな

  21. #46
    Xtreme Enthusiast
    Join Date
    Oct 2007
    Location
    Hong Kong
    Posts
    526
    Quote Originally Posted by blindbox View Post
    Just to add to this, 2 cores of thuban is smaller than 1 module of vishera when both are on 32nm, aaand the IPC of thuban is still higher than vishera and steamroller. What I'm not sure of is whether 2 cores of thuban is smaller than 1 module of steamroller.

    That said, it didn't really stopped me from supporting AMD (my laptops are AMD.. and so is my main rig) but man, I can't recommend AMD to my friends. It wouldn't do them justice.

    Back in my mind, I also think Bulldozer-based processors are all necessary evil for future HSA, and that's where AMD is heading but couldn't they have done it on Thuban instead? How hard is it to apply what they've learned form making all these APUs and
    bulldozer-based processors back into the Stars architecture?
    The large transistor count for a module is mainly due to the large 2MB L2 cache per module.

  22. #47
    Xtreme Enthusiast
    Join Date
    Feb 2009
    Posts
    800
    Quote Originally Posted by qcmadness View Post
    The large transistor count for a module is mainly due to the large 2MB L2 cache per module.
    Time to start the calculations again, shall we? As far as I remember, what I said does not include the L2 Cache so let's see.
    Bulldozer dieshot from here. http://www.anandtech.com/show/4955/t...x8150-tested/2
    Thuban = 346mm^2 at 45nm = 174.9649 mm^2 at 32nm assuming perfect shrinkage.
    Bulldozer = 315mm^2 at 32nm.

    I took the bulldozer dieshot, measured core size without the cache using Gimp by setting Printer size to scale with bulldozer diesize. Simple mathematics of scaling up and scaling down (with a square and square root in the mix), I get one BD module is 4.2mm by 4.4mm = 18.48 mm^2

    Then, I do the same for Thuban from this dieshot. http://www.anandtech.com/show/3674/a...1055t-reviewed

    One Thuban Core is 2.4mm by 3.1mm = 7.44 mm^2 . Two Thuban Core is 7.44*2 = 14.8mm^2. Compare this with BD module's 18.48 mm^2

    I've just shown you that two thuban cores are smaller than one BD modules, and that's without cache. With cache, the difference would be even bigger. Thuban gets higher IPC with less cache, and with smaller area.

    Now, back to the question, why didn't they continue with Thuban and abandon Bulldozer?
    Last edited by blindbox; 01-23-2014 at 07:04 AM. Reason: 45mm changed to 45nm

  23. #48
    Xtreme Enthusiast
    Join Date
    Oct 2007
    Location
    Hong Kong
    Posts
    526
    Quote Originally Posted by blindbox View Post
    Time to start the calculations again, shall we? As far as I remember, what I said does not include the L2 Cache so let's see.
    Bulldozer dieshot from here. http://www.anandtech.com/show/4955/t...x8150-tested/2
    Thuban = 346mm^2 at 45nm = 174.9649 mm^2 at 32nm assuming perfect shrinkage.
    Bulldozer = 315mm^2 at 32nm.

    I took the bulldozer dieshot, measured core size without the cache using Gimp by setting Printer size to scale with bulldozer diesize. Simple mathematics of scaling up and scaling down (with a square and square root in the mix), I get one BD module is 4.2mm by 4.4mm = 18.48 mm^2

    Then, I do the same for Thuban from this dieshot. http://www.anandtech.com/show/3674/a...1055t-reviewed

    One Thuban Core is 2.4mm by 3.1mm = 7.44 mm^2 . Two Thuban Core is 7.44*2 = 14.8mm^2. Compare this with BD module's 18.48 mm^2

    I've just shown you that two thuban cores are smaller than one BD modules, and that's without cache. With cache, the difference would be even bigger. Thuban gets higher IPC with less cache, and with smaller area.

    Now, back to the question, why didn't they continue with Thuban and abandon Bulldozer?
    1. Try using Llano as comparison
    http://www.anandtech.com/show/4444/a...apu-a8-3500m/2

    2. K10.5 does not scale well beyond 3GHz, as we can see at Llano, which is expected to reach 3.5GHz but can only reach 3.1GHz.

    3. Bulldozer does support AVX, FMA and SSE4.x, which are absent in K10.5

  24. #49
    Xtreme Member
    Join Date
    Oct 2009
    Posts
    146
    Quote Originally Posted by blindbox View Post
    Time to start the calculations again, shall we? As far as I remember, what I said does not include the L2 Cache so let's see.
    Bulldozer dieshot from here. http://www.anandtech.com/show/4955/t...x8150-tested/2
    Thuban = 346mm^2 at 45nm = 174.9649 mm^2 at 32nm assuming perfect shrinkage.
    Bulldozer = 315mm^2 at 32nm.

    I took the bulldozer dieshot, measured core size without the cache using Gimp by setting Printer size to scale with bulldozer diesize. Simple mathematics of scaling up and scaling down (with a square and square root in the mix), I get one BD module is 4.2mm by 4.4mm = 18.48 mm^2

    Then, I do the same for Thuban from this dieshot. http://www.anandtech.com/show/3674/a...1055t-reviewed

    One Thuban Core is 2.4mm by 3.1mm = 7.44 mm^2 . Two Thuban Core is 7.44*2 = 14.8mm^2. Compare this with BD module's 18.48 mm^2

    I've just shown you that two thuban cores are smaller than one BD modules, and that's without cache. With cache, the difference would be even bigger. Thuban gets higher IPC with less cache, and with smaller area.

    Now, back to the question, why didn't they continue with Thuban and abandon Bulldozer?
    18.48 - 14.8 = 3.68m^2 and it goes for AVX , SSE4
    Bulldozer can reach 5.1Ghz(??) but Thuban is around 3.9(??)Ghz
    Oc Bulldozer gets higher score than Oc Thuban
    Last edited by behrouz; 01-23-2014 at 08:54 AM.
    CPU : Athlon X2 7850,Clock:3000 at 1.20 | Mobo : Biostar TA790GX A2+ Rev 5.1 | PSU : Green GP535A | VGA : Sapphire 5770 Clock:910,Memory:1300 | Memory : Patriot 2x2 GB DDR2 800 CL 5-5-5-15 | LCD : AOC 931Sw

  25. #50
    Xtreme Enthusiast
    Join Date
    Feb 2009
    Posts
    800
    1. 2.7mm x 3.6mm. 9.72mm^2 per core, and that's 19.44mm^2 for two cores. I guess you're quite right.

    2. K10.5 scaled to 4.1~4.2 GHz on Thuban. Llano has a graphics core attached to it. It's also the first iteration of APUs. And they're also AMD's first 32nm processor.

    3. Nothing stopping them from improving K10.5 to have said instruction sets.

    Also, with regards to 1, saving 4mm^2 of space is worth the IPC decrease? What we still don't know, is whether the direction AMD is going, is possible if it was Stars being improved over and over.'

    @behrouz: What sort of cooling for 5 GHz?
    Last edited by blindbox; 01-23-2014 at 08:58 AM.

Page 2 of 4 FirstFirst 1234 LastLast

Bookmarks

Bookmarks

Posting Permissions

  • You may not post new threads
  • You may not post replies
  • You may not post attachments
  • You may not edit your posts
  •