AMD wil charge based on position. They may still under cut Intel, but if they can charge for chip, they will.
Printable View
If an 8 core part is available at launch, that's what I'm buying. :)
Heh, Im too, but I want 4c and 6c of Zambezi too :D
Ohhh. That makes sense. It was before my time using computers though, so I had no idea. :-/
Soooo - Has anyone else noticed that the Crosshair IV boards have one extra pin over the other AM3 boards?
Heres secretly hoping it will be 'bulldozer compatible'! ;)
extra pin where?
There's an extra pinhole on the Crosshair IV motherboard's CPU socket. I compared it to my M4A89GTDPro. (The socket that the actual CPU sits in)
I was trying to figure out how to install the CPU when I switched boards as something wasn't right, i can't remember now sorry guys - It might have been the other way round, that the X6 CPU had one pin missing which wasn't used on the board.....I'll be sorting my PC out again soon so will report back then.
heres an image of the AM3+ from MSI
the extra pin is in top left group of missing pins, all 4 groups of missing pins on AM3 have 2 pins missing, this has just one
http://www.techreaction.net/wp-conte...1/IMG_1007.jpg
It would be nice if it was currently 100% AM3+ compatible.
But they might just be designing and building the board so that they can make minor changes, update the chipset, and call it a Crosshair V.
no, it was said it 100x, Zambezi is AM3+ only!
:shrug:
At the time, intel only had 4 cores, with 8 threads, whats wrong with 6 cores that are near the performance produced by intel's 8 threads to keep up? Sounds like a great middle solution to me.
If Bulldozer is as fast as LGA1156 (i7 8xx) single threaded CPC, and will overclock past 4.5-4.6 Ghz air/water with all 8 cores, I see it being a competitor.
I mean, how the heck is Ivy Bridge going to compete with Bulldozer? Forget the fact that Bulldozer will be a whole manufacturing process behind, even if Ivy Bridge will need 5.5-6 Ghz on air to beat it multi threaded. That's 8 cores vs 4...
Haswell won't be here until 2013 :ROTF: so I believe AMD is looking at keeping the performance crown for a solid year and a half :confused:
I have read from a chinese websites, indicating AMD is going to change the nomenclature of upcoming CPU.
Zacate is E-series, Ontario is C-series
Llano is A-series & Zambezi is FX-series
if you ask for the source: http://www.chiphell.com/thread-161958-1-1.html
it might mean that using FX name might not be indication for performance, just simply renaming
right....will be somethink with number. C50, E350, A550 etc...
Well, it's not only a naming scheme, but it's also the best performing processors in the line-up. Coincidence? I think not. ;)
http://ht4u.net/news/23365_neuer_ben..._core_i7_980x/
Fake?
I belive that BD can be, in the best case, 10-20% more better than 980X
But 65%....??? would be unimaginable...
Suspicious sources = fake by default. At least in BD's case.
February 13th, 2011...
still no performance info. on BullDozer:brick:
We have told everyone benchmarks at launch.
Sooo I read on expreview bulldozer isn't coming out until 2012 now? Whaaa?
intel can do it, why cant amd?
http://www.intel.com/p/en_US/product...specifications
I thought BD was on 32nm...
Quote:
The Expreview blurb is about AMD transitioning all products to a 32mn process by Q2 2012, not that Bulldozer isn't coming out until 2012.
What are you guys talking about? Both Llano and Bulldozer will launch at 32nm. I assume that the Phenom/Athlon II brands won't immediately go EOL at that point, and will still be manufactured for a while. Also, the Brazos platform will remain at 40nm until its refresh in 2012 (22/28/32nm, I have no clue).
I'm about as tightlipped as it gets and the furthest from breaching NDA not to mention god nows how many NDA form's ive signed and I can honestly say I have no samples.
What I can say is what little samples are out there are not indicative of final product, so if someone does leak info its likely useless anyway.
AMD is playing this one close to the chest.
Well i guess this is the big challenge,if AMD fails to deliver some very good performance on the scene,then the end is not so far away...I srsly hope they will make a good generation...it's about time :)
Maybe are some pre-productions samples now (end of February, but its half of February :-D), hm?
http://www.marketwire.com/press-rele...MD-1394828.htm
little bit off topic,but ... 12 cores/2.5GHz/105W ACP@45nm ... :up:
guys, Do you think that the best BD is better than a 980X-990X?
I mean, or can be better than any gulfty?
I think, in multithreading sometimes yes. I believe, in rendering it will be new monstrum for some time :)
chiphell has release some more info:
http://translate.googleusercontent.c...N9h-t6gQU5Yv0A
Puts a 3.5Ghz Zambezi on par with a 2600k @ 4ghz! - Heres hoping this is a sign of things to come :)
Since when was being in the x86 CPU biz and having Intel as the main competitor not been a challenge?
I think people have been saying that about AMD since they started out but they are still here! As long as AMD keeps delivering bang for the buck I don't see any end in sight for them.Quote:
if AMD fails to deliver some very good performance on the scene,then the end is not so far away...
AMD has had many great processors over their lifetime and at fair prices which I expect BD to be, it doesn't need to be the fastest and I doubt it will be.Quote:
I srsly hope they will make a good generation...it's about time :)
AMD can't vanish because this would mean Intel's monopoly... Even if they dragged their feet (and they don't), they would still be around.
if will be performance 3.5 GHz stock Zmabezi as 4 GHz SB 2600K, it will be great!
OK guys,now that we have almost all the details of the design with rumored clock speeds,IPC jumps,frontend penalties,Turbo clock information, I'm pretty sure the Donanimhaber slide is genuine :). I have no idea if it was AMD who actually made the slide,but the speedups are matching almost exactly the numbers derived by various other sources on the net(rumorpedia's,scarletwh*re,server speed up slides for MT workloads).
Summary by me ,purely hypothetical of course (for a hypothetical 3.5Ghz Zambezi with 4.2Ghz integer turbo mode : ) ):
Single thread pure integer desktop workloads :
-around 12 to 15% IPC improvement -too complicated to explain how I got to this particular range :) ,
-around 13% effective clock improvement-4.2Ghz BD Turbo Vs 3.7Ghz Thuban Turbo,
-cumulative speed up of 30% versus Thuban in this type of workload,on average(can be higher or lower ,depends on application and whether it's ALU bound or memory latency sensitive)
Single thread fp workload(this covers in my opinion integer SEE too):
- around 40%-50% better per clock performance than one Thuban core;this is one 128bit FMAC/IMAC Vs one Thuban FP part of the core
-around 13% effective clock improvement-4.2Ghz BD Turbo Vs 3.7Ghz Thuban Turbo <-Turbo should kick in when single threaded workload is detected ,even if it is a floating point(power draw intensive)
-cumulative speed up of ~60% versus Thuban in this type of workload,on average.
Multi threaded integer workloads(SSE in MMX/IMAC pipelines):
-50-80% faster than Thuban 1100T
Multi threaded floating point workloads(fp ops in FMAC pipelines)
-50-80% faster than Thuban 1100T
All of this equates to a total of around 40-50% faster than Thuban 1100T @ 3.3Ghz. Or around 15-20% on average faster than Westmere 3.33Ghz 6C in client workloads.
I also think that the leaked AMD Terminator slide (FX,"I wil be back") is not without merit :).
You're mentioning 15% IPC improvement. That's very different from 50% I quoted, LOL. 15-25% is not unreasonable to expect, IMO. 50% is just a crazy number.
It may look that way,but Interlagos with 10% higher clock should have 82% higher fp rate score than MC.This equates to ~25% better per core performance(1.82/1.33/1.1),or 1.25x, in multithreaded workload where we have a penalty due to shared front end(which can be as high as 1.25x or 25% due to 80% of CMP design claim=> 100/80=1.25 ;) ). Now if we account for that penalty of 25% over the previous 1.25x over MC we get : 1.25x1.25=1.56x or 56% better in single threaded fp workload. Then you have a Turbo mode kicking in,my guess was 4-4.2Ghz,which gets you another 13% or 1.13x : 1.56x1.13=1.77x or 77% faster :).
All of this is speculation on my part,but I'm almost 100% sure this is how it will end up performing.Even if i have a decent error of margin in my estimation it will still trounce Thuban and be on par with highest end Westmere.
Or if you want an aggregate estimate of Zambezi 3.5Ghz model,all workloads included you can look at this chart:
http://www.hardware.fr/articles/815-...dy-bridge.html
Thuban has 164.1 points . 164.1 x 1.5x=~246pts. SB hexacore @ 3.4Ghz/3.8Ghz should have ~240-245pts and SB 8C @ 3.33/3.8Ghz should have 262pts :D. I derived SB 6C potential score via IPC+clock improvement of 11% over 3.33Ghz Westmere 6C,while 8C score was gotten from 10% added aggregate improvement from 33% more cores VS 6C SB score @ 3.4Ghz,with correction for 3.33Ghz Vs 3.4Ghz clock speed :D. 10% aggregate improvement was derived from an improvement 6C Westmere sees in client apps Vs 4C Nehalem at the same clock,based on hothardware review of Thuban on the same page for average results (using ratio equation of 50% more core : 16% improvement for Westmere = 33% more cores : x improvement ,while IPC improvement was already calculated in SB 6C 246pts score :) ).
Zambezi @ 3.5Gz ,in that chart,will slide right in between 8C 3.4Ghz SB and 6C 3.4Ghz SB. Whether intel will manage to make 8C 3.4Ghz SB desktop chip is another question(I think they can).
Summary of potential hypothetical Zambezi X8 @ 3.5Ghz :tiny bit faster than 6C SB @ 3.4Ghz and 7% slower than 8C SB @ 3.33Ghz.
Let's see how it turns out.You have my post here :D.
Wait, what are you saying? IPC is Instruction Per Clock (Cycle). It has nothing to do with CPU Frequency whatsoever. Or did you come up with "Instruction Per Core"? :confused:
I'll read up your post again, give me a few mins.
Read it again,both posts.I tried to figure in IPC,clock,penalty that strikes MT workload and the improvement when there is no penalty(single thread workload).
IPC is instructions per cycle,or core logic level improvement vs older core.
OK. In regards to my comment:
I posted this after you mentioning "~25% better per core performance" (when it should be per clock, per core would be 10% higher) and pulling Turbo mode in.
Your logic is sound, I can give you that. :up:
A few issues, though. The biggest one is that you're taking the numbers from the slide with some relative rating which will not necessarily be absolutely true. The second one is that you're going for the maximum possible penalty, and that might not be a very good indication (especially when it comes to a marketing slide and we have no idea what benchmark was used).
Let's hope you're correct, though! :)
Regarding the IPC improvement and why I doubt it's going to be actually true...
Single threaded Cinebench is a good indication of FP performance, right?
http://images.anandtech.com/graphs/s...2107/24403.png
If your 56% calculation is correct then BD @ 3.2GHz would score 3951*1.56=6164, which is [(6164*31/32)-5405]/5405 = 10% faster than SB clock per clock (single threaded FP workload). It wouldn't be just on par with Westmere, it would blow away SB by 10% (which is more than the difference between two generations of Core i7). Which is pretty crazy if you ask me.
Thanks for the feedback :). I agree that I used the worst penalty possible ,which in turn gets best ST performance. This might be very wrong in the end :). As for the slide,it is representing estimated performance with unknown compiler settings.This may skew results somewhat but not by much IMO. We already know that the whole FP unit(2xFMAC) should be more powerful,by a good delta, than 2x K10 FP units. Add in 33% more cores and higher clock of 10% or so percent for server parts(without Turbo since fp workload should not permit the clock jump) and you end up at very big jump(~80%) in aggregate score for a whole 16C Interlagos chip Vs 12C MC chip. This aligns pretty good with Donanimhaber's slide which shows Zambezi X8 at unknown clock scoring almost 88% higher than Thuban 3.3Ghz in CB11.5 benchmark * .3.5Ghz for top end is my estimate ,judging by known info from ISCC. So we have a massive uplift in fp score for 8 core Zambezi in commercial workload that does not see AVX or FMA extensions at all ;). Just pure SSE/SIMD.
To conclude,I think the numbers will fall in this ballpark.I may be off by a huge mark,who knows.But everything I have seen so far points into the direction of 50% better "aggregate" score ,which takes into account all of the possible scenarios of workloads(the worst and the best).
* It seems like server parts get the same core count advantage and similar clock advantage over MC as Zambezi will get over Thuban.33% more cores and ~6-10% more clock,not counting Turbo.
Yeah it may look crazy high,put the thing you miss out is that one BD core will have access to full 2xFMAC unit,each of which is 128b wide,each of which can do either add or mul -meaning consecutive adds or muls ,a feature no x86 core today can do today. If the single trheaded app is add or mull limited,then you have 2x the resources of a SB/Westmere/Thuban core in theory. Since Westmere/SB perform better than Thuban per clock,the difference comes either from ISA extensions support or better scheduling. Bulldozer should ,as I mentioned,have 2x the flexibility,2x the load/store BW of Thuban core,when you run single trhead fp workload.It will ,on top of all that,have clock speed advantage. Does "56% higher than Thuban in ST fp workload" sound a lot now,from this perspective ;) ?
56% single threaded IPC (FP or not) improvement looks scary in any case. This is a huge jump, especially for single threaded performance (which is very difficult to scale).
The improvements you mentioned above will no doubt help single threaded performance. To what degree, I don't know.
I think they're more targeted at multi-threaded workloads, though, being some sort of "hardware HT" implementation.
I really don't want to speculate. I'll just wait for some benchmarks. :)
I somewhat agree,but it's sooo fun to speculate :D. Like I said,BD's improvements might look impossible from today's perspective,but the core is a radical departure from all what we have known thus far. It's like Sun's Niagara on crack,featuring OoO cores instead of in-order ones and featuring super powerful FP unit(with a ratio of 1:2 instead of 1:8 in Niagara's case).Add in very aggressive prefetch and you have an 8 core design that can kind of "morph" according to workload :D. It uses many different techniques : shared front end to fill in bubbles (SMT's advantage!), full cores instead of sharing cores(adding more performance over traditional SMT approach),huge L2 benefiting both hardware threads or even single thread, "fat" 256b FMAC FP unit that can "morph" into one or two units according to workload(ST or MT via SMT!),aggressive power gating features and flip-flop design which in turn net much higher/aggressive clock throttling and clock uplift(according to workload),overall improved integer cores that have unified scheduler for mem/ALU ops instead of separate schedulers and shared pipelines,complete ISA support etc.
All in all,the design is a radical departure from anything we have seen thus far. I personally think it's a winner. Whether it is or not,we have to wait a few more months I guess :).
thanks for the breakdown informal, great info
http://www.xtremesystems.org/forums/...&postcount=210Quote:
I truly believe 10h will spank Conroe/Penryn across the board in 1P configs (including int besides the fp domination) and all that while consuming less energy and working at lower clocks.
:rolleyes:
Nice to see you follow me Olivon :). Actually,Barcelona was a fail only on clock domain.On server workloads it did better,even on 1P,than FSB based Conroe MCM chips.INtel did crank up the clocks on Penryn and in the end made up the difference. If you look at Shanghai vs Penryn,both on 45nm,there is no contest actually. Shanghai is a clear cut winner in int and fp MT workloads.
Opteron 2389( 2.9Ghz,2P,45nm)
spec int rate 141
spec fp rate 121
Intel Xeon X5450( 3.00 GHz,2P,45nm)
spec int rate 130
spec fp rate 74.1
2.9Ghz Shanghai is 8.4% faster than 3Ghz Penryn in spec int rate 2006 (12% faster per clock)
2.9Ghz Shanghai is 63% faster than 3Ghz Penryn in spec fp rate 2006 (68% faster per clock)
I rest my case :)
Interesting read.
To all the nay sayers look at what AMD did to intel back in the P4 days.
Now i know that it was intels decision to go down the netburst path but i can see AMD doing the same things again.
In case Olivon prefers 1P scores i can provide those too :D :
Opteron 2389( 2.9Ghz,1P,45nm)
spec int rate 72.5
spec fp rate 60.4
Intel Xeon X5450( 3.00 GHz,1P,45nm)
spec int rate 70.2
spec fp rate 41.2
2.9Ghz Shanghai is 3.2% faster than 3Ghz Penryn in spec int rate 2006 (6.8% faster per clock)
2.9Ghz Shanghai is 63% faster than 3Ghz Penryn in spec fp rate 2006 (51% faster per clock)
Intel Xeon X5450 release date : Q4'07
Opteron 2389 release date : Feb 23, 2009
Intel Xeon X5550 (Gainestown) release date : Q1'09, Mar
AMD do a great job in the server part but Intel seems really too fast in execution.
And don't misleading, I really hope AMD will deliver too ;)
That was a just a reminiscence from the past :D
I said 10h Vs Conroe/Penryn and I provided the results. I didn't say 10h VS Nehalem,did I ? ;)
Well, Been playing around with pictures and numbers
now i'm out of :coffee:
Hopefully to scale! (Bulldozer is interesting to scale given the quoted size.. I don't think it is supposed to include the wasted space (and small amt of logic) around the L2 Cache, which what i've run with (not doing so leaves large discrepency in L2 cache density when you look at it next to Llano...
http://img543.imageshack.us/img543/6...isoncrp7cr.png
Thanks for the picture mAJORD!
if will really true, will be more than awesome! I hope for so too :). Im really sure about Cinebenchr11.5 score about 9.5-10 points. But I have not idea about single threaded performance.
That 213M figure was including L2
module logic transistor count could be + - 10M!
Interlagos 2P in video http://www.youtube.com/watch?v=4v07kzah91A
that guy installs and swaps processors for a living, pretty sure he knows what he is doing
C'mon guys.... The dude was just doing a quick swap to show the number of cores running. :shakes:
It's not like he was setting up a permanant system he was intending to OC.
There was still paste on the HSF and the chips. I'm sure it was sufficient for a quick test.
OT: Pretty sweet to at least see some 16 core action! :up:
yes, video was about qucikly swap CPUs. Not as we making it with love :)
That guy probably swaps a dozen of CPUs in a day. This was just a demo show about how easy it is to upgrade a G34 system with Interlagos.
I've switched CPU's many times in my own system and not switching TIM :shrug: For example when my short patience runs out as I can't find my tube I just don't give a damn. Sure I gain like 3-4 degree's C when I use new TIM but it's not the end of the world.
Fixed :ROTF:
No seriously they do run that cool.......
Got ya:p:
:D
That's some sweet stuff right there. Also, those processors are huge! I mean, there is not really a good way to make 16 cores "small," but damn! Nice chips for sure though. I wonder, from a many-chip computing stand point, what kind of performance increase you see going from 2x12 to 2x16.
My 4184s Idle at around 22c at 2800mhz
Edit: Using Thermalright 120s
its actually not even 2 more cores. since the module is used because its 90% as good as a core with less space, in reality if Thuban is 6 cores, then BD is just 7.2 more complex cores.
if it wasnt for all that "blank" space between cores, BD should have been able to fit in like 280mm2, but i think is coming out closer to 340mm2