http://www.hpcwire.com/hpcwire/2011-...ig_supers.html
Printable View
Fiery who is the developer of the popular AIDA64 said that the whole BD microarchitecture is crappy and 8-core Zambezi can't even compete with 6-core Thuban in some cases.
He wouldn't be surprised if Zambezi will never be launched. Looks like that they have an Interlagos sample.
link
From OBRs test results which I've always believed were genuine, Bulldozer doesn't perform that bad, it should beat Thuban in most cases.
Oliverda he is comparing to a SB-E and not SB. Did you think some pseudo 8core(more like 4core+HT) can compete with +500 dollar chip 6 cores (12 threads). I wouldn't mind but it was unreal from the start.
here is something interesting:
http://forums.anandtech.com/showpost...25&postcount=2
You can see the difference so If AIDA author has another Interlagos as you said then it can perform as this one from september comparerd to July tested model so the reality may differ quite a bit.
By the way: I didn't read everything in original I was too lazy and its quite a hard language to read but It didn't look like he knows any final numbers just he is saying BD won't be a messiah destroying Intel's lineup with sheer power.
That's what I have thought and said at #2133. 8120 is scheduled at 2012 Q1, why it suddenly ahead of schedule and with lots of problem? They're accelerating the respinning and the spec changing more frequently than before, I'm afraid amd is playing renaming game to us, that final naming scheme and spec will be differ. Please everyone dont only focus on benchmark.
http://www.bug.hr/_cache/bcf715ee2a6...8f8ed6aa70.jpg
this vr-zone test is "sh1t"...No real performance, because in wprime is slower than Athlon x4 with more lower clocks and in R11.5 slower than x6 1055T
someone deleted my thread in VRFORUMS (english version). Saying that they need to clarify with AMD what's wrong with the result. Maybe some L3 prefetching and CnQ microcode issue in BIOS?
see the screen shot.
http://i54.tinypic.com/23keoap.jpg
The issue with previous stepping seems related to power and only power, not internal µarch.
Anyway, I just put my hand on a final, retail CPU with retail box. The box art is similar than the ones leaked some months ago, except they changed the colors and replaced the LGA CPU with a PGA one on the front. About performances, nothing changed. So, if this CPU isn't with "shipping performance", the fix will come post launch. And that would sux a lot.
Jack,it's rather simple. If yo urun single thread on a module that is SIMD heavy,all FPU resources(2xFMAC) will be dedicated to one core. Then ,if you ran same test but MT ,across all 4 modules,all 8 cores will then share 4 FLexFPs (in total 4 256bit units). You will have scaling from ST to MT akin to sclaing from single core to QC,only this time your single trhead results SHOULD be very high as you are using one double-sized very powerful FPU(whole FLexFP within a module). Scaling is better than pure 4x since now SMT works within FlexFP as 2 cores per module share 2 128bit FMACs and this improves performance additionally.
The problem is,however,that one FLexFP is somehow slower than one Thuban core
@xsecret
Well vrzone just ran the same chip on 2 different motherboards xsecret.Guess what,on different boards,same chip performed with 80% delta... So it's the firmware problem it seems.
Well, all those "performance are not final ! not shipping performance ! don't trust leaks !" seem now bullshat and a way for AMD to prevent leaks. I have the same performances with the B2 chip since weeks and more important, the benchmarks published under NDA by AMD itself are really close to that.
That smells really really bad ... :shakes:
Ignore all those "tests" it's just pure bull:banana::banana::banana::banana:. Compared to Thuban the singelthread performance is higher and also the multithread. Of course it's not slower.
The FPU is doubled up from the K10 generation. Intel also doubled the FPU with Sandy Bridge, but with the same core count. Also, I should call Bulldozer for a 4 core design with 8 thread execution capability, so the 4 256-bit FPU's matches the cpu well.
AVX will only use 256-bit mode, so in all other situations, the flex-fp should shine, and of course with a lot of optimization.
Bulldozer will rock and wait for final release. Those test's is all fake or with a earlier ES cpu and is not representative.
Well I don't know what AMD is showing in their NDA docs but if this is the final performance then it's pretty bad indeed. They barely match their previous six core product.
All I know is that the chip vrzone run on 2 different boards performed very differently. This could be due to different power delivery to the CPU which in turn just limits it clock wise and forces it to throttle down. The "better" of the two results posted(the one in their forums and not the one on the homepage),shows 3.1Ghz 8C to be roughly 25% slower than Thuban in wprime. Still makes no sense if you ask me. Latest sisoft leak shows Interlagos having at least 30% better results per clock and per core than MC in SIMD heavy workloads. I don't know whether they optimize for FMA,but if they did just AVX then results don't change much when Bulldozer is in question(roughly 10% higher versus SSE2/3).
@JPQY
Link. The homepage results ,which were even worse with the same CPU(on different motherboard) are now gone.
FlexFP can do 2x128-bit FMUL or 2x128-bit FADD or 1x256-bit FADD or FMUL per cycle. Simultaneously it can execute up to 2x128-bit FP SIMD + 2 int/ALU SIMD instructions per cycle.
Throughput of 128-bit SSE or AVX FADD or FMUL instructions is 2x larger for module than SB core. Combined 128-bit FADD + FMUL has same throughput as SB core and combined 256-bit AVX FADD and FMUL has half of throughput as SB core.
With FMA4 and XOP with one 256-bit instruction two 128-bit flex FP can execute 256-bit FADD and 256-bit FMUL per cycle - same as SB core with AVX.
Per core with FMA4 one 128-bit FlexFP has same throughput as K10 core FPU. Throughput of FADD OR FMUL type instructions is also same, but in combination K10 core can execute up to two 128-bit SSEx instructions. In various combinations K10 FPU is slower than one 128-bit FlexFP, but in extreme conditions it is faster.
Same code could run faster with FMA4 than AVX because of issuing one instruction for two different arithmetic operations which is executed at single unit. Also rounding has better accuracy with FMA4, there is less register pressure because FMA4 is four operand instruction set. On BD FMA4 optimised code could be much faster than AVX.
It can if other unit is not busy because FlexFP is shared. It can also do additional two bitwise ALU FP operations with two other units (pipe 2 and pipe 3). But if units are full utilised, one core can use only one FMAC and one MMX unit.
FlexFP should shine in 256-bit mode with XOP and FMA instructions.Quote:
AVX will only use 256-bit mode, so in all other situations, the flex-fp should shine, and of course with a lot of optimization.
:DQuote:
Bulldozer will rock and wait for final release. Those test's is all fake or with a earlier ES cpu and is not representative.
AMD is trapped with its own bullshat. If you compare a FX-6100 (6-core @ 3.3 GHz according to AMD marketing terminology) with a 1100T (6-core @ 3.3 GHz), the 1100T will, or course, be faster. Why ? Because the FX-6100 is a 3-core CPU w/ CMT and µarch tweaking. It just can't compete with a real 6-core CPU, even with an old arch. You can translate the case to Intel to understand how stupid it is. If you compare a Core i5 750 (4-cores, 4-threads, 2.66 GHz, 1st gen arch) with a Core i3 2120T (2-cores SMT, 4 threads, so "4-cores" by AMD terminology, 2.60 GHz, 2nd gen arch), the i5 750 will, of course, be much more faster in SMT environment, despite the presence of 4 threads in both case.
The only problem with the SMT analogy is that in intel's case you have the same number of execution units being shared with 2x the thread count. In AMD's case you have a dedicated hardware part in the silicon that takes care of each thread. So your "3C 6T CMT" chip in reality does have 6 integer cores and 6 FMAC units. None of those are shared between 2 threads(yes I said 6 FMAC units ,not 3 FlexFP units,there is a difference). So to sum it up : in AMD's case each thread has a dedicated hardware(int and fp) in the chip which is equal to a core(more or less,who cares if the scaling is 1.8x) while in intel's case each core is shared among the 2 (weak) threads.
Now that the difference is clearly defined,it really doesn't matter much for AMD if they have dedicated HW that runs the thread IF it does it at sub par performance to their previous design (Thuban). It will be regarded as subpar,to even K10,let alone SB or Nehalem.
informal
First of all integer cluster and FPU is not the only thing what makes a core what it is, without the rest you can't do anything.
And to your dedicated hardware parts
HT is using the integers because their ALU isn't utilized at max.
BD shrank the number of ALU but made a dedicated integer in a core because it doesn't use HT.
If you want to use AVX then you can't say it has dedicated FMAC per core but per module so your point about dedicated hardware is flawed.
If you want a real 8 core and not just 4core+CMT(cluster based multithreading), get an 8 module Interlagos and deactivate the second integer in every module, That would be a true 8 core and not this hybrid.
freeloader bassically you are right, If it has the performance then I wouldn't care even if they call it based on the number of ALUs
You just can't compare the hypothetic maximum bandwidth of a compute unit with a real complete core. The term "core" used by AMD is a non-sense. What's a core ? A core includes dispatch, prefetch, decoding and executions unit. A core is not a cluster of Integer ALU/AGU or an ALU alone. You have only 4 FP scheduler for a "8-core" Bulldozer, not 8. You have 4 I-Cache, not 8. 4 blocks of L2 cache, not 8. The hypothetic maximum throughput of a unit is a great information, but is not representative of actual, real performances. If you don't feed the engine with good decoder/dispatcher, the FP units will sux. They tried to copy the CMT architecture build by Alpha (Alpha never called that a "core") but they failed to implement that correctly due to .... well, i should shut up now :) This said, a big problem in BD µarch will comes from the Integer unit.
Even in AVX you have a dedicated HW per thread. In 128bit AVX mode you have one FMAC doing one 128bit AVX instruction.In 256bit mode when both cores have scheduled 256bit FP instructions on the FP co-processor(so called FlexFP) you have 1 256bit instruction being done in 2 cycles for each core,that's all. So my logic is not flawed. Threads in Bulldozer have dedicated HW doing work. But ,like I said before, it won't matter much if performance is not there.It will be regarded as slow 8 core chip,something like 8 K7 level cores.
Is it X6 1100T faster than i7 2600K ? 2600K has only 4 real cores, but X6 with 6 real cores isn't faster than 2600K. Your comparison with SMT isn't correct.
On average FX6100 should be significantly faster than 1100T, but may be slower in Prime95 or SSE128 optimised Linpack. However, with proper FMA optimisation FX6100 can do 24 DP FLOPS per cycle, same as Thuban with SSE128, but more flexible.
I can tell you one thing.Even if we consider that pure integer speed maybe goes down a bit versus K10,those FP/SIMD numbers from FX8120 really do seem odd and out of place.That's why I think something is going on with the platform itself. According to Interlagos' Sisoft Multimedia numbers,we are left with around 32% better fp performance than MC,per core and per clock. This has not been seen in other Zambezi leaks.On the contrary,we have seen 30%+ lower performance per core than Deneb/Thuban. I think that those Interlagos numbers are with AVX in mind,but again,AVX brings very little to Bulldozer performance since you have fixed number of FP resources which is practically the same in legacy and AVX workloads(peak flops are the same). I also doubt that sisoft uses FMA for Multimedia benchmark when Interlagos is in question(just "plain" AVX).
If will be FX better core to core (or clock to clock), must be FX61xx better performance than x6 1100t. Example efectivity of performance can be 6C (or 3c/6tsm) FX simillary as efectivity of performance 4c/8T (intel hyperthreading)
Right now I think it would be best to moderate your expectations and be patient. You guys are getting really twisted up trying to justify your expectations. I don't think that level of emotional pressure on an unknown future is healthy. If you are right it fills you with a sense of selfrighteousness that reality legitimized your expectations. And if you are wrong it can be depressing or even turn you into the dreaded anti-fan. There are examples of the latter here in our very own forums. Instead practice patience and try to see reality as it is, free from our preconceptions.
I want bulldozer to succeed as much as any of you guys. Partly because of nostalgia and partly because it would be the better outcome for the computer market. Those with a long memory will remember that I speculated favorably about bulldozer since the beginning, against some of the worst Intel trolls. But my speculations, hopes, and expectations can't change the reality of making a chip. The reality is that when the theory of making a chip turns in to the actual implementation of physically creating that chip things can go wrong at many stages of that process.
Quite a while ago (just after we learned BD has 2 ALU) I made a rough estimate. I figured that if early 8 core BD could attain 4.25GHz base clock then AMD and Intel would be near parity for regular workloads and AMD would have a serious advantage in high thread count workloads. Details about the pipeline, cache hierarchy, memory subsystem seemed to bear this theory out - AMD is aiming for high clocks to make up for any IPC deficit compared to SB/IB.
But then rumors of the release clocks started rolling in. 3.6GHz, WTF? Of course my estimates could have been way off, but then why the long pipeline, etc for clocks not any faster than current products? Something doesn't smell right. Then there was the rumor of FX-8170 at 4.2, then 3.9. And the rumor about the 4.2GHz 4 core part. The reality is that to make a high frequency product everything has to go right, the architecture and the process. What do you do as a company if your foundry (for example) can't meet your original design clocks? You release what you can and try to fix the process or tread water until a new better one can be implemented. No amount of me or you wishing that it can be fixed with a respin will actually fix it if the problem lies elsewhere (if there is a problem at all). So here I am closer to release and I am letting go of my emotional attachment to all the speculations and hopes - speculation is for fun and I'm not going to let it wreck my emotional state if reality doesn't want to cooperate.
As for all this arguing over what defines a core, it's garbage. You guys are arguing over a fuzzy term that is more useful to the marketing team then the engineering team. What matters is how the processor as a whole performs with various workloads, not the number of cores. Cores is just a term to plaster in big letters on the box to sell to people who are too clueless to do actual research before impulse buying it. Just like GHz was in the P4 days.
SiSoft Sandra 2011 does have AVX/FMA support, but we don't know anything about implementation of FMA.
http://www.sisoftware.net/?d=news&f=...elease&l=en&a=Quote:
AVX/FMA instruction set support for new CPUs
Processor Multi-Media, Processor Cryptography, Memory Bandwidth, Cache and Memory Benchmarks
Well I basically agree with you Solus Corvus. I'm now quite pessimistic when it comes to Zambezi and desktop market. Interlagos can still do pretty good in server market,it has relatively good clock speed and number of threads per chip.
I disagree with this statement. No matter how much raw power a CPU can deliver, the power is nothing if software can't use it in real world application (and without too much work for devs). Threading is a mess and an high time-consuming task for developers, so single thread performance and core count have a major impact in real world performance.
This said I wonder what u guys will think if benchmarks published after the NDA are similar to actual ones ? Reviewer sucks ? AMD issue with a TLB or anything else like for first phenom ? crappy µarch ?
I don't think being pessimistic is good either. It's reasonable to have concerns, but there is plenty to look forward to as well. I think it's best to replace all those positive and negative emotions we are placing on this product with simple curiosity. Even if it isn't a fire breathing monster it is still a new architecture to play with. We have no idea what overclocking will be like, how it will perform with our specific applications, etc. Reality is more interesting than all of our emotional baggage combined anyway.
I'm sorry if I didn't communicate my point well, you stated almost exactly what I meant. I have said it for a while now: core counts don't matter. What matters is how it performs in the applications that you as use. That's what I meant by workload - real world apps, not some theoretical measure of power.
The real world strange performances of BD seems to come from major changes in instructions throughput. Some of them are much (much) faster than Thuban, some other are much (much) slower. So depending on which instructions are used by the benchmark/software, you will get various results. Use FDIV and you'll see major gain, use FCOS and that will sux more than a Prescott. Something is wrong in µops decoding on many instructions.
Some depressing info here,but i will stay optimistic
this new µarch from AMD might have some bugs
but i can and will be improved upon,its better than
them just shrinking the old,they now have somewhere
to go and it will be good after it matures:D
@ xsecret
Do you think this can be microcode's fault? Can AMD forcefully impair Bulldozer samples with additional overhead via firmware? I know they can turn off prefetch completely .
I see it like this then, if AMD cannot release a product on a new architecture it intends to use for 5+ years to come, at least moderately faster than its previous top product (nevermind slower), then it will:
1- Say goodbye to the desktop market and become a server only provider
2- Hold on to a 1-5% share of the desktop market where price is the only element of choice such as in the majority of 3rd world countries and China
3- Go bankrupt from lack of funds and casflow to invest in developing a new replacement project/ be bought out by Nvidia or other
Don't tell me AMD does not need Bulldozer to survive because everyone knows very well that this company won't survive on the graphics department alone...
makes sense.. damn this will give nightmare public relations for amd, even if the design would somewhat mature. Worst case will be if it depends on special compilers and therefore software/hardware market. Then it might only be good in server department with specialized solutions... And even then it will be hard..
Even if bulldozer would rock with specialized software, I doubt AMD would get the deserved support.. It may be 10 years before its time, and even if the future will proof it like with x64 and internal imc, if AMD sacrificed too much everyday-performance, it will be a serious problem for bulldozer success...
it should be 10-15% better then thuban, and reach higher clocks/suck less power, everything else would be a great fail..
xsecret: I dont know now, where is it true. One man with FX B2 say here, single performance is better than Phenom II and u said "its whorse" ...:-/
Maybe that's the problem. People expect bulldozer to be much faster than a chip that's considered very good (SB 2600K) when in reality that would be a huge step up for AMD in desktop.
I find it difficult to understand how AMD can possibly repair the damage caused by another Barcelona while Intel will bring a world of pain in just 6 months time with Ivy Bridge + S2011 which will make the performance gap between AMD and Intel even greater than it is today with Phenom II vs. i7.
AMD sunk terribly with the Barcelona fiasco, another fiasco would confine them to a new VIA in the desktop market.
Better prefetcher, especially bulldozer's double prefetcher will have a lot of work to do while the longer pipeline cause higher instruction latency, to save some cycle that have been wasted. If prefetcher is down, those higher latency will cause performance disaster cuz you have to spend more cycle to fetch code from cache/memory.
In that case I will calmly try to find my Kindle I put somewhere after I bought it and finally start reading books.
Joking apart. You make impression you already know the final performance when I look at some of your posts:
"Anyway, I just put my hand on a final, retail CPU with retail box. About performances, nothing changed. So, if this CPU isn't with "shipping performance", the fix will come post launch. And that would sux a lot."
"I have the same performances with the B2 chip since weeks and more important, the benchmarks published under NDA by AMD itself are really close to that."
"Bulldozer will be able to compete with Intel mainstream product in the 150-250$ range."
On the other hand there is also a lot of confusion you are bringing to this discussion as well, for example you said that "The issue with previous stepping seems related to power and only power, not internal µarch." which can be seen as there wasn`t any problem with BD microarchitecture whatsoever. But then you said "This said, a big problem in BD µarch will comes from the Integer unit." which on the contrary is a claim that contradicts your previous opinion.
Depend on what is "bad performance". If it's "not able to compete with Intel product for the same price", then all previous stepping than B2 are bad due to that bug. If it's "awesome roxxing performance able to destroy a Core i7 990X" (many guys expect that), well, even the current final CPU is "bad", but that's µarch related.
Looks like i will need a bigger PSU:rofl::rofl::ROTF::ROTF:
http://i55.tinypic.com/f0uf49.jpg
I'm just gonna put this little bit out there.
There is more than 1 way to skin a cat.
:confused:
..maybe you re saying that there is not only 1 way to valuate "performance" ? :rolleyes:
or maybe you re saying that this architecture will work in a different way, so it is useless comparing that using the "old" cpu architecture functioning? :rolleyes:
sorry for my bad english and sorry if Im saying stupid things but I just want to know more about your "ambiguos" expression :)
also your sign is ambiguos :)
in my country we use to say that somebody have to "retire" when he is at the highest level.. so I think (and hope) you will retire after doing the world record ln2 overclock on a FX! :)
Minimum, so to be safe I suppose it's better to go for a 13 to 15MW PSU:rofl::rofl::rofl::rofl:
in southern china, some eat cats like chicken. they skin cats differently. and they eat the flesh. skinning cats have different ways.
Assuming the cat is half the solution ...
Nice way to derail thread which already was kind of derailed. Double Derail FTW :)
Haven't had that much fun reading forums for weeks now, keep it up!
What that little and very old phrase means. Lets go to a car analogy ( forgive me ).
I have a plain Jane 2001 Neon SE 5spd MATX.
Now in it's stock form the car is a rat, for instancee 1st gear is a very narrow band and sluggish as well as second........the car wakes up around 3rd and 4th gear.
Now if i wanted to improve the performance of the car the right thing to do would be work on the areas it is weak. Installing long tube headers and a CAI and a cam would be counter productive as those 3 things would boost mid range which is already preety strong.
Cost for above mods would be roughly $600 with quality parts.
The logical thing would be to install an under drive pulley for faster rev, alum flywheel, and dump the stock 2" exhaust from the cat back which would all boost the bottom end performance around first and second gear.
Cost for above mods would be roughly $600 with quality parts.
2 totally diff paths of upgrade pointed out above and tbh would you feel the difference? Not really, but it would be measurable and the last one would net better measured performance.
Make sense now?
You can tune deffencies away whether it be a cpu or an entire system build, it all just depends on your end game or goals and choosing correctly. It's always better to tune for some nice balance in the end.
so bulldozer pulls more from the low end xD nice analogy :) my x6 does this too i believe and its overall smoother :D
You guys want to know more about Bulldozer "final" Performance & Pricing ? I just found an (old) interesting screenshot from AMD.
Easy as hell : everything you want to know is written down here.
Attachment 119972
^funny though both brazos and lynx are more expensive than that chart shows. i think they over-performed.
hehe only that amd is late :)
whatever msi is also late with bd-ready bios update for my board :)
I thanked you for that post, simply because of the car analogy usage lol
You could also do the "blasphemous" move and swap out the MT for an AT, then equip it with a 3500 Stall so it launches immediately in the power band :D A lot of the problem could easily be the fact that the Neon (sans SRT-4) was not intended for any sort of performance, so an overhaul of the computer's programming could do a world of difference (remap spark curves, AF:R table, tune the knock-retard table). Factory tune is sort of like the whole Intel Compiler use with AMD fiasco, where if you just run it as is you're not getting all the performance you could be heh
As for the cat: Schrödinger's Skun Cat Theory - The cat has neither been skinned, nor does it have skin! :eek:
(Definition for our foreign friends, -- ^^^^^ -- since it's not a commonly used term)
In the US I don't see it being that much for a system build, even for a A8-3850 (which I think the 3550 in that chart might be representative of what the 3850 is).
A8-3850: ~$130
Mobo: $65-140
RAM: 2x4GB - $30-40 (DDR3-1600)
PSU: $40-50 (for a quality 600W with MIR, which is overkill)
Case: $40-90
HDD: $40-55
Optical: $20
-------------------
Total: $525 (at most)
Monitor, keyboard, mouse, general purpose speakers: ~$155 (monitor would be name brand ~19-21 inch, but relatively inexpensive mouse/keyboard/speakers)
-------------------
Complete System: $680
Use the cheaper motherboard and case, and they knock it down to $555! That's lower than the Lynx with E2 APU prediction :p:
I think everyone will be late with the max-performance BIOS since AMD did some chip revising.Quote:
whatever msi is also late with bd-ready bios update for my board
AMD will try to keep the discussion away from performances as long as possible. You'll see next week what I'm talking about :D
Funny my patience doesnt grow much if people that know more say things like "you will see next week" xD
First post, Anxiously awaiting BD.
BLT has the 8150, 8120 and 6100 models on there site but
8150 - 266.28Quote:
This product is not in stock, and is not yet on order with the manufacturer but should be shortly
http://www.shopblt.com/cgi-bin/shop/...r_id=590972573
8120 - 221.73
http://www.shopblt.com/cgi-bin/shop/...r_id=590972573
6100 - 188.32
http://www.shopblt.com/cgi-bin/shop/...r_id=590972573
Conclusion i draw from those prices are as follows..
2700K>8150>2500K>8120
would make complete sense..
and tbh is not too bad.. i dont want to pay more then 300$ for my new high-end amd so it must not be faster then 2600K :)
I am actually building one of these for a customer soon.
A8-3850 - $130
Mobo with DVI+VGA+HDMI, USB3 = $100
DDR3 (2x2GB DDR3-1600) $30
PSU - Corsair Builder Series 430w = $42, $29 after MIR
CASE - Lian Li Lancool PC-K58 = $60
1TB Seagate HDD = $50
Optical w/ Lightscribe = $22
OS that you forgot - Win7 HP = $100
Keep current monitor as they bought one within the last few years, keep KB/Mouse as they have one = $0
Price: $534
I completely forgot about an OS, but even still, as you pointed out a person can save if they can reuse any old hardware. If not... there's always Linux for free lol
I think those specs will make a really good all-around system, more-so than the G/GX chipset motherboards with a similarly clocked Athlon II X4. There's quite a bit of upgrade potential with these, which I hope AMD can stick with that socket for a few years so people can upgrade bits and pieces here and there :D
Thanks for posting this. Looking at the price of 8150 I wonder what will happen to 1100T after FX launches. If the results leaked so far are accurate then 8150 will loose quite a bit of benchmarks to 1100T.How will they justify 266$ price when you can buy older generation of their own products that can perform the same or better while costing 26% less(AMD lists 210$ for 1100T on their website as price for distributors). If 1100T goes down in price after FX launches(which always happens with new product launches),then picture is even worse for FX. Then you will have even more affordable Thuban X6 ( <200$ ?) at 3.3Ghz that OCs easily to 3.8Ghz+ on air. Price/perf. ratio of FX will then be even worse IMO.
Note this is all based on what we know via the leaks and what people in the know (such as xsecret and chew*) hinted so far.
Nothing makes sense to me.
PS And yes,I know 32nm means higher OC for FX but who cares about that when you can OC Thuban as well.With IPC deficiency you will end up at the same spot with Thuban @4Ghz even if you push FX 81xx to 4.5Ghz+ ...
Heres x2 opteron 6220 8c at 2.75GHz from sisoft,although cpu world
has them listed as 3.0Ghz cpu's.its from sept 7th 2011
Dont know if this is useful or not
2x8c opteron 6220@2.75Ghz
Scores
Processor Arithmetic Benchmark
score-138.538 GOPS
Processor Multi-Media Benchmark
score-315.004 Mpix/s
part of the problems we have with clear reasoning about this all is the totally new architecture. bd has 4 cores, that are stronger as one thuban core, but to be reallystrong need at least two threads. so in some scenario they outperform thuban whilst in others maybe thuban hold some ground. bd is designed to cure its weak one-threaded performance by matter of clock speed, which at the moment might not be enough due to problems in the process.
whatever, where thuban is strong, like multithreaded workloads, i expect bd also to be quite strong. All in all i guess it's a clear update over thuban, but having problems to compete with intels one-thread-strong-ipc sandybridges...
anyway, i can see now, that bd will for most cases be a good processor. only the 1-core/multicore benchmark logic will not fit it very well..
Well it can't hurt. Thanks for posting this radaja! Weird thing is the clock Sisoft reads. The model should be 8C 3Ghz base clock and 3.5Ghz Turbo,at least this is the spec according to Gateway server configuration menu (now pulled).
For 8C zambezi with Turbo for integer cores (500Mhz presumably?) that ran at 2.8Ghz/3.3Ghz:
Scores
Processor Arithmetic Benchmark
score-56.64 GOPS
Processor Multi-Media Benchmark
score-147 Mpix/s
Base clock difference between Zambezi and that opteron is 7%. Similar goes if you count the all core Turbo clocks :3.5/3.3=1.06 or 6%. So let's take 6.5% as mean value.
So 3Ghz zambezi with 3.5Ghz should score :
Processor Arithmetic Benchmark
score-60 GOPS
Processor Multi-Media Benchmark
score-157 Mpix/s
Scaling of Interlagos (since it's 2P system in question) is probably very good,close to perfect. Assume 1.95x,that takes a few % from perfect scaling due to software-hardware limitations.
For 1P 3Ghz Opteron then we should have:
Processor Arithmetic Benchmark
score-138.538/1.95=71 GOPS
Processor Multi-Media Benchmark
score-315.004/1.95=161 Mpix/s
Now compared to Zambezi platform @ 3Ghz,this Opteron is:
71/60=18% faster in Processor Arithmetic Benchmark
161/157=2% faster in Processor Multi-Media Benchmark
Practically Integer performance is somehow off in Zambezi's case (3Ghz+Turbo of 500Mhz) while SIMD is almost the same as with 1P 3Ghz Opteron (3Ghz + Turbo of 500Mhz).
This leads us to conclusion that at least there might be something wrong with integer performance. SIMD is not promising though since Opteron scores the same.
The only problem with above "math" is that Sisoft sees the Opteron system @ 2.75Ghz instead of 3/3.5Ghz. This may be error in detection or the samples may actually run at 2.75Ghz,we don't know. But ,Opteron's performance should be final as the chips are shipping to partners.
I'm going to search for 2P Lisbon 6140 (2.6Ghz) sisoft results and compare them to this 2P Valencia system.
Is this processor really an 8 core or more of a 4 core with 8 threads? I've heard both sides and i don't know what to believe. I also don't expect the IPC to be a hell of a lot better then Thuban but i do expect it to hit over 4.5Ghz with the 8150. Have a feeling the 8120's overclock ceiling is going to be very low.
@informal
IMO those interlagos sisoft result have many traps in them, be careful to try to reach any conclusion.
@informal,yea thats what i thought too about the clock speed,everywhere
i have checked says 3.0GHz yet that sisoft shows 2.75Ghz?
Well now we have 2 of them:
Details for Device 2x AMD Opteron 6220
2x AMD Opteron 6282 SE
To save you the boring math and long story,the results kinda match with only one thing standing out. Multi-Media Benchmark performance of the 6220 (per core!) is kinda low compared to 6282 (per core). Now I'm pretty sure I know at what clock the 6282 runs. It's either 2.5Ghz base or 2.6Ghz base. I assume it's 2.5Ghz. So the Multi-Media Benchmark for 1P 6282SE would roughly be around : 585/1.95=300 mpix/s which is only 5% lower than what supposedly 2P 3Ghz Valencia gets . If we take that 2.5Ghz 1P 6282SE scores 300 then 3Ghz 1P 6282SE would get ~360 and then this would imply 8C performance at 3Ghz is around 180Mpix/s . Compare to Zambezi's(8C 3Ghz) 157Mpix/s and 6220's (1P 8C 3Ghz) 161Mpix/s and you see the difference : it's better by 15%/12% roughly.
So to conclude : Zambezi's result deviate from Opterons ,per core and per clock,by: 15-18% in integer benchmark and 3-15% in Multimedia(AVX) benchmark (depending on the opterons used for comparison). One small caveat though : I have noticed that Lisbon platform scores around 30% better at the same clock versus Thuban ! I think this may be due to NUMA (2P Lisbons). So if this is translated to Interlagos too,then none of the above is relevant and Zambezi's original (bad) Sisoft results may even be genuine...
edit: oh and one more thing I forgot to mention. Interlagos at the same core count and roughly same clock is noticeably slower in integer benchmark than Lisbon/MC (generation). Interlagos can clock higher though but IPC difference is very high (think more than 15%). and yes it is noticeably faster in Multimedia one but integer represents a big chunk of both server and desktop workloads,so it is kind of a letdown IMO. All that talk about integer cores being faster etc. and then we get this... I guess it still may not be final platform/BIOS but it doesn't look promising in integer workloads.
On the other hand...
System with Phenom II X4 925 overclockable to ~3.5 @ 250 HTT + ASUS AM3+ mobo with UEFI, USB3, Bulldozer ready + 8GB DDR3-1600 CL9 @ 1666 250 HTT + HD6670 (would perform better GPU wise) is $580.
I just realized this as newegg tends to run a lot of combo deals...that system has an upgrade path. I know the person I am building for quite well, who knows...maybe in a year or so they will want to upgrade to Bulldozer?
Seems like we have a total stinker headed our way. I will be thoroughly peeved at AMD if they've had us in the dark this long for what amounts to garbage.
Well it may not be garbage in server segment. On the contrary it may be pretty good ,even versus SB-E.
On desktop though,if nothing changes,many websites will label it as a letdown in their reviews.
IMO this doesn't make sense. We are now in 2011. AMD knew what kind of cores it will have to fight this year and in the years to come. Core generation already had pretty big advantage in integer workloads(some due to compiler advantages some due to better uarchitecture). Now I'm supposed to think that even though AMD knew the gap will be around 35-40% ,at similar clocks,they will go with 15% slower design (IPC) versus their own familiy 10h and hope to make up all this by clocking it sky high? They would need 5.5-6Ghz to match QC SB with SMT which will never happen. Bulldozer can barely touch 4Ghz with Turbo on and this is not on all cores. Something is not right here...
i was looking at things like laptops where the price is already summed up based on all the specs and the market kinda dictates prices much more than individual component competition. and keep in mind that the top end it mentions dual gpu, so thats another 50-70$, then beep pointed out you missed the os for 100$ right there. bringing the total to $800+
things seem to be about 20% over those prices (which are system prices, not user needs upgrade by pc knowledgeable friend prices)
u right, if is it true, then :-(...AMD must know power of SB and Q1/Q2 netx year of IB. They need +- performance in single thread as Nehalem. It will be OK, still slower in ST, but very impressive in module architecture in multithread (MT). What is the logic for lower single thread performance in new generation architecture for next years than 2.5 years old Phenom II?
In my opinion, a :banana::banana::banana::banana:ing bad choice of wanting to look strong on multithreaded server workloads and not particularly caring about single threaded performance, much less on desktop segment.
BD is a good server chip which will scale wonderfuly in multithreaded work but if it has 15% less ipc than thuban it will simply be garbage on desktop save for a few ocasions.