Bulldozers first screens

Printable View

Show 100 post(s) from this thread on one page

05-19-2011, 01:16 PM
zoomee

good point! - I auto presumed it was 3.8 as per other 'leaked' benchies ;)
05-19-2011, 01:20 PM
Apokalipse

Quote:

Originally Posted by chew*

You don't even know what speed cine was run at..........

Or the number of cores used.
Which means it's also pretty useless in terms of comparing BD to SB even if it is real.
05-19-2011, 01:23 PM
FlanK3r

look at R10, here seems very, very good. 28 000points is simillary as 980x think...R11.5 its good, better than 2600k. But:
1) we dont know, if this fake or not
2)BD modul is not as 2 todays cores, it is better alternative to hyperthreading
05-19-2011, 01:41 PM
jimbo75

Do those scores even match up properly? If it's 28k in R10 surely it would be higher than 7.37 in 11.5?

Also : OBR.
05-19-2011, 01:51 PM
DarthShader

Quote:

Originally Posted by jimbo75

Do those scores even match up properly? If it's 28k in R10 surely it would be higher than 7.37 in 11.5?

Also : OBR.

Ditto on both things.

The R10 score is indeed slightly higher than a 980x (ie:http://www.hardwarecanucks.com/forum...review-11.html), which is also almost exactly the promised 50% with 33% more cores (ie:http://www.overclockers.com/wp-conte...cb10-graph.jpg). While the 11.5 score is only an almost exactly 25% improvment over thuban... does it differ so much from R10? If that was a 6core BD on the other hand....
05-19-2011, 01:55 PM
muziqaz

Quote:

Originally Posted by jimbo75

Do those scores even match up properly? If it's 28k in R10 surely it would be higher than 7.37 in 11.5?

Also : OBR.

What's OBR? :)

By the way, cinebench 10 I get 21964p on thuban @3.9ghz
and cinebench 11.5 I get 6.85.
05-19-2011, 01:57 PM
w0mbat

OBR is a person.
05-19-2011, 04:26 PM
informal

If I would have to guess,then I would say 3Ghz 6C Zambezi.But it may be lower clocked 8C too.Without the CPUz it's pointless.

Oh and if it is 6C 3Ghz Zambezi,then leaked Donanimhaber slide fits perfectly into the picture: C11.5 -> 7.37x1.33=9.8pts for around 3Ghz 8C.Or close to 10.5 for 3.2Ghz version.
05-20-2011, 01:04 AM
w0mbat

Quote:

Originally Posted by informal

If I would have to guess,then I would say 3Ghz 6C Zambezi.But it may be lower clocked 8C too.Without the CPUz it's pointless.

Oh and if it is 6C 3Ghz Zambezi,then leaked Donanimhaber slide fits perfectly into the picture: C11.5 -> 7.37x1.33=9.8pts for around 3Ghz 8C.Or close to 10.5 for 3.2Ghz version.

http://obrovsky.blogspot.com/2011/05...to-laught.html

lol, that provoked someone with bad english skills :p:
05-20-2011, 01:11 AM
FlanK3r

Maybe R11.5 is harder for FP unit?....
05-20-2011, 01:21 AM
Lightman

If OBR's scores are somewhat representative of final Zambezi 8C then it's not bad! I like the fact it's faster than SB and in some situations even exceeds Intel's most expensive desktop processors. This is a proper step up compared to what AMD can offer at the moment.
Having Zambezi dominating all benchmarks would be nice, but at the same time our wallets would feel the pain!

On another note, OBR stop being so rude! You've been banned on most forums already, soon someone will need to ban you from using internet at all :yepp:
05-20-2011, 01:51 AM
Sn0wm@n

cant wait till bulldozer gets out
05-20-2011, 03:30 AM
informal

Quote:

Originally Posted by w0mbat

http://obrovsky.blogspot.com/2011/05...to-laught.html

lol, that provoked someone with bad english skills :p:

Haha now he spilled all of the beans :D. Ahh it doesn't really matter,results are coming out sooner or later.
Even if it is 8C model it's not bad. But nice to see him reading XS ;).

PS So now Turbo works in heavy FP workloads??Hmm right :).
05-20-2011, 05:28 AM
informal

After looking at the results from "older" CPUs,in both C10 and C11.5,I could see that C11.5 drastically improves scaling with more cores. For example(hardware canucks latest Phenom X4 review) : in C11.5 64bit test 1100T has exactly 50% better result than 3.3Ghz Phenom X4 (normalized for this clockspeed from the result of other X4 Phenom),while in C10 64bit test 1100T has exactly 37% better result than same 3.3Ghz X4 . As can be seen from this,C11.5 does scale much better with cores so the result for Zambezi should go up compared to relative score in C10 Vs 1100T,not down. But we see the opposite ,instead of score going up,the score goes down and scaling is somehow very poor in this test,negating any IPC floating point boost FMAC can give.

Cinebench10 summary(leaked fishy results from that "blogger"):
1100T gets 19164pts, Zambezi X8 @ not 3 Ghz(I assume it's more than 3Ghz then,say 3.2Ghz) gets 28074.This test doesn't scale THAT well with more cores and scaling penalty is 9%(from perfect scaling with more cores- 1100T is 37% faster than 3.3Ghz X4 instead 50% faster). Start from X6 score,apply 33% more cores and 9% scaling penalty and normalize for 3.2Ghz clock : 19164 x 1.33 / 1.09 x 3.2 /3.3=~22675pts. The difference between this score and what he got is IPC improvement+maybe some limited Turbo effect which I won't count since this is heavy FP workload : 28074/22675=1.23x or 23% IPC improvement per core( one 128bit FMAC vs 1 thuban core). Pretty good so far.

Now Cinebench 11.5 results summary:
1100T gets 5.91pts, Zambezi X8@ ~3.2Ghz supposedly gets 7.37pts. Scaling in this test is perfect as can be seen from hardware canucks link.Start from X6 score,apply 33% more cores ,no scaling penalty and normalize for 3.2Ghz clock : 5.91x1.33x3.2/3.3=~7.62pts. This is the hypothetical score of Zambezi X8 that would show ZERO IPC improvement in C11.5 Vs Thuban. Now compare with "blogger's" result of 7.37pts : 7.37/7.62=0.96pts or 4% IPC decrease per core Vs Thuban. Hmm,fishy indeed :). If the CPU would show similar performance gains Vs older generation(thuban) as in previous C10 benchmark, result should have been roughly : 7.62x1.23=9.37pts. This is for around 3.2Ghz clock,since he said it is not 3Ghz and I assume the worst case scenario for Zambezi (best case would be lower than 3Ghz). 3.5Ghz X8 then should have had a score at around 10.25 or in line with DH slide which had projected score due to non-finalized specs in late 2010.

But no, "blogger's " sample somehow sucks in C11.5 :D.
05-20-2011, 06:09 AM
Dresdenboy

Quote:

Originally Posted by informal

Start from X6 score,apply 33% more cores ,no scaling penalty and normalize for 3.2Ghz clock : 5.91x1.33x3.2/3.3=~7.62pts. This is the hypothetical score of Zambezi X8 that would show ZERO IPC improvement in C11.5 Vs Thuban. Now compare with "blogger's" result of 7.37pts : 7.37/7.62=0.96pts or 4% IPC decrease per core Vs Thuban. Hmm,fishy indeed :).

This is simple:
Theoretical max. 128b FMUL+FADD throughput of Zambezi w/o using FMA is the same as of a X4 per clock. So based on this it should perform lower. But CB is no synthetic benchmark (FMUL+FADD loop) and depends on a lot of other components. And as it is known it isn't that dependent on memory throughput due to data locality. So Zambezi's IMC shouldn't have much influence here.
05-20-2011, 06:12 AM
Mechromancer

I haven't seen JF-AMD around lately. He has been conspicuously absent. Maybe they'll tap him to be the next AMD CEO.

/speculation
05-20-2011, 06:19 AM
informal

Quote:

Originally Posted by Dresdenboy

This is simple:
Theoretical max. 128b FMUL+FADD throughput of Zambezi w/o using FMA is the same as of a X4 per clock. So based on this it should perform lower. But CB is no synthetic benchmark (FMUL+FADD loop) and depends on a lot of other components. And as it is known it isn't that dependent on memory throughput due to data locality. So Zambezi's IMC shouldn't have much influence here.

Thanks for the input. If this is the case,why is then C10 version behaving differently? In this test we see a massive gain .And I doubt that Maxon guys completely rewrote the benchmark code. If you take a look at the link i posted(HW canucks),you can see that any perf. difference between ,say, 2600K and i7-875 is transferred from C10 to C11.5,by the digit(25%). I would expect similar behavior to be seen on Bulldozer too.
But who knows,maybe C11.5 is hitting some limitation in Bulldozer so that we have such a behavior in that test.
05-20-2011, 06:24 AM
muziqaz

Quote:

Originally Posted by Mechromancer

I haven't seen JF-AMD around lately. He has been conspicuously absent. Maybe they'll tap him to be the next AMD CEO.

/speculation

He is a bit busy. Loads of stuff to do before the bd launch. I mean, when we met for a pint in London, he was traveling non stop. 2 days in one country, two days in another. He mentioned his traveling plans, but I lost track of them as he mentioned so many cities.
05-20-2011, 06:42 AM
Dresdenboy

Quote:

Originally Posted by informal

Thanks for the input. If this is the case,why is then C10 version behaving differently? In this test we see a massive gain .And I doubt that Maxon guys completely rewrote the benchmark code. If you take a look at the link i posted(HW canucks),you can see that any perf. difference between ,say, 2600K and i7-875 is transferred from C10 to C11.5,by the digit(25%). I would expect similar behavior to be seen on Bulldozer too.
But who knows,maybe C11.5 is hitting some limitation in Bulldozer so that we have such a behavior in that test.

I've seen a discussion about the compilers used for compiling the different CB versions but didn't dig deeply into it. But this might explain at least a bit. Remember that while for SB the cache subsystem architecture didn't change that much while from 10h to BD it did significantly.

One could use CodeAnalyst or VTune to check some basic metrics of CB's code, e.g. percentage of SSE instructions etc.
05-20-2011, 07:02 AM
Lightman

New snippet from OBR:

Quote:

PS .. 7.6 (LL), 11.6 (BD) ... + Richard Huddy to Leave AMD ...

Take it as you want.
05-20-2011, 07:03 AM
Postmodum

LL = Llano?
05-20-2011, 07:22 AM
zalbard

Yep, it is. But once again, I wouldn't completely trust the source. He's been wrong before.
05-20-2011, 07:40 AM
FlanK3r

I dont know, its impossible 7p in R11.5 for Llano (yes, its better optimalized than Athlon II, but still is too much, its at OC x6 Thuban)
05-20-2011, 07:41 AM
Manicdan

Quote:

Originally Posted by informal

After looking at the results from "older" CPUs,in both C10 and C11.5,I could see that C11.5 drastically improves scaling with more cores. For example(hardware canucks latest Phenom X4 review) : in C11.5 64bit test 1100T has exactly 50% better result than 3.3Ghz Phenom X4 (normalized for this clockspeed from the result of other X4 Phenom),while in C10 64bit test 1100T has exactly 37% better result than same 3.3Ghz X4 . As can be seen from this,C11.5 does scale much better with cores so the result for Zambezi should go up compared to relative score in C10 Vs 1100T,not down. But we see the opposite ,instead of score going up,the score goes down and scaling is somehow very poor in this test,negating any IPC floating point boost FMAC can give.

Cinebench10 summary(leaked fishy results from that "blogger"):
1100T gets 19164pts, Zambezi X8 @ not 3 Ghz(I assume it's more than 3Ghz then,say 3.2Ghz) gets 28074.This test doesn't scale THAT well with more cores and scaling penalty is 9%(from perfect scaling with more cores- 1100T is 37% faster than 3.3Ghz X4 instead 50% faster). Start from X6 score,apply 33% more cores and 9% scaling penalty and normalize for 3.2Ghz clock : 19164 x 1.33 / 1.09 x 3.2 /3.3=~22675pts. The difference between this score and what he got is IPC improvement+maybe some limited Turbo effect which I won't count since this is heavy FP workload : 28074/22675=1.23x or 23% IPC improvement per core( one 128bit FMAC vs 1 thuban core). Pretty good so far.

Now Cinebench 11.5 results summary:
1100T gets 5.91pts, Zambezi X8@ ~3.2Ghz supposedly gets 7.37pts. Scaling in this test is perfect as can be seen from hardware canucks link.Start from X6 score,apply 33% more cores ,no scaling penalty and normalize for 3.2Ghz clock : 5.91x1.33x3.2/3.3=~7.62pts. This is the hypothetical score of Zambezi X8 that would show ZERO IPC improvement in C11.5 Vs Thuban. Now compare with "blogger's" result of 7.37pts : 7.37/7.62=0.96pts or 4% IPC decrease per core Vs Thuban. Hmm,fishy indeed :). If the CPU would show similar performance gains Vs older generation(thuban) as in previous C10 benchmark, result should have been roughly : 7.62x1.23=9.37pts. This is for around 3.2Ghz clock,since he said it is not 3Ghz and I assume the worst case scenario for Zambezi (best case would be lower than 3Ghz). 3.5Ghz X8 then should have had a score at around 10.25 or in line with DH slide which had projected score due to non-finalized specs in late 2010.

But no, "blogger's " sample somehow sucks in C11.5 :D.

im also still wondering the results, the CB11.5 ones are very reliable as you change cores and speeds, but the architecture change is the real mystery here. but i also dont understand what dres means when he says its the same as x4, since i thought theres 8 pipelines for everything, while only 4 if using avx, which older generation cant even do

the CB10 however i think its bad to compare with an x6 due to how weird it scales with cores. just watching the video you can see how many times one thread is done and has no where to go. i think if someone who has a 2P system, or atleast 8+cores, could test out what happens to the score as they increase the thread count from 1 to max. going from 1 to 2 would nearly double, but 2-3 would be like 40%, but then 2-4 is like 90%, just due to which threads finish early or fast and if they have a proper place to go after.
05-20-2011, 07:46 AM
informal

Quote:

Originally Posted by FlanK3r

I dont know, its impossible 7p in R11.5 for Llano (yes, its better optimalized than Athlon II, but still is too much, its at OC x6 Thuban)

I think those are not scores but dates,like 7th of June is Llano launch date etc.
But we already knew this so he is just reposting known stuff on his blog.

Show 100 post(s) from this thread on one page