Quote Originally Posted by informal View Post
Summary of FX4110 @ 4.2Ghz vs Phenom II X4 980 @ 3.7Ghz:
super pi :meh I don't want to bother finding results in this "benchmark"
fritz chess: FX4110 gets 7332pts, X4 980 gets 9067pts. FX is 24% slower at stock(vs stock) and 40% slower per "core" at the same clock. Multithreaded benchmark.
3d mark vantage CPU test : FX4110 gets 10660pts, X4 980 gets 12780pts. FX is 20% slower at stock and 36% slower per "core" at the same clock.Multithreaded benchmark.
wprime 32m: FX4110 gets 17.9s, X4 980 gets 11.45s. FX is 56% slower at stock and 77% slower per "core" at the same clock.Multithreaded benchmark.
c11.5 : FX4110 gets 3.42pts, X4 980 gets 4.34pts. FX is 27% slower at stock and 44% slower per "core" at the same clock.Multithreaded benchmark.

I have skipped over 7zip since I can't find comparable benchmarks . Aida cache and memory shows somewhat better memory read/write and L2/L3 cache BW for reads. The rest of cache performance is on par or slower than Deneb.

Conclusion: overall FX4110 is 32% slower than Deneb X4 @ 3.7Ghz stock vs stock and 49% slower when both are at 4.2GHz. Either all these tests are failure of the platform bugs (or something else) or Bulldozer is much slower than Deneb with the same "thread" count. All above tests utilize the "world's first 256bit FPU" and it fails hard versus "old 128bit" Deneb FPU,even in single thread mode... Imagine the OC you have to reach to just match Deneb,it has to be sky high (think 5.5-6Ghz on air to match 4Ghz Deneb). How is AMD going to charge 140$ for this chip is beyond me.
For the record, and another reference. going back to my Bobcat clock/clock comparisons
Per core K10 is:

Super pi 5% faster
Fritz chess: K10 20% faster
Cinebench 11.5: 49% faster

at the same clk speed.

which means, somehow Bulldozer is about the same performance :S , even on some of these SSE/FPU heavy benches.

Has anyone determined the pipeline length details for Bulldozer? I believe Bobcat is 15 stages vs 12 for K10

It is likely a bit longer to enable these high clockspeeds, but I'm still finding some of the results out of line. I know you didn't look up superpi, but it for one is, according to these results slower than Bobcat (as i've mentioned before), given the architectures seem to be similar, I find this quite odd.

Even if the pipeline stages are longer than Bobcat's the massive amounts of Cache, much larger buffers, much wider more capable performance orientated FPU (Bobcat has a very trimmed down FPU due to its target market) , assumed more aggressive prefteching It certainly doesn't make much sense at this stage.

I can understand similar IPC to Thurban given the higher frequency headroom, and trade-off's to achieve high performance / watt (all valid design decsisions), but these outlier results like Cinebench, Wprime, Fritz, are quite baffling