Quote Originally Posted by ajaidev View Post
BD is a new architecture, Sandy bridge is not its a evolution. Yes this is a shoot in the dark but its hardly likely that a single BD module can overtake a dual core K10.5 in multi threaded tasks, thus the 80-90% "CMT vs true dual core argument"

You cant use modules simultaneously for single thread, for two threads the whole module can be used and these two threads will share the common FP unit. The scheduler will take care of multi threaded and single threaded division.

BD will provide (4x 64-bit FP MUL and ADD)/clock and Nehalem gives (4x 64-bit FP MUL or ADD)/clock
I disagree, you're logic is missing one very key fact. Integer performance is far more important than FP for most applications, if it wasn't everyone would be using k10.5s instead of core 2 quads (nehalem is a different beast).

Each bulldozer modules has 2 integer "cores" that are 4 issue. k10.5 has 3 issue integer. That makes a world of a difference, so bulldozer should have a sizeable increase in single threaded performance, I believe that latest number was something around ~35%. If you take into account that increase, then the 80% increase from the second module will surpass the efficiency of two physical k10.5 cores (.8*1.35+1.35=2.43). Even if you took 20% increase, which is probably more realistic to be honest, you still get .8*1.2+1.2=2.16, thus on average 1 bulldozer module will not only be 20% faster than 2 k10.5 cores for single threaded tasks, but also 8% faster for multithreaded. If you expand that to 2 modules, then you see an average of 16% increase over a k10.5 quadcore for multithreaded tasks, which quite frankly is a very nice increase.

It's simple math really. This is the same reason why core 2 duo was still faster than k10, even though k10 spanked it in FP. k10 had only 3 issue integer logic to core 2's 4.