Quote Originally Posted by savantu View Post
That's actually incredibly good.Why ? Because , IMO , Penryn has a better cache subsystem :

Penryn - 32KB L1 3 cycles , 6MB L2 15 cycles
Nehalem - 32KB L1 4 cycles , 256KB L2 11 cycles , 8MB L3 39 cycles

Nehalem has 4 cycles L1 , that's a lot , but if probably hidden with SMT and other techniques.The small L2 has very good latency , but at only 256KB it's really tiny.The L3 is very large , but also very slow.

Basically , it's far from optimal for single threaded apps ( Core/Penryn are best there ).

To be honest , I expected Nehalem to be slower than Penryn in single threaded apps that aren't BW dependant.This might still be the case , but it looks like Intel did its job.
Hell , even maintaining Penryn single thread performance coupled with K8/K10 scalability ( in fact even better ) makes Nehalem an excellent all around monster.
K10 actually went forward in "all-roundedness" (mostly multi-threaded but single threaded perf. went UP at least), and all it got was bashing by basically the same people rejoicing now. I'm gonna have to call you (and the others) out on this.

K8 scaled like C2Duo BTW. About the same underpinnings. It's just C2Q that started to show signs of fatigue with 4 cores.