Originally Posted by
savantu
That's actually incredibly good.Why ? Because , IMO , Penryn has a better cache subsystem :
Penryn - 32KB L1 3 cycles , 6MB L2 15 cycles
Nehalem - 32KB L1 4 cycles , 256KB L2 11 cycles , 8MB L3 39 cycles
Nehalem has 4 cycles L1 , that's a lot , but if probably hidden with SMT and other techniques.The small L2 has very good latency , but at only 256KB it's really tiny.The L3 is very large , but also very slow.
Basically , it's far from optimal for single threaded apps ( Core/Penryn are best there ).
To be honest , I expected Nehalem to be slower than Penryn in single threaded apps that aren't BW dependant.This might still be the case , but it looks like Intel did its job.
Hell , even maintaining Penryn single thread performance coupled with K8/K10 scalability ( in fact even better ) makes Nehalem an excellent all around monster.