Quote Originally Posted by JumpingJack View Post
..
Nehalem has simply improved clock for clock performance, the sites running single GPUs comparing against a QX9770 are simply showing gaming situations already railed up against the GPU performance, so no matter what one does, the overall result will make the 'CPUs look the same'. The fact is, i7 is showing similar gains in gaming code execution as it is in 3D rendering or video encoding ... the current crop of reviews/GPUs hides it because the GPU is capping the results.

People focus on the QPI and IMC as the major changes to Nehalem, but these were not all the major changes -- Intel also deepened the execution window, and improved branch prediction (both good for gaming code). However, looking at the tri-SLI results from Guru3D and Toms (recently posted) actually surprised me -- I was expecting modest gaming improvements but some are just huge...
Nehalem isn't just Core on steroids ( IMC+QPI).They've improved every single part of the core, everything was tinkered with.
Some changes are more radical , like macro-uop fusion in 64bit and being capable to support a larger variety of instructions being fused , SMT , second TLB and so on.

Where Nehalem stumbles is the slower L1 and small L2.People forget that Penryn was an incredible high standard to start with ( look at AMD being incapable to offer a solution , probably 2010 ) and a nice massive very fast 6MB L2.As long as an app is cache friendly , Penryn will rock.