AS WE NOTED here, the chaps at Xtremesystems have been showing off a sample of Intel’s Kentsfield Quad Core processor, presumably made available early to manufacturers in order to validate their motherboards.
At default speed, the 2.4GHz part impressively outperforms the Conroe, even one running at 4GHz in multithreaded apps, such as Cinebench, and overclocks to 3.2GHz without problems – one hell of an achievement for an early sample, especially since all four CPU cores have to be capable of running at this speed.
However looking at the benchmarks reveals evidence of the dreaded bus saturation – AMD’s justification for direct connect. With two cores, the two products are fed with similar amounts of data, but moving up to four cores AMD’s solution scales up better.
Even with the logistical nightmare of AMD’s architecture (you need at least four DIMMs to deliver the best performance) it is a big advantage.
Compare Kentsfield to AMD's 4x4: on a 1066MHz bus the Intel socket has 8.5GB/s available. This isn’t a problem for dual core chips, but stick two Conroes onto a module, then bandwidth drops to a miserly 2.1GB/s per core – the equivalent of just one stick of DDR266, and with higher latency!
So here’s the killer – each 4x4 core will receive over three times the bandwidth of its Kentsfield equivalent.
The evidence of this is in the performance of SuperPi 1M, seen as a benchmark that largely fits in 2MB cache and doesn’t depend heavily on bandwidth. Run four copies, though, and each takes 20% longer than if two copies had been run on a Conroe of the same speed: 25.3 seconds compared to 21 seconds.
So what’s causing this problem? Our theory is that a strength of the core, the intelligent prefetchers, are getting in each other’s way, saturating the bus, and increasing the latency to RAM. Move to a more bandwidth-hungry application and the problem will surely just get worse.
This is a problem not shared by a Conroe as there is only one bus interface (including the pre-fetcher), shared between the two cores. Likewise a dual Woodcrest doesn’t suffer, because of the two independent buses.
But in context, this is still very impressive stuff. While AMD’s 4x4 platform will certainly offer an advantage in this area it may not be enough to hand the ultimate quad core performance crown to the boys in green until rev H hits, but if many other apps are 20 per cent slower per thread on Kentsfield than on Conroe, then it will make things a lot closer.
Bookmarks