Could this Analyzing Efficiency of Shared and Dedicated L2 Cache in Modern Dual-Core Processors be why?

The tested systems are not particularly modern, one being a Socket 939, 2.0 GHz X2 3800+, 1MB total L2, E6 revision AMD system, the other being a X6800, 2.93 GHz, B1 Conroe Intel system.

Regardless of that when testing with 512KB+512KB and 2MB+2MB working set sizes respectively then the AMD system wins.