NUMA is only relevant with AMD's 2 socket or higher platforms. A single-socket AMD system is still UMA.
And as evidenced by the QuadFX, NUMA tends to reduce performance in desktop type applications.
Printable View
NUMA is only relevant with AMD's 2 socket or higher platforms. A single-socket AMD system is still UMA.
And as evidenced by the QuadFX, NUMA tends to reduce performance in desktop type applications.
Running one single desktop application on Intel will normally go faster than AMD, Intel processors is designed to run one single thread as fast as possible. Programming multithreaded applications is difficult. Applications that need this type of functionality has developers that know how to develop in order to take advantage of the processors capabilities. They know that it is a bad solution to have small threads that talk to each other, Intel processors doesn’t like that at all. The only x86 processor today that handles applications using small stateless threads well (e.g. small functions that are stateless could easily be runned in their own thread) is AMD’s K10. But this type of design isn’t done because the market is too small. When Intel will release their next generation Nehalem I think this will change.
Applications that use threads today is divided in big parts. Avoiding shared memory etc.
The problem arises when you run more than one application because this is a situation that the developer isn’t going to bother with.
Here is a test on the memory performance and scaling between Opteron and Xeon.
http://connexitor.com/blog/pivot/entry.php?id=191
http://www.informationweek.com/blog/...adcore_ba.html
Perhaps multi-threaded compression application is good example for scaling:
http://www.techspot.com/review/93-am...ion/page7.html
Q6600 goes from 735 to 1253 (from 1 to 4 cores) - only 70% increase
Phenom 9850 goes from 684 to 1402 scales better - 105% increase
My 9850 at stock speed (2.5Ghz) with ddr2-1066 does 692 / 1951 - 182% increase
(Not sure why mine differs so much from review site...)
Memory is not the reason - single core intel has better score.
Multi-threading is handled a lot better on Phenom. 2.82x increase from 4x cores compared to 1.7x from 4x on Intel Q6600.
This may explain better responsiveness and "smoothness" on phenom - it handles better multiple threads
P.S. Funny explanation from the reviewer for the big difference (ignoring single thread results or platform differencess) - "perhaps using the 100MHz clock advantage"... btw, 2.4 to 2.5 is only 4% difference
That matches up with Cinebench as well. The multiprocessor speedup was notably higher on my 9850 than it was on my brother's Q6600.
http://en.wikipedia.org/wiki/Front_side_bus
In complex image, audio, video, gaming and scientific applications where the data set is large, FSB speed becomes a major performance issue. A slow FSB will cause the CPU to spend significant amounts of time waiting for data to arrive from system memory.
http://upload.wikimedia.org/wikipedi...rd_diagram.png
That diagram doesn't really depict how a K8 or K10 system works. It's more of a classical K7 and under/Intel model.
It is meant to show how Intel works today, and the bottlenecks with that design. AMD is better
Here is some more about how the FSB slows performance on Intel
The need for an IMC and why the FSB is dead
AMD scales better then Intel when using more then 4 to 8 cores. I've heard this from some people who where using both systems that where used in university and some applications. Intel is sticking on it's FSB limitation while AMD has it's own HT offering way more bandwith.
AMD also created a true Dual & Quad core, Intel instead 'sticked' a few together, it works tho but what do you prefer, something truely or something hacked up?
I also switched from a 5600+ X2 to a Q6600 @ 3.2 GHz. And i have the same feeling. The AMD setup just felt smoother. I dont know what it is. Maybe i should play on my borther his Phenom 9850 for a wile. And compair it to my Q9450 on 3.5 GHz.
One test where a much slower AMD wins over Intel on high res. The reason could be that AMD has more I/O power and don't need to compete with memory doing I/O.
http://www.overclockersclub.com/reviews/intel_q9450/
not if you know why, the fsb on intel needs to handle I/O apart from memory and that slows it down. What was shown in that review was that one much faster Intel (3,2 GHz with 12 MB L2 cache) loses over one slower AMD ( 2,3 GHz with 2 MB L2 cache). Games love cache.