Even better: If someone has machines with various brands of CPU they can make a type of benchmark that CAN be measured and repeated. If they ADD other benchmark(s) at the SAME TIME then they can create some interesting Xtreme conditions. They could try to find something that slows down one machine but not the other AND is repeatable.
For example: What happens on various CPU if you are running Prime95 WHILE playing a game? Does it make one system bog down and the other doesn't even notice? Does the Prime95 go slower?
If you run 4 threads of Prime95 (perhaps 8?), a game benchmark, and everything works without problems... then add something else like a virus scanner. You can keep adding different things that use up various resources one at a time until either one or both machines slow down. Then back off a bit and see if you can determine which machine had problems first. It this process is repeatable then you have found a method to determine "smoothness".
Of course if one brand works better than the other when this process is done... you can expect some people to claim: "I never run all of that stuff at the same time so it doesn't matter to me." However if this kind of thing can be measured AND proven then it is definitely something to consider.
The problem is that this type of "Xtreme" benchmark is not easy to find. In fact various people have already tried the 4xPrime95 + Game test. We didn't really see any differences between brands. So the hypothetical "slowdown" would take a lot of work to actually find and will require more work.
EDIT: BTW: I do actually expect this process to happen... but it very probable that it will be done by somebody testing between the new Intel i7 and and older Intel chip of comparable speed.
Bookmarks