given the ipc of those applications and the #fpu ratio. i don't see a big problem. Low ipc shows alot of dependant ops or lots of bubbles. So while it is fpu intensive code it isn't really intensive for the fpu. So longer latency would affect the thread throughput, another thread would not interfere with the performance. (which is also and advantage). Biggest problem is that the code uses an obsolete marked instruction set for x86-64. So in SSE this would or should run pretty decent on BD single threaded and very good if it is multithreaded... but given the x87 dependancy i'm not sure how BD will perform in that. Might be pretty bad if they really disregarded those legacy instructions.
this code would run great on K8 architectures. (if they can predict and feed enough to their execution resources)
Bookmarks