Actually, the new, "demanding" apps are running SSE code, not FP code.
And don't tell me they are the same, because they are very different. FP code has been left in to essentially support old applications that rely on it. It is notoriously hard to optimize in assembly language because the instructions for it have variable lengths.
This is one of the main reasons SSE/2/3/4 were developed. They have fixed length and can perform the same mathematical calculations with instructions that are all the same size. This is why it is so much easier to optimize SSE code for speed at the assembly level - all the instructions are the same length.
And since there already is a very large installed base of SSE code, this is where K10 is going to see the biggest improvement over K8.
Bookmarks