Quote Originally Posted by kl0012 View Post


It seems they counted only arithmetic instructions (IPC < 1 does not make sence). Also in a different part of a code the ALU consumption may greatly vary.
They counted address and math instructions :
Figure 2.2(a) and Figure 2.2(b) represent the instruction profile of CPU2006 and CPU2000 respectively. It is evident from the figure that a very high percentage of instructions retired consist of loads and stores. CPU2006 benchmarks like h264ref, hmmer, bwaves, lesli3d and GemsFDTD have comparatively high percentage of loads while astar, bzip2, gcc, gobmk, libquantum, mcf, omnetpp, perlbench, sjeng, xalancbmk and gamess have high percentage of branch instructions. On the contrary CPU2000 benchmarks like gap, parser, vortex, applu, equake, fma3d, mgrid and swim have comparatively high percentage of loads while almost all integer programs have high percentage of branch instructions.
You could see that it was never higher than 1.8x throughout the whole range of spec suit applications. The point is there is a lot of loads and stores that constitute a big part of instruction mix.

Quote Originally Posted by terrace215 View Post
You might also consider AMD's very own slide showing the relative improvement of client / server / hpc.

Client is only 1/2 of server, and 1/3rd of hpc. They've been quite open about BD's emphasis. Only when people start asking about single-threaded performance, etc, do they get defensive and start claiming that's going to be just wonderful, too.
What slides?The ones from 2007 that talked about BD version that got delayed in order to be reworked and improved?