It's more interesting that they slimmed down the integer cores to 2 ALUs/2 AGUs.

This is a big change from the 3 ALUs/3 AGUs that AMD has used for K7/K8/K10.
Combine this with the four wide front end/decoder and it is clear that BS is a
throughput design optimized for server type workloads.

It looks like AMD concluded that it can't match Intel in single core performance
and it would always have to throw 2 cores at every Intel core. It sat down and
looked carefully at how much sharing and trimming it could do to make 2 of its
cores closer in area and power to 1 Intel Nehalem/SB core. The biggest moves
are sharing front end and FPU between cores and slimming down the integer
cores. This should keep AMD in the game in servers where it can put two 8 core
devices into an MCM and sell it against 8 core SB and 10 core Westmere-EX.
The loss of the third ALU/AGU in the BD's integer cores likely won't make a lot
of difference to commercial server workloads which have low ILP and high MLP.
The sharing of the FPUs probably won't hurt HPC performance much because
with 8 cores per die and 16 cores per package memory system limitations will
tend to dominate.

IMO where it becomes problematic is building client devices based on the BD
architecture. AMD will really have to ratchet up its CPU clock rates to more
than make up for the loss in IPC from the slimmed down integer cores.


http://aceshardware.freeforums.org/a...iew-t1042.html