Please calm down. These numbers are just multiple hypothesis. One is based on stuff like this:
Within each cluster 150 , execution units 154 may support the concurrent execution of various different types of operations. For example, in one embodiment execution units 154 may support two concurrent load/store address generation (AGU) operations and two concurrent arithmetic/logic (ALU) operations, for a total of four concurrent integer operations per cluster.
(contained in several Bulldozer related patents)

Some of my related posts:
2 ALU/2 AGU hyptothesis:
http://citavia.blog.de/2009/11/13/ho...-have-7366681/
4 ALU/4 AGU hypothesis:
http://citavia.blog.de/2009/11/16/bu...ought-7383623/
2 ALU/2 AGU hyptothesis by Hiroshige Goto:
http://citavia.blog.de/2009/12/19/bu...ussed-7605288/
2 ALU/2 AGU possible confirmation reported by Yusuke Ohara:
http://citavia.blog.de/2010/01/11/an...japan-7737558/

But it could also go into another direction:
In one embodiment, the ALU 220 and the AGU 222 are implemented as the same unit.
(also from some patents related to a BD like architecture, maybe a successor)

And to round it up there is my hypothesis, that there are just 2 ALUs and 2 AGUs, but running at a significantly higher clock than today (maybe even as a double pumped design).

What counts, is the resulting performance of the chip and not, if it has 4 ALUs/4 AGUs at only 2GHz. The goal is not to achieve the ultimate absolute raw power per clock, but the highest performance for different workloads inside a given TDP or ACP "envelope".