Quote Originally Posted by Opteron146 View Post
HMm no ...
If you mean "shared" in the sense like Bulldozer's Decoder is shared between two INT clusters, then your statement ist wrong.

These pipelines are there since the good old AMD K7 days, they consist of a pair: one AGU for adress generation and one ALU for any other, normal INT calculations.

Nothing is shared between those two pipeline. Some may count the scheduler, but that one is shared even to a higher degree in Bulldozer, too.

However, sharing in that context is a good thing, the unified scheduler is more flexible in assigning ALU and AGU Ops to the ALU/AGU pipelines. Nevertheless the theoretical, maximum throughput of Bulldozer is only 2AGU+2ALU ops per clock, whereas a K10 could issue 3+3 in the best case.

Because of the better efficiency, Bulldozer might still have the higher throughput in reality, but the 3 against 4 pipeline picture is still wrong from a technically point of view.

See also here:
http://www.chip-architect.org/news/O...teger_Core.jpg
http://chip-architect.com/news/2003_...it_Core.html#1

(ALUs 0,1,2 and AGUs 0,1,2 are clearly visible, they are not one big shared entity)
I believe the AMD engineers have weighed all the options and have chosen the best configuration.
The ALU and AGUs in Bulldozer are new designs, so it should be much more superior and more efficient than the old designs.