Quote Originally Posted by drfedja View Post
2 ALU's isn't so much problem because L1 Data cache is 2-ported. There is only two memory operations per cycle, with 10h or even BD. There is two AGU's which is used for memory address calculatons. Most of integer operations are mov's from reg-mem, mem-reg etc.... BD integer core will be much stronger than 10h, because of much better memory level parallelism.
I didn't say it was a problem. Sometimes you really can do more with less. If they kept the extra ALU or even added more to satisfy certain high ILP code (ie, not average) then they would just be wasting power and transistors that could have been, and were, spent elsewhere (more cores).