Quote Originally Posted by Apokalipse View Post
It's not just my definition.
Also, each core in BD can do its own fetch and retire with its own thread.
Yes, it uses shared hardware to do it. But it still works that way.
It's AMD's definition, you have simply accepted it. Try and find me a computer science source that agrees with that definition. You won't, because before now because instruction fetch and retire were traditionally considered part of the core.

Yes it is.
Sharing the FP scheduler (capable of handling two threads of FP micro-ops simultaneously) is just more flexible than two independent schedulers, AND uses less transistors.
Yes it can.
It just means it has to take two cycles to process each 128-bit micro-op.
I agree that it is more flexible and I like that design. But being flexible doesn't mean that it is really two separate FP units. Take away one half and it wouldn't be able to process 256-bit instructions. It is designed to be ganged together to process them. One half can't do half a 256-bit instruction in two steps, that would require a redesign of the FP unit.

From John Fruehe's FlexFP article:
The beauty of the Flex FP is that it is a single 256-bit FPU that is shared by two integer cores.