Here is everything I got (see: understood) from the 20 Questions John answered on his AMD blog.
-BD modules = 2 cores. Call them what you want, they are two cores and will always be treated, by the OS, as two cores.
-Each core's FPU can be looked at one of two ways, but the end result is essentially the same. Either A) It's two individual 128bit FPUs per core that can only work in parallel to process 256bit AVX extensions/threads/processes (whatever is the correct term lol) or B) Each core shares 1/2 (128bits) of a 256bit FPU, but is only 256bit for AVX ext/thr/proc.
So I'm able to conclude:
a. Each core's FPU can process it's own information on a single cycle, but still only has a single scheduler. So I assume a core will receive it's orders ever so slightly delayed after the other core.
b. A BD has one FPU capable of 256bit (for AVX), per module. Thus, an FX 4000 series will have 2 modules, for 4 cores, with 2 256bit FPUs; an FX 6000 series will have 3 modules (working, disregarding any potential 'disabled' ones), for 6 cores, with 3 256bit FPUs; an FX 8000 series will have 4 modules, for 8 cores, with 4 256bit FPUs.
That about the gist of it?
Bookmarks