Again, it's not a logical vs physical debate. If your workload is pure integer, you can execute 8 threads on 8 real integer cores at the same time. If your workload is pure 128-bit floating point, you can again execute 8 threads at the same time. The FPU is a little more akin to HT in this instance in that there's one unit that is doing multiple threads, but ultimately it's more like two real cores. It splits in half and executes each thread independently. If your workload is pure 256-bit floating point, you can only execute 4 threads at the same time. This is not like HT at all. To make it even more interesting, a given cycle could have the CPU look like five, six, or seven "cores" depending on the mix of instructions. Using the "core" nomenclature for something that isn't built using that traditional model isn't very precise, but if you want to call it anything you should go by the integer cores since the vast majority of arithmetic in computer programs is integer. Eight real "cores".