Originally Posted by
JF-AMD
Each core can execute a single thread simultaneously. An 8 core die can execute 8 threads per cycle, a 16-core processor can execute 16 threads per cycle.
As I understand SMT, a 4 core die has 4 pipelines. If one thread stalls, another can take over those pipelines and continue. So, while you technically have 8 threads active, only 4 are running in any given cycle. SMT takes advantage of thread stalls to fill the pipelines with the "on deck" thread. This is why the throughput increase is 15-20% (for servers).