-
Each module has two integer cores and two 128-bit FPU's.
the two 128-bit FPU's belong to two cores. They just have a single FP scheduler.
It means fewer transistors used, as well as allowing one 256-bit AVX instruction (decoded into two 128-bit micro-ops) to run on both 128-bit FP pipelines simultaneously.
Thus 256-bit AVX can complete in one cycle while at the same time keeping each core's FP pipeline 128-bit
Having a 256b FP pipeline per core would be faster than using modules if therere enough 256b instructions in enough threads. However, it would be a massive increase in die area for a tiny performance gain.
One thing Bulldozer also has is a combined multiply-add instruction, which takes half the time of separate multiply and add instructions (and also has higher precision).
SB has 256-bit FP pipelines per core, but lacks a combined multiply-add instruction.
Last edited by Apokalipse; 05-19-2011 at 11:23 AM.
Posting Permissions
- You may not post new threads
- You may not post replies
- You may not post attachments
- You may not edit your posts
-
Forum Rules
Bookmarks