AMD cuts to the core with 'Bulldozer' Opterons

**Hans de Vries** · 08-11-2010, 02:55 AM

Originally Posted by Dresdenboy

Hyperpipelining is less painful as it sounds. Simply clocking an existing unit twice as fast (without dividing it into more pipeline stages) would be painful instead.

It's no coincidence that the architects behind the very long SIMD words
(256 bit, 512 bit and longer) are Doug Carmean and Eric Sprangle who joined
Intel from Ross technologies.

These are exactly the Hyperpipelining specialists at Intel:

(1) They co-authored the original hyperpipelining paper:
Increasing Processor Performance by Implementing Deeper Pipelines

(2) They leaded the original ~60 stage hyperpiplined Nehalem project.
http://www.theinquirer.net/inquirer/...em-slated-2005

(3) They initiated the Larrabee project. One of the main ideas behind
Larrabee is to achieve a theoretical maximum number of FLOPs on a
certain die with a limited number of transistors. A fourfold hiperpipelined
128 bit unit running at 4.8 GHz can produce 512 bit results at 1.2 GHz
using only 25%(+a bit) of the transistors of a non hyperpipelined unit.
ftp://download.intel.com/technology/...abee_paper.pdf
http://www.drdobbs.com/high-performa...ting/216402188

The SIMD units are the easiest (of all units) to hyperpipeline. All instructions
which could cause problems for hyperpipelining have been systematically
left out of the AVX and LNI specifications. (for instance data shuffles
crossing 128 bit boundaries)

Regards, Hans

Thread: AMD cuts to the core with 'Bulldozer' Opterons

Thread Tools

Search Thread

Rate This Thread

Display

Threaded View

Bookmarks

Bookmarks

Posting Permissions