Vector processing on nehelam?

**~~Turtle 1~~** · 10-21-2006, 12:09 PM

Originally Posted by Carfax

I guess they mean Nehalem could get a REAL vector unit. Right now, Intel uses SSEn which is basically an FPU that can do vector instructions aswell.

If Intel takes technology from the Alpha EV8 for Vector, we could see a dedicated vector unit on the die which would be much more powerful than SSEn.

The benefits would be that anything which could be vectorized would see a massive speed up..

Faster encoding/decoding, frames per second, :banana::banana::banana::banana: surfing

So I am not understanding. In nehalem your saying Intel could add a vector unit . But it would also keep the SSEn units. As we are all aware that Intel has added sse4 instruction set to penryn 30 instructions and nehalem another 20 instructions for a total of 50 instructions . Would this still work with the vector units?

**nn_step** · 10-21-2006, 12:25 PM

Originally Posted by Turtle 1

So I am not understanding. In nehalem your saying Intel could add a vector unit . But it would also keep the SSEn units. As we are all aware that Intel has added sse4 instruction set to penryn 30 instructions and nehalem another 20 instructions for a total of 50 instructions . Would this still work with the vector units?

let me explain it this way
SSE, SSE2, SSE3...SSEn
http://en.wikipedia.org/wiki/Streaming_SIMD_Extensions
Do Floating point and SIMD math (aka vector)
http://en.wikipedia.org/wiki/SIMD
By seperating the Floating point math unit from the Vector unit, they can massively improve performance for BOTH.
Since the Floating Point Unit can specialize for Floating point math (and not have to worry about vector math)
And the Vector unit, only has to deal with Vectors.
Now Altivec/VMX (depending on who you ask [Motorola or IBM])
Basically does exactly that.
Now what I am hoping for is that they follow the Altivec design, which is VASTLY superior to ANY Intel/AMD Streaming SIMD Extension

**~~Turtle 1~~** · 10-21-2006, 01:56 PM

Originally Posted by nn_step

let me explain it this way
SSE, SSE2, SSE3...SSEn
http://en.wikipedia.org/wiki/Streaming_SIMD_Extensions
Do Floating point and SIMD math (aka vector)
http://en.wikipedia.org/wiki/SIMD
By seperating the Floating point math unit from the Vector unit, they can massively improve performance for BOTH.
Since the Floating Point Unit can specialize for Floating point math (and not have to worry about vector math)
And the Vector unit, only has to deal with Vectors.
Now Altivec/VMX (depending on who you ask [Motorola or IBM])
Basically does exactly that.
Now what I am hoping for is that they follow the Altivec design, which is VASTLY superior to ANY Intel/AMD Streaming SIMD Extension

Nice links nn. Now if I understand this correctly . Vector units to operate efficiently need there own registor. True or False. Is it possiable that the russian company intel bought a while back. Will aid intel with a much better compiler that could overcome FFU and vector units trying to use the register at the same time ? Anyone!

**nn_step** · 10-21-2006, 02:07 PM

Originally Posted by Turtle 1

Nice links nn. Now if I understand this correctly . Vector units to operate efficiently need there own registor. True or False. Is it possiable that the russian company intel bought a while back. Will add intel with a much better compiler that could overcome FFU and vector units trying to use the register at the same time ? Anyone!

Ideally speaking you would want 32 registers JUST for the Vector Unit. 128 or 256 bits wide apiece.
Which will cause a Double in the space needed for Floating point/Vector math But you will get up to (in theory) 4 Times the processing power. Which SHOULD make it a Floating point/Vector Monster

**Carfax** · 10-21-2006, 11:33 PM

Originally Posted by nn_step

Now what I am hoping for is that they follow the Altivec design, which is VASTLY superior to ANY Intel/AMD Streaming SIMD Extension

Actually, with Core 2 Duo, Altivec is no longer superior to SSE2 as Core 2's throughput is similar to Altivec's when working on 32-bit single. SSE2 actually is superior to Altivec in many ways, since it can do 64-bit double precision math while Altivec cannot.

If Intel decides to improve on the vector capabilities of future processors, there's plenty that they can do.

They'll need to increase bandwidth though, as bandwidth is a severe limitation for vector performance.

Thread: Vector processing on nehelam?

Thread Tools

Search Thread

Rate This Thread

Display

Hybrid View

Bookmarks

Bookmarks

Posting Permissions