AMD's Bobcat and Bulldozer

**Chumbucket843** · 08-25-2010, 11:22 AM

Originally Posted by Hornet331

well x86 quite sucks at ipc.. 1.5 is a good value.

IPC isnt the same across different architectures. for example a single SSE instruction can do 4 multiplies on 32bit floating point numbers in one instruction (mulps). fmul can do only one. yes, sse is explicitly data parallel but that is part of the weakness of ipc measurements.

a better example would be a sine function. you can use the taylor series to get a good estimate. modern x86 cpu's take ~40-100 cycles to execute the fsin instruction.

taylor series approximation:
x - (x^3)/3! + (x^5)/5! - (x^7)/7!

2 subtractions
30 multiplies
3 divide
1 add

36 arithmetic operations in a RISC processor is equal to 1 (very slow)instruction in x86. this is a select case. normally risc uses 30% more code space.

this algorithm has room for improvement actually. we can store the value of x to a power and save many redundant multiplications with a look up table. i.e. compute x^3 then multiply by x^2 or add the exponents. evenutually algebra will give you a nice shortcut.

Thread: AMD's Bobcat and Bulldozer

Thread Tools

Search Thread

Rate This Thread

Display

Threaded View

Bookmarks

Bookmarks

Posting Permissions