Quote Originally Posted by Astennu View Post

So if instructions can be grouped together you can have quite a performance boost. Down side is if you cant group instructions you only have 1/4e or 1/5e of the performance.

I dont know if its true but i have heard the AMD compiler does a quite good job. Grouping up to 3-4 of them most of the time.

But if you have a heavy nVidia optimised game you can have lower value's and bad shader performance. (you might it 1-3 then)
it depends on the what it is doing .you can see here Ati is the intel of synthetic benchmarks.http://www.bit-tech.net/hardware/gra...ture-review/10. the wider you make a vector, the harder it is to keep under full load.


Then about the performance of the HD5870. It does not seem to me memory bandwith limited. But i also think there is more performance inside this core then we see now. It might be driver related. It wont surprise me if we get up to 20% higher performance in the future.
on average one flop takes 1 byte per second of memory performance. this translates to 2 terabytes per second of required bandwidth for rv870 so every gpu made is bottlenecked from this. the only way is to further reduce the memory operation to calculation ratio. its already 100:1 but it must go higher.