Quote Originally Posted by ***Deimos*** View Post
Plz correct if I'm wrong..

Amdahl's law states that if P is the proportion of a program that can be made parallel (i.e. benefit from parallelization), and (1 − P) is the proportion that cannot be parallelized (remains serial), then the maximum speedup that can be achieved by using N processors is

( (1-P) + P/N )^-1

Therefore, doesn't matter if you got Fermi, G200, or G92, if "only" 90% of code can be parallelized, anything beyond 10 SP or CPU wont be faster.
Thus with 512 "cores", maximum efficiency requires 99.8% parallelization.

Seeing how vast majority of programs, and even games are barely even optimized for 2 cores, you can imagine how much ingenuity this requires on GPU side.
not much, gpu applications like graphics are embarassingly parallel and this parallelism increases with problem size. i.e. if you double the pixel count you double parallelism. this law sounds very grim but truthfully its not. gpu's are already running thousands of threads to hide latency.