That is where you are wrong, hopefully...
There is a LARGE increase with the ROPs even while keeping them the same number. They should be tweaking them so that they perform roughly twice as fast while staying at the same clockspeed.
Plus the fact that they, if the independent clocks is indeed false like I think it is, are at a 1050mhz core clock which is a ~35% increase.
This is why I doubt they will be going with separate clock domains, it would hurt the TMU and ROP performance at the expense of a slight increase in shader performance, which isn't the bottleneck in the first place.
If the above is correct we should see a, theoretical, max increase of ~170% for the ROPs.
The same ~170% increase for the TMUs.
A ~103% increase in shader performance.
Plus a 80% increase in memory bandwidth, in regards to the 4ghz GDDR5.
16ROPs is just fine if they do indeed have the 1050mhz clock and are tweaked to double the z performance.
Bookmarks