Quote Originally Posted by DarthShader View Post
Unfortunatly there is only wishful thinking in your post.

You are basicly saying, compared to a big Fermi, the mid Kepler GK104 drops die size by about a thrid ( ca. 510mm^2 is 50% more than ca. 340m^2), drops TDP by around a thrid (ca. 270W real peak TDP is 50% more than ca. 180W real peak TDP), increases transistor amount only a little bit (The rumours mention 3.5 Billions trannies, maybe up to 4. Billion) and the performance goes up by 50% too??? I can only see three possibilities how this could happen:
1) Fermi is the least efficient chip ever. Hot, broken, unfixable. Kepler is a heavily reworked architecture, picking up some low hanging fruits.
2) nV engeneers put physics and TSMC engeeners to shame, by doubling the gains from going to 28nm process TSMC would ever admit were possible.
3) What you say is wrong.

So what is going on, if there is indeed 3x times more Cuda Cores? Where does that 50% more perf come from? Dropping hot clock looks almost certain by now, but there seems to be still a lot more raw GFLOPs available. Will they translate to more performance?

It depends. My educated guess is ... there will be no SFUs anymore in a SM. That's were the space will come from, to fit all the extra CCs. Special functions will be done on the CCs in multiple clock cycles, just like... in GCN! There are a few advantages from this approach. First, you can reduce the data movement inside a SM, registers can be kept closer to the SIMDs. Data moving is expensive, so you save power by avoiding that. Furthermore, SFUs do nothing for linpack numbers, they don't increase the FLOP count. And nVidia promised to deliver 3 times more GFLOP/Watt with Kepler and HPC is a very important market for them. So if you exchange "useless" SFUs for shaders, saving some power by doing that, this goal becomes possible to achieve!

So if you look at artificial, "canned" benchmarks that rely on raw GFLOp power... yes, there is going to be 50% more performance. But if you look at others, ones requiring special functions... performance will start to tank, likely under the performance of a GTX580. Games require a mix of both, so who knows how it will balance itself out. Overall faster than a GTX580 is very likely IMO, but not by much.
Little correction for you... Both the HD 2900XT and the FX5800 were FAR less efficient than fermi. The HD 2900XT consumed more power than even the 8800GTX, while performing far under, for example. Fermi at least had the performance lead, neither of my examples could say that.

Quote Originally Posted by jam2k View Post
AMD just delivered an entire top-to-bottom fresh lineup, while the competition has nothing and is 6 months late

Yeah, AMD is soo scared

But seriously, I think Tenerife is planned to fight Big Kepler (gk110) in a couple of months
Show me an AMD chip from the 7 series that you could buy 6 months ago.

Quote Originally Posted by Dimitriman View Post
That would've happened, had TSMC not of cancelled 32nm. It was that cancel that pushed back both southern islands and kepler.