Update

we have now a good performance model of the application. The maximum we should have expected from fermi is a reduction in time of 60% compared to a gtx275. Actually, we now think that this will be achievable in the future. The problem is that cuda 3 is slower even on gtx200 by almost 15%. Indeed, we are 60% faster compared to both applications compiled for cuda3 but compared to cuda2.2 only 35%.

On our machine the running temperature is 91 degrees. Equivalent to a gtx275.
The only real problem is the price. Later i will report the time of the gtx470 and running temperature.

Gdf