In Anands test the diff between idle to load for SB is ~60W means that we have 3.3gflops/W which is not far from 5450 and also OOO/branch prediction/prefetchers/ which uses most of the power allows to utilize these flop/s in more effective way. BTW nvidia gt310/320 has even lower gflops/w ratio.
~12x slower you mean?but comparing flops to flops as a performance metric is probably about as inaccurate as you can get. we have already seen the 5870 achieve 2.2tflops in sgemm. an ideal case for sandy bridge is 20x slower.
http://software.intel.com/en-us/arti...mkl-v103-beta/
* AVX DGEMM (M, N, K=8Kx4Kx128) performs 1.8x over NHM. AVX DGEMM/SGEMM achieves 88-90% machine peak.
* The AVX/NHM speedup is 1.8x for radix-2 1D CFFTs with N=1024
* The Intel® Optimized LINPACK benchmark, using Intel AVX optimizations, performs over 1.86x (or over 80% overall efficiency) on 4 cores with N=20000.




Reply With Quote

Bookmarks