You can only count the r6xx SP to 64 when you are using 64-bit double-precision floating-points. Since there are no 64-bit ALUs, the r6xx have to emulate it and does so by using the whole 5 ALU cell. Then you get 1/5 of the performance and about 100GFLOPS. Thats why you have to seperate SP FLOPS and DP FLOPS.