Results 1 to 25 of 1225

Thread: Bulldozers first screens

Threaded View

  1. #11
    Xtreme Cruncher
    Join Date
    Jun 2006
    Posts
    6,215
    Quote Originally Posted by kl0012 View Post
    In fact, Bulldozer's module max floating point instruction throughput on current software (without AVX and FMA) is equal to thus of one Phenom's core - two 128-bit fp ops per cycle. Bulldozer module is more flexible - it can start any combination of ops per cycle (such as MUL+MUL or ADD+ADD) while Phenom core is tied to MUL+ADD. On the other hand Bulldozer has higher latencies for fp ops and various FP-pack/blend/copy ops are executed on one of two fp-pipes while Phenom has special unit (fp-misc) for such type of instructions. So it is possible that 6-core bulldozer will have equal performance to 3-core Phenom on the same freq in apps with many fpu code.
    My prediction is that one FMAC will have around 20-30% higher performance than one Thuban core ,in non recompiled MT software.Single thread fp performance should be a lot higher than that(2xFMAC in this case).In FMA optimized code, there should be substantial jump,maybe up to 50%.
    You can see from leaked donanimhaber slide that 8 core (probably <3.5Ghz) model has approx. 1.88x the performance of 1100T in Cinebench 11.5.That's non recompiled legacy fp workload in which you have 8 128bit FMACs working versus 6 Thuban cores(each of which is Mul+Add). This roughly corresponds to 1.3x the fp power of Tuban core,roughly at the same clock.

    edit: Someone asked about stepping or revision of BD ES in question. It has W8K44 at the end so it is a B0 for sure. Since Charlie wrote pre-B1 was useless for benchmarking you can now see why the scores are the way they are.
    Last edited by informal; 04-27-2011 at 03:10 AM.

Bookmarks

Bookmarks

Posting Permissions

  • You may not post new threads
  • You may not post replies
  • You may not post attachments
  • You may not edit your posts
  •