Quote:
Conroe's score is amazing.
Keep in mind that he tested both the Pentium-M optimized binary and the Pentium 4 binary. Out of both binaries, the Pentium 4 was faster. This is compiled with Intel's latest *publicly* available compiler.
Conroe is more similar to the Pentium-M than to the Pentium 4, but the binary doesn't utilize Conroe's wider resources. In *scalar* code, x87 code is generated by the Pentium 4 optimized binary because the Pentium 4 can't execute pipelined scalar sse adds, but can in x87 mode. (I had thought this was not the case but Intel's optimization guide says this is the case, I'm not sure if I 100% believe that, but given that x87 code was generated with the most aggressive flags possible -- sse/sse2 code was used sparingly in with some of the x87 code).
I imagine that if you ran the 64-bit binary Conroe's lead would widen even more.
What I am perplexed about is why Conroe bombs on the encryption code -- the entire instruction mix consists of BSWAP, XOR, and various MOV instructions, none of which are micro-coded on other processors, and no jump instructions. With Conroe's wide integer resources, I don't see why this is the case. I'm guessing there is an address generation limitation, but since I don't have a Conroe I can't really test that theory. It's a blind guess.
In any case, I see this as a *strong* showing for Conroe, not a weak one as the blogger claims. In the benchmarks that matter (BLAS, MolDyn, Primordia) Conroe is at least equal to if not exceeding an Athlon at the same clock speed. The fact that in 32-bit mode, an Athlon64 clocked 400 mhz higher cannot exceed Conroe's performance gap is telling.
I can't wait for a 64-bit run....
A good explanation, and depending on what stepping Victor is using, it may explain the other anomalies.