PDA

View Full Version : Comparitive Data


Axlegear
09-28-2006, 10:28 PM
(Forgive me if this is the wrong forum. I'm new to this forum! It's also 12:30am and I haven't slept..)

Okay, so here's the deal. We have all these new-age CPUs, and all this overclocking.
How much of an effect does it REALLY have?
Is a 1.8 GHz Venice really ten times faster than a 600 MHz Celeron?
I've read that a 3.2 GHz P4 is only 30% faster than a 486, at scanning the encyclopedia britannica in and out of itself, though i'm skeptical of that.
So let's put this to the test.

I have several reasons for wanting this information.
1) So I can identify which CPUs are superior in price:performance for very specific applications (namely F@H)
2) So I can work out the best way to build a Yatta rig for Price:Performance:Power Consumption. (IE, would having a bunch of dirt-cheap, low power consumption VIA's work, or a ton of dual cores, or lots of junker P2's and P3's, or etc.)
3) Personal curiosity.

So, here's what I want gathered. Using a tool which measures raw GFlops, I plan to harvest all the processors I can. Everyone's help is much appreciated!
Of all the tools i've used, the only accurate one I know of that gives a raw read of FLOPs and MIPs is SiSoft's SANDRA. If anyone has a better suggestion, i'm open to hearing it.

here's what I want:
MFLOPs
MIPs
CPU (plus Family, Stepping, Revision, etc. and also especially the Cores, L1 and L2 cache, and instruction sets. Basically, screenshot CPU-Z. =) )
Stock Core/FSB frequencies
Current Core/FSB frequencies
Power Consumption (if you know it)



I promise to only use this information for good! (If F@H isn't good, what is?)
Mostly, i'm just curious if, say, a 10% boostin Core/FSB actually translates to a 10% performance boost, or even 5%. I've never noticed much of a performance boost in all but video card OCing, even with extreme levels (like my old XP3200's 333 FSB to 355)
Note1: We know the core frequency's effect is really quite subjective and situational. For example, Via's microprocessors are up to 1.2 GHz (maybe more, that was some time ago) but perform on-par with a 533 MHz Celeron. So, yeah.
Note2: I plan to actually bench a few processors and system of my own shortly too, not just for MFLOPs but also for F@H and monitor them closely. I plan to compare both MFLOPs reported by SANDRA or similar, to real-world F@H performance. I plan to be quite concisive and give the most accurate results I can produce to acheive a real answer.

Axlegear
10-04-2006, 10:03 PM
Yanno, I should REALLY get off my butt and post my own results?

spdycpu
10-04-2006, 10:52 PM
I just finished with a project I've been working on the past few days, it may help you a little. All benchmark routines were coded in assembly and should be quite fast. It is all floating point calculations so far, I have yet to add any integer (+ integer MMX) tests and I'll probably add SSE3 tests later as well. Also you should be able to run it on any CPU that supports the cpuid instruction (anything greater than a 486). All other instruction sets (3DNow!, SSE, SSE2, etc) are detected at run time and only the instruction sets supported by the CPU are run.

Here is the link to download it. Results from some of my computers as well as the instructions/info are in the readme.txt, the source code is in the source directory.
http://chess.homelinux.com/cputest.zip

If anyone has a Core 2 Duo that can run this for me, please do, I'd love to see the results. Supposedly the C2D SSE units can do more per cycle than previous processors so that should be interesting. The scalar SSE/SSE2 tests are important as well, while slower than packed/vector SSE/SSE2 it can replace the normal FPU entirely, vector SSE only helps in some special situations. The upcoming K8 that AMD is planning to release should be similar in the way of SSE speedups, and, it should have 2x the regular FPU performance as well from what I hear.

I'll be running more tests from time to time on various CPUs and updating cputest.zip with the results. I love CPUs and collect all I can, so I'll likely be testing a whole box full of Socket-7 stuff here pretty soon. :)

Axlegear
10-05-2006, 03:41 AM
You, sir, are my new God.

Axlegear
10-05-2006, 05:45 AM
Intel Pentuim III EB 1000 MHz [Family 6, Model 8, Stepping A, Revision cD0. Coppermine 2, Socket 370 FCPGA, 0.18um, 0.016v]
(999.6 MHz, 133.3 MHz FSB, x7.5 multiplierm, 16+16 KB L1i+L1d, 256 KB L2)
x87 FPU: 0.7126 GFlops
SSE Vector: 1.7017 GFlops
SSE Scalar: 0.7499 GFlops

I dunno whether to trust CPU-Z on the voltage. 0.016v!? it's CPU-Z v1.37..

Axlegear
10-05-2006, 08:03 AM
AMD Athlon 64 3000+
(Venice core 4, socket 939, 0.09u, 1.392v)
(AMD Athlon 64 Processor 3000+ family F-F model F-2F Stepping F revision DH-E3)
Instructions: MMX+, 3DNow!+, SSE, SSE2, SSR3, x86-64
1800.1 MHz, 200 MHz FSB, x9 multiplier (64/64 L1i/L1d, 512 KB L2)

x87 FPU: 1.9778 GFlops
3DNOW! vector: 3.7342 GFlops
SSE vector: 7.2683 GFLops
SSE scalar: 1.9504 FGlops
SSE2 vector: 3.7310 GFlops
SSE2 scalar: 1.9717 GFlops

Axlegear
10-05-2006, 09:40 AM
Intel Pentium 4 Northwood 9
Family F-F Model 2-2 Stepping 9 Revision D1
2661.7 MHz, 133.1 MHz (Rated 523.3 MHz), x20 multiplier
8KB L1d, 12 Kuops L1 Trace, 512 KB L2

x87 FPU: 1.4659
SSE vector: 6.8607
SSE scalar: 1.7179
SSE2 vector: 3.4340
SSE2 scalar: 1.7171

Axlegear
10-05-2006, 09:51 AM
Intel Pentium D 820
Family F-F Model 2-2 Stepping 7 Revision B0
2793.2 MHz, 199.5 MHz (Rated 798.1 MHz), x14 multiplier
2x16KB L1d, 2x12 Kuops L1 Trace, 2x1024 KB L2

x87 FPU: 1.3636
SSE vector: 6.2727
SSE scalar: 1.5766
SSE2 vector: 3.1532
SSE2 scalar: 1.5778

Theli
10-05-2006, 10:05 AM
Core 2 Duo E6400 (Allendale)
Family 6 - Model F - Stepping 6 - Revision B2
Instructions: MMX, SSE, SSE2, SSE3, SSE4, EM64T
3200 MHz, 400 MHz FSB, x8 multiplier
Stock: 2130 MHz, 266 MHz FSB, x8 multiplier


L1 Data Cache
Size: 32 KBytes x2
Discriptor: 8-way set associative, 64-byte line size

L1 Instruction Cache
Size: 32 KBytes x2
Discriptor: 8-way set associative, 64-byte line size

L2 Cache
Size: 2048 KBytes x1
Discriptor: 8-way set associative, 64-byte line size



x87 FPU: 2.5539 GFlops
3DNOW! vector: -
SSE vector: 12.7604 GFLops
SSE scalar: 3.1949 FGlops
SSE2 vector: 5.1135 GFlops
SSE2 scalar: 2.5537 GFlops

Axlegear
10-05-2006, 10:14 AM
Thanks, Theli.

I'll be grabbing more benches from more systems as time goes on. The Linux box will be especially difficult. The Mediabox is down. Plus my two roommates' PCs will need to be benched whenever they agree.

Keep 'em coming guys! I'm starting to see some interesting trends.

Axlegear
10-05-2006, 10:41 AM
AM Athlon 64 3200+ Newcastle 4
Family F-F, Model C-C, Stepping 0, Revision DH7-CG
2206.6 MHz, 200.6 MHz (802.4 MHz rated)
64KBi+64KBd L1, 512 KB L2

x87 FPU: 2.1333 GFlops
3DNOW! Vector: 4.0798
SSE vector: 8.6411 GFLops
SSE scalar: 2.1491 FGlops
SSE2 vector: 4.3115 GFlops
SSE2 scalar: 2.1558 GFlops

Axlegear
10-05-2006, 10:51 AM
Athlon XP Barton
Family 6-7, Model A-A, Stepping 0
2207.7 MHz, 200.7 MHz FSB (rated 401.4 MHz)
64+64 KB L1, 512 KB L2

x87: 1.8319
3DNow: 3.7067
SSE vector: 6.1136
SSE scalar: 1.8028

Axlegear
10-05-2006, 11:03 AM
Pentium M Banias 22
Family 6-6, Model 8-8, stepping A, revision cD0
599.9 MHz, 100 MHz FSB (400 rated)
16+16 KB L1, 1024 KB L2

x87 FPU: 1.0817
SSE vec: 2.7034
SSE sca: 1.3303
SSE2 sca: 1.3682
SSE2 vec: 1.0981

spdycpu
10-09-2006, 09:55 PM
Here is a new version with x86 integer tests, integer MMX and SSE2 tests as well as a faster FPU routine for the x87 test. The integer testing is addition only and should represent the absolute maximum peak the processor can realistically attain. The same goes for the FPU really, but, it is uses add/muls for closer to "real world" type testing.

Download: http://chess.homelinux.com/cputest102.zip

Another thing I found interesting is the speed between SSE and 3DNow! on various chips. The Athlon 1.4GHz is only doing about 2 per clock cycle on paired add/mul instructions. The original peak stated by AMD when 3DNow! was first introduced was 4 operations per clock cycle (2 adds with pfadd, 2 muls with pfmul at the same time). I ran this on a K6-2/366MHz and saw it getting ~1.37 Gflops which is close to 4 operations per cycle. All Athlons however have a 4 cycle latency for the pfadd and pfmul instructions, dropping the 3DNow! speed down to 2 instructions/cycle. Another interesting thing is SSE on the Pentium 3 chips. When SSE was first introduced in the Pentium 3 Intel stated it could do 4 operations/cycle, however reading about the performance plus actual testing showed it only doing 2 ops/cycle max. Later on the Pentium 4 would fix this and be able to get over 2 ops/cycle (theoretical is 4).

Axlegear
10-10-2006, 12:24 AM
Thanks! That can provide some insight on the theoretical max.
I think the raw GFlops output is more accurate to it's F@H performance, though.

Still, more data = more I want work with = more accurate.