New Multi-Threaded Pi Program - Faster than SuperPi and PiFast

**poke349** · 12-16-2010, 12:47 PM

Wow. Lots of results for me to update. I'll try to get them in on Saturday before I head off to Hong Kong.

Originally Posted by skier

how important is storage for this app? ie, is it constantly writing to disk as it computes?

Originally Posted by amrgb

When you don't have enough memory, it depends a lot on the storage speed. See my 25B computation below. The average CPU utilization was 170% instead of close to 800%. This was more of a hard drive test than anything else.

25B, i7 950 @ 4.3Ghz, 24GB ram @ 1790Mhz, 22GB ram, rest on a single 1TB Samsung F3 (after seeing the 10B, I wasn't wear the SSD on this one

)

Edit: Dammit, I only noticed now that it counts time elapsed since the checkpoint

I had to stop it for a few minutes to use the computer. This was at 22.45PM, so the total time was around 14.7 hours. Anyway, it may not count for the top, but I did it.

Yes, unless you're running multiple drives in parallel, it's basically gonna be a hard drive benchmark.

Currently, the break-even point (50% CPU-limited, 50% Disk-limited) is about 100MB/s for every Nehalem Core i7 core @ ~3.0 - 3.5 GHz.
Which means you'll need 400MB/s of disk bandwidth to get any good large runs on the original Bloomfield Core i7s. There's no single HD/SSD that can do that - and it's also beyond the limit of a single SATA II channel. So you'll need an array of drives.

Both of my test rigs have an array of identical drives.

My Core i7 rig: 4 x 1 TB Seagate Barracuda 7200 RPM
My Harpertown rig: 8 x 2 TB Hitachi Deskstar 7200 RPM

And then there's Shigeru Kondo's rig:
16 x 2TB Seagate 7200RPM Enterprise drives
We used this HD setup to set the world record to 5 trillion digits.

None of these setups are in any form of RAID. The program usually seems to do a better job of managing them manually.

In any case, I wouldn't recommend using an SSD for too long since the program will probably kill it in a short amount of time.

A few months ago I had an e-mail chat with Massman over the possibility of adding y-cruncher to HWbot. The eventual long-term goal is to replace SuperPi and introduce a multi-threaded benchmark that isn't 100% synthetic like Wprime.

But the main problem right now is the huge memory requirement. Since the swap modes are so slow, you basically are forced to run maxed out ram configurations to get anything competitive. (And it's pretty hard to get anything with 24+GB or ram to hit 6GHz...)
This will mean scores will be heavily biased towards the multi-socket workstations that can run huge amounts of memory - to the point that non-overclocked(able) machines that can run more memory may beat out LN2-cooled machines that need to use a swap mode.

The other problem is speed-consistency.
Speed-consistency is very important for HWbot. But being able to keep up to date with new instruction sets (AVX/FMA, ...) is also important. (Just check out AMDZone for all their SuperPi bashing because it uses only x87 FPU.)
Speed-consistency and up-to-date instruction sets are inherently conflicting...

Neither Massman nor I know how to deal with this yet.

y-cruncher itself will never be speed-consistent, but I can easily branch the program and maintain a HWbot version which I will never update or optimize (except for bugs).

So far, we're both too busy to take this seriously yet. Hwbot will need to build an interface with GUI and score submission on top of the program. And I'll have to make some changes to help facilitate that.

Thread: New Multi-Threaded Pi Program - Faster than SuperPi and PiFast

Thread Tools

Search Thread

Rate This Thread

Display

Threaded View

Tags for this Thread

Bookmarks

Bookmarks

Posting Permissions