Page 1 of 33 123411 ... LastLast
Results 1 to 25 of 815

Thread: New Multi-Threaded Pi Program - Faster than SuperPi and PiFast

  1. #1
    Xtreme Enthusiast
    Join Date
    Mar 2009
    Location
    Bay Area, California
    Posts
    705

    y-cruncher: Multi-Threaded Pi-Benchmark for Multi-Core Systems...
    (official thread)

    Latest Version: v0.6.5.9442
    Now with support for XOP and AVX2. Backward-compatibility is maintained for older processors.

    The benchmark is also available for 64-bit Linux.



    Download:
    http://www.numberworld.org/y-cruncher/


    Pi Benchmark Results: Can you do any better?

    -----------------------------------------------------------------------------------------------------

    What is y-cruncher?

    y-cruncher is (yet another) program that computes Pi to millions of digits.

    But unlike SuperPi and PiFast:
    • y-cruncher is multi-threaded and is capable of using multiple cores.
    • y-cruncher is able to use modern instruction sets such as SSE and AVX.

    And unlike HyperPi, y-cruncher uses multiple threads to make a single computation faster. (as opposed to running the same computation multiple times)

    Because y-cruncher is multi-threaded and uses up-to-date processor features, it can reach speeds and sizes that are unobtainable with older apps:
    • A stock AMD FX-8350 can do 32M digits in 9.737 seconds using y-cruncher. SuperPi requires upwards of 20 minutes.
    • A Core i7 4770K @ 4.0 GHz can do 1,000,000,000 digits in 206 seconds.
    • The same Core i7 equipped with 32GB ram and 8 hard drives can do 100 billion digits in 23 hours.
    • A dual Xeon E5-2690W @ 3.5 GHz with 256GB ram and 22 hard drives can do 1 trillion digits in 83 hours.


    Likewise, y-cruncher is capable of setting world records for the most digits ever computed:
    Last edited by poke349; 05-25-2014 at 12:39 PM.
    Main Machine:
    AMD FX8350 @ stock --- 16 GB DDR3 @ 1333 MHz --- Asus M5A99FX Pro R2.0 --- 2.0 TB Seagate

    Miscellaneous Workstations for Code-Testing:
    Intel Core i7 4770K @ 4.0 GHz --- 32 GB DDR3 @ 1866 MHz --- Asus Z87-Plus --- 1.5 TB (boot) --- 4 x 1 TB + 4 x 2 TB (swap)

  2. #2
    Xtreme Enthusiast
    Join Date
    Mar 2009
    Location
    Bay Area, California
    Posts
    705
    2.4 GHz Kentsfield

    32m

    SuperPi: 1289.29 seconds
    PiFast 4.3: 122.41 seconds
    QuickPi 4.5 (x86): 58.84 seconds
    y-cruncher (x64): 30.357 seconds

    64-bit QuickPi surprisingly ran slower than 32-bit.

    (click to enlarge)
    Last edited by poke349; 05-31-2010 at 10:24 AM. Reason: Fixed the link on the image.
    Main Machine:
    AMD FX8350 @ stock --- 16 GB DDR3 @ 1333 MHz --- Asus M5A99FX Pro R2.0 --- 2.0 TB Seagate

    Miscellaneous Workstations for Code-Testing:
    Intel Core i7 4770K @ 4.0 GHz --- 32 GB DDR3 @ 1866 MHz --- Asus Z87-Plus --- 1.5 TB (boot) --- 4 x 1 TB + 4 x 2 TB (swap)

  3. #3
    Registered User
    Join Date
    Mar 2009
    Posts
    9
    Hi guys I ran a comparison too computing pi to 32m with a core i7 920 @ 3.2ghz

    superpi: 660.614s
    pifast 4.3: 77.50s
    quickpi x64: 32.95s
    y-cruncher x64 16.692s

    and not shown on the screenie, quickpi 32bit took 34.35s



    y-cruncher = full of win
    Last edited by Serotoninn; 03-23-2009 at 10:31 AM.

  4. #4
    Xtreme Enthusiast
    Join Date
    Mar 2009
    Location
    Bay Area, California
    Posts
    705
    wth... XS reverted...

    Anyways... New version is out! This one has benchmark validation and anti-cheat protection!!!

    Main Machine:
    AMD FX8350 @ stock --- 16 GB DDR3 @ 1333 MHz --- Asus M5A99FX Pro R2.0 --- 2.0 TB Seagate

    Miscellaneous Workstations for Code-Testing:
    Intel Core i7 4770K @ 4.0 GHz --- 32 GB DDR3 @ 1866 MHz --- Asus Z87-Plus --- 1.5 TB (boot) --- 4 x 1 TB + 4 x 2 TB (swap)

  5. #5
    Xtreme Cruncher
    Join Date
    Oct 2006
    Location
    Boston, Massachusetts
    Posts
    2,224
    Cool


    Checking it out now

  6. #6
    Registered User
    Join Date
    Nov 2005
    Location
    Plano, TX
    Posts
    82
    Quote Originally Posted by poke349 View Post
    wth... XS reverted...

    Anyways... New version is out! This one has benchmark validation and anti-cheat protection!!!
    Wish I had saved all the results posted in this thread.

    I'm curious again about a few things:
    Do you think you'll maintain a result list?
    In the Validation.txt output could you include the time?
    Could you bring back the ability to change the number of running threads?
    Maybe add an efficiency number (cycles per digit for example)?

    For example my work system is an E7200 (Penryn dual core 2.53GHz), here are the results with the x86 SSE3 binary:
    Code:
    Processor(s):   Intel(R) Core(TM)2 Duo CPU E7200 @ 2.53GHz
    CPU Frequency:  2,527,040,841 Hz
    Thread(s):      2
    Digits:         25,000,000
    Total Time:     56.4712 seconds
    Checksum:       46d5028e60d7bba608077b159b3d8a4b
    If we do...
    ( (CPU Frequency * number of actual cores) * seconds) / digits calculated
    we get 11416.4 cycles per digit. Would it be something worth adding you think? The CPD timing would likely only be valid for a set number of digits, but, if people like to test one size (100,000,000 for example) then it could be useful for comparison and tweaking.

    What do you think?
    Core i7 920 @ 4.4GHz, EVGA Classified E760, 3x1GB OCZ Platinum DDR3-1600 @ 1680 7-8-7-24, SLI eVGA 8800GT @ 756/1890/2200, Heatkiller 3.0 CU waterblock, WD Caviar Black 1TB, Hitachi E7K500 500gb, Seasonic S12 SS-650HT psu

  7. #7
    Xtreme Enthusiast
    Join Date
    Mar 2009
    Location
    Bay Area, California
    Posts
    705
    About half the old posts are still in the Google cache. Should we grab them and re-post?


    Result List:
    I'll start by listing them all on this thread (on my first post.)
    I'll keep track of the top few times on each size. But I won't split them into different binaries (x86 vs. x64) because that would be too much. (Yes, let 64-bit rule!!!)

    Change Threads:
    Yes I can bring that back. But it can only have 2 modes: 1 thread, or threads >= cores (first power of 2 that's >= # of cores).
    This is because of the way the program uses more threads than selected, so "extra cores" will skew results. There's no way I can enforce CPU affinity.

    Efficiency Number:
    Yes I can do that. But keep in mind that it won't scale across different sizes and # of cores.
    Computing Pi isn't linear. It's roughly O( n*log(n)^3 ). So larger computations will have lower efficiency.
    Also, the program doesn't quite scale linearly with the # of cores. So multi-threaded benchmarks will have lower efficiency.



    Aside from that:

    The next version, will use less memory. 1 billion digits will only require 5.25GB as opposed to 5.98GB of ram in the current release. Which means that it'll be possible for the "standard" Core i7 with 6GB of ram to bench 1 billion digits.

    For the "very large" end, 10 billion digits will require 52.9 GB of ram (in the new version). By the looks of it, I "might" be able to squeeze that down to 46GB - which would make it possible for a fully-loaded 12-slot Gainestown server to do. And THAT would be interesting...

    Anyways... Post your benchmarks everyone!!!
    Main Machine:
    AMD FX8350 @ stock --- 16 GB DDR3 @ 1333 MHz --- Asus M5A99FX Pro R2.0 --- 2.0 TB Seagate

    Miscellaneous Workstations for Code-Testing:
    Intel Core i7 4770K @ 4.0 GHz --- 32 GB DDR3 @ 1866 MHz --- Asus Z87-Plus --- 1.5 TB (boot) --- 4 x 1 TB + 4 x 2 TB (swap)

  8. #8
    Xtreme Member
    Join Date
    Jun 2008
    Location
    London, UK
    Posts
    256
    Quote Originally Posted by poke349 View Post
    Anyways... Post your benchmarks everyone!!!
    Nice to see a new Pi software finally made


  9. #9
    Xtreme Cruncher
    Join Date
    Jul 2007
    Location
    @ the computer
    Posts
    2,510
    how do you cheat in benching with superpi?
    [SIGPIC][/SIGPIC]

  10. #10
    Xtreme Enthusiast
    Join Date
    Mar 2009
    Location
    Bay Area, California
    Posts
    705
    Updated the list of benchmarks with old results (from before the XS revert) and with results from other forums.


    Stuff like tampering with the system clock, output tampering... is standard for cheating.

    I've made sure none of that will work on y-cruncher.



    Also, if everyone copy and pastes the validation text into your posts too. It would make it easier for me to validate.
    Main Machine:
    AMD FX8350 @ stock --- 16 GB DDR3 @ 1333 MHz --- Asus M5A99FX Pro R2.0 --- 2.0 TB Seagate

    Miscellaneous Workstations for Code-Testing:
    Intel Core i7 4770K @ 4.0 GHz --- 32 GB DDR3 @ 1866 MHz --- Asus Z87-Plus --- 1.5 TB (boot) --- 4 x 1 TB + 4 x 2 TB (swap)

  11. #11
    Xtreme Member
    Join Date
    Sep 2007
    Location
    Alberta, Canada
    Posts
    360
    Cool program!

    Here's a quick test on my new rig.



    It's core i7 @ 4.0gHz with 1600mHz DDR3 (QPI is only stock , oc'ed for 24/7 use with multiplier since It's a 965.)
    EVGA z68 FTW
    i7 2600k @ 4.8
    8gb DDR3 1600
    3x GTX 580 3gb HydroCopper2
    Silverstone Strider 1500W
    Areca 1880i w/ 6x intel x25m
    On water

  12. #12
    Xtreme Enthusiast
    Join Date
    Mar 2009
    Location
    Bay Area, California
    Posts
    705
    Interesting... Your benchmark is valid (the checksum is good). But the CPU frequency is wrong... And by the very impressive timing, I can tell it really IS at 4GHz.

    I swear there's something wrong with the rdtsc instruction... Because it doesn't detect turbo boost from 3.2 to 3.4 on my friend's computer. (it reports it at 3.2 even with turbo kicked on)

    Maybe I'm not using it right.

    Problem is: I actually don't know how to fix it. Anyone, by any random chance, know how to do this properly?
    Main Machine:
    AMD FX8350 @ stock --- 16 GB DDR3 @ 1333 MHz --- Asus M5A99FX Pro R2.0 --- 2.0 TB Seagate

    Miscellaneous Workstations for Code-Testing:
    Intel Core i7 4770K @ 4.0 GHz --- 32 GB DDR3 @ 1866 MHz --- Asus Z87-Plus --- 1.5 TB (boot) --- 4 x 1 TB + 4 x 2 TB (swap)

  13. #13
    Xtreme Member
    Join Date
    Sep 2007
    Location
    Alberta, Canada
    Posts
    360
    I'm awful at programming, so not much help here

    Is there any way to get this data from the BIOS directly?

    I noticed that alot of apps misread my CPU speed, the only thing that seems to get it right is CPU-Z or the EVGA version.
    Last edited by tet5uo; 04-21-2009 at 04:55 PM.
    EVGA z68 FTW
    i7 2600k @ 4.8
    8gb DDR3 1600
    3x GTX 580 3gb HydroCopper2
    Silverstone Strider 1500W
    Areca 1880i w/ 6x intel x25m
    On water

  14. #14
    Xtreme Enthusiast
    Join Date
    Mar 2009
    Location
    Bay Area, California
    Posts
    705
    Okay, then I guess I'm not alone...

    Would you mind if you try some of the smaller benchmarks?

    At 4Ghz, you might be able to beat my dual-Harpertown rig at 25 and 50 mil.

    Someone with a 920 OC'ed to 4.1 had a faster 25m and 50m time than my dual-Harpertown rig. But it's gone from the revert and it isn't in the Google cache, so I can't recover it.
    Main Machine:
    AMD FX8350 @ stock --- 16 GB DDR3 @ 1333 MHz --- Asus M5A99FX Pro R2.0 --- 2.0 TB Seagate

    Miscellaneous Workstations for Code-Testing:
    Intel Core i7 4770K @ 4.0 GHz --- 32 GB DDR3 @ 1866 MHz --- Asus Z87-Plus --- 1.5 TB (boot) --- 4 x 1 TB + 4 x 2 TB (swap)

  15. #15
    Xtreme Member
    Join Date
    Sep 2007
    Location
    Alberta, Canada
    Posts
    360
    Sure, I'll run a couple more tests and edit them into this for you.

    Thanks for making this program I'm sure we all love to play with new benches
    EVGA z68 FTW
    i7 2600k @ 4.8
    8gb DDR3 1600
    3x GTX 580 3gb HydroCopper2
    Silverstone Strider 1500W
    Areca 1880i w/ 6x intel x25m
    On water

  16. #16
    Registered User
    Join Date
    Nov 2005
    Location
    Plano, TX
    Posts
    82
    I did some runs at 4.1 (20x205) but I'm messing around now at 21x200 w/ turbo mode. As mentioned before it doesn't seem to detect that I'm running at 4.2GHz. I tried 21x205 for 4.3GHz but unfortunately my 2lbs of ice magically disappeared after about 2 seconds. I only managed a single run of 25m:

    4.3GHz @ 21x205
    Code:
    Processor(s):   Intel(R) Core(TM) i7 CPU 920 @ 2.67GHz
    CPU Frequency:  4,105,000,554 Hz
    Thread(s):      8
    Digits:         25,000,000
    Total Time:     9.56572 seconds
    Checksum:       3c277379f4ec9d5316e5b430710d0cbf
    Interestingly enough 4.2GHz is stable in Prime95 and Linx, 4.3 just gives me fits however. 4.1GHz seems to run slightly faster in y-cruncher than 4.2GHz, I'd imagine due to the faster L3 and memory. More ice (MUCH more) is in the freezer, I'll try some 4.3GHz runs at 50, 100, 250 tomorrow.

    Quote Originally Posted by poke349 View Post
    Okay, then I guess I'm not alone...

    Would you mind if you try some of the smaller benchmarks?

    At 4Ghz, you might be able to beat my dual-Harpertown rig at 25 and 50 mil.

    Someone with a 920 OC'ed to 4.1 had a faster 25m and 50m time than my dual-Harpertown rig. But it's gone from the revert and it isn't in the Google cache, so I can't recover it.
    Core i7 920 @ 4.4GHz, EVGA Classified E760, 3x1GB OCZ Platinum DDR3-1600 @ 1680 7-8-7-24, SLI eVGA 8800GT @ 756/1890/2200, Heatkiller 3.0 CU waterblock, WD Caviar Black 1TB, Hitachi E7K500 500gb, Seasonic S12 SS-650HT psu

  17. #17
    Xtreme Enthusiast
    Join Date
    Mar 2009
    Location
    Bay Area, California
    Posts
    705
    Awesomeness... I'll be looking forward to those...

    Finally I can kick myself off the top spot for one of the sizes.

    I expect to hold the top spot only for the larger sizes... simply because no (overclockable) machine has the memory for it...

    I'd love to see someone with dual W5580s and 72GB of ram show up and clean sweep these benchmarks...

    Yes, I'll need to put the single-thread mode back in. Otherwise, y-cruncher is going to turn into a who-has-the-most-money server competition... (especially so with the ram requirement)
    I don't exactly want to see the top spots being dominated by quad socket Dunningtons of some sort or whatever the new one coming out is... (As is the wprime benchmarks on hwbot...)
    Last edited by poke349; 04-21-2009 at 06:39 PM.
    Main Machine:
    AMD FX8350 @ stock --- 16 GB DDR3 @ 1333 MHz --- Asus M5A99FX Pro R2.0 --- 2.0 TB Seagate

    Miscellaneous Workstations for Code-Testing:
    Intel Core i7 4770K @ 4.0 GHz --- 32 GB DDR3 @ 1866 MHz --- Asus Z87-Plus --- 1.5 TB (boot) --- 4 x 1 TB + 4 x 2 TB (swap)

  18. #18
    V3 Xeons coming soon!
    Join Date
    Nov 2005
    Location
    New Hampshire
    Posts
    36,363
    Quote Originally Posted by poke349 View Post
    Awesomeness... I'll be looking forward to those...

    Finally I can kick myself off the top spot for one of the sizes.

    I expect to hold the top spot only for the larger sizes... simply because no (overclockable) machine has the memory for it...

    I'd love to see someone with dual W5580s and 72GB of ram show up and clean sweep these benchmarks...

    Yes, I'll need to put the single-thread mode back in. Otherwise, y-cruncher is going to turn into a who-has-the-most-money server competition... (especially so with the ram requirement)
    I don't exactly want to see the top spots being dominated by quad socket Dunningtons of some sort or whatever the new one coming out is... (As is the wprime benchmarks on hwbot...)
    Will dual X5570's do for now?
    Crunch with us, the XS WCG team
    The XS WCG team needs your support.
    A good project with good goals.
    Come join us,get that warm fuzzy feeling that you've done something good for mankind.

    Quote Originally Posted by Frisch View Post
    If you have lost faith in humanity, then hold a newborn in your hands.

  19. #19
    Xtreme Enthusiast
    Join Date
    Mar 2009
    Location
    Bay Area, California
    Posts
    705
    Are you SERIOUS?!?!?!


    Let it roll!!!

    You should be able to boot me off of everything you have enough ram for...
    Main Machine:
    AMD FX8350 @ stock --- 16 GB DDR3 @ 1333 MHz --- Asus M5A99FX Pro R2.0 --- 2.0 TB Seagate

    Miscellaneous Workstations for Code-Testing:
    Intel Core i7 4770K @ 4.0 GHz --- 32 GB DDR3 @ 1866 MHz --- Asus Z87-Plus --- 1.5 TB (boot) --- 4 x 1 TB + 4 x 2 TB (swap)

  20. #20
    V3 Xeons coming soon!
    Join Date
    Nov 2005
    Location
    New Hampshire
    Posts
    36,363
    Quote Originally Posted by poke349 View Post
    Are you SERIOUS?!?!?!


    Let it roll!!!

    You should be able to boot me off of everything you have enough ram for...
    Just 6 gig..YGPM
    Crunch with us, the XS WCG team
    The XS WCG team needs your support.
    A good project with good goals.
    Come join us,get that warm fuzzy feeling that you've done something good for mankind.

    Quote Originally Posted by Frisch View Post
    If you have lost faith in humanity, then hold a newborn in your hands.

  21. #21
    HARD CRUNCHER!!
    Join Date
    Nov 2006
    Location
    Chicago
    Posts
    1,787
    Quote Originally Posted by Movieman View Post
    Just 6 gig..YGPM
    aww come on Dave, lets see it.
    Quote Originally Posted by mike047 View Post
    CRUNCH HARD, it may not help me and you, but it might help the Kids.

  22. #22
    V3 Xeons coming soon!
    Join Date
    Nov 2005
    Location
    New Hampshire
    Posts
    36,363
    This any good?
    Crunch with us, the XS WCG team
    The XS WCG team needs your support.
    A good project with good goals.
    Come join us,get that warm fuzzy feeling that you've done something good for mankind.

    Quote Originally Posted by Frisch View Post
    If you have lost faith in humanity, then hold a newborn in your hands.

  23. #23
    Xtreme Enthusiast
    Join Date
    Mar 2009
    Location
    Bay Area, California
    Posts
    705
    Oh wait, use the newest version, v0.3.1.

    It has a benchmark mode fully equipped with my "lame" attempt at validation.

    Also see if you guys can cheat the times too... I'm no expert in cheating benchmarks so I wouldn't know how to write a decent anti-cheat protection.
    Main Machine:
    AMD FX8350 @ stock --- 16 GB DDR3 @ 1333 MHz --- Asus M5A99FX Pro R2.0 --- 2.0 TB Seagate

    Miscellaneous Workstations for Code-Testing:
    Intel Core i7 4770K @ 4.0 GHz --- 32 GB DDR3 @ 1866 MHz --- Asus Z87-Plus --- 1.5 TB (boot) --- 4 x 1 TB + 4 x 2 TB (swap)

  24. #24
    V3 Xeons coming soon!
    Join Date
    Nov 2005
    Location
    New Hampshire
    Posts
    36,363
    Here's 25M and 50M with the latest version of the app:


    Crunch with us, the XS WCG team
    The XS WCG team needs your support.
    A good project with good goals.
    Come join us,get that warm fuzzy feeling that you've done something good for mankind.

    Quote Originally Posted by Frisch View Post
    If you have lost faith in humanity, then hold a newborn in your hands.

  25. #25
    Xtreme Enthusiast
    Join Date
    Mar 2009
    Location
    Bay Area, California
    Posts
    705
    Updated Your 25m re-run is slower probably because of normal variation. +/- .1 seconds is very typical for so many threads. (I see it myself on my Harpertowns...)

    You should try some of the larger sizes... The program doesn't scale well with multi-threading for small computations.

    You'll need to super-size it to at least 250 million before it can bring out the power of those 16-virtual cores.
    Main Machine:
    AMD FX8350 @ stock --- 16 GB DDR3 @ 1333 MHz --- Asus M5A99FX Pro R2.0 --- 2.0 TB Seagate

    Miscellaneous Workstations for Code-Testing:
    Intel Core i7 4770K @ 4.0 GHz --- 32 GB DDR3 @ 1866 MHz --- Asus Z87-Plus --- 1.5 TB (boot) --- 4 x 1 TB + 4 x 2 TB (swap)

Page 1 of 33 123411 ... LastLast

Tags for this Thread

Bookmarks

Bookmarks

Posting Permissions

  • You may not post new threads
  • You may not post replies
  • You may not post attachments
  • You may not edit your posts
  •