Page 29 of 33 FirstFirst ... 1926272829303132 ... LastLast
Results 701 to 725 of 815

Thread: New Multi-Threaded Pi Program - Faster than SuperPi and PiFast

  1. #701
    Xtreme Enthusiast
    Join Date
    Mar 2009
    Location
    Bay Area, California
    Posts
    705
    Been a while since I've posted... So I thought I'd bump this thread.

    There hasn't been much news regarding the program lately.
    As I've mentioned before, it's in the middle of a large rewrite that will touch > 80% of all the code. So there isn't gonna be much in the way updates until the new code is working.
    I've also been experimenting with new things (related and unrelated to y-cruncher).

    Here are some things that will likely make it into the next major release of y-cruncher.


    Hybrid RAID 0/3:
    Two level RAID 0 + 0/3. RAID 0 on top of either a RAID 0 or RAID 3 array.
    Hard drive failures is a huge problem that's plaguing Shigeru Kondo's 10 trillion digit attempt.
    With RAID 3, the program will be able to handle a hard drive failure in each RAID group.

    Assuming hard drive plug-and-play works out, then it will be possible to hot-swap dead drives with new ones without closing the program. (and needing to revert to a checkpoint)

    For pure RAID 0 setups, it will be more efficient then the current y-cruncher code. (The new code is better optimized.)
    For hybrid RAID 0/3 setups, some overhead will be incurred for the error-correction math. But it's not significant.

    *btw, this is gonna be a pain in the @$$ to test. I'm gonna be physically unplugging drives (from the motherboard) while the program is running.


    A new Multiplication Algorithm:
    I mentioned this a few times before, but it's finally complete enough to be benchmarked.
    Unfortunately, it's slower than what y-cruncher uses right now. However, it is SIMD-scalable and NUMA-friendly.

    The "baseline" performance sucks (and I knew it before I started implementing it), but... look at these numbers:

    4 x Opteron 8356 @ 2.31 GHz (the one that skycrane sent me):

    Code:
    Integer Square: 1.6 x 1.6 billion digits
    Memory Needed : 4 GB
    Build: x64 SSE3
    
                 Current      New
    1 thread     72.0697    197.023
    2 threads    37.0519    92.7221
    4 threads    19.6629    43.2403
    8 threads    11.9583    20.9102
    16 threads   9.95651    11.184
    
    Times are in seconds.
    The current algorithm doesn't scale well beyond 2 sockets.
    But the new algorithm has super-linear scaling? I thought it was just normal variation, but nope. It is consistent.
    I don't know what the cause is, but it probably has to do with the NUMA since I don't see this awesome behavior on my other machines.
    It's almost as fast as the current algorithm at 16 threads. Will 8 sockets (32 cores) be the crossover point?

    In any case, the new algorithm really needs SSE4.1 and AVX to be efficient.

    On my Sandy Bridge rig:
    SSE4.1 makes it 30% faster than SSE3.
    AVX makes it 66% faster than SSE4.1, and 116% faster than SSE3. (2.16x faster)
    *No SSE at all sucks so bad that it isn't worth mentioning.

    With SSE4.1, the new algorithm already beats GMP. (1 thread only, since GMP isn't multi-threaded)
    With AVX, it is almost as fast as the current algorithm at a billion digits.

    Perhaps Bulldozer will make things more interesting?
    The shared of the 256-bit execution unit will be a huge drawback, but the new algorithm will benefit greatly from FMA and XOP. (The old algorithm will only benefit a tiny bit from FMA.)


    Although this new algorithm sucks for small products, it destroys the current y-cruncher algorithm for sizes above 100 billion digits - with or without AVX. (That's why I decided to implemented it in the first place.)
    So I never intended it to be useable at a "mere" 1.6 billion digits - at least not until we get 512/1024 bit SIMD...
    Main Machine:
    AMD FX8350 @ stock --- 16 GB DDR3 @ 1333 MHz --- Asus M5A99FX Pro R2.0 --- 2.0 TB Seagate

    Miscellaneous Workstations for Code-Testing:
    Intel Core i7 4770K @ 4.0 GHz --- 32 GB DDR3 @ 1866 MHz --- Asus Z87-Plus --- 1.5 TB (boot) --- 4 x 1 TB + 4 x 2 TB (swap)

  2. #702
    Xtreme Enthusiast
    Join Date
    Mar 2009
    Location
    Bay Area, California
    Posts
    705
    Another bump and an update.
    It looks like some Bulldozer benchmarks have been leaked. This confirms that the AVX binary works on Bulldozer.

    http://hardforum.com/showthread.php?t=1630624 (dead link, the benchmarks were deleted)
    EDIT: http://www.chiphell.com/thread-250461-1-1.html (new link)

    The results don't exactly look good for Bulldozer. But we know that the binary works.
    I will admit that I tuned that AVX binary specifically for my own 2600K rig, but I highly doubt that's enough to make up the difference we're seeing right now.

    EDIT: As far as I can tell, the benchmarks look legitimate, but of course those Bulldozer ES chips might not be properly tuned. We'll know when they retail next month.
    Last edited by poke349; 08-21-2011 at 10:58 PM.
    Main Machine:
    AMD FX8350 @ stock --- 16 GB DDR3 @ 1333 MHz --- Asus M5A99FX Pro R2.0 --- 2.0 TB Seagate

    Miscellaneous Workstations for Code-Testing:
    Intel Core i7 4770K @ 4.0 GHz --- 32 GB DDR3 @ 1866 MHz --- Asus Z87-Plus --- 1.5 TB (boot) --- 4 x 1 TB + 4 x 2 TB (swap)

  3. #703

  4. #704
    Xtreme Enthusiast
    Join Date
    Mar 2009
    Location
    Bay Area, California
    Posts
    705
    Oh wow... 4.7 GHz! 1.6v...

    I think that's the best single-socket non-sandy result so far.
    I'll update the lists later.
    Main Machine:
    AMD FX8350 @ stock --- 16 GB DDR3 @ 1333 MHz --- Asus M5A99FX Pro R2.0 --- 2.0 TB Seagate

    Miscellaneous Workstations for Code-Testing:
    Intel Core i7 4770K @ 4.0 GHz --- 32 GB DDR3 @ 1866 MHz --- Asus Z87-Plus --- 1.5 TB (boot) --- 4 x 1 TB + 4 x 2 TB (swap)

  5. #705
    Xtreme Guru
    Join Date
    Dec 2002
    Posts
    4,046
    thanks.. cant wait till you release the version that takes advantage of storage setup

    then gpu.. you got to.. the whole system: cpu/ram/gpu/storage

    if that dont bring down my all liquid cooled system then nothing will

  6. #706
    Xtreme Enthusiast
    Join Date
    Mar 2009
    Location
    Bay Area, California
    Posts
    705
    Quote Originally Posted by NapalmV5 View Post
    thanks.. cant wait till you release the version that takes advantage of storage setup

    then gpu.. you got to.. the whole system: cpu/ram/gpu/storage

    if that dont bring down my all liquid cooled system then nothing will
    I'm not sure exactly what you mean by "storage setup". If you were refering to the raid that I mentioned a few posts back, the current version already has raid-0 support. (You just have to find it .)
    If you go into Custom Compute (option 3), change the number of digits (option 3) to something large (> 100,000,000 digits).
    Then select Computation Mode (option 8), and select one of the swap modes.
    Once you're in a swap mode, a new option will appear: "Swap Disks" (option 9)
    You can now specify how many drives you wish to use as well as the paths. The program will use all the paths you enter as raid-0. So the combined disk performance will the number of drives you enter times the speed of the slowest drive.

    The hybrid raid0+3 that I mentioned in the earlier post will be in v0.6.1. It's basically done and it works - even through "simulated" hard drive failures.
    The only thing missing right now is the ability to rebuild a dead drive. Since I haven't started writing this feature yet, I'm probably going to put it off to v0.6.2.


    As for GPU, it's unlikely that it will help much (if at all). Computing Pi does not fit the GPU programming model very well. There isn't much parallelism that is exploitable with a GPU.
    The other major problem is the GPU <-> CPU memory bottleneck. It doesn't matter how powerful your GPU is if you can't get data to and from memory. This is already a problem right now - on a CPU.




    Anyways, more v0.6.1 updates:
    Again I still can't provide an ETA since the program is in the middle of a partial re-write and there is still a lot of code that needs to be updated.

    • Hybrid Raid-0+3: Nested raid 0+3 to provide fault-tolerance.
    • New Stress-Tester: I'm replacing the current stress-tester with four "component testers". Each of the 4 testers correspond to one of the 4 major algorithms used by y-cruncher for large multiplication.
    • Detailed Status Output: The progress indicator will show more than just a %. It will show more of the sub-steps of the computation. (though I don't expect everyone to easily figure out what it means)
      This feature has always been in the program, but it was always disabled in public releases. I'll be enabling it starting from v0.6.1.
    • FMA4 + XOP: This is also done and tested via emulation*. However, I don't expect the speedup to be significant except for large computations above 50 billion digits when there is enough disk bandwidth to become CPU-bound.
    • Native 64-bit Arithmetic: The small arithmetic library that I use has been completely rewritten with native support for 64-bit arithmetic. The speedup isn't very noticeable, but I had to do this to get rid of the old library which was incompatible with most of the new code.
      Each of these algorithms vary in what they stress the most. On Sandy Bridge (with AVX enabled), one of these tests runs 5-10C hotter than the current stress-tester.
    • New Base Conversion Algorithm: (50% faster) - I mentioned this a few posts back. I still need to implement it for disk.
    • New Multiplication Algorithm: This will be used for computations larger than 50 billion digits. It is heavily vectorized and uses AVX, FMA4, and XOP.
      This new algorithm is the one that will run 5-10C hotter than the current version of y-cruncher.
    • Various other speedups. Computing e will be slightly faster. I may or may not get around to optimizing some of the other constants.


    *I don't plan on getting a bulldozer machine unless it's a lot better than Sandy Bridge. (and the leaked ES benchmarks are showing the opposite... )
    So I'm gonna need help from someone to do some (real) tests before I can add it to the v0.6.1 release.


    Things I'll be removing:
    • The current stress-tester. I may keep it as a sub-option to the new stress-tester.
    • Basic Swap Mode. This mode was useful before I added Advanced Swap Mode in v0.5.2. Now it's useless code that's bloating the program. So I'm getting rid of it completely.


    Possible Features: (probably not for v0.6.1)
    • Swap computation in benchmark mode (0). This was suggested by Massman for HWBOT.
    • More detailed output with timestamps. Also suggested by Massman. If I implement this, all timestamps will also be printed into the validation file for easier verification.
    • Denser swap-mode checkpoints. (more checkpoints to reduce the time between them)
    Main Machine:
    AMD FX8350 @ stock --- 16 GB DDR3 @ 1333 MHz --- Asus M5A99FX Pro R2.0 --- 2.0 TB Seagate

    Miscellaneous Workstations for Code-Testing:
    Intel Core i7 4770K @ 4.0 GHz --- 32 GB DDR3 @ 1866 MHz --- Asus Z87-Plus --- 1.5 TB (boot) --- 4 x 1 TB + 4 x 2 TB (swap)

  7. #707
    Xtreme Guru
    Join Date
    Dec 2002
    Posts
    4,046
    ahh swap disk got it.. i meant adding gpu threads storage thread/s

  8. #708
    Xtreme Enthusiast
    Join Date
    Jan 2004
    Posts
    603
    Is my 10 000 000 000 run OK

    Attachment 119862
    Attached Images Attached Images
    Last edited by Sheik; 09-09-2011 at 12:53 AM.

  9. #709
    Xtreme Enthusiast
    Join Date
    Mar 2009
    Location
    Bay Area, California
    Posts
    705
    Quote Originally Posted by Sheik View Post
    Is my 10 000 000 000 run OK
    That's pretty. I think we have a winner here!

    Charts updated!
    Main Machine:
    AMD FX8350 @ stock --- 16 GB DDR3 @ 1333 MHz --- Asus M5A99FX Pro R2.0 --- 2.0 TB Seagate

    Miscellaneous Workstations for Code-Testing:
    Intel Core i7 4770K @ 4.0 GHz --- 32 GB DDR3 @ 1866 MHz --- Asus Z87-Plus --- 1.5 TB (boot) --- 4 x 1 TB + 4 x 2 TB (swap)

  10. #710
    Xtreme Enthusiast
    Join Date
    Mar 2009
    Location
    Bay Area, California
    Posts
    705
    I guess this took enough...

    We've been running this since December and it finally finishes... (after about 10 hard drive failures...)

    10 Trillion Digits of Pi!!!



    Code:
    Start : 12:27 AM (JST) December 12, 2010
    Finish:  3:14 PM (JST) October 16, 2011
    Code:
    Total Computation Time: 	191 days
    Time Lost to HW Failures: 	117 days
    Total Real Time: 	        308 days
    I'll have more details later once Shigeru Kondo okays my announcement page.

    For all I can say right now. 10 trillion digits is the limit of y-cruncher. And Shigeru Kondo has hit that limit.
    y-cruncher, can't go higher until at least version v0.6.1 - which is still a work in progress...
    Last edited by poke349; 10-16-2011 at 09:31 AM.
    Main Machine:
    AMD FX8350 @ stock --- 16 GB DDR3 @ 1333 MHz --- Asus M5A99FX Pro R2.0 --- 2.0 TB Seagate

    Miscellaneous Workstations for Code-Testing:
    Intel Core i7 4770K @ 4.0 GHz --- 32 GB DDR3 @ 1866 MHz --- Asus Z87-Plus --- 1.5 TB (boot) --- 4 x 1 TB + 4 x 2 TB (swap)

  11. #711
    Xtreme Enthusiast
    Join Date
    Dec 2005
    Posts
    746
    PI: 50 million, AVX enabled (identical to previous run)

    Validation Version: 1.1

    Program: y-cruncher - Gamma to the eXtReMe!!! ( www.numberworld.org )
    Copyright 2008-2011 Alexander J. Yee ( a-yee@u.northwestern.edu )


    User: None Specified - You can edit this in "Username.txt".


    Processor(s): AMD Opteron(TM) Processor 6276
    Logical Cores: 16
    Physical Memory: 25,751,158,784 bytes ( 24.0 GB )
    CPU Frequency: 2,300,036,383 Hz

    Program Version: 0.5.5 Build 9180 (fix 2) (x64 AVX - Windows ~ Hina)
    Constant: Pi
    Algorithm: Chudnovsky Formula
    Decimal Digits: 50,000,000
    Hexadecimal Digits: Disabled
    Threading Mode: 16 threads
    Computation Mode: Ram Only
    Swap Disks: 0
    Working Memory: 318 MB

    Start Date: Sun Oct 16 08:27:31 2011
    End Date: Sun Oct 16 08:27:49 2011

    Computation Time: 16.621 seconds
    Total Time: 18.659 seconds

    CPU Utilization: 1172.71 %
    Multi-core Efficiency: 73.29 %

    Last Digits:
    4127897300 0153683630 8346732220 0943329365 1632962502 : 49,999,950
    5130045796 0464561703 2424263071 4554183801 7945652654 : 50,000,000

    Timer Sanity Check: Passed
    Frequency Sanity Check: Passed
    ECC Recovered Errors: 0
    Checkpoint From: None

    ----

    Checksum: 8a8193eb0bc1abdb02cfb279c54f5821336514681896bd7205 64dfa27037f252
    and non-AVX enabled

    PI: 50 million, non-AVX

    Validation Version: 1.1

    Program: y-cruncher - Gamma to the eXtReMe!!! ( www.numberworld.org )
    Copyright 2008-2011 Alexander J. Yee ( a-yee@u.northwestern.edu )


    User: None Specified - You can edit this in "Username.txt".


    Processor(s): AMD Opteron(TM) Processor 6276
    Logical Cores: 16
    Physical Memory: 17,161,224,192 bytes ( 16.0 GB )
    CPU Frequency: 2,300,105,407 Hz

    Program Version: 0.5.5 Build 9180 (fix 2) (x64 SSE4.1 - Windows ~ Ushio)
    Constant: Pi
    Algorithm: Chudnovsky Formula
    Decimal Digits: 50,000,000
    Hexadecimal Digits: 41,524,102
    Threading Mode: 16 threads
    Computation Mode: Ram Only
    Swap Disks: 0
    Working Memory: 318 MB

    Start Date: Fri Oct 14 20:19:14 2011
    End Date: Fri Oct 14 20:19:40 2011

    Computation Time: 23.059 seconds
    Total Time: 25.630 seconds

    CPU Utilization: 1063.15 %
    Multi-core Efficiency: 66.44 %

    Last Digits:
    4127897300 0153683630 8346732220 0943329365 1632962502 : 49,999,950
    5130045796 0464561703 2424263071 4554183801 7945652654 : 50,000,000

    Timer Sanity Check: Passed
    Frequency Sanity Check: Passed
    ECC Recovered Errors: 0
    Checkpoint From: None

    ----

    Checksum: d56ac3152f855c5e0d51595321f581e6df28dc1a547a511f15 ba2fd79c18af9b
    Heat: 50 - 0 - 0 under "Argus333"

  12. #712
    Registered User
    Join Date
    May 2010
    Location
    Speedway, In
    Posts
    3
    5.4GHz = 3.895secs


  13. #713
    Registered User
    Join Date
    May 2010
    Location
    Speedway, In
    Posts
    3
    dbl post

  14. #714
    Xtreme Member
    Join Date
    May 2009
    Location
    Hull, England
    Posts
    467
    Slowest 100b!

    Code:
    Validation Version:    1.1
    
    Program:               y-cruncher - Gamma to the eXtReMe!!!     ( www.numberworld.org )
                           Copyright 2008-2011 Alexander J. Yee    ( a-yee@u.northwestern.edu )
    
    
    User:                  Jacka
    
    
    Processor(s):          AMD Athlon(tm) II Neo N36L Dual-Core Processor
    Logical Cores:         2
    Physical Memory:       8,052,207,616 bytes  ( 7.50 GB )
    CPU Frequency:         1,297,864,495 Hz
    
    Program Version:       0.5.5 Build 9180 (fix 2) (x64 SSE3 - Windows ~ Kasumi)
    Constant:              Pi
    Algorithm:             Chudnovsky Formula
    Decimal Digits:        100,000,000,000
    Hexadecimal Digits:    83,048,202,373
    Threading Mode:        2 threads
    Computation Mode:      Advanced Swap
    Swap Disks:            1
    Working Memory:        4.00 GB
    
    Start Date:            Mon Oct 10 00:53:57 2011
    End Date:              Mon Oct 31 02:30:51 2011
    
    Computation Time:      1,766,599.652 seconds
    Total Time:            1,823,860.391 seconds
    
    CPU Utilization:           105.70 %
    Multi-core Efficiency:     52.85 %
    
    Last Digits:
    8614936178 2910791153 4443607291 9665696203 7329712945  :  99,999,999,950
    9536515199 6948432428 3185077669 0674614692 0191295669  :  100,000,000,000
    
    Timer Sanity Check:        Passed
    Frequency Sanity Check:    Passed
    ECC Recovered Errors:      0
    Checkpoint From:           None
    
    ----
    
    Checksum:   24a76a2d2914c4970bc9cfde907cf08666e4b6c939238ee40c77e9c548a6b298

  15. #715
    Xtreme Enthusiast
    Join Date
    Mar 2009
    Location
    Bay Area, California
    Posts
    705
    Wow!!!

    Is that 100b on a laptop?!? If so, that's a first! This needs a special mention when it put it in the charts!

    20 days, that must have been an agonizing wait...
    Main Machine:
    AMD FX8350 @ stock --- 16 GB DDR3 @ 1333 MHz --- Asus M5A99FX Pro R2.0 --- 2.0 TB Seagate

    Miscellaneous Workstations for Code-Testing:
    Intel Core i7 4770K @ 4.0 GHz --- 32 GB DDR3 @ 1866 MHz --- Asus Z87-Plus --- 1.5 TB (boot) --- 4 x 1 TB + 4 x 2 TB (swap)

  16. #716
    Xtreme Member
    Join Date
    May 2009
    Location
    Hull, England
    Posts
    467
    It's actually a HP Microserver.
    Currently have 2x2TB and 2x1TB HDDs in it, but using 3130GB of the ~5500GB.

    Just started this:
    Code:
    Constant :  Pi
    Algorithm:  Chudnovsky Formula
    
    Decimal Digits    :   250,000,000,000
    Hexadecimal Digits:   207,620,505,931
    
    Threads:    2
    Mode   :    Advanced Swap  ( Disks = 1 )
    
    Start Time: Wed Nov 09 16:55:49 2011
    
    
    Allocating and Reserving Memory...      4.00 GB
    Constructing FFT lookup tables...
    Setting Permissions for File Allocation...
    
    
    Begin Computation:
    
    Summing Series:  17,628,417,340 terms
    Summing...  0%

  17. #717
    Xtreme Addict
    Join Date
    Jun 2002
    Location
    Ontario, Canada
    Posts
    1,782
    Bulldozer 4.3Ghz


    Code:
    Validation Version:    1.1
    
    Program:               y-cruncher - Gamma to the eXtReMe!!!     ( www.numberworld.org )
                           Copyright 2008-2011 Alexander J. Yee    ( a-yee@u.northwestern.edu )
    
    
    User:                  None Specified - You can edit this in "Username.txt".
    
    
    Processor(s):          AMD FX(tm)-8120 Eight-Core Processor 
    Logical Cores:         8
    Physical Memory:       8,539,910,144 bytes  ( 8.00 GB )
    CPU Frequency:         4,314,592,223 Hz
    
    Program Version:       0.5.5 Build 9180 (fix 2) (x64 AVX - Windows ~ Hina)
    Constant:              Pi
    Algorithm:             Chudnovsky Formula
    Decimal Digits:        1,000,000,000
    Hexadecimal Digits:    Disabled
    Threading Mode:        8 threads
    Computation Mode:      Ram Only
    Swap Disks:            0
    Working Memory:        4.75 GB
    
    Start Date:            Wed Nov 09 12:56:06 2011
    End Date:              Wed Nov 09 13:03:58 2011
    
    Computation Time:      453.098 seconds
    Total Time:            471.971 seconds
    
    CPU Utilization:           768.46 %
    Multi-core Efficiency:     96.05 %
    
    Last Digits:
    6434543524 2766553567 4357021939 6394581990 5483278746  :  999,999,950
    7139868209 3196353628 2046127557 1517139511 5275045519  :  1,000,000,000
    
    Timer Sanity Check:        Passed
    Frequency Sanity Check:    Passed
    ECC Recovered Errors:      0
    Checkpoint From:           None
    
    ----
    
    Checksum:   b649ebc604ab20d70692927fb0a4ea2f70e81b9112b5ed310d3fd570c7c156f2
    Last edited by freeloader; 11-09-2011 at 10:16 AM.
    As quoted by LowRun......"So, we are one week past AMD's worst case scenario for BD's availability but they don't feel like communicating about the delay, I suppose AMD must be removed from the reliable sources list for AMD's products launch dates"

  18. #718
    Registered User
    Join Date
    Nov 2007
    Location
    Morpeth, UK
    Posts
    16
    I'm wondering whats the absolute bottom rung processor that you could use? And what program version would be best?

    Is it technically feasible for a 486DX2 to run? How about a pentium 75?

    Great work by the way

  19. #719
    Xtreme Enthusiast
    Join Date
    Mar 2009
    Location
    Bay Area, California
    Posts
    705
    Quote Originally Posted by Jacka View Post
    It's actually a HP Microserver.
    Currently have 2x2TB and 2x1TB HDDs in it, but using 3130GB of the ~5500GB.

    Just started this:
    Code:
    Constant :  Pi
    Algorithm:  Chudnovsky Formula
    
    Decimal Digits    :   250,000,000,000
    Hexadecimal Digits:   207,620,505,931
    
    Threads:    2
    Mode   :    Advanced Swap  ( Disks = 1 )
    
    Start Time: Wed Nov 09 16:55:49 2011
    
    
    Allocating and Reserving Memory...      4.00 GB
    Constructing FFT lookup tables...
    Setting Permissions for File Allocation...
    
    
    Begin Computation:
    
    Summing Series:  17,628,417,340 terms
    Summing...  0%
    Woah... Can't imagine how long that will take. How are the 4 hard drives combined? Some sort of raid0 or spanning?

    Quote Originally Posted by freeloader View Post
    Bulldozer 4.3Ghz


    Code:
    Validation Version:    1.1
    
    Program:               y-cruncher - Gamma to the eXtReMe!!!     ( www.numberworld.org )
                           Copyright 2008-2011 Alexander J. Yee    ( a-yee@u.northwestern.edu )
    
    
    User:                  None Specified - You can edit this in "Username.txt".
    
    
    Processor(s):          AMD FX(tm)-8120 Eight-Core Processor 
    Logical Cores:         8
    Physical Memory:       8,539,910,144 bytes  ( 8.00 GB )
    CPU Frequency:         4,314,592,223 Hz
    
    Program Version:       0.5.5 Build 9180 (fix 2) (x64 AVX - Windows ~ Hina)
    Constant:              Pi
    Algorithm:             Chudnovsky Formula
    Decimal Digits:        1,000,000,000
    Hexadecimal Digits:    Disabled
    Threading Mode:        8 threads
    Computation Mode:      Ram Only
    Swap Disks:            0
    Working Memory:        4.75 GB
    
    Start Date:            Wed Nov 09 12:56:06 2011
    End Date:              Wed Nov 09 13:03:58 2011
    
    Computation Time:      453.098 seconds
    Total Time:            471.971 seconds
    
    CPU Utilization:           768.46 %
    Multi-core Efficiency:     96.05 %
    
    Last Digits:
    6434543524 2766553567 4357021939 6394581990 5483278746  :  999,999,950
    7139868209 3196353628 2046127557 1517139511 5275045519  :  1,000,000,000
    
    Timer Sanity Check:        Passed
    Frequency Sanity Check:    Passed
    ECC Recovered Errors:      0
    Checkpoint From:           None
    
    ----
    
    Checksum:   b649ebc604ab20d70692927fb0a4ea2f70e81b9112b5ed310d3fd570c7c156f2
    Nice! I think that's the first non ES Bulldozer result I've seen so far.

    Quote Originally Posted by Krazeyivan View Post
    I'm wondering whats the absolute bottom rung processor that you could use? And what program version would be best?

    Is it technically feasible for a 486DX2 to run? How about a pentium 75?

    Great work by the way
    The oldest machine I've seen run this is a dual-socket Pentium Pro. I think the minimum OS requirement is Windows XP. So if a 486 can't run Windows XP, it probably don't be able to run the program. You'll also need at least 100mb of ram to do anything meaningful.

    The next version of y-cruncher will has SSE3 as a minimum requirement, so that will push the minimum requirement to the later Pentium 4's.

    It wasn't an easy decision to make: The 32-bit binaries rely on about 20,000 lines of older code that the 64-bit binaries don't use. However, this older code is incompatible with all the new code in v0.6.x. When I tried force compiling the 64-bit code on x86, the x86 SSE3 binary turned out to be the same speed, but the x86 no-SSE binary ended up being 20% slower (because the x87 FPU stack sucks*).

    So I decided to just drop x86 no-SSE completely for v0.6.x. (which also allows me to cut out those 20,000 lines)

    *It takes a very special kind of hand-tuning to make the x87 FPU efficient. These tricks tend to clash with the tricks that are used to make SSE efficient. So no more x87 FPU. I'm getting rid of those 20,000 lines and making SSE3 the minimum requirement.
    Main Machine:
    AMD FX8350 @ stock --- 16 GB DDR3 @ 1333 MHz --- Asus M5A99FX Pro R2.0 --- 2.0 TB Seagate

    Miscellaneous Workstations for Code-Testing:
    Intel Core i7 4770K @ 4.0 GHz --- 32 GB DDR3 @ 1866 MHz --- Asus Z87-Plus --- 1.5 TB (boot) --- 4 x 1 TB + 4 x 2 TB (swap)

  20. #720
    Xtreme Member
    Join Date
    Mar 2010
    Location
    Germany
    Posts
    118
    Quote Originally Posted by poke349 View Post
    ...
    *It takes a very special kind of hand-tuning to make the x87 FPU efficient. These tricks tend to clash with the tricks that are used to make SSE efficient. So no more x87 FPU. I'm getting rid of those 20,000 lines and making SSE3 the minimum requirement.
    Good, go for it.

    Backwards compatibility is for lame companies who prevent technology from advancing.

  21. #721
    Xtreme Member
    Join Date
    May 2009
    Location
    Hull, England
    Posts
    467
    Quote Originally Posted by poke349 View Post
    Woah... Can't imagine how long that will take. How are the 4 hard drives combined? Some sort of raid0 or spanning?
    It's on 8% after 3 days, so approximate total time is 37.5 days. :o

    Drives aren't in any sort of RAID. If I had space to shuffle files around, I would try it with two of the discs in RAID0, but unfortunately I can't do that right now.

  22. #722
    Xtreme Member
    Join Date
    May 2009
    Location
    Hull, England
    Posts
    467
    19% after 8 days 9 hours. Computation time at this rate is 42 days.

  23. #723
    Xtreme Enthusiast
    Join Date
    Mar 2009
    Location
    Bay Area, California
    Posts
    705
    Quote Originally Posted by Jacka View Post
    19% after 8 days 9 hours. Computation time at this rate is 42 days.
    I'd also add that the % number only applies to the series summation. There's still several more steps at the end.
    Another thing is that it isn't quite linear when you're using swap. As you get further into a computation, the amount of disk access relative to the amount of computation increases.



    Here are some results I've received on the new Sandies:



    *Note that I had to link to these. For some reason, it wouldn't let me attach them...
    Main Machine:
    AMD FX8350 @ stock --- 16 GB DDR3 @ 1333 MHz --- Asus M5A99FX Pro R2.0 --- 2.0 TB Seagate

    Miscellaneous Workstations for Code-Testing:
    Intel Core i7 4770K @ 4.0 GHz --- 32 GB DDR3 @ 1866 MHz --- Asus Z87-Plus --- 1.5 TB (boot) --- 4 x 1 TB + 4 x 2 TB (swap)

  24. #724
    Registered User
    Join Date
    Jan 2008
    Posts
    37
    Got a few Bulldozer results to add. I've noticed the AVX version is slower than any of the other version. The Kasumi version gives me the fastest results.

    Non AVX build
    Code:
    Validation Version:    1.1
    
    Program:               y-cruncher - Gamma to the eXtReMe!!!     ( www.numberworld.org )
                           Copyright 2008-2011 Alexander J. Yee    ( a-yee@u.northwestern.edu )
    
    
    User:                  "Username.txt" Not found.
    
    
    Processor(s):          AMD FX(tm)-8120 Eight-Core Processor 
    Logical Cores:         8
    Physical Memory:       17,139,204,096 bytes  ( 16.0 GB )
    CPU Frequency:         4,480,194,223 Hz
    
    Program Version:       0.5.5 Build 9180 (fix 2) (x64 SSE3 - Windows ~ Kasumi)
    Constant:              Pi
    Algorithm:             Chudnovsky Formula
    Decimal Digits:        25,000,000
    Hexadecimal Digits:    Disabled
    Threading Mode:        8 threads
    Computation Mode:      Ram Only
    Swap Disks:            0
    Working Memory:        208 MB
    
    Start Date:            Sat Dec 17 18:36:33 2011
    End Date:              Sat Dec 17 18:36:40 2011
    
    Computation Time:      6.573 seconds
    Total Time:            7.462 seconds
    
    CPU Utilization:           606.39 %
    Multi-core Efficiency:     75.79 %
    
    Last Digits:
    3803750790 9491563108 2381689226 7224175329 0045253446  :  24,999,950
    0786411592 4597806944 2455112852 2554677483 6191884322  :  25,000,000
    
    Timer Sanity Check:        Passed
    Frequency Sanity Check:    Passed
    ECC Recovered Errors:      0
    Checkpoint From:           None
    
    ----
    
    Checksum:   5bed7b5a9e0192881f15c2045125f42b5ae9ed72e56bc3861710262549318c9a
    
    
    Validation Version:    1.1
    
    Program:               y-cruncher - Gamma to the eXtReMe!!!     ( www.numberworld.org )
                           Copyright 2008-2011 Alexander J. Yee    ( a-yee@u.northwestern.edu )
    
    
    User:                  "Username.txt" Not found.
    
    
    Processor(s):          AMD FX(tm)-8120 Eight-Core Processor 
    Logical Cores:         8
    Physical Memory:       17,139,204,096 bytes  ( 16.0 GB )
    CPU Frequency:         4,480,196,015 Hz
    
    Program Version:       0.5.5 Build 9180 (fix 2) (x64 SSE3 - Windows ~ Kasumi)
    Constant:              Pi
    Algorithm:             Chudnovsky Formula
    Decimal Digits:        50,000,000
    Hexadecimal Digits:    Disabled
    Threading Mode:        8 threads
    Computation Mode:      Ram Only
    Swap Disks:            0
    Working Memory:        318 MB
    
    Start Date:            Sat Dec 17 18:32:56 2011
    End Date:              Sat Dec 17 18:33:12 2011
    
    Computation Time:      14.159 seconds
    Total Time:            15.638 seconds
    
    CPU Utilization:           649.04 %
    Multi-core Efficiency:     81.13 %
    
    Last Digits:
    4127897300 0153683630 8346732220 0943329365 1632962502  :  49,999,950
    5130045796 0464561703 2424263071 4554183801 7945652654  :  50,000,000
    
    Timer Sanity Check:        Passed
    Frequency Sanity Check:    Passed
    ECC Recovered Errors:      0
    Checkpoint From:           None
    
    ----
    
    Checksum:   51cbf3f210a5a72f11c5675eab992b63fd0bd148af3f753109ce8a1c8c16102f
    
    
    Validation Version:    1.1
    
    Program:               y-cruncher - Gamma to the eXtReMe!!!     ( www.numberworld.org )
                           Copyright 2008-2011 Alexander J. Yee    ( a-yee@u.northwestern.edu )
    
    
    User:                  "Username.txt" Not found.
    
    
    Processor(s):          AMD FX(tm)-8120 Eight-Core Processor 
    Logical Cores:         8
    Physical Memory:       17,139,204,096 bytes  ( 16.0 GB )
    CPU Frequency:         4,480,187,312 Hz
    
    Program Version:       0.5.5 Build 9180 (fix 2) (x64 SSE3 - Windows ~ Kasumi)
    Constant:              Pi
    Algorithm:             Chudnovsky Formula
    Decimal Digits:        100,000,000
    Hexadecimal Digits:    Disabled
    Threading Mode:        8 threads
    Computation Mode:      Ram Only
    Swap Disks:            0
    Working Memory:        537 MB
    
    Start Date:            Sat Dec 17 18:33:12 2011
    End Date:              Sat Dec 17 18:33:44 2011
    
    Computation Time:      30.043 seconds
    Total Time:            32.425 seconds
    
    CPU Utilization:           692.62 %
    Multi-core Efficiency:     86.57 %
    
    Last Digits:
    9948682556 3967530560 3352869667 7734610718 4471868529  :  99,999,950
    7572203175 2074898161 1683139375 1497058112 0187751592  :  100,000,000
    
    Timer Sanity Check:        Passed
    Frequency Sanity Check:    Passed
    ECC Recovered Errors:      0
    Checkpoint From:           None
    
    ----
    
    Checksum:   eabc1b9f269f1ac70b31281db6e8a002fa0687ec2ebcf83e0a258d52217343d9
    
    
    Validation Version:    1.1
    
    Program:               y-cruncher - Gamma to the eXtReMe!!!     ( www.numberworld.org )
                           Copyright 2008-2011 Alexander J. Yee    ( a-yee@u.northwestern.edu )
    
    
    User:                  "Username.txt" Not found.
    
    
    Processor(s):          AMD FX(tm)-8120 Eight-Core Processor 
    Logical Cores:         8
    Physical Memory:       17,139,204,096 bytes  ( 16.0 GB )
    CPU Frequency:         4,480,179,152 Hz
    
    Program Version:       0.5.5 Build 9180 (fix 2) (x64 SSE3 - Windows ~ Kasumi)
    Constant:              Pi
    Algorithm:             Chudnovsky Formula
    Decimal Digits:        250,000,000
    Hexadecimal Digits:    Disabled
    Threading Mode:        8 threads
    Computation Mode:      Ram Only
    Swap Disks:            0
    Working Memory:        1.26 GB
    
    Start Date:            Sat Dec 17 18:33:45 2011
    End Date:              Sat Dec 17 18:35:13 2011
    
    Computation Time:      82.719 seconds
    Total Time:            88.326 seconds
    
    CPU Utilization:           715.83 %
    Multi-core Efficiency:     89.47 %
    
    Last Digits:
    3673748634 2742427296 0219667627 3141599893 4569474921  :  249,999,950
    9958866734 1705167068 8515785208 0067520395 3452027780  :  250,000,000
    
    Timer Sanity Check:        Passed
    Frequency Sanity Check:    Passed
    ECC Recovered Errors:      0
    Checkpoint From:           None
    
    ----
    
    Checksum:   9aacf4b263085aa40314586e6cada66b58c71834de2862e3047f1eb9d9708780
    
    
    Validation Version:    1.1
    
    Program:               y-cruncher - Gamma to the eXtReMe!!!     ( www.numberworld.org )
                           Copyright 2008-2011 Alexander J. Yee    ( a-yee@u.northwestern.edu )
    
    
    User:                  "Username.txt" Not found.
    
    
    Processor(s):          AMD FX(tm)-8120 Eight-Core Processor 
    Logical Cores:         8
    Physical Memory:       17,139,204,096 bytes  ( 16.0 GB )
    CPU Frequency:         4,480,192,623 Hz
    
    Program Version:       0.5.5 Build 9180 (fix 2) (x64 SSE3 - Windows ~ Kasumi)
    Constant:              Pi
    Algorithm:             Chudnovsky Formula
    Decimal Digits:        500,000,000
    Hexadecimal Digits:    Disabled
    Threading Mode:        8 threads
    Computation Mode:      Ram Only
    Swap Disks:            0
    Working Memory:        2.42 GB
    
    Start Date:            Sat Dec 17 18:38:01 2011
    End Date:              Sat Dec 17 18:41:13 2011
    
    Computation Time:      181.588 seconds
    Total Time:            192.369 seconds
    
    CPU Utilization:           733.02 %
    Multi-core Efficiency:     91.62 %
    
    Last Digits:
    3896531789 0364496761 5664275325 5483742003 7847987772  :  499,999,950
    5002477883 0364214864 5906800532 7052368734 3293261427  :  500,000,000
    
    Timer Sanity Check:        Passed
    Frequency Sanity Check:    Passed
    ECC Recovered Errors:      0
    Checkpoint From:           None
    
    ----
    
    Checksum:   14a1a7895ac26b0908c0805cc03387f68d5d30eec1ba1ef702408aef2913e5e4
    
    
    Validation Version:    1.1
    
    Program:               y-cruncher - Gamma to the eXtReMe!!!     ( www.numberworld.org )
                           Copyright 2008-2011 Alexander J. Yee    ( a-yee@u.northwestern.edu )
    
    
    User:                  "Username.txt" Not found.
    
    
    Processor(s):          AMD FX(tm)-8120 Eight-Core Processor 
    Logical Cores:         8
    Physical Memory:       17,139,204,096 bytes  ( 16.0 GB )
    CPU Frequency:         4,480,185,920 Hz
    
    Program Version:       0.5.5 Build 9180 (fix 2) (x64 SSE3 - Windows ~ Kasumi)
    Constant:              Pi
    Algorithm:             Chudnovsky Formula
    Decimal Digits:        1,000,000,000
    Hexadecimal Digits:    Disabled
    Threading Mode:        8 threads
    Computation Mode:      Ram Only
    Swap Disks:            0
    Working Memory:        4.75 GB
    
    Start Date:            Sat Dec 17 18:41:35 2011
    End Date:              Sat Dec 17 18:48:44 2011
    
    Computation Time:      407.531 seconds
    Total Time:            429.508 seconds
    
    CPU Utilization:           734.00 %
    Multi-core Efficiency:     91.75 %
    
    Last Digits:
    6434543524 2766553567 4357021939 6394581990 5483278746  :  999,999,950
    7139868209 3196353628 2046127557 1517139511 5275045519  :  1,000,000,000
    
    Timer Sanity Check:        Passed
    Frequency Sanity Check:    Passed
    ECC Recovered Errors:      0
    Checkpoint From:           None
    
    ----
    
    Checksum:   bd5edc28901dc0491325c720ca5eda5ef6033775284b492f18401969e5e9c60f
    
    
    Validation Version:    1.1
    
    Program:               y-cruncher - Gamma to the eXtReMe!!!     ( www.numberworld.org )
                           Copyright 2008-2011 Alexander J. Yee    ( a-yee@u.northwestern.edu )
    
    
    User:                  "Username.txt" Not found.
    
    
    Processor(s):          AMD FX(tm)-8120 Eight-Core Processor 
    Logical Cores:         8
    Physical Memory:       17,139,204,096 bytes  ( 16.0 GB )
    CPU Frequency:         4,480,110,671 Hz
    
    Program Version:       0.5.5 Build 9180 (fix 2) (x64 SSE3 - Windows ~ Kasumi)
    Constant:              Pi
    Algorithm:             Chudnovsky Formula
    Decimal Digits:        2,500,000,000
    Hexadecimal Digits:    Disabled
    Threading Mode:        8 threads
    Computation Mode:      Ram Only
    Swap Disks:            0
    Working Memory:        11.2 GB
    
    Start Date:            Fri Dec 16 12:48:04 2011
    End Date:              Fri Dec 16 13:07:48 2011
    
    Computation Time:      1,110.708 seconds
    Total Time:            1,183.704 seconds
    
    CPU Utilization:           748.67 %
    Multi-core Efficiency:     93.58 %
    
    Last Digits:
    0917027898 3554136437 7123165188 3528593128 0032489094  :  2,499,999,950
    9228502005 4677489552 2459688725 5242233502 7255998083  :  2,500,000,000
    
    Timer Sanity Check:        Passed
    Frequency Sanity Check:    Passed
    ECC Recovered Errors:      0
    Checkpoint From:           None
    
    ----
    
    Checksum:   37b61cb11d26c4580dbb2df46871d470e6a07f0e54e967ab14e2686e42552af7
    AVX build
    Code:
    Validation Version:    1.1
    
    Program:               y-cruncher - Gamma to the eXtReMe!!!     ( www.numberworld.org )
                           Copyright 2008-2011 Alexander J. Yee    ( a-yee@u.northwestern.edu )
    
    
    User:                  Invalid Username
    
    
    Processor(s):          AMD FX(tm)-8120 Eight-Core Processor 
    Logical Cores:         8
    Physical Memory:       17,139,204,096 bytes  ( 16.0 GB )
    CPU Frequency:         4,480,180,848 Hz
    
    Program Version:       0.5.5 Build 9180 (fix 2) (x64 AVX - Windows ~ Hina)
    Constant:              Pi
    Algorithm:             Chudnovsky Formula
    Decimal Digits:        25,000,000
    Hexadecimal Digits:    Disabled
    Threading Mode:        8 threads
    Computation Mode:      Ram Only
    Swap Disks:            0
    Working Memory:        207 MB
    
    Start Date:            Sat Dec 17 18:24:45 2011
    End Date:              Sat Dec 17 18:24:53 2011
    
    Computation Time:      7.416 seconds
    Total Time:            8.379 seconds
    
    CPU Utilization:           618.17 %
    Multi-core Efficiency:     77.27 %
    
    Last Digits:
    3803750790 9491563108 2381689226 7224175329 0045253446  :  24,999,950
    0786411592 4597806944 2455112852 2554677483 6191884322  :  25,000,000
    
    Timer Sanity Check:        Passed
    Frequency Sanity Check:    Passed
    ECC Recovered Errors:      0
    Checkpoint From:           None
    
    ----
    
    Checksum:   1da6d2b94e01a45b0b9ae9fee03fc2a03588f28acb3a6c2803f3278c82f874bb
    
    Validation Version:    1.1
    
    Program:               y-cruncher - Gamma to the eXtReMe!!!     ( www.numberworld.org )
                           Copyright 2008-2011 Alexander J. Yee    ( a-yee@u.northwestern.edu )
    
    
    User:                  Invalid Username
    
    
    Processor(s):          AMD FX(tm)-8120 Eight-Core Processor 
    Logical Cores:         8
    Physical Memory:       17,139,204,096 bytes  ( 16.0 GB )
    CPU Frequency:         4,480,194,239 Hz
    
    Program Version:       0.5.5 Build 9180 (fix 2) (x64 AVX - Windows ~ Hina)
    Constant:              Pi
    Algorithm:             Chudnovsky Formula
    Decimal Digits:        50,000,000
    Hexadecimal Digits:    Disabled
    Threading Mode:        8 threads
    Computation Mode:      Ram Only
    Swap Disks:            0
    Working Memory:        317 MB
    
    Start Date:            Sat Dec 17 18:24:53 2011
    End Date:              Sat Dec 17 18:25:10 2011
    
    Computation Time:      15.498 seconds
    Total Time:            16.950 seconds
    
    CPU Utilization:           652.14 %
    Multi-core Efficiency:     81.51 %
    
    Last Digits:
    4127897300 0153683630 8346732220 0943329365 1632962502  :  49,999,950
    5130045796 0464561703 2424263071 4554183801 7945652654  :  50,000,000
    
    Timer Sanity Check:        Passed
    Frequency Sanity Check:    Passed
    ECC Recovered Errors:      0
    Checkpoint From:           None
    
    ----
    
    Checksum:   92e6bfc610e8cb2e1a43e1fe3951b0da4cd1587227d009131bba1644dc695e41
    
    
    Validation Version:    1.1
    
    Program:               y-cruncher - Gamma to the eXtReMe!!!     ( www.numberworld.org )
                           Copyright 2008-2011 Alexander J. Yee    ( a-yee@u.northwestern.edu )
    
    
    User:                  Invalid Username
    
    
    Processor(s):          AMD FX(tm)-8120 Eight-Core Processor 
    Logical Cores:         8
    Physical Memory:       17,139,204,096 bytes  ( 16.0 GB )
    CPU Frequency:         4,480,189,711 Hz
    
    Program Version:       0.5.5 Build 9180 (fix 2) (x64 AVX - Windows ~ Hina)
    Constant:              Pi
    Algorithm:             Chudnovsky Formula
    Decimal Digits:        100,000,000
    Hexadecimal Digits:    Disabled
    Threading Mode:        8 threads
    Computation Mode:      Ram Only
    Swap Disks:            0
    Working Memory:        536 MB
    
    Start Date:            Sat Dec 17 18:25:11 2011
    End Date:              Sat Dec 17 18:25:46 2011
    
    Computation Time:      33.040 seconds
    Total Time:            35.420 seconds
    
    CPU Utilization:           691.03 %
    Multi-core Efficiency:     86.37 %
    
    Last Digits:
    9948682556 3967530560 3352869667 7734610718 4471868529  :  99,999,950
    7572203175 2074898161 1683139375 1497058112 0187751592  :  100,000,000
    
    Timer Sanity Check:        Passed
    Frequency Sanity Check:    Passed
    ECC Recovered Errors:      0
    Checkpoint From:           None
    
    ----
    
    Checksum:   506461dede0c44651003193e4571d6cc6a5f916c5a9ff3a110e1313238487035
    
    
    Validation Version:    1.1
    
    Program:               y-cruncher - Gamma to the eXtReMe!!!     ( www.numberworld.org )
                           Copyright 2008-2011 Alexander J. Yee    ( a-yee@u.northwestern.edu )
    
    
    User:                  Invalid Username
    
    
    Processor(s):          AMD FX(tm)-8120 Eight-Core Processor 
    Logical Cores:         8
    Physical Memory:       17,139,204,096 bytes  ( 16.0 GB )
    CPU Frequency:         4,480,193,119 Hz
    
    Program Version:       0.5.5 Build 9180 (fix 2) (x64 AVX - Windows ~ Hina)
    Constant:              Pi
    Algorithm:             Chudnovsky Formula
    Decimal Digits:        250,000,000
    Hexadecimal Digits:    Disabled
    Threading Mode:        8 threads
    Computation Mode:      Ram Only
    Swap Disks:            0
    Working Memory:        1.26 GB
    
    Start Date:            Sat Dec 17 18:25:46 2011
    End Date:              Sat Dec 17 18:27:24 2011
    
    Computation Time:      92.413 seconds
    Total Time:            98.254 seconds
    
    CPU Utilization:           719.32 %
    Multi-core Efficiency:     89.91 %
    
    Last Digits:
    3673748634 2742427296 0219667627 3141599893 4569474921  :  249,999,950
    9958866734 1705167068 8515785208 0067520395 3452027780  :  250,000,000
    
    Timer Sanity Check:        Passed
    Frequency Sanity Check:    Passed
    ECC Recovered Errors:      0
    Checkpoint From:           None
    
    ----
    
    Checksum:   2dc8f82107829f6d0d04a39895d47f7ef3f2c363d6d8d3aa1ebaab686a8458bd
    
    
    Validation Version:    1.1
    
    Program:               y-cruncher - Gamma to the eXtReMe!!!     ( www.numberworld.org )
                           Copyright 2008-2011 Alexander J. Yee    ( a-yee@u.northwestern.edu )
    
    
    User:                  Invalid Username
    
    
    Processor(s):          AMD FX(tm)-8120 Eight-Core Processor 
    Logical Cores:         8
    Physical Memory:       17,139,204,096 bytes  ( 16.0 GB )
    CPU Frequency:         4,575,473,247 Hz
    
    Program Version:       0.5.5 Build 9180 (fix 2) (x64 AVX - Windows ~ Hina)
    Constant:              Pi
    Algorithm:             Chudnovsky Formula
    Decimal Digits:        500,000,000
    Hexadecimal Digits:    Disabled
    Threading Mode:        8 threads
    Computation Mode:      Ram Only
    Swap Disks:            0
    Working Memory:        2.42 GB
    
    Start Date:            Tue Nov 01 18:01:26 2011
    End Date:              Tue Nov 01 18:04:54 2011
    
    Computation Time:      196.326 seconds
    Total Time:            208.336 seconds
    
    CPU Utilization:           756.13 %
    Multi-core Efficiency:     94.51 %
    
    Last Digits:
    3896531789 0364496761 5664275325 5483742003 7847987772  :  499,999,950
    5002477883 0364214864 5906800532 7052368734 3293261427  :  500,000,000
    
    Timer Sanity Check:        Passed
    Frequency Sanity Check:    Passed
    ECC Recovered Errors:      2
    Checkpoint From:           None
    
    ----
    
    Checksum:   af756f9dd2cd9e330ddeace0d23b39a279ba8341c9bbd02115ee352b0b4f6844

  25. #725
    Xtreme Enthusiast
    Join Date
    Mar 2009
    Location
    Bay Area, California
    Posts
    705
    That's interesting. I do remember seeing a similar result on a different forum. Have you tried at the larger sizes? Like 1 billion?

    Another thing is that the AVX binary is indeed tuned for Sandy Bridge, so it's very likely to be sub-optimal on Bulldozer.
    Main Machine:
    AMD FX8350 @ stock --- 16 GB DDR3 @ 1333 MHz --- Asus M5A99FX Pro R2.0 --- 2.0 TB Seagate

    Miscellaneous Workstations for Code-Testing:
    Intel Core i7 4770K @ 4.0 GHz --- 32 GB DDR3 @ 1866 MHz --- Asus Z87-Plus --- 1.5 TB (boot) --- 4 x 1 TB + 4 x 2 TB (swap)

Page 29 of 33 FirstFirst ... 1926272829303132 ... LastLast

Tags for this Thread

Bookmarks

Bookmarks

Posting Permissions

  • You may not post new threads
  • You may not post replies
  • You may not post attachments
  • You may not edit your posts
  •