Page 1 of 5 1234 ... LastLast
Results 1 to 25 of 102

Thread: CUDA Factorial Benchmark from TOC.ru

  1. #1
    Moderator
    Join Date
    Mar 2006
    Posts
    8,556

    CUDA Factorial Benchmark from Total-OC.ru

    http://total-oc.ru/download.php?id=101



    Show me what you got. Interested in high end NVIDIAs. Remember... shader and on card memory clocks are essential.
    Last edited by [XC] riptide; 09-25-2009 at 04:26 PM.

  2. #2
    Xtreme Cruncher
    Join Date
    May 2009
    Location
    Bloomfield
    Posts
    1,968
    you might want the problem size to be bigger, just a thought. i know you already have seen my results. im just trying to get this thread goin. i wonder if anyone on XS has access to a tesla system. memory is at 1200 and shaders are at 1500.

  3. #3
    Moderator
    Join Date
    Mar 2006
    Posts
    8,556
    Its fine. You can do any problem size you want. Your GFX is a 260? Is that 216 shaders?

  4. #4
    Xtreme Enthusiast
    Join Date
    Mar 2009
    Location
    Bay Area, California
    Posts
    705
    I wonder what algorithm it uses...

    750000! can be done well under a second with Mathematica 6.0 single-threaded on any i7.

    Yet it takes 1+ minutes using CUDA?



    *Sorry for being off topic.
    Main Machine:
    AMD FX8350 @ stock --- 16 GB DDR3 @ 1333 MHz --- Asus M5A99FX Pro R2.0 --- 2.0 TB Seagate

    Miscellaneous Workstations for Code-Testing:
    Intel Core i7 4770K @ 4.0 GHz --- 32 GB DDR3 @ 1866 MHz --- Asus Z87-Plus --- 1.5 TB (boot) --- 4 x 1 TB + 4 x 2 TB (swap)

  5. #5
    Moderator
    Join Date
    Mar 2006
    Posts
    8,556
    Quote Originally Posted by poke349 View Post
    I wonder what algorithm it uses...

    750000! can be done well under a second with Mathematica 6.0 single-threaded on any i7.

    Yet it takes 1+ minutes using CUDA?



    *Sorry for being off topic.
    Have you got mathematica? Can you show us the result? We already know there are insanely fast algorithms and approximations that can be used http://www.luschny.de/math/factorial/Benchmark.html .... but we also know there are many algorithms for calculation Pi to 1million places.. all with diff times.
    Last edited by [XC] riptide; 08-30-2009 at 03:53 PM.

  6. #6
    Xtreme Enthusiast
    Join Date
    Mar 2009
    Location
    Bay Area, California
    Posts
    705
    Quote Originally Posted by [XC] riptide View Post
    Have you got mathematica? Can you show us the result? We already know there are insanely fast algorithms and approximations that can be used http://www.luschny.de/math/factorial/Benchmark.html .... but we also know there are many algorithms for calculation Pi to 1million places.. all with diff times.
    Code:
    Timing[750000!;]
    {0.405, Null}
    i7 920 @ 3.5 GHz

    Mathematica can't multi-thread high-precision arithmetic. So there's no way see how much better it can go.


    On the other hand... I can mod the program in my siggy to do multi-threaded factorials (using a sub-optimal algorithm)... And I'm certain I can beat 0.405 seconds.

    But again... off topic.
    Last edited by poke349; 08-30-2009 at 04:01 PM.
    Main Machine:
    AMD FX8350 @ stock --- 16 GB DDR3 @ 1333 MHz --- Asus M5A99FX Pro R2.0 --- 2.0 TB Seagate

    Miscellaneous Workstations for Code-Testing:
    Intel Core i7 4770K @ 4.0 GHz --- 32 GB DDR3 @ 1866 MHz --- Asus Z87-Plus --- 1.5 TB (boot) --- 4 x 1 TB + 4 x 2 TB (swap)

  7. #7
    Moderator
    Join Date
    Mar 2006
    Posts
    8,556
    Quote Originally Posted by poke349 View Post
    Code:
    Timing[750000!;]
    {0.405, Null}
    i7 920 @ 3.5 GHz
    Can you export all the digits? I mean does it deliver all the digits? There should be 4080578 digits.

  8. #8
    Xtreme Enthusiast
    Join Date
    Mar 2009
    Location
    Bay Area, California
    Posts
    705
    Quote Originally Posted by [XC] riptide View Post
    Can you export all the digits? I mean does it deliver all the digits? There should be 4080578 digits.
    That run of 0.405 seconds did only binary digits. Printing it out in decimal requires an expensive conversion.


    Code:
    750000!
    \!\(\*
    TagBox[
    RowBox[{"2646896442810456334473283390526976189442958803731348335812907\
    9334567747113504796887022327350144664381155203676817108918748679291696\
    6443372148573575453227479621798163102781469763477812875007762400556456\
    3838296982600913849826449820515029294880777450379489322119687361868491\
    51503071358153700424169800424565", 
    RowBox[{"<<", "4079973", ">>"}], 
        "00000000000000000000000000000000000000000000000000000000000000000\
    0000000000000000000000000000000000000000000000000000000000000000000000\
    0000000000000000000000000000000000000000000000000000000000000000000000\
    0000000000000000000000000000000000000000000000000000000000000000000000\
    0000000000000000000000000000"}],
    Short[#, 5]& ]\)

    Took roughly 6 seconds including conversion.
    Main Machine:
    AMD FX8350 @ stock --- 16 GB DDR3 @ 1333 MHz --- Asus M5A99FX Pro R2.0 --- 2.0 TB Seagate

    Miscellaneous Workstations for Code-Testing:
    Intel Core i7 4770K @ 4.0 GHz --- 32 GB DDR3 @ 1866 MHz --- Asus Z87-Plus --- 1.5 TB (boot) --- 4 x 1 TB + 4 x 2 TB (swap)

  9. #9
    Xtreme Mentor
    Join Date
    Sep 2007
    Location
    Ohio
    Posts
    2,977
    (Thanks for the Benchmark link.)

    I have a question. When I ran the CUDA Factorial Benchmark program, I set it up for 900,000 and 4 threads @ 3.81GHz on my Q6600.
    My GPU is listed as GTX 295, and is running 2.5 times faster than my CPU. Just for the record, that GPU test is only calculating on 1/2 of my 295, correct?
    (Basically, running on the equivalent of a single GTX 275.)


    CPU takes 3 minutes, 20.828s
    GPU takes 1 minute, 21.580s
    2.5 x faster...


    Note that if I set the benchmark for 999,000 my GPU moves up to 3.1 X as fast as my CPU. The bigger the order, the more the GPU appears to gain.
    CPU takes 4 minutes, 23.231s
    GPU takes 1 minute, 27.306s
    Checksum: 578712543720173939

    I would love to find the app, that checks to see if you have more than 1 GPU in your system, then use them all.
    3 instances of folding, can load up 3 GPU's... (But that is 3 separate programs running...)
    A game with PhysX can run graphics in SLI, and PhysX on another... (Still, that's Graphics on 2 or more, and PhysX on the other...)
    But still not 1 benchmark program to use all GPU's in your system.

    To be fair, I think CUDA apps couldn't grab cards in SLI until a recent release... I believe?
    I would love to see this CUDA Factorial benchmark, grab all available GPU's in your system, in a later update.

    If 1/2 of my 295 can be 3 X as fast as my Q6600, I have to wonder how many times faster it would be with both my 295 and 280 in on the deal...

    I imagine DX11 will also use just 1 GPU for video transcoding, not all GPU's in your system?
    Last edited by Talonman; 09-06-2009 at 01:21 PM.
    Asus Maximus SE X38 / Lapped Q6600 G0 @ 3.8GHz (L726B397 stock VID=1.224) / 7 Ultimate x64 /EVGA GTX 295 C=650 S=1512 M=1188 (Graphics)/ EVGA GTX 280 C=756 S=1512 M=1296 (PhysX)/ G.SKILL 8GB (4 x 2GB) SDRAM DDR2 1000 (PC2 8000) / Gateway FPD2485W (1920 x 1200 res) / Toughpower 1,000-Watt modular PSU / SilverStone TJ-09 BW / (2) 150 GB Raptor's RAID-0 / (1) Western Digital Caviar 750 GB / LG GGC-H20L (CD, DVD, HD-DVD, and BlueRay Drive) / WaterKegIII Xtreme / D-TEK FuZion CPU, EVGA Hydro Copper 16 GPU, and EK NB S-MAX Acetal Waterblocks / Enzotech Forged Copper CNB-S1L (South Bridge heat sink)

  10. #10
    Xtreme Cruncher
    Join Date
    May 2009
    Location
    Bloomfield
    Posts
    1,968
    ^^ you realize we came to the conclusion cpu was faster. just read the posts above.

  11. #11
    Xtreme Mentor
    Join Date
    Sep 2007
    Location
    Ohio
    Posts
    2,977
    Is that using Mathematica, and a different equation?

    I don't know if we can do a valid comparison that way...

    http://www.hearne.com.au/products/mathematica/pricing/
    Looks pricey!!
    Asus Maximus SE X38 / Lapped Q6600 G0 @ 3.8GHz (L726B397 stock VID=1.224) / 7 Ultimate x64 /EVGA GTX 295 C=650 S=1512 M=1188 (Graphics)/ EVGA GTX 280 C=756 S=1512 M=1296 (PhysX)/ G.SKILL 8GB (4 x 2GB) SDRAM DDR2 1000 (PC2 8000) / Gateway FPD2485W (1920 x 1200 res) / Toughpower 1,000-Watt modular PSU / SilverStone TJ-09 BW / (2) 150 GB Raptor's RAID-0 / (1) Western Digital Caviar 750 GB / LG GGC-H20L (CD, DVD, HD-DVD, and BlueRay Drive) / WaterKegIII Xtreme / D-TEK FuZion CPU, EVGA Hydro Copper 16 GPU, and EK NB S-MAX Acetal Waterblocks / Enzotech Forged Copper CNB-S1L (South Bridge heat sink)

  12. #12
    Xtreme Cruncher
    Join Date
    May 2009
    Location
    Bloomfield
    Posts
    1,968
    cpu's and gpu's use different algorithms to get the best performance. it would be an invalid comparison if we used the same algorithm. both of these processors have advantages and drawbacks.

  13. #13
    Xtreme Enthusiast
    Join Date
    Mar 2009
    Location
    Bay Area, California
    Posts
    705
    Quote Originally Posted by Talonman View Post
    Is that using Mathematica, and a different equation?

    I don't know if we can do a valid comparison that way...

    http://www.hearne.com.au/products/mathematica/pricing/
    Looks pricey!!

    Mathematica uses GMP - which is open sourced and free, so price doesn't matter.
    You'd get the same fast timings using GMP directly than through Mathematica.

    Obviously this isn't a fair comparison at all.
    GMP uses state-of-the-art algorithms which are much faster, but probably not as easily paralleled as whatever this CUDA benchmark uses.
    Main Machine:
    AMD FX8350 @ stock --- 16 GB DDR3 @ 1333 MHz --- Asus M5A99FX Pro R2.0 --- 2.0 TB Seagate

    Miscellaneous Workstations for Code-Testing:
    Intel Core i7 4770K @ 4.0 GHz --- 32 GB DDR3 @ 1866 MHz --- Asus Z87-Plus --- 1.5 TB (boot) --- 4 x 1 TB + 4 x 2 TB (swap)

  14. #14
    Xtreme Mentor
    Join Date
    Sep 2007
    Location
    Ohio
    Posts
    2,977
    Thanks for the info guys..
    Asus Maximus SE X38 / Lapped Q6600 G0 @ 3.8GHz (L726B397 stock VID=1.224) / 7 Ultimate x64 /EVGA GTX 295 C=650 S=1512 M=1188 (Graphics)/ EVGA GTX 280 C=756 S=1512 M=1296 (PhysX)/ G.SKILL 8GB (4 x 2GB) SDRAM DDR2 1000 (PC2 8000) / Gateway FPD2485W (1920 x 1200 res) / Toughpower 1,000-Watt modular PSU / SilverStone TJ-09 BW / (2) 150 GB Raptor's RAID-0 / (1) Western Digital Caviar 750 GB / LG GGC-H20L (CD, DVD, HD-DVD, and BlueRay Drive) / WaterKegIII Xtreme / D-TEK FuZion CPU, EVGA Hydro Copper 16 GPU, and EK NB S-MAX Acetal Waterblocks / Enzotech Forged Copper CNB-S1L (South Bridge heat sink)

  15. #15
    Registered User
    Join Date
    Sep 2009
    Location
    Rostov-on-Don, Russia
    Posts
    28
    Hi ALL.

    I am from www.total-oc.ru and can answer any your question about this benchmark

    Now it use only one GPU from all, but we will work to make it universal. So this test can use all of CPU and CPU core simultaneously or in any combinations.

    And about speed of calculation - it more slowly because is applied uniform algorithm of calculation on the processor and GPU that there was a comparability of results. Aim not to receive as much as possible fast algorithm, and universal.

    This is my world record on GPU in this test - 4.331
    On CPU in this test world record make community XtremeLabs.org - 6.888

    Discussion n our forum in English and some results here.

    I am ready to answer all interesting questions under the test

  16. #16
    Moderator
    Join Date
    Mar 2006
    Posts
    8,556
    Quote Originally Posted by OverFoxtrot View Post
    Hi ALL.

    I am from www.total-oc.ru and can answer any your question about this benchmark

    Now it use only one GPU from all, but we will work to make it universal. So this test can use all of CPU and CPU core simultaneously or in any combinations.

    And about speed of calculation - it more slowly because is applied uniform algorithm of calculation on the processor and GPU that there was a comparability of results. Aim not to receive as much as possible fast algorithm, and universal.

    This is my world record on GPU in this test - 4.331
    On CPU in this test world record make community XtremeLabs.org - 6.888

    Discussion n our forum in English and some results here.

    I am ready to answer all interesting questions under the test
    Welcome.

    I haz a question however. There seems to be a discrepancy once you go above ~>3000n between the CPU result and the GPU result. Is there an approximation that is diff from the CPU and GPU? A different algorithm?
    Last edited by [XC] riptide; 09-25-2009 at 04:07 AM.

  17. #17
    Registered User
    Join Date
    Oct 2005
    Posts
    41
    My Core 2 Duo E6400 + XFX 8800GTX

    System:

    CPU: Core 2 Duo E6400 (2.13Ghz) @ 3.20Ghz
    Video: XFX 8800GTX @ standart
    Ram: 4x512 OCZ Platinium @ 400Mhz

    Results:

    CPU: 5m 46.113s
    GPU: 3m 4.773s



    How can you see, my E6400 @ 3.20Ghz is little bit too week against 8800GTX, so need some Quad power
    CPU: Intel Core 2 Duo E6400 @ 3.2Ghz Cooled By TT BT
    Mobo: GigaByte GA-965P-S3
    Video: nVidia XFX GeForce 8800GTX 768Mb GDDR3
    Ram: 4x512Mb DDR2 OCZ Platinum Edition PC2 6400
    HDD: WD 250GB 16Mb Sata II
    PSU: FSP Epsilon 700W

    Fan(s): 2x120mm

  18. #18
    Moderator
    Join Date
    Mar 2006
    Posts
    8,556
    Quote Originally Posted by HeUeR View Post
    My Core 2 Duo E6400 + XFX 8800GTX

    System:

    CPU: Core 2 Duo E6400 (2.13Ghz) @ 3.20Ghz
    Video: XFX 8800GTX @ standart
    Ram: 4x512 OCZ Platinium @ 400Mhz

    Results:

    CPU: 5m 46.113s
    GPU: 3m 4.773s



    How can you see, my E6400 @ 3.20Ghz is little bit too week against 8800GTX, so need some Quad power
    Dude... there's plenty of mileage left in that GTX.... Push it!

  19. #19
    Registered User
    Join Date
    Oct 2005
    Posts
    41
    [XC] riptide maybe you can help, and little bit tell me about 8800GTX card OC and such things ? Pm me with more info, im intrested to oc my card to check how far it is going :
    Last edited by HeUeR; 09-25-2009 at 08:04 AM.
    CPU: Intel Core 2 Duo E6400 @ 3.2Ghz Cooled By TT BT
    Mobo: GigaByte GA-965P-S3
    Video: nVidia XFX GeForce 8800GTX 768Mb GDDR3
    Ram: 4x512Mb DDR2 OCZ Platinum Edition PC2 6400
    HDD: WD 250GB 16Mb Sata II
    PSU: FSP Epsilon 700W

    Fan(s): 2x120mm

  20. #20
    Registered User
    Join Date
    Sep 2009
    Location
    Rostov-on-Don, Russia
    Posts
    28
    [XC] riptide
    This is nuances working of algorithm on GPU and CPU. If you calculate <=3000!, GPU not working, but CPU is.
    I improve my personal record on CPU - 7.188

    check file
    generate with help AntiCheat TOC 0.9.8.3
    Last edited by OverFoxtrot; 09-25-2009 at 02:35 PM.

  21. #21
    Registered User
    Join Date
    Oct 2005
    Posts
    41
    Hi again,

    got new score in gpu test with my XFX 8800GTX

    OLD Result:

    GPU: 3m 4.773s

    NEW Result:

    GPU: 2m 54.065s (~10s faster)

    CPU: Intel Core 2 Duo E6400 @ 3.2Ghz Cooled By TT BT
    Mobo: GigaByte GA-965P-S3
    Video: nVidia XFX GeForce 8800GTX 768Mb GDDR3
    Ram: 4x512Mb DDR2 OCZ Platinum Edition PC2 6400
    HDD: WD 250GB 16Mb Sata II
    PSU: FSP Epsilon 700W

    Fan(s): 2x120mm

  22. #22
    Xtreme Mentor
    Join Date
    Sep 2007
    Location
    Ohio
    Posts
    2,977
    Quote Originally Posted by OverFoxtrot View Post
    Hi ALL.

    I am from www.total-oc.ru and can answer any your question about this benchmark

    Now it use only one GPU from all, but we will work to make it universal. So this test can use all of CPU and CPU core simultaneously or in any combinations.
    Outstanding!! Thanks for the post. GPU's rule.
    Asus Maximus SE X38 / Lapped Q6600 G0 @ 3.8GHz (L726B397 stock VID=1.224) / 7 Ultimate x64 /EVGA GTX 295 C=650 S=1512 M=1188 (Graphics)/ EVGA GTX 280 C=756 S=1512 M=1296 (PhysX)/ G.SKILL 8GB (4 x 2GB) SDRAM DDR2 1000 (PC2 8000) / Gateway FPD2485W (1920 x 1200 res) / Toughpower 1,000-Watt modular PSU / SilverStone TJ-09 BW / (2) 150 GB Raptor's RAID-0 / (1) Western Digital Caviar 750 GB / LG GGC-H20L (CD, DVD, HD-DVD, and BlueRay Drive) / WaterKegIII Xtreme / D-TEK FuZion CPU, EVGA Hydro Copper 16 GPU, and EK NB S-MAX Acetal Waterblocks / Enzotech Forged Copper CNB-S1L (South Bridge heat sink)

  23. #23
    Registered User
    Join Date
    Sep 2009
    Location
    Rostov-on-Don, Russia
    Posts
    28
    HeUeR
    Calculate 250000! on your video please. We get statistics about any hardware
    Talonman
    Our project develops more test packages of comparison of speed of video cards and processors and useful utilities. I shortly will place here a theme that we could test speed and in them.

  24. #24
    Registered User
    Join Date
    Sep 2009
    Location
    Rostov-on-Don, Russia
    Posts
    28
    [XC] riptide
    Please fix title of theme from TOC.ru to Total-OC.ru

  25. #25
    Registered User
    Join Date
    Oct 2005
    Posts
    41
    Hi,

    OverFoxtrot i got some score for you


    750000 bench:

    GPU: 2m 45.140s




    250000 bench:

    GPU: 27.903s



    Both results ar with same clocked 8800gtx and same clock cpu, just in 750000 bench i forgot to take photo from gpu-z
    CPU: Intel Core 2 Duo E6400 @ 3.2Ghz Cooled By TT BT
    Mobo: GigaByte GA-965P-S3
    Video: nVidia XFX GeForce 8800GTX 768Mb GDDR3
    Ram: 4x512Mb DDR2 OCZ Platinum Edition PC2 6400
    HDD: WD 250GB 16Mb Sata II
    PSU: FSP Epsilon 700W

    Fan(s): 2x120mm

Page 1 of 5 1234 ... LastLast

Bookmarks

Bookmarks

Posting Permissions

  • You may not post new threads
  • You may not post replies
  • You may not post attachments
  • You may not edit your posts
  •