Page 1 of 7 1234 ... LastLast
Results 1 to 25 of 173

Thread: ratGPU OpenCL raytracing benchmark

  1. #1
    Xtreme Member
    Join Date
    Aug 2007
    Posts
    282

    ratGPU OpenCL raytracing benchmark

    Hi!

    We've created a free program to render images using raytracing and the GPU and to benchmark the OpenCL's performance. It's called ratGPU and you can get it at www.ratgpu.com ( which redirects temporally to www.xnormal.net/ratGPU ).

    There is small benchmark mode with web verification. If you could test it a bit it would be really appreciated.

    Usage:

    1. Download and install the program ( www.ratgpu.com ). You should reset after installing it ( see limitation #3 to know why ).

    2. Execute the StandAloneRenderer.exe ( easily from the "start" menu S.Org->ratGPU->StandAloneRenderer ). Notice if you use a 64bits OS like Windows 7 x64 you should execute the "Standalone renderer (64bits)" or, due to a current limitation of the Forceware drivers, the program will fail. So, you cannot execute the 32bits version currently in a 64bits OS. Use the x86 version in a 32bits OS and the x64 version in a 64bits OS as is natural.

    3. Press the "benchmark" button on the top of the toolbar and wait until it finishes.



    You can see the progress of the current test or to cancel it with the button.



    4. Copy the verification URL link and save it if you need to validate the result after ( for example, for HW-bot ).




    Limitations/problems:
    0. As it uses the GPU, your monitor or Windows's UI can be very unresponsive.

    1. Old drivers ( Forceware 197 ) might have problems. The FW 25X are problematic too. You should use the FW 260 Beta atm.

    2. If you use an ATI card, you must download and install their OpenCL SDK runtime ( because Catalyst drivers don't include it... don't ask me why... )
    Download it at http://developer.amd.com/gpu/atistre...s/default.aspx
    Note: Only Radeon 5XXX are fully supported, 4XXX series lack the required functionality.

    3. If you use Windows XP, a thing called "GPU watchdog" could pop resetting the graphics drivers and aborting the program execution. That's a thing Microsoft added in the 90s to reset the driver in case a program would be stressing the GPU for more than 5 seconds. Of course, that was before GPGPU existed because modern OSs like Vista/Windows 7 include a mechanism to avoid that ( I do it automatically in the installer, but for WinXP simply cannot be disabled ).
    To solve that problem I suggest you to get a second GPU and disable the primary card ( the one connected to the monitor ) using the "config" tab and, then, uncheck the "use this device" checkbox.
    Again, nor Vista/7 requires that because the watchdog is disabled automatically.


    My score with a reference Gainward GTX460 (192bits), W7 x64, FW260b and an i7 920 is 598s-601s.
    Here is a screenshot showing several things:



    Btw, we're porting it to Ubuntu 10.10 too.

    Thanks!
    Last edited by jogshy; 10-18-2010 at 09:49 PM.

  2. #2
    Xtreme Addict
    Join Date
    Apr 2007
    Location
    canada
    Posts
    1,886
    no ati 4290 love ????
    WILL CUDDLE FOR FOOD

    Quote Originally Posted by JF-AMD View Post
    Dual proc client systems are like sex in high school. Everyone talks about it but nobody is really doing it.

  3. #3
    Xtreme Member
    Join Date
    Aug 2007
    Posts
    282
    Quote Originally Posted by Sn0wm@n View Post
    no ati 4290 love ????
    I'm afraid not. ATI said the Radeon 4XXX series have too many OpenCL limitations:

    http://www.xbitlabs.com/news/video/d...rformance.html

    Only the 5XXX ones are supported because the 4XXX ones cannot use more than one hardware buffer and I'm using more than ten
    For NVIDIAs, any card equal or above than a G80 should work, but the G80's performance under OpenCL is seriously limited. I recommend to use a GT200 or Fermi or the test will take forever.

    Btw, you can also use the CPU in case no GPU could be used... but it will be much much much slower. Just select your CPU in the "device" combobox.
    Also, you can load one of the example scenes located in the [installdir]\StandAloneRenderer\scenes folder, go to the "renderer" tab, select "Path-tracing" and press render. In that way only that scene will be rendered and you can also make the image smaller if you need a faster test.

    ps: edited/updated "limitations"
    Last edited by jogshy; 10-17-2010 at 07:35 PM.

  4. #4
    Xtreme Addict
    Join Date
    Apr 2007
    Location
    canada
    Posts
    1,886
    thanks for the info man
    WILL CUDDLE FOR FOOD

    Quote Originally Posted by JF-AMD View Post
    Dual proc client systems are like sex in high school. Everyone talks about it but nobody is really doing it.

  5. #5
    c[_]
    Join Date
    Nov 2002
    Location
    Alberta, Canada
    Posts
    18,728
    Gave it a go and couldnt quite get it to finish..

    CPU renderer crashed out
    GPU renderer got to the 4th or 5th scene (?) then a pop-up came up saying "ratGPU cannot render the tile" or something to that effect.

    system is:

    2x E5450 @ 3.5ghz (LGA771, 4cores per CPU)
    8gb ram
    GTX480 with 260.63 drivers, 797/1599/1998 clocks
    WinXP-64

    All along the watchtower the watchmen watch the eternal return.

  6. #6
    Registered User
    Join Date
    May 2006
    Posts
    67
    Mine-SLi GTX460 @ 860/2100.
    Attached Images Attached Images

  7. #7
    Xtreme Member
    Join Date
    Aug 2007
    Posts
    282
    Quote Originally Posted by STEvil View Post
    GPU renderer got to the 4th or 5th scene (?) then a pop-up came up saying "ratGPU cannot render the tile" or something to that effect.

    system is:
    ....
    WinXP-64
    For simple scenes like the ambient occlusion one or the balls probably the watchdog didn't pop, but the last scene is very complex so if a pass took more than 5 secs to execute then the watchdog probably popped causing the error.
    For winXP you might need to use a second card and to disable the primary adapter or the GPGPU watchdog will pop. I'm afraid WinXP is not really prepared to perform GPGPU with only one card 8( With Vista/7 should work better.


    You mention the CPU device crashes... have you tried it without OC, pls? The CPU device test stresses all the cores so it might be due to OC, but I'll revise the code searching for bugs just in case.
    Btw, are you using only the CPU device or in combination with the GPU ones?

    Quote Originally Posted by Arctucas View Post
    Mine-SLi GTX460 @ 860/2100.
    Nice SLI scaling! ( > 2x my score! ). Are you using 192bits cards or 256bits ones?


    Btw, you can use BOTH cpu+gpu hybrid rendering ( play a bit with the "use this device" option ), but sometimes the GPU has to wait the CPU so the result might be worse than using only the GPU. If you disable the GPU device and you use only the CPU you'll notice the GPU is much much much faster. I recommend to check in only the GPU devices to render fast and only the CPU to perform a CPU OC stress test.
    For instance, compare the 600s of the GPU-only renderer vs the 2000s of the CPU-only:



    CPU+GPU together did 721s, worse than GPU-only because my CPU is much slower than a Fermi card so it only slowed down the GPU ( a chain is only as strong as its weakest link ).
    Last edited by jogshy; 10-18-2010 at 08:46 PM.

  8. #8
    Registered User
    Join Date
    May 2006
    Posts
    67
    @jogshy,

    Thanks.
    The cards are eVGA SuperClocked External Exhaust 1GB versions-01G-P3-1373-AR, so, 256-bit.

    Here is a run @900/2100:
    Attached Images Attached Images
    Last edited by Arctucas; 10-20-2010 at 08:16 AM.

  9. #9
    c[_]
    Join Date
    Nov 2002
    Location
    Alberta, Canada
    Posts
    18,728
    Quote Originally Posted by jogshy View Post
    For simple scenes like the ambient occlusion one or the balls probably the watchdog didn't pop, but the last scene is very complex so if a pass took more than 5 secs to execute then the watchdog probably popped causing the error.
    For winXP you might need to use a second card and to disable the primary adapter or the GPGPU watchdog will pop. I'm afraid WinXP is not really prepared to perform GPGPU with only one card 8( With Vista/7 should work better.


    You mention the CPU device crashes... have you tried it without OC, pls? The CPU device test stresses all the cores so it might be due to OC, but I'll revise the code searching for bugs just in case.
    Btw, are you using only the CPU device or in combination with the GPU ones?


    Nice SLI scaling! ( > 2x my score! ). Are you using 192bits cards or 256bits ones?


    Btw, you can use BOTH cpu+gpu hybrid rendering ( play a bit with the "use this device" option ), but sometimes the GPU has to wait the CPU so the result might be worse than using only the GPU. If you disable the GPU device and you use only the CPU you'll notice the GPU is much much much faster. I recommend to check in only the GPU devices to render fast and only the CPU to perform a CPU OC stress test.
    For instance, compare the 600s of the GPU-only renderer vs the 2000s of the CPU-only:



    CPU+GPU together did 721s, worse than GPU-only because my CPU is much slower than a Fermi card so it only slowed down the GPU ( a chain is only as strong as its weakest link ).
    Ah, the hybrid rendering is what made it crash. I didnt realize you could have multiple devices selected..

    As to the GPU tests, they for sure took more than 5 seconds and didnt cause the error until that ~4th test (final test?).

    Giving these a try to see if they help:

    http://msdn.microsoft.com/en-us/library/ff553890.aspx
    http://msdn.microsoft.com/en-us/library/ff553893.aspx


    edit..

    oh wow.. that was fast lol. 282.2

    http://www.ratgpu.com/verify.aspx?k=...EB1CE716751B0C
    Attached Thumbnails Attached Thumbnails Click image for larger version. 

Name:	ratgpu-800-2000.jpg 
Views:	3587 
Size:	163.0 KB 
ID:	108662  
    Last edited by STEvil; 10-20-2010 at 11:59 PM.

    All along the watchtower the watchmen watch the eternal return.

  10. #10
    Xtreme Member
    Join Date
    Aug 2007
    Posts
    282
    Quote Originally Posted by STEvil View Post
    Interresting, thx! I'll try tomorrow.

    Quote Originally Posted by STEvil View Post
    edit..
    oh wow.. that was fast lol. 282.2
    Now you own the WR

    Btw, I've just uploaded the 0.4.3 with some minor bugs corrected and Ubuntu 10(.04/.10) support:
    Last edited by jogshy; 10-21-2010 at 01:44 AM.

  11. #11
    Registered User
    Join Date
    May 2006
    Posts
    67
    Using the 4.3 version, @925/2200:
    Attached Images Attached Images

  12. #12
    Xtreme Member
    Join Date
    Sep 2007
    Location
    Alberta, Canada
    Posts
    360
    Ran it on 3 480's and a 450

    Clocks on the 480's were 867/1734/1988 and the 450 was 930/1860/1900
    i7 980x at 4.4gHz.

    EVGA z68 FTW
    i7 2600k @ 4.8
    8gb DDR3 1600
    3x GTX 580 3gb HydroCopper2
    Silverstone Strider 1500W
    Areca 1880i w/ 6x intel x25m
    On water

  13. #13
    Xtreme Member
    Join Date
    Aug 2007
    Posts
    282
    Quote Originally Posted by tet5uo View Post
    Ran it on 3 480's and a 450
    Wow, nice!

    Btw, caution if you mix a card much slower than the rest. Perhaps 3X 480 could be faster because, with a small image like the one used for the benchmark (512x512) , the "chain is as strong as its weakest link"
    Last edited by jogshy; 10-21-2010 at 01:30 PM.

  14. #14
    Xtreme Member
    Join Date
    Sep 2007
    Location
    Alberta, Canada
    Posts
    360
    Quote Originally Posted by jogshy View Post
    Wow, nice!

    Btw, caution if you mix a card much slower than the rest. Perhaps 3X 480 could be faster because, with a small image like the one used for the benchmark (512x512) , the "chain is as strong as its weak link"

    Cool, yeah I noticed that 4th thread working alot slower than the rest

    I'll give it a shot with just the 480's now....

    Yep faster

    EVGA z68 FTW
    i7 2600k @ 4.8
    8gb DDR3 1600
    3x GTX 580 3gb HydroCopper2
    Silverstone Strider 1500W
    Areca 1880i w/ 6x intel x25m
    On water

  15. #15
    Xtreme Member
    Join Date
    Aug 2007
    Posts
    282
    Quote Originally Posted by tet5uo View Post
    I'll give it a shot with just the 480's now....
    Yep faster
    Wooow! I wonder what 4x 480s could do!

    Btw, anybody can test any ATI card, pls? Although in all my experiments ATI cards were much slower, even when I spent a lot of time optimizing for it. I think the incoherent ray tracing's secondary rays are flushing a lot the Radeon's VLWI SIMD registers and the lack of DMA support is slowing down the PCI express transfers and kernel calls
    Btw, remind to use ratGPU with an ATI card you must:

    1. Get a Radeon 5XXX or one of those new 6870s . Radeon 4XXX are not supported ( because they lack the required OpenCL functionality ).
    2. Download and install the ATI OpenCL SDK v2.2 at http://developer.amd.com/gpu/atistre...s/default.aspx
    3. Use Catalyst 10.9 or above.
    Last edited by jogshy; 10-21-2010 at 12:44 PM.

  16. #16
    I am Xtreme
    Join Date
    Dec 2007
    Posts
    7,750
    does this work with only CPUs too? and if so, about how long would that take on faster quad/hexies?
    2500k @ 4900mhz - Asus Maxiums IV Gene Z - Swiftech Apogee LP
    GTX 680 @ +170 (1267mhz) / +300 (3305mhz) - EK 680 FC EN/Acteal
    Swiftech MCR320 Drive @ 1300rpms - 3x GT 1850s @ 1150rpms
    XS Build Log for: My Latest Custom Case

  17. #17
    Xtreme Member
    Join Date
    Sep 2007
    Location
    Alberta, Canada
    Posts
    360
    Quote Originally Posted by jogshy View Post
    Wooow! I wonder what 4x 480s could do!

    .

    They could make my power bills even more obscene

    I've already seen this system draw 1.38 kW from the wall with max GPU load in furmark
    EVGA z68 FTW
    i7 2600k @ 4.8
    8gb DDR3 1600
    3x GTX 580 3gb HydroCopper2
    Silverstone Strider 1500W
    Areca 1880i w/ 6x intel x25m
    On water

  18. #18
    Xtreme Member
    Join Date
    Aug 2007
    Posts
    282
    Quote Originally Posted by Manicdan View Post
    does this work with only CPUs too? and if so, about how long would that take on faster quad/hexies?
    Yep, ratGPU can perform CPU-only renderers ( also GPU-only or GPU+CPU-mixed-mode renders aka "hybrid raytracing" ).

    To use only the CPU just select your GPUs with the Config->Device combo box and disable them with the "Use this device" check box. Once all your GPU devices have the "Use this device" UNchecked just select the CPU device and check in its "Use this device".

    If you render with only the CPU device you'll notice the GPU is really good performing ray tracing. My i7 920 OCd to 4Ghz takes 2000s vs the 600s of my GTX460, so it's about 3.5X slower than my GPU.
    Last edited by jogshy; 10-21-2010 at 08:17 PM.

  19. #19
    c[_]
    Join Date
    Nov 2002
    Location
    Alberta, Canada
    Posts
    18,728
    Assuming 100% scaling, dual westmere's @ 4ghz should take 667s.

    Will have to test on my SR-2 when its up in a week or two hopefully, if someone doesnt beat me to it.

    All along the watchtower the watchmen watch the eternal return.

  20. #20
    Xtreme Member
    Join Date
    Sep 2007
    Location
    Alberta, Canada
    Posts
    360
    Quote Originally Posted by Manicdan View Post
    does this work with only CPUs too? and if so, about how long would that take on faster quad/hexies?
    Here's a CPU only run on my 980x @ 4.45 w/ 12gb of ddr3 2000mHz cas10

    EVGA z68 FTW
    i7 2600k @ 4.8
    8gb DDR3 1600
    3x GTX 580 3gb HydroCopper2
    Silverstone Strider 1500W
    Areca 1880i w/ 6x intel x25m
    On water

  21. #21
    Xtreme Member
    Join Date
    Aug 2007
    Posts
    282
    Quote Originally Posted by tet5uo View Post
    Here's a CPU only run on my 980x @ 4.45 w/ 12gb of ddr3 2000mHz cas10
    Ooh nice! So a dual socket will use about 600s, yep. Funny to see 2x 1000$ CPUs + the SR-2 MB catching a 160$ card


    Quote Originally Posted by STEvil View Post
    By the way, did you use that to make WinXP to pass the test #4, pls?
    For Vista/7 is easy and I do it in the .exe installer but a graphics driver engineer told me it was virtually impossible to disable the watchdog for WinXP ...
    Last edited by jogshy; 10-21-2010 at 08:44 PM.

  22. #22
    Xtreme Member
    Join Date
    Sep 2007
    Location
    Alberta, Canada
    Posts
    360
    Hehe yeah, here's my 130$ after instant rebate card kickin my 1k$ cpu's butt...

    EVGA z68 FTW
    i7 2600k @ 4.8
    8gb DDR3 1600
    3x GTX 580 3gb HydroCopper2
    Silverstone Strider 1500W
    Areca 1880i w/ 6x intel x25m
    On water

  23. #23
    Xtreme Member
    Join Date
    Sep 2007
    Location
    Alberta, Canada
    Posts
    360
    Ops, just realized I had steam open downloading a game while doing that CPU run.. might have shaved a few secs off without that running hehe
    EVGA z68 FTW
    i7 2600k @ 4.8
    8gb DDR3 1600
    3x GTX 580 3gb HydroCopper2
    Silverstone Strider 1500W
    Areca 1880i w/ 6x intel x25m
    On water

  24. #24
    Xtreme Member
    Join Date
    Sep 2007
    Location
    Alberta, Canada
    Posts
    360
    I'd test an ATI card for ya, but the only one I have here is in the closet, an old AGP x800XT :P
    EVGA z68 FTW
    i7 2600k @ 4.8
    8gb DDR3 1600
    3x GTX 580 3gb HydroCopper2
    Silverstone Strider 1500W
    Areca 1880i w/ 6x intel x25m
    On water

  25. #25
    Xtreme Addict
    Join Date
    Jul 2007
    Location
    Germany
    Posts
    1,592
    Is this reasonable for a HD5850 @ 900/1253?!

    Catalyst 10.8:


    Catalyst 10.9:


    Not much of an improvement between both drivers IMO.
    Last edited by p2501; 10-22-2010 at 01:45 AM.
    The XS Folding@Home team needs your help! Join us and help fight diseases with your CPU and GPU!!


Page 1 of 7 1234 ... LastLast

Bookmarks

Bookmarks

Posting Permissions

  • You may not post new threads
  • You may not post replies
  • You may not post attachments
  • You may not edit your posts
  •