Page 2 of 7 FirstFirst 12345 ... LastLast
Results 26 to 50 of 173

Thread: ratGPU OpenCL raytracing benchmark

  1. #26
    Xtreme Member
    Join Date
    Aug 2007
    Posts
    282
    Quote Originally Posted by p2501 View Post
    Is this reasonable for a HD5850 @ 900/1253?!
    Yep, 773s sounds good for that ATI.

    Although we optimized for ATIs, I'm not completely sure why but they perform a bit slower than NVIDIA cards with the current implementation.
    Taking a look to ATI Stream's profiler I see a lot of SIMD registers being flushed for secondary rays ( which are very incoherent ). That, combined with the lack of DMA support in the ATI OpenCL SDK v2.2 and the incapacity to execute the SKA profiling Tool(my kernels just crashes it) probably makes them a bit slower compared with NVIDIA cards.

    Anybody tested a 68XX, pls? I've heard they are using 4D VLWI instead of 5-VLWI so less registers will be flushed and the performance could be a bit better.

    Btw, I think for the next version we're gonna low a bit the benchmark's default settings to complete the test sightly faster ( and we'll try to perform several optimizations we have in mind also ).
    Last edited by jogshy; 10-22-2010 at 04:57 PM.

  2. #27
    c[_]
    Join Date
    Nov 2002
    Location
    Alberta, Canada
    Posts
    18,728
    Quote Originally Posted by jogshy View Post
    Ooh nice! So a dual socket will use about 600s, yep. Funny to see 2x 1000$ CPUs + the SR-2 MB catching a 160$ card




    By the way, did you use that to make WinXP to pass the test #4, pls?
    For Vista/7 is easy and I do it in the .exe installer but a graphics driver engineer told me it was virtually impossible to disable the watchdog for WinXP ...
    Yeah, I used those registry changes to pass the #4 test. I set the GPU watchdog timeout to 200.

    All along the watchtower the watchmen watch the eternal return.

  3. #28
    Xtreme Member
    Join Date
    Aug 2007
    Posts
    282
    Quote Originally Posted by STEvil View Post
    Yeah, I used those registry changes to pass the #4 test. I set the GPU watchdog timeout to 200.
    Nice, I'll tweak the installer for the next version based on that. Thanks !

  4. #29
    Xtreme Member
    Join Date
    Aug 2007
    Posts
    282
    I've posted the 0.4.4 with several optimizations and some minor bugs corrected.
    Now it takes around 88s for a GTX460 instead of 600s.

  5. #30
    c[_]
    Join Date
    Nov 2002
    Location
    Alberta, Canada
    Posts
    18,728
    http://www.ratgpu.com/verify.aspx?k=...C4360D167520C5

    43.547 on the GTX480 @ 800/1600/2000 WinXP-64 with 0.4.4

    All along the watchtower the watchmen watch the eternal return.

  6. #31
    Registered User
    Join Date
    May 2006
    Posts
    67
    4.4@875/2125
    Attached Images Attached Images

  7. #32
    Registered User
    Join Date
    May 2006
    Posts
    67
    4.4@925/2200
    Attached Images Attached Images

  8. #33
    Xtreme Member
    Join Date
    Sep 2007
    Location
    Alberta, Canada
    Posts
    360
    Wow this is faster now, lol.

    EVGA z68 FTW
    i7 2600k @ 4.8
    8gb DDR3 1600
    3x GTX 580 3gb HydroCopper2
    Silverstone Strider 1500W
    Areca 1880i w/ 6x intel x25m
    On water

  9. #34
    Xtreme Member
    Join Date
    Aug 2007
    Posts
    282
    Nice results !

    By the way, caution with the Catalyst 10.10 in case you use an ATI card. Seems they're bugged. You need to use the old Catalyst 10.9 with the OpenCL SDK v2.2.

    I'm investigating other optimizations and trying to port it to OpenSUSE 11.3 and Fedora currently.
    Last edited by jogshy; 10-25-2010 at 03:04 PM.

  10. #35
    Xtreme Cruncher
    Join Date
    May 2009
    Location
    Bloomfield
    Posts
    1,968
    Quote Originally Posted by jogshy View Post
    Yep, 773s sounds good for that ATI.

    Although we optimized for ATIs, I'm not completely sure why but they perform a bit slower than NVIDIA cards with the current implementation.
    Taking a look to ATI Stream's profiler I see a lot of SIMD registers being flushed for secondary rays ( which are very incoherent ). That, combined with the lack of DMA support in the ATI OpenCL SDK v2.2 and the incapacity to execute the SKA profiling Tool(my kernels just crashes it) probably makes them a bit slower compared with NVIDIA cards.
    a few questions:

    it looks like you are using path tracing by the noise and GI quality (ambient occlusion seems very accurate.) but i see familiar pixel aliasing which leads me to believe this is a forwards ray tracer. which is it?

    have you tried to improve coherency through sorting? you really need to do that with gpu's because it works well with data parallelism where as a mostly incoherent ray tracer works well with task parallelism.

    if you are using forwards ray tracing are you using ray casting + primary rays or rasterizing like a hybrid engine would?

  11. #36
    Xtreme Member
    Join Date
    Aug 2007
    Posts
    282
    Quote Originally Posted by Chumbucket843 View Post
    it looks like you are using path tracing by the noise and GI quality (ambient occlusion seems very accurate.) but i see familiar pixel aliasing which leads me to believe this is a forwards ray tracer. which is it?
    It's a path tracer but I re-utilise the primary rays's data to optimise a bit. I still need to implement the AA, yep.

    have you tried to improve coherency through sorting?
    Yep, I actually group the rays but some secondary rays are still very incoherent so the packets are flushed almost empty.
    Last edited by jogshy; 10-25-2010 at 07:55 PM.

  12. #37
    Xtreme Cruncher
    Join Date
    May 2009
    Location
    Bloomfield
    Posts
    1,968
    Quote Originally Posted by jogshy View Post
    It's a path tracer but I re-utilise the primary rays's data to optimise a bit. I still need to implement the AA, yep.
    i dont have much experience with path tracing but for the AA you could do something interesting such as using a gaussian filter or perhaps MLAA.
    Yep, I actually group the rays but some secondary rays are still very incoherent so the packets are flushed almost empty.
    damn. i have never optimized a ray tracer that much, mostly just simple things. i am sure you can find some good papers on improving coherency with gpu's although i dont know if they would be relevant to path tracing.

  13. #38
    Xtreme Member
    Join Date
    Aug 2007
    Posts
    282
    I've uploaded the OpenSUSE 11.3 port. Here's an screenshot:



    Fedora and FreeBSD will be my next victims
    I would like also to target MacOSX Leopard but I lack a mac
    Last edited by jogshy; 10-26-2010 at 10:08 PM.

  14. #39
    Xtreme Enthusiast
    Join Date
    Mar 2007
    Posts
    761
    Hello sir,

    This is a very nice app. Are you planning a Blender exporter? What about real-time scene navigation like in Octane? CPU+GPU rendering like Lux? And finally, will it remain free or are you planning to charge eventually?

  15. #40
    Xtreme Member
    Join Date
    Aug 2007
    Posts
    282
    Quote Originally Posted by karbonkid View Post
    Are you planning a Blender exporter?
    Currently I'm working only in Maya and XSI exporters but, after that, it's possible, yep.

    What about real-time scene navigation like in Octane?
    Well, I have to do a lot of things before that but maybe possible, yep.

    CPU+GPU rendering like Lux?
    You can actually enable hybrid rendering. However, the CPU is much more slower than the GPU so "the chain is as strong as its weakest link", almost for small images. For large images you can get a benefit but for small ones you'll get worse results because the last thread usually must wait for the CPU to complete the image.

    And finally, will it remain free or are you planning to charge eventually?
    Well, I currently do this as a hobby.
    Last edited by jogshy; 10-27-2010 at 11:09 AM.

  16. #41
    Xtreme Member
    Join Date
    Jan 2004
    Posts
    393
    very interesting software

    here the result of the benchmark on a 9600gt under windows xp


  17. #42
    Xtreme Enthusiast
    Join Date
    Mar 2007
    Posts
    761
    Quote Originally Posted by jogshy View Post
    Well, I currently do this as a hobby.
    Thanks for the answers sir.

    Have you considered open-sourcing it?

  18. #43
    Xtreme Guru
    Join Date
    Dec 2002
    Posts
    4,046
    Quote Originally Posted by Arctucas View Post
    4.4@925/2200
    900/2200

  19. #44
    Xtreme Member
    Join Date
    Aug 2007
    Posts
    282
    Quote Originally Posted by NapalmV5 View Post
    900/2200
    Btw, FYI, I think the graphic drivers reserve certain Mhz/VRAM Mbs for the Windows's GUI, that explains the discrepancy between GPUz and the quantities that ratGPU reports.

  20. #45
    Registered User
    Join Date
    May 2006
    Posts
    67
    Quote Originally Posted by napalmv5 View Post
    900/2200
    900/2200
    Attached Images Attached Images

  21. #46
    Registered User
    Join Date
    May 2006
    Posts
    67
    930/2200
    Attached Images Attached Images

  22. #47
    Registered User
    Join Date
    May 2006
    Posts
    67
    950/2200
    Attached Images Attached Images

  23. #48
    Xtreme Member
    Join Date
    Aug 2007
    Posts
    282
    Quote Originally Posted by Arctucas View Post
    950/2200
    Wow! 50Mhz for 1Ghz!

    Btw, I've not test it but there's a trick you could use to get better score in SLI/Crossfire:

    1. Use other card as primary adapter ( to plug the monitor there... for instance, a GT240 ). UNcheck the "Use this device" to avoid to use this card for the benchmark. This card will be dedicated to paint the Windows's GUI exclusively.

    2. Check-in the "use this device" for the two GTX460s only.

    In that way the GTX460s won't spend time painting the Windows's GUI, they will be 100% dedicated to the benchmark... so the score should be sightly better.

  24. #49
    sleepin is overrated
    Join Date
    Feb 2005
    Location
    Ireland
    Posts
    1,308
    62.154 sec with stock 470 gtx

  25. #50
    Xtreme Member
    Join Date
    Sep 2007
    Location
    Alberta, Canada
    Posts
    360
    I ran it on the rig I just put together with 3 way SLI of gtx285's. 39 seconds.

    Trying to find where I saved the screen :P
    EVGA z68 FTW
    i7 2600k @ 4.8
    8gb DDR3 1600
    3x GTX 580 3gb HydroCopper2
    Silverstone Strider 1500W
    Areca 1880i w/ 6x intel x25m
    On water

Page 2 of 7 FirstFirst 12345 ... LastLast

Bookmarks

Bookmarks

Posting Permissions

  • You may not post new threads
  • You may not post replies
  • You may not post attachments
  • You may not edit your posts
  •