Page 4 of 10 FirstFirst 1234567 ... LastLast
Results 76 to 100 of 226

Thread: SuperPi on GPU, were going CUDA

  1. #76
    Xtreme Member
    Join Date
    Apr 2006
    Location
    Belgrade, Serbia
    Posts
    187

    Cool

    Quote Originally Posted by BenchZowner View Post
    Guys, we're just grabbing a opportunity to run a pure number crunching benchmark on the GPU.
    Fine, just remember that such a benchmark won't tell you how good your video card will be for running Crysis.

    In my opinion, benching just the number crunching part of a GPU is equally insane as benching just the FPU in the CPU.

    Simply put CPU has other units which may improve its performance for other uses (for example SSE as used in DivX encoder), and GPU has other units which may limit its performance for other uses (number of ROPs, TMUs, etc, which determine actual game performance).

    What I am trying to say is that I am not sure we really need yet another benchmark with relative instead of absolute performance numbers.

    What we also don't need is the CUDA mp3 encoder (and Linux only mind you) but NVIDIA still organized a contest.

    mp3 encoding is ridiculously fast on a CPU already, disk I/O is the bottleneck -- it would be much better if they organized a contest for x264 encoder on a GPU or even better wrote one themselves and open-sourced it.

  2. #77
    Xtreme Addict
    Join Date
    Dec 2007
    Posts
    1,030
    Quote Originally Posted by audiofreak View Post
    it would be much better if they organized a contest for x264 encoder on a GPU or even better wrote one themselves and open-sourced it.
    Don't know about the open source part, but x264 encoder by cuda seems to be on.

    http://www.youtube.com/watch?v=8C_Pj1Ep4nw
    Are we there yet?

  3. #78
    One-Eyed Killing Machine
    Join Date
    Sep 2006
    Location
    Inside a pot
    Posts
    6,340
    Quote Originally Posted by audiofreak View Post
    Fine, just remember that such a benchmark won't tell you how good your video card will be for running Crysis.
    That's not my purpose.
    If I wanted to test a graphics card's gaming performance, I know how to run and what to run.

    Notice the word BENCHING.

    Quote Originally Posted by audiofreak View Post
    In my opinion, benching just the number crunching part of a GPU is equally insane as benching just the FPU in the CPU.
    Once again, BENCHING.
    We're talking about programs that we ( overclockers ) use to measure the performance of our overclocked systems in specific apps/things.

    Quote Originally Posted by audiofreak View Post
    What I am trying to say is that I am not sure we really need yet another benchmark with relative instead of absolute performance numbers.
    I repeat
    Benching.
    We ( overclockers ) want more and we like it.

    -- We need applications to take advantage of our GPUs for normal usage, but this is NOT the thread to talk about CUDA & "real-life" usage.
    Coding 24/7... Limited forums/PMs time.

    -Justice isn't blind, Justice is ashamed.

    Many thanks to: Sue Wu, Yiwen Lin, Steven Kuo, Crystal Chen, Vivian Lien, Joe Chan, Sascha Krohn, Joe James, Dan Snyder, Amy Deng, Jack Peterson, Hank Peng, Mafalda Cogliani, Olivia Lee, Marta Piccoli, Mike Clements, Alex Ruedinger, Oliver Baltuch, Korinna Dieck, Steffen Eisentein, Francois Piednoel, Tanja Markovic, Cyril Pelupessy (R.I.P. ), Juan J. Guerrero

  4. #79
    Xtreme Member
    Join Date
    Feb 2008
    Location
    Portugal
    Posts
    324
    how much can CUDA impact on general performance?

    what is the biggest advantage? how will this influence market and tech

    evolution in a close future?

    cheers and thanks


    SILVERSTONE TJ07 . ASUS RAMPAGE EXTREME . INTEL C2D E8600@ Q822A435 . 6GB CELLSHOCK PC3 15000 . EVGA GTX 285 . WD VELOCIRAPTOR 300HLFS . WD AAKS 640GB ''RAID0 . CORSAIR HX 1000W . X-Fi FATAL1TY TITANIUM . LOGITECH WAVE . G9 LASER . Z5500 . DELL ULTRASHARP 2047WFP
    Aquaero VFD . Enzotech revA . Laing DDC 12v . Black Ice GTS-Lite 360 . Swiftech Mcres Micro . 3/8"
    By MrHydes®

    sales
    feedback Techzone

  5. #80
    Xtreme Member
    Join Date
    Jan 2007
    Location
    Kilkenny, Ireland
    Posts
    259
    Well you can see in the tech demo posted above ^

    GPUs are animals at video encoding etc

  6. #81
    Xtreme Legend
    Join Date
    Jan 2003
    Location
    Stuttgart, Germany
    Posts
    929
    gpus are only good at workloads that can be parallelized (hundreds of parallel threads). if there is a sequential execution flow, gpus can't show their performance and will be slower than cpus

    quick example .. imagine a large excel sheet with a number of rows (the money you spent on drinking, partying and getting laid) that you want to sum up.

    one way is to go through the rows one by one and add each row to the previous result -> sequential, like it would run on any CPU today, this will take you N steps for N rows (actually N-1 but lets keep it simple).

    on a GPU you could parallelize this and launch a large number of threads that add up groups of two rows each first, like (1+2=a, 3+4=b, 5+6=c..), all of those additions are done at the same time in parallel on the gpu in a single step. once that is done you sum up a+b=a1, c+d=a2 etc... repeat until you have only two numbers left to add together and you get the final result. in total this will take you log2(N) steps. (e.g. for 256 rows -> log2(256) = 8 steps only)

    for a small number of rows there won't be much difference and the higher clock speed of the CPU will still outweigh small gains. but once you increase the number of rows you can clearly see what a huge difference this make. (yes this is simplified, you do not have an infinite amount of execution units)

    however, note how much more complex the second example is. most programmers today have coded for their whole life like example 1. now they are supposed to switch to example 2...
    Last edited by W1zzard; 05-27-2008 at 03:18 PM.

  7. #82
    Xtreme Addict
    Join Date
    Dec 2007
    Posts
    1,030
    Quote Originally Posted by W1zzard View Post
    however, note how much more complex the second example is. most programmers today have coded for their whole life like example 1. now they are seduced to switch to example 2...
    Fixed

    Loved the post, W1zzard really enlighten.
    Last edited by Luka_Aveiro; 05-27-2008 at 03:40 PM.
    Are we there yet?

  8. #83
    Xtreme Member
    Join Date
    May 2007
    Posts
    341
    The benchmark side of this is not something that really impresses me (just another benchmark). However, the potential of the calculation speed difference, is something that does impress me. Can CUDA be used to factor numbers, with the promise of faster performance than a CPU?

  9. #84
    Xtreme Owner Charles Wirth's Avatar
    Join Date
    Jun 2002
    Location
    Las Vegas
    Posts
    11,653
    Michael, though I am not up to speed on tweaks needed to compile correctly there are two people working on getting me the assistance to get it done.

    As to the name, SuperPi 1.6 CUDA GPU

    Is there way to make a universal binary for both manufactures? Larrabe should be a GPGPU as well.
    Intel 9990XE @ 5.1Ghz
    ASUS Rampage VI Extreme Omega
    GTX 2080 ti Galax Hall of Fame
    64GB Galax Hall of Fame
    Intel Optane
    Platimax 1245W

    Intel 3175X
    Asus Dominus Extreme
    GRX 1080ti Galax Hall of Fame
    96GB Patriot Steel
    Intel Optane 900P RAID

  10. #85
    Xtreme Legend
    Join Date
    Jan 2003
    Location
    Stuttgart, Germany
    Posts
    929
    Quote Originally Posted by FUGGER View Post
    there are two people working on getting me the assistance to get it done.
    if those are somehow affiliated with nvidia maybe call it "techdemo"?

  11. #86
    Moderator
    Join Date
    Mar 2006
    Posts
    8,556
    If anyone is more interested on the general use of GPU's you could follow up on the feeds here. http://www.gpgpu.org/

  12. #87
    Xtreme X.I.P.
    Join Date
    Nov 2002
    Location
    Shipai
    Posts
    31,147
    Quote Originally Posted by W1zzard View Post
    yep, even opengl gpgpu is quite easy to do. but ctm/cuda, especially ctm give you much more options to improve performance and flexibility
    hmmm how much of a boost at what expense though? coding for ctm and cuda is a lot more complex then coding for gpgpu directx or ogl, right?

    Quote Originally Posted by W1zzard View Post
    magine a large excel sheet with a number of rows (the money you spent on drinking, partying and getting laid) that you want to sum up.
    you keep a record of all that? heheheh

    thanks for the example, very interesting
    so basically anything that is coded to run on a server cluster should work well on a gpu, right? so every application that has to do with audio/video processing, filtering, compressing etc, should work well on gpus then, right? im curious when we will see a gpu divx codec

  13. #88
    Tyler Durden
    Join Date
    Oct 2003
    Location
    Massachusetts, USA
    Posts
    5,623
    Quote Originally Posted by saaya View Post
    you keep a record of all that? heheheh
    Don't we all?
    Formerly XIP, now just P.

  14. #89
    Xtreme n00berclocker
    Join Date
    Mar 2006
    Location
    San Jose, CA
    Posts
    1,445
    Quote Originally Posted by EnJoY View Post
    Don't we all?
    Its call previous orders from newegg lol. To bad thats only a quarter of all the hardware ive spent money on.
    Quote Originally Posted by 3oh6
    damn you guys...am i in a three way and didn't know it again
    Quote Originally Posted by Brian y.
    Im exclusively benching ECS from this point forward

  15. #90
    Xtreme Owner Charles Wirth's Avatar
    Join Date
    Jun 2002
    Location
    Las Vegas
    Posts
    11,653
    My guys are not with Nvidia (doing the work), but I do have developer assistance from Nvidia.

    Sascha one could assume but the gains are random but they are usually exponential of 8x beyond 100x for the examples given.
    Intel 9990XE @ 5.1Ghz
    ASUS Rampage VI Extreme Omega
    GTX 2080 ti Galax Hall of Fame
    64GB Galax Hall of Fame
    Intel Optane
    Platimax 1245W

    Intel 3175X
    Asus Dominus Extreme
    GRX 1080ti Galax Hall of Fame
    96GB Patriot Steel
    Intel Optane 900P RAID

  16. #91
    Xtreme Member
    Join Date
    Aug 2006
    Location
    Warsaw, Poland
    Posts
    148
    Quote Originally Posted by saaya View Post
    so basically anything that is coded to run on a server cluster should work well on a gpu, right?
    First requirement is - of course - multithreading; and yes - MULTI, not 2 or 4 as we are happy when playing with CPUs.
    Second requirement - recode program to be doable on GPU and don't lose that ~30x "possible" performance boost while recoding .
    I'm far from being accurate or anything , that's just a simple explanation as I see it.

  17. #92
    Xtreme Member
    Join Date
    Apr 2007
    Posts
    386
    Quote Originally Posted by audiofreak View Post
    [b]

    What we also don't need is the CUDA mp3 encoder (and Linux only mind you) but NVIDIA still organized a contest.

    mp3 encoding is ridiculously fast on a CPU already, disk I/O is the bottleneck -- it would be much better if they organized a contest for x264 encoder on a GPU or even better wrote one themselves and open-sourced it.

    i cant understand that statement how can you say we dont need this?

    i would love a program i could throw all my mp3s at and have it recode, normalise esspecially if i didn't have to dedicate a pc for a day to do it.

    as much as they can do to possibly improve PC usage is great (there is already a cuda based h.264 encoder)

    this is the most exciting thing to come on to xtreme news in a long time have fun Fugger (and yes open source would be great!!)
    Gaming Box:: q6600 @3.0 :: 9800gtx :: Abit IP35 :: 4gb :: 1.4TB :: akasa eclipse :: Win7
    Development:: PhenomII 955BE @3.2 :: 4200 :: asus M4A785 M Evo :: 1.25TB ::Win7
    Media Centre :: q6600 @3.0 :: x1950pro :: asus p35 epu :: 8gb :: 320 GB :: Lc17B :: Win7
    server:: I7 860 :: p55 gd65 :: 3450 :: 8 TB :: 8gb :: Rebel 12 :: server 2008 R2

  18. #93
    Xtreme X.I.P.
    Join Date
    Nov 2002
    Location
    Shipai
    Posts
    31,147
    well hes right, we dont need this, but its still interesting
    if it turns out to be a pointless benchmark that doesnt really scale realisitcally, then it will most likely be forgotten pretty soon...
    but not necessarily... it might be fun things dont need to make sense to be fun

  19. #94
    Xtreme Enthusiast
    Join Date
    May 2007
    Posts
    831
    The problem I have found is that the algorithm SuperPi uses (Gauss-Legendre) can not be very well mutlithreaded.
    The way it works is, every time an "iteration" is performed, more and more numbers are returned. So, each result is dependent on the previous result.

    There is probably an algorithm out there that is very good for mutlithreading.

    I think something like wPrime being ported to GPGPU code would be good, because the workload can be distributed to many threads.
    If you want to calculate 100 prime numbers, have each thread calculate 1 prime number. (assuming there are 100 threads)
    If you want to do 1000, have each thread calculate 10 prime numbers. (assuming there are 100 threads)
    The work load is able to be distributed.

    This might just be a tech demo for nVIDIA.
    Gigabyte P35-DQ6 | Intel Core 2 Quad Q6700 | 2x1GB Crucial Ballistix DDR2-1066 5-5-5-15 | MSI nVIDIA GeForce 7300LE

  20. #95
    Xtreme Addict
    Join Date
    Aug 2007
    Location
    Toon
    Posts
    1,570
    Nice one, shame I haven't got my G92 any more, any chance that this time round it can have a continuous mode for stress testing.
    Intel i7 920 C0 @ 3.67GHz
    ASUS 6T Deluxe
    Powercolor 7970 @ 1050/1475
    12GB GSkill Ripjaws
    Antec 850W TruePower Quattro
    50" Full HD PDP
    Red Cosmos 1000

  21. #96
    Xtreme Addict
    Join Date
    Aug 2005
    Location
    Germany
    Posts
    2,247
    Quote Originally Posted by GoThr3k View Post
    [...]

    And ATI has something like CUDA, called CTM (close to metal), to bad you have to program in assembler with CTM, in CUDA you can program in C & C++
    but then, ati did something wrong as i never heard anything of CTM. i know that there's a folding@home client for ati gpus, but ati never caused sensation with this.
    and now nvidia teases customers with marketing regarding their CUDA environment.

    seems like ati somehow missed the train to advertise their feature properly?
    1. Asus P5Q-E / Intel Core 2 Quad Q9550 @~3612 MHz (8,5x425) / 2x2GB OCZ Platinum XTC (PC2-8000U, CL5) / EVGA GeForce GTX 570 / Crucial M4 128GB, WD Caviar Blue 640GB, WD Caviar SE16 320GB, WD Caviar SE 160GB / be quiet! Dark Power Pro P7 550W / Thermaltake Tsunami VA3000BWA / LG L227WT / Teufel Concept E Magnum 5.1 // SysProfile


    2. Asus A8N-SLI / AMD Athlon 64 4000+ @~2640 MHz (12x220) / 1024 MB Corsair CMX TwinX 3200C2, 2.5-3-3-6 1T / Club3D GeForce 7800GT @463/1120 MHz / Crucial M4 64GB, Hitachi Deskstar 40GB / be quiet! Blackline P5 470W

  22. #97
    Xtreme Enthusiast
    Join Date
    Jun 2005
    Posts
    525
    will there be a way to test an individual core on a gpu?

    on a side note, it would be nice if there was a common compiler for all gpu's (ati's, nvidia's and intels up and comming one), but that would take these guys working together...

  23. #98
    Xtreme Addict
    Join Date
    Aug 2007
    Location
    Toon
    Posts
    1,570
    If there is a point to this it is then it is to let the GPU do the maths that the GPU does best (massively parallel, DSP, Video, Audio, CAD etc). Offload whatever can be off loaded to the GPU while letting the CPU do what it does best.
    Intel i7 920 C0 @ 3.67GHz
    ASUS 6T Deluxe
    Powercolor 7970 @ 1050/1475
    12GB GSkill Ripjaws
    Antec 850W TruePower Quattro
    50" Full HD PDP
    Red Cosmos 1000

  24. #99
    Xtreme Enthusiast
    Join Date
    May 2007
    Posts
    831
    Quote Originally Posted by RaZz! View Post
    but then, ati did something wrong as i never heard anything of CTM. i know that there's a folding@home client for ati gpus, but ati never caused sensation with this.
    and now nvidia teases customers with marketing regarding their CUDA environment.

    seems like ati somehow missed the train to advertise their feature properly?
    This is true.
    I just got an e-mail already from eVGA saying that you should join there Folding@Home team.
    I base this on absolutely nothing, but didn't ATI "overly" advertise R600?
    Last edited by MuffinFlavored; 05-31-2008 at 06:02 PM.
    Gigabyte P35-DQ6 | Intel Core 2 Quad Q6700 | 2x1GB Crucial Ballistix DDR2-1066 5-5-5-15 | MSI nVIDIA GeForce 7300LE

  25. #100
    I am Xtreme
    Join Date
    Feb 2005
    Location
    SiliCORN Valley
    Posts
    5,543
    I just got an e-mail already from eVGA, Folding@Home on nVIDIA.
    and what did that email say?
    "These are the rules. Everybody fights, nobody quits. If you don't do your job I'll kill you myself.
    Welcome to the Roughnecks"

    "Anytime you think I'm being too rough, anytime you think I'm being too tough, anytime you miss-your-mommy, QUIT!
    You sign your 1248, you get your gear, and you take a stroll down washout lane. Do you get me?"

    Heat Ebay Feedback

Page 4 of 10 FirstFirst 1234567 ... LastLast

Bookmarks

Bookmarks

Posting Permissions

  • You may not post new threads
  • You may not post replies
  • You may not post attachments
  • You may not edit your posts
  •