Results 1 to 13 of 13

Thread: CUDA Toolkit and SDK 2.3 released

  1. #1
    Xtreme Mentor
    Join Date
    Sep 2007
    Location
    Ohio
    Posts
    2,977

    CUDA Toolkit and SDK 2.3 released

    CUDA Toolkit and SDK 2.3 released

    http://forums.nvidia.com/index.php?showtopic=102548

    IMPORTANT NOTE

    Because of the new support for cross-compilation, the library locations on Linux have changed. 32-bit libraries are located by default at /usr/local/cuda/lib and 64-bit libraries are located by default at /usr/local/cuda/lib64. This is a change from 2.2 and will necessitate changing /etc/ld.so.conf or LD_LIBRARY_PATH.

    The CUDA Toolkit and SDK v2.3 are now released and available to all developers.

    A brief overview of features--there are a lot:

    * The CUFFT Library now supports double-precision transforms and includes significant performance improvements for single-precision transforms as well. See the CUDA Toolkit release notes for details.
    * The CUDA-GDB hardware debugger and CUDA Visual Profiler are now included in the CUDA Toolkit installer, and the CUDA-GDB debugger is now available for all supported Linux distros. (see below)
    * Each GPU in an SLI group is now enumerated individually, so compute applications can now take advantage of multi-GPU performance even when SLI is enabled for graphics.
    * The 64-bit versions of the CUDA Toolkit now support compiling 32-bit applications. Please note that the installation location of the libraries has changed, so developers on 64-bit Linux must update their LD_LIBRARY_PATH to contain either /usr/local/cuda/lib or /usr/local/cuda/lib64.
    * New support for fp16 <-> fp32 conversion intrinsics allows storage of data in fp16 format with computation in fp32. Use of fp16 format is ideal for applications that require higher numerical range than 16-bit integer but less precision than fp32 and reduces memory space and bandwidth consumption.
    * The CUDA SDK has been updated to include:
    o A new pitchLinearTexure code sample that shows how to efficiently texture from pitch linear memory.
    o A new PTXJIT code sample illustrating how to use cuModuleLoadDataEx() to load PTX source from memory instead of loading a file.
    o Two new code samples for Windows, showing how to use the NVCUVID library to decode MPEG-2, VC-1, and H.264 content and pass frames to OpenGL or Direct3D for display.
    o Updated code samples showing how to properly align CUDA kernel function parameters so the same code works on both x32 and x64 systems.

    * The Visual Profiler includes several enhancements:
    o All memory transfer API calls are now reported
    o Support for profiling multiple contexts per GPU
    o Synchronized clocks for requested start time on the CPU and start/end times on the GPU for all kernel launches and memory transfers
    o Global memory load and store efficiency metrics for GPUs with compute capability 1.2 and higher

    * The CUDA Driver for MacOS is now packaged separately from the CUDA Toolkit.
    * Support for major Linux distros, MacOS X, and Windows:
    o MacOS X 10.5.6 and later (32-bit)
    o Windows XP/Vista/7 with Visual Studio 8 (VC2005 SP1) and 9 (VC2008)
    o Fedora 10, RHEL 4.7 & 5.3, SLED 10.2 & 11.0, OpenSUSE 11.1, and Ubuntu 8.10 & 9.04

    Notes for MacOS developers

    * The cudadriver_2.3.1_macos.pkg driver is for use with Quadro FX 4800 and GeForce GTX 285.
    * The cudadriver_2.3.1_macos.pkg driver may also be used with any NVIDIA GPU on SnowLeopard.
    * Use the cudadriver_2.3.0_macos.pkg driver for MacOS X 10.5.6 and later (pre-SnowLeoard) with all other GPUs.


    More progress.

    Question about this feature: * Each GPU in an SLI group is now enumerated individually, so compute applications can now take advantage of multi-GPU performance even when SLI is enabled for graphics.

    Does that mean PhysX or Folding will support SLI now, or just that GPU's in SLI can be individually accessed by CUDA applications?
    Last edited by Talonman; 07-22-2009 at 05:14 PM.
    Asus Maximus SE X38 / Lapped Q6600 G0 @ 3.8GHz (L726B397 stock VID=1.224) / 7 Ultimate x64 /EVGA GTX 295 C=650 S=1512 M=1188 (Graphics)/ EVGA GTX 280 C=756 S=1512 M=1296 (PhysX)/ G.SKILL 8GB (4 x 2GB) SDRAM DDR2 1000 (PC2 8000) / Gateway FPD2485W (1920 x 1200 res) / Toughpower 1,000-Watt modular PSU / SilverStone TJ-09 BW / (2) 150 GB Raptor's RAID-0 / (1) Western Digital Caviar 750 GB / LG GGC-H20L (CD, DVD, HD-DVD, and BlueRay Drive) / WaterKegIII Xtreme / D-TEK FuZion CPU, EVGA Hydro Copper 16 GPU, and EK NB S-MAX Acetal Waterblocks / Enzotech Forged Copper CNB-S1L (South Bridge heat sink)

  2. #2
    Xtreme Enthusiast
    Join Date
    Nov 2006
    Location
    SoCal
    Posts
    632
    Can someone explain CUDA to me?? wtf is it?

    Thanks, lol
    himynameisfrank
    Flickr

    i7 2600K @ 4.4Ghz
    Corsair Hydro H80
    MSI Z68A-GD80
    EVGA GTX580 x 2 SLi
    Corsair XMS3 1600 12Gb
    Corsair HX850w
    OCZ Agility 3 SSD x 3 RAID 0
    Creative X-Fi Titanium / Astro A40 Audio System
    Corsair Graphite 600T SE White
    DELL U2410 24" IPS Panel

  3. #3
    Xtreme Mentor
    Join Date
    Sep 2007
    Location
    Ohio
    Posts
    2,977
    I would be happy to.

    http://en.wikipedia.org/wiki/CUDA

    "CUDA (an acronym for Compute Unified Device Architecture) is a parallel computing architecture developed by NVIDIA. CUDA is the computing engine in NVIDIA graphics processing units or GPUs that is accessible to software developers through industry standard programming languages. Programmers use 'C for CUDA' (C with NVIDIA extensions), compiled through a PathScale Open64 C compiler,[1] to code algorithms for execution on the GPU. CUDA architecture supports a range of computational interfaces including OpenCL[2] and DirectX Compute[3]. Third party wrappers are also available for Python, Fortran, Java and Matlab.

    The latest drivers all contain the necessary CUDA components. CUDA works with all NVIDIA GPUs from the G8X series onwards, including GeForce, Quadro and the Tesla line. NVIDIA states that programs developed for the GeForce 8 series will also work without modification on all future Nvidia video cards, due to binary compatibility. CUDA gives developers access to the native instruction set and memory of the parallel computational elements in CUDA GPUs. Using CUDA, the latest NVIDIA GPUs effectively become open architectures like CPUs. Unlike CPUs however, GPUs have a parallel "many-core" architecture, each core capable of running thousands of threads simultaneously - if an application is suited to this kind of an architecture, the GPU can offer large performance benefits.

    In the computer gaming industry, in addition to graphics rendering, graphics cards are used in game physics calculations (physical effects like debris, smoke, fire, fluids), examples include PhysX and Bullet. CUDA has also been used to accelerate non-graphical applications in computational biology, cryptography and other fields by an order of magnitude or more.[4][5][6][7] An example of this is the BOINC distributed computing client.[8]

    CUDA provides both a low level API and a higher level API. The initial CUDA SDK was made public on 15 February 2007, for Microsoft Windows and Linux. Mac OS X support was later added in version 2.0[9], which supersedes the beta released February 14, 2008.[10]"



    In my words, the low level API would be considered the OS of the GPU.
    The CUDA high level API (or computer language) is based in C or C++, but contains special CUDA commands that allowes work to be processed on your GPU, instead of your CPU.
    This would include both single and double precision calculations.
    Last edited by Talonman; 07-22-2009 at 05:15 PM.
    Asus Maximus SE X38 / Lapped Q6600 G0 @ 3.8GHz (L726B397 stock VID=1.224) / 7 Ultimate x64 /EVGA GTX 295 C=650 S=1512 M=1188 (Graphics)/ EVGA GTX 280 C=756 S=1512 M=1296 (PhysX)/ G.SKILL 8GB (4 x 2GB) SDRAM DDR2 1000 (PC2 8000) / Gateway FPD2485W (1920 x 1200 res) / Toughpower 1,000-Watt modular PSU / SilverStone TJ-09 BW / (2) 150 GB Raptor's RAID-0 / (1) Western Digital Caviar 750 GB / LG GGC-H20L (CD, DVD, HD-DVD, and BlueRay Drive) / WaterKegIII Xtreme / D-TEK FuZion CPU, EVGA Hydro Copper 16 GPU, and EK NB S-MAX Acetal Waterblocks / Enzotech Forged Copper CNB-S1L (South Bridge heat sink)

  4. #4
    Xtreme Enthusiast
    Join Date
    Mar 2009
    Location
    Toronto ON
    Posts
    566
    Here we go again.
    Core i7-4930K LGA 2011 Six-Core - Cooler Master Seidon 120XL ? Push-Pull Liquid Water
    ASUS Rampage IV Black Edition LGA2011 - G.SKILL Trident X Series 32GB (4 x 8GB) DDR3 1866
    Sapphire R9 290X 4GB TRI-X OC in CrossFire - ATI TV Wonder 650 PCIe
    Intel X25-M 160GB G2 SSD - WD Black 2TB 7200 RPM 64MB Cache SATA 6
    Corsair HX1000W PSU - Pioner Blu-ray Burner 6X BD-R
    Westinghouse LVM-37w3, 37inch 1080p - Windows 7 64-bit Pro
    Sennheiser RS 180 - Cooler Master Cosmos S Case

  5. #5
    Xtreme Mentor
    Join Date
    Sep 2007
    Location
    Ohio
    Posts
    2,977
    Flamebait.

    No thanks.
    Asus Maximus SE X38 / Lapped Q6600 G0 @ 3.8GHz (L726B397 stock VID=1.224) / 7 Ultimate x64 /EVGA GTX 295 C=650 S=1512 M=1188 (Graphics)/ EVGA GTX 280 C=756 S=1512 M=1296 (PhysX)/ G.SKILL 8GB (4 x 2GB) SDRAM DDR2 1000 (PC2 8000) / Gateway FPD2485W (1920 x 1200 res) / Toughpower 1,000-Watt modular PSU / SilverStone TJ-09 BW / (2) 150 GB Raptor's RAID-0 / (1) Western Digital Caviar 750 GB / LG GGC-H20L (CD, DVD, HD-DVD, and BlueRay Drive) / WaterKegIII Xtreme / D-TEK FuZion CPU, EVGA Hydro Copper 16 GPU, and EK NB S-MAX Acetal Waterblocks / Enzotech Forged Copper CNB-S1L (South Bridge heat sink)

  6. #6
    Xtreme Addict
    Join Date
    Apr 2007
    Posts
    1,870
    Quote Originally Posted by Talonman View Post
    Does that mean PhysX or Folding will support SLI now, or just that GPU's in SLI can be individually accessed by CUDA applications?
    The latter. Before this update if you had SLI enabled CUDA would not see all the GPUs. CUDA apps like F@H will now be able to utilize all GPUs even if SLI is enabled. But it's not "SLI" like it is for games, the host still controls each GPU independently.

  7. #7
    Xtreme Mentor
    Join Date
    Sep 2007
    Location
    Ohio
    Posts
    2,977
    Thank you sir!

    As long as CUDA will have full access to my 295 operating in SLI mode, and my 280 for PhysX, I am happy.
    Asus Maximus SE X38 / Lapped Q6600 G0 @ 3.8GHz (L726B397 stock VID=1.224) / 7 Ultimate x64 /EVGA GTX 295 C=650 S=1512 M=1188 (Graphics)/ EVGA GTX 280 C=756 S=1512 M=1296 (PhysX)/ G.SKILL 8GB (4 x 2GB) SDRAM DDR2 1000 (PC2 8000) / Gateway FPD2485W (1920 x 1200 res) / Toughpower 1,000-Watt modular PSU / SilverStone TJ-09 BW / (2) 150 GB Raptor's RAID-0 / (1) Western Digital Caviar 750 GB / LG GGC-H20L (CD, DVD, HD-DVD, and BlueRay Drive) / WaterKegIII Xtreme / D-TEK FuZion CPU, EVGA Hydro Copper 16 GPU, and EK NB S-MAX Acetal Waterblocks / Enzotech Forged Copper CNB-S1L (South Bridge heat sink)

  8. #8
    Registered User
    Join Date
    Jan 2007
    Posts
    94
    Quote Originally Posted by trinibwoy View Post
    The latter. Before this update if you had SLI enabled CUDA would not see all the GPUs. CUDA apps like F@H will now be able to utilize all GPUs even if SLI is enabled. But it's not "SLI" like it is for games, the host still controls each GPU independently.

    To add to this, most of the CUDA apps I've tried would flat-out themselves when SLI was enabled, with the above example of just seeing one GPU being a best-case outcome.

    Basically, this change allows CUDA to "ignore" SLI.
    i7 920 @ 4.2Ghz
    Asus P6T6 Revolution
    3x GTX260
    6x2GB Corsair DDR3-1600
    G.Skill Falcon 128GB SSD
    G.SKill Titan 128GB SSD
    Segate 7200.11 1.5TB
    Vista 64 Ultimate

  9. #9
    Xtreme Member
    Join Date
    Apr 2006
    Posts
    393
    CUDA just needs to die.

  10. #10
    c[_]
    Join Date
    Nov 2002
    Location
    Alberta, Canada
    Posts
    18,728
    Quote Originally Posted by Clairvoyant129 View Post
    CUDA just needs to die.
    thread crappers need to be banned

    All along the watchtower the watchmen watch the eternal return.

  11. #11
    Xtreme X.I.P.
    Join Date
    Nov 2002
    Location
    Shipai
    Posts
    31,147
    Quote Originally Posted by Talonman View Post
    Thank you sir!

    As long as CUDA will have full access to my 295 operating in SLI mode, and my 280 for PhysX, I am happy.
    why?

    seriously what benefit do you get out of this?

    im really surprised by your enthusiasm for cuda and all the time and effort you spend in advertizing it...

  12. #12
    Xtreme Addict
    Join Date
    Aug 2008
    Location
    Hollywierd, CA
    Posts
    1,284
    Quote Originally Posted by saaya View Post
    why?

    seriously what benefit do you get out of this?

    im really surprised by your enthusiasm for cuda and all the time and effort you spend in advertizing it...
    he is looking at the potential benefits, we should all be so positive about the future
    [SIGPIC][/SIGPIC]

    I am an artist (EDM producer/DJ), pls check out mah stuff.

  13. #13
    Xtreme Mentor
    Join Date
    Sep 2007
    Location
    Ohio
    Posts
    2,977
    +1

    Quote Originally Posted by saaya View Post
    why?

    seriously what benefit do you get out of this?

    im really surprised by your enthusiasm for cuda and all the time and effort you spend in advertizing it...
    Advertising it? A rather negative word. I would rather say that I like discussing it, and believe GPU acceleration is way more interesting than CPU's. I assure you, it's no effort at all. I simply enjoy it.

    Why do you spend so much time talking about CPU's? Do you regret the time and effort you spent doing interviews and posting videos in the net? Probably not.

    It's the same for me with GPU's.

    I have 720 Stream Processors on my system. I rather like the idea of having tham all harnessed by CUDA/OpenCL/DirectX. I think it will give me more speed in GPU accelerated apps, than if I were to upgrade to an i7 system.

    I don't have the $$ for that right now anyway, but I already own my GPU's.

    Be happy for the guys that are into it.

    I wonder if when folding, we soon won't have to disable SLI to run 2 instances on a 295.

    Not having to switch SLI back and forth for gaming and folding, will be nice. I already like that leaving my 280 set as a dedicated PhysX processor, doesn't bother F@H at all. It still runs just fine.

    If SLI soon follows, it will be considered as a value add in my book.
    Last edited by Talonman; 07-24-2009 at 01:22 AM.
    Asus Maximus SE X38 / Lapped Q6600 G0 @ 3.8GHz (L726B397 stock VID=1.224) / 7 Ultimate x64 /EVGA GTX 295 C=650 S=1512 M=1188 (Graphics)/ EVGA GTX 280 C=756 S=1512 M=1296 (PhysX)/ G.SKILL 8GB (4 x 2GB) SDRAM DDR2 1000 (PC2 8000) / Gateway FPD2485W (1920 x 1200 res) / Toughpower 1,000-Watt modular PSU / SilverStone TJ-09 BW / (2) 150 GB Raptor's RAID-0 / (1) Western Digital Caviar 750 GB / LG GGC-H20L (CD, DVD, HD-DVD, and BlueRay Drive) / WaterKegIII Xtreme / D-TEK FuZion CPU, EVGA Hydro Copper 16 GPU, and EK NB S-MAX Acetal Waterblocks / Enzotech Forged Copper CNB-S1L (South Bridge heat sink)

Bookmarks

Bookmarks

Posting Permissions

  • You may not post new threads
  • You may not post replies
  • You may not post attachments
  • You may not edit your posts
  •