CUDA Toolkit and SDK 2.3 released

**Talonman** · 07-22-2009, 04:46 PM

CUDA Toolkit and SDK 2.3 released

http://forums.nvidia.com/index.php?showtopic=102548

IMPORTANT NOTE

Because of the new support for cross-compilation, the library locations on Linux have changed. 32-bit libraries are located by default at /usr/local/cuda/lib and 64-bit libraries are located by default at /usr/local/cuda/lib64. This is a change from 2.2 and will necessitate changing /etc/ld.so.conf or LD_LIBRARY_PATH.

The CUDA Toolkit and SDK v2.3 are now released and available to all developers.

A brief overview of features--there are a lot:

* The CUFFT Library now supports double-precision transforms and includes significant performance improvements for single-precision transforms as well. See the CUDA Toolkit release notes for details.
* The CUDA-GDB hardware debugger and CUDA Visual Profiler are now included in the CUDA Toolkit installer, and the CUDA-GDB debugger is now available for all supported Linux distros. (see below)
* Each GPU in an SLI group is now enumerated individually, so compute applications can now take advantage of multi-GPU performance even when SLI is enabled for graphics.
* The 64-bit versions of the CUDA Toolkit now support compiling 32-bit applications. Please note that the installation location of the libraries has changed, so developers on 64-bit Linux must update their LD_LIBRARY_PATH to contain either /usr/local/cuda/lib or /usr/local/cuda/lib64.
* New support for fp16 <-> fp32 conversion intrinsics allows storage of data in fp16 format with computation in fp32. Use of fp16 format is ideal for applications that require higher numerical range than 16-bit integer but less precision than fp32 and reduces memory space and bandwidth consumption.
* The CUDA SDK has been updated to include:
o A new pitchLinearTexure code sample that shows how to efficiently texture from pitch linear memory.
o A new PTXJIT code sample illustrating how to use cuModuleLoadDataEx() to load PTX source from memory instead of loading a file.
o Two new code samples for Windows, showing how to use the NVCUVID library to decode MPEG-2, VC-1, and H.264 content and pass frames to OpenGL or Direct3D for display.
o Updated code samples showing how to properly align CUDA kernel function parameters so the same code works on both x32 and x64 systems.

* The Visual Profiler includes several enhancements:
o All memory transfer API calls are now reported
o Support for profiling multiple contexts per GPU
o Synchronized clocks for requested start time on the CPU and start/end times on the GPU for all kernel launches and memory transfers
o Global memory load and store efficiency metrics for GPUs with compute capability 1.2 and higher

* The CUDA Driver for MacOS is now packaged separately from the CUDA Toolkit.
* Support for major Linux distros, MacOS X, and Windows:
o MacOS X 10.5.6 and later (32-bit)
o Windows XP/Vista/7 with Visual Studio 8 (VC2005 SP1) and 9 (VC2008)
o Fedora 10, RHEL 4.7 & 5.3, SLED 10.2 & 11.0, OpenSUSE 11.1, and Ubuntu 8.10 & 9.04

Notes for MacOS developers

* The cudadriver_2.3.1_macos.pkg driver is for use with Quadro FX 4800 and GeForce GTX 285.
* The cudadriver_2.3.1_macos.pkg driver may also be used with any NVIDIA GPU on SnowLeopard.
* Use the cudadriver_2.3.0_macos.pkg driver for MacOS X 10.5.6 and later (pre-SnowLeoard) with all other GPUs.

More progress.

Question about this feature: * Each GPU in an SLI group is now enumerated individually, so compute applications can now take advantage of multi-GPU performance even when SLI is enabled for graphics.

Does that mean PhysX or Folding will support SLI now, or just that GPU's in SLI can be individually accessed by CUDA applications?

**TheCarLessDriven** · 07-22-2009, 04:50 PM

Can someone explain CUDA to me?? wtf is it?

Thanks, lol

**Talonman** · 07-22-2009, 05:00 PM

I would be happy to.

http://en.wikipedia.org/wiki/CUDA

"CUDA (an acronym for Compute Unified Device Architecture) is a parallel computing architecture developed by NVIDIA. CUDA is the computing engine in NVIDIA graphics processing units or GPUs that is accessible to software developers through industry standard programming languages. Programmers use 'C for CUDA' (C with NVIDIA extensions), compiled through a PathScale Open64 C compiler,[1] to code algorithms for execution on the GPU. CUDA architecture supports a range of computational interfaces including OpenCL[2] and DirectX Compute[3]. Third party wrappers are also available for Python, Fortran, Java and Matlab.

The latest drivers all contain the necessary CUDA components. CUDA works with all NVIDIA GPUs from the G8X series onwards, including GeForce, Quadro and the Tesla line. NVIDIA states that programs developed for the GeForce 8 series will also work without modification on all future Nvidia video cards, due to binary compatibility. CUDA gives developers access to the native instruction set and memory of the parallel computational elements in CUDA GPUs. Using CUDA, the latest NVIDIA GPUs effectively become open architectures like CPUs. Unlike CPUs however, GPUs have a parallel "many-core" architecture, each core capable of running thousands of threads simultaneously - if an application is suited to this kind of an architecture, the GPU can offer large performance benefits.

In the computer gaming industry, in addition to graphics rendering, graphics cards are used in game physics calculations (physical effects like debris, smoke, fire, fluids), examples include PhysX and Bullet. CUDA has also been used to accelerate non-graphical applications in computational biology, cryptography and other fields by an order of magnitude or more.[4][5][6][7] An example of this is the BOINC distributed computing client.[8]

CUDA provides both a low level API and a higher level API. The initial CUDA SDK was made public on 15 February 2007, for Microsoft Windows and Linux. Mac OS X support was later added in version 2.0[9], which supersedes the beta released February 14, 2008.[10]"

In my words, the low level API would be considered the OS of the GPU.
The CUDA high level API (or computer language) is based in C or C++, but contains special CUDA commands that allowes work to be processed on your GPU, instead of your CPU.
This would include both single and double precision calculations.

**Heinz68** · 07-22-2009, 05:08 PM

Here we go again.

**Talonman** · 07-22-2009, 05:13 PM

Flamebait.

No thanks.

**trinibwoy** · 07-22-2009, 05:22 PM

Originally Posted by Talonman

Does that mean PhysX or Folding will support SLI now, or just that GPU's in SLI can be individually accessed by CUDA applications?

The latter. Before this update if you had SLI enabled CUDA would not see all the GPUs. CUDA apps like F@H will now be able to utilize all GPUs even if SLI is enabled. But it's not "SLI" like it is for games, the host still controls each GPU independently.

**Talonman** · 07-22-2009, 05:27 PM

Thank you sir!

As long as CUDA will have full access to my 295 operating in SLI mode, and my 280 for PhysX, I am happy.

**aka1nas** · 07-23-2009, 08:33 PM

Originally Posted by trinibwoy

The latter. Before this update if you had SLI enabled CUDA would not see all the GPUs. CUDA apps like F@H will now be able to utilize all GPUs even if SLI is enabled. But it's not "SLI" like it is for games, the host still controls each GPU independently.

To add to this, most of the CUDA apps I've tried would flat-out

themselves when SLI was enabled, with the above example of just seeing one GPU being a best-case outcome.

Basically, this change allows CUDA to "ignore" SLI.

**Clairvoyant129** · 07-23-2009, 09:28 PM

CUDA just needs to die.

**STEvil** · 07-23-2009, 10:13 PM

Originally Posted by Clairvoyant129

CUDA just needs to die.

thread crappers need to be banned

**saaya** · 07-23-2009, 10:20 PM

Originally Posted by Talonman

Thank you sir!

As long as CUDA will have full access to my 295 operating in SLI mode, and my 280 for PhysX, I am happy.

why?

seriously what benefit do you get out of this?

im really surprised by your enthusiasm for cuda and all the time and effort you spend in advertizing it...

**570091D** · 07-23-2009, 10:46 PM

Originally Posted by saaya

why?

seriously what benefit do you get out of this?

im really surprised by your enthusiasm for cuda and all the time and effort you spend in advertizing it...

he is looking at the potential benefits, we should all be so positive about the future

**Talonman** · 07-24-2009, 01:01 AM

+1

Originally Posted by saaya

why?

seriously what benefit do you get out of this?

im really surprised by your enthusiasm for cuda and all the time and effort you spend in advertizing it...

Advertising it? A rather negative word. I would rather say that I like discussing it, and believe GPU acceleration is way more interesting than CPU's. I assure you, it's no effort at all. I simply enjoy it.

Why do you spend so much time talking about CPU's? Do you regret the time and effort you spent doing interviews and posting videos in the net? Probably not.

It's the same for me with GPU's.

I have 720 Stream Processors on my system. I rather like the idea of having tham all harnessed by CUDA/OpenCL/DirectX. I think it will give me more speed in GPU accelerated apps, than if I were to upgrade to an i7 system.

I don't have the $$ for that right now anyway, but I already own my GPU's.

Be happy for the guys that are into it.

I wonder if when folding, we soon won't have to disable SLI to run 2 instances on a 295.

Not having to switch SLI back and forth for gaming and folding, will be nice. I already like that leaving my 280 set as a dedicated PhysX processor, doesn't bother F@H at all. It still runs just fine.

If SLI soon follows, it will be considered as a value add in my book.

Thread: CUDA Toolkit and SDK 2.3 released

Thread Tools

Search Thread

Rate This Thread

Display

CUDA Toolkit and SDK 2.3 released

Bookmarks

Bookmarks

Posting Permissions