MMM
Results 1 to 13 of 13

Thread: CUDA Toolkit and SDK 2.3 released

Threaded View

  1. #1
    Xtreme Mentor
    Join Date
    Sep 2007
    Location
    Ohio
    Posts
    2,977

    CUDA Toolkit and SDK 2.3 released

    CUDA Toolkit and SDK 2.3 released

    http://forums.nvidia.com/index.php?showtopic=102548

    IMPORTANT NOTE

    Because of the new support for cross-compilation, the library locations on Linux have changed. 32-bit libraries are located by default at /usr/local/cuda/lib and 64-bit libraries are located by default at /usr/local/cuda/lib64. This is a change from 2.2 and will necessitate changing /etc/ld.so.conf or LD_LIBRARY_PATH.

    The CUDA Toolkit and SDK v2.3 are now released and available to all developers.

    A brief overview of features--there are a lot:

    * The CUFFT Library now supports double-precision transforms and includes significant performance improvements for single-precision transforms as well. See the CUDA Toolkit release notes for details.
    * The CUDA-GDB hardware debugger and CUDA Visual Profiler are now included in the CUDA Toolkit installer, and the CUDA-GDB debugger is now available for all supported Linux distros. (see below)
    * Each GPU in an SLI group is now enumerated individually, so compute applications can now take advantage of multi-GPU performance even when SLI is enabled for graphics.
    * The 64-bit versions of the CUDA Toolkit now support compiling 32-bit applications. Please note that the installation location of the libraries has changed, so developers on 64-bit Linux must update their LD_LIBRARY_PATH to contain either /usr/local/cuda/lib or /usr/local/cuda/lib64.
    * New support for fp16 <-> fp32 conversion intrinsics allows storage of data in fp16 format with computation in fp32. Use of fp16 format is ideal for applications that require higher numerical range than 16-bit integer but less precision than fp32 and reduces memory space and bandwidth consumption.
    * The CUDA SDK has been updated to include:
    o A new pitchLinearTexure code sample that shows how to efficiently texture from pitch linear memory.
    o A new PTXJIT code sample illustrating how to use cuModuleLoadDataEx() to load PTX source from memory instead of loading a file.
    o Two new code samples for Windows, showing how to use the NVCUVID library to decode MPEG-2, VC-1, and H.264 content and pass frames to OpenGL or Direct3D for display.
    o Updated code samples showing how to properly align CUDA kernel function parameters so the same code works on both x32 and x64 systems.

    * The Visual Profiler includes several enhancements:
    o All memory transfer API calls are now reported
    o Support for profiling multiple contexts per GPU
    o Synchronized clocks for requested start time on the CPU and start/end times on the GPU for all kernel launches and memory transfers
    o Global memory load and store efficiency metrics for GPUs with compute capability 1.2 and higher

    * The CUDA Driver for MacOS is now packaged separately from the CUDA Toolkit.
    * Support for major Linux distros, MacOS X, and Windows:
    o MacOS X 10.5.6 and later (32-bit)
    o Windows XP/Vista/7 with Visual Studio 8 (VC2005 SP1) and 9 (VC2008)
    o Fedora 10, RHEL 4.7 & 5.3, SLED 10.2 & 11.0, OpenSUSE 11.1, and Ubuntu 8.10 & 9.04

    Notes for MacOS developers

    * The cudadriver_2.3.1_macos.pkg driver is for use with Quadro FX 4800 and GeForce GTX 285.
    * The cudadriver_2.3.1_macos.pkg driver may also be used with any NVIDIA GPU on SnowLeopard.
    * Use the cudadriver_2.3.0_macos.pkg driver for MacOS X 10.5.6 and later (pre-SnowLeoard) with all other GPUs.


    More progress.

    Question about this feature: * Each GPU in an SLI group is now enumerated individually, so compute applications can now take advantage of multi-GPU performance even when SLI is enabled for graphics.

    Does that mean PhysX or Folding will support SLI now, or just that GPU's in SLI can be individually accessed by CUDA applications?
    Last edited by Talonman; 07-22-2009 at 05:14 PM.
    Asus Maximus SE X38 / Lapped Q6600 G0 @ 3.8GHz (L726B397 stock VID=1.224) / 7 Ultimate x64 /EVGA GTX 295 C=650 S=1512 M=1188 (Graphics)/ EVGA GTX 280 C=756 S=1512 M=1296 (PhysX)/ G.SKILL 8GB (4 x 2GB) SDRAM DDR2 1000 (PC2 8000) / Gateway FPD2485W (1920 x 1200 res) / Toughpower 1,000-Watt modular PSU / SilverStone TJ-09 BW / (2) 150 GB Raptor's RAID-0 / (1) Western Digital Caviar 750 GB / LG GGC-H20L (CD, DVD, HD-DVD, and BlueRay Drive) / WaterKegIII Xtreme / D-TEK FuZion CPU, EVGA Hydro Copper 16 GPU, and EK NB S-MAX Acetal Waterblocks / Enzotech Forged Copper CNB-S1L (South Bridge heat sink)

Bookmarks

Bookmarks

Posting Permissions

  • You may not post new threads
  • You may not post replies
  • You may not post attachments
  • You may not edit your posts
  •