interesting.

http://www.tgdaily.com/content/view/34799/118/

Santa Clara (CA) - Today, AMD released the first of what will be two major upgrades to their AMD Core Math Library (ACML) in the coming year. This first includes support for Barcelona's native 128-bit SIMD engine. It works with Windows, Linux and Solaris, and doubles the floating point operations per clock to 8 on Barcelona, up from 4. The major upgrade next year will include direct support for heterogeneous processing using their recently announced FireStream 64-bit stream processor.


The refresh brings ACML up to version 4.0. The library is available for free download and includes enhancements to their base math algorithms. These include Level 1, 2 and 3 Basic Linear Alegbra Subroutines (BLAS), Linear Algebra (LAPACK) routines, Fast Fourier Transforms (FFTs) in single, double, single-complex and double-complex data types, scalar, vector, array and transcendental math functions. A pseudo-random number generator is also included with both single and double precision generation.

AMD has worked with PathScale, PGI and Sun, as well as Microsoft and the Linux community at large, to include special optimizations and native library support in their products. AMD's Margaret Lewis, Commercial Solutions director, told us that while AMD has no plans to release their own compiler, they are working very closely with the various compiler communities. Lewis told me that Microsoft Visual Studio alone, for example, reaches 9 million developers. Visual Studio Express doubles that, reaching 18 million. The non-Windows platforms target GNU compilers in both FORTRAN and C++, as these account for 43% of the overall compiler market.

AMD also has a CodeAnalyst tool, which is similar in function to intel's VTune, though with a lesser set of comprehensive abilities. It is also available for free download. AMD first released their ACML in June, 2003. Version 4.0 includes complete backward compatibility and will offer drop-in enhancement following a recompile for any software using ACML. The changes present in ACML 4.0 will also benefit other x86 processors with compatible SIMD engines, though the greatest performance gains will be seen in AMD64 systems as the optimizations were specifically tailored for that platform. New function versions are available with different naming conventions, such as foo() being the original function and foo2() being the new one. These new functions contain algorithm enhancements unique to Barcelona, those which either would not work with previous AMD64 architectural versions, or would not have effecient with the older 64-bit engine.