Nukada FFT library
Nukada FFT library is a high performance FFT library for NVIDIA CUDA GPUs.
This library employs new FFT algorithms which are efficient for GPUs.
A kind of auto-tuning mechanism is used in our library to support varying transform sizes. The optimal kernel is produced after generating and evaluating many kinds of FFT kernels,
NVIDIA also provides CUFFT, an FFT library for CUDA GPUs.
The CUFFT library 3.1 is very fast in (Fermi and single precision and powers-of-two transform sizes).
Features
This library supports the following transform types and data types.
- Transform Types
- Batched 1-D FFTs (NX * batch)
- Batched 2-D FFTs (NX * NY * batch)
- Single 3-D FFT (NX * NY * NZ)
- Batched 1-D FFTs for dimension Y of 3-D array
- Data Types
- complex-to-complex, single precision
- complex-to-complex, double precision
Limitations
The library only implements FFT kernels of up to radix-32.
This means, NX, NY and NZ must be factorized into less than 32 factors.
Moreover, NX has more strong restriction. 1-D FFTs for dimension X is
performed by single CUDA kernel; therefore it must fit the computation
resource of a streaming multiprocessor of CUDA GPUs.
It is difficult to say the exact conditions because final resource
usage of a CUDA kernel depends on the compilers, drivers, and GPU models.
Requirements
- Existing CUDA-capable GPUs
- CUDA 3.2 or later (latest version does not work with 3.1)
- Compatible device drivers
- Linux, 32-bit OS and 64-bit OS
- Windows 32-bit OS and 64-bit OS
- (MacOS is unknown.)
Downloads
Sorry for waiting. Download service is now restarted.
For registered beta users...
Documents
FAQ
Forum
A thread of NVIDIA Forum is used.
NukadaFFT thread in NVIDIA Forum
Benchmark Results
Publications
- Akira Nukada, Yasuhiko Ogata, Toshio Endo and Satoshi Matsuoka. "Bandwidth Intensive 3-D FFT kernel for GPUs using CUDA". In Proceedings of the ACM/IEEE conference on Supercomputing (SC'08), 11 pages, IEEE Press, Austin, November 2008
- Akira Nukada and Satoshi Matsuoka. "Auto-Tuning 3-D FFT Library for CUDA GPUs". In Proceedings of the ACM/IEEE conference on Supercomputing (SC09), 10 pages, ACM Press, Portland, November 2009
- Akira Nukada and Satoshi Matsuoka. "NukadaFFT : An Auto-Tuning FFT Library for CUDA GPUs". NVIDIA GPU Technology Conference 2010, Research Summit Poster, San Jose, September 2010.
Acknowledgments
This work is supported by the following projects.
- JST-CREST "ULP-HPC: Ultra Low-Power, High Performance Computing via Modeling and Optimization of Next Generation HPC Technologies".
- Microsoft Technical Computing Initiative "HPC-GPGPU: Large-Scale Commodity Accelerated Clusters and its Application to Advanced Structural Proteomics".
- NVIDIA CUDA Center of Excellence Program (Tokyo Tech)
- MEXT Grant-in-Aid for Young Scientists (A) 22680002.
Akira Nukada, Tokyo Institute of Technology, Japan.