NukadaFFT Library User's Manual


CUDA applications may use one of CUDA runtime APIs or CUDA driver APIs. Many applications use CUDA runtime API because of simplicity. Nukada FFT library provides API functions for both of those applications.

Runtime API functions

int nufftInit(void);
This function initialize the library. Before calling this function, the application must initialize CUDA context by allocating CUDA resources.
int nufftPlan1d(nufft_plan *plan, int nx, int batch, void *in, void *out, void *work1, void *work2, int mode);
This function create a plan for batched 1-D FFTs. `nx' is the transform size and `batch' is the number of batched FFTs. `in' is the input buffer, and `out' is the output buffer. For 1-D FFTs, working spaces `work1' and `work2' are ignored. To override the input, you can specify the same buffer to the output. `mode' is the memory type of input and output buffer listed in
Table 1.
int nufftPlan2d(nufft_plan *plan, int nx, int ny, int batch, void *in, void *out, void *work1, void *work2, int mode);
This function create a plan for batched 2-D FFTs. `nx' x `ny' is the transform size and `batch' is the number of batched FFTs. `in' is the input buffer, and `out' is the output buffer. `work1' and `work2' are working space. `in' and `work1' may be same. `work2' must be different from both `in' and `work1'. If `in' and `out' are device mapped pinned memory, please specify device memories for `work1' and `work2' for better performance. `mode' is the memory type of input and output buffer listed in Table 1. This function corresponds to an API function of CUFFT library as follows:
int n[2] = {ny, nx};
cufftPlanMany(plan, 2, n, CUFFT_C2C or CUFFT_Z2Z, batch);
int nufftPlan3d(nufft_plan *plan, int nx, int ny, int nz, void *in, void *out, void *work1, void *work2, int mode);
This function create a plan for single 3-D FFTs. `nx' x `ny' x `nz' is the transform size. `in' is the input buffer, and `out' is the output buffer. `work1' and `work2' are working space. `in' and `work1' may be same. `work2' must be different from both `in' and `work1'. If `in' and `out' are device mapped pinned memory, please specify device memories for `work1' and `work2' for better performance. `mode' is the memory type of input and output buffer listed in Table 1. This function corresponds to an API function of CUFFT library as follows:
cufftPlan3d(plan, nz, ny, nx, CUFFT_C2C or CUFFT_Z2Z);
int nufftPlan0d(nufft_plan *plan, int nx, int ny, int nz, void *in, void *out, void *work1, void *work2, int mode);
This function create a plan for batched 1-D FFTs for dimension Y. `nx' x `ny' x `nz' is the array size, and `ny' is the transform size. `in' is the input buffer, and `out' is the output buffer. `work1' and `work2' are working space. `in' and `work1' may be same. `work2' must be different from both `in' and `work1'. If `in' and `out' are device mapped pinned memory, please specify device memories for `work1' and `work2' for better performance. `mode' is the memory type of input and output buffer listed in Table 1.
int nufftExec(nufft_plan plan, void *in, void *out, void *work1, void *work2, int direction);
This function computes transforms defined in the `plan'. The buffers can be different from those specified when creating the plan. However, the memory types should be same. `direction' is one of NUFFT_FORWARD or NUFFT_BACKWARD. The kernel launches uses stream zero, and this function is non-blocking.
int nufftExecAsync(nufft_plan plan, void *in, void *out, void *work1, void *work2, int direction, CUstream stream);
This function computes transforms defined in the `plan'. The buffers can be different from those specified when creating the plan. However, the memory types should be same. `direction' is one of NUFFT_FORWARD or NUFFT_BACKWARD. In addition to the parameters for nufftExec(), you can specify a CUDA stream for FFT kernels. CUstream is a data type for CUDA Stream in CUDA driver API. It is equivalent to cudaStream_t for CUDA runtime API in CUDA 3.1 or later.
in nufftDestroy(nufft_plan plan);
This function destroys a plan and all resources allocated for the plan.
in nufftPrecision(int precision);
This function sets the precision for transforms. It affects all plans created after this call. Please specify 1 for single precision, and specify 2 for double precision.
int nufftWisdomMode(int mode);
This function sets the wisdom mode. The wisdom is a database of auto-tuning results. By default, the user wisdom file is stored in $HOME/.nufft/. The mode is listed in Table 2.

Driver API functions

Basically, the driver API is same as corresponding runtime API.

Table 1. memory types of input and output buffers.

InputOutputMode
devicedeviceNUFFT_D2D
pinneddeviceNUFFT_H2D
devicepinnedNUFFT_D2H
pinnedpinnedNUFFT_H2H

Table 2. wisdom mode.

FlagsMeaningsDefault
NUFFT_WISDOM_MODE_SYSTEM_READRefer system wisdom file.Enabled
NUFFT_WISDOM_MODE_SYSTEM_WRITEUpdate system wisdom file.Disabled
NUFFT_WISDOM_MODE_USER_READRefer user wisdom file.Enabled
NUFFT_WISDOM_MODE_USER_WRITEUpdate user wisdom file.Enabled
By default, no system level wisdom path is defined. You can define this using the environment variable `NUFFT_SYSTEM_WISDOM'.