NukadaFFT Library User's Manual
CUDA applications may use one of CUDA runtime APIs or CUDA driver APIs.
Many applications use CUDA runtime API because of simplicity.
Nukada FFT library provides API functions for both of those applications.
Runtime API functions
This function initialize the library. Before calling this function,
the application must initialize CUDA context by allocating
CUDA resources.
int nufftPlan1d(nufft_plan *plan, int nx, int batch, void *in, void *out, void *work1, void *work2, int mode);
This function create a plan for batched 1-D FFTs.
`nx' is the transform size and
`batch' is the number of batched FFTs.
`in' is the input buffer, and `out' is the output buffer.
For 1-D FFTs, working spaces `work1' and `work2' are ignored.
To override the input, you can specify the same buffer to
the output.
`mode' is the memory type of input and output buffer
listed in Table 1.
int nufftPlan2d(nufft_plan *plan, int nx, int ny, int batch, void *in, void *out, void *work1, void *work2, int mode);
This function create a plan for batched 2-D FFTs.
`nx' x `ny' is the transform size and
`batch' is the number of batched FFTs.
`in' is the input buffer, and `out' is the output buffer.
`work1' and `work2' are working space.
`in' and `work1' may be same.
`work2' must be different from both `in' and `work1'.
If `in' and `out' are device mapped pinned memory,
please specify device memories for `work1' and `work2'
for better performance.
`mode' is the memory type of input and output buffer
listed in Table 1.
This function corresponds to an API function of CUFFT library as follows:
int n[2] = {ny, nx};
cufftPlanMany(plan, 2, n, CUFFT_C2C or CUFFT_Z2Z, batch);
int nufftPlan3d(nufft_plan *plan, int nx, int ny, int nz, void *in, void *out, void *work1, void *work2, int mode);
This function create a plan for single 3-D FFTs.
`nx' x `ny' x `nz' is the transform size.
`in' is the input buffer, and `out' is the output buffer.
`work1' and `work2' are working space.
`in' and `work1' may be same.
`work2' must be different from both `in' and `work1'.
If `in' and `out' are device mapped pinned memory,
please specify device memories for `work1' and `work2'
for better performance.
`mode' is the memory type of input and output buffer
listed in Table 1.
This function corresponds to an API function of CUFFT library as follows:
cufftPlan3d(plan, nz, ny, nx, CUFFT_C2C or CUFFT_Z2Z);
int nufftPlan0d(nufft_plan *plan, int nx, int ny, int nz, void *in, void *out, void *work1, void *work2, int mode);
This function create a plan for batched 1-D FFTs for dimension Y.
`nx' x `ny' x `nz' is the array size, and `ny' is the transform size.
`in' is the input buffer, and `out' is the output buffer.
`work1' and `work2' are working space.
`in' and `work1' may be same.
`work2' must be different from both `in' and `work1'.
If `in' and `out' are device mapped pinned memory,
please specify device memories for `work1' and `work2'
for better performance.
`mode' is the memory type of input and output buffer
listed in Table 1.
int nufftExec(nufft_plan plan, void *in, void *out, void *work1, void *work2, int direction);
This function computes transforms defined in the `plan'.
The buffers can be different from those specified when
creating the plan. However, the memory types should be
same.
`direction' is one of NUFFT_FORWARD or NUFFT_BACKWARD.
The kernel launches uses stream zero, and this function is non-blocking.
int nufftExecAsync(nufft_plan plan, void *in, void *out, void *work1, void *work2, int direction, CUstream stream);
This function computes transforms defined in the `plan'.
The buffers can be different from those specified when
creating the plan. However, the memory types should be
same.
`direction' is one of NUFFT_FORWARD or NUFFT_BACKWARD.
In addition to the parameters for nufftExec(), you can specify
a CUDA stream for FFT kernels. CUstream is a data type for CUDA Stream
in CUDA driver API. It is equivalent to cudaStream_t for CUDA runtime API
in CUDA 3.1 or later.
in nufftDestroy(nufft_plan plan);
This function destroys a plan and all resources allocated
for the plan.
in nufftPrecision(int precision);
This function sets the precision for transforms.
It affects all plans created after this call.
Please specify 1 for single precision, and
specify 2 for double precision.
int nufftWisdomMode(int mode);
This function sets the wisdom mode.
The wisdom is a database of auto-tuning results.
By default, the user wisdom file is stored in $HOME/.nufft/.
The mode is listed in Table 2.
Driver API functions
Basically, the driver API is same as corresponding runtime API.
Table 1. memory types of input and output buffers.
Input | Output | Mode |
device | device | NUFFT_D2D |
pinned | device | NUFFT_H2D |
device | pinned | NUFFT_D2H |
pinned | pinned | NUFFT_H2H |
Table 2. wisdom mode.
Flags | Meanings | Default |
NUFFT_WISDOM_MODE_SYSTEM_READ | Refer system wisdom file. | Enabled |
NUFFT_WISDOM_MODE_SYSTEM_WRITE | Update system wisdom file. | Disabled |
NUFFT_WISDOM_MODE_USER_READ | Refer user wisdom file. | Enabled |
NUFFT_WISDOM_MODE_USER_WRITE | Update user wisdom file. | Enabled |
By default, no system level wisdom path is defined. You can define this using the environment variable `NUFFT_SYSTEM_WISDOM'.