Is there a method of FFT that will run inside CUDA Kernel?

问题

I am currently converting a C++ program into CUDA code, and part of my program runs a fast Fourier transform. Originally I ran FFTW, but I saw that I couldn't call it in kernel, so I then rewrote that part using cufft but it tells me the same thing!

Are there any FFT that will run inside a CUDA kernel?

Can I just add __device__ to the fftw library?

I would like to avoid having to initialize or call the FFT in host. I want a completely on the gpu type function, if one exists.

回答1:

Are you sure you need to avoid a launch from the host? Nvidia's cufft library is pretty good these days. Porting FFTW seems like a pretty hard task. You might have an easier time porting kissfft but it is still not going to be easy.

回答2:

Looks like you are trying to perform several FFTs at once if you are looking to incorporate it into a kernel. I would look into the batch processing features in cuFFT. What is your application? cufftPlanMany() works for batch FFTs in many different memory configurations.

回答3:

there is NO way to call the APIs from the GPU kernel. You must call them from the host. If you want to run a FFT without passing from DEVICE -> HOST -> DEVICE to continue your elaboration I think that the only solution is to write a kernel that performs the FFT in a device function. Actually I'm doing this because I need to run more FFTs in parallel without passing again the datas to the HOST. If you find/have another solution let me know.

来源：https://stackoverflow.com/questions/11587160/is-there-a-method-of-fft-that-will-run-inside-cuda-kernel

标签

cuda

fft

易学教程内所有资源均来自网络或用户发布的内容，如有违反法律规定的内容欢迎反馈！
该文章没有解决你所遇到的问题?点击提问,说说你的问题,让更多的人一起探讨吧!