cufft

Batched FFTs using cufftPlanMany

久未见 提交于 2019-12-03 21:45:37
I want to perform 441 2D, 32-by-32 FFTs using the batched method provided by the cuFFT library. The parameters of the transform are the following: int n[2] = {32,32}; int inembed[] = {32,32}; int onembed[] = {32,32/2+1}; cufftPlanMany(&plan,2,n,inembed,1,32*32,onembed,1,32*(32/2+1),CUFFT_D2Z,441); cufftPlanMany(&inverse_plan,2,n,onembed,1,32*32,inembed,1,32*32,CUFFT_Z2D,441); After I did the forward and inverse FFTs using the above plans, I could not get the original data back. Can anyone advise me how to set the parameters correctly for cudaPlanMany? Many thanks in advance. By the way, is it

Not the same image after cuda FFT and iFFT

浪子不回头ぞ 提交于 2019-12-02 17:55:08
问题 I'm trying to preform an FFT -> ramp filtering -> iFFT on a 2D image with CUDA. First, as a test I tried to do FFT and iFFt without any filters. After the FFT andthe iFFT the image seems the same, but before the operation the image pixel values were between 0-255 and after the FFT and iFFT the image contains ~10^7 values. The test image contains float numbers, and the dimensions are 512 x 360. I preform the fft with my "cuffSinogram" function, and the iFFT with the "cuInversefftSinogram"

CUFFT : How to calculate the fft when the input is a pitched array

◇◆丶佛笑我妖孽 提交于 2019-11-30 16:20:48
I'm trying to find the fft of a dynamically allocated array. The input array is copied from host to device using cudaMemcpy2D . Then the fft is taken (cufftExecR2C) and the results are copied back from device to host. So my initial problem was how to use the pitch information in the fft. Then I found an answer here - CUFFT: How to calculate fft of pitched pointer? But unfortunately it doesn't work. The results I get are garbage values. Given below is my code. #define NRANK 2 #define BATCH 10 #include "cuda_runtime.h" #include "device_launch_parameters.h" #include <cufft.h> #include <stdio.h>

CUFFT : How to calculate the fft when the input is a pitched array

有些话、适合烂在心里 提交于 2019-11-29 23:34:17
问题 I'm trying to find the fft of a dynamically allocated array. The input array is copied from host to device using cudaMemcpy2D . Then the fft is taken (cufftExecR2C) and the results are copied back from device to host. So my initial problem was how to use the pitch information in the fft. Then I found an answer here - CUFFT: How to calculate fft of pitched pointer? But unfortunately it doesn't work. The results I get are garbage values. Given below is my code. #define NRANK 2 #define BATCH 10

Calculating performance of CUFFT

*爱你&永不变心* 提交于 2019-11-28 12:58:41
I am running CUFFT on chunks (N*N/p) divided in multiple GPUs, and I have a question regarding calculating the performance. First, a bit about how I am doing it: Send N*N/p chunks to each GPU Batched 1-D FFT for each row in p GPUs Get N*N/p chunks back to host - perform transpose on the entire dataset Ditto Step 1 Ditto Step 2 Gflops = ( 1e-9 * 5 * N * N *lg(N*N) ) / execution time and Execution time is calculated as: execution time = Sum(memcpyHtoD + kernel + memcpyDtoH times for row and col FFT for each GPU) Is this the correct way to evaluate CUFFT performance on multiple GPUs? Is there any

1D FFTs of columns and rows of a 3D matrix in CUDA

旧城冷巷雨未停 提交于 2019-11-27 15:18:51
I'm trying to compute batch 1D FFTs using cufftPlanMany . The data set comes from a 3D field, stored in a 1D array, where I want to compute 1D FFTs in the x and y direction. The data is stored as shown in the figure below; continuous in x then y then z . Doing batch FFTs in the x -direction is (I believe) straighforward; with input stride=1 , distance=nx and batch=ny * nz , it computes the FFTs over elements {0,1,2,3} , {4,5,6,7} , ... , {28,29,30,31} . However, I can't think of a way to achieve the same for the FFTs in the y -direction. A batch for each xy plane is again straightforward

CUFFT error handling

荒凉一梦 提交于 2019-11-26 17:10:00
问题 I'm using the following macro for CUFFT error handling: #define cufftSafeCall(err) __cufftSafeCall(err, __FILE__, __LINE__) inline void __cufftSafeCall(cufftResult err, const char *file, const int line) { if( CUFFT_SUCCESS != err) { fprintf(stderr, "cufftSafeCall() CUFFT error in file <%s>, line %i.\n", file, line); getch(); exit(-1); } } This macro does not return the message string from an error code. The book "CUDA Programming: a developer's guide to parallel computing with GPUs" suggests

1D FFTs of columns and rows of a 3D matrix in CUDA

百般思念 提交于 2019-11-26 17:06:20
问题 I'm trying to compute batch 1D FFTs using cufftPlanMany. The data set comes from a 3D field, stored in a 1D array, where I want to compute 1D FFTs in the x and y direction. The data is stored as shown in the figure below; continuous in x then y then z . Doing batch FFTs in the x -direction is (I believe) straighforward; with input stride=1 , distance=nx and batch=ny * nz , it computes the FFTs over elements {0,1,2,3} , {4,5,6,7} , ... , {28,29,30,31} . However, I can't think of a way to