问题
I wrote a simple test program in order to implement the FFTW with MPI in a 2d domain (with Fortran). The domain is 'Ny x Nx' wide and partitioned in the second ('x') index.
After proper (I believe?) declaration and allocation of variables and plans, I call the fftw_mpi r2c_2d function and next I transform back its output with the fftw_mpi c2r_2d, in order to check if I get the original input. The r2c_2d part seems to work fine. However, I don't get the original input after transforming back the output (apart normalization) with the c2r_2d function: the resulting vector displays 'zeros' at the indices (:,j) with j corresponding to multiples of 'Ny/2'. What am I doing wrong? Thanks!
Here is the extract from the code:
Program TEST
use, intrinsic :: iso_c_binding
Implicit none
include 'mpif.h'
include 'fftw3-mpi.f03'
Integer*8,parameter :: nx=16, ny=16
!MPI
integer*8 :: ipe,npe
integer*8 ::mpi_realtype,icomm=mpi_comm_world,istat(mpi_status_size),ierr
! FFTW VARIABLES DECLARATION
type(C_PTR) :: p1, p2, cdatar, cdatac
integer(C_INTPTR_T) :: alloc_local, local_L, local_L_offset, local_M, local_M_offset
real(C_DOUBLE), pointer :: faux(:,:) ! real input 2d function
complex(C_DOUBLE), pointer :: gaux(:,:) ! complex output of 2d FFTW (transposed)
! MPI initialization
call mpi_init(ierr)
call mpi_comm_rank(icomm,ipe,ierr)
call mpi_comm_size(icomm,npe,ierr)
! FFTW ALLOCATIONS AND PLANS
call fftw_mpi_init()
alloc_local = fftw_mpi_local_size_2d(ny/2+1,nx &
,MPI_COMM_WORLD, local_L, local_L_offset)
cdatac = fftw_alloc_complex(alloc_local)
call c_f_pointer(cdatac, gaux, [nx,local_L]) !transposed
alloc_local = fftw_mpi_local_size_2d(nx,ny/2+1, MPI_COMM_WORLD, &
local_M, local_M_offset)
cdatar = fftw_alloc_real(2*alloc_local)
call c_f_pointer(cdatar, faux, [ny,local_M])
! Create plans
p1 = fftw_mpi_plan_dft_r2c_2d(nx,ny,faux,gaux, MPI_COMM_WORLD, &
ior(FFTW_MEASURE, FFTW_MPI_TRANSPOSED_OUT))
p2 = fftw_mpi_plan_dft_c2r_2d(nx,ny,gaux,faux, MPI_COMM_WORLD, &
ior(FFTW_MEASURE, FFTW_MPI_TRANSPOSED_IN))
! EXECUTE FFTW
call random_number(faux)
print *, "real input:", real(faux(1,:))
call fftw_mpi_execute_dft_r2c(p1,faux,gaux)
call fftw_mpi_execute_dft_c2r(p2, gaux, faux)
print *, "real output:", real(faux(1,:))/(nx*ny)
call fftw_destroy_plan(p1)
call fftw_destroy_plan(p2)
call mpi_finalize(ierr)
End Program TEST
回答1:
The issue is due to the padding needed by fftw:
Although the real data is conceptually n0 × n1 × n2 × … × nd-1 , it is physically stored as an n0 × n1 × n2 × … × [2 (nd-1/2 + 1)] array, where the last dimension has been padded to make it the same size as the complex output. This is much like the in-place serial r2c/c2r interface (see Multi-Dimensional DFTs of Real Data), except that in MPI the padding is required even for out-of-place data.
Hence, the input array for a 16x16 transform is therefore a 16x18 array. The value of the extra two numbers at the end of each rows are meaningless in the real space. Yet, these extra numbers must not be forgotten as the c pointer is cast to a fortran 2D array:
call c_f_pointer(cdatar, faux, [2*(ny/2+1),local_M])
The extra numbers are still printed at the end of each row. The array can be sliced to avoid printing these worthless values:
print *, "real input:", real(faux(1:ny,:))
...
print *, "real output:", real(faux(1:ny,:))/(nx*ny)
Here is the complete code, based on yours and the one of How to do a fftw3 MPI "transposed" 2D transform if possible at all? It can be compiled by mpif90 main.f90 -o main -I/usr/include -L/usr/lib -lfftw3_mpi -lfftw3 -lm
and ran by mpirun -np 2 main
.
Program TEST
use, intrinsic :: iso_c_binding
Implicit none
include 'mpif.h'
include 'fftw3-mpi.f03'
Integer*8,parameter :: nx=4, ny=8
!MPI
integer*8 :: ipe,npe
integer*8 ::mpi_realtype,icomm=mpi_comm_world,istat(mpi_status_size),ierr
! FFTW VARIABLES DECLARATION
type(C_PTR) :: p1, p2, cdatar, cdatac
integer(C_INTPTR_T) :: alloc_local, local_L, local_L_offset, local_M, local_M_offset
real(C_DOUBLE), pointer :: faux(:,:) ! real input 2d function
complex(C_DOUBLE), pointer :: gaux(:,:) ! complex output of 2d FFTW (transposed)
! MPI initialization
call mpi_init(ierr)
call mpi_comm_rank(icomm,ipe,ierr)
call mpi_comm_size(icomm,npe,ierr)
! FFTW ALLOCATIONS AND PLANS
call fftw_mpi_init()
alloc_local = fftw_mpi_local_size_2d(ny/2+1,nx &
,MPI_COMM_WORLD, local_L, local_L_offset)
cdatac = fftw_alloc_complex(alloc_local)
call c_f_pointer(cdatac, gaux, [nx,local_L]) !transposed
alloc_local = fftw_mpi_local_size_2d(nx,ny/2+1, MPI_COMM_WORLD, &
local_M, local_M_offset)
cdatar = fftw_alloc_real(2*alloc_local)
call c_f_pointer(cdatar, faux, [2*(ny/2+1),local_M])
! Create plans
p1 = fftw_mpi_plan_dft_r2c_2d(nx,ny,faux,gaux, MPI_COMM_WORLD, &
ior(FFTW_MEASURE, FFTW_MPI_TRANSPOSED_OUT))
p2 = fftw_mpi_plan_dft_c2r_2d(nx,ny,gaux,faux, MPI_COMM_WORLD, &
ior(FFTW_MEASURE, FFTW_MPI_TRANSPOSED_IN))
! EXECUTE FFTW
call random_number(faux)
print *, "real input:", real(faux(1:ny,:))
call fftw_mpi_execute_dft_r2c(p1,faux,gaux)
call fftw_mpi_execute_dft_c2r(p2, gaux, faux)
print *, "real output:", real(faux(1:ny,:))/(nx*ny)
call fftw_destroy_plan(p1)
call fftw_destroy_plan(p2)
call mpi_finalize(ierr)
End Program TEST
来源:https://stackoverflow.com/questions/41730864/incorrect-output-via-fftw-mpi-r2c-2d-and-fftw-mpi-c2r-2d