How to do a fftw3 MPI “transposed” 2D transform if possible at all?

前端 未结 1 1176
悲&欢浪女
悲&欢浪女 2020-12-21 03:30

Consider a 2D transform of the form L x M (column major setup), from a complex array src to a real array tgt. Or , in Fortranese,

complex(C         


        
相关标签:
1条回答
  • 2020-12-21 04:09

    And the answer is:

    For mpi real transforms, there are only two allowed combinations of transpositions and directions:

    • real to complex transform and FFTW_MPI_TRANSPOSED_OUT
    • complex to real transform and FFTW_MPI_TRANSPOSED_IN

    I have found this while digging inside the fftw3 ver. 3.3.4 code, file "rdft2-problem.c", comment on the line 120.

    EDIT:

    MINIMAL COMPILABLE AND WORKING EXAMPLE:

    program trashingfftw
      use, intrinsic :: iso_c_binding
      use MPI
    
      implicit none
      include 'fftw3-mpi.f03'
    
      integer(C_INTPTR_T), parameter :: L = 256
      integer(C_INTPTR_T), parameter :: M = 256
    
      type(C_PTR) :: plan, ctgt, csrc
    
      complex(C_DOUBLE_COMPLEX), pointer :: src(:,:)
      real(8), pointer :: tgt(:,:)
    
      integer(C_INTPTR_T) :: alloc_local, local_M, &
                             & local_L,local_offset1,local_offset2
    
      integer :: ierr,id
    
    
      call mpi_init(ierr)
    
      call mpi_comm_rank(MPI_COMM_WORLD,id,ierr)
    
      call fftw_mpi_init()
    
    
      alloc_local = fftw_mpi_local_size_2d(L/2+1,M, MPI_COMM_WORLD, &
           local_l, local_offset1)
    
      print *, id, "alloc complex=",alloc_local, local_l
    
      csrc = fftw_alloc_complex(alloc_local)
      call c_f_pointer(csrc, src, [M,local_l])
    
      !Caveat: Must partition the real storage according to complex layout, this is why
      ! I am using M and L/2+1 instead of M, 2*(L/2+1) as it was done in the original post
      alloc_local = fftw_mpi_local_size_2d(M,L/2+1, MPI_COMM_WORLD, &
          &                               local_M, local_offset2)
    
      print *, id, "alloc real=",alloc_local, local_m
      ! Two reals per complex
      ctgt = fftw_alloc_real(2*alloc_local)
      ! Only the first L are relevant, the rest is just dangling space (see fftw3 docs) 
      !caveat: since the padding is in the first index, the 2d data is laid out non-contiguously 
      !(L sensible reals, padding, padding, L sensible reals, padding, padding, ....)
      call c_f_pointer(ctgt, tgt, [2*(L/2+1),local_m])
    
    
      plan =  fftw_mpi_plan_dft_c2r_2d(M,L,src,tgt, MPI_COMM_WORLD, & 
           ior(FFTW_MEASURE, FFTW_MPI_TRANSPOSED_IN))
    
      ! Should be non-null
      print *, 'plan:', plan
    
      src(3,2)=(1.,0)
      call fftw_mpi_execute_dft_c2r(plan, src, tgt) 
    
      call mpi_finalize(ierr)
    end program thrashingfftw
    
    0 讨论(0)
提交回复
热议问题