MPI Fortran code: how to share data on node via openMP?

后端 未结 3 1645
醉梦人生
醉梦人生 2020-12-09 19:50

I am working on an Fortan code that already uses MPI.

Now, I am facing a situation, where a set of data grows very large but is same for every process, so I would pr

相关标签:
3条回答
  • 2020-12-09 20:40

    In the spirit of adding Fortran shared memory MPI examples, I'd like to extend ftiaronsem's code to incorporate a loop so that the behavior of MPI_Win_fence and MPI_Barrier is clearer (at least it is for me now, anyway).

    Specifically, try running the code with either or both of the MPI_Win_Fence or MPI_Barrier commands in the loop commented out to see the effect. Alternatively, reverse their order.

    Removing the MPI_Win_Fence allows the write statement to display memory that has not been updated yet.

    Removing the MPI_Barrier allows other processes to run the next iteration and change memory before a process has the chance to write.

    The previous answers really helped me implement the shared memory paradigm in my MPI code. Thanks.

    program sharedmemtest
      USE, INTRINSIC :: ISO_C_BINDING, ONLY : C_PTR, C_F_POINTER
      use mpi
      implicit none
      integer, parameter :: dp = selected_real_kind(14,200)
      integer :: win,win2,hostcomm,hostrank
      INTEGER(KIND=MPI_ADDRESS_KIND) :: windowsize
      INTEGER :: disp_unit,my_rank,ierr,total, i
      TYPE(C_PTR) :: baseptr,baseptr2
      real(dp), POINTER :: matrix_elementsy(:,:,:,:)
      integer,allocatable :: arrayshape(:)
    
      call MPI_INIT( ierr )
    
      call MPI_COMM_RANK(MPI_COMM_WORLD,my_rank, ierr)  !GET THE RANK OF ONE PROCESS
      call MPI_COMM_SIZE(MPI_COMM_WORLD,total,ierr)  !GET THE TOTAL PROCESSES OF THE COMM
      CALL MPI_Comm_split_type(MPI_COMM_WORLD, MPI_COMM_TYPE_SHARED, 0, MPI_INFO_NULL, hostcomm,ierr)
      CALL MPI_Comm_rank(hostcomm, hostrank,ierr)
    
      ! Gratefully based on: http://stackoverflow.com/questions/24797298/mpi-fortran-code-how-to-share-data-on-node-via-openmp
      ! and https://gcc.gnu.org/onlinedocs/gfortran/C_005fF_005fPOINTER.html
      ! We only want one process per host to allocate memory
      ! Set size to 0 in all processes but one
      allocate(arrayshape(4))
      arrayshape=(/ 10,10,10,10 /)
      if (hostrank == 0) then
         windowsize = int(10**4,MPI_ADDRESS_KIND)*8_MPI_ADDRESS_KIND !*8 for double ! Put the actual data size here
      else
         windowsize = 0_MPI_ADDRESS_KIND
      end if
      disp_unit = 1
      CALL MPI_Win_allocate_shared(windowsize, disp_unit, MPI_INFO_NULL, hostcomm, baseptr, win, ierr)
    
      ! Obtain the location of the memory segment
      if (hostrank /= 0) then
         CALL MPI_Win_shared_query(win, 0, windowsize, disp_unit, baseptr, ierr)
      end if
    
      ! baseptr can now be associated with a Fortran pointer
      ! and thus used to access the shared data
      CALL C_F_POINTER(baseptr, matrix_elementsy,arrayshape)
    
      !!! your code here!
      !!! sample below
      if (hostrank == 0) then
         matrix_elementsy=0.0_dp
      endif
      call MPI_WIN_FENCE(0, win, ierr)
      do i=1, 15
         if (hostrank == 0) then
            matrix_elementsy(1,2,3,4)=i * 1.0_dp
            matrix_elementsy(1,2,2,4)=i * 2.0_dp
         elseif ((hostrank > 5) .and. (hostrank < 11)) then  ! code for non-root nodes to do something different
            matrix_elementsy(1,2,hostrank, 4) = hostrank * 1.0 * i
         endif
         call MPI_WIN_FENCE(0, win, ierr)
         write(*,'(A, I4, I4, 10F7.1)') "my_rank=",my_rank, i, matrix_elementsy(1,2,:,4)
         call MPI_BARRIER(MPI_COMM_WORLD, ierr)
      enddo
      !!! end sample code
    
      call MPI_WIN_FENCE(0, win, ierr)
      call MPI_BARRIER(MPI_COMM_WORLD,ierr)
      call MPI_Win_free(win,ierr)
      call MPI_FINALIZE(IERR)
    
      end program
    
    0 讨论(0)
  • 2020-12-09 20:49

    With this answer I want to add a complete running code example (for ifort 15 and mvapich 2.1). The MPI shared memory concept is still pretty new and in particular for Fortran there aren't many code examples out there. It is based on the answer from Hristo and a very useful email on the mvapich mailing list (http://mailman.cse.ohio-state.edu/pipermail/mvapich-discuss/2014-June/005003.html).

    The code example is based on the problems I ran into and adds to Hristo's answer in the following ways:

    • uses mpi instead of mpi_f08 (some libraries do not provide a full fortran 2008 interface yet)
    • Has ierr added to the respective MPI calls
    • Explicit calculation of the windowsize elements*elementsize
    • How to use C_F_POINTER to map the shared memory to a multi dimensional array
    • Reminds to use MPI_WIN_FENCE after modifying the shared memory
    • Intel mpi (5.0.1.035) needs an additional MPI_BARRIER after the MPI_FENCE since it only guarantees that between "between two MPI_Win_fence calls, all RMA operations are completed." (https://software.intel.com/en-us/blogs/2014/08/06/one-sided-communication)

    Kudos go to Hristo and Michael Rachner.

    program sharedmemtest
      USE, INTRINSIC :: ISO_C_BINDING, ONLY : C_PTR, C_F_POINTER
      use mpi
      implicit none
      integer, parameter :: dp = selected_real_kind(14,200)
      integer :: win,win2,hostcomm,hostrank
      INTEGER(KIND=MPI_ADDRESS_KIND) :: windowsize
      INTEGER :: disp_unit,my_rank,ierr,total
      TYPE(C_PTR) :: baseptr,baseptr2
      real(dp), POINTER :: matrix_elementsy(:,:,:,:)
      integer,allocatable :: arrayshape(:)
    
      call MPI_INIT( ierr )
    
      call MPI_COMM_RANK(MPI_COMM_WORLD,MY_RANK,IERR)  !GET THE RANK OF ONE PROCESS                                                                                                                                                                                                
      call MPI_COMM_SIZE(MPI_COMM_WORLD,Total,IERR)  !GET THE TOTAL PROCESSES OF THE COMM                                                                                                                                                                                          
      CALL MPI_Comm_split_type(MPI_COMM_WORLD, MPI_COMM_TYPE_SHARED, 0, MPI_INFO_NULL, hostcomm,ierr)
      CALL MPI_Comm_rank(hostcomm, hostrank,ierr)
    
      ! Gratefully based on: http://stackoverflow.com/questions/24797298/mpi-fortran-code-how-to-share-data-on-node-via-openmp                                                                                                                                                     
      ! and https://gcc.gnu.org/onlinedocs/gfortran/C_005fF_005fPOINTER.html                                                                                                                                                                                                       
      ! We only want one process per host to allocate memory                                                                                                                                                                                                                       
      ! Set size to 0 in all processes but one                                                                                                                                                                                                                                     
      allocate(arrayshape(4))
      arrayshape=(/ 10,10,10,10 /)
      if (hostrank == 0) then
         windowsize = int(10**4,MPI_ADDRESS_KIND)*8_MPI_ADDRESS_KIND !*8 for double ! Put the actual data size here                                                                                                                                                                
      else
         windowsize = 0_MPI_ADDRESS_KIND
      end if
      disp_unit = 1
      CALL MPI_Win_allocate_shared(windowsize, disp_unit, MPI_INFO_NULL, hostcomm, baseptr, win, ierr)    
    
      ! Obtain the location of the memory segment                                                                                                                                                                                                                                  
      if (hostrank /= 0) then
         CALL MPI_Win_shared_query(win, 0, windowsize, disp_unit, baseptr, ierr)     
      end if
    
      ! baseptr can now be associated with a Fortran pointer                                                                                                                                                                                                                       
      ! and thus used to access the shared data                                                                                                                                                                                                                                    
      CALL C_F_POINTER(baseptr, matrix_elementsy,arrayshape)
    
      !!! your code here!                                                                                                                                                                                                                                                          
      !!! sample below                                                                                                                                                                                                                                                             
    
      if (hostrank == 0) then
         matrix_elementsy=0.0_dp
         matrix_elementsy(1,2,3,4)=1.0_dp
      end if
      CALL MPI_WIN_FENCE(0, win, ierr)
    
      print *,"my_rank=",my_rank,matrix_elementsy(1,2,3,4),matrix_elementsy(1,2,3,5)
    
      !!! end sample code                                                                                                                                                                                                                                                          
    
      call MPI_WIN_FENCE(0, win, ierr) 
      call MPI_BARRIER(MPI_COMM_WORLD,ierr) 
      call MPI_Win_free(win,ierr)     
      call MPI_FINALIZE(IERR)
    
      end program
    
    0 讨论(0)
  • 2020-12-09 20:52

    You don't need to implement a hybrid MPI+OpenMP code if it is only for sharing a chunk of data. What you actually have to do is:

    1) Split the world communicator into groups that span the same host/node. That is really easy if your MPI library implements MPI-3.0 - all you need to do is call MPI_COMM_SPLIT_TYPE with split_type set to MPI_COMM_TYPE_SHARED:

    USE mpi_f08
    
    TYPE(MPI_Comm) :: hostcomm
    
    CALL MPI_Comm_split_type(MPI_COMM_WORLD, MPI_COMM_TYPE_SHARED, 0, &
                             MPI_INFO_NULL, hostcomm)
    

    MPI-2.2 or earlier does not provide the MPI_COMM_SPLIT_TYPE operation and one has to get somewhat creative. You could for example use my simple split-by-host implementation that can be found on Github here.

    2) Now that processes that reside on the same node are part of the same communicator hostcomm, they can create a block of shared memory and use it to exchange data. Again, MPI-3.0 provides an (relatively) easy and portable way to do that:

    USE mpi_f08
    USE, INTRINSIC :: ISO_C_BINDING, ONLY : C_PTR, C_F_POINTER
    
    INTEGER :: hostrank
    
    INTEGER(KIND=MPI_ADDRESS_KIND) :: size
    INTEGER :: disp_unit
    TYPE(C_PTR) :: baseptr
    TYPE(MPI_Win) :: win
    
    TYPE(MY_DATA_TYPE), POINTER :: shared_data
    
    ! We only want one process per host to allocate memory
    ! Set size to 0 in all processes but one
    CALL MPI_Comm_rank(hostcomm, hostrank)
    if (hostrank == 0) then
       size = 10000000 ! Put the actual data size here
    else
       size = 0
    end if
    disp_unit = 1
    CALL MPI_Win_allocate_shared(size, disp_unit, MPI_INFO_NULL, &
                                 hostcomm, baseptr, win)
    
    ! Obtain the location of the memory segment
    if (hostrank /= 0) then
       CALL MPI_Win_shared_query(win, 0, size, disp_unit, baseptr)
    end if
    
    ! baseptr can now be associated with a Fortran pointer
    ! and thus used to access the shared data
    CALL C_F_POINTER(baseptr, shared_data)
    
    ! Use shared_data as if it was ALLOCATE'd
    ! ...
    
    ! Destroy the shared memory window
    CALL MPI_Win_free(win)
    

    The way that code works is that it uses the MPI-3.0 functionality for allocating shared memory windows. MPI_WIN_ALLOCATE_SHARED allocates a chunk of shared memory in each process. Since you want to share one block of data, it only makes sense to allocate it in a single process and not have it spread across the processes, therefore size is set to 0 for all but one ranks while making the call. MPI_WIN_SHARED_QUERY is used to find out the address at which that shared memory block is mapped in the virtual address space of the calling process. One the address is known, the C pointer can be associated with a Fortran pointer using the C_F_POINTER() subroutine and the latter can be used to access the shared memory. Once done, the shared memory has to be freed by destroying the shared memory window with MPI_WIN_FREE.

    MPI-2.2 or earlier does not provide shared memory windows. In that case one has to use the OS-dependent APIs for creation of shared memory blocks, e.g. the standard POSIX sequence shm_open() / ftruncate() / mmap(). A utility C function callable from Fortran has to be written in order to perform those operations. See that code for some inspiration. The void * returned by mmap() can be passed directly to the Fortran code in a C_PTR type variable that can be then associated with a Fortran pointer.

    0 讨论(0)
提交回复
热议问题