Using MPI_Send/Recv to handle chunk of multi-dim array in Fortran 90

问题

I have to send and receive (MPI) a chunk of a multi-dimensional array in FORTRAN 90. The line

MPI_Send(x(2:5,6:8,1),12,MPI_Real,....)

is not supposed to be used, as per the book "Using MPI..." by Gropp, Lusk, and Skjellum. What is the best way to do this? Do I have to create a temporary array and send it or use MPI_Type_Create_Subarray or something like that?

回答1:

The reason not to use array sections with MPI_SEND is that the compiler has to create a temporary copy with some MPI implementations. This is due to the fact that Fortran can only properly pass array sections to subroutines with explicit interfaces and has to generate temporary "flattened" copies in all other cases, usually on the stack of the calling subroutine. Unfortunately in Fortran before the TR 29113 extension to F2008 there is no way to declare subroutines that take variable type arguments and MPI implementations usually resort to language hacks, e.g. MPI_Send is entirely implemented in C and relies on Fortran always passing the data as a pointer.

Some MPI libraries work around this issue by generating huge number of overloads for MPI_SEND:

one that takes a single INTEGER
one that takes an 1-d array of INTEGER
one that takes an 2-d array of INTEGER
and so on

The same is then repeated for CHARACTER, LOGICAL, DOUBLE PRECISION, etc. This is still a hack as it does not cover cases where one passes user-defined type. Further it greatly complicates the C implementation as it now has to understand the Fortran array descriptors, which are very compiler-specific.

Fortunately times are changing. The TR 29113 extension to Fortran 2008 includes two new features:

assumed-type arguments: TYPE(*)
assumed-dimension arguments: DIMENSION(..)

The combination of both, i.e. TYPE(*), DIMENSION(..), INTENT(IN) :: buf, describes an argument that can both be of varying type and have any dimension. This is already being taken advantage of in the new mpi_f08 interface in MPI-3.

Non-blocking calls present bigger problems in Fortran that go beyond what Alexander Vogt has described. The reason is that Fortran does not have the concept of suppressing compiler optimisations (i.e. there is no volatile keyword in Fortran). The following code might not run as expected:

INTEGER :: data

data = 10
CALL MPI_IRECV(data, 1, MPI_INTEGER, 0, 0, MPI_COMM_WORLD, req, ierr)
! data is not used here
! ...
CALL MPI_WAIT(req, MPI_STATUS_IGNORE, ierr)
! data is used here

One might expect that after the call to MPI_WAIT data would contain the value received from rank 0, but this might very well not be the case. The reason is that the compiler cannot know that data might change asynchronously after MPI_IRECV returns and therefore keep its value in a register instead. That's why non-blocking MPI calls are generally considered as dangerous in Fortran.

TR 29113 has solution for that second problem too with the ASYNCHRONOUS attribute. If you take a look at the mpi_f08 definition of MPI_IRECV, its buf argument is declared as:

TYPE(*), DIMENSION(..), INTENT(OUT), ASYNCHRONOUS :: buf

Even if buf is a scalar argument, i.e. no temporary copy is created, a TR 29113 compliant compiler would not resort to register optimisations for the buffer argument.

回答2:

EDIT: As Hristo Iliev pointed out MPI_Send is always blocking, but might choose to send data asynchronously. From here:

MPI_Send will not return until you can use the send buffer.

Non-blocking communications (like MPI_Send), might pose a problem with Fortran when non-contiguous arrays are involved. Then, the compiler creates a temporary array for the dummy variable and passes it to the subroutine. Once the subroutine is finished, the compiler is at liberty to free the memory of that copy.

That's fine as long as you use blocking communication (MPI_Send), because then the message has been sent when the subroutine returns. For the non-blocking communication (MPI_Isend), however, the temporary array is the send buffer, and the subroutine returns before it has been sent.

So it might happen, that MPI will send data from a memory location that holds no valid data any more.

So, either you create a copy yourself (so that your send buffer is contiguous in memory), or you create a sub-array (i.e. tell MPI the addresses in memory of elements you want to send). There are further alternatives out there, like MPI_Pack, but I have no experience with them.

Which way is faster? Well, that depends:

On the actual implementation of your MPI library
On the data and its distribution
On your compiler
On your hardware

See here for a detailed explanation and further options.

来源：https://stackoverflow.com/questions/19455051/using-mpi-send-recv-to-handle-chunk-of-multi-dim-array-in-fortran-90

标签

parallel-processing

mpi

fortran90