There is a dynamically allocated struct:
TYPE Struct
INTEGER :: N
REAL*8 :: A
REAL*8,ALLOCATABLE :: B(:)
END TYPE Struct
an
To send one element of that struct, simply proceed as usual by creating a structured datatype using MPI_TYPE_CREATE_STRUCT
. Depending on how the heap and the stack of the program are located in respect to each other, the offset of B(1)
relative to N
could end up being a a huge positive or negative number, but that is of no issue on most Unix platforms and the number should fit in the range of INTEGER(KIND=MPI_ADDRESS_KIND)
.
Important: Separate instances of the structure will most likely have different offsets of B
relative to N
, therefore the MPI datatype can only be used for sending the specific record that was used for obtaining the offsets during the construction of the datatype.
When I try to use MPI_TYPE_CREATE_STRUCT to create a derived datatype for such Struct, it happens that differen CPUs create inconsistent derived datatypes. This is because Struct%B(:) might be located in different memory locations relative to the first memeber Struct%N, on different CPUs.
This is a non-issue. The MPI datatypes on both sides of the communication operation must only be congruent, which means that they should consist of the same basic datatypes in the same sequence. The offset of each element is irrelevant. In other words, as long as both the sender and the receiver specify the same types and number of data elements in the MPI_TYPE_CREATE_STRUCT
call, the program will function correctly.
To send more than one element, things get a bit complicated. There are two solutions:
Use MPI_PACK
to serialise the data on the sender side and MPI_UNPACK
to deserialise it on the receiver side. As packing and unpacking require additional buffer space, this doubles the memory requirements of the program.
or
Create a separate MPI structure datatype for each record and then create a structure datatype that combines all records. Here is an example of how to send an array of two such structures, one with 10 elements in B
and one with 20:
TYPE(Struct) :: Structs(2)
ALLOCATE(Structs(1)%B(10))
ALLOCATE(Structs(2)%B(20))
! (1) Create a separate structure datatype for each record
DO i=1,2
CALL MPI_GET_ADDRESS(Structs(i)%N, POS_(1), IError)
CALL MPI_GET_ADDRESS(Structs(i)%A, POS_(2), IError)
CALL MPI_GET_ADDRESS(Structs(i)%B(1), POS_(3), IError)
Offsets = POS_ - POS_(1)
Types(1) = MPI_INTEGER
Types(2) = MPI_REAL8
Types(3) = MPI_REAL8
Blocks(1) = 1
Blocks(2) = 1
Blocks(3) = i * 10
CALL MPI_TYPE_CREATE_STRUCT(3, Blocks, Offsets, Types, Elem_Type(i), IError)
END DO
! (2) Create a structure of structures that describes the whole array
CALL MPI_GET_ADDRESS(Structs(1)%N, POS_(1), IError)
CALL MPI_GET_ADDRESS(Structs(2)%N, POS_(2), IError)
Offsets = POS_ - POS_(1)
Types(1) = Elem_Type(1)
Types(2) = Elem_Type(2)
Blocks(1) = 1
Blocks(2) = 1
CALL MPI_TYPE_CREATE_STRUCT(2, Blocks, Offsets, Types, TwoElem_Type, IError)
CALL MPI_TYPE_COMMIT(TwoElem_Type, IError)
! (2.1) Free the intermediate datatypes
DO i=1,2
CALL MPI_TYPE_FREE(Elem_Type(i), IError)
END DO
! (3) Send the array
CALL MPI_SEND(Structs(1)%N, 1, TwoElem_Type, ...)
Note that though constructing MPI datatypes is a relatively cheap operation, you should not use the procedure described above to send e.g. 1000000 instances of the structured type. Also, MPI datatype descriptors live in memory managed by the library and the timely deallocation of datatypes that are no longer needed is important.