Say 2 processes are participating. Process 0 (rank 0) has
A = { a d
b e
c f
}
and process 1 (rank 1) has
A
So first off - and this comes up with MPI and C arrays all the time - you can't really do the standard C two dimensional array thing. Let's look at this:
A = (char **)calloc((3), sizeof(char *));
for(i=0; i<3; ++i)
{
A[i] = (char *)calloc(2, sizeof(char));
}
This will definately allocate a 3x2 array of characters, but you have no idea how the resulting data is laid out in memory. In particular, there's no guarantee at all that A[1][0]
immediately follows A[0][1]
. That makes it very difficult to create MPI datatypes which span the data structure! You need to allocate 3x2 contiguous bytes, and then make the array point into it:
char **charalloc2d(int n, int m) {
char *data = (char *)calloc(n*m,sizeof(char));
char **array = (char **)calloc(n, sizeof(char *));
for (int i=0; i
Now we know something about the layout of the array, and can depend on that to build datatypes.
You're on the right track with the data types --
MPI_Datatype b_col_type;
MPI_Type_vector(3, 1, 1, MPI_CHAR, &b_col_type);
MPI_Type_commit(&b_col_type);
the signature of MPI_Type_vector is (count, blocklen, stride, old_type, *newtype).
We want nrows characters, that come in blocks of 1; but they're spaced ncols apart; so that's the stride.
Note that this is really the column type of the A
array, rather than B
; the type will depend on the number of columns in the array. So each process is using a different sendtype, which is fine.
MPI_Datatype a_col_type;
MPI_Type_vector(nrows, 1, ncols, MPI_CHAR, &a_col_type);
MPI_Type_commit(&a_col_type);
The final step is the MPI_Gatherv
, and here you have to be a little cute. The trick is, we want to send (and receive) multiple of these things at a time - that is, several consecutive ones. But we need the next column not to be nrows*ncols chars away, but just one char away. Luckily, we can do that by setting the upper bound of the data structure to be just one character away from the lower bound, so that the next element does start in the right place. This is allowed by the standard, and in fact one of their examples in section 4.1.4 there hinges on it.
To do that, we create a resized type that ends just one byte after it starts:
MPI_Type_create_resized(a_col_type, 0, 1*sizeof(char), &new_a_col_type);
MPI_Type_commit(&new_a_col_type);
and similarly for B
; and now we can send and recieve multiples of these as one would expect. So the following works for me:
#include
#include
#include
char **charalloc2d(int n, int m) {
char *data = (char *)calloc(n*m,sizeof(char));
char **array = (char **)calloc(n, sizeof(char *));
for (int i=0; i