What is the main difference betweeen the MPI_Allgather and MPI_Alltoall functions in MPI?
I mean can some one give me examples where MPI_Allgather will be helpful a
While these two methods are indeed very similar, there seems to be one crucial difference between the two of them.
MPI_Allgather ends with each process having the exact same data in its receive buffer, and each process contributes a single value to the overall array. For example, if each of a set of processes needed to share some single value about their state with everyone else, each would provide their single value. These values would then be sent to everyone, so everyone would have a copy of the same structure.
MPI_Alltoall does not send the same values to each other process. Instead of providing a single value that should be shared with each other process, each process specifies one value to give to each other process. In other words, with n processes, each must specify n values to share. Then, for each processor j, its k'th value will be sent to process k's j'th index in the receive buffer. This is useful if each process has a single, unique message for each other process.
As a final note, the results of running allgather and alltoall would be the same in the case where each process filled its send buffer with the same value. The only difference would be that allgather would likely be much more efficient.
These two screenshots have a quick explanation:
MPI_Allgatherv
MPI_Alltoallv
Though this a comparison between MPI_Allgatherv and MPI_Alltoallv, but it also explains how MPI_Allgather differs from MPI_Alltoall.
A picture says more than thousand words, so here are several ASCII art pictures:
rank send buf recv buf
---- -------- --------
0 a,b,c MPI_Allgather a,b,c,A,B,C,#,@,%
1 A,B,C ----------------> a,b,c,A,B,C,#,@,%
2 #,@,% a,b,c,A,B,C,#,@,%
This is just the regular MPI_Gather
, only in this case all processes receive the data chunks, i.e. the operation is root-less.
rank send buf recv buf
---- -------- --------
0 a,b,c MPI_Alltoall a,A,#
1 A,B,C ----------------> b,B,@
2 #,@,% c,C,%
(a more elaborate case with two elements per process)
rank send buf recv buf
---- -------- --------
0 a,b,c,d,e,f MPI_Alltoall a,b,A,B,#,@
1 A,B,C,D,E,F ----------------> c,d,C,D,%,$
2 #,@,%,$,&,* e,f,E,F,&,*
(looks better if each element is coloured by the rank that sends it but...)
MPI_Alltoall
works as combined MPI_Scatter
and MPI_Gather
- the send buffer in each process is split like in MPI_Scatter
and then each column of chunks is gathered by the respective process, whose rank matches the number of the chunk column. MPI_Alltoall
can also be seen as a global transposition operation, acting on chunks of data.
Is there a case when the two operations are interchangeable? To properly answer this question, one has to simply analyse the sizes of the data in the send buffer and of the data in the receive buffer:
operation send buf size recv buf size
--------- ------------- -------------
MPI_Allgather sendcnt n_procs * sendcnt
MPI_Alltoall n_procs * sendcnt n_procs * sendcnt
The receive buffer size is actually n_procs * recvcnt
, but MPI mandates that the number of basic elements sent should be equal to the number of basic elements received, hence if the same MPI datatype is used in both send and receive parts of MPI_All...
, then recvcnt
must be equal to sendcnt
.
It is immediately obvious that for the same size of the received data, the amount of data sent by each process is different. For the two operations to be equal, one necessary condition is that the sizes of the sent buffers in both cases are equal, i.e. n_procs * sendcnt == sendcnt
, which is only possible if n_procs == 1
, i.e. if there is only one process, or if sendcnt == 0
, i.e. no data is being sent at all. Hence there is no practically viable case where both operations are really interchangeable. But one can simulate MPI_Allgather
with MPI_Alltoall
by repeating n_procs
times the same data in the send buffer (as already noted by Tyler Gill). Here is the action of MPI_Allgather
with one-element send buffers:
rank send buf recv buf
---- -------- --------
0 a MPI_Allgather a,A,#
1 A ----------------> a,A,#
2 # a,A,#
And here the same implemented with MPI_Alltoall
:
rank send buf recv buf
---- -------- --------
0 a,a,a MPI_Alltoall a,A,#
1 A,A,A ----------------> a,A,#
2 #,#,# a,A,#
The reverse is not possible - one cannot simulate the action of MPI_Alltoall
with MPI_Allgather
in the general case.