how are handles distributed after MPI_Comm_split?

问题

Say, i have 8 processes. When i do the following, the MPU_COMM_WORLD communicator will be splitted into two communicators. The processes with even ids will belong to one communicator and the processes with odd ids will belong to another communicator.

color=myid % 2;
MPI_Comm_split(MPI_COMM_WORLD,color,myid,&NEW_COMM);
MPI_Comm_rank( NEW_COMM, &new_id);

My question is where is the handle for these two communicators. After the split the ids of processors which before were 0 1 2 3 4 5 6 7 will become 0 2 4 6 | 1 3 5 7.

Now, my question is: suppose I want to send and receive in a particular communicator, say the one hosting the even ids, then when i send a message from 0 to 2 using the wrong communicator the message could end up in the second communicator, is this correct? Thank you in advance for clarification!

if(new_id < 2){

    MPI_Send(&my_num, 1, MPI_INT,  2 + new_id, 0, NEW_COMM);
    MPI_Recv(&my_received, 1, MPI_INT, 2 + new_id, 0, NEW_COMM, MPI_STATUS_IGNORE);         

}
else
{
    MPI_Recv(&my_received, 1, MPI_INT, new_id - 2, 0, NEW_COMM, MPI_STATUS_IGNORE);         
    MPI_Send(&my_num, 1, MPI_INT, new_id - 2 , 0, NEW_COMM);


}

Full code

#include <stdio.h>
#include <stdlib.h>
#include <mpi.h>
#include <math.h>
int main(argc,argv)
int argc;
char *argv[];

{
    int myid, numprocs;
    int color,Zero_one,new_id,new_nodes;
    MPI_Comm NEW_COMM; 
    MPI_Init(&argc,&argv);
    MPI_Comm_size(MPI_COMM_WORLD,&numprocs);
    MPI_Comm_rank(MPI_COMM_WORLD,&myid);


    int my_num, my_received;

       int old_id;


    switch(myid){

        case 0:
            my_num = 0;
            old_id = 0;

        break;


        case 1:
            my_num = 1;
            old_id = 1;

        break;

        case 2:
            my_num = 2;
            old_id = 2;

        break;

        case 3:
            my_num = 3;
            old_id = 3;

        break;

        case 4:
            my_num = 4;
            old_id = 4;

        break;

        case 5:
            my_num = 5;
            old_id = 5;

        break;

        case 6:
            my_num = 6;
            old_id = 6;

        break;

        case 7:
            my_num = 7;
            old_id = 7;

        break;


    }



    color=myid % 2;
    MPI_Comm_split(MPI_COMM_WORLD,color,myid,&NEW_COMM);
    MPI_Comm_rank( NEW_COMM, &new_id);
    MPI_Comm_rank( NEW_COMM, &new_nodes);



       //  0 1 2 3     4 5 6 7  //After splits we have these nums for 8 processors
       //  2 3 0 1     6 7 4 5  //After the below exchange we should have this...each two elements in each communicator will exchange to next two elements in that same communicator




        if(new_id < 2){

            MPI_Send(&my_num, 1, MPI_INT,  2 + new_id, 0, NEW_COMM);
            MPI_Recv(&my_received, 1, MPI_INT, 2 + new_id, 0, NEW_COMM, MPI_STATUS_IGNORE);         

        }
        else
        {
            MPI_Recv(&my_received, 1, MPI_INT, new_id - 2, 0, NEW_COMM, MPI_STATUS_IGNORE);         
            MPI_Send(&my_num, 1, MPI_INT, new_id - 2 , 0, NEW_COMM);


        }



    printf("old_id= %d received num= %d\n", old_id, my_received);


    MPI_Finalize();

}

回答1:

I have edited your question, making it clearer. Also, I fixed the ids related to the two new communicators created by the call to MPI_Comm_split.

First question. A single process after the call to MPI_Comm_split can get at most ONE handle to a newly created communicator (at most, indeed, the returned communicator may be equal to MPI_COMM_NULL for processes passing MPI_UNDEFINED as the value for the color parameter). This is the reason why beginners typically do not understand the semantics of this call: MPI_Comm_split is a collective call, and, as such, it must be called by all of the processes in the original communicator. So, every process calls it once, but the function returns $k$ communicators, depending on the values of the color parameter supplied by all of the processes, partitioning the processes into $k$ groups. If you are NOT comfortable with this powerful mechanism and want to create just one communicator, simply supply MPI_UNDEFINED as the value of the color parameter in every call made by a process which must not belong to the newly created communicator. But then, you should use other available functions that allow creation of a communicator, not MPI_Comm_split.

Second question. If the semantics is now clear, you will immediately recognize that a process using the communicator returned by MPI_Comm_split for point-to-point or for collective communications can NEVER exchange data with processes being part of another communicator returned by MPI_Comm_split. The communicators provide a different universe of communicators, since each communicator has associated a different group of processes.

Now, your code snippet will NOT work, even if called by processes belonging to the even ids communicator. Why ? Because the code executed when new_id < 2 will correctly send from process with rank 0 in the new communicator to the process with rank 2 in the new communicator, and the process with rank 0 in the new communicator will receive from process with rank 2 in the new communicator. However, the else branch is flawed. Indeed, ALL of the processes with even rank >= 2 in the new communicator will execute it, not just the process with rank 2. In this case, this branch will be executed by processes with rank 2, 4 and 6 in the new even ids communicator. Of course, the processes with ranks 4 and 6 will hang forever, blocking respectively for a message never sent by processes 2 and 4.

Finally, since the same code will also be executed by processes with odd id in the other newly created communicator, the process with rank 1 will try to send and receive from process 3, and in the else branch the processes with ranks 3, 5 and 7 will try to send and receive as well. In this case, processes 5 and 7 will hang forever, blocking respectively for a message never sent by processes 3 and 5.

The code is easily fixed if you want to exchange data between the processes with rank 0 and 2. Simply use the explicit ids 0 and 2 and rewrite the if as follows:

if(!new_id){
    MPI_Send(&my_num, 1, MPI_INT,  2, 0, NEW_COMM);
    MPI_Recv(&my_received, 1, MPI_INT, 2 + new_id, 0, NEW_COMM, MPI_STATUS_IGNORE);
}

if(new_id == 2){

    MPI_Recv(&my_received, 1, MPI_INT, 0, 0, NEW_COMM, MPI_STATUS_IGNORE);         
    MPI_Send(&my_num, 1, MPI_INT, 0, 0, NEW_COMM);

}

来源：https://stackoverflow.com/questions/22737842/how-are-handles-distributed-after-mpi-comm-split

标签

parallel-processing

multiprocessing

mpi