MPI_Comm_spawn and MPI_Reduce

后端 未结 1 1118
滥情空心
滥情空心 2021-01-06 04:32

I have two programs. The \"master\" which spawns \"workers\" which perform some calculations and I want the master to get the results from the workers and store the sum. I a

相关标签:
1条回答
  • 2021-01-06 05:02

    MPI_Comm_get_parent returns the parent intercommunicator that encompasses the original process and all the spawned ones. In this case calling MPI_Comm_rank(parent, &parent_id) does not return the rank of the parent but rather the rank of the current process in the local group of the intercommunicator:

    I'm child process rank 0 and we are 3
    The parent process **rank 0** and we are 1
    I'm child process rank 1 and we are 3
    The parent process **rank 1** and we are 1
    I'm child process rank 2 and we are 3
    The parent process **rank 2** and we are 1
    

    (observe how the highlighted values differ - one would expect that the rank of the parent process should be the same, shouldn't it?)

    That's why the MPI_Reduce() call would not succeed as all worker processes specify different values for the root rank. Since originally there was one master process, its rank in remote group of parent would be 0 and hence all workers should specify 0 as the root to MPI_Reduce:

    //
    // Worker code
    //
    rc = MPI_Reduce(&send, &recv, 1, MPI_INT, MPI_SUM, 0, parent);
    

    This is only half of the problem. The other half is that rooted collective operations (e.g. MPI_REDUCE) operate a bit different with intercommunicators. One first has to decide which of the two groups would host the root. Once the root group is identified, the root process has to pass MPI_ROOT as the value of root in MPI_REDUCE and all other processes in the root group must pass MPI_PROC_NULL. That is the processes in the receiving group do not take part in the rooted collective operation at all. Since the master code is written so that there could be only one process in the master's group, then it would suffice to change the call to MPI_Reduce in the master code to:

    //
    // Master code
    //
    rc = MPI_Reduce(&send, &recv, 1, MPI_INT, MPI_SUM, MPI_ROOT, everyone);
    

    Note that the master also does not participate in the reduction operation itself, e.g. the value of sendbuf (&send in this case) is irrelevant as the root would not be sending data to be reduced - it merely collects the result of the reduction performed over the values from the processes in the remote group.

    0 讨论(0)
提交回复
热议问题