MPI - Printing in an order

后端 未结 5 1258
臣服心动
臣服心动 2020-12-03 20:12

I\'m trying to write a function in C where every processor prints it\'s own data. Here is what i have:

void print_mesh(int p,int myid,int** U0,int X,int Y){
         


        
相关标签:
5条回答
  • 2020-12-03 20:34

    There is no way to guarantee that messages from many different processes will arrive in the "correct" order when they arrive to another process. This is essentially what is happening here.

    Even though you aren't explicitly sending messages, when you print something to the screen, it has to be sent to the process on your local system (mpiexec or mpirun) where it can be printed to the screen. There is no way for MPI to know what the correct order for these messages is so it just prints them as they arrive.

    If you require that your messages are printed in a specific order, you must send them all to one rank which can print them in whatever order you like. As long as one rank does all of the printing, all of the messages will be ordered correctly.

    It should be said that there will probably be answers that you can find out there which say you can put a newline at the end of your string or use flush() to ensure that the buffers are flushed, but that won't guarantee ordering on the remote end for the reasons mentioned above.

    0 讨论(0)
  • 2020-12-03 20:47

    For debugging and development purposes, you can run each process in a separate terminal, so they print in their own terminal:

    mpirun -np n xterm -hold -e ./output
    

    n: number of processors
    -hold: keeps xterm on after the program is done.
    output: name of MPI executable

    0 讨论(0)
  • 2020-12-03 20:51

    So, you can do something like this:

    MPI_Init(&argc, &argv);
    MPI_Comm_size(MPI_COMM_WORLD, &size);
    MPI_Comm_rank(MPI_COMM_WORLD, &rank);
    if (rank == 0) {
        MPI_Send(&message, 1, MPI_INT, 1, 0, MPI_COMM_WORLD);
        printf("1 SIZE = %d RANK = %d MESSAGE = %d \n",size,rank, message);
    } else {
        int buffer;
        MPI_Status status;
        MPI_Probe(MPI_ANY_SOURCE, 0, MPI_COMM_WORLD, &status);
        MPI_Get_count(&status, MPI_INT, &buffer);
        if (buffer == 1) {
            printf("2 SIZE = %d RANK = %d MESSAGE = %d \n",size,rank, message);
            MPI_Recv(&message, buffer, MPI_INT, MPI_ANY_SOURCE, 0, MPI_COMM_WORLD, &status);
            if (rank + 1 != size) {
                MPI_Send(&message, 1, MPI_INT, ++rank, 0, MPI_COMM_WORLD);
            }
        };
    };
    MPI_Finalize();
    

    After execute:

    $ mpirun -n 5 ./a.out 
    1 SIZE = 5 RANK = 0 MESSAGE = 999 
    2 SIZE = 5 RANK = 1 MESSAGE = 999 
    2 SIZE = 5 RANK = 2 MESSAGE = 999 
    2 SIZE = 5 RANK = 3 MESSAGE = 999 
    2 SIZE = 5 RANK = 4 MESSAGE = 999 
    
    0 讨论(0)
  • 2020-12-03 20:51

    I was inspired by Святослав Павленко's answer: using the blocking MPI communications to enforce serial-in-time output. While Wesley Bland has a point about MPI not being built for serial output. So if we want to output data, it makes sense either have each processor output (non-colliding) data. Alternatively, if the order of the data is important (and it's not too big) the recommended approach is to send it all to on cpu (say rank 0), which then formats the data correctly.

    To me, this seems to be a bit of overkill especially when the data can be variable-length strings, which all too often is what std::cout << "a=" << some_varible << " b=" << some_other_variable often is. So if we want some quick-and-dirty in-order printing, we can exploit Святослав Павленко's answer to build a serial output stream. This solution works fine, but its performance scales badly with many cpus, so don't use it of data output!

    #include <iostream>
    #include <sstream>
    #include <mpi.h>
    

    MPI House-keeping:

    int mpi_size;
    int mpi_rank;
    
    void init_mpi(int argc, char * argv[]) {
        MPI_Init(& argc, & argv);
        MPI_Comm_size(MPI_COMM_WORLD, & mpi_size);
        MPI_Comm_rank(MPI_COMM_WORLD, & mpi_rank);
    }
    
    void finalize_mpi() {
        MPI_Finalize();
    }
    

    General-purpose class which enables MPI message-chaining

    template<class T, MPI_Datatype MPI_T> class MPIChain{
        // Uses a chained MPI message (T) to coordinate serial execution of code (the content of the message is irrelevant).
        private:
            T message_out; // The messages aren't really used here
            T message_in;
            int size;
            int rank;
    
        public:
            void next(){
                // Send message to next core (if there is one)
                if(rank + 1 < size) {
                // MPI_Send - Performs a standard-mode blocking send.
                MPI_Send(& message_out, 1, MPI_T, rank + 1, 0, MPI_COMM_WORLD);
                }
            }
    
            void wait(int & msg_count) {
                // Waits for message to arrive. Message is well-formed if msg_count = 1
                MPI_Status status;
    
                // MPI_Probe - Blocking test for a message.
                MPI_Probe(MPI_ANY_SOURCE, 0, MPI_COMM_WORLD, & status);
                // MPI_Get_count - Gets the number of top level elements.
                MPI_Get_count(& status, MPI_T, & msg_count);
    
                if(msg_count == 1) {
                    // MPI_Recv - Performs a standard-mode blocking receive.
                    MPI_Recv(& message_in, msg_count, MPI_T, MPI_ANY_SOURCE, 0, MPI_COMM_WORLD, & status);
                }
            }
    
            MPIChain(T message_init, int c_rank, int c_size): message_out(message_init), size(c_size), rank(c_rank) {}
    
            int get_rank() const { return rank;}
            int get_size() const { return size;}
    };
    

    We can now use our MPIChain class to create our class which manages to output stream:

    class ChainStream : public MPIChain<int, MPI_INT> {
        // Uses the MPIChain class to implement a ostream with a serial operator<< implementation.
        private:
            std::ostream & s_out;
    
        public:
            ChainStream(std::ostream & os, int c_rank, int c_size)
                : MPIChain<int, MPI_INT>(0, c_rank, c_size), s_out(os) {};
    
            ChainStream & operator<<(const std::string & os){
                if(this->get_rank() == 0) {
                    this->s_out << os;
                    // Initiate chain of MPI messages
                    this->next();
                } else {
                    int msg_count;
                    // Wait untill a message arrives (MPIChain::wait uses a blocking test)
                    this->wait(msg_count);
                    if(msg_count == 1) {
                        // If the message is well-formed (i.e. only one message is recieved): output string
                        this->s_out << os;
                        // Pass onto the next member of the chain (if there is one)
                        this->next();
                    }
                }
    
                // Ensure that the chain is resolved before returning the stream
                MPI_Barrier(MPI_COMM_WORLD);
    
                // Don't output the ostream! That would break the serial-in-time exuction.
                return *this;
           };
    };
    

    Note the MPI_Barrier at the end of operator<<. This is to prevent the code starting a second output chain. Even though this could be moved outside the operator<<, I figured that I would put it here, since this is supposed to be serial output anyway....

    Putting it all together:

    int main(int argc, char * argv[]) {
        init_mpi(argc, argv);
    
        ChainStream cs(std::cout, mpi_rank, mpi_size);
    
        std::stringstream str_1, str_2, str_3;
        str_1 << "FIRST:  " << "MPI_SIZE = " << mpi_size << " RANK = " << mpi_rank << std::endl;
        str_2 << "SECOND: " << "MPI_SIZE = " << mpi_size << " RANK = " << mpi_rank << std::endl;
        str_3 << "THIRD:  " << "MPI_SIZE = " << mpi_size << " RANK = " << mpi_rank << std::endl;
    
        cs << str_1.str() << str_2.str() << str_3.str();
        // Equivalent to:
        //cs << str_1.str();
        //cs << str_2.str();
        //cs << str_3.str();
    
        finalize_mpi();
    }
    

    Note that we are concatenating the strings str_1, str_2, str_3 before we send them the the ChainStream instance. Normally one would do something like:

    std::cout << "a" << "b" << "c"" << std::endl
    

    but this applies operator<< from left-to-right, and we want the strings to be ready for output before sequentially running through each process.

    g++-7 -O3 -lmpi serial_io_obj.cpp -o serial_io_obj
    mpirun -n 10 ./serial_io_obj
    

    Outputs:

    FIRST:  MPI_SIZE = 10 RANK = 0
    FIRST:  MPI_SIZE = 10 RANK = 1
    FIRST:  MPI_SIZE = 10 RANK = 2
    FIRST:  MPI_SIZE = 10 RANK = 3
    FIRST:  MPI_SIZE = 10 RANK = 4
    FIRST:  MPI_SIZE = 10 RANK = 5
    FIRST:  MPI_SIZE = 10 RANK = 6
    FIRST:  MPI_SIZE = 10 RANK = 7
    FIRST:  MPI_SIZE = 10 RANK = 8
    FIRST:  MPI_SIZE = 10 RANK = 9
    SECOND: MPI_SIZE = 10 RANK = 0
    SECOND: MPI_SIZE = 10 RANK = 1
    SECOND: MPI_SIZE = 10 RANK = 2
    SECOND: MPI_SIZE = 10 RANK = 3
    SECOND: MPI_SIZE = 10 RANK = 4
    SECOND: MPI_SIZE = 10 RANK = 5
    SECOND: MPI_SIZE = 10 RANK = 6
    SECOND: MPI_SIZE = 10 RANK = 7
    SECOND: MPI_SIZE = 10 RANK = 8
    SECOND: MPI_SIZE = 10 RANK = 9
    THIRD:  MPI_SIZE = 10 RANK = 0
    THIRD:  MPI_SIZE = 10 RANK = 1
    THIRD:  MPI_SIZE = 10 RANK = 2
    THIRD:  MPI_SIZE = 10 RANK = 3
    THIRD:  MPI_SIZE = 10 RANK = 4
    THIRD:  MPI_SIZE = 10 RANK = 5
    THIRD:  MPI_SIZE = 10 RANK = 6
    THIRD:  MPI_SIZE = 10 RANK = 7
    THIRD:  MPI_SIZE = 10 RANK = 8
    THIRD:  MPI_SIZE = 10 RANK = 9
    
    0 讨论(0)
  • 2020-12-03 20:54

    The MPI standard doesn't specify how stdout from different nodes should be collected and fflush doesn't help.

    If you need to print big outputs in order, probably the best solution is not to gather them all and print at once, because this will generate traffic over the network. A better solution is to create something similar to a virtual ring where each process waits a token from the previous process, prints and sends the token to the next one. Of course the first process doesn't have to wait, it prints and send to the next one.

    Anyway in case of really big output, where probably there is no sense to print outputs on video, you should use MPI-IO as suggested by Jonathan Dursi.

    0 讨论(0)
提交回复
热议问题