Dynamic Memory Allocation in MPI

后端 未结 2 1695
醉酒成梦
醉酒成梦 2020-12-06 15:49

I am new to MPI. I wrote a simple code to display a matrix using multiple process. Say if I have a matrix of 8x8 and launching the MPI program with 4 processes, the 1st 2 ro

相关标签:
2条回答
  • 2020-12-06 16:17

    This code looks awfully suspect.

    a = (int **)malloc(rows*sizeof(int));
    for(i=0; i<rows; i++)
        a[i] =  (int *)malloc(width*sizeof(int));
    MPI_Recv(&a, rows*width, MPI_INT, 0, 1, MPI_COMM_WORLD, &status);
    

    Your creating an array of int** and allocating correctly but then you don't pass the individual pointers. MPI_Recv expects int* as an argument, right?

    Note that when you do a int[][], the memory allocated will be contiguous. When you do malloc, you should expect non-contiguous blocks of memory.

    An easy solution may be to just do a = (int**) malloc ( big ), and then index against that large memory allocation.

    0 讨论(0)
  • 2020-12-06 16:40

    There are at least two ways to dynamically allocate a 2D array.

    The first one is the one of @HRoid : each row is allocated one at a time. Look here for getting an scheme.

    The second one is suggested by @Claris, and it will ensure that the data is contiguous in memory. This is required by many MPI operations...it is also required by libraries like FFTW (2D fast fourier transform) or Lapack (dense matrices for linear algebra). Your program may fail at

    MPI_Send(&a[offset][0], rows*S, MPI_INT,i,1, MPI_COMM_WORLD);
    

    if S>1, this program will try to send items that are after the end of the line n°offset...That may trigger a segmentation fault or undefined behavior.

    You may allocate your array this way :

    a = malloc(rows * sizeof(int *));
    if(a==NULL){fprintf(stderr,"out of memory...i will fail\n");}
    int *t = malloc(rows * width * sizeof(int));
    if(t==NULL){fprintf(stderr,"out of memory...i will fail\n");}
    for(i = 0; i < rows; ++i)
      a[i] = &t[i * width];
    

    Watch out : malloc does not initialize memory to 0 !

    It seems that you want to spread a 2D array over many process. Look at MPI_Scatterv() here. Look at this question too.

    If you want to know more about 2D arrays and MPI, look here.

    You may find a basic example of MPI_Scatterv here.

    I changed #define S 8 for #define SQUARE_SIZE 42. It's always better to give descriptive names.

    And here is a working code using MPI_Scatterv() !

    #include <mpi.h>
    #include <iostream> 
    #include <cstdlib>
    
    using namespace std;
    
    #define SQUARE_SIZE 42
    
    MPI_Status status;
    
    int main(int argc, char *argv[])
    {
        int numtasks, taskid;
        int i, j, k = 0;
    
        MPI_Init(&argc, &argv);
        MPI_Comm_rank(MPI_COMM_WORLD, &taskid);
        MPI_Comm_size(MPI_COMM_WORLD, &numtasks);
    
        int rows, offset, remainPart, orginalRows, height, width;
        int **a;
    
        height = width = SQUARE_SIZE;
    
        //on rank 0, let's build a big mat of int
        if(taskid == 0){ 
            a=new int*[height]; 
            int *t =new int[height * width];
            for(i = 0; i < height; ++i)
                a[i] = &t[i * width];
            for(i=0; i<height; i++)
                for(j=0; j<width; j++)
                    a[i][j] = ++k;
        }
    
        //for everyone, lets compute numbers of rows, numbers of int and displacements for everyone. Only 0 will use these arrays, but it's a practical way to get `rows` 
        int nbrows[numtasks];
        int sendcounts[numtasks];
        int displs[numtasks];
        displs[0]=0;
        for(i=0;i<numtasks;i++){
            nbrows[i]=height/numtasks;
            if(i<height%numtasks){
                nbrows[i]=nbrows[i]+1;
            }
            sendcounts[i]=nbrows[i]*width;
            if(i>0){
                displs[i]=displs[i-1]+sendcounts[i-1];
            }
        }
        rows=nbrows[taskid];
    
        //scattering operation. 
        //The case of the root is particular, since the communication is not to be done...Hence, the flag MPI_IN_PLACE is used.
        if(taskid==0){
            MPI_Scatterv(&a[0][0],sendcounts,displs,MPI_INT,MPI_IN_PLACE,0,MPI_INT,0,MPI_COMM_WORLD);
        }else{
            //allocation of memory for the piece of mat on the other nodes.
            a=new int*[rows];
            int *t =new int[rows * width];
            for(i = 0; i < rows; ++i)
                a[i] = &t[i * width];
    
            MPI_Scatterv(NULL,sendcounts,displs,MPI_INT,&a[0][0],rows*width,MPI_INT,0,MPI_COMM_WORLD);
        }
        //printing, one proc at a time
        if(taskid>0){
            MPI_Status status;
            MPI_Recv(NULL,0,MPI_INT,taskid-1,0,MPI_COMM_WORLD,&status);
        }
        cout<<"rank"<< taskid<<" Rows : "<<rows<<" Width : "<<width<<endl;
    
        for(i=0; i<rows; i++)
        {
            for(j=0; j<width; j++)
                cout<<a[i][j]<<"\t";
            cout<<endl;
        }
        if(taskid<numtasks-1){
            MPI_Send(NULL,0,MPI_INT,taskid+1,0,MPI_COMM_WORLD);
        }
    
        //freeing the memory !
    
        delete[] a[0];
        delete[] a;
    
        MPI_Finalize();
    
        return 0;
    }
    

    To compile : mpiCC main.cpp -o main

    To run : mpiexec -np 3 main

    0 讨论(0)
提交回复
热议问题