Sending typedef struct containing void* by creating MPI drived datatype.

问题

what I understand studying MPI specification is that an MPI send primitive refer to a memory location (or a send buffer) pointed by the data to be sent and take the data in that location which then passed as a message to the another Process.

Though it is true that virtual address of a give process will be meaningless in another process memory address; It is ok to send data pointed by pointer such as void pointer as MPI will any way pass the data itself as a message

For example the following works correctly:

    // Sender Side.
    int x = 100;
    void* snd;
    MPI_Send(snd,4,MPI_BYTE,1,0,MPI_COMM_WORLD);   

    // Receiver Side.
    void* rcv;
    MPI_Recv(rcv, 4,MPI_BYTE,0,0,MPI_COMM_WORLD);

but when I add void* snd in a struct and try to send the struct this will no succeed.

I don't understand why the previous example work correctly but not the following.

Here, I have defined a typedef struct and then create an MPI_DataType from it. With the same explanation of the above the following should also have succeed, unfortunately it is not working.

here is the code:

    #include "mpi.h"
    #include<stdio.h>

    int main(int args, char *argv[])
    {
        int rank, source =0, tag=1, dest=1;
        int bloackCount[2];

        MPI_Init(&args, &argv);

        typedef struct {
            void* data;
            int tag; 
        } data;

        data myData;    

        MPI_Datatype structType, oldType[2];
        MPI_Status stat;

        /* MPI_Aint type used to idetify byte displacement of each block (array)*/      
        MPI_Aint offsets[2], extent;
        MPI_Comm_rank(MPI_COMM_WORLD, &rank);


        offsets[0] = 0;
        oldType[0] = MPI_BYTE;
            bloackCount[0] = 1;

        MPI_Type_extent(MPI_INT, &extent);

        offsets[1] = 4 * extent;  /*let say the MPI_BYTE will contain ineteger :         size of int * extent */
        oldType[1] = MPI_INT;
        bloackCount[1] = 1;

        MPI_Type_create_struct(2, bloackCount,offsets,oldType, &structType);
        MPI_Type_commit(&structType);


        if(rank == 0){
    int x = 100;
    myData.data = &x;
    myData.tag = 99;
    MPI_Send(&myData,1,structType, dest, tag, MPI_COMM_WORLD);
}
if(rank == 1 ){ 
    MPI_Recv(&myData, 1, structType, source, tag, MPI_COMM_WORLD, &stat);
          // with out this the following printf() will properly print the value 99 for 
          // myData.tag
    int x = *(int *) myData.data;
    printf(" \n Process %d, Received : %d , %d \n\n", rank , myData.tag, x); 
    }   
       MPI_Type_free(&structType);             
       MPI_Finalize();
    }

Error message running the code: [Looks like I am trying to access an invalid memory address space in the second process]

    [ubuntu:04123] *** Process received signal ***
    [ubuntu:04123] Signal: Segmentation fault (11)
    [ubuntu:04123] Signal code: Address not mapped (1)
    [ubuntu:04123] Failing at address: 0xbfe008bc
    [ubuntu:04123] [ 0] [0xb778240c]
    [ubuntu:04123] [ 1] GenericstructType(main+0x161) [0x8048935]
    [ubuntu:04123] [ 2] /lib/i386-linux-gnu/libc.so.6(__libc_start_main+0xf3)         [0xb750f4d3]
    [ubuntu:04123] [ 3] GenericstructType() [0x8048741]
    [ubuntu:04123] *** End of error message ***

Can some please explain to me why it is not working. any advice will also be appreciated

thanks,

回答1:

// Sender Side.
int x = 100;
void* snd;
MPI_Send(snd,4,MPI_BYTE,1,0,MPI_COMM_WORLD);

// Receiver Side.
void* rcv;
MPI_Recv(rcv, 4,MPI_BYTE,0,0,MPI_COMM_WORLD);

I don't understand why the previous example work correctly but not the following.

It works (of course, snd and rcv have to be assigned meaningful memory locations as values), because MPI_Send and MPI_Recv take the address of the data location and both snd and rcv are pointers, i.e. their values are such addresses. For example, the MPI_Send line is not sending the value of the pointer itself but rather 4 bytes starting from the location that snd is pointing to. The same is true about the call to MPI_Recv and the usage of rcv. In order to send the value of the pointer rather than the value it is pointing to, you would have to use:

MPI_Send(&snd, sizeof(void *), MPI_BYTE, ...);

This would send sizeof(void *) bytes, starting from the address where the value of the pointer is stored. This would make very little sense unless for some super special cases.

Why your second example doesn't work? MPI is not a magician and it cannot recognise that part of the memory contains a pointer to another memory block and follow that pointer. That is, when you construct a structured datatype, there is no way to tell MPI that the first element of the structure is actually a pointer and make it read the data that this pointer points to. In other words, you must perform explicit data marshalling - construct and intermediate buffer that contains a copy of the memory region, pointed by data.data. Besides, your data structure contains no information on the length of the memory region that data points to.

Please note something very important. All MPI datatypes have something called a type map. A type map is a list of tuples, where each tuple, also called type signature, has the form (basic_type, offset) where basic_type is a primitive language type, e.g. char, int, double, etc. and offset is an offset, relative to the beginning of the buffer. One peculiar feature of MPI is that offsets could also be negative and this means that the argument to MPI_Send (or to MPI_Recv, or to any other communication function) might actually point to the middle of the memory area, that would serve as data source. When sending data, MPI traverses the type map and takes one element of type basic_type from the corresponding offset, relative to the supplied data buffer address. The built-in MPI datatypes have typemaps of only one entry with an offset of 0, e.g.:

MPI_INT      -> (int, 0)
MPI_FLOAT    -> (float, 0)
MPI_DOUBLE   -> (double, 0)

NO datatype exists in MPI, that can make it dererence a pointer and take the value that it points to instead of the pointer value itself.

offsets[0] = 0;
oldType[0] = MPI_BYTE;
blockCount[0] = 1;

MPI_Type_extent(MPI_INT, &extent);

offsets[1] = 4 * extent;
oldType[1] = MPI_INT;
blockCount[1] = 1;

MPI_Type_create_struct(2, blockCount, offsets, oldType, &structType);

This code creates an MPI datatype that has the following type map (assuming int is 4 bytes):

{(byte, 0), (int, 16)}

When supplied as the type argument to MPI_Send, it would instruct the MPI library to take one byte from the beginning of the data buffer and then to take the integer value, located at 16 bytes past the beginning of the data buffer. In total the message would be 5 bytes long, although the span of the buffer area would be 20 bytes.

offsets[0] = offsetof(data, data);
oldType[0] = MPI_CHAR;
blockCount[0] = sizeof(void *);

offsets[1] = offsetof(data, tag);
oldType[1] = MPI_INT;
blockCount[1] = 1;

MPI_Type_create_struct(2, blockCount, offsets, oldType, &structType);

This code, taken from the answer of Greg Inozemtsev, creates a datatype with the following type map (assuming 32-bit machine with 32-bit wide pointes and zero padding):

{(char, 0), (char, 1), (char, 2), (char, 3), (int, 4)}

The number of (char, x) typesigs is equal to sizeof(void *) (4 by assumption). If used as a datatype, this would take 4 bytes from the beginning of the buffer (i.e. the value of the pointer, the address, not the actual int it is pointing to!) and then it would take an integer from 4 bytes after the beginning, i.e. the value of the tag field in the structure. Once again, you would be sending the address of the pointer and not the data that this pointer points to.

The difference betwen MPI_CHAR and MPI_BYTE is that no type conversion is applied to data of type MPI_BYTE. This is only relevant when running MPI codes in heterogenous environments. With MPI_CHAR the library might perform data conversion, e.g. convert each character from ASCII to EBCDIC character sets and vice versa. Using MPI_CHAR in this case is erroneous, but sending pointers in a heterogeneous environment is even more erroneous, so no worries ;)

In the light of all this, if I were you, I would consider the solution that suszterpatt has proposed.

For the explicit data marshalling, there are two possible scenarios:

Scenario 1. Each data item, pointed to by data.data is of constant size. In this case you can construct a structure datatype in the following way:

typedef struct {
   int tag;
   char data[];
} data_flat;

// Put the tag at the beginning
offsets[0] = offsetof(data_flat, tag);
oldType[0] = MPI_INT;
blockCount[0] = 1;

offsets[1] = offsetof(data_flat, data);
oldType[1] = MPI_BYTE;
blockCount[1] = size of the data;

MPI_Type_create_struct(2, blockCount, offsets, oldType, &structType);
MPI_Type_commit(&structType);

Then use it like this:

// --- Sender ---

// Make a temporary buffer to hold the data
size_t total_size = offsetof(data_flat, data) + size of the data;
data_flat *temp = malloc(total_size);

// Copy data structure content into the temporary flat structure
temp->tag = data.tag;
memcpy(temp->data, data.data, size of the data);

// Send the temporary structure
MPI_Send(temp, 1, structType, ...);

// Free the temporary structure
free(temp);

You might also not free the temporary storage but rather reuse it for other instances of the data structure as well (since by presumption they are all pointing to data of the same size). The receiver would be:

// --- Receiver ---

// Make a temporary buffer to hold the data
size_t total_size = offsetof(data_flat, data) + size of the data;
data_flat *temp = malloc(total_size);

// Receive into the temporary structure
MPI_Recv(temp, 1, structType, ...);

// Copy the temporary flat struture into a data structure
data.tag = temp->tag;
data.data = temp->data;
// Do not free the temporary structure as it contains the actual data

Scenario 2. Each data item might be of different size. This one is much more involved and hard to do in a portable way. If speed is not your greatest concern, then you might send the data in two distinct messages for maximum portability. MPI guarantees that order is preserved for messages sent with the same envelope (source, destination, tag, communicator).

You could also implement what suszterpatt proposed in the following way (given that your tags fit into the allowed range):

// --- Send a structure ---
MPI_Send(data.data, size of data, MPI_BYTE, dest, data.tag, MPI_COMM_WORLD);

// --- Receive a structure ---
MPI_Status status;
MPI_Aint msg_size;
// Peek for a message, allocate big enough buffer
MPI_Probe(source, MPI_ANY_TAG, &status);
MPI_Get_count(&status, MPI_BYTE, &msg_size);
uint8_t *buffer = malloc(msg_size);
// Receive the message
MPI_Recv(buffer, (int)msg_size, MPI_BYTE, source, status.MPI_TAG,
         MPI_COMM_WORLD, MPI_STATUS_IGNORE);
// Fill in a data structure
data.tag = status.MPI_TAG;
data.data = buffer;

回答2:

Assuming that you define this struct because you want to pair different pieces of data with different tags, your solution is conceptually wrong. Consider the following example:

data foo, bar;
int x = 100;
foo.data = bar.data = &x;
foo.tag = bar.tag = 99;

In this case, foo and bar will each have their own copy of tag in memory, but they point to the same piece of data. Therefore, it is impossible to define a single MPI datatype that could be used to send both elements, since the displacement between their respective data and tag elements is different. The same will hold true for different data pointers in all but the luckiest of cases.

If you wish to pair data and tags, you can still use your data struct, though for the reason mentioned above, you do not need a custom MPI datatype:

MPI_Send(myData.data,extent,MPI_BYTE, dest, myData.tag, MPI_COMM_WORLD);

with a matching receive:

MPI_Recv(myData.data, extent, MPI_BYTE, source, myData.tag, MPI_COMM_WORLD, &stat);

回答3:

The offset of tag is wrong in the MPI datatype. In general you can't assume that a void* is the same size as an int. Besides, there might be padding introduced into the struct as more fields are added. There's a way around this problem though - just use offsetof:

offsets[0] = offsetof(data, data);
oldType[0] = MPI_BYTE;
blockCount[0] = sizeof(void *);

offsets[1] = offsetof(data, tag);
oldType[1] = MPI_INT;
blockCount[1] = 1;

MPI_Type_create_struct(2, blockCount, offsets, oldType, &structType);

And one more thing: since the pointer is meaningless at the destination anyway, you can skip it in the MPI datatype.

来源：https://stackoverflow.com/questions/13039283/sending-typedef-struct-containing-void-by-creating-mpi-drived-datatype

标签

mpi