MPI, python, Scatterv, and overlapping data

跟風遠走 提交于 2020-05-29 10:20:07

问题


The MPI standard, 3.0, says about mpi_scatterv:

The specification of counts, types, and displacements should not cause any location on the root to be read more than once."

However, my testing of mpi4py in python with the code below does not indicate that there is a problem with reading data from root more than once:

import numpy as np
from sharethewealth import sharethewealth

comm = MPI.COMM_WORLD
nprocs = comm.Get_size()
rank = comm.Get_rank()

counts = [16, 17, 16, 16, 16, 16, 15]
displs = [0, 14, 29, 43, 57, 71, 85]

if rank == 0:
    bigx = np.arange(100, dtype=np.float64)
else:
    bigx = None

my_size = counts[rank]
x = np.zeros(my_size)

comm.Scatterv([bigx, counts, displs, MPI.DOUBLE], x, root = 0)

print x

Command

> mpirun -np 7 python mycode.py

produces

[ 57.  58.  59.  60.  61.  62.  63.  64.  65.  66.  67.  68.  69.  70.  71. 72.]

[ 85.  86.  87.  88.  89.  90.  91.  92.  93.  94.  95.  96.  97.  98.  99.]

[  0.   1.   2.   3.   4.   5.   6.   7.   8.   9.  10.  11.  12.  13.  14. 15.]

[ 29.  30.  31.  32.  33.  34.  35.  36.  37.  38.  39.  40.  41.  42.  43. 44.]

[ 43.  44.  45.  46.  47.  48.  49.  50.  51.  52.  53.  54.  55.  56.  57. 58.]

[ 71.  72.  73.  74.  75.  76.  77.  78.  79.  80.  81.  82.  83.  84.  85. 86.]

[ 14.  15.  16.  17.  18.  19.  20.  21.  22.  23.  24.  25.  26.  27.  28. 29.  30.]

The output is clearly correct and the data from root ( process 0) has clearly been referenced more than once at each of the boundary points. Am I not understanding the MPI standard? Or is this a fortuitous behavior that cannot be relied on in general?

FWIW, I'm running python 2.7 on OSX.


回答1:


You cannot rely on this.

This assumption stems directly from the MPI standard. Since mpi4py upper-case functions are just a thin layer on top of MPI, this is what matters. The standard also states:

Rationale. Though not needed, the last restriction is imposed so as to achieve symmetry with MPI_GATHER, where the corresponding restriction (a multiple-write restriction) is necessary. (End of rationale.)

Considering it is in the standard, an MPI implementation may make use of that:

  • Ignore violations
  • Issue a warning when violated
  • Fail when violated
  • Use this assumptions for any kind of optimization that could lead to undefined behavior when violated

The last point is most scary as it may introduce subtle bugs. Considering the read-only nature of the send buffer, it is difficult to imagine such an optimization, but that doesn't mean it does/will not exist. Just as an idea consider strict aliasing optimizations. Also note that MPI implementations are very complex - their behavior may change seemingly erratic between versions, configurations, data sizes or other environmental changes.

There is also an infamous example with memcpy: The standard forbids overlapping memory inputs, and at some point the glibc implementation made use of that for a tiny disputed optimization. Code that did not satisfied the requirement started to fail, and users started to hear strange sound on mp3 flash websites, followed by a heated debate involving Linus Torvalds and Ulrich Drepper.

The morale of the story is: Follow the requirements imposed by the standard, even if it works right now and the requirement doesn't make sense to you. Also be glad that there is such a detailed standard.




回答2:


The MPI standard includes many requirements that are often not strictly checked by the implementations, mainly for performance reasons. The rationale is that any program that is correct according to the standard will also be correct given a set of relaxed constraints. Relying on such implementation-specific behaviour results in non-portable code and goes against the standard.

There are many valid reasons to require disjoint send segments. The immediately visible one is the symmetry with MPI_Gatherv. For the latter the segments must be disjoint, otherwise the content of the memory after the gather will depend on the order of the underlying receive operations. Since in a typical MPI program scatters are usually mirrored by gathers, the computed offset and count arrays can be reused if the same constraints apply to both the gather and the scatter. A less obvious reason is that on some architectures the network equipment might not allow simultaneous reads from overlapping memory regions.

As it is very easy for non-standard MPI behaviour to creep into the program code during development, one might want to use tools like our MUST to check the correctness of the program.



来源:https://stackoverflow.com/questions/36582696/mpi-python-scatterv-and-overlapping-data

标签
易学教程内所有资源均来自网络或用户发布的内容,如有违反法律规定的内容欢迎反馈
该文章没有解决你所遇到的问题?点击提问,说说你的问题,让更多的人一起探讨吧!