When starting an MPI job with mpirun
or mpiexec
, I can understand how one might go about starting each individual process. However, without any compile
Details on how individual processes establish the MPI universe are implementation specific. You should look into the source code of the specific library in order to understand how it works. There are two almost universal approaches though:
MPI_Init()
with argc
and argv
in C - thus the library can get access to the command line and extract all arguments that are meant for it;Open MPI for example sets environment variables and also writes some universe state in a disk location known to all processes that run on the same node. You can easily see the special variables that its run-time component ORTE (OpenMPI Run-Time Environment) uses by executing a command like mpirun -np 1 printenv
:
$ mpiexec -np 1 printenv | grep OMPI
... <many more> ...
OMPI_MCA_orte_hnp_uri=1660944384.0;tcp://x.y.z.t:43276;tcp://p.q.r.f:43276
OMPI_MCA_orte_local_daemon_uri=1660944384.1;tcp://x.y.z.t:36541
... <many more> ...
(IPs changed for security reasons)
Once a child process is launched remotely and MPI_Init()
or MPI_Init_thread()
is called, ORTE kicks in and reads those environment variables. Then it connects back to the specified network address with the "home" mpirun
/mpiexec
process which then coordinates all spawned processes into establishing the MPI universe.
Other MPI implementations work in a similar fashion.