问题
My question is how to create a dynamic pool of workers with MPI.
There is a large (NNN = 10^6-7 elements) 1D array/vector. I should perform some calculations on each cell. This problem is extremely embarrassingly parallel.
The idea is (it works fine): each MPI process (when run in parallel) reads common .dat file, puts values in local (to each rank) large vector of size NNN and performs computation on appropriate part of large array, the lenght of this "part" is NNN/nprocs, where "nprocs" is the number of processes of MPI.
The problem: some "parts" of this array (NNN/nprocs) are finished very quick and thus some of CPUs are unused (they wait for the others to finish the run).
The question1: How to make dynamic schedule. CPU's, that finished their tasks, can pick new task and continue working.
The question2: Is there MPI built-in procedure, that schedules automatically "workers" and tasks?
Here is my code (static schedule)
{
MPI_Init(&argc, &argv);
MPI_Comm_size(MPI_COMM_WORLD, &nprocs);
MPI_Comm_rank(MPI_COMM_WORLD, &rank);
MPI_Offset offset;
MPI_File file;
MPI_Status status;
int Pstart = (NNN / nprocs) * rank + ((NNN % nprocs) < rank ? (NNN % nprocs) : rank);
int Pend = Pstart + (NNN / nprocs) + ((NNN % nprocs) > rank);
offset = sizeof(double)*Pstart;
MPI_File_open(MPI_COMM_WORLD, "shared.dat", MPI_MODE_CREATE|MPI_MODE_WRONLY, MPI_INFO_NULL, &file);
double * local_array;
local_array = new double [NNN/nprocs];
for (int i=0;i<NNN/nprocs;i++)
{
/* next line calculates integral on each cell element of part NNN/nprocs of large array NNN */
adapt_integrate(1, Integrand, par, 2, a, b, MaxEval, tol, tol, &val, &err);
// putting result of integration to local array NNN/nprocs
local_array[i] = val;
}
// here, all local arrays are written to one shared file "shared.dat"
MPI_File_seek(file, offset, MPI_SEEK_SET);
MPI_File_write(file, local_array, NNN/nprocs, MPI_DOUBLE, &status);
MPI_File_close(&file);
}
回答1:
This question is about a similar problem, but just to recap: have a designated master process that issues chunks of work to the others. All the workers need to do is blocking receive a work item, perform their calculations, then blocking send the result to the master and repeat. The master can manage work items either by posting a nonblocking receive for each worker and polling if any of them completed, or by posting a blocking receive with MPI_ANY_SOURCE
as source.
来源:https://stackoverflow.com/questions/16015291/dynamic-pool-of-workers-with-mpi-for-large-array-c