Using mpi4py to parallelize a 'for' loop on a compute cluster

时光怂恿深爱的人放手 提交于 2020-07-05 04:35:52

问题


I haven't worked with distributed computing before, but I'm trying to integrate mpi4py into a program in order to parallelize a for loop on a compute cluster.

This is a pseudocode of what I want to do:

for file in directory: Initialize a class Run class methods Conglomerate results

I've looked all over stack overflow and I can't find any solution to this. Is there any way to do this simply with mpi4py, or is there another tool that can do it with easy installation and setup?


回答1:


In order to achieve parallelism of a for loop with MPI4Py check the code example below. Its just a for loop to add some numbers. The for loop will execute in every node. Every node will get a different chunk of data to work with (range in for loop). In the end Node with rank zero will add the results from all the nodes.

#!/usr/bin/python

import numpy
from mpi4py import MPI
import time

comm = MPI.COMM_WORLD
rank = comm.Get_rank()
size = comm.Get_size()

a = 1
b = 1000000

perrank = b//size
summ = numpy.zeros(1)

comm.Barrier()
start_time = time.time()

temp = 0
for i in range(a + rank*perrank, a + (rank+1)*perrank):
    temp = temp + i

summ[0] = temp

if rank == 0:
    total = numpy.zeros(1)
else:
    total = None

comm.Barrier()
#collect the partial results and add to the total sum
comm.Reduce(summ, total, op=MPI.SUM, root=0)

stop_time = time.time()

if rank == 0:
    #add the rest numbers to 1 000 000
    for i in range(a + (size)*perrank, b+1):
        total[0] = total[0] + i
    print ("The sum of numbers from 1 to 1 000 000: ", int(total[0]))
    print ("time spent with ", size, " threads in milliseconds")
    print ("-----", int((time.time()-start_time)*1000), "-----")

In order to execute the code above you should run it like this:

$ qsub -q qexp -l select=4:ncpus=16:mpiprocs=16:ompthreads=1 -I # Salomon: ncpus=24:mpiprocs=24
$ ml Python
$ ml OpenMPI
$ mpiexec -bycore -bind-to-core python hello_world.py

In this example, we run MPI4Py enabled code on 4 nodes, 16 cores per node (total of 64 processes), each python process is bound to a different core.

Sources that may help you:
Submit job with python code (mpi4py) on HPC cluster
https://github.com/JordiCorbilla/mpi4py-examples/tree/master/src/examples/matrix%20multiplication



来源:https://stackoverflow.com/questions/50979373/using-mpi4py-to-parallelize-a-for-loop-on-a-compute-cluster

易学教程内所有资源均来自网络或用户发布的内容,如有违反法律规定的内容欢迎反馈
该文章没有解决你所遇到的问题?点击提问,说说你的问题,让更多的人一起探讨吧!