How to use multiprocessing in python

后端 未结 1 1163
星月不相逢
星月不相逢 2021-01-18 21:12

New to python and I want to do parallel programming in the following code, and want to use multiprocessing in python to do it. So how to modify the code? I\'ve been searchin

1条回答
  •  深忆病人
    2021-01-18 21:46

    It's hard to parallelize code that needs to mutate the same shared data from different tasks. So, I'm going to assume that setinner and setouter are non-mutating functions; if that's not true, things will be more complicated.

    The first step is to decide what you want to do in parallel.


    One obvious thing is to do the setinner and setouter at the same time. They're completely independent of each other, and always need to both get done. So, that's what I'll do. Instead of doing this:

    inneropt,partition,x = setinner(Q,G,n)
    outeropt = setouter(Q,G,n)
    

    … we want to submit the two functions as tasks to the pool, then wait for both to be done, then get the results of both.

    The concurrent.futures module (which requires a third-party backport in Python 2.x) makes it easier to do things like "wait for both to be done" than the multiprocessing module (which is in the stdlib in 2.6+), but in this case, we don't need anything fancy; if one of them finishes early, we don't have anything to do until the other finishes anyway. So, let's stick with multiprocessing.apply_async:

    pool = multiprocessing.Pool(2) # we never have more than 2 tasks to run
    while i < 1000:
        # parallelly start both tasks
        inner_result = pool.apply_async(setinner, (Q, G, n))
        outer_result = pool.apply_async(setouter, (Q, G, n))
    
        # sequentially wait for both tasks to finish and get their results
        inneropt,partition,x = inner_result.get()
        outeropt = outer_result.get()
    
        # the rest of your loop is unchanged
    

    You may want to move the pool outside the function so it lives forever and can be used by other parts of your code. And if not, you almost certainly want to shut the pool down at the end of the function. (Later versions of multiprocessing let you just use the pool in a with statement, but I think that requires Python 3.2+, so you have to do it explicitly.)


    What if you want to do more work in parallel? Well, there's nothing else obvious to do here without restructuring the loop. You can't do updateGraph until you get the results back from setinner and setouter, and nothing else is slow here.

    But if you could reorganize things so that each loop's setinner were independent of everything that came before (which may or may not be possible with your algorithm—without knowing what you're doing, I can't guess), you could push 2000 tasks onto the queue up front, then loop by just grabbing results as needed. For example:

    pool = multiprocessing.Pool() # let it default to the number of cores
    inner_results = []
    outer_results = []
    for _ in range(1000):
        inner_results.append(pool.apply_async(setinner, (Q,G,n,i))
        outer_results.append(pool.apply_async(setouter, (Q,G,n,i))
    while i < 1000:
        inneropt,partition,x = inner_results.pop(0).get()
        outeropt = outer_results.pop(0).get()
        # result of your loop is the same as before
    

    Of course you can make this fancier.

    For example, let's say you rarely need more than a couple hundred iterations, so it's wasteful to always compute 1000 of them. You can just push the first N at startup, and push one more every time through the loop (or N more every N times) so you never do more than N wasted iterations—you can't get an ideal tradeoff between perfect parallelism and minimal waste, but you can usually tune it pretty nicely.

    Also, if the tasks don't actually take that long, but you have a lot of them, you may want to batch them up. One really easy way to do this is to use one of the map variants instead of apply_async; this can make your fetching code a tiny bit more complicated, but it makes the queuing and batching code completely trivial (e.g., to map each func over a list of 100 parameters with a chunksize of 10 is just two simple lines of code).

    0 讨论(0)
提交回复
热议问题