How do I parallelize a simple Python loop?

后端 未结 13 1300
北荒
北荒 2020-11-22 11:54

This is probably a trivial question, but how do I parallelize the following loop in python?

# setup output lists
output1 = list()
output2 = list()
output3 =          


        
相关标签:
13条回答
  • 2020-11-22 12:15

    This could be useful when implementing multiprocessing and parallel/ distributed computing in Python.

    YouTube tutorial on using techila package

    Techila is a distributed computing middleware, which integrates directly with Python using the techila package. The peach function in the package can be useful in parallelizing loop structures. (Following code snippet is from the Techila Community Forums)

    techila.peach(funcname = 'theheavyalgorithm', # Function that will be called on the compute nodes/ Workers
        files = 'theheavyalgorithm.py', # Python-file that will be sourced on Workers
        jobs = jobcount # Number of Jobs in the Project
        )
    
    0 讨论(0)
  • 2020-11-22 12:20

    why dont you use threads, and one mutex to protect one global list?

    import os
    import re
    import time
    import sys
    import thread
    
    from threading import Thread
    
    class thread_it(Thread):
        def __init__ (self,param):
            Thread.__init__(self)
            self.param = param
        def run(self):
            mutex.acquire()
            output.append(calc_stuff(self.param))
            mutex.release()   
    
    
    threads = []
    output = []
    mutex = thread.allocate_lock()
    
    for j in range(0, 10):
        current = thread_it(j * offset)
        threads.append(current)
        current.start()
    
    for t in threads:
        t.join()
    
    #here you have output list filled with data
    

    keep in mind, you will be as fast as your slowest thread

    0 讨论(0)
  • 2020-11-22 12:21

    What's the easiest way to parallelize this code?

    I really like concurrent.futures for this, available in Python3 since version 3.2 - and via backport to 2.6 and 2.7 on PyPi.

    You can use threads or processes and use the exact same interface.

    Multiprocessing

    Put this in a file - futuretest.py:

    import concurrent.futures
    import time, random               # add some random sleep time
    
    offset = 2                        # you don't supply these so
    def calc_stuff(parameter=None):   # these are examples.
        sleep_time = random.choice([0, 1, 2, 3, 4, 5])
        time.sleep(sleep_time)
        return parameter / 2, sleep_time, parameter * parameter
    
    def procedure(j):                 # just factoring out the
        parameter = j * offset        # procedure
        # call the calculation
        return calc_stuff(parameter=parameter)
    
    def main():
        output1 = list()
        output2 = list()
        output3 = list()
        start = time.time()           # let's see how long this takes
    
        # we can swap out ProcessPoolExecutor for ThreadPoolExecutor
        with concurrent.futures.ProcessPoolExecutor() as executor:
            for out1, out2, out3 in executor.map(procedure, range(0, 10)):
                # put results into correct output list
                output1.append(out1)
                output2.append(out2)
                output3.append(out3)
        finish = time.time()
        # these kinds of format strings are only available on Python 3.6:
        # time to upgrade!
        print(f'original inputs: {repr(output1)}')
        print(f'total time to execute {sum(output2)} = sum({repr(output2)})')
        print(f'time saved by parallelizing: {sum(output2) - (finish-start)}')
        print(f'returned in order given: {repr(output3)}')
    
    if __name__ == '__main__':
        main()
    

    And here's the output:

    $ python3 -m futuretest
    original inputs: [0.0, 1.0, 2.0, 3.0, 4.0, 5.0, 6.0, 7.0, 8.0, 9.0]
    total time to execute 33 = sum([0, 3, 3, 4, 3, 5, 1, 5, 5, 4])
    time saved by parallellizing: 27.68999981880188
    returned in order given: [0, 4, 16, 36, 64, 100, 144, 196, 256, 324]
    

    Multithreading

    Now change ProcessPoolExecutor to ThreadPoolExecutor, and run the module again:

    $ python3 -m futuretest
    original inputs: [0.0, 1.0, 2.0, 3.0, 4.0, 5.0, 6.0, 7.0, 8.0, 9.0]
    total time to execute 19 = sum([0, 2, 3, 5, 2, 0, 0, 3, 3, 1])
    time saved by parallellizing: 13.992000102996826
    returned in order given: [0, 4, 16, 36, 64, 100, 144, 196, 256, 324]
    

    Now you have done both multithreading and multiprocessing!

    Note on performance and using both together.

    Sampling is far too small to compare the results.

    However, I suspect that multithreading will be faster than multiprocessing in general, especially on Windows, since Windows doesn't support forking so each new process has to take time to launch. On Linux or Mac they'll probably be closer.

    You can nest multiple threads inside multiple processes, but it's recommended to not use multiple threads to spin off multiple processes.

    0 讨论(0)
  • 2020-11-22 12:21

    This is the easiest way to do it!

    You can use asyncio. (Documentation can be found here). It is used as a foundation for multiple Python asynchronous frameworks that provide high-performance network and web-servers, database connection libraries, distributed task queues, etc. Plus it has both high-level and low-level APIs to accomodate any kind of problem.

    import asyncio
    
    def background(f):
        def wrapped(*args, **kwargs):
            return asyncio.get_event_loop().run_in_executor(None, f, *args, **kwargs)
    
        return wrapped
    
    @background
    def your_function(argument):
        #code
    

    Now this function will be run in parallel whenever called without putting main program into wait state. You can use it to parallelize for loop as well. When called for a for loop, though loop is sequential but every iteration runs in parallel to the main program as soon as interpreter gets there. For instance:

    @background
    def your_function(argument):
        time.sleep(5)
        print('function finished for '+str(argument))
    
    
    for i in range(10):
        your_function(i)
    
    
    print('loop finished')
    

    This produces following output:

    loop finished
    function finished for 4
    function finished for 8
    function finished for 0
    function finished for 3
    function finished for 6
    function finished for 2
    function finished for 5
    function finished for 7
    function finished for 9
    function finished for 1
    
    0 讨论(0)
  • 2020-11-22 12:24

    To parallelize a simple for loop, joblib brings a lot of value to raw use of multiprocessing. Not only the short syntax, but also things like transparent bunching of iterations when they are very fast (to remove the overhead) or capturing of the traceback of the child process, to have better error reporting.

    Disclaimer: I am the original author of joblib.

    0 讨论(0)
  • 2020-11-22 12:24
    from joblib import Parallel, delayed
    import multiprocessing
    
    inputs = range(10) 
    def processInput(i):
        return i * i
    
    num_cores = multiprocessing.cpu_count()
    
    results = Parallel(n_jobs=num_cores)(delayed(processInput)(i) for i in inputs)
    print(results)
    

    The above works beautifully on my machine (Ubuntu, package joblib was pre-installed, but can be installed via pip install joblib).

    Taken from https://blog.dominodatalab.com/simple-parallelization/

    0 讨论(0)
提交回复
热议问题