How can I use threading in Python?

前端 未结 19 2687
迷失自我
迷失自我 2020-11-21 04:54

I am trying to understand threading in Python. I\'ve looked at the documentation and examples, but quite frankly, many examples are overly sophisticated and I\'m having trou

19条回答
  •  天命终不由人
    2020-11-21 05:06

    Since this question was asked in 2010, there has been real simplification in how to do simple multithreading with Python with map and pool.

    The code below comes from an article/blog post that you should definitely check out (no affiliation) - Parallelism in one line: A Better Model for Day to Day Threading Tasks. I'll summarize below - it ends up being just a few lines of code:

    from multiprocessing.dummy import Pool as ThreadPool
    pool = ThreadPool(4)
    results = pool.map(my_function, my_array)
    

    Which is the multithreaded version of:

    results = []
    for item in my_array:
        results.append(my_function(item))
    

    Description

    Map is a cool little function, and the key to easily injecting parallelism into your Python code. For those unfamiliar, map is something lifted from functional languages like Lisp. It is a function which maps another function over a sequence.

    Map handles the iteration over the sequence for us, applies the function, and stores all of the results in a handy list at the end.

    Enter image description here


    Implementation

    Parallel versions of the map function are provided by two libraries:multiprocessing, and also its little known, but equally fantastic step child:multiprocessing.dummy.

    multiprocessing.dummy is exactly the same as multiprocessing module, but uses threads instead (an important distinction - use multiple processes for CPU-intensive tasks; threads for (and during) I/O):

    multiprocessing.dummy replicates the API of multiprocessing, but is no more than a wrapper around the threading module.

    import urllib2
    from multiprocessing.dummy import Pool as ThreadPool
    
    urls = [
      'http://www.python.org',
      'http://www.python.org/about/',
      'http://www.onlamp.com/pub/a/python/2003/04/17/metaclasses.html',
      'http://www.python.org/doc/',
      'http://www.python.org/download/',
      'http://www.python.org/getit/',
      'http://www.python.org/community/',
      'https://wiki.python.org/moin/',
    ]
    
    # Make the Pool of workers
    pool = ThreadPool(4)
    
    # Open the URLs in their own threads
    # and return the results
    results = pool.map(urllib2.urlopen, urls)
    
    # Close the pool and wait for the work to finish
    pool.close()
    pool.join()
    

    And the timing results:

    Single thread:   14.4 seconds
           4 Pool:   3.1 seconds
           8 Pool:   1.4 seconds
          13 Pool:   1.3 seconds
    

    Passing multiple arguments (works like this only in Python 3.3 and later):

    To pass multiple arrays:

    results = pool.starmap(function, zip(list_a, list_b))
    

    Or to pass a constant and an array:

    results = pool.starmap(function, zip(itertools.repeat(constant), list_a))
    

    If you are using an earlier version of Python, you can pass multiple arguments via this workaround).

    (Thanks to user136036 for the helpful comment.)

提交回复
热议问题