How can I use threading in Python?

前端 未结 19 2669
迷失自我
迷失自我 2020-11-21 04:54

I am trying to understand threading in Python. I\'ve looked at the documentation and examples, but quite frankly, many examples are overly sophisticated and I\'m having trou

相关标签:
19条回答
  • 2020-11-21 05:11

    Given a function, f, thread it like this:

    import threading
    threading.Thread(target=f).start()
    

    To pass arguments to f

    threading.Thread(target=f, args=(a,b,c)).start()
    
    0 讨论(0)
  • 2020-11-21 05:12

    With borrowing from this post we know about choosing between the multithreading, multiprocessing, and async/asyncio and their usage.

    Python 3 has a new built-in library in order to concurrency and parallelism: concurrent.futures

    So I'll demonstrate through an experiment to run four tasks (i.e. .sleep() method) by Threading-Pool:

    from concurrent.futures import ThreadPoolExecutor, as_completed
    from time import sleep, time
    
    def concurrent(max_worker):
        futures = []
        tic = time()
        with ThreadPoolExecutor(max_workers=max_worker) as executor:
            futures.append(executor.submit(sleep, 2))  # Two seconds sleep
            futures.append(executor.submit(sleep, 1))
            futures.append(executor.submit(sleep, 7))
            futures.append(executor.submit(sleep, 3))
            for future in as_completed(futures):
                if future.result() is not None:
                    print(future.result())
        print(f'Total elapsed time by {max_worker} workers:', time()-tic)
    
    concurrent(5)
    concurrent(4)
    concurrent(3)
    concurrent(2)
    concurrent(1)
    

    Output:

    Total elapsed time by 5 workers: 7.007831811904907
    Total elapsed time by 4 workers: 7.007944107055664
    Total elapsed time by 3 workers: 7.003149509429932
    Total elapsed time by 2 workers: 8.004627466201782
    Total elapsed time by 1 workers: 13.013478994369507
    

    [NOTE]:

    • As you can see in the above results, the best case was 3 workers for those four tasks.
    • If you have a process task instead of I/O bound or blocking (multiprocessing vs threading) you could change the ThreadPoolExecutor to ProcessPoolExecutor.
    0 讨论(0)
  • 2020-11-21 05:14

    Using the blazing new concurrent.futures module

    def sqr(val):
        import time
        time.sleep(0.1)
        return val * val
    
    def process_result(result):
        print(result)
    
    def process_these_asap(tasks):
        import concurrent.futures
    
        with concurrent.futures.ProcessPoolExecutor() as executor:
            futures = []
            for task in tasks:
                futures.append(executor.submit(sqr, task))
    
            for future in concurrent.futures.as_completed(futures):
                process_result(future.result())
            # Or instead of all this just do:
            # results = executor.map(sqr, tasks)
            # list(map(process_result, results))
    
    def main():
        tasks = list(range(10))
        print('Processing {} tasks'.format(len(tasks)))
        process_these_asap(tasks)
        print('Done')
        return 0
    
    if __name__ == '__main__':
        import sys
        sys.exit(main())
    

    The executor approach might seem familiar to all those who have gotten their hands dirty with Java before.

    Also on a side note: To keep the universe sane, don't forget to close your pools/executors if you don't use with context (which is so awesome that it does it for you)

    0 讨论(0)
  • 2020-11-21 05:19

    NOTE: For actual parallelization in Python, you should use the multiprocessing module to fork multiple processes that execute in parallel (due to the global interpreter lock, Python threads provide interleaving, but they are in fact executed serially, not in parallel, and are only useful when interleaving I/O operations).

    However, if you are merely looking for interleaving (or are doing I/O operations that can be parallelized despite the global interpreter lock), then the threading module is the place to start. As a really simple example, let's consider the problem of summing a large range by summing subranges in parallel:

    import threading
    
    class SummingThread(threading.Thread):
         def __init__(self,low,high):
             super(SummingThread, self).__init__()
             self.low=low
             self.high=high
             self.total=0
    
         def run(self):
             for i in range(self.low,self.high):
                 self.total+=i
    
    
    thread1 = SummingThread(0,500000)
    thread2 = SummingThread(500000,1000000)
    thread1.start() # This actually causes the thread to run
    thread2.start()
    thread1.join()  # This waits until the thread has completed
    thread2.join()
    # At this point, both threads have completed
    result = thread1.total + thread2.total
    print result
    

    Note that the above is a very stupid example, as it does absolutely no I/O and will be executed serially albeit interleaved (with the added overhead of context switching) in CPython due to the global interpreter lock.

    0 讨论(0)
  • 2020-11-21 05:19

    I found this very useful: create as many threads as cores and let them execute a (large) number of tasks (in this case, calling a shell program):

    import Queue
    import threading
    import multiprocessing
    import subprocess
    
    q = Queue.Queue()
    for i in range(30): # Put 30 tasks in the queue
        q.put(i)
    
    def worker():
        while True:
            item = q.get()
            # Execute a task: call a shell program and wait until it completes
            subprocess.call("echo " + str(item), shell=True)
            q.task_done()
    
    cpus = multiprocessing.cpu_count() # Detect number of cores
    print("Creating %d threads" % cpus)
    for i in range(cpus):
         t = threading.Thread(target=worker)
         t.daemon = True
         t.start()
    
    q.join() # Block until all tasks are done
    
    0 讨论(0)
  • 2020-11-21 05:19

    Here is the very simple example of CSV import using threading. (Library inclusion may differ for different purpose.)

    Helper Functions:

    from threading import Thread
    from project import app
    import csv
    
    
    def import_handler(csv_file_name):
        thr = Thread(target=dump_async_csv_data, args=[csv_file_name])
        thr.start()
    
    def dump_async_csv_data(csv_file_name):
        with app.app_context():
            with open(csv_file_name) as File:
                reader = csv.DictReader(File)
                for row in reader:
                    # DB operation/query
    

    Driver Function:

    import_handler(csv_file_name)
    
    0 讨论(0)
提交回复
热议问题