How can I use threading in Python?

前端未结

关注

 19  2718

I am trying to understand threading in Python. I\'ve looked at the documentation and examples, but quite frankly, many examples are overly sophisticated and I\'m having trou

相关标签:

19条回答

陌清茗

2020-11-21 05:11
Given a function, f, thread it like this:
```
import threading
threading.Thread(target=f).start()
```
To pass arguments to f
```
threading.Thread(target=f, args=(a,b,c)).start()
```
0 讨论(0)
发布评论:

提交评论
- 加载中...

别跟我提以往

2020-11-21 05:12

With borrowing from this post we know about choosing between the multithreading, multiprocessing, and async/asyncio and their usage.

Python 3 has a new built-in library in order to concurrency and parallelism: concurrent.futures

So I'll demonstrate through an experiment to run four tasks (i.e. .sleep() method) by Threading-Pool:

from concurrent.futures import ThreadPoolExecutor, as_completed
from time import sleep, time

def concurrent(max_worker):
    futures = []
    tic = time()
    with ThreadPoolExecutor(max_workers=max_worker) as executor:
        futures.append(executor.submit(sleep, 2))  # Two seconds sleep
        futures.append(executor.submit(sleep, 1))
        futures.append(executor.submit(sleep, 7))
        futures.append(executor.submit(sleep, 3))
        for future in as_completed(futures):
            if future.result() is not None:
                print(future.result())
    print(f'Total elapsed time by {max_worker} workers:', time()-tic)

concurrent(5)
concurrent(4)
concurrent(3)
concurrent(2)
concurrent(1)

Output:

Total elapsed time by 5 workers: 7.007831811904907
Total elapsed time by 4 workers: 7.007944107055664
Total elapsed time by 3 workers: 7.003149509429932
Total elapsed time by 2 workers: 8.004627466201782
Total elapsed time by 1 workers: 13.013478994369507

[NOTE]:

As you can see in the above results, the best case was 3 workers for those four tasks.
If you have a process task instead of I/O bound or blocking (multiprocessing vs threading) you could change the ThreadPoolExecutor to ProcessPoolExecutor.

0 讨论(0)

野性不改

2020-11-21 05:14

Using the blazing new concurrent.futures module

def sqr(val):
    import time
    time.sleep(0.1)
    return val * val

def process_result(result):
    print(result)

def process_these_asap(tasks):
    import concurrent.futures

    with concurrent.futures.ProcessPoolExecutor() as executor:
        futures = []
        for task in tasks:
            futures.append(executor.submit(sqr, task))

        for future in concurrent.futures.as_completed(futures):
            process_result(future.result())
        # Or instead of all this just do:
        # results = executor.map(sqr, tasks)
        # list(map(process_result, results))

def main():
    tasks = list(range(10))
    print('Processing {} tasks'.format(len(tasks)))
    process_these_asap(tasks)
    print('Done')
    return 0

if __name__ == '__main__':
    import sys
    sys.exit(main())

The executor approach might seem familiar to all those who have gotten their hands dirty with Java before.

Also on a side note: To keep the universe sane, don't forget to close your pools/executors if you don't use with context (which is so awesome that it does it for you)

0 讨论(0)

梦如初夏

2020-11-21 05:19
NOTE: For actual parallelization in Python, you should use the multiprocessing module to fork multiple processes that execute in parallel (due to the global interpreter lock, Python threads provide interleaving, but they are in fact executed serially, not in parallel, and are only useful when interleaving I/O operations).

However, if you are merely looking for interleaving (or are doing I/O operations that can be parallelized despite the global interpreter lock), then the threading module is the place to start. As a really simple example, let's consider the problem of summing a large range by summing subranges in parallel:
```
import threading

class SummingThread(threading.Thread):
     def __init__(self,low,high):
         super(SummingThread, self).__init__()
         self.low=low
         self.high=high
         self.total=0

     def run(self):
         for i in range(self.low,self.high):
             self.total+=i


thread1 = SummingThread(0,500000)
thread2 = SummingThread(500000,1000000)
thread1.start() # This actually causes the thread to run
thread2.start()
thread1.join()  # This waits until the thread has completed
thread2.join()
# At this point, both threads have completed
result = thread1.total + thread2.total
print result
```
Note that the above is a very stupid example, as it does absolutely no I/O and will be executed serially albeit interleaved (with the added overhead of context switching) in CPython due to the global interpreter lock.
0 讨论(0)
发布评论:

提交评论
- 加载中...

独厮守ぢ

2020-11-21 05:19

I found this very useful: create as many threads as cores and let them execute a (large) number of tasks (in this case, calling a shell program):

import Queue
import threading
import multiprocessing
import subprocess

q = Queue.Queue()
for i in range(30): # Put 30 tasks in the queue
    q.put(i)

def worker():
    while True:
        item = q.get()
        # Execute a task: call a shell program and wait until it completes
        subprocess.call("echo " + str(item), shell=True)
        q.task_done()

cpus = multiprocessing.cpu_count() # Detect number of cores
print("Creating %d threads" % cpus)
for i in range(cpus):
     t = threading.Thread(target=worker)
     t.daemon = True
     t.start()

q.join() # Block until all tasks are done

0 讨论(0)

旧时难觅i

2020-11-21 05:19

Here is the very simple example of CSV import using threading. (Library inclusion may differ for different purpose.)

Helper Functions:

from threading import Thread
from project import app
import csv


def import_handler(csv_file_name):
    thr = Thread(target=dump_async_csv_data, args=[csv_file_name])
    thr.start()

def dump_async_csv_data(csv_file_name):
    with app.app_context():
        with open(csv_file_name) as File:
            reader = csv.DictReader(File)
            for row in reader:
                # DB operation/query

Driver Function:

import_handler(csv_file_name)

0 讨论(0)