Improving Python execution speed with parallel threads

问题

Let's say I have this sample code:

x = foo1(something1)
y = foo2(something2)

z = max(x, y)

I want to improve the execution time of this code by using threads (hope it helps isn't it?). I'd like to keep things as simple as possible so basically what I'd like to do is creating two threads working at the same time which compute respectively foo1 and foo2.

I'm reading something about threads but I found it a little tricky and I can't lose too much time in it just for doing such a simple thing.

回答1:

Assuming foo1 or foo2 is CPU-bound, threading doesn't improve the execution time... in fact, it normally makes it worse... for more information, see David Beazley's PyCon2010 presentation on the Global Interpreter Lock / Pycon2010 GIL slides. This presentation is very informative, I highly recommend it to anyone trying to distribute load across CPU cores.

The best way to improve performance is with the multiprocessing module

Assuming there is no shared state required between foo1() and foo2(), do this to improve execution performance...

from multiprocessing import Process, Queue
import time

def foo1(queue, arg1):
    # Measure execution time and return the total time in the queue
    print "Got arg1=%s" % arg1
    start = time.time()
    while (arg1 > 0):
        arg1 = arg1 - 1
        time.sleep(0.01)
    # return the output of the call through the Queue
    queue.put(time.time() - start)

def foo2(queue, arg1):
    foo1(queue, 2*arg1)

_start = time.time()
my_q1 = Queue()
my_q2 = Queue()

# The equivalent of x = foo1(50) in OP's code
p1 = Process(target=foo1, args=[my_q1, 50])
# The equivalent of y = foo2(50) in OP's code
p2 = Process(target=foo2, args=[my_q2, 50])

p1.start(); p2.start()
p1.join(); p2.join()
# Get return values from each Queue
x = my_q1.get()
y = my_q2.get()

print "RESULT", x, y
print "TOTAL EXECUTION TIME", (time.time() - _start)

From my machine, this results in:

mpenning@mpenning-T61:~$ python test.py 
Got arg1=100
Got arg1=50
RESULT 0.50578212738 1.01011300087
TOTAL EXECUTION TIME 1.02570295334
mpenning@mpenning-T61:~$

回答2:

You can use python thread module or the new Threading module, however using thread module is simpler in syntax,

#!/usr/bin/python

import thread
import time

# Define a function for the thread
def print_time( threadName, delay):
   count = 0
   while count < 5:
      time.sleep(delay)
      count += 1
      print "%s: %s" % ( threadName, time.ctime(time.time()) )

# Create two threads as follows
try:
   thread.start_new_thread( print_time, ("Thread-1", 2, ) )
   thread.start_new_thread( print_time, ("Thread-2", 4, ) )
except:
   print "Error: unable to start thread"

while 1:
   pass

will output,

Thread-1: Thu Jan 22 15:42:17 2009
Thread-1: Thu Jan 22 15:42:19 2009
Thread-2: Thu Jan 22 15:42:19 2009
Thread-1: Thu Jan 22 15:42:21 2009
Thread-2: Thu Jan 22 15:42:23 2009
Thread-1: Thu Jan 22 15:42:23 2009
Thread-1: Thu Jan 22 15:42:25 2009
Thread-2: Thu Jan 22 15:42:27 2009
Thread-2: Thu Jan 22 15:42:31 2009
Thread-2: Thu Jan 22 15:42:35 2009

Read more from here, Python - Multithreaded Programming

回答3:

Read the documentation here first to understand the code below:

import threading

def foo1(x=0):
    return pow(x, 2)

def foo2(y=0):
    return pow(y, 3)

thread1 = threading.Thread(target=foo1, args=(3))
thread2 = threading.Thread(target=foo2, args=(2))
thread1.start()
thread2.start()
thread1.join()
thread2.join()

回答4:

It won't help. Read the Python FAQ.

来源：https://stackoverflow.com/questions/8430899/improving-python-execution-speed-with-parallel-threads

标签

python

multithreading

performance

multiprocessing

gil