问题
I want to use multiprocessing.Pool, but multiprocessing.Pool can't abort a task after a timeout. I found solution and some modify it.
from multiprocessing import util, Pool, TimeoutError
from multiprocessing.dummy import Pool as ThreadPool
import threading
import sys
from functools import partial
import time
def worker(y):
print("worker sleep {} sec, thread: {}".format(y, threading.current_thread()))
start = time.time()
while True:
if time.time() - start >= y:
break
time.sleep(0.5)
# show work progress
print(y)
return y
def collect_my_result(result):
print("Got result {}".format(result))
def abortable_worker(func, *args, **kwargs):
timeout = kwargs.get('timeout', None)
p = ThreadPool(1)
res = p.apply_async(func, args=args)
try:
# Wait timeout seconds for func to complete.
out = res.get(timeout)
except TimeoutError:
print("Aborting due to timeout {}".format(args[1]))
# kill worker itself when get TimeoutError
sys.exit(1)
else:
return out
def empty_func():
pass
if __name__ == "__main__":
TIMEOUT = 4
util.log_to_stderr(util.DEBUG)
pool = Pool(processes=4)
# k - time to job sleep
featureClass = [(k,) for k in range(20, 0, -1)] # list of arguments
for f in featureClass:
# check available worker
pool.apply(empty_func)
# run job with timeout
abortable_func = partial(abortable_worker, worker, timeout=TIMEOUT)
pool.apply_async(abortable_func, args=f, callback=collect_my_result)
time.sleep(TIMEOUT)
pool.terminate()
print("exit")
main modification - worker process exit with sys.exit(1). It's kill worker process and kill job thread, but i'm not sure that this solution is good. What potential problems can i get, when process terminate itself with running job?
回答1:
There is no implicit risk in stopping a running job, the OS will take care of correctly terminating the process.
If your job is writing on files, you might end up with lots of truncated files on your disk.
Some small issue might also occur if you write on DBs or if you are connected with some remote process.
Nevertheless, Python standard Pool does not support timeouts and terminating processes abruptly might lead to weird behaviour within your applications.
Pebble processing Pool does support timing-out tasks.
from pebble import process, TimeoutError
with process.Pool() as pool:
task = pool.schedule(function, args=[1,2], timeout=5)
try:
result = task.get()
except TimeoutError:
print "Task: %s took more than 5 seconds to complete" % task
来源:https://stackoverflow.com/questions/38711840/python-multiprocessing-pool-timeout