问题
Python guru I need your help. I faced quite strange behavior: empty python Process hangs on joining. Looks like it forks some locked resource.
Env:
- Python version: 3.5.3
- OS: Ubuntu 16.04.2 LTS
- Kernel: 4.4.0-75-generic
Problem description:
1) I have a logger with thread to handle messages in background and queue for this thread. Logger source code (a little bit simplified).
2) And I have a simple script which uses my logger (just code to display my problem):
import os
from multiprocessing import Process
from my_logging import get_logger
def func():
pass
if __name__ == '__main__':
logger = get_logger(__name__)
logger.start()
for _ in range(2):
logger.info('message')
proc = Process(target=func)
proc.start()
proc.join(timeout=3)
print('TEST PROCESS JOINED: is_alive={0}'.format(proc.is_alive()))
logger.stop()
print('EXIT')
Sometimes this test script hangs. Script hangs on joining process "proc" (when script completes execution). Test process "proc" stay alive.
To reproduce this problem you can run the script in loop:
$ for i in {1..100} ; do /opt/python3.5.3/bin/python3.5 test.py ; done
Investigation:
Strace shows following:
strace: Process 25273 attached
futex(0x2275550, FUTEX_WAIT_BITSET_PRIVATE|FUTEX_CLOCK_REALTIME, 0, NULL, ffffffff
And I figured out the place where process hangs. It hangs in multiprocessing module, file process.py, line 269 (python3.5.3), on flushing STDERR:
...
267 util.info('process exiting with exitcode %d' % exitcode)
268 sys.stdout.flush()
269 sys.stderr.flush()
...
If line 269 commented the script completes successfully always.
My thoughts:
By default logging.StreamHandler uses sys.stderr as stream.
If process has been forked when logger flushing data to STDERR, process context gets some locked resource and further hangs on flushing STDERR.
Some workarounds which solves problem:
- Use python2.7. I can't reproduce it with python2.7. Maybe timings prevent me to reproduce the problem.
- Use process to handle messages in logger instead of thread.
Do you have any ideas on this behavior? Where is the problem? Am I doing something wrong?
回答1:
It looks like this behaviour is related to this issue: http://bugs.python.org/issue6721
回答2:
Question: Sometimes ... Test process "proc" stay alive.
I could only reproduce your
TEST PROCESS:0 JOINED: is_alive=True
by adding a
time.sleep(5)
todef func():
.
You useproc.join(timeout=3)
, that's the expected behavior.Conclusion:
Overloading your System, starts in my Environment with 30 Processes running, triggers yourproc.join(timeout=3)
. You may rethink your Testcase to reproduce your problem.One Approach I think, is fine-tuning your
Process/Thread
with sometime.sleep(0.05)
to give off a timeslice.
Your are using
from multiprocessing import Queue
usefrom queue import Queue
instead.From the Documentation
Class multiprocessing.Queue
A queue class for use in a multi-processing (rather than multi-threading) context.In
class QueueHandler(logging.Handler):
, prevent to doself.queue.put_nowait(record)
after
class QueueListener(object): ... def stop(self): ...
implement, for instance
class QueueHandler(logging.Handler): def __init__(self): self.stop = Event() ...
In
def _monitor(self):
use only ONEwhile ...
loop.
Wait until theself._thread
stopedclass QueueListener(object): ... def stop(self): self.handler.stop.set() while not self.queue.empty(): time.sleep(0.5) # Don't use double flags #self._stop.set() self.queue.put_nowait(self._sentinel) self._thread.join()
来源:https://stackoverflow.com/questions/44069717/empty-python-process-hangs-on-join-sys-stderr-flush