python multiprocessing logging: QueueHandler with RotatingFileHandler “file being used by another process” error

余生长醉 提交于 2020-01-14 04:22:31

问题


I'm converting a program to multiprocessing and need to be able to log to a single rotating log from the main process as well as subprocesses. I'm trying to use the 2nd example in the python cookbook Logging to a single file from multiple processes, which starts a logger_thread running as part of the main process, picking up log messages off a queue that the subprocesses add to. The example works well as is, and also works if I switch to a RotatingFileHandler.

However if I change it to start logger_thread before the subprocesses (so that I can log from the main process as well), then as soon as the log rotates, all subsequent logging generates a traceback with WindowsError: [Error 32] The process cannot access the file because it is being used by another process.

In other words I change this code from the 2nd example

workers = []
for i in range(5):
    wp = Process(target=worker_process, name='worker %d' % (i + 1), args=(q,))
    workers.append(wp)
    wp.start()
logging.config.dictConfig(d)
lp = threading.Thread(target=logger_thread, args=(q,))
lp.start()

to this:

logging.config.dictConfig(d)
lp = threading.Thread(target=logger_thread, args=(q,))
lp.start()
workers = []
for i in range(5):
    wp = Process(target=worker_process, name='worker %d' % (i + 1), args=(q,))
    workers.append(wp)
    wp.start()

and swap out logging.FileHandler for logging.handlers.RotatingFileHandler (with a very small maxBytes for testing) and then I hit this error.

I'm using Windows and python 2.7. QueueHandler is not part of stdlib til python 3.2 but I've copied the source code from Gist, which it says is safe to do.

I don't understand why starting the listener first would make any difference, nor do I understand why any process other than main would be attempting to access the file.


回答1:


You should never start any threads before subprocesses. When Python forks, the threads and IPC state will not always be copied properly.

There are several resources on this, just google for fork and threads. Some people claim they can do it, but it's not clear to me that it can ever work properly.

Just start all your processes first.

Example additional information:

Status of mixing multiprocessing and threading in Python

https://stackoverflow.com/a/6079669/4279

In your case, it might be that the copied open file handle is the problem, but you still should start your subprocesses before your threads (and before you open any files that you will later want to destroy).

Some rules of thumb, summarized by fantabolous from the comments:

  • Subprocesses must always be started before any threads created by the same process.

  • multiprocessing.Pool creates both subprocesses AND threads, so one mustn't create additional Processes or Pools after the first one.

  • Files should not already be open at the time a Process or Pool is created. (This is OK in some cases, but not, e.g. if a file will be deleted later.)

  • Subprocesses can create their own threads and processes, with the same rules above applying.

  • Starting all processes first is the easiest way to do this




回答2:


So, you can simply make your own file log handler. I have yet to see logs getting garbled from multiprocessing, so it seems file log rotation is the big issue. Just do this in your main, and you don't have to change any of the rest of your logging

import logging
import logging.handlers
from multiprocessing import RLock

class MultiprocessRotatingFileHandler(logging.handlers.RotatingFileHandler):
    def __init__(self, *kargs, **kwargs):
        super(MultiprocessRotatingFileHandler, self).__init__(*kargs, **kwargs)
        self.lock = RLock()

    def shouldRollover(self, record):
        with self.lock:
            super(MultiprocessRotatingFileHandler, self).shouldRollover(record)

file_log_path = os.path.join('var','log', os.path.basename(__file__) + '.log')
file_log = MultiprocessRotatingFileHandler(file_log_path,
                                           maxBytes=8*1000*1024,
                                           backupCount=5,
                                           delay=True)

logging.basicConfig(level=logging.DEBUG)
logging.addHandler(file_log)

I'm willing to guess that locking every time you try to rotate is probably slowing down logging, but then this is a case where we need to sacrifice performance for correctness.



来源:https://stackoverflow.com/questions/32099378/python-multiprocessing-logging-queuehandler-with-rotatingfilehandler-file-bein

易学教程内所有资源均来自网络或用户发布的内容,如有违反法律规定的内容欢迎反馈
该文章没有解决你所遇到的问题?点击提问,说说你的问题,让更多的人一起探讨吧!