What happens if I log into the same file from multiple different processes in python?

后端 未结 3 2242
礼貌的吻别
礼貌的吻别 2021-02-15 11:36

I spent hours to dig the behavior, first about those questions:

  • Atomicity of `write(2)` to a local filesystem
  • How can I synchronize -- make atomic -- writ
相关标签:
3条回答
  • 2021-02-15 11:56

    I have tried a similar code like this(I have tried in python 3)

    import threading
    for i in range(0,100000):
         t1 = threading.Thread(target= funtion_to_call_logger, args=(i,))
         t1.start()
    

    This worked completely fine for me, similar issue is addressed here.

    This took a lot of Cpu time but not memory.

    EDIT:
    Fine means all the requested things were logged but Order was missing.Hence Race condition still not fixed ,

    0 讨论(0)
  • 2021-02-15 12:05

    I would not rely on tests here. Weird things can only happen in race conditions, and exhibiting a race condition by test is almost non sense because the race condition is unlikely to occur. So it can work nicely for 1000 test runs and randomly break later in prod... The page you cite says:

    logging to a single file from multiple processes is not supported, because there is no standard way to serialize access to a single file across multiple processes in Python

    That does not means that it will break... it could even be safe in a particular implementation on a particular file system. It just means that it can break without any hope of fix on any other version of Python or on any other filesystem.

    If you really want to make sure of it, you will have to dive into Python source code (for your version) to control how logging is actually implemented, and control whether is it safe on your file system. And you will always be threatened by the possibility that a later optimization in logging module breaks your assumptions.

    IMHO that is the reason for the warning in Logging Cookbook, and the existence of a special module to allow concurrent logging to the same file. This last one does not rely on anything unspecified, but just uses explicit locking.

    0 讨论(0)
  • 2021-02-15 12:12

    I digged deeper and deeper. Now I think it's clear about these facts:

    1. With O_APPEND, parallel write(2) from multiple processes is ok. It's just order of lines are undetermined, but lines don't interleave or overwrite each other. And the size of data is by any amount, according to Niall Douglas's answer for Understanding concurrent file writes from multiple processes. I have tested about this "by any amount" on linux and have not found the upper limit, so i guess it's right.

    2. Without O_APPEND, it will be a mess. Here is what POSIX says "This volume of POSIX.1-2008 does not specify behavior of concurrent writes to a file from multiple processes. Applications should use some form of concurrency control."

    3. Now we come into python. The test i did in EDIT3, that 8K, i found it's origin. Python's write() use fwrite(3) in fact, and my python set a BUFF_SIZE here which is 8192. According to an answer from abarnert in Default buffer size for a file on Linux. This 8192 has a long story.

    However, more information is welcome.

    0 讨论(0)
提交回复
热议问题