Difference in behavior between os.fork and multiprocessing.Process

前端 未结 2 2099
小蘑菇
小蘑菇 2021-02-09 02:43

I have this code :

import os

pid = os.fork()

if pid == 0:
    os.environ[\'HOME\'] = \"rep1\"
    external_function()
else:
    os.environ[\'HOME\'] = \"rep2\"         


        
相关标签:
2条回答
  • 2021-02-09 03:31

    To answer your question directly, there must be some side effect of external_process that makes it so that when the code is run in series, you get different results than if you run them at the same time. This is due to how you set up your code, and the lack of differences between os.fork and multiprocessing.Process in systems that os.fork is supported.


    The only real difference between the os.fork and multiprocessing.Process is portability and library overhead, since os.fork is not supported in windows, and the multiprocessing framework is included to make multiprocessing.Process work. This is because os.fork is called by multiprocessing.Process, as this answer backs up.

    The important distinction, then, is os.fork copies everything in the current process using Unix's forking, which means at the time of forking both processes are the same with PID differences. In Window's, this is emulated by rerunning all the setup code before the if __name__ == '__main__':, which is roughly the same as creating a subprocess using the subprocess library.

    For you, the code snippets you provide are doing fairly different things above, because you call external_function in main before you open the new process in the second code clip, making the two processes run in series but in different processes. Also the pipe is unnecessary, as it emulates no functionality from the first code.

    In Unix, the code snippets:

    import os
    
    pid = os.fork()
    
    if pid == 0:
        os.environ['HOME'] = "rep1"
        external_function()
    else:
        os.environ['HOME'] = "rep2"
        external_function()
    

    and:

    import os
    from multiprocessing import Process
    
    def f():
        os.environ['HOME'] = "rep1"
        external_function()
    
    if __name__ == '__main__':
        p = Process(target=f)
        p.start()
        os.environ['HOME'] = "rep2"
        external_function()
        p.join()
    

    should do exactly the same thing, but with a little extra overhead from the included multiprocessing library.


    Without further information, we can't figure out what the issue is. If you can provide code that demonstrates the issue, that would help us help you.

    0 讨论(0)
  • 2021-02-09 03:37

    The answer you are looking for is in detail addressed here. There is also an explanation of differences between different OS.

    One big issue is that the fork system call does not exist on Windows. Therefore, when running a Windows OS you cannot use this method. multiprocessing is a higher-level interface to execute a part of the currently running program. Therefore, it - as forking does - creates a copy of your process current state. That is to say, it takes care of the forking of your program for you.

    Therefore, if available you could consider fork() a lower-level interface to forking a program, and the multiprocessing library to be a higher-level interface to forking.

    0 讨论(0)
提交回复
热议问题