Why does importing module in '__main__' not allow multiprocessig to use module?

帅比萌擦擦* 提交于 2019-12-05 19:45:55

The situation is different in unix-like systems and Windows. On the unixy systems, multiprocessing uses fork to create child processes that share a copy-on-write view of the parent memory space. The child sees the imports from the parent, including anything the parent imported under if __name__ == "__main__":.

On windows, there is no fork, a new process has to be executed. But simply rerunning the parent process doesn't work - it would run the whole program again. Instead, multiprocessing runs its own python program that imports the parent main script and then pickles/unpickles a view of the parent object space that is, hopefully, sufficient for the child process.

That program is the __main__ for the child process and the __main__ of the parent script doesn't run. The main script was just imported like any other module. The reason is simple: running the parent __main__ would just run the full parent program again, which mp must avoid.

Here is a test to show what is going on. A main module called testmp.py and a second module test2.py that is imported by the first.

testmp.py

import os
import multiprocessing as mp

print("importing test2")
import test2

def worker():
    print('worker pid: {}, module name: {}, file name: {}'.format(os.getpid(), 
        __name__, __file__))

if __name__ == "__main__":
    print('main pid: {}, module name: {}, file name: {}'.format(os.getpid(), 
        __name__, __file__))
    print("running process")
    proc = mp.Process(target=worker)
    proc.start()
    proc.join()

test2.py

import os

print('test2 pid: {}, module name: {}, file name: {}'.format(os.getpid(),
        __name__, __file__))

When run on Linux, test2 is imported once and the worker runs in the main module.

importing test2
test2 pid: 17840, module name: test2, file name: /media/td/USB20FD/tmp/test2.py
main pid: 17840, module name: __main__, file name: testmp.py
running process
worker pid: 17841, module name: __main__, file name: testmp.py

Under windows, notice that "importing test2" is printed twice - testmp.py was run two times. But "main pid" was only printed once - its __main__ wasn't run. That's because multiprocessing changed the module name to __mp_main__ during import.

E:\tmp>py testmp.py
importing test2
test2 pid: 7536, module name: test2, file name: E:\tmp\test2.py
main pid: 7536, module name: __main__, file name: testmp.py
running process
importing test2
test2 pid: 7544, module name: test2, file name: E:\tmp\test2.py
worker pid: 7544, module name: __mp_main__, file name: E:\tmp\testmp.py
易学教程内所有资源均来自网络或用户发布的内容,如有违反法律规定的内容欢迎反馈
该文章没有解决你所遇到的问题?点击提问,说说你的问题,让更多的人一起探讨吧!