Does python os.fork uses the same python interpreter?

后端 未结 3 1110
死守一世寂寞
死守一世寂寞 2021-02-05 17:17

I understand that threads in Python use the same instance of Python interpreter. My question is it the same with process created by os.fork? Or does each process cr

相关标签:
3条回答
  • 2021-02-05 17:56

    os.fork() is equivalent to the fork() syscall in many UNIC(es). So yes your sub-process(es) will be separate from the parent and have a different interpreter (as such).

    man fork:

    FORK(2)

    NAME fork - create a child process

    SYNOPSIS #include

       pid_t fork(void);
    

    DESCRIPTION fork() creates a new process by duplicating the calling process. The new process, referred to as the child, is an exact duplicate of the calling process, referred to as the parent, except for the following points:

    pydoc os.fork():

    os.fork() Fork a child process. Return 0 in the child and the child’s process id in the parent. If an error occurs OSError is raised.

    Note that some platforms including FreeBSD <= 6.3, Cygwin and OS/2 EMX have known issues when using fork() from a thread.

    See also: Martin Konecny's response as to the why's and advantages of "forking" :)

    For brevity; other approaches to concurrency which don't involve a separate process and therefore a separate Python interpreter include:

    • Green or Lightweight threads; ala greenlet
    • Coroutines ala Python generators and the new Python 3+ yield from
    • Async I/O ala asyncio, Twisted, circuits, etc.
    0 讨论(0)
  • 2021-02-05 18:00

    While fork does indeed create a copy of the current Python interpreter rather than running with the same one, it usually isn't what you want, at least not on its own. Among other problems:

    • There can be problems forking multi-threaded processes on some platforms. And some libraries (most famously Apple's Cocoa/CoreFoundation) may start threads for you in the background, or use thread-local APIs even though you've only got one thread, etc., without your knowledge.
    • Some libraries assume that every process will be initialized properly, but if you fork after initialization that isn't true. Most infamously, if you let ssl seed its PRNG in the main process, then fork, you now have potentially predictable random numbers, which is a big hole in your security.
    • Open file descriptors are inherited (as dups) by the children, with details that vary in annoying ways between platforms.
    • POSIX only requires platforms to implement a very specific set of syscalls between a fork and an exec. If you never call exec, you can only use those syscalls. Which basically means you can't do anything portably.
    • Anything to do with signals is especially annoying and nonportable after fork.

    See POSIX fork or your platform's manpage for details on these issues.

    The right answer is almost always to use multiprocessing, or concurrent.futures (which wraps up multiprocessing), or a similar third-party library.

    With 3.4+, you can even specify a start method. The fork method basically just calls fork. The forkserver method runs a single "clean" process (no threads, signal handlers, SSL initialization, etc.) and forks off new children from that. The spawn method calls fork then exec, or an equivalent like posix_spawn, to get you a brand-new interpreter instead of a copy. So you can start off with fork, ut then if there are any problems, switch to forkserver or spawn and nothing else in your code has to change. Which is pretty nice.

    0 讨论(0)
  • 2021-02-05 18:04

    Whenever you fork, the entire Python process is duplicated in memory (including the Python interpreter, your code and any libraries, current stack etc.) to create a second process - one reason why forking a process is much more expensive than creating a thread.

    This creates a new copy of the python interpreter.

    One advantage of having two python interpreters running is that you now have two GIL's (Global Interpreter Locks), and therefore can have true multi-processing on a multi-core system.

    Threads in one process share the same GIL, meaning only one runs at a given moment, giving only the illusion of parallelism.

    0 讨论(0)
提交回复
热议问题