问题
I understand that threads in Python use the same instance of Python interpreter. My question is it the same with process created by os.fork
? Or does each process created by os.fork
has its own interpreter?
回答1:
Whenever you fork, the entire Python process is duplicated in memory (including the Python interpreter, your code and any libraries, current stack etc.) to create a second process - one reason why forking a process is much more expensive than creating a thread.
This creates a new copy of the python interpreter.
One advantage of having two python interpreters running is that you now have two GIL's (Global Interpreter Locks), and therefore can have true multi-processing on a multi-core system.
Threads in one process share the same GIL, meaning only one runs at a given moment, giving only the illusion of parallelism.
回答2:
While fork
does indeed create a copy of the current Python interpreter rather than running with the same one, it usually isn't what you want, at least not on its own. Among other problems:
- There can be problems forking multi-threaded processes on some platforms. And some libraries (most famously Apple's Cocoa/CoreFoundation) may start threads for you in the background, or use thread-local APIs even though you've only got one thread, etc., without your knowledge.
- Some libraries assume that every process will be initialized properly, but if you
fork
after initialization that isn't true. Most infamously, if you letssl
seed its PRNG in the main process, then fork, you now have potentially predictable random numbers, which is a big hole in your security. - Open file descriptors are inherited (as dups) by the children, with details that vary in annoying ways between platforms.
- POSIX only requires platforms to implement a very specific set of syscalls between a
fork
and anexec
. If you never callexec
, you can only use those syscalls. Which basically means you can't do anything portably. - Anything to do with signals is especially annoying and nonportable after
fork
.
See POSIX fork or your platform's manpage for details on these issues.
The right answer is almost always to use multiprocessing, or concurrent.futures (which wraps up multiprocessing
), or a similar third-party library.
With 3.4+, you can even specify a start method. The fork
method basically just calls fork
. The forkserver
method runs a single "clean" process (no threads, signal handlers, SSL initialization, etc.) and forks off new children from that. The spawn
method calls fork
then exec
, or an equivalent like posix_spawn
, to get you a brand-new interpreter instead of a copy. So you can start off with fork
, ut then if there are any problems, switch to forkserver
or spawn
and nothing else in your code has to change. Which is pretty nice.
回答3:
os.fork()
is equivalent to the fork()
syscall in many UNIC(es). So yes your sub-process(es) will be separate from the parent and have a different interpreter (as such).
man fork:
FORK(2)
NAME fork - create a child process
SYNOPSIS #include
pid_t fork(void);
DESCRIPTION fork() creates a new process by duplicating the calling process. The new process, referred to as the child, is an exact duplicate of the calling process, referred to as the parent, except for the following points:
pydoc os.fork():
os.fork()
Fork a child process. Return 0 in the child and the child’s process id in the parent. If an error occurs OSError is raised.Note that some platforms including FreeBSD <= 6.3, Cygwin and OS/2 EMX have known issues when using fork() from a thread.
See also: Martin Konecny's response as to the why's and advantages of "forking" :)
For brevity; other approaches to concurrency which don't involve a separate process and therefore a separate Python interpreter include:
- Green or Lightweight threads; ala greenlet
- Coroutines ala Python generators and the new Python 3+ yield from
- Async I/O ala asyncio, Twisted, circuits, etc.
来源:https://stackoverflow.com/questions/30157895/does-python-os-fork-uses-the-same-python-interpreter