问题
Calling tzset()
after forking appears to be very slow. I only see the slowness if I first call tzset()
in the parent process before forking. My TZ
environment variable is not set. I dtruss
'd my test program and it revealed the child process reads /etc/localtime
for every tzset()
invocation, while the parent process only reads it once. This file access seems to be the source of the slowness, but I wasn't able to determine why it's accessing it every time in the child process.
Here is my test program foo.c:
#include <stdio.h>
#include <stdlib.h>
#include <sys/time.h>
#include <unistd.h>
void check(char *msg);
int main(int argc, char **argv) {
check("before");
pid_t c = fork();
if (c == 0) {
check("fork");
exit(0);
}
wait(NULL);
check("after");
}
void check(char *msg) {
struct timeval tv;
gettimeofday(&tv, NULL);
time_t start = tv.tv_sec;
suseconds_t mstart = tv.tv_usec;
for (int i = 0; i < 10000; i++) {
tzset();
}
gettimeofday(&tv, NULL);
double delta = (double)(tv.tv_sec - start);
delta += (double)(tv.tv_usec - mstart)/1000000.0;
printf("%s took: %fs\n", msg, delta);
}
I compiled and executed foo.c like this:
[muir@muir-work-mb scratch]$ clang -o foo foo.c
[muir@muir-work-mb scratch]$ env -i ./foo
before took: 0.002135s
fork took: 1.122254s
after took: 0.001120s
I'm running Mac OS X 10.10.1 (also reproduced on 10.9.5).
I originally noticed the slowness via ruby (Time#localtime slow in child process).
回答1:
Ken Thomases's response may be correct, but I was curious about a more specific answer because I still find the slowness unexpected behavior for a single-threaded program performing such a simple/common operation after fork
ing. After examining http://opensource.apple.com/source/Libc/Libc-997.1.1/stdtime/FreeBSD/localtime.c (not 100% sure this is the correct source), I think I have an answer.
The code uses passive notifications to determine if the time zone has changed (as opposed to stat
ing /etc/localtime
every time). It appears that the registered notification token becomes invalid in the child process after fork
ing. Furthermore, the code treats the error from using an invalid token as a positive notification that the timezone has changed, and proceeds to read /etc/localtime
every time. I guess this is the kind of undefined behavior you can get after fork
ing? It would be nice if the library noticed the error and re-registered for the notification, though.
Here is the snippet of code from localtime.c that mixes the error value with the status value:
nstat = notify_check(p->token, &ncheck);
if (nstat || ncheck) {
I demonstrated that the registration token becomes invalid after fork using this program:
#include <notify.h>
#include <stdio.h>
#include <stdlib.h>
void bail(char *msg) {
printf("Error: %s\n", msg);
exit(1);
}
int main(int argc, char **argv) {
int token, something_changed, ret;
notify_register_check("com.apple.system.timezone", &token);
ret = notify_check(token, &something_changed);
if (ret)
bail("notify_check #1 failed");
if (!something_changed)
bail("expected change on first call");
ret = notify_check(token, &something_changed);
if (ret)
bail("notify_check #2 failed");
if (something_changed)
bail("expected no change");
pid_t c = fork();
if (c == 0) {
ret = notify_check(token, &something_changed);
if (ret) {
if (ret == NOTIFY_STATUS_INVALID_TOKEN)
printf("ret is invalid token\n");
if (!notify_is_valid_token(token))
printf("token is not valid\n");
bail("notify_check in fork failed");
}
if (something_changed)
bail("expected not changed");
exit(0);
}
wait(NULL);
}
And ran it like this:
muir-mb:projects muir$ clang -o notify_test notify_test.c
muir-mb:projects muir$ ./notify_test
ret is invalid token
token is not valid
Error: notify_check in fork failed
回答2:
You're lucky you didn't experience nasal demons!
POSIX states that only async-signal-safe functions are legal to call in the child process after the fork()
and before a call to an exec*()
function. From the standard (emphasis added):
… the child process may only execute async-signal-safe operations until such time as one of the
exec
functions is called.…
There are two reasons why POSIX programmers call
fork()
. One reason is to create a new thread of control within the same program (which was originally only possible in POSIX by creating a new process); the other is to create a new process running a different program. In the latter case, the call tofork()
is soon followed by a call to one of theexec
functions.The general problem with making
fork()
work in a multi-threaded world is what to do with all of the threads. There are two alternatives. One is to copy all of the threads into the new process. This causes the programmer or implementation to deal with threads that are suspended on system calls or that might be about to execute system calls that should not be executed in the new process. The other alternative is to copy only the thread that callsfork()
. This creates the difficulty that the state of process-local resources is usually held in process memory. If a thread that is not callingfork()
holds a resource, that resource is never released in the child process because the thread whose job it is to release the resource does not exist in the child process.When a programmer is writing a multi-threaded program, the first described use of
fork()
, creating new threads in the same program, is provided by thepthread_create()
function. Thefork()
function is thus used only to run new programs, and the effects of calling functions that require certain resources between the call tofork()
and the call to anexec
function are undefined.
There are lists of async-signal-safe functions here and here. For any other function, if it's not specifically documented that the implementations on the platforms to which you're deploying add a non-standard safety guarantee, then you must consider it unsafe and its behavior on the child side of a fork()
to be undefined.
来源:https://stackoverflow.com/questions/27932330/why-is-tzset-a-lot-slower-after-forking-on-mac-os-x