Why is tzset() a lot slower after forking on Mac OS X?

问题

Calling tzset() after forking appears to be very slow. I only see the slowness if I first call tzset() in the parent process before forking. My TZ environment variable is not set. I dtruss'd my test program and it revealed the child process reads /etc/localtime for every tzset() invocation, while the parent process only reads it once. This file access seems to be the source of the slowness, but I wasn't able to determine why it's accessing it every time in the child process.

Here is my test program foo.c:

#include <stdio.h>
#include <stdlib.h>
#include <sys/time.h>
#include <unistd.h>

void check(char *msg);

int main(int argc, char **argv) {
  check("before");

  pid_t c = fork();
  if (c == 0) {
    check("fork");
    exit(0);
  }

  wait(NULL);

  check("after");
}

void check(char *msg) {
  struct timeval tv;

  gettimeofday(&tv, NULL);
  time_t start = tv.tv_sec;
  suseconds_t mstart = tv.tv_usec;

  for (int i = 0; i < 10000; i++) {
    tzset();
  }

  gettimeofday(&tv, NULL);
  double delta = (double)(tv.tv_sec - start);
  delta += (double)(tv.tv_usec - mstart)/1000000.0;

  printf("%s took: %fs\n", msg, delta);
}

I compiled and executed foo.c like this:

[muir@muir-work-mb scratch]$ clang -o foo foo.c
[muir@muir-work-mb scratch]$ env -i ./foo
before took: 0.002135s
fork took: 1.122254s
after took: 0.001120s

I'm running Mac OS X 10.10.1 (also reproduced on 10.9.5).

I originally noticed the slowness via ruby (Time#localtime slow in child process).

回答1:

Ken Thomases's response may be correct, but I was curious about a more specific answer because I still find the slowness unexpected behavior for a single-threaded program performing such a simple/common operation after forking. After examining http://opensource.apple.com/source/Libc/Libc-997.1.1/stdtime/FreeBSD/localtime.c (not 100% sure this is the correct source), I think I have an answer.

The code uses passive notifications to determine if the time zone has changed (as opposed to stating /etc/localtime every time). It appears that the registered notification token becomes invalid in the child process after forking. Furthermore, the code treats the error from using an invalid token as a positive notification that the timezone has changed, and proceeds to read /etc/localtime every time. I guess this is the kind of undefined behavior you can get after forking? It would be nice if the library noticed the error and re-registered for the notification, though.

Here is the snippet of code from localtime.c that mixes the error value with the status value:

nstat = notify_check(p->token, &ncheck);
if (nstat || ncheck) {

I demonstrated that the registration token becomes invalid after fork using this program:

#include <notify.h>
#include <stdio.h>
#include <stdlib.h>

void bail(char *msg) {
  printf("Error: %s\n", msg);
  exit(1);
}

int main(int argc, char **argv) {
  int token, something_changed, ret;

  notify_register_check("com.apple.system.timezone", &token);

  ret = notify_check(token, &something_changed);
  if (ret)
    bail("notify_check #1 failed");
  if (!something_changed)
    bail("expected change on first call");

  ret = notify_check(token, &something_changed);
  if (ret)
    bail("notify_check #2 failed");
  if (something_changed)
    bail("expected no change");

  pid_t c = fork();
  if (c == 0) {
    ret = notify_check(token, &something_changed);
    if (ret) {
      if (ret == NOTIFY_STATUS_INVALID_TOKEN)
        printf("ret is invalid token\n");

      if (!notify_is_valid_token(token))
        printf("token is not valid\n");

      bail("notify_check in fork failed");
    }

    if (something_changed)
      bail("expected not changed");

    exit(0);
  }

  wait(NULL);
}

And ran it like this:

muir-mb:projects muir$ clang -o notify_test notify_test.c 
muir-mb:projects muir$ ./notify_test 
ret is invalid token
token is not valid
Error: notify_check in fork failed

回答2:

You're lucky you didn't experience nasal demons!

POSIX states that only async-signal-safe functions are legal to call in the child process after the fork() and before a call to an exec*() function. From the standard (emphasis added):

… the child process may only execute async-signal-safe operations until such time as one of the exec functions is called.

…

There are two reasons why POSIX programmers call fork(). One reason is to create a new thread of control within the same program (which was originally only possible in POSIX by creating a new process); the other is to create a new process running a different program. In the latter case, the call to fork() is soon followed by a call to one of the exec functions.

The general problem with making fork() work in a multi-threaded world is what to do with all of the threads. There are two alternatives. One is to copy all of the threads into the new process. This causes the programmer or implementation to deal with threads that are suspended on system calls or that might be about to execute system calls that should not be executed in the new process. The other alternative is to copy only the thread that calls fork(). This creates the difficulty that the state of process-local resources is usually held in process memory. If a thread that is not calling fork() holds a resource, that resource is never released in the child process because the thread whose job it is to release the resource does not exist in the child process.

When a programmer is writing a multi-threaded program, the first described use of fork(), creating new threads in the same program, is provided by the pthread_create() function. The fork() function is thus used only to run new programs, and the effects of calling functions that require certain resources between the call to fork() and the call to an exec function are undefined.

There are lists of async-signal-safe functions here and here. For any other function, if it's not specifically documented that the implementations on the platforms to which you're deploying add a non-standard safety guarantee, then you must consider it unsafe and its behavior on the child side of a fork() to be undefined.

来源：https://stackoverflow.com/questions/27932330/why-is-tzset-a-lot-slower-after-forking-on-mac-os-x

标签

macos

fork

libc