Sharing heap memory with fork()

前端 未结 5 1179
半阙折子戏
半阙折子戏 2021-02-15 11:53

I am working on implementing a database server in C that will handle requests from multiple clients. In order to do so I am using fork() to handle connections for individual cli

相关标签:
5条回答
  • 2021-02-15 12:21

    Many popular HTTP servers use fork() to take advantage of multiple processors, Nginx is one of those.

    Threading brings with it an entire set of headaches that I personally like to avoid unless absolutely necessary, like, your program will never be free of crashes caused by multithreading bugs (my experience with other people's threading code).

    Multiprocessing lets you use all the processors on your machine, without implicitly sharing memory between execution threads, by default avoiding all typical, multithreading, endless bugs.

    I like to sleep at night without getting those 2am calls, knowing my web facing, high throughput servers aren't going to crash on me because I failed to see one of dozens of multithreading pitfalls that day.

    There are many cases where shared memory is pain free, such as, if the data in shared memory is read only. You don't have to worry about locks etc.

    0 讨论(0)
  • 2021-02-15 12:26

    Would it suffice to share the root pointer of the database or do I have to make all allocated memory as shared?

    No, because each process will have a its own private memory range. Copy-on-write is a kernel-space optimization that is transparent to user space.

    As others have said, SHM or mmap'd files are the only way to share memory between separate processes.

    0 讨论(0)
  • 2021-02-15 12:37

    Sorry for answering a month later, but I don't think the existing answers gave what the OP asked for.

    I think you are basically looking to do what is done by Redis (and propbably others). They describe it in http://redis.io/topics/persistence (search for "copy-on-write").

    • threads defeat the purpose
    • classic shared memory (shm, mapped memory) also defeats the purpose

    The primary benefit to using this method is avoidance of locking, which can be a pain to get right.

    As far as I understand it the idea of using COW is to:

    • fork when you want to write, not in advance
    • the child (re)writes the data to disk, then immediately exits
    • the parent keeps on doing its work, and detects (SIGCHLD) when the child exited. If while doing its work the parent ends up making changes to the hash, the kernel will execute a copy for the affected blocks (right terminology?).
      A "dirty flag" is used to track if a new fork is needed to execute a new write.

    Things to watch out for:

    • Make sure only one outstanding child
    • Transactional safety: write to a temp file first, then move it over so that you always have a complete copy, maybe keeping the previous around if the move is not atomic.
    • test if you will have issues with other resources that get duplicated (file descriptors, global destructors in c++)

    You may want to take gander at the redis code as well

    0 讨论(0)
  • 2021-02-15 12:43

    If you must you fork, the shared memory seems to be the 'only' choice.

    Actually, I think in your scene, the thread is more suitable.

    If you don't want to be multi-threaded. Here is another choice,you can only use one-process & one-thread mode, like redis

    With this mode,you don't need worry about something like lock and if you want to scale, just design a route policy,as route with the hash value of the key

    0 讨论(0)
  • 2021-02-15 12:46

    First of all, fork is completely inappropriate for what you're trying to achieve. Even if you can make it work, it's a horrible hack. In general, fork only works for very simplistic programs anyway, and I would go so far as to say that fork should never be used except followed quickly by exec, but that's aside from the point here. You really should be using threads.

    With that said, the only way to have memory that's shared between the parent and child after fork, and where the same pointers are valid in both, is to mmap (or shmat, but that's a lot fuglier) a file or anonymous map with MAP_SHARED prior to the fork. You cannot create new shared memory like this after fork because there's no guarantee that it will get mapped at the same address range in both.

    Just don't use fork. It's not the right tool for the job.

    0 讨论(0)
提交回复
热议问题