Functions and variable space with threading using clone

问题

I currently intend to implement threading using clone() and a question is, if I have all threads using the same memory space, with each function I call in a given thread, will each thread using a different part of memory when the same function is called, or do I do have todo something to ensure this happens?

回答1:

Each thread will be using the same memory map overall but a different, separate thread-local stack for function calls. When different threads have called the same function (which itself lives at the same executable memory location), any local variables will not be shared, because they are allocated on the stack upon entry into the function / as needed, and each thread has its own stack by default.

References to any static/global memory (i.e., anything not allocated on the thread-local stack, such as globals or references to the heap or mmap memory regions passed to / visible to a thread after calling clone, and in general, the full memory map and process context itself) will, of course, be shared and subject to the usual host of multithreading issues (i.e., synchronization of shared state).

Note that you have to setup this thread-local stack space yourself before calling clone. From the manpage:

The child_stack argument specifies the location of the stack used by the child process. Since the child and calling process may share memory, it is not possible for the child process to execute in the same stack as the calling process. The calling process must therefore set up memory space for the child stack and pass a pointer to this space to clone(). Stacks grow downward on all processors that run Linux (except the HP PA processors), so child_stack usually points to the topmost address of the memory space set up for the child stack.

The child_stack parameter is the second argument to clone. It's the responsibility of the caller (i.e., the parent thread) to ensure that each child thread created via clone receives a separate and non-overlapping chunk of memory for its stack.

Note that allocating and setting-up this thread-local stack memory region is not at all simple. Ensure that your allocations are page-aligned (start address is on a 4K boundary), a multiple of the page size (4K), amply-sized (if you only have a few threads, 2MB is safe), and ideally contains a "guard" section following the usable space. The stack guard is some number of pages with no access privileges-- no reading, no writing-- following the main stack region to guard the rest of the virtual memory address space should a thread dynamically exceed its stack size (e.g., with a bunch of recursion or functions with very large temporary buffers as local variables) and try to continue to grow into the stack guard region, which will fail-early as the thread will be served a SIGSEGV right away rather than insidiously corrupting. The stack guard is technically optional. You should probably be using mmap to allocate your stacks, although posix_memalign would do as well.

All that said, I've got to ask: why try to implement threading with clone to start? There are some very challenging problems here, and the POSIX threading library has solved them (in a portable way as well). If it's the fine-grained control of clone you want, then checkout the pthread_attr_* functions; they pretty much cover every non-obscure use case (such as allowing you to allocate your own stack if you like-- from the previous discussion, I'd think you wouldn't). The very performant, general Linux implementation, amongst other things, fully wraps clone and a large variety of other heinous system calls relevant to threading-- many of which do not even have C library wrappers and must be called via syscall. It all depends upon what you want to do.

来源：https://stackoverflow.com/questions/13503244/functions-and-variable-space-with-threading-using-clone

标签

multithreading

memory

gcc

clone