Who creates and owns the call stack and how does call stack works in multithread?

问题

I know that each thread usually have one call stack, which is just a chunk of memory and controlled by using esp and ebp.

1, how are these call stacks created and who's responsible for doing this? My guess is the runtime, for example Swift runtime for iOS application. And is it the thread directly talks to its own call stack by esp and ebp or through runtime?

2, for each call stack, they have to work with the esp and ebb cpu registers, if I have a CPU with 2 cores 4 threads, then let's say it has 4 cores (instruction sets). Does it mean that each call stack gonna be working with these registers in specific core only?

回答1:

XNU kernel does it. Swift threads are POSIX pthreads aka Mach threads. During program startup the XNU kernel parses Mach-O executable format and handles either modern LC_MAIN or legacy LC_UNIXTHREAD load command (among others). This is handled in kernel functions:

static
load_return_t
load_main(
        struct entry_point_command  *epc,
        thread_t        thread,
        int64_t             slide,
        load_result_t       *result
    )

static
load_return_t
load_unixthread(
    struct thread_command   *tcp,
    thread_t        thread,
    int64_t             slide,
    load_result_t       *result
)

which do happen to be open source

LC_MAIN initialises the stack through thread_userstackdefault

LC_UNIXTHREAD through load_threadstack.

As @PeterCordes mentions in comments only when the kernel creates the main thread the started process itself can spawn child threads from it's own main thread either through some api like GCD or directly through syscall (bsdthread_create, not sure if any others). The syscall happens to have user_addr_t stack as it's 3rd argument (i.e. rdx in x86-64 System V kernel ABI used by MacOS). Reference for MacOS syscalls
I haven't thoroughly investigated details of this particular stack argument, but I would imagine it's similar to thread_userstackdefault / load_threadstack approach.

I do believe your doubt about Swift runtime responsibility may arise due to frequent mentions of data structures (like Swift struct - no pun intended) being stored on the stack (which is btw implementation detail and not guaranteed feature of the runtime).

Update:
He's an example main.swift command line program ilustrating the idea.

import Foundation

struct testStruct {
    var a: Int
}

class testClass {
}

func testLocalVariables() {
    print("main thread function with local varablies")
    var struct1 = testStruct(a: 5)
    withUnsafeBytes(of: &struct1) { print($0) }
    var classInstance = testClass()
    print(NSString(format: "%p", unsafeBitCast(classInstance, to: Int.self)))
}
testLocalVariables()

print("Main thread", Thread.isMainThread)
var struct1 = testStruct(a: 5)
var struct1Copy = struct1

withUnsafeBytes(of: &struct1) { print($0) }
withUnsafeBytes(of: &struct1Copy) { print($0) }

var string = "testString"
var stringCopy = string

withUnsafeBytes(of: &string) { print($0) }
withUnsafeBytes(of: &stringCopy) { print($0) }

var classInstance = testClass()
var classInstanceAssignment = classInstance
var classInstance2 = testClass()

print(NSString(format: "%p", unsafeBitCast(classInstance, to: Int.self)))
print(NSString(format: "%p", unsafeBitCast(classInstanceAssignment, to: Int.self)))
print(NSString(format: "%p", unsafeBitCast(classInstance2, to: Int.self)))

DispatchQueue.global(qos: .background).async {
    print("Child thread", Thread.isMainThread)
    withUnsafeBytes(of: &struct1) { print($0) }
    withUnsafeBytes(of: &struct1Copy) { print($0) }
    withUnsafeBytes(of: &string) { print($0) }
    withUnsafeBytes(of: &stringCopy) { print($0) }
    print(NSString(format: "%p", unsafeBitCast(classInstance, to: Int.self)))
    print(NSString(format: "%p", unsafeBitCast(classInstanceAssignment, to: Int.self)))
    print(NSString(format: "%p", unsafeBitCast(classInstance2, to: Int.self)))
}

//Keep main thread alive indefinitely so that process doesn't exit
CFRunLoopRun()

My output looks like this:

main thread function with local varablies
UnsafeRawBufferPointer(start: 0x00007ffeefbfeff8, count: 8)
0x7fcd0940cd30
Main thread true
UnsafeRawBufferPointer(start: 0x000000010058a6f0, count: 8)
UnsafeRawBufferPointer(start: 0x000000010058a6f8, count: 8)
UnsafeRawBufferPointer(start: 0x000000010058a700, count: 16)
UnsafeRawBufferPointer(start: 0x000000010058a710, count: 16)
0x7fcd0940cd40
0x7fcd0940cd40
0x7fcd0940c900
Child thread false
UnsafeRawBufferPointer(start: 0x000000010058a6f0, count: 8)
UnsafeRawBufferPointer(start: 0x000000010058a6f8, count: 8)
UnsafeRawBufferPointer(start: 0x000000010058a700, count: 16)
UnsafeRawBufferPointer(start: 0x000000010058a710, count: 16)
0x7fcd0940cd40
0x7fcd0940cd40
0x7fcd0940c900

Now we can observe a couple of interesting things:

Class instances clearly occupy a different part of memory than Structs
Assigning a struct to a new variable makes a copy to a new memory address
Assigning class instance just copies the pointer.
Both main thread and child thread when referring to global Structs point to exactly same memory
Strings do have a Struct container.

Update2 - proof of 4^ We can actually inspect the memory underneath:

x 0x10058a6f0 -c 8
0x10058a6f0: 05 00 00 00 00 00 00 00                          ........
x 0x10058a6f8 -c 8
0x10058a6f8: 05 00 00 00 00 00 00 00                          ........

So this definitely is the actual struct raw data i.e. the struct itself.

Update 3

I added a testLocalVariables() function, to distinguish between Swift Struct defined as global and local variables. In this case

x 0x00007ffeefbfeff8 -c 8
0x7ffeefbfeff8: 05 00 00 00 00 00 00 00                          ........

it clearly lives on the thread stack.

Last but not least when in lldb I do:

re read rsp
rsp = 0x00007ffeefbfefc0  from main thread
re read rsp
rsp = 0x000070000291ea40  from child thread

it yields different value for each thread, so the thread stacks are clearly distinct.

Digging further
There's a handy memory region lldb command which sheds some light what's going on.

memory region 0x000000010058a6f0
[0x000000010053d000-0x000000010058b000) rw- __DATA

So global Structs sit in preallocated executable writeable __DATA memory page (same one where your global variables live). Same command for class 0x7fcd0940cd40 address isn't as spectacular (I reckon because that's a dynamically allocated heap). Analogously for thread stack address 0x7ffeefbfefc0 which clearly isn't a process memory region.

Fortunately there is one last tool to further go down the rabbit hole.
vmmap -v -purge pid which does confirm classes sit in MALLOC_ed heap and likewise a thread stack (at least for main thread) can be cross referenced to Stack.

Somewhat related question is also here.

HTH

回答2:

(I'm assuming Swift threading is just like threads in other languages. There really aren't many good options, either normal OS-level threads or user-space "green threads", or a mix of both. The difference is only where context switches happen; main concepts are still the same)

Each thread has its own stack, allocated in the process's address space by mmap or something by the parent thread, or maybe by the same system call that creates the thread. IDK iOS system calls. In Linux you have to pass a void *child_stack to the Linux-specific clone(2) system call that actually creates a new thread. It's very rare to use low-level OS-specific system calls directly; a language runtime would probably do threading on top of pthreads functions like pthread_create, and that pthreads library would handle the OS-specific details.

And yes each software thread has its own architectural state, including RSP on x86-64, or sp on AArch64. (Or ESP if you make obsolete 32-bit x86 code). I assume frame pointers are optional for swift.

And yes each logical core has its own architectural state (registers including a stack pointer); a software thread runs on a logical core, and context switches between software threads save/restore registers. Related, maybe a duplicate of What resources are shared between threads?.

Software threads share the same page tables (virtual address space), but not registers.

来源：https://stackoverflow.com/questions/58405568/who-creates-and-owns-the-call-stack-and-how-does-call-stack-works-in-multithread

标签

ios

swift

multithreading

assembly

runtime