Allocating a Thread's Stack on a specific NUMA memory

我只是一个虾纸丫 提交于 2019-12-08 21:09:26

Here's the code I use for this (slightly adapted to remove some constants defined elsewhere). Note that I first create the thread normally, and then call the SetAffinityAndRelocateStack() below from within the thread. I think this is much better than trying to create your own stack, since stacks have special support for growing in case the bottom is reached.

The code can also be adapted to operate on the newly created thread from outside, but this could give rise to race conditions (e.g. if the thread performs I/O into its stack), so I wouldn't recommend it.

void* PreFaultStack()
{
    const size_t NUM_PAGES_TO_PRE_FAULT = 50;
    const size_t size = NUM_PAGES_TO_PRE_FAULT * numa_pagesize();
    void *allocaBase = alloca(size);
    memset(allocaBase, 0, size);
    return allocaBase;
}

void SetAffinityAndRelocateStack(int cpuNum)
{
    assert(-1 != cpuNum);
    cpu_set_t cpuset;
    CPU_ZERO(&cpuset);
    CPU_SET(cpuNum, &cpuset);
    const int rc = pthread_setaffinity_np(pthread_self(), sizeof(cpu_set_t), &cpuset);
    assert(0 == rc);

    pthread_attr_t attr;
    void *stackAddr = nullptr;
    size_t stackSize = 0;
    if ((0 != pthread_getattr_np(pthread_self(), &attr)) || (0 != pthread_attr_getstack(&attr, &stackAddr, &stackSize))) {
        assert(false);
    }

    const unsigned long nodeMask = 1UL << numa_node_of_cpu(cpuNum);
    const auto bindRc = mbind(stackAddr, stackSize, MPOL_BIND, &nodeMask, sizeof(nodeMask), MPOL_MF_MOVE | MPOL_MF_STRICT);
    assert(0 == bindRc);

    PreFaultStack();
    // TODO: Also lock the stack with mlock() to guarantee it stays resident in RAM
    return;
}
易学教程内所有资源均来自网络或用户发布的内容,如有违反法律规定的内容欢迎反馈
该文章没有解决你所遇到的问题?点击提问,说说你的问题,让更多的人一起探讨吧!