问题
I am building a user-mode NUMA-aware memory allocator for linux. The allocator during its initialization grabs a large chunk of memory, one chunk per NUMA node. After this, memory pages requested by the user are met by giving as many memory pages from the large chunk pool.
If the user asks for n pages, it is easy to give n pages from a particular chunk. But now I want to implement an interleaved allocation policy, where the user gets one page from each chunk, round-robin up to n pages. This brings up the problem of the virtual addresses of these pages no longer being contiguous.
Q1: Is there a way to return virtually addressable contiguous memory? The only solution I can think of is using a "smart" pointer who knows how to jump from one page to another.
One of the reasons I am walking this path is that I am not happy with the MPOL_INTERLEAVE memory allocation policy of linux whose round-robin policy is not strict (deterministic).
Q2: Is there an inexpensive way of knowing which page and NUMA node a given virtual address range is mapped to? More precisely I do not how to get fine-grained page-level information from reading /proc/< proc_id >/numa_maps .
Thank you for your answers.
回答1:
A1. Virtually contiguous memory does not imply that the physical memory is contiguous. In linux, physical pages are not bound to virtual pages during a malloc, but rather during the first page fault.
If you really wanted to, you should be able to pre-fault the pages to bind them to a particular numa node in order to create the strict interleaving using the default allocation policy.
e.g.
N - # numa nodes
PAGES - # pages in allocation
for(i=0; i < N; i++):
pin current thread to node i
for(p=i; p < PAGES; p += N)
touch page p;
After you have that set up, you can dish out pre-interleaved contiguous pages.
Q2.
You can determine the numa node of a virtual address by using move_pages
from <numaif.h>
and passing NULL as the target node. The current node location will be in the status return value.
e.g.
int status[1];
move_pages(0, 1, &ptr_to_check, NULL, status, 0);
来源:https://stackoverflow.com/questions/8590330/how-to-implement-interleaved-page-allocation-in-a-user-mode-numa-aware-memory-al