Consider a program which uses a large number of roughly page-sized memory regions (say 64 kB or so), each of which is rather short-lived. (In my particular case, these are alter
I did a lot of research into this topic (for a different use) at some point. In my case I needed a large hashmap that was very sparsely populated + the ability to zero it every now and then.
mmap
solution:
The easiest solution (which is portable, madvise(MADV_DONTNEED)
is linux specific) to zero a mapping like this is to mmap
a new mapping above it.
void * mapping = mmap(MAP_ANONYMOUS);
// use the mapping
// zero certain pages
mmap(mapping + page_aligned_offset, length, MAP_FIXED | MAP_ANONYMOUS);
The last call is performance wise equivalent to subsequent munmap/mmap/MAP_FIXED
, but is thread safe.
Performance wise the problem with this solution is that the pages have to be faulted in again on a subsequence write access which issues an interrupt and a context change. This is only efficient if very few pages were faulted in in the first place.
memset
solution:
After having such crap performance if most of the mapping has to be unmapped I decided to zero the memory manually with memset
. If roughly over 70% of the pages are already faulted in (and if not they are after the first round of memset
) then this is faster then remapping those pages.
mincore
solution:
My next idea was to actually only memset
on those pages that have been faulted in before. This solution is NOT thread-safe. Calling mincore
to determine if a page is faulted in and then selectively memset
them to zero was a significant performance improvement until over 50% of the mapping was faulted in, at which point memset
ting the entire mapping became simpler (mincore
is a system call and requires one context change).
incore table solution:
My final approach which I then took was having my own in-core table (one bit per page) that says if it has been used since the last wipe. This is by far the most efficient way since you will only be actually zeroing the pages in each round that you actually used. It obviously also is not thread safe and requires you to track which pages have been written to in user space, but if you need this performance then this is by far the most efficient approach.