I\'ve been experimenting with mremap(). I\'d like to be able to move virtual memory pages around at high speeds. At least higher speeds than copying them. I have some ideas for
It appears that there is no faster user-land mechanism to re-order memory pages than memcpy(). mremap() is far slower and therefore only useful for re-sizing an area of memory previously assigned using mmap().
But page tables must be extremely fast I hear you saying! And it's possible for user-land to call kernel functions millions of times per second! The following references help explain why mremap() is so slow:
"An Introduction to Intel Memory Management" is a nice introduction to the theory of memory page mapping.
"Key concepts of Intel virtual memory" shows how it all works in more detail, in case you plan on writing your own OS :-)
"Sharing Page Tables in the Linux Kernel" shows some of the difficult Linux memory page mapping architectural decisions and their effect on performance.
Looking at all three references together then we can see that there has been little effort so far from kernel architects to expose memory page mapping to user-land in an efficient way. Even in the kernel, manipulation of the page table must be done by using up to three locks which will be slow.
Going forwards, since the page table itself is made up of 4k pages, it may be possible to change the kernel so that particular page table pages are unique to a particular thread and can be assumed to have lock-less access for the duration of the process. This would facilitate very efficient manipulation of that particular page table page via user-land. But this moves outside the scope of the original question.