Setup:
I\'m trying to write code which sho
I don't think there's any standard POSIX API for this.
Parsing /proc/self/maps
is your best bet. (There may be a library to help with this, but IDK).
You tagged this ASLR, though. If you just want to know where the text / data / bss segments are, you can put labels at the start/end of them so those addresses are available in C. e.g. extern const char bss_end[];
would be a good way to reference a label you put at the end of the BSS using a linker script and maybe some hand-written asm. The compiler-generated asm will use a RIP-relative LEA instruction to get the address in a register relative to the current instruction address (which the CPU knows because it's executing the code mapped there).
Or maybe just a linker script and declaring dummy C variables in custom sections.
I'm not sure if you can do that for the stack mapping. With a large environment and/or argv, the initial stack on entry to main()
or even _start
might not be in the same page as the highest address in the stack mapping.
To scan, you either need to catch SIGSEGV
or scan with system calls instead of user-space loads or stores.
mmap
and mprotect
can't query the old setting, so they're not very useful for non-destructive stuff. mmap
with a hint but without MAP_FIXED
could map a page, and then you could munmap
it. If the actual chosen address != hint, then you could assume the address was in use.
Maybe a better option would be to scan with madvise(MADV_NORMAL)
and check for EFAULT
, but only one page at a time.
You could even do this portably with errno=0; posix_madvise(page, 4096, POSIX_MADV_NORMAL). Then check errno
: ENOMEM
: Addresses in the specified range are partially or completely outside the caller's address space.
On Linux with madvise(2) you could use MADV_DOFORK
or something that's even less likely to be at a non-default setting for each page.
But on Linux, an even better choice for read-only querying the process memory mapping is mincore(2): It also uses the error code ENOMEM
for an invalid addresses in the queried range. "addr
to addr + length
contained unmapped memory". (EFAULT
is for the result vector pointing to unmapped memory, not addr).
Only the errno
result is useful; the vec
result shows you whether pages are hot in RAM or not. (I'm not sure if it shows you which pages are wired into the HW page tables, or if it would count a page that's resident in memory in the pagecache for a memory mapped file but not wired, so an access would trigger a soft page fault).
You can binary-search for the end of a large mapping by calling mincore
with larger lengths.
But unfortunately I don't see any equivalent for finding the next mapping after an unmapped page, which would be much more useful because most of the address-space will be unmapped. Especially in x86-64 with 64-bit addresses!
For sparse files there's lseek(SEEK_DATA)
. I wonder if that works on Linux's /proc/self/mem
? probably not.
So maybe large (like 256MB) (tmp=mmap(page, blah blah)) == page
calls would be a good way to scan through unmapped regions looking for mapped pages. Either way you simply munmap(tmp)
, whether mmap
used your hint address or not.
Parsing /proc/self/maps
is almost certainly more efficient.
But the most efficient thing would be putting labels where you want them for static addresses, and tracking dynamic allocations so you already know where your memory is. This works if you have no memory leaks. (glibc malloc
might have an API to walk the mappings, but I'm not sure.)
Note that any system call will produce an errno=EFAULT
if you pass it an unmapped address for a parameter that's supposed to point to something.
One possible candidate is access(2), which takes a filename and returns an integer. It has zero effect on the state of anything else, success or fail, but the downside is filesystem access if the pointed-to memory is a valid path string. And it's looking for an implicit-length C string so could also be slow if passed a pointer to memory with no 0
byte anywhere soon. I guess ENAMETOOLONG
would kick in, but it still definitely reads every accessible page you use it on, faulting it in even if it was paged out.
If you open a file descriptor on /dev/null
, you could make write()
system calls with that. Or even with writev(2) : writev(devnull_fd, io_vec, count)
to pass the kernel a vector of pointers in one system call, and get an EFAULT if any of them are bad. (With lengths of 1 byte each). But (unless the /dev/null
driver skips reads early enough) this does actually read from pages that are valid, faulting them in unlike mincore()
. Depending how it's implemented internally, the /dev/null
driver might see the request early enough for its "return true"-without-doing-anything implementation to avoid actually touching pages after checking for EFAULT. Would be interesting to check.