How does mmap work?

后端 未结 3 1459
南笙
南笙 2020-12-28 19:13

I am working on programs in Linux which needs mmap file from harddrive, but i have a question, what can make it fail. Like if all the memories are fragmented, which has only

相关标签:
3条回答
  • 2020-12-28 19:37

    mmap() uses addresses outide your program's heap area, so heap fragmentation isn't a problem, except to the extent that it can make the heap take up more space, and reduce the available space for mappings.

    If you have lots of mapped files, you could potentially run into problems with fragmentation on a 32-bit system where the address space is relatively constrained. On a 64-bit system, fragmentation is unlikely to be a problem because even if you have only small regions available between existing mappings, there's still lots and lots of available contiguous address space, adjacent to the existing mappings.

    The more common problem on a 32-bit system is that the address space is just too small to map large files at all. Of the 4GB address space, typically 2GB is available to userspace, with the other 2GB being reserved by the kernel. Of that available 2GB, your mappings have to share space with the program's code and stacks (typically small) and heap (potentially large).

    In short, mmap() can often fail on 32-bit systems if the file is too large, but you're unlikely to ever have a file large enough to cause that problem on a 64-bit system.

    If you're creating a private copy-on-write mapping, it can also fail due to lack of swap space. The kernel has to ensure that the sum of available RAM and swap is large enough to hold the size of your mapping, in case you modify all the pages so that the kernel is forced to make private copies of them all. A shared mapping shouldn't have this problem, since changes can be flushed to the file on disk, and then the pages can be discarded if memory is scarce and reloaded from disk later.

    Of course, a mapping can also fail if you don't have permission to access the file, or if it's not a type of file that can be mapped (such as a directory or a socket).

    It's not clear what you mean about recollecting memory. Remember that the scarce resource that mmap() consumes isn't memory, it's address space. You can potentially map a 1GB file even if the machine actually only has 128MB of RAM, but on a 32-bit system you can't map a 4GB file even if the machine has 16GB of RAM.

    The concept of virtual memory is essential to understanding what mmap() does, so read about that if you're not familiar with it already.

    0 讨论(0)
  • 2020-12-28 19:37

    mmap works by manipulating your process's page table, a data structure your CPU uses to map address spaces. The CPU will translate "virtual" addresses to "physical" ones, and does so according to the page table set up by your kernel.

    When you access the mapped memory for the first time, your CPU generates a page fault. The OS kernel can then jump in, "fix up" the invalid memory access by allocating memory and doing file I/O in that newly allocated buffer, then continue your program's execution as if nothing happened.

    mmap can fail if your process is out of address space, something to watch out for these days for 32-bit code, where all usable address can be mapped pretty quickly with large data sets. It can also fail for any of the things mentioned in the "Errors" section of the manpage.

    Accessing memory inside a mapped region can also fail if the kernel has issues allocating memory or doing I/O. In that case your process will get a SIGBUS signal.

    0 讨论(0)
  • 2020-12-28 19:37

    The short answer is: it depends.

    Depending on the amount of memory you have, the environment you're working in, or the way the mapping is accessed, there are a number of ways mmap can fail. Read the manual page for mmap for more details.

    0 讨论(0)
提交回复
热议问题