mmap problem, allocates huge amounts of memory

前端 未结 8 746
野趣味
野趣味 2020-12-23 18:17

I got some huge files I need to parse, and people have been recommending mmap because this should avoid having to allocate the entire file in-memory.

But looking at

相关标签:
8条回答
  • 2020-12-23 18:51

    top has many memory-related columns. Most of them are based on the size of the memory space mapped to the process; including any shared libraries, swapped out RAM, and mmapped space.

    Check the RES column, this is related to the physical RAM currently in use. I think (but not sure) it would include the RAM used to 'cache' the mmap'ped file

    0 讨论(0)
  • 2020-12-23 18:55

    You may have been offered the wrong advice.

    Memory mapped files (mmap) will use more and more memory as you parse through them. When physical memory becomes low, the kernel will unmap sections of the file from physical memory based on its LRU (least recently used) algorithm. But the LRU is also global. The LRU may also force other processes to swap pages to disk, and reduce the disk cache. This can have a severely negative affect on the performance on other processes and the system as a whole.

    If you are linearly reading through files, like counting the number of lines, mmap is a bad choice, as it will fill physical memory before release memory back to the system. It would be better to use traditional I/O methods which stream or read in a block at a time. That way memory can be released immediately afterwards.

    If you are randomly accessing a file, mmap is an okay choice. But it's not optimal since you would still be relying the kernel's general LRU algorithm, but it’s faster to use than writing your caching mechanism.

    In general, I would never recommend anyone use mmap, except for some extreme performance edge cases - like accessing the file from multiple processes or threads at the same time, or when the file is small in relationship to the amount of free available memory.

    0 讨论(0)
提交回复
热议问题