Typical implementations of malloc
use brk
/sbrk
as the primary means of claiming memory from the OS. However, they also use mmap
Calling mmap(2)
once per memory allocation is not a viable approach for a general purpose memory allocator because the allocation granularity (the smallest individual unit which may be allocated at a time) for mmap(2)
is PAGESIZE
(usually 4096 bytes), and because it requires a slow and complicated syscall. The allocator fast path for small allocations with low fragmentation should require no syscalls.
So regardless what strategy you use, you still need to support multiple of what glibc calls memory arenas, and the GNU manual mentions: "The presence of multiple arenas allows multiple threads to allocate memory simultaneously in separate arenas, thus improving performance."
The jemalloc manpage (http://jemalloc.net/jemalloc.3.html) has this to say:
Traditionally, allocators have used sbrk(2) to obtain memory, which is suboptimal for several reasons, including race conditions, increased fragmentation, and artificial limitations on maximum usable memory. If sbrk(2) is supported by the operating system, this allocator uses both mmap(2) and sbrk(2), in that order of preference; otherwise only mmap(2) is used.
I don't see how any of these apply to the modern use of sbrk(2)
, as I understand it. Race conditions are handled by threading primitives. Fragmentation is handled just as would be done with memory arenas allocated by mmap(2)
. The maximum usable memory is irrelevant, because mmap(2)
should be used for any large allocation to reduce fragmentation and to release memory back to the operating system immediately on free(3)
.
The use of both the application heap (claimed with sbrk) and mmap introduces some additional complexity that might be unnecessary:
Allocated Arena - the main arena uses the application's heap. Other arenas use mmap'd heaps. To map a chunk to a heap, you need to know which case applies. If this bit is 0, the chunk comes from the main arena and the main heap. If this bit is 1, the chunk comes from mmap'd memory and the location of the heap can be computed from the chunk's address.
So the question now is, if we're already using mmap(2)
, why not just allocate an arena at process start with mmap(2)
instead of using sbrk(2)
? Especially so if, as quoted, it is necessary to track which allocation type was used. There are several reasons:
mmap(2)
may not be supported.sbrk(2)
is already initialized for a process, whereas mmap(2)
would introduce additional requirements.mmap(2)
cannot be extended as easily. Linux has mremap(2)
, but its use limits the allocator to kernels which support it. Premapping many pages with PROT_NONE
access uses too much virtual memory. Using MMAP_FIXED
unmaps any mapping which may have been there before without warning. sbrk(2)
has none of these problems, and is explicitly designed to allow for extending its memory safely.