What is the aligment requirements for sys_brk

前端 未结 1 1707
名媛妹妹
名媛妹妹 2021-01-21 14:59

I\'m using sys_brk syscall to dynamically allocate memory in the heap. I noticed that when acquiring the current break location I usually get value similar to this:



        
1条回答
  •  旧时难觅i
    2021-01-21 15:18

    The kernel does track the break with byte granularity. But don't use it directly for small allocations if you care at all about performance.


    There was some discussion in comments about the kernel rounding the break to a page boundary, but that's not the case. The implementation of sys_brk uses this (with my comments added so it makes sense out of context)

    newbrk = PAGE_ALIGN(brk);     // the syscall arg
    oldbrk = PAGE_ALIGN(mm->brk); // the current break
    if (oldbrk == newbrk)
        goto set_brk;      // no need to map / unmap any pages, just update mm->brk
    

    This checks if the break moved to a different page, but eventually mm->brk = brk; sets the current break to the exact arg passed to the system call (if it's valid). If the current break was always page aligned, the kernel wouldn't need PAGE_ALIGN() on it.


    Of course, memory protection has at least page granularity (and maybe hugepage, if the kernel chooses to use anonymous hugepages for this mapping). So you can access memory out to the end of the page containing the break without faulting. This is why the kernel code is just checking if the break moved to a different page to skip the map / unmap logic, but still updates the actual brk.

    AFAIK, nothing will ever use that mapped memory above the break as scratch space, so it's not like memory below the stack pointer that can be clobbered asynchronously.

    brk is just a simple memory-management system built-in to the kernel. System calls are expensive, so if you care about performance you should keep track of things in user-space and only make a system call at all when you need a new page. Using sys_brk directly for tiny allocations is terrible for performance, especially in kernels with Meltdown + Spectre mitigation enabled (making system calls much more expensive, like tens of thousands of clock cycles + TLB and branch prediction invalidation, instead of hundreds of clock cycles).

    0 讨论(0)
提交回复
热议问题