Minimizing the amount of malloc() calls improves performance?

前端 未结 8 877
悲&欢浪女
悲&欢浪女 2020-12-09 08:36

Consider two applications: one (num. 1) that invokes malloc() many times, and the other (num. 2) that invokes malloc() few times. Both applications allocate the same

相关标签:
8条回答
  • 2020-12-09 09:12

    Malloc has to run through a linked list of free blocks to find one to allocate. This takes time. So, #1 will usually be slower:

    • The more often you call malloc, the more time it will take - so reducing the number of calls will give you a speed improvement (though whether it is significant will depend on your exact circumstances).

    • In addition, if you malloc many small blocks, then as you free those blocks, you will fragment the heap much more than if you only allocate and free a few large blocks. So you are likely to end up with many small free blocks on your heap rather than a few big blocks, and therefore your mallocs may have to search further through the free-space lists to find a suitable block to allocate. WHich again will make them slower.

    0 讨论(0)
  • 2020-12-09 09:13

    You don't define the relative difference between "many" and "few" but I suspect most mallocs would function almost identically in both scenarios. The question implies that each call to malloc has as much overhead as a system call and page table updates. When you do a malloc call, e.g. malloc(14), in a non-brain-dead environment, malloc will actually allocate more memory than you ask for, often a multiple of the system MMU page size. You get your 14 bytes and malloc keeps track of the newly allocated area so that later calls can just return a chunk of the already allocated memory, until more memory needs to be requested from the OS.

    In other words, if I call malloc(14) 100 times or malloc(1400) once, the overhead will be about the same. I'll just have to manage the bigger allocated memory chunk myself.

    0 讨论(0)
  • 2020-12-09 09:19

    Of course this completely depends on the malloc implementation, but in this case, with no calls to free, most malloc implementations will probably give you the same algorithmic speed.

    As another answer commented, usually there will be a list of free blocks, but if you have not called free, there will just be one, so it should be O(1) in both cases.

    This assumes that the memory allocated for the heap is big enough in both cases. In case #1, you will have allocated more total memory, as each allocation involves memory overhead to store meta-data, as a result you may need to call sbrk(), or equivalent to grow the heap in case #1, which would add an additional overhead.

    They will probably be different due to cache and other second order effects, since the memory alignments for the new allocation won't be the same.

    If you have been freeing some of the memory blocks, then it is likely that #2 will be faster due to less fragmentation, and so a smaller list of free blocks to search.

    If you have freed all the memory blocks, it should end up being exactly the same, since any sane free implementation will have coalesced the blocks back into a single arena of memory.

    0 讨论(0)
  • 2020-12-09 09:21

    You asked 2 questions:

    • for which application the next malloc() call will be faster, #1 or #2?
    • In other words: Does malloc() have an index of allocated locations in memory?

    You've implied that they are the same question, but they are not. The answer to the latter question is YES.

    As for which will be faster, it is impossible to say. It depends on the allocator algorithm, the machine state, the fragmentation in the current process, and so on.

    Your idea is sound, though: you should think about how malloc usage will affect performance. There was once an app I wrote that used lots of little blobs of memory, each allocated with malloc(). It worked correctly but was slow. I replaced the many calls to malloc with just one, and then sliced up that large block within my app. It was much much faster.

    I don't recommend this approach; it's just an illustration of the point that malloc usage can materially affect performance.

    My advice is to measure it.

    0 讨论(0)
  • Allocating one block of memory is faster than allocating many blocks. There is the overhead of the system call and also searching for available blocks. In programming reducing the number of operations usually speeds up the execution time.

    Memory allocators may have to search to find a block of memory that is the correct size. This adds to the overhead of the execution time.

    However, there may be better chances of success when allocating small blocks of memory versus one large block. Is your program allocating one small block and releasing it or does it need to allocate (and preserve) small blocks. When memory becomes fragmented, there are less big chunks available, so the memory allocator may have to coalesce all the blocks to form a block big enough for the allocation.

    If your program is allocating and destroying many small blocks of memory you may want to consider allocating a static array and using that for your memory.

    0 讨论(0)
  • 2020-12-09 09:24

    These are of course implementation details, but typically free() will insert the memory into a list of free blocks. malloc() will then look at this list for a free block that is the right size, or larger. Typically, only if this fails does malloc() ask the kernel for more memory.

    There are also other considerations, such as when to coalesce multiple adjacent blocks into a single, larger block.

    And, another reason that malloc() is expensive: If malloc() is called from multiple threads, there must be some kind of synchronization on these global structures. (i.e. locks.) There exist malloc() implementations with different optimization schemes to make it better for multple threads, but generally, keeping it multi-thread safe adds to the cost, as multiple threads will contend for those locks and block progress on each other.

    0 讨论(0)
提交回复
热议问题