Efficiently allocating many short-lived small objects

前端 未结 3 884
有刺的猬
有刺的猬 2021-02-08 11:27

I\'ve got a small class (16 bytes on a 32bit system) which I need to dynamically allocate. In most cases the life-time of any given instance is very short. Some instances may al

相关标签:
3条回答
  • 2021-02-08 11:51

    You might make sure that you are using the low fragmentation heap. It is on by default in Vista and later, but I do not think that is so with earlier OS's. That can make a big difference in allocation speed for small objects.

    0 讨论(0)
  • 2021-02-08 11:58

    Some instances may also be passed across thread boundaries

    Only "some"? So perhaps you can afford to pay extra for these, if it makes the ones that don't get passed to other threads cheaper. There are various ways I can think of to get to one allocator per thread and avoid the need to lock when allocating or freeing in the thread to which the allocator belongs. I don't know which might be possible in your program:

    • Copy things across the thread boundary, instead of passing them.

    • Arrange that if they're passed to another thread for any reason, then they're passed back to the original thread to free. This doesn't necessarily have to happen very often, you could queue up a few in the receiving thread and pass them all back in a message later. This assumes of course that the thread which owns the allocator is going to stick around.

    • Have two free lists per allocator, one synchronised (to which objects are added when they're freed from another thread), and one unsynchronised. Only if the unsynchronised list is empty, and you're allocating (and hence in the thread which owns the allocator), do you need to lock the synchronised free list and move all of its current contents to the unsynchronised list. If objects being passed to other threads is rare, this basically eliminates contention on the mutex and massively reduces the number of times it's taken at all.

    • If all the above fails, having one allocator per thread might still allow you to get rid of the mutex and use a lock-free queue for the free list (multiple writers freeing, single reader allocating), which could improve performance a bit. Implementing a lock-free queue is platform-specific.

    Taking a step further back, does your app frequently hit a state in which you know that all cells allocated after a certain point (perhaps a little in the past), are no longer in use? If so, and assuming the destructor of your small objects doesn't do anything terribly urgent, then don't bother freeing cells at all - at the "certain point" create a new allocator and mark the old one as no longer in use for new allocations. When you "hit the state", free the whole allocator and its underlying buffer. If the "certain point" and the "state" are simultaneous, all the easier - just reset your allocator.

    0 讨论(0)
  • 2021-02-08 12:00

    Maybe try using Google's tcmalloc? It is optimized for fast allocation/deallocation in a threaded program, and has low overhead for small objects.

    0 讨论(0)
提交回复
热议问题