I\'m writing a C++14 JSON library as an exercise and to use it in my personal projects.
By using callgrind I\'ve discovered that the current bottleneck
Custom allocators can help because most malloc()
/new
implementations are designed for maximum flexibility, thread-safety and bullet-proof workings. For instance, they must gracefully handle the case that one thread keeps allocating memory, sending the pointers to another thread that deallocates them. Things like these are difficult to handle in a performant way and drive the cost of malloc()
calls.
However, if you know that some things cannot happen in your application (like one thread deallocating stuff another thread allocated, etc.), you can optimize your allocator further than the standard implementation. This can yield significant results, especially when you don't need thread safety.
Also, the standard implementation is not necessarily well optimized: Implementing void* operator new(size_t size)
and void operator delete(void* pointer)
by simply calling through to malloc()
and free()
gives an average performance gain of 100 CPU cycles on my machine, which proves that the default implementation is suboptimal.