What does the tbb::scalable_allocator
in Intel Threading Building Blocks actually do under the hood ?
It can certainly be effective. I\'ve just used it
There is a good paper on the allocator: The Foundations for Scalable Multi-core Software in Intel Threading Building Blocks
My limited experience: I overloaded the global new/delete with the tbb::scalable_allocator for my AI application. But there was little change in the time profile. I didn't compare the memory usage though.
The solution you mentioned is optimized for Intel CPUs. It incorporates specific CPU mechanisms to improve performance.
Sometime ago I found another very useful solution: Fast C++11 allocator for STL containers. It slightly speeds up STL containers on VS2017 (~5x) as well as on GCC (~7x). It uses memory pool for elements allocation which makes it extremely effective for all platofrms.