Both should run in O(n log n), but in general sort is faster than stable_sort. How big is the performance gap in practice? Do you have some experience about that?
I want
How big is the performance gap in practice? Do you have some experience about that?
Yes, but it didn't go the way you would expect.
I took a C implementation of the Burrows-Wheeler Transform and C++-ified it. Turned out a lot slower than the C code (though the code was cleaner). So I put timing instrumentation in there and it appeared that the qsort was performing faster than the std::sort. This was running in VC6. It was then recompiled with stable_sort and the tests ran faster than the C version. Other optimisations managed to push the C++ version ~25% quicker than the C version. I think it was possible to eke more speed but the clarity of the code was disappearing.