performance problems in parallel mergesort C++
问题 I have tried to write a parallel implementation of mergesort using threads and templates. The relevant code is listed below. I have compared the performance with sort from the C++ STL. My code is 6 times slower than std::sort when no threads are spawned. Playing with the variable maxthreads (and/or FACTOR) I was able to only double the performance, so that in the best case I am 3 times slower than std::sort. I have tried the code on a 16 core multiprocessor machine. htop shows that the cores