upd I now think that root of my problem not \"threading\", because I observe slowdown at any point of my program. I think somehow when using 2 processors my
just because you have a System which can handle much more threads, this does not mean that all of them can be fully processed parallel.
When I upgrade from a Quadcore CPU to a i7(virtual 8 cores), I noticed that a setup using more threads than cores resulted in the threads blocking each other for some time, which lead to an overall slowdown of the System.
The problem was just that my algorythims already were capable of using the full processing time of the core their thread was running on while waiting threads only worked on about 5 to 10%, which lead to the main threads to finish but some singe threads still having to do all their work(taking the same amout of time again).
The threadpool will only continue if all workers have finished, so the total amount of time until finishing will be unuset processor time for the other threads.
maybe you just need to find an optimal number of threads.