I\'m benchmarking software which executes 4x faster on Intel 2670QM then my serial version using all 8 of my \'logical\' threads. I would like some community feedback on my
The important piece of information to understand here is the difference between physical and logical thread.
If you have 4 physical cores on your CPU, that means you have physical resources to execute 4 distinct thread of execution in parallel. So, if your threads do not have data contention, you can normally measure a x4 performance increase, compared to the speed of the single thread.
I'm also assuming that the OS (or you :)) sets the thread affinity correctly, so each thread is run on each physical core.
When you enable HT (Hyper-Threading) on your CPU the core frequency is not modified. :)
What happen is that part of the hw pipeline (inside the core and around (uncore, cache, etc)) is duplicated, but part of it is still shared between the logical threads.
That's the reason why you do not measure a x8 performance increase. In my experience enabling all logical cores you can get a x1.5 - x1.7 performance improvement per physical core, depending on the code you are executing, cache usage (remember that the L1 cache is shared between two logical cores/1 physical core, for instance), thread affinity, and so on and so forth.
Hope this helps.