Why does hyper-threading benefit my algorithm?

时光毁灭记忆、已成空白 提交于 2019-12-12 00:29:20

问题


I have a dual core machine with 4 logical processors thanks to hyper-threading. I am executing a SHA1 pre-image brute force test in C#. In each thread I basically have a for loop and compute a SHA1 hash and then compare the hash to what I am looking for. I made sure that all threads execute in complete separation. No memory is shared between them. (Except one variable: long count, which I increment in each thread using:

System.Threading.Interlocked.Increment(ref count);

I get about 1 mln sha1/s with 2 threads and 1.3 mln sha1/s with 4 threads. I fail to see why do I get a 30% bonus from HT in this case. Both cores should be busy doing their stuff, so increasing the number of threads beyond 2 should not give me any benefit. Can anyone explain why?


回答1:


Hyperthreading effectively gives you more cores, for integer operations - it allows two sets of integer operations to run in parallel on a single physical core. It doesn't help floating point operations as far as I'm aware, but presumably the SHA-1 code is primarily integer operations, hence the speed-up.

It's not as good as having 4 real physical cores, of course - but it does allow for a bit more parallelism.




回答2:


Disable HT in BIOS and do the test again for 2 threads. HT gives a little speedup only when one virtual core uses CPU instruction set and second executes instructions which uses FPU registers.




回答3:


SMT/Hyperthreading allows multiple threads (usually two), on the same physical core, to execute -- one is typically waiting for the other to encounter a stall, and then the thread which is executing will switch.

Stalls happen -- mostly with cache misses. Even if you are not traversing the same memory, there's no guarantee that said memory will already be in the cache (thus inducing a stall when it is accessed), or that it will not map to the same line of the cache that another thread is mapping memory to.

Thus, two threads will almost always benefit from SMT/hyperthreading, unless the data they traverse is already present in the cache. That's actually an unusual scenario -- an algorithm typically needs to prefetch its data, and additionally not use more than the cache can hold, or not overwrite memory other threads are trying to cache -- which requires knowledge of other threads on the core. That's not usually possible, because it's abstracted away by the OS.

Most algorithms are not tuned to that extent, particularly since its only usually console-exclusive games, or other hardware exclusive applications, which can guarantee a certain minimum spec for the cache, and more importantly, have intimate knowledge of other threads which are running concurrently on the same core. This is also one of the major reasons larger caches benefit modern CPU performance.



来源:https://stackoverflow.com/questions/19420137/why-does-hyper-threading-benefit-my-algorithm

易学教程内所有资源均来自网络或用户发布的内容,如有违反法律规定的内容欢迎反馈
该文章没有解决你所遇到的问题?点击提问,说说你的问题,让更多的人一起探讨吧!