My application contains several latency-critical threads that \"spin\", i.e. never blocks. Such thread expected to take 100% of one CPU core. However it seems modern operation s
I came across this question because I'm dealing with the exactly same design problem. I'm building HFT systems where each nanosecond count. After reading all the answers, I decided to implement and benchmark 4 different approaches
The imbatible winner was "busy wait with affinity set". No doubt about it.
Now, as many have pointed out, make sure to leave a couple of cores free in order to allow OS run freely.
My only concern at this point is if there is some physical harm to those cores that are running at 100% for hours.