I\'ve been using volatile bool for years for thread execution control and it worked fine
// in my class declaration
volatile bool stop_;
-----------------
// I
My guess is that this is an hardware question. When you write volatile you tell the compiler to not assume anything about the variable but as I understand it the hardware will still treat it as a normal variable. This means that the variable will be in the cache the whole time. When you use atomic you use special hardware instructions that probably means that the variable is fetch from the main memory each time it is used. The difference in timing is consistent with this explanation.