Assume:
A. C++ under WIN32.
B. A properly aligned volatile integer incremented and decremented using InterlockedIncrement()
and Interlocke
Your way is good:
LONG Cur = InterlockedCompareExchange(&_ServerState, 0, 0);
I'm using similar solution:
LONG Cur = InterlockedExchangeAdd(&_ServerState, 0);
Read is fine. A 32-bit value is always read as a whole as long as it's not split on a cache line. Your align 8 guarantees that it's always within a cache line so you'll be fine.
Forget about instructions reordering and all that non-sense. Results are always retired in-order. It would be a processor recall otherwise!!!
Even for a dual CPU machine (i.e. shared via the slowest FSBs), you'll still be fine as the CPUs guarantee cache coherency via MESI Protocol. The only thing you're not guaranteed is the value you read may not be the absolute latest. BUT, what is the latest anyway? That's something you likely won't need to know in most situations if you're not writing back to the location based on the value of that read. Otherwise, you'd have used interlocked ops to handle it in the first place.
In short, you gain nothing by using Interlocked ops on a read (except perhaps reminding the next person maintaining your code to tread carefully - then again, that person may not be qualified to maintain your code to begin with).
EDIT: In response to a comment left by Adrian McCarthy.
You're overlooking the effect of compiler optimizations. If the compiler thinks it has the value already in a register, then it's going to re-use that value instead of re-reading it from memory. Also, the compiler may do instruction re-ordering for optimization if it believes there are no observable side effects.
I did not say reading from a non-volatile variable is fine. All the question was asking was if interlocked was required. In fact, the variable in question was clearly declared with volatile
. Or were you overlooking the effect of the keyword volatile
?
Interlocked instructions provide atomicity and inter-processor synchronization. Both writes and reads must be synchronized, so yes, you should be using interlocked instructions to read a value that is shared between threads and not protected by a lock. Lock-free programming (and that's what you're doing) is a very tricky area, so you might consider using locks instead. Unless this is really one of your program's bottlenecks that must be optimized?
32-bit read operations are already atomic on some 32-bit systems (Intel spec says these operations are atomic, but there's no guarantee that this will be true on other x86-compatible platforms). So you shouldn't use this for threads synchronization.
If you need a flag some sort you should consider using Event object and WaitForSingleObject function for that purpose.