Exit critical region

僤鯓⒐⒋嵵緔 提交于 2019-12-10 20:28:53

问题


Consider several threads executing concurrently the following code:

long gf = 0;// global variable or class member

//...

if (InterlockedCompareExchange(&gf, 1, 0)==0) // lock cmpxchg
{
    // some exclusive code - must not execute in concurrent
    gf = 0; // this is ok ? or need
    //InterlockedExchange(&gf, 0); // [lock] xchg 
}

Treat the code above as C-like pseudo-code, which will be translated more-or-less directly into assembly without the usual concessions to compiler optimizations such as re-odering and store elimination.

So after some thread exclusively acquires flag gf- to exit from the critical region is it enough to write a zero (as in gf = 0) or does this need to be interlocked - InterlockedExchange(&gf, 0)?

If both are OK, which is better from a performance view, assuming that with high probability that several cores concurrently call InterlockedCompareExchange(&gf, 1, 0)?

Several threads periodically execute this code (from several places, when some events fire) and it is important that the next thread again enters the critical region as soon as possible after it freed.


回答1:


Related: Spinlock with XCHG explains why you don't need xchg to release a lock in x86 asm, just a store instruction.

But in C++, you need something stronger than a plain gf = 0; on a plain long gf variable. The C / C++ memory model (for normal variables) is very weakly ordered, even when compiling for strongly-ordered x86, because that's essential for optimizations.

You need a release-store to correctly release a lock, without allowing operations in the critical section to leak out of the critical section by reordering at compile time or runtime with the gf=0 store. http://preshing.com/20120913/acquire-and-release-semantics/.

Since you're using long gf, not volatile long gf, and you aren't using a compiler memory barrier, nothing in your code would prevent compile-time reordering. (x86 asm stores have release semantics, so it's only compile-time reordering we need to worry about.) http://preshing.com/20120625/memory-ordering-at-compile-time/


We get everything we need as cheaply as possible using std::atomic<long> gf; and gf.store(0, std::memory_order_release); atomic<long> is lock-free on every platform that supports InterlockedExchange, AFAIK, so you should be ok to mix and match. (Or just use gf.exchange() to take the lock. If rolling your own locks, keep in mind that you should loop on a read-only operation + _mm_pause() while waiting for the lock, don't hammer away with xchg or lock cmpxchg and potentially delay the unlock. See Locks around memory manipulation via inline assembly.

This is one of the cases where the warning in Why is integer assignment on a naturally aligned variable atomic on x86? that you need atomic<> to make sure the compiler actually does the store where / when you need it applies.




回答2:


gf = 0 is sufficient. There’s no need to use a locked operation since no other thread can be changing its value.

By the way, I’d use bts instead of cmpxchg to acquire the lock. I’m not sure if it makes any difference in performance, but it’s simpler.



来源:https://stackoverflow.com/questions/50221295/exit-critical-region

易学教程内所有资源均来自网络或用户发布的内容,如有违反法律规定的内容欢迎反馈
该文章没有解决你所遇到的问题?点击提问,说说你的问题,让更多的人一起探讨吧!