Do I need a memory barrier for a change notification flag between threads?

问题

I need a very fast (in the sense "low cost for reader", not "low latency") change notification mechanism between threads in order to update a read cache:

The situation

Thread W (Writer) updates a data structure (S) (in my case a setting in a map) only once in a while.

Thread R (Reader) maintains a cache of S and does read this very frequently. When Thread W updates S Thread R needs to be notified of the update in reasonable time (10-100ms).

Architecture is ARM, x86 and x86_64. I need to support C++03 with gcc 4.6 and higher.

Code

is something like this:

// variables shared between threads
bool updateAvailable;
SomeMutex dataMutex;
std::string myData;

// variables used only in Thread R
std::string myDataCache;

// Thread W
SomeMutex.Lock();
myData = "newData";
updateAvailable = true;
SomeMutex.Unlock();

// Thread R

if(updateAvailable)
{
    SomeMutex.Lock();
    myDataCache = myData;
    updateAvailable = false;
    SomeMutex.Unlock();
}

doSomethingWith(myDataCache);

My Question

In Thread R no locking or barriers occur in the "fast path" (no update available). Is this an error? What are the consequences of this design?

Do I need to qualify updateAvailable as volatile?

Will R get the update eventually?

My understanding so far

Is it safe regarding data consistency?

This looks a bit like "Double Checked Locking". According to http://www.cs.umd.edu/~pugh/java/memoryModel/DoubleCheckedLocking.html a memory barrier can be used to fix it in C++.

However the major difference here is that the shared resource is never touched/read in the Reader fast path. When updating the cache, the consistency is guaranteed by the mutex.

Will R get the update?

Here is where it gets tricky. As I understand it, the CPU running Thread R could cache updateAvailable indefinitely, effectively moving the Read way way before the actual if statement.

So the update could take until the next cache flush, for example when another thread or process is scheduled.

回答1:

Do I need to qualify updateAvailable as volatile?

As volatile doesn't correlate with threading model in C++, you should use atomics for make your program strictly standard-confirmant:

On C++11 or newer preferable way is to use atomic<bool> with memory_order_relaxed store/load:

atomic<bool> updateAvailable;

//Writer
....
updateAvailable.store(true, std::memory_order_relaxed); //set (under mutex locked)

// Reader

if(updateAvailable.load(std::memory_order_relaxed)) // check
{
    ...
    updateAvailable.store(false, std::memory_order_relaxed); // clear (under mutex locked)
    ....
}

gcc since 4.7 supports similar functionality with in its atomic builtins.

As for gcc 4.6, it seems there is not strictly-confirmant way to evade fences when access updateAvailable variable. Actually, memory fence is usually much faster than 10-100ms order of time. So you can use its own atomic builtins:

int updateAvailable = 0;

//Writer
...
__sync_fetch_and_or(&updateAvailable, 1); // set to non-zero
....

//Reader
if(__sync_fetch_and_and(&updateAvailable, 1)) // check, but never change
{
    ...
    __sync_fetch_and_and(&updateAvailable, 0); // clear
    ...
}

Is it safe regarding data consistency?

Yes, it is safe. Your reason is absolutely correct here:

the shared resource is never touched/read in the Reader fast path.

This is NOT double-check locking!

It is explicitely stated in the question itself.

In case when updateAvailable is false, Reader thread uses variable myDataCache which is local to the thread (no other threads use it). With double-check locking scheme all threads use shared object directly.

Why memory fences/barriers are NOT NEEDED here

The only variable, accessed concurrently, is updateAvailable. myData variable is accessed with mutex protection, which provides all needed fences. myDataCache is local to the Reader thread.

When Reader thread sees updateAvailable variable to be false, it uses myDataCache variable, which is changed by the thread itself. Program order garantees correct visibility of changes in that case.

As for visibility garantees for variable updateAvailable, C++11 standard provide such garantees for atomic variable even without fences. 29.3 p13 says:

Implementations should make atomic stores visible to atomic loads within a reasonable amount of time.

Jonathan Wakely has confirmed, that this paragraph is applied even to memory_order_relaxed accesses in chat.

回答2:

Use C++ atomics and make updateAvailable an std::atomic<bool>. The reason for this is that it's not just the CPU that can see an old version of the variable but especially the compiler which doesn't see the side effect of another thread and thus never bothers to refetch the variable so you never see the updated value in the thread. Additionally, this way you get a guaranteed atomic read, which you don't have if you just read the value.

Other than that, you could potentially get rid of the lock, if for example the producer only ever produces data when updateAvailable is false, you can get rid of the mutex because the std::atomic<> enforces proper ordering of the reads and writes. If that's not the case, you'll still need the lock.

回答3:

You do have to use a memory fence here. Without the fence, there is no guarantee updates will be ever seen on the other thread. In C++03 you have the option of either using platform-specific ASM code (mfence on Intel, no idea about ARM) or use OS-provided atomic set/get functions.

来源：https://stackoverflow.com/questions/33956205/do-i-need-a-memory-barrier-for-a-change-notification-flag-between-threads

标签

c++

multithreading

c++03

lock-free