问题
I am trying to find my feet in lock-free programming. Having read different explanations for memory ordering semantics, I would like to clear up what possible reordering may happen. As far as I understood, instructions may be reordered by the compiler (due to optimization when the program is compiled) and CPU (at runtime?).
For the relaxed semantics cpp reference provides the following example:
// Thread 1:
r1 = y.load(memory_order_relaxed); // A
x.store(r1, memory_order_relaxed); // B
// Thread 2:
r2 = x.load(memory_order_relaxed); // C
y.store(42, memory_order_relaxed); // D
It is said that with x and y initially zero the code is allowed to produce r1 == r2 == 42 because, although A is sequenced-before B within thread 1 and C is sequenced before D within thread 2, nothing prevents D from appearing before A in the modification order of y, and B from appearing before C in the modification order of x. How could that happen? Does it imply that C and D get reordered, so the execution order would be DABC? Is it allowed to reorder A and B?
For the acquire-release semantics there is the following sample code:
std::atomic<std::string*> ptr;
int data;
void producer()
{
std::string* p = new std::string("Hello");
data = 42;
ptr.store(p, std::memory_order_release);
}
void consumer()
{
std::string* p2;
while (!(p2 = ptr.load(std::memory_order_acquire)))
;
assert(*p2 == "Hello"); // never fires
assert(data == 42); // never fires
}
I'm wondering what if we used relaxed memory order instead of acquire? I guess, the value of data
could be read before p2 = ptr.load(std::memory_order_relaxed)
, but what about p2
?
Finally, why it is fine to use relaxed memory order in this case?
template<typename T>
class stack
{
std::atomic<node<T>*> head;
public:
void push(const T& data)
{
node<T>* new_node = new node<T>(data);
// put the current value of head into new_node->next
new_node->next = head.load(std::memory_order_relaxed);
// now make new_node the new head, but if the head
// is no longer what's stored in new_node->next
// (some other thread must have inserted a node just now)
// then put that new head into new_node->next and try again
while(!head.compare_exchange_weak(new_node->next, new_node,
std::memory_order_release,
std::memory_order_relaxed))
; // the body of the loop is empty
}
};
I mean both head.load(std::memory_order_relaxed)
and head.compare_exchange_weak(new_node->next, new_node, std::memory_order_release, std::memory_order_relaxed)
.
To summarize all the above, my question is essentially when do I have to care about potential reordering and when I don't?
回答1:
For #1, compiler may issue the store to y before the load from x (there are no dependencies), and even if it doesn't, the load from x can be delayed at cpu/memory level.
For #2, p2 would be nonzero, but neither *p2 nor data would necessarily have a meaningful value.
For #3 there is only one act of publishing non-atomic stores made by this thread, and it is a release
You should always care about reordering, or, better, not assume any order: neither C++ nor hardware executes code top to bottom, they only respect dependencies.
来源:https://stackoverflow.com/questions/40127280/lock-free-programming-reordering-and-memory-order-semantics