Example of misuse of std::memory_order::relaxed in C++ Standard [algorithms.parallel.exec/5 in n4713]

问题

One of the examples of misuse of std::memory_order::relaxed in C++ Standard:

std::atomic<int> x{0};
int a[] = {1,2};
std::for_each(std::execution::par, std::begin(a), std::end(a), [&](int) {
    x.fetch_add(1, std::memory_order::relaxed);
    // spin wait for another iteration to change the value of x
    while (x.load(std::memory_order::relaxed) == 1) { } // incorrect: assumes execution order
});

And then it says,

The above example depends on the order of execution of the iterations, and will not terminate if both iterations are executed sequentially on the same thread of execution.

Questions:

The comment says, "incorrect: assumes execution order". What's the "assumed execution order"? I miss it.
What does the "iterations" refer to in "The above example depends on the order of execution of the iterations"? Does it mean the iteration in while loop? Or does it refer to the iteration of std::for_each?
If the iterations of std::for_each are executed in parallel by different threads, isn't it still true that one of the iterations/threads won't exit? Because x.fetch_add(1, std::memory_order::relaxed) is atomic and so one thread will make x 1 and another will make x 2 and it is impossible to have x == 1 for both thread. No?

回答1:

"incorrect: assumes execution order". What's the "assumed execution order"?

It assumes that the body of the lambda gets executed by multiple threads rather than one. The standard rather says that it may execute in parallel.

What does the "iterations" refer to in "The above example depends on the order of execution of the iterations"?

It probably refers to the execution of the lambda by another thread. But the standard doesn't guarantee that there is another thread. See execution_policy_tag_t:

parallel_policy The execution policy type used as a unique type to disambiguate parallel algorithm overloading and indicate that a parallel algorithm's execution may be parallelized. The invocations of element access functions in parallel algorithms invoked with this policy (usually specified as std::execution::par) are permitted to execute in either the invoking thread or in a thread implicitly created by the library to support parallel algorithm execution. Any such invocations executing in the same thread are indeterminately sequenced with respect to each other.

来源：https://stackoverflow.com/questions/58287969/example-of-misuse-of-stdmemory-orderrelaxed-in-c-standard-algorithms-para

标签

c++

multithreading

parallel-processing

relaxed-atomics