Does the C++ volatile keyword introduce a memory fence?

前端未结

关注

 13  2021

I understand that volatile informs the compiler that the value may be changed, but in order to accomplish this functionality, does the compiler need to introduc

相关标签:

13条回答

后悔当初

2020-11-28 21:18
I think the confusion around volatile and instruction reordering stems from the 2 notions of reorderings CPUs do:
1. Out-of-order execution.
2. Sequence of memory read/writes as seen by other CPUs (reordering in a sense that each CPU might see a different sequence).
Volatile affects how a compiler generates the code assuming single threaded execution (this includes interrupts). It doesn't imply anything about memory barrier instructions, but it rather precludes a compiler from performing certain kinds of optimizations related to memory accesses.
A typical example is re-fetching a value from memory, instead of using one cached in a register.

Out-of-order execution

CPUs can execute instructions out-of-order/speculatively provided that the end result could have happened in the original code. CPUs can perform transformations that are disallowed in compilers because compilers can only perform transformations which are correct in all circumstances. In contrast, CPUs can check the validity of these optimizations and back out of them if they turn out to be incorrect.

Sequence of memory read/writes as seen by other CPUs

The end result of a sequence of instruction, the effective order, must agree with the semantics of the code generated by a compiler. However the actual execution order chosen by the CPU can be different. The effective order as seen in other CPUs (every CPU can have a different view) can be constrained by memory barriers.
I'm not sure how much effective and actual order can differ because I don't know to what extent memory barriers can preclude CPUs from performing out-of-order execution.

Sources:
- Memory Barriers
- LLVM: Atomics
- ACCESS_ONCE() and compiler bugs
0 讨论(0)
发布评论:

提交评论
- 加载中...
你的背包

2020-11-28 21:26
The keyword volatile essentially means that reads and writes an object should be performed exactly as written by the program, and not optimized in any way. Binary code should follow C or C++ code: a load where this is read, a store where there is a write.

It also means that no read should be expected to result in a predictable value: the compiler shouldn't assume anything about a read even immediately following a write to the same volatile object:
```
volatile int i;
i = 1;
int j = i; 
if (j == 1) // not assumed to be true
```
volatile may be the most important tool in the "C is a high level assembly language" toolbox.

Whether declaring an object volatile is sufficient for ensuring the behavior of code that deals with asynchronous changes depends on the platform: different CPU give different levels of guaranteed synchronization for normal memory reads and writes. You probably shouldn't try to write such low level multithreading code unless you are an expert in the area.

Atomic primitives provide a nice higher level view of objects for multithreading that makes it easy to reason about code. Almost all programmers should use either atomic primitives or primitives that provide mutual exclusions like mutexes, read-write-locks, semaphores, or other blocking primitives.
0 讨论(0)
发布评论:

提交评论
- 加载中...
悲&欢浪女

2020-11-28 21:28

Does the C++ volatile keyword introduce a memory fence?

A C++ compiler which conforms to the specification is not required to introduce a memory fence. Your particular compiler might; direct your question to the authors of your compiler.

The function of "volatile" in C++ has nothing to do with threading. Remember, the purpose of "volatile" is to disable compiler optimizations so that reading from a register that is changing due to exogenous conditions is not optimized away. Is a memory address that is being written to by a different thread on a different CPU a register that is changing due to exogenous conditions? No. Again, if some compiler authors have chosen to treat memory addresses being written to by different threads on different CPUs as though they were registers changing due to exogenous conditions, that's their business; they are not required to do so. Nor are they required -- even if it does introduce a memory fence -- to, for instance, ensure that every thread sees a consistent ordering of volatile reads and writes.

In fact, volatile is pretty much useless for threading in C/C++. Best practice is to avoid it.

Moreover: memory fences are an implementation detail of particular processor architectures. In C#, where volatile explicitly is designed for multithreading, the specification does not say that half fences will be introduced, because the program might be running on an architecture that doesn't have fences in the first place. Rather, again, the specification makes certain (extremely weak) guarantees about what optimizations will be eschewed by the compiler, runtime and CPU to put certain (extremely weak) constraints on how some side effects will be ordered. In practice these optimizations are eliminated by use of half fences, but that's an implementation detail subject to change in the future.

The fact that you care about the semantics of volatile in any language as they pertain to multithreading indicates that you're thinking about sharing memory across threads. Consider simply not doing that. It makes your program far harder to understand and far more likely to contain subtle, impossible-to-reproduce bugs.

0 讨论(0)
发布评论:

提交评论
- 加载中...
眼角桃花

2020-11-28 21:28

First of all, the C++ standards do not guarantee the memory barriers needed for properly ordering the read / writes that are non atomic. volatile variables are recommended for using with MMIO, signal handling, etc. On most implementations volatile is not useful for multi-threading and it's not generally recommended.

Regarding the implementation of volatile accesses, this is the compiler choice.

This article, describing gcc behavior shows that you cannot use a volatile object as a memory barrier to order a sequence of writes to volatile memory.

Regarding icc behavior I found this source telling also that volatile does not guarantee ordering memory accesses.

Microsoft VS2013 compiler has a different behavior. This documentation explains how volatile enforces Release / Acquire semantics and enables volatile objects to be used in locks / releases on multi-threaded applications.

Another aspect that needs to be taken into considerations is that the same compiler may have a different behavior wrt. to volatile depending on the targeted hardware architecture. This post regarding the MSVS 2013 compiler clearly states the specifics of compiling with volatile for ARM platforms.

So my answer to:

Does the C++ volatile keyword introduce a memory fence?

would be: Not guaranteed, probably not but some compilers might do it. You should not rely on the fact that it does.

0 讨论(0)
发布评论:

提交评论
- 加载中...
旧巷少年郎

2020-11-28 21:28

I always use volatile in interrupt service routines, e.g. the ISR (often assembly code) modifies some memory location and the higher level code that runs outside of the interrupt context accesses the memory location through a pointer to volatile.

I do this for RAM as well as memory-mapped IO.

Based on the discussion here it seems this is still a valid use of volatile but doesn't have anything to do with multiple threads or CPUs. If the compiler for a microcontroller "knows" that there can't be any other accesses (e.g. everyting is on-chip, no cache and there's only one core) I would think that a memory fence isn't implied at all, the compiler just needs to prevent certain optimisations.

As we pile more stuff into the "system" that executes the object code almost all bets are off, at least that's how I read this discussion. How could a compiler ever cover all bases?

0 讨论(0)
发布评论:

提交评论
- 加载中...
别那么骄傲

2020-11-28 21:31
Rather than explaining what volatile does, allow me to explain when you should use volatile.
- When inside a signal handler. Because writing to a volatile variable is pretty much the only thing the standard allows you to do from within a signal handler. Since C++11 you can use std::atomic for that purpose, but only if the atomic is lock-free.
- When dealing with setjmp according to Intel.
- When dealing directly with hardware and you want to ensure that the compiler does not optimize your reads or writes away.
For example:
```
volatile int *foo = some_memory_mapped_device;
while (*foo)
    ; // wait until *foo turns false
```
Without the volatile specifier, the compiler is allowed to completely optimize the loop away. The volatile specifier tells the compiler that it may not assume that 2 subsequent reads return the same value.

Note that volatile has nothing to do with threads. The above example does not work if there was a different thread writing to *foo because there is no acquire operation involved.

In all other cases, usage of volatile should be considered non-portable and not pass code review anymore except when dealing with pre-C++11 compilers and compiler extensions (such as msvc's /volatile:ms switch, which is enabled by default under X86/I64).
0 讨论(0)
发布评论:

提交评论
- 加载中...

Does the C++ volatile keyword introduce a memory fence?

Out-of-order execution

Sequence of memory read/writes as seen by other CPUs