As i know, volatile
is usually used to prevent unexpected compile optimization during some hardware operations. But which scenes volatile
should be dec
A good explanation is given here: Understanding “volatile” qualifier in C
The volatile keyword is intended to prevent the compiler from applying any optimizations on objects that can change in ways that cannot be determined by the compiler.
Objects declared as volatile are omitted from optimization because their values can be changed by code outside the scope of current code at any time. The system always reads the current value of a volatile object from the memory location rather than keeping its value in temporary register at the point it is requested, even if a previous instruction asked for a value from the same object. So the simple question is, how can value of a variable change in such a way that compiler cannot predict. Consider the following cases for answer to this question.
1) Global variables modified by an interrupt service routine outside the scope: For example, a global variable can represent a data port (usually global pointer referred as memory mapped IO) which will be updated dynamically. The code reading data port must be declared as volatile in order to fetch latest data available at the port. Failing to declare variable as volatile, the compiler will optimize the code in such a way that it will read the port only once and keeps using the same value in a temporary register to speed up the program (speed optimization). In general, an ISR used to update these data port when there is an interrupt due to availability of new data
2) Global variables within a multi-threaded application: There are multiple ways for threads communication, viz, message passing, shared memory, mail boxes, etc. A global variable is weak form of shared memory. When two threads sharing information via global variable, they need to be qualified with volatile. Since threads run asynchronously, any update of global variable due to one thread should be fetched freshly by another consumer thread. Compiler can read the global variable and can place them in temporary variable of current thread context. To nullify the effect of compiler optimizations, such global variables to be qualified as volatile
If we do not use volatile qualifier, the following problems may arise
1) Code may not work as expected when optimization is turned on.
2) Code may not work as expected when interrupts are enabled and used.
volatile
comes from C. Type "C language volatile" into your favourite search engine (some of the results will probably come from SO), or read a book on C programming. There are plenty of examples out there.
A compiler assumes that the only way a variable can change its value is through code that changes it.
int a = 24;
Now the compiler assumes that a
is 24
until it sees any statement that changes the value of a
. If you write code somewhere below above statement that says
int b = a + 3;
the compiler will say "I know what a
is, it's 24
! So b
is 27
. I don't have to write code to perform that calculation, I know that it will always be 27
". The compiler may just optimize the whole calculation away.
But the compiler would be wrong in case a
has changed between the assignment and the calculation. However, why would a
do that? Why would a
suddenly have a different value? It won't.
If a
is a stack variable, it cannot change value, unless you pass a reference to it, e.g.
doSomething(&a);
The function doSomething
has a pointer to a
, which means it can change the value of a
and after that line of code, a
may not be 24
any longer. So if you write
int a = 24;
doSomething(&a);
int b = a + 3;
the compiler will not optimize the calculation away. Who knows what value a
will have after doSomething
? The compiler for sure doesn't.
Things get more tricky with global variables or instance variables of objects. These variables are not on stack, they are on heap and that means that different threads can have access to them.
// Global Scope
int a = 0;
void function ( ) {
a = 24;
b = a + 3;
}
Will b
be 27
? Most likely the answer is yes, but there is a tiny chance that some other thread has changed the value of a
between these two lines of code and then it won't be 27
. Does the compiler care? No. Why? Because C doesn't know anything about threads - at least it didn't used to (the latest C standard finally knows native threads, but all thread functionality before that was only API provided by the operating system and not native to C). So a C compiler will still assume that b
is 27
and optimize the calculation away, which may lead to incorrect results.
And that's what volatile
is good for. If you tag a variable volatile like that
volatile int a = 0;
you are basically telling the compiler: "The value of a
may change at any time. No seriously, it may change out of the blue. You don't see it coming and *bang*, it has a different value!". For the compiler that means it must not assume that a
has a certain value just because it used to have that value 1 pico-second ago and there was no code that seemed to have changed it. Doesn't matter. When accessing a
, always read its current value.
Overuse of volatile prevents a lot of compiler optimizations, may slow down calculation code dramatically and very often people use volatile in situations where it isn't even necessary. For example, the compiler never makes value assumptions across memory barriers. What exactly a memory barrier is? Well, that's a bit far beyond the scope of my reply. You just need to know that typical synchronization constructs are memory barriers, e.g. locks, mutexes or semaphores, etc. Consider this code:
// Global Scope
int a = 0;
void function ( ) {
a = 24;
pthread_mutex_lock(m);
b = a + 3;
pthread_mutex_unlock(m);
}
pthread_mutex_lock
is a memory barrier (pthread_mutex_unlock
as well, by the way) and thus it's not necessary to declare a
as volatile
, the compiler will not make an assumption of the value of a
across a memory barrier, never.
Objective-C is pretty much like C in all these aspects, after all it's just a C with extensions and a runtime. One thing to note is that atomic
properties in Obj-C are memory barriers, so you don't need to declare properties volatile
. If you access the property from multiple threads, declare it atomic
, which is even default by the way (if you don't mark it nonatomic
, it will be atomic
). If you never access it from multiple thread, tagging it nonatomic
will make access to that property a lot faster, but that only pays off if you access the property really a lot (a lot doesn't mean ten times a minute, it's rather several thousand times a second).
So you want Obj-C code, that requires volatile?
@implementation SomeObject {
volatile bool done;
}
- (void)someMethod {
done = false;
// Start some background task that performes an action
// and when it is done with that action, it sets `done` to true.
// ...
// Wait till the background task is done
while (!done) {
// Run the runloop for 10 ms, then check again
[[NSRunLoop currentRunLoop]
runUntilDate:[NSDate dateWithTimeIntervalSinceNow:0.01]
];
}
}
@end
Without volatile
, the compiler may be dumb enough to assume, that done
will never change here and replace !done
simply with true
. And while (true)
is an endless loop that will never terminate.
I haven't tested that with modern compilers. Maybe the current version of clang
is more intelligent than that. It may also depend on how you start the background task. If you dispatch a block, the compiler can actually easily see whether it changes done
or not. If you pass a reference to done
somewhere, the compiler knows that the receiver may the value of done
and will not make any assumptions. But I tested exactly that code a long time ago when Apple was still using GCC 2.x and there not using volatile
really caused an endless loop that never terminated (yet only in release builds with optimizations enabled, not in debug builds). So I would not rely on the compiler being clever enough to do it right.
Just some more fun facts about memory barriers:
If you ever had a look at the atomic operations that Apple offers in <libkern/OSAtomic.h>
, then you might have wondered why every operation exists twice: Once as x
and once as xBarrier
(e.g. OSAtomicAdd32
and OSAtomicAdd32Barrier
). Well, now you finally know it. The one with "Barrier" in its name is a memory barrier, the other one isn't.
Memory barriers are not just for compilers, they are also for CPUs (there exists CPU instructions, that are considered memory barriers while normal instructions are not). The CPU needs to know these barriers because CPUs like to reorder instructions to perform operations out of order. E.g. if you do
a = x + 3 // (1)
b = y * 5 // (2)
c = a + b // (3)
and the pipeline for additions is busy, but the pipeline for multiplication is not, the CPU may perform instruction (2)
before (1)
, after all the order won't matter in the end. This prevents a pipeline stall. Also the CPU is clever enough to know that it cannot perform (3)
before either (1)
or (2)
because the result of (3)
depends on the results of the other two calculations.
Yet, certain kinds of order changes will break the code, or the intention of the programmer. Consider this example:
x = y + z // (1)
a = 1 // (2)
The addition pipe might be busy, so why not just perform (2)
before (1)
? They don't depend on each other, the order shouldn't matter, right? Well, it depends. Consider another thread monitors a
for changes and as soon as a
becomes 1
, it reads the value of x
, which should now be y+z
if the instructions were performed in order. Yet if the CPU reordered them, then x
will have whatever value it used to have before getting to this code and this makes a difference as the other thread will now work with a different value, not the value the programmer would have expected.
So in this case the order will matter and that's why barriers are needed also for CPUs: CPUs don't order instructions across such barriers and thus instruction (2)
would need to be a barrier instruction (or there needs to be such an instruction between (1)
and (2)
; that depends on the CPU). However, reordering instructions is only performed by modern CPUs, a much older problem are delayed memory writes. If a CPU delays memory writes (very common for some CPUs, as memory access is horribly slow for a CPU), it will make sure that all delayed writes are performed and have completed before a memory barrier is crossed, so all memory is in a correct state in case another thread might now access it (and now you also know where the name "memory barrier" actually comes from).
You are probably working a lot more with memory barriers than you are even aware of (GCD - Grand Central Dispatch is full of these and NSOperation
/NSOperationQueue
bases on GCD), that's why your really need to use volatile
only in very rare, exceptional cases. You might get away writing 100 apps and never have to use it even once. However, if you write a lot low level, multi-threading code that aims to achieve maximum performance possible, you will sooner or later run into a situation where only volatile
can grantee you correct behavior; not using it in such a situation will lead to strange bugs where loops don't seem to terminate or variables simply seem to have incorrect values and you find no explanation for that. If you run into bugs like these, especially if you only see them in release builds, you might miss a volatile
or a memory barrier somewhere in your code.