I recently came across the following post on the Resharper website. It was a discussion of double-checked locking, and had the following code:
public class F
The big problem with the example is that the first null check is not locked, so instance may not be null, but before Init has been called. This may lead to threads using instance before Init has been called.
The correct version should therefore be:
public static Foo GetValue()
{
if (instance == null)
{
lock (padlock)
{
if (instance == null)
{
var foo = new Foo();
foo.Init();
instance = foo;
}
}
}
return instance;
}
If I read the code correctly, the issue is:
Caller 1 begins the method, finds instance == null to be true, enters the lock, finds instance to STILL be null, and creates instance.
Before Init() is called, caller 1's thread is suspended and caller 2 enters the method. Caller 2 finds instance to NOT be null, and goes on to use it before caller 1 can initialize it.
It was my understanding that .NET memory model (at least since 2.0) has not required that instance be declared as volatile, since lock would provide a full memory fence. Is that not the case, or was I misinformed?
It is required. The reason is because you are accessing instance
outside of the lock
. Let us assume that you omit volatile
and have already fixed the initialization problem like this.
public class Foo
{
private static Foo instance;
private static readonly object padlock = new object();
public static Foo GetValue()
{
if (instance == null)
{
lock (padlock)
{
if (instance == null)
{
var temp = new Foo();
temp.Init();
instance = temp;
}
}
}
return instance;
}
private void Init() { /* ... */ }
}
At some level the C# compiler, JIT compiler, or hardware could emit an instruction sequence that optimizes away the temp
variable and causes the instance
variable to get assigned before Init
is ran. In fact, it could assign instance
even before the constructor runs. The Init
method makes the issue much easier to spot, but the issue is still there for the constructor as well.
This is a valid optimization since instructions are free to be reordered within the lock. A lock
does emit a memory barrier, but only on the Monitor.Enter
and Monitor.Exit
calls.
Now, if you do omit volatile
the code will probably still work with most combinations of hardware and CLI implementations. The reason is because the x86 hardware has a tighter memory model and Microsoft's implementation of the CLR is also pretty tight. However, the ECMA specification on this subject is relatively loose which means that another implementation of the CLI is free to make optimizations that Microsoft currently chooses to ignore. You have to code for the weaker model which may be a CLI jitter and not the hardware which most people tend to focus on. This is why volatile
is still required.
Isn't read/write reordering only observable with respect to multiple threads?
Yes. The instruction reordering only comes into play when more than one thread is accessing the same memory location. Even the weakest software and hardware memory models will not allow any kind of optimization that changes the behavior from what the developer intended when code is executing on a thread. Otherwise, no program would execute correctly. The issue is with how other threads observe what is going on in that thread. Other threads may perceive a different behavior from that of the executing thread. But, the executing thread always perceives the correct behavior.
It was my understanding that on a single thread, the side effects would be in a consistent order, and that the lock in place would prevent any other thread from observing something to be amiss. Am I off-base here as well?
No, a lock
all by itself will not prevent other threads from perceiving a different sequence of events. The reason is because the executing thread could be executing the instructions inside the lock
in a different order than what the developer intended. It is only at the entry and exit points of the lock that the memory barriers are created. So in your example the reference to the new object could be assigned to instance
even before the constructor runs even though you have wrapped those instructions with a lock
.
Using volatile
, on the other hand, has a bigger impact on how the code inside the lock
behaves as compared to the initial check of instance
at the beginning of the method despite common wisdom. A lot of people think the major issue is that instance
might be stale without a volatile read. That may be the case, but the bigger issue is that without a volatile write inside the lock
another thread might see instance
referring to an instance for which the constructor has not run yet. A volatile write solves this problem because it prevents the compiler from moving the constructor code after the write to instance
. That is the big reason why volatile
is still required.
On one hand it creates a "full fence" BUT what the quote is referring to is what goes on "inside that fence" in a "double checked locking case"... see for an explanation http://blogs.msdn.com/b/cbrumme/archive/2003/05/17/51445.aspx
It states:
However, we have to assume that a series of stores have taken place during construction
of ‘a’. Those stores can be arbitrarily reordered, including the possibility of delaying
them until after the publishing store which assigns the new object to ‘a’. At that point,
there is a small window before the store.release implied by leaving the lock. Inside that
window, other CPUs can navigate through the reference ‘a’ and see a partially constructed
instance.
Replace a
in the above sentence with instance
from you example...
Additionally check this http://blogs.msdn.com/b/brada/archive/2004/05/12/130935.aspx out - it explains what volatile
achieves in your scenario...
A nice explanation of fences and volatile
and how volatile
has even different effects depending on the processor you run the code on see http://www.albahari.com/threading/part4.aspx and even more/better information see http://csharpindepth.com/Articles/General/Singleton.aspx