Is there a way to implement a singleton object in C++ that is:
read on weak memory model. It can break double-checked locks and spinlocks. Intel is strong memory model (yet), so on Intel it's easier
carefully use "volatile" to avoid caching of parts the object in registers, otherwise you'll have initialized the object pointer, but not the object itself, and the other thread will crash
the order of static variables initialization versus shared code loading is sometimes not trivial. I've seen cases when the code to destruct an object was already unloaded, so the program crashed on exit
such objects are hard to destroy properly
In general singletons are hard to do right and hard to debug. It's better to avoid them altogether.
For gcc, this is rather easy:
LazyType* GetMyLazyGlobal() {
static const LazyType* instance = new LazyType();
return instance;
}
GCC will make sure that the initialization is atomic. For VC++, this is not the case. :-(
One major issue with this mechanism is the lack of testability: if you need to reset the LazyType to a new one between tests, or want to change the LazyType* to a MockLazyType*, you won't be able to. Given this, it's usually best to use a static mutex + static pointer.
Also, possibly an aside: It's best to always avoid static non-POD types. (Pointers to PODs are OK.) The reasons for this are many: as you mention, initialization order isn't defined -- neither is the order in which destructors are called though. Because of this, programs will end up crashing when they try to exit; often not a big deal, but sometimes a showstopper when the profiler you are trying to use requires a clean exit.
While this question has already been answered, I think there are some other points to mention:
But as already stated, you can't guarantee threadsafe lazy-initialisation without using at least one synchronisation primitive.
Edit: Yup Derek, you're right. My bad. :)
Unfortunately, Matt's answer features what's called double-checked locking which isn't supported by the C/C++ memory model. (It is supported by the Java 1.5 and later — and I think .NET — memory model.) This means that between the time when the pObj == NULL
check takes place and when the lock (mutex) is acquired, pObj
may have already been assigned on another thread. Thread switching happens whenever the OS wants it to, not between "lines" of a program (which have no meaning post-compilation in most languages).
Furthermore, as Matt acknowledges, he uses an int
as a lock rather than an OS primitive. Don't do that. Proper locks require the use of memory barrier instructions, potentially cache-line flushes, and so on; use your operating system's primitives for locking. This is especially important because the primitives used can change between the individual CPU lines that your operating system runs on; what works on a CPU Foo might not work on CPU Foo2. Most operating systems either natively support POSIX threads (pthreads) or offer them as a wrapper for the OS threading package, so it's often best to illustrate examples using them.
If your operating system offers appropriate primitives, and if you absolutely need it for performance, instead of doing this type of locking/initialization you can use an atomic compare and swap operation to initialize a shared global variable. Essentially, what you write will look like this:
MySingleton *MySingleton::GetSingleton() {
if (pObj == NULL) {
// create a temporary instance of the singleton
MySingleton *temp = new MySingleton();
if (OSAtomicCompareAndSwapPtrBarrier(NULL, temp, &pObj) == false) {
// if the swap didn't take place, delete the temporary instance
delete temp;
}
}
return pObj;
}
This only works if it's safe to create multiple instances of your singleton (one per thread that happens to invoke GetSingleton() simultaneously), and then throw extras away. The OSAtomicCompareAndSwapPtrBarrier
function provided on Mac OS X — most operating systems provide a similar primitive — checks whether pObj
is NULL
and only actually sets it to temp
to it if it is. This uses hardware support to really, literally only perform the swap once and tell whether it happened.
Another facility to leverage if your OS offers it that's in between these two extremes is pthread_once
. This lets you set up a function that's run only once - basically by doing all of the locking/barrier/etc. trickery for you - no matter how many times it's invoked or on how many threads it's invoked.
Basically, you're asking for synchronized creation of a singleton, without using any synchronization (previously-constructed variables). In general, no, this is not possible. You need something available for synchronization.
As for your other question, yes, static variables which can be statically initialized (i.e. no runtime code necessary) are guaranteed to be initialized before other code is executed. This makes it possible to use a statically-initialized mutex to synchronize creation of the singleton.
From the 2003 revision of the C++ standard:
Objects with static storage duration (3.7.1) shall be zero-initialized (8.5) before any other initialization takes place. Zero-initialization and initialization with a constant expression are collectively called static initialization; all other initialization is dynamic initialization. Objects of POD types (3.9) with static storage duration initialized with constant expressions (5.19) shall be initialized before any dynamic initialization takes place. Objects with static storage duration defined in namespace scope in the same translation unit and dynamically initialized shall be initialized in the order in which their definition appears in the translation unit.
If you know that you will be using this singleton during the initialization of other static objects, I think you'll find that synchronization is a non-issue. To the best of my knowledge, all major compilers initialize static objects in a single thread, so thread-safety during static initialization. You can declare your singleton pointer to be NULL, and then check to see if it's been initialized before you use it.
However, this assumes that you know that you'll use this singleton during static initialization. This is also not guaranteed by the standard, so if you want to be completely safe, use a statically-initialized mutex.
Edit: Chris's suggestion to use an atomic compare-and-swap would certainly work. If portability is not an issue (and creating additional temporary singletons is not a problem), then it is a slightly lower overhead solution.
Here's a very simple lazily constructed singleton getter:
Singleton *Singleton::self() {
static Singleton instance;
return &instance;
}
This is lazy, and the next C++ standard (C++0x) requires it to be thread safe. In fact, I believe that at least g++ implements this in a thread safe manner. So if that's your target compiler or if you use a compiler which also implements this in a thread safe manner (maybe newer Visual Studio compilers do? I don't know), then this might be all you need.
Also see http://www.open-std.org/jtc1/sc22/wg21/docs/papers/2008/n2513.html on this topic.