Since C++11, local static
variables are known to be initialized in a thread safe manner (unless the -fno-threadsafe-statics
is given), as specified
Generally, Objective-C++, which allows mixing Objective-C-Objects and code with C++ objects and code, is a different language than "pure" C++11. Therefore, I don't think that everything guaranteed for C++11 is automatically guaranteed in Objectiver-C++'s mixed world. And I have been spending some time now investigating apple's documentation whether specific guarantees on static local variables or even block variables are also given in Objective-C++.
As I did not find a statement to this, I tried introducing a race condition on the creation of an object, one with the proposed "new style", i.e. using a static local variable, one with the "old style" with dispatch_once
, and one "real" race condition "notOnlyOnce" ignoring any synchronization (just to be sure that the code actually introduces a race condition).
The tests show that both "new style" and "old style" seem to be thread safe, whereas "notOnlyOnce" clearly is not. Unfortunately, such a test could have just proofen that "new style" produces a race condition, but it cannot proof that there will never be a race condition. But as "new style" and "old style" behave the same, but "notOnlyOnce" shows up a race condition in the same setting, we can at least assume that static local variables work as you proposed.
See the following code and the respective outputs.
@interface SingletonClass : NSObject
- (instancetype)init;
@end
@implementation SingletonClass
- (instancetype)init {
self = [super init];
std::cout << "Created a singleton object" << std::endl;
for (int i=0; i<1000000; i++) { i++; }
return self;
}
@end
@interface TestClassObjCPP : NSObject
@property (nonatomic) SingletonClass *sc;
+ (SingletonClass *)onlyOnceNewStyle;
+ (SingletonClass *)onlyOnceOldStyle: (TestClassObjCPP*)caller;
+ (SingletonClass *)notOnlyOnce: (TestClassObjCPP*)caller;
@end
@implementation TestClassObjCPP
+ (SingletonClass *)onlyOnceNewStyle {
static SingletonClass *object = [[SingletonClass alloc] init];
return object;
}
+ (SingletonClass *)onlyOnceOldStyle: (TestClassObjCPP*)caller {
static dispatch_once_t onceToken;
dispatch_once(&onceToken, ^{
caller.sc = [[SingletonClass alloc] init];
});
return caller.sc;
}
+ (SingletonClass *)notOnlyOnce: (TestClassObjCPP*)caller {
if (caller.sc == nil)
caller.sc = [[SingletonClass alloc] init];
return caller.sc;
}
@end
int main(int argc, char * argv[]) {
@autoreleasepool {
std::cout << "Before loop requesting singleton." << std::endl;
TestClassObjCPP *caller = [[TestClassObjCPP alloc] init];
caller.sc = nil;
for (int i=0; i<10000; i++) {
dispatch_async(dispatch_get_global_queue( DISPATCH_QUEUE_PRIORITY_DEFAULT, 0), ^{
[TestClassObjCPP onlyOnceNewStyle]; // (1)
// [TestClassObjCPP onlyOnceOldStyle:caller]; // (2)
// [TestClassObjCPP notOnlyOnce:caller]; // (3)
});
}
std::cout << "After loop requesting singleton." << std::endl;
return UIApplicationMain(argc, argv, nil, NSStringFromClass([AppDelegate class]));
}
}
Output for onlyOnceNewStyle (1):
Before loop requesting singleton.
Created a singleton object
After loop requesting singleton.
Output for onlyOnceOldStyle (2):
Before loop requesting singleton.
Created a singleton object
After loop requesting singleton.
Output for notOnlyOnce (3):
Before loop requesting singleton.
Created a singleton object
Created a singleton object
Created a singleton object
After loop requesting singleton.
So not a clear yes or no, but I hope it helps in some way.
TL;DR - it seems that it's possible to use C++11 static variable initialization in a thread safe manner which has the same performance characteristics as dispatch_once
.
Following Stephan Lechner's answer, I wrote the most simple code that tests the C++ static initialization flow:
class Object {
};
static Object *GetObjectCppStatic() {
static Object *object = new Object();
return object;
}
int main() {
GetObjectCppStatic();
}
Compiling this to assembly via clang++ test.cpp -O0 -fno-exceptions -S
(-O0
to avoid inlining, same general code is produced for -Os
, -fno-exceptions
to simplify generated code), shows that GetObjectCppStatic
compiles to:
__ZL18GetObjectCppStaticv: ## @_ZL18GetObjectCppStaticv
.cfi_startproc
## BB#0:
pushq %rbp
Lcfi6:
.cfi_def_cfa_offset 16
Lcfi7:
.cfi_offset %rbp, -16
movq %rsp, %rbp
Lcfi8:
.cfi_def_cfa_register %rbp
cmpb $0, __ZGVZL18GetObjectCppStaticvE6object(%rip)
jne LBB2_3
## BB#1:
leaq __ZGVZL18GetObjectCppStaticvE6object(%rip), %rdi
callq ___cxa_guard_acquire
cmpl $0, %eax
je LBB2_3
## BB#2:
movl $1, %eax
movl %eax, %edi
callq __Znwm
leaq __ZGVZL18GetObjectCppStaticvE6object(%rip), %rdi
movq %rax, __ZZL18GetObjectCppStaticvE6object(%rip)
callq ___cxa_guard_release
LBB2_3:
movq __ZZL18GetObjectCppStaticvE6object(%rip), %rax
popq %rbp
retq
.cfi_endproc
We can definitely see the ___cxa_guard_acquire
and ___cxa_guard_release
, implemented by the libc++ ABI here. Note that we didn't even had to specify to clang
that we use C++11, as apparently this was supported by default even prior than that.
So we know both forms ensures thread-safe initialization of local statics. But what about performance? The following test code checks both methods with no contention (single threaded) and with heavy contention (multi threaded):
#include <cstdio>
#include <dispatch/dispatch.h>
#include <mach/mach_time.h>
class Object {
};
static double Measure(int times, void(^executionBlock)(), void(^finallyBlock)()) {
struct mach_timebase_info timebaseInfo;
mach_timebase_info(&timebaseInfo);
uint64_t start = mach_absolute_time();
for (int i = 0; i < times; ++i) {
executionBlock();
}
finallyBlock();
uint64_t end = mach_absolute_time();
uint64_t timeTook = end - start;
return ((double)timeTook * timebaseInfo.numer / timebaseInfo.denom) /
NSEC_PER_SEC;
}
static Object *GetObjectDispatchOnce() {
static Object *object;
static dispatch_once_t onceToken;
dispatch_once(&onceToken, ^{
object = new Object();
});
return object;
}
static Object *GetObjectCppStatic() {
static Object *object = new Object();
return object;
}
int main() {
printf("Single thread statistics:\n");
printf("DispatchOnce took %g\n", Measure(10000000, ^{
GetObjectDispatchOnce();
}, ^{}));
printf("CppStatic took %g\n", Measure(10000000, ^{
GetObjectCppStatic();
}, ^{}));
printf("\n");
dispatch_queue_t queue = dispatch_queue_create("queue",
DISPATCH_QUEUE_CONCURRENT);
dispatch_group_t group = dispatch_group_create();
printf("Multi thread statistics:\n");
printf("DispatchOnce took %g\n", Measure(1000000, ^{
dispatch_group_async(group, queue, ^{
GetObjectDispatchOnce();
});
}, ^{
dispatch_group_wait(group, DISPATCH_TIME_FOREVER);
}));
printf("CppStatic took %g\n", Measure(1000000, ^{
dispatch_group_async(group, queue, ^{
GetObjectCppStatic();
});
}, ^{
dispatch_group_wait(group, DISPATCH_TIME_FOREVER);
}));
}
Which yields the following results on x64:
Single thread statistics:
DispatchOnce took 0.025486
CppStatic took 0.0232348
Multi thread statistics:
DispatchOnce took 0.285058
CppStatic took 0.32596
So up to measurement error, it seems that the performance characteristics of both methods are similar, mostly due to the double-check locking that is performed by both of them. For dispatch_once
, this happens in the _dispatch_once
function:
void
_dispatch_once(dispatch_once_t *predicate,
DISPATCH_NOESCAPE dispatch_block_t block)
{
if (DISPATCH_EXPECT(*predicate, ~0l) != ~0l) {
// ...
} else {
// ...
}
}
Where in the C++ static initialization flow it happens right before the call to ___cxa_guard_acquire
.