Every Modern OS provides today some atomic operations:
Interlocked*
API
On Debian/Ubuntu recommend...
sudo apt-get install libatomic-ops-dev
examples: http://www.hpl.hp.com/research/linux/atomic_ops/example.php4
GCC & ICC compatible.
compared to Intel Thread Building Blocks (TBB), using atomic< T >, libatomic-ops-dev is over twice as fast! (Intel compiler)
Testing on Ubuntu i7 producer-consumer threads piping 10 million ints down a ring buffer connection in 0.5secs as opposed to 1.2secs for TBB
And easy to use e.g.
volatile AO_t head;
AO_fetch_and_add1(&head);
I recently did an implementation of such a thing and I was confronted to the same difficulties as you are. My solution was basically the following:
cmpxch
with __asm__
for the other architectures (ARM is a bit more complicated than that). Just do that for one possible size, e.g sizeof(int)
.inline
functionsBoost, which has a non intrusive license, and other frameworks already offer portable atomic counters -- as long as they are supported on the target platform.
Third party libraries are good for us. And if for strange reasons your company forbid you from using them, you can still have a look at how they proceed (as long as the licence permit it for your use) to implement what your are looking for.
Darn. I was going to suggest the GCC primitives, then you said they were off limits. :-)
In that case, I would do an #ifdef
for each architecture/compiler combination you care about and code up the inline asm. And maybe check for __GNUC__
or some similar macro and use the GCC primitives if they are available, because it feels so much more right to use those. :-)
You are going to have a lot of duplication and it might be difficult to verify correctness, but this seems to be the way a lot of projects do this, and I've had good results with it.
Some gotchas that have bit me in the past: when using GCC, don't forget "asm volatile
" and clobbers for "memory"
and "cc"
, etc.
There is a patch for GCC here to support ARM atomic operations. WIll not help you on Intel, but you could examine the code - there is recent kernel support for older ARM architectures, and newer ones have the instructions built in, so you should be able to build something that works.
http://gcc.gnu.org/ml/gcc-patches/2011-07/msg00050.html
__sync*
certainly is (and has been) supported by the Intel compiler, because GCC adopted these build-ins from there. Read the first paragraph on this page. Also see "Intel® C++ Compiler for Linux* Intrinsics Reference", page 198. It's from 2006 and describes exactly those built-ins.
Regarding ARM support, for older ARM CPUs: it cannot be done entirely in userspace, but it can be done in kernelspace (by disabling interrupts during the operation), and I think I read somewhere that it is supported for quite a while now.
According to this PHP bug, dated 2011-10-08, __sync_*
will only fail on
So with GCC > 4.3 (and 4.7 is the current one), you shouldn't have a problem with ARMv6 and newer. You shouldn't have no problem with ARMv5 either as long as compiling for Linux.