Every Modern OS provides today some atomic operations:
Interlocked*
API
Projects are using this:
http://packages.debian.org/source/sid/libatomic-ops
If you want simple operations such as CAS, can't you just just use the arch-specific implementations out of the kernel, and do arch checks in user-space with autotools/cmake? As far as licensing goes, although the kernel is GPL, I think it's arguable that the inline assembly for these operations is provided by Intel/AMD, not that the kernel has a license on them. They just happen to be in an easily accessible form in the kernel source.
Recent standards (from 2011) of C & C++ now specify atomic operations:
Regardless, your platform or compiler may not support these newer headers & features.
See: kernel_user_helpers.txt or entry-arm.c and look for __kuser_cmpxchg
. As seen in comments of other ARM Linux versions,
Location: 0xffff0fc0 Reference prototype: int __kuser_cmpxchg(int32_t oldval, int32_t newval, volatile int32_t *ptr); Input: r0 = oldval r1 = newval r2 = ptr lr = return address Output: r0 = success code (zero or non-zero) C flag = set if r0 == 0, clear if r0 != 0 Clobbered registers: r3, ip, flags Definition: Atomically store newval in *ptr only if *ptr is equal to oldval. Return zero if *ptr was changed or non-zero if no exchange happened. The C flag is also set if *ptr was changed to allow for assembly optimization in the calling code. Usage example:
typedef int (__kuser_cmpxchg_t)(int oldval, int newval, volatile int *ptr);
#define __kuser_cmpxchg (*(__kuser_cmpxchg_t *)0xffff0fc0)
int atomic_add(volatile int *ptr, int val)
{
int old, new;
do {
old = *ptr;
new = old + val;
} while(__kuser_cmpxchg(old, new, ptr));
return new;
}
Notes:
This is for use with Linux with ARMv3 using the swp
primitive. You must have a very ancient ARM not to support this. Only a data abort or interrupt can cause the spinning to fail, so the kernel monitors for this address ~0xffff0fc0 and performs a user space PC
fix-up when either a data abort or an interrupt occurs. All user-space libraries that support ARMv5 and lower will use this facility.
For instance, QtConcurrent uses this.