Are some implementations better than others for specific applications? Is there anything to earn by rolling out your own?
Building on Adamski's test-and-set
suggestion, you should also look at the concept of "fast user-space mutexes" or futexes.
Futexes have the desirable property that they do not require a kernel system call in the common cases of locking or unlocking an uncontended mutex. In these cases, the user-mode code successfully uses an atomic compare and swap (CAS) operation to lock or unlock the mutex.
If CAS fails, the mutex is contended and a kernel system call -- sys_futex
under Linux -- must be used either to wait for the mutex (in the lock case) or to wake other threads (in the unlock case).
If you're serious about implementing this yourself, make sure you also read Ulrich Drepper's paper.
Check out the description of the Test-and-set machine instruction on Wikipedia, which alludes to how atomic operations are achieved at the machine level. I can imagine most language-level mutex implementations rely on machine-level support such as Test-and-set.
A mutex preferably runs in the kernel of the operating system while keeping the amount of code around it as short as possible, so it can avoid being cut-off while task-switching to another process. The exact implementation is therefore a bit of a secret. It's not complex though. It's basically an object that has a boolean field, which it gets and sets.
Around the basic mutex logic there are wrappers to wrap it in an object.. Then more wrapper objects to make it available outside the kernel. And then another wrapper to make it available in .NET. And then several programmers will write their own wrapper code around this all for their own logical needs. The wrappers around wrappers really make them a murky territory.
Now, with this basic knowledge about the internals of mutexes, all I hope is that you're going to use one implementation that relies on the kernel and the hardware underneath. These would be the most reliable. (If the hardware supports these.) If the mutex that you're using doesn't work at this kernel/hardware level then it can still be reliable but I would advise to not use it, unless there's no alternative.
As far as I know, Windows, Linux and .NET will all use mutexes at kernel/hardware level.
The Wikipedia page that I've linked to explains more about the internal logic and possible implementations. Preferably, a mutex is controlled by the hardware, thus making the whole getting/setting of the mutex an indivisible step. (Just to make sure the system doesn't switch tasks in-between.)
A bit of assembly to demonstrate locking atomically:
; BL is the mutex id
; shared_val, a memory address
CMP [shared_val],BL ; Perhaps it is locked to us anyway
JZ .OutLoop2
.Loop1:
CMP [shared_val],0xFF ; Free
JZ .OutLoop1 ; Yes
pause ; equal to rep nop.
JMP .Loop1 ; Else, retry
.OutLoop1:
; Lock is free, grab it
MOV AL,0xFF
LOCK CMPXCHG [shared_val],BL
JNZ .Loop1 ; Write failed
.OutLoop2: ; Lock Acquired
Interlocked.CompareExchange is enough to implement spinlocks. It's pretty difficult to do right though. See for Joe Duffy's blog for an example of the subtleties involved.
I used Reflector.NET to decompile the source for System.Threading.ReaderWriterLockSlim
, which was added to a recent version of the .NET framework.
It mostly uses Interlocked.CompareExchange
, Thread.SpinWait
and Thread.Sleep
to achieve synchronisation. There are a few EventWaitHandle
(kernel object) instances that are used under some circumstances.
There's also some complexity added to support reentrancy on a single thread.
If you're interested in this area and working in .NET (or at least, can read it) then you might find it quite interesting to check this class out.