Concurrency problems are incredibly difficult to debug. As a preventative measure, one can completely disallow access to shared objects without using mutexes in such a way that programmers can easily follow the rules. I have seen this done by making wrappers around the OS-provided mutexes and semaphores, etc.
Here are some confusing examples from my past:
I used to develop printer drivers for Windows. In order to prevent multiple threads from writing to the printer at the same time, our port monitor used a construction like this:
// pseudo code because I can't remember the API
BOOL OpenPort() { GrabCriticalSection(); }
BOOL ClosePort() { ReleaseCriticalSection(); }
BOOL WritePort() { writestuff(); }
Unfortunately, each call to WritePort was from a different thread in the spooler's thread pool. We eventually got into a situation where OpenPort and ClosePort were called by different threads, causing a deadlock. The solution for this is left as an exercise, because I can't remember what I did.
I also used to work on printer firmware. The printer in this case used an RTOS called uCOS (pronounced 'mucus') so each function had its own task (print head motor, serial port, parallel port, network stack, etc.). One version of this printer had an internal option that plugged into a serial port on the printer motherboard. At some point, it was discovered that the printer would read the same result twice from this peripheral, and every value there after would be out of sequence. (e.g, the peripheral read a sequence 1,7,3,56,9,230 but we would see 1,7,3,3,56,9,230. This value was getting reported to the computer and put into a database so having a pile of documents with wrong ID numbers was Very Bad) The root cause of this was a failure to respect the mutex that was protecting the read buffer for the device. (Hence my advice at the beginning of this response)