I recently stumbled upon this Wikipedia article. From my experience with multi-threading I am aware of the multitude of issues caused by the program being able to switch threads
It's not the compiler, it's the CPU. (Well both actually, but the CPU is the harder to control.) Regardless of how your code gets compiled, the CPU will look-ahead in the instruction stream and execute things out of order. Typically, for example, start a read early, since memory is slower than the CPU. (ie start it early, hope the read is done before you actually need it)
Both the CPU and the compiler optimize based on the same rule: reorder anything as long as it doesn't affect the results of the program * assuming a single-threaded single-processor environment *.
So there's the problem - it optimizes for single-threadedness, when it isn't. Why? Because otherwise everything would be 100x slower. really. And most of your code is singlethreaded (ie single-threaded-interaction) - only small parts need to interact in a multi-threaded way.
The best/easiest/safest way to control this is with locks - mutexes, semaphores, events, etc.
Only if you really, really, need to optimize (based on careful measurement), then you can look into memory barriers and atomic operations - these are the underlying instructions that are used to build mutexes etc, and when used correctly limit out-of-order execution.
But before doing that kind of optimization, check that the algorithms and code-flow are correct and whether you could further minimize multi-threaded interactions.