Can DMB instructions be safely omitted in ARM Cortex M4

I am going through the assembly generated by GCC for an ARM Cortex M4, and noticed that atomic_compare_exchange_weak gets two DMB instructions inserted around the condition (compiled with GCC 4.9 using -std=gnu11 -O2):

// if (atomic_compare_exchange_weak(&address, &x, y))
dmb      sy
ldrex    r0, [r3]
cmp      r0, r2
itt      eq
strexeq  lr, r1, [r3]
cmpeq.w  lr, #0
dmb      sy
bne.n    ...

Since the programming guide to barrier instructions for ARM Cortex M4 states that:

Omitting the DMB or DSB instruction in the examples in Figure 41 and Figure 42 would not cause any error because the Cortex-M processors:

do not re-order memory transfers

do not permit two write transfers to be overlapped.

Is there any reason why these instructions couldn't be removed when targetting Cortex M?

I'm not aware of whether Cortex M4 can be used in a multi-cpu/multi-core configuration, but in general:

Memory barriers are never necessary (can always be omitted) in single-core systems.
Memory barriers are always necessary (can never be omitted) in multi-core systems where threads/processes operating on the same memory may be running on different cores.

Presence or lack of reordering memory writes at the hardware level is irrelevant.

Of course I would expect the DMB instruction to be essentially free on chips that don't support SMP, so I'm not sure why you'd want to try to hack it out.

Please note that, based on the question's referencing the code the compiler produces for atomic intrinsics, I'm assuming the context is for synchronization of atomics to make them match the high-level specification, not other uses like IO barriers for MMIO, and the above "never" should not be read as applying to this (unrelated) use (though I suspect, for the reasons you already cited, it doesn't apply to Cortex M4).

来源：https://stackoverflow.com/questions/50800118/can-dmb-instructions-be-safely-omitted-in-arm-cortex-m4

标签

c11

cortex-m

memory-barriers