问题
I use C++ since a long time, and now I'm starting to learn assembly and learn how processors work (not just for fun, but I have to as a part of a test program). While learning assembly, I started hearing some of the terms that I hear here and there when discussing multithreading, given that I do lots of multithreading in scientific computing. I'm struggling to get the full picture, and I'd appreciate helping me to widen my picture.
I learned that a bus, in its simplest form, is something like a multiplexer followed by a demultiplexer. Each of the ends takes an address as input, in order to connect the two ends with some external component. The two ends can, based on the address, point to memory, graphics card, RAM, CPU registers, or anything else.
Now getting to my question: I keep hearing people arguing on whether to use a mutex or an atomic for thread safety (I know there's no ultimate answer, this is not what my question is, but my question is about the comparison). Here for example, the claim was made that atomics are so bad that they will prevent a processor from doing a decent job because of bus-locking.
Could someone please explain what bus-locking is, in a little detail, and why it is not like mutexes, while AFAIK, mutexes need at least two atomic operations to lock and unlock.
回答1:
"I learned that a bus, in its simplest form, is something like a multiplexer followed by a demultiplexer. Each of the ends"
Well, that's not correct. In its simplest form there's nothing to multiplex or demultiplex. It's just two things talking directly to each other. And in the nost-so simple case, a bus may have three or more devices connected. In that case, you start needing bus addresses because you no longer can talk about "the other end".
Now if you've got multiple devices on a single bus, they generally can't all talk at the same time. There must be some mechanism to prevent them from all talking at the same time. Yet for all devices to be able to share that bus, they must be able to alternate who is talking to who. Bus locking as a broad term means any deviation from the usual pattern, where two devices reserve the bus for their mutual conversation.
In the particular context of the x86 memory bus, this means keeping the bus locked during a read-modify-write cycle (as Kerrek SB pointed out in comments). Now this may sound like a simple bus with 2 devices (memory and CPU) but DMA and multi-core chips make this not that simple.
回答2:
From Intel® 64 and IA-32 Architectures Software Developer’s Manual:
Beginning with the P6 family processors, when the
LOCK
prefix is prefixed to an instruction and the memory area being accessed is cached internally in the processor, the LOCK# signal is generally not asserted. Instead, only the processor’s cache is locked. Here, the processor’s cache coherency mechanism ensures that the operation is carried out atomically with regards to memory.
There are special non-temporal store instructions to bypass the cache. All other loads and stores normally go through the cache, unless the memory page is marked as non-cacheable (like GPU or PCIe device memory).
来源:https://stackoverflow.com/questions/43365382/what-is-bus-locking-in-the-context-of-atomic-variables