memory-fences

atomic<bool> vs bool protected by mutex

本秂侑毒 提交于 2020-08-23 04:55:28
问题 Let's assume we have a memory area where some thread is writing data to. It then turns its attention elsewhere and allows arbitrary other threads to read the data. However, at some point in time, it wants to reuse that memory area and will write to it again. The writer thread supplies a boolean flag ( valid ), which indicates that the memory is still valid to read from (i.e. he is not reusing it yet). At some point he will set this flag to false and never set it to true again (it just flips

In OpenCL, what does mem_fence() do, as opposed to barrier()?

為{幸葍}努か 提交于 2020-01-02 01:04:08
问题 Unlike barrier() (which I think I understand), mem_fence() does not affect all items in the work group. The OpenCL spec says (section 6.11.10), for mem_fence() : Orders loads and stores of a work-item executing a kernel. (so it applies to a single work item). But, at the same time, in section 3.3.1, it says that: Within a work-item memory has load / store consistency. so within a work item the memory is consistent. So what kind of thing is mem_fence() useful for? It doesn't work across items,

Are volatile reads and writes atomic on Windows+VisualC?

南笙酒味 提交于 2019-12-29 03:39:31
问题 There are a couple of questions on this site asking whether using a volatile variable for atomic / multithreaded access is possible: See here, here, or here for example. Now, the C(++) standard conformant answer is obviously no . However, on Windows & Visual C++ compiler, the situation seems not so clear. I have recently answered and cited the official MSDN docs on volatile Microsoft Specific Objects declared as volatile are (...) A write to a volatile object (volatile write) has Release

Where to places fences/memory barriers to guarantee a fresh read/committed writes?

那年仲夏 提交于 2019-12-28 05:46:54
问题 Like many other people, I've always been confused by volatile reads/writes and fences. So now I'm trying to fully understand what these do. So, a volatile read is supposed to (1) exhibit acquire-semantics and (2) guarantee that the value read is fresh, i.e., it is not a cached value. Let's focus on (2). Now, I've read that, if you want to perform a volatile read, you should introduce an acquire fence (or a full fence) after the read, like this: int local = shared; Thread.MemoryBarrier(); How

Is atomic decrementing more expensive than incrementing?

帅比萌擦擦* 提交于 2019-12-20 17:35:09
问题 In his Blog Herb Sutter writes [...] because incrementing the smart pointer reference count can usually be optimized to be the same as an ordinary increment in an optimized shared_ptr implementation — just an ordinary increment instruction, and no fences, in the generated code. However, the decrement must be an atomic decrement or equivalent, which generates special processor memory instructions that are more expensive in themselves, and that on top of that induce memory fence restrictions on

Out of Order Execution and Memory Fences

主宰稳场 提交于 2019-12-20 09:41:17
问题 I know that modern CPUs can execute out of order, However they always retire the results in-order, as described by wikipedia. "Out of Oder processors fill these "slots" in time with other instructions that are ready, then re-order the results at the end to make it appear that the instructions were processed as normal. " Now memory fences are said to be required when using multicore platforms, because owing to Out of Order execution, wrong value of x can be printed here. Processor #1: while f

Do memory barriers guarantee a fresh read in C#?

淺唱寂寞╮ 提交于 2019-12-18 06:56:42
问题 If we have the following code in C#: int a = 0; int b = 0; void A() // runs in thread A { a = 1; Thread.MemoryBarrier(); Console.WriteLine(b); } void B() // runs in thread B { b = 1; Thread.MemoryBarrier(); Console.WriteLine(a); } The MemoryBarriers make sure that the write instruction takes place before the read. However, is it guaranteed that the write of one thread is seen by the read on the other thread? In other words, is it guaranteed that at least one thread prints 1 or both thread

jni/java: thread safe publishing/sharing of effectively immutable native object

倖福魔咒の 提交于 2019-12-13 19:18:22
问题 1) I have a native java function which passes several params and its implementation is a native C++ constructor to create an object and returns a long which is cast from the pointer to object. This object's constructed members are effectively immutable. The C++ object then can do work based on its constructed state. 2) java code that gets the result of the function call safely publishes the longified version of the pointer somewhere (without mutex) and changes a volatile variable to hopefully

Intel 64 and IA-32 | Atomic operations including acquire / release semantic

风格不统一 提交于 2019-12-12 07:10:08
问题 According to the Intel 64 and IA-32 Architectures Software Developer's Manual the LOCK Signal Prefix "ensures that the processor has exclusive use of any shared memory while the signal is asserted". That can be a in the form of a bus or cache lock. But - and that's the reason I'm asking this question - it isn't clear to me, if this Prefix also provides any memory-barrier. I'm developing with NASM in a multi-processor environment and need to implement atomic operations with optional acquire

Possible to use C11 fences to reason about writes from other threads?

无人久伴 提交于 2019-12-10 11:19:28
问题 Adve and Gharachorloo's report, in Figure 4b, provides the following example of a program that exhibits unexpected behavior in the absence of sequential consistency: My question is whether it is possible, using only C11 fences and memory_order_relaxed loads and stores, to ensure that register1, if written, will be written with the value 1. The reason this might be hard to guarantee in the abstract is that P1, P2, and P3 could be at different points in a pathological NUMA network with the