stdatomic

C++11 on modern Intel: am I crazy or are non-atomic aligned 64-bit load/store actually atomic?

倖福魔咒の 提交于 2020-05-16 08:04:21
问题 Can I base a mission-critical application on the results of this test, that 100 threads reading a pointer set a billion times by a main thread never see a tear? Any other potential problems doing this besides tearing? Here's a stand-alone demo that compiles with g++ -g tear.cxx -o tear -pthread . #include <atomic> #include <thread> #include <vector> using namespace std; void* pvTearTest; atomic<int> iTears( 0 ); void TearTest( void ) { while (1) { void* pv = (void*) pvTearTest; intptr_t i =

C11 Standalone memory barriers LoadLoad StoreStore LoadStore StoreLoad

左心房为你撑大大i 提交于 2020-05-15 08:23:24
问题 I want to use standalone memory barriers between atomic and non-atomic operations (I think it shouldn't matter at all anyway). I think I understand what a store barrier and a load barrier mean and also the 4 types of possible memory reorderings; LoadLoad , StoreStore , LoadStore , StoreLoad . However, I always find the acquire/release concepts confusing. Because when reading the documentation, acquire doesn't only speak about loads, but also stores, and release doesn't only speak about stores

A readers/writer lock… without having a lock for the readers?

我怕爱的太早我们不能终老 提交于 2020-05-15 04:56:26
问题 I get the feeling this may be a very general and common situation for which a well-known no-lock solution exists. In a nutshell, I'm hoping there's approach like a readers/writer lock, but that doesn't require the readers to acquire a lock and thus can be better average performance. Instead there'd be some atomic operations (128-bit CAS) for a reader, and a mutex for a writer. I'd have two copies of the data structure, a read-only one for the normally-successful queries, and an identical copy

A readers/writer lock… without having a lock for the readers?

杀马特。学长 韩版系。学妹 提交于 2020-05-15 04:54:18
问题 I get the feeling this may be a very general and common situation for which a well-known no-lock solution exists. In a nutshell, I'm hoping there's approach like a readers/writer lock, but that doesn't require the readers to acquire a lock and thus can be better average performance. Instead there'd be some atomic operations (128-bit CAS) for a reader, and a mutex for a writer. I'd have two copies of the data structure, a read-only one for the normally-successful queries, and an identical copy

Can atomic operations on a non-atomic<> pointer be safe and faster than atomic<>?

心不动则不痛 提交于 2020-04-30 06:29:13
问题 I have a dozen threads reading a pointer, and one thread that may change that pointer maybe once an hour or so. The readers are super, super, super time-sensitive. I hear that atomic<char**> or whatever is the speed of going to main memory, which I want to avoid. In modern (say, 2012 and later) server and high-end desktop Intel, can an 8-byte-aligned regular pointer be guaranteed not to tear if read and written normally? A test of mine runs an hour without seeing a tear. Otherwise, would it

C11 Atomic Acquire/Release and x86_64 lack of load/store coherence?

时光怂恿深爱的人放手 提交于 2020-03-17 10:58:59
问题 I am struggling with Section 5.1.2.4 of the C11 Standard, in particular the semantics of Release/Acquire. I note that https://preshing.com/20120913/acquire-and-release-semantics/ (amongst others) states that: ... Release semantics prevent memory reordering of the write-release with any read or write operation that precedes it in program order. So, for the following: typedef struct test_struct { _Atomic(bool) ready ; int v1 ; int v2 ; } test_struct_t ; extern void test_init(test_struct_t* ts,

C11 Atomic Acquire/Release and x86_64 lack of load/store coherence?

£可爱£侵袭症+ 提交于 2020-03-17 10:58:47
问题 I am struggling with Section 5.1.2.4 of the C11 Standard, in particular the semantics of Release/Acquire. I note that https://preshing.com/20120913/acquire-and-release-semantics/ (amongst others) states that: ... Release semantics prevent memory reordering of the write-release with any read or write operation that precedes it in program order. So, for the following: typedef struct test_struct { _Atomic(bool) ready ; int v1 ; int v2 ; } test_struct_t ; extern void test_init(test_struct_t* ts,

std::memory_order and instruction order, clarification

做~自己de王妃 提交于 2020-02-02 12:34:07
问题 This is a follow up question to this one. I want to figure exactly the meaning of instruction ordering, and how it is affected by the std::memory_order_acquire , std::memory_order_release etc... In the question I linked there's some detail already provided, but I felt like the provided answer isn't really about the order (which was more what was I looking for) but rather motivating a bit why this is necessary etc. I'll quote the same example which I'll use as reference #include <thread>

When should you not use [[carries_dependency]]?

好久不见. 提交于 2020-01-13 05:34:07
问题 I've found questions (like this one) asking what [[carries_dependency]] does, and that's not what I'm asking here. I want to know when you shouldn't use it, because the answers I've read all make it sound like you can plaster this code everywhere and magically you'd get equal or faster code. One comment said the code can be equal or slower, but the poster didn't elaborate. I imagine appropriate places to use this is on any function return or parameter that is a pointer or reference and that

Loads and stores reordering on ARM

試著忘記壹切 提交于 2020-01-13 04:23:06
问题 I'm not an ARM expert but won't those stores and loads be subjected to reordering at least on some ARM architectures? atomic<int> atomic_var; int nonAtomic_var; int nonAtomic_var2; void foo() { atomic_var.store(111, memory_order_relaxed); atomic_var.store(222, memory_order_relaxed); } void bar() { nonAtomic_var = atomic_var.load(memory_order_relaxed); nonAtomic_var2 = atomic_var.load(memory_order_relaxed); } I've had no success in making the compiler put memory barriers between them. I've