What are the correct memory orders to use when inserting a node at the beginning of a lock free singly linked list?

混江龙づ霸主 提交于 2019-12-11 01:40:23

问题


I have a simple linked list. There is no danger of the ABA problem, I'm happy with Blocking category and I don't care if my list is FIFO, LIFO or randomized. At long as the inserting succeeds without making others fails.

The code for that looks something like this:

class Class {
  std::atomic<Node*> m_list;
  ...
};

void Class::add(Node* node)
{
  node->next = m_list.load(std::memory_order_acquire);
  while (!m_list.compare_exchange_weak(node->next, node, std::memory_order_acq_rel, std::memory_order_acquire));
}

where I more or less randomly filled in the used memory_order's. What are the right memory orders to use here?

I've seen people use std::memory_order_relaxed in all places, one guy on SO used that too, but then std::memory_order_release for the success case of compare_exchange_weak -- and the genmc project uses memory_order_acquire / twice memory_order_acq_rel in a comparable situation, but I can't get genmc to work for a test case :(.


回答1:


Using the excellent tool from Michalis Kokologiannakis genmc, I was able to verify the required memory orders with the following test code. Unfortunately, genmc currently requires C code, but that doesn't matter for figuring out what the memory orders need to be of course.

// Install https://github.com/MPI-SWS/genmc
//
// Then test with:
//
// genmc -unroll 5 -- genmc_sll_test.c

// These header files are replaced by genmc (see /usr/local/include/genmc):
#include <pthread.h>
#include <stdlib.h>
#include <stddef.h>
#include <assert.h>
#include <stdatomic.h>
#include <stdio.h>

#define PRODUCER_THREADS 3
#define CONSUMER_THREADS 2

struct Node
{
  struct Node* next;
};

struct Node* const deleted = (struct Node*)0xd31373d;

_Atomic(struct Node*) list;

void* producer_thread(void* node_)
{
  struct Node* node = (struct Node*)node_;

  // Insert node at beginning of the list.
  node->next = atomic_load_explicit(&list, memory_order_relaxed);
  while (!atomic_compare_exchange_weak_explicit(&list, &node->next,
             node, memory_order_release, memory_order_relaxed))
    ;

  return NULL;
}

void* consumer_thread(void* param)
{
  // Replace the whole list with an empty list.
  struct Node* head = atomic_exchange_explicit(&list, NULL, memory_order_acquire);
  // Delete each node that was in the list.
  while (head)
  {
    struct Node* orphan = head;
    head = orphan->next;
    // Mark the node as deleted.
    assert(orphan->next != deleted);
    orphan->next = deleted;
  }

  return NULL;
}

pthread_t t[PRODUCER_THREADS + CONSUMER_THREADS];
struct Node n[PRODUCER_THREADS]; // Initially filled with zeroes -->
                                 // none of the Node's is marked as deleted.

int main()
{
  // Start PRODUCER_THREADS threads that each append one node to the queue.
  for (int i = 0; i < PRODUCER_THREADS; ++i)
    if (pthread_create(&t[i], NULL, producer_thread, &n[i]))
      abort();

  // Start CONSUMER_THREAD threads that each delete all nodes that were added so far.
  for (int i = 0; i < CONSUMER_THREADS; ++i)
    if (pthread_create(&t[PRODUCER_THREADS + i], NULL, consumer_thread, NULL))
      abort();

  // Wait till all threads finished.
  for (int i = 0; i < PRODUCER_THREADS + CONSUMER_THREADS; ++i)
    if (pthread_join(t[i], NULL))
      abort();

  // Count number of elements still in the list.
  struct Node* l = list;
  int count = 0;
  while (l)
  {
    ++count;
    l = l->next;
  }

  // Count the number of deleted elements.
  int del_count = 0;
  for (int i = 0; i < PRODUCER_THREADS; ++i)
    if (n[i].next == deleted)
      ++del_count;

  assert(count + del_count == PRODUCER_THREADS);
  //printf("count = %d; deleted = %d\n", count, del_count);

  return 0;
}

The output of which is

$ genmc -unroll 5 -- genmc_sll_test.c
Number of complete executions explored: 6384
Total wall-clock time: 1.26s

Replacing either the memory_order_release or memory_order_acquire with memory_order_relaxed causes an assertion.

In fact, it can be checked that using exclusive memory_order_relaxed when just inserting nodes is sufficient to get them all cleanly in the list (although in a 'random' order - there is nothing sequential consistent, so the order in which they are added is not necessarily the same as that the threads try to add them, if such correlation exists for other reasons).

However, the memory_order_release is required so that when head is read with memory_order_acquire we can be certain that all non-atomic next pointers are visible in the "consumer" thread.

Note there is no ABA problem here because values used for head and next cannot be "reused" before they are deleted by the 'consumer_thread' function, which is the only place where these node are allowed to be deleted (therefore), implying that there can only be one consumer thread (this test code does NOT check for the ABA problem, so it also works using 2 CONSUMER_THREADS).

The actual code is a garbage collection mechanism where multiple "producer" threads add pointers to a singly linked list when those can be deleted, but where it is only safe to actually do so in one specific thread (in that case there is only one "consumer" thread thus, which performs this garbage collection at a well-known place in a main-loop).



来源:https://stackoverflow.com/questions/57029910/what-are-the-correct-memory-orders-to-use-when-inserting-a-node-at-the-beginning

易学教程内所有资源均来自网络或用户发布的内容,如有违反法律规定的内容欢迎反馈
该文章没有解决你所遇到的问题?点击提问,说说你的问题,让更多的人一起探讨吧!