C++多线程——原子操作atomic

1. 原子操作

1.1 示例

原子操作是个不可分割的操作。在系统的所有线程中，你是不可能观察到原子操作完成了一半这种情况的；它要么就是做了，要么就是没做，只有这两种可能。

不使用原子操作：

#include <iostream>
#include <thread>
#include <atomic>
using namespace std;

long num = 0;

void addnum()
{
    for(int i=0; i<100000; i++)
        num++;//不对全局变量进行互斥访问
}

int main()
{
    int nthreads = 2;
    thread t[nthreads];
    for(int i=0; i<nthreads; i++)
        t[i] = thread(addnum);

    for(auto& th : t)
        th.join();
    cout << num << endl;
    return 0;
}

输出结果：
在这里插入图片描述
最终结果为109515，这个结果小于200000，说明在对全局变量进行写的时候出现了下面的情况：

明明加了两次，但是因为访问不是互斥的，从而导致实际的值小。
使用原子操作可以避免这种情况的发生。原子操作使得全局变量num的累加操作变为不可分的。

#include <iostream>
#include <thread>
#include <atomic>
using namespace std;

atomic<long> num(0);

void addnum()
{
    for(int i=0; i<100000; i++)
        num++;
}

int main()
{
    int nthreads = 2;
    thread t[nthreads];
    for(int i=0; i<nthreads; i++)
        t[i] = thread(addnum);

    for(auto& th : t)
        th.join();
    cout << num << endl;
    return 0;
}

输出结果：
在这里插入图片描述

1.2 原子类型与内置类型对应关系

在这里插入图片描述
原子类型的构造方式：

不允许原子类型的拷贝构造和移动构造。

1.3 store、load、exchange操作

store（将数据写入）

store接收两个参数，一个是T val，另一个是sync。store会将val写入，操作是原子的。sync可选的内存顺序有3种：
load（返回数据）

load返回T类型的对象。
支持的内存顺序有：
exchange（使用某一对象替换原有对象）

含有两个参数，第一个参数是改变之后的值，第二个参数是内存模型。

其返回值是调用exchange之前的值。

store和load使用的例子：

#include <iostream>
#include <atomic>
#include <thread>
using namespace std;

atomic<int> foo(0);

void set_foo(int x)
{
    foo.store(x, memory_order_relaxed);
}

void print_foo()
{
    int x;
    do
    {
        x = foo.load(memory_order_relaxed);
    } while (x == 0);
    cout << "foo: " << x << endl;
}

int main()
{
    thread t1(set_foo, 1);
    thread t2(print_foo);

    t1.join();
    t2.join();
    return 0;
}

exchange使用的例子：

#include <iostream>
#include <atomic>
#include <thread>
#include <vector>

std::atomic<bool> ready(false);
std::atomic<bool> winner(false);

void count1m(int n)
{
    while(!ready) {} //wait for ready

    for(int i=0; i<1000000; i++){} //count to 1M

    if(!winner.exchange(true)) { //只有第一个执行exchange的线程会返false,并输出下面的语句,其余线程都返回true无法进入if
        std::cout << "thread #" << n << " won!\n";
    }
}

int main()
{
    std::vector<std::thread> threads;
    for(int i=0; i<10; i++){
        threads.push_back(std::thread(count1m, i+1));//创建10个线程
    }
    ready.store(true);//开始
    for(auto& th : threads)
        th.join();
    return 0;
}

这个例子是同时产生10个线程，让他们开始执行for循环，谁先执行完谁就能够第一次调用exchange，从而进入if语句打印信息，其余的都不会进入if，因为第一个已经将winner设置为了true，其他线程再访问的时候exchange就会返回true。

2. atomic_flag

atomic_flag 与其他原子类型不同，它是无锁（lock_free）的，即线程对其访问不需要加锁，而其他的原子类型不一定是无锁的。
在这里插入图片描述
因为atomic<T>并不能保证类型T是无锁的，另外不同平台的处理器处理方式不同，也不能保证必定无锁，所以其他的类型都会有 is_lock_free() 成员函数来判断是否是无锁的。
atomic_flag 只支持 test_and_set() 以及 clear() 两个成员函数。

test_and_set()函数检查std::atomic_flag 标志，如果 std::atomic_flag 之前没有被设置过，则设置std::atomic_flag 的标志；如果之前 std::atomic_flag 已被设置，则返回 true，否则返回 false。

clear()函数清除 std::atomic_flag 标志使得下一次调用std::atomic_flag::test_and_set()返回 false。可以用 atomic_flag 的成员函数test_and_set() 和 clear() 来实现一个自旋锁（spin lock）：

/*
** 使用 atomic_flag 实现自旋锁
*/
#include <iostream>
#include <atomic>
#include <thread>
#include <unistd.h>
using namespace std;

atomic_flag lock = ATOMIC_FLAG_INIT;//初始化为false

void fun1(int n)
{
    while(lock.test_and_set(memory_order_acquire)){ //由于在主函数中set了lock,所以该循环需要等待线程2将lock clear
        cout << "waiting for thread " << n << endl;
    }
    cout << "thread " << n << " starts working!" << endl;
}

void fun2(int n)
{
    cout << "thread " << n << " is going to start\n";
    lock.clear();
    cout << "thread " << n << " starts working!" << endl;
}

int main()
{
    lock.test_and_set();//set lock
    thread t1(fun1, 1);
    thread t2(fun2, 2);

    t1.join();
    usleep(1000);
    t2.join();
    return 0;
}

输出结果：
在这里插入图片描述
可见在thread 2将锁clear之前，线程1一直在执行while。

另外一个使用atomic_flag实现自旋锁的例子：

#include <iostream>
#include <atomic>
#include <thread>
#include <vector>
using namespace std;

atomic_flag lock = ATOMIC_FLAG_INIT;

void appendnum(int x)
{
    while (lock.test_and_set()){ //获取锁
    }
    cout << "thread #" << x << endl;
    lock.clear();//释放
}

int main()
{
    vector<thread> v;
    for(int i=0; i<10; i++){
        v.push_back(thread(appendnum, i+1));
    }
    for(auto& th : v){
        th.join();
    }
    return 0;
}

运行结果：
在这里插入图片描述

内存模型

注意到上面的store、load、test_and_set、clear函数都有一个memory_order类型的参数，这个参数是什么意思呢？

先来看一个例子：

#include <iostream>
#include <atomic>
#include <thread>
using namespace std;

atomic<int> a(0);
atomic<int> b(0);

void valueSet()
{
    int t = 1;
    a = t;
    b = 2;
}

void observer()
{
    cout << "a = " << a << "," << "b = " << b << endl;
}

int main()
{
    thread t1(valueSet);
    thread t2(observer);
    
    t1.join();
    t2.join();

    cout << "in main: " << "a = " << a << ", " << "b = " << b << endl;
    return 0;
}

这个例子的运行结果可能是(0,0) (1,0) (1,2)，也可能是(0,2)，(0,2)是什么情况，按照valueSet中的执行顺序不可能出现这种情况，实际上这和内存模型有关。

内存模型通常是硬件上的概念，表示的是机器指令是以什么样的顺序被处理器执行的，现代的处理器并不是逐条处理机器指令的：

1: Load reg3, 1; // 将立即数1放入寄存器reg3
2: Move reg4,reg3; // 将reg3的数据放入reg4
3: Store reg4, a; // 将reg4的数据存入内存地址a
4: Load reg5, 2; // 将立即数2放入寄存器reg5
5: Store reg5, b; // 将reg5的数据存入内存地址b

伪汇编代码代表了t = 1; a = t; b = 2，通常情况下指令都是按照1~5的顺序执行，这种内存模型称为强顺序(strong ordered)。不过可以看到，指令（1 2 3）和指令（4 5）的运行顺序不影响结果，有一些处理器可能会将指令的顺序打乱，例如按照1-4-2-5-3的顺序执行，这种内存模型称为弱顺序(weak ordered)。弱顺序内存模型下，指令5（b的赋值）很有可能在指令3（a的赋值）之前完成。

现实中，x86_64以及SPARC（TSO模式）都是采用强顺序内存模型的平台。在多线程程序中，强顺序类型意味着对于各个线程看到的指令执行顺序是一致的。对于处理器而言，内存中的数据被改变的顺序与机器指令中的一致。相反的，弱顺序就是各个线程看到的内存数据被改变的顺序与机器指令中声明的不一致。弱顺序内存模型可能会导致程序问题，为什么有些平台，诸如Alpha、PowerPC、Itanlium、ArmV7等平台会使用这种模型？简单地说，这种模型能让处理器有更好的并行性，提高指令执行的效率。并且，为了保证指令执行的顺序，通常需要在汇编指令中加入一条内存栅栏（memory barrier）指令，但是会影响处理器性能。比如在PowerPC上，就有一条名为sync的内存栅栏指令。该指令迫使已经进入流水线中的指令都完成后处理器才会执行sync以后的指令。
在这里插入图片描述

C++11允许程序员为原子操作指定所谓的内存顺序，通过使用memory_order来实现，memory_order是枚举类型的：

typedef enum memory_order {
    memory_order_relaxed,   // relaxed
    memory_order_consume,   // consume
    memory_order_acquire,   // acquire
    memory_order_release,   // release
    memory_order_acq_rel,   // acquire/release
    memory_order_seq_cst    // sequentially consistent
} memory_order;