zoukankan      html  css  js  c++  java
  • C11 memory_order

    概念:

    摘录自:http://preshing.com/20120913/acquire-and-release-semantics/

     

    Acquire semantics is a property which can only apply to operations which read from shared memory, whether they are read-modify-write operations or plain loads. The operation is then considered a read-acquire. Acquire semantics prevent memory reordering of the read-acquire with any read or write operation which follows it in program order.

    Release semantics is a property which can only apply to operations which write to shared memory, whether they are read-modify-write operations or plain stores. The operation is then considered a write-release. Release semantics prevent memory reordering of the write-release with any read or write operation which precedes it in program order.

    Acquire and Release Fences

    First things first: Acquire and release fences are considered low-level lock-free operations. If you stick with higher-level, sequentially consistent atomic types, such as volatile variables in Java 5+, or default atomics in C++11, you don’t need acquire and release fences. The tradeoff is that sequentially consistent types are slightly less scalable or performant for some algorithms.

    On the other hand, if you’ve developed for multicore devices in the days before C++11, you might feel an affinity for acquire and release fences. Perhaps, like me, you remember struggling with the placement of some lwsync intrinsics while synchronizing threads on Xbox 360. What’s cool is that once you understand acquire and release fences, you actually see what we were trying to accomplish using those platform-specific fences all along.

    Acquire and release fences, as you might imagine, are standalone memory fences, which means that they aren’t coupled with any particular memory operation. So, how do they work?

    An acquire fence prevents the memory reordering of any read which precedes it in program order with any read or write which follows it in program order.

    A release fence prevents the memory reordering of any read or write which precedes it in program order with any write which follows it in program order.

    In other words, in terms of the barrier types explained here, an acquire fence serves as both a #LoadLoad + #LoadStore barrier, while a release fence functions as both a #LoadStore + #StoreStore barrier. That’s all they purport to do.

     

    LoadLoad确保前后两个Load操作不乱序,StoreStore确保前后两个Store操作不乱序。 PowerPC上通过 lwsync 轻量级sync
    StoreLoad 是最昂贵的。类似于磁盘的sync操作,确保将高速缓存中数据完全写入主内存;并确保其它CPU cache更新。PowerPC上通过 sync

    编程接口:

    C++11用法:

    #include <atomic>
    std::atomic_thread_fence(std::memory_order_acquire); std::atomic_thread_fence(std::memory_order_release);

    C11 用法:

    #include <stdatomic.h>
    atomic_thread_fence(memory_order_acquire); atomic_thread_fence(memory_order_release);

    以 C11 为例详细解释头文件 <stdatomic.h> 中定义的 memory_order 枚举的每个值的意思

    enum memory_order {
        memory_order_relaxed,  /* 仅仅确保读写操作的原子性。无内存序,所以仅适用 atomic 变量 */  
        memory_order_consume,  /* 数据依赖序,DEC Alpha only */
        memory_order_acquire,
        memory_order_release,
        memory_order_acq_rel,
        memory_order_seq_cst
    };

    关于 C11  compare and exchange 各自版本的操作区别:

    weak 和 strong

    循环中用 weak 有更好的性能。 非循环操作必须用 strong 版本。因为 weak 有时候会在 所比较的值相等时候 也失败返回

    implicit 和 explicit 

    implicit 版本会默认 使用强内存模型 memory_order_seq_cst 。

    explicit 版本会有2个额外参数 succ 和 fail,succ 指定 compare 比较成功后的内存 barrier;fail 指定 compare 失败后的内存 barrier 。


    C 11 对各自的英文解释,比较绕口:

    Value Explanation  
    memory_order_relaxed Relaxed   ordering: there are no constraints on reordering of memory accesses around   the atomic variable. 确保操作原子性
    memory_order_consume Consume   operation: no reads in the current thread dependent on the value currently   loaded can be reordered before this load. This ensures that writes to   dependent variables in other threads that release the same atomic variable   are visible in the current thread. On most platforms, this affects compiler   optimization only. 简言之 Data dependency barriers,比 Acquire 更弱。一般CPU都会自动保证数据依赖序(Alpha 除外)
    memory_order_acquire Acquire   operation: no reads in the current thread can be reordered before this load.   This ensures that all writes in other threads that release the same atomic   variable are visible in the current thread. 其它线程Release之前的所有内存可见
    memory_order_release Release   operation: no writes in the current thread can be reordered after this store.   This ensures that all writes in the current thread are visible in other   threads that acquire the same atomic variable. 此Release操作之前的所有内存,其它线程Acquire后可见;
    此Release操作之前的部分内存,其它线程Consume后可见;
    memory_order_acq_rel Acquire-release operation: no reads in the current thread can be reordered   before this load as well as no writes in the current thread can be reordered   after this store. The operation is read-modify-write operation. It is ensured   that all writes in another threads that release the same atomic variable are   visible before the modification and the modification is visible in other   threads that acquire the same atomic variable. Acquire和Release操作的合体。自动对读做Aquire操作;对写做Release操作
    memory_order_seq_cst Sequential ordering. The operation has the same semantics as acquire-release   operation, and additionally has sequentially-consistent operation ordering.

    a full memory fence 
    比Acquire-release更进一步:之前所有写,其它线程立即可见(其它线程简单的读就能读到,不需要acquire)

    频繁使用可能会成为性能瓶颈

    重点:解释下什么情况下需要 memory_order_consume (data dependency barrier)

    1)
    A=
    <data dependency barrier>
    B=*A   
     
    2)
    A=
    <data dependency barrier>
    C=B[A]

    问题:已经有封装好的 atomic 变量了,那 atomic_thread_fence 还有用场吗?

    有用场。如下面例子,开始只有 relaxed 保证原子性,仅仅当读到变量满足条件时,才用 acquire 确保 do_work() 发生在 读到 mailbox[i] 之后

    样例来自 http://en.cppreference.com/w/cpp/atomic/atomic_thread_fence 

    const int num_mailboxes = 32;
    std::atomic<int> mailbox[num_mailboxes];
     
    // The writer threads update non-atomic shared data and then update mailbox[i] as follows
     std::atomic_store_explicit(&mailbox[i], std::memory_order_release);
     
    // Reader thread needs to check all mailbox[i], but only needs to sync with one
     for (int i = 0; i < num_mailboxes; ++i) {
        if (std::atomic_load_explicit(&mailbox[i],  std::memory_order_relaxed) == my_id) {
            std::atomic_thread_fence(std::memory_order_acquire); // synchronize with just one writer
            do_work(i); // guaranteed to observe everything done in the writer thread before
                        // the atomic_store_explicit()
        }
     }
  • 相关阅读:
    vue 子组件像父组件传递数据
    SQL Query XML column.   SQL 查询 xml 字段
    最方便的批处理延时方法
    Automation testing framework for RFT execution with STAF+STAX . [Session1]
    Disable Windows server 2003 Security Warning.
    Perl初级教程 (5) 遍历文件夹内指定扩展名文件,查找匹配关键字的输出。
    Perl 基于 Windows 环境 搭建
    Perl Scalar
    Package you execution files with Iexpress.exe
    SQLServer2005 remove log file.
  • 原文地址:https://www.cnblogs.com/JesseFang/p/3494558.html
Copyright © 2011-2022 走看看