zoukankan      html  css  js  c++  java
  • ReadCopy Update (RCU)

    ULK中讲的,感觉这个想法很巧妙,也不翻译了,保留原意吧。 Read-copy update (RCU) is yet another synchronization technique designed to protect data structures that are mostly accessed for reading by several CPUs. RCU allows many readers and many writers to proceed concurrently (an improvement over seqlocks, which allow only one writer to proceed). Moreover, RCU is lock-free, that is, it uses no lock or counter shared by all CPUs; this is a great advantage over read/write spin locks and seqlocks, which have a high overhead due to cache line-snooping and invalidation. How does RCU obtain the surprising result of synchronizing several CPUs without shared data structures? The key idea consists of limiting the scope of RCU as follows: 1. Only data structures that are dynamically allocated and referenced by means of pointers can be protected by RCU. 2. No kernel control path can sleep inside a critical region protected by RCU. When a kernel control path wants to read an RCU-protected data structure, it executes the rcu_read_lock( ) macro, which is equivalent to preempt_disable( ) . Next, the reader dereferences the pointer to the data structure and starts reading it. As stated above, the reader cannot sleep until it finishes reading the data structure; the end of the critical region is marked by the rcu_read_unlock( ) macro, which is equivalent to preempt_enable( ). Because the reader does very little to prevent race conditions, we could expect that the writer has to work a bit more. In fact, when a writer wants to update the data structure, it dereferences the pointer and makes a copy of the whole data structure. Next, the writer modifies the copy. Once finished, the writer changes the pointer to the data structure so as to make it point to the updated copy. Because changing the value of the pointer is an atomic operation, each reader or writer sees either the old copy or the new one: no corruption in the data structure may occur. However, a memory barrier is required to ensure that the updated pointer is seen by the other CPUs only after the data structure has been modified. Such a memory barrier is implicitly introduced if a spin lock is coupled with RCU to forbid the concurrent execution of writers. The real problem with the RCU technique, however, is that the old copy of the data structure cannot be freed right away when the writer updates the pointer. In fact, the readers that were accessing the data structure when the writer started its update could still be reading the old copy. The old copy can be freed only after all (potential) readers on the CPUs have executed the rcu_read_unlock( ) macro. The kernel requires every potential reader to execute that macro before: The CPU performs a process switch (see restriction 2 earlier). The CPU starts executing in User Mode. The CPU executes the idle loop (see the section "Kernel Threads" in Chapter 3). In each of these cases, we say that the CPU has gone through a quiescent state. The call_rcu( ) function is invoked by the writer to get rid of the old copy of the data structure. It receives as its parameters the address of an rcu_head descriptor (usually embedded inside the data structure to be freed) and the address of a callback function to be invoked when all CPUs have gone through a quiescent state. Once executed, the callback function usually frees the old copy of the data structure. The call_rcu( ) function stores in the rcu_head descriptor the address of the callback and its parameter, then inserts the descriptor in a per-CPU list of callbacks. Periodically, once every tick (see the section "Updating Local CPU Statistics" in Chapter 6), the kernel checks whether the local CPU has gone through a quiescent state. When all CPUs have gone through a quiescent state, a local taskletwhose descriptor is stored in the rcu_tasklet per-CPU variableexecutes all callbacks in the list. RCU is a new addition in Linux 2.6; it is used in the networking layer and in the Virtual Filesystem.
  • 相关阅读:
    Codeforces Round #257 (Div. 2) E题:Jzzhu and Apples 模拟
    【浅墨Unity3D Shader编程】之二 雪山飞狐篇:Unity的基本Shader框架写法&颜色、光照与材质
    poj 1741 楼教主男人八题之中的一个:树分治
    Localhost 回环IP 127.0.0.1
    网络营销着陆页:怎么让游客成顾客?
    窗体的消息处理
    运行Java -jar somefile.jar时发生了什么(二)
    Why is processing a sorted array faster than an unsorted array(Stackoverflow)
    NYOJ 330 一个简单的数学题【数学题】
    Java 实现的断点下载
  • 原文地址:https://www.cnblogs.com/yangce/p/2910095.html
Copyright © 2011-2022 走看看