zoukankan      html  css  js  c++  java
  • innodb的互斥量(1): os_event

          innodb里实现了2类很常用的互斥量,一个是mutex_t(独占形式),另外一个是rw_lock_t(读共享,写独占),innodb对其进行了改造,以适应数据库的性能要求。因为并发是innodb主打的看点,所以这两类互斥量在整个代码里面占了很重要的地位(特别是mutex_t,几乎贯穿了整个体系),而在介绍这两种互斥量之前,先要介绍一个基础的模块——os_event,它实现了基本的事件收发机制, mutex_t和rw_lock_t的互斥通知都依赖的是os_event。

      note: innodb喜欢把封装了系统调用的模块没其名曰os_xxxxx

    先描述一下os_event的事件收发流程                   

    thread A calls os_event_reset(event_1) [开始接收事件通知]
    thread B calls os_event_set(event_1)   [ 发送事件通知]
    thread A calls os_event_wait(event_1)  [等待事件]
    thread A 等待完毕

    1. A进程调用了os_event_reset()后就已经加入了争抢event_1的队伍,而不是只在wait的时候才开始接收事件,也就是说在reset和wait之间发的该事件信号A也收得到(具体实现code体现)

    2. os_event_set的事件通知是惊群模式(调用的pthread_cond_broadcast), 通知所有的waiter这个肯定增加cpu开销,但是可以满足rw_lock_t的需求,下面是pthread manual的一段解释

           The pthread_cond_broadcast() function is used whenever the shared-vari-
           able state has been changed in a way that more than one thread can pro-
           ceed with its task. Consider a single producer/multiple consumer	 prob-
           lem,  where  the	 producer  can insert multiple items on a list that is
           accessed one  item  at  a  time	by  the	 consumers.   By  calling  the
           pthread_cond_broadcast()	 function,  the producer would notify all con-
           sumers that might be waiting, and thereby the application would receive
           more  throughput on a multi-processor. In addition, pthread_cond_broad-
           cast()  makes  it  easier  to  implement	  a   read-write   lock.   The
           pthread_cond_broadcast()	 function  is  needed  in order to wake up all
           waiting readers when a writer releases its  lock.   Finally,  the  two-
           phase  commit  algorithm	 can use this broadcast function to notify all
           clients of an impending transaction commit.
    

     3.os_event_wait是个pthread_mutex和pthread_cond的常见组合,网上很多这种介绍。

    我们看看os_event的实现

    下面是event的结构

    struct os_event_struct {
    
    
        os_fast_mutex_t    os_mutex;    /*!< this mutex protects the next
                        fields */
        ibool        is_set;        /*!< this is TRUE when the event is
                        in the signaled state, i.e., a thread
                        does not stop if it tries to wait for
                        this event */
        ib_int64_t    signal_count;    /*!< this is incremented each time
                        the event becomes signaled */
        os_cond_t    cond_var;    /*!< condition variable is used in
                        waiting for the event */
        UT_LIST_NODE_T(os_event_struct_t) os_event_list;
                        /*!< list of all created events */
    };

    1)  is_set和signal_count是一个事件状态的标志组合

    线程发送事件(event_set),is_set设置为true,且signal_count++(signal_count只会一直递增)

        os_fast_mutex_lock(&(event->os_mutex));
    
        if (event->is_set) {
            /* Do nothing */
        } else {
            event->is_set = TRUE;
            event->signal_count += 1;
            os_cond_broadcast(&(event->cond_var));
        }
    
        os_fast_mutex_unlock(&(event->os_mutex));

    线程开始接收事件通知(event_reset)会返回此刻的signal_count(假定调用的该线程将返回值保留在old_signal_count里)且is_set设置为false

        os_fast_mutex_lock(&(event->os_mutex));
    
        if (!event->is_set) {
            /* Do nothing */
        } else {
            event->is_set = FALSE;
        }
        ret = event->signal_count;
    
        os_fast_mutex_unlock(&(event->os_mutex));

    (old_signal_count==signal_count && is_set==false) 作为判定从reset到wait之间是否已经有event的标志(表达式为真则无event来,还有一个是timeout_wait的,但实现大同小异)

    //
    os_fast_mutex_lock(&event->os_mutex);
     //初始化这个event的时候signal_count从1开始,因为0在os_event_wait_low判断放弃reset到wait直接的event通知的标志,
     //也就是说old_signal_count硬性设置为0则等于从cond_wait才开始接收该事件的通知
    if (!reset_sig_count) 
    {
    reset_sig_count
    = event->signal_count;
    }

    while
    (!event->is_set && event->signal_count == reset_sig_count)
    {
    os_cond_wait(
    &(event->cond_var), &(event->os_mutex));

    /* Solaris manual said that spurious wakeups may occur: we have to check if the event really has been signaled after we came here to wait */
    }
    os_fast_mutex_unlock(
    &event->os_mutex);


    这样做的好处是

    event_reset把is_set设置为false,则屏蔽了reset之前的所有event通知,避免早已有event_set把is_set设置过了,但是仅这样设计有缺陷,因为如果是下面这样

    A:event_reset

    B: event_set

    C:event_reset

    A : event_wait

    这样B的事件通知被C给意外抹杀掉了,A就丢失了这次通知,继续等待下去,所以还得引入signal_count这个变量的判断,如果A在reset的时候记录了signal_count的oldvalue,那么就算is_set


    被C给设置成false了,(old_signal_count==signal_count && is_set==false)还是判断为假,A的wait依然会通过。

    2)  os_mutex保证并发情况下这个os_event内成员的修改一致性,也会配合cond_var等待事件,(os_fast_mutex_tos_cond_t是对pthread_mutex和pthread_cond的简单封装)

    3)  所有的os_event都会加入到一个全局双链表中,os_event_list则又反向指向这个链表

    /* The os_sync_mutex can be NULL because during startup an event
        can be created [ because it's embedded in the mutex/rwlock ] before
        this module has been initialized */
        if (os_sync_mutex != NULL) {
            os_mutex_enter(os_sync_mutex);
        }
    
        /* Put to the list of events */
        UT_LIST_ADD_FIRST(os_event_list, os_event_list, event);
    
        os_event_count++;
    
        if (os_sync_mutex != NULL) {
            os_mutex_exit(os_sync_mutex);
        }

     os_event的成员就这么多,实现也是比较简单的,主要还是靠对is_set和signal_count的修改和判断来实现整个事件行为,后面的rw_lock和mutex会复杂一点

  • 相关阅读:
    Qt 6 正式发布
    GTK 4.0 正式发布
    编译 flink 1.12.0
    Flink 1.12.0 sql 任务指定 job name
    【翻译】Apache Flink 1.12.0 Release Announcement
    【源码】Flink 三层图结构 —— JobGraph 生成过程
    【源码】Flink 算子 chain 在一起的条件
    Web开发基础之CMDB系统开发之三
    Web开发基础之CMDB系统开发之二
    Ubuntu18.04升级至20.04
  • 原文地址:https://www.cnblogs.com/hdflzh/p/2542921.html
Copyright © 2011-2022 走看看