zoukankan      html  css  js  c++  java
  • 【华为云技术分享】MySQL Seconds_Behind_Master简要分析

    Seconds_Behind_Master

    对于mysql主备实例,seconds_behind_master是衡量master与slave之间延时的一个重要参数。通过在slave上执行"show slave status;"可以获取seconds_behind_master的值。

    原始实现

    Definition:The number of seconds that the slave SQL thread is behind processing the master binary log.  

    Type:time_t(long)

    计算方式如下:

    rpl_slave.cc::show_slave_status_send_data()
    if ((mi->get_master_log_pos() == mi->rli->get_group_master_log_pos()) &&
           (!strcmp(mi->get_master_log_name(),
                    mi->rli->get_group_master_log_name()))) {
         if (mi->slave_running == MYSQL_SLAVE_RUN_CONNECT)
           protocol->store(0LL);
         else
           protocol->store_null();
       } else {
         long time_diff = ((long)(time(0) - mi->rli->last_master_timestamp) -
                           mi->clock_diff_with_master);
         protocol->store(
             (longlong)(mi->rli->last_master_timestamp ? max(0L, time_diff) : 0));
       }

    主要分为以下两种情况:

    • SQL线程等待IO线程获取主机binlog,此时seconds_behind_master为0,表示备机与主机之间无延时;

    • SQL线程处理relay log,此时seconds_behind_master通过(long)(time(0) – mi->rli->last_master_timestamp) – mi->clock_diff_with_master计算得到;

    last_master_timestamp

    定义:

    • 主库binlog中事件的时间。

    • type: time_t (long)

    计算方式:

    last_master_timestamp根据备机是否并行复制有不同的计算方式。

    非并行复制:

    rpl_slave.cc:exec_relay_log_event()
    if ((!rli->is_parallel_exec() || rli->last_master_timestamp == 0) &&
        !(ev->is_artificial_event() || ev->is_relay_log_event() ||
         (ev->common_header->when.tv_sec == 0) ||
         ev->get_type_code() == binary_log::FORMAT_DESCRIPTION_EVENT ||
         ev->server_id == 0))
    {
     rli->last_master_timestamp= ev->common_header->when.tv_sec +
                                 (time_t) ev->exec_time;
     DBUG_ASSERT(rli->last_master_timestamp >= 0);
    }

    在该模式下,last_master_timestamp表示为每一个event的结束时间,其中when.tv_sec表示event的开始时间,exec_time表示事务的执行时间。该值的计算在apply_event之前,所以event还未执行时,last_master_timestamp已经被更新。由于exec_time仅在Query_log_event中存在,所以last_master_timestamp在应用一个事务的不同event阶段变化。以一个包含两条insert语句的事务为例,在该代码段的调用时,打印出event的类型、时间戳和执行时间

    create table t1(a int PRIMARY KEY AUTO_INCREMENT ,b longblob) engine=innodb;
    begin;
    insert into t1(b) select repeat('a',104857600);
    insert into t1(b) select repeat('a',104857600);
    commit;
    2020-02-10T06:41:32.628554Z 11 [Note] [MY-000000] [Repl] event_type: 33 GTID_LOG_EVENT
    2020-02-10T06:41:32.628601Z 11 [Note] [MY-000000] [Repl] event_time: 1581316890
    2020-02-10T06:41:32.628614Z 11 [Note] [MY-000000] [Repl] event_exec_time: 0
    2020-02-10T06:41:32.628692Z 11 [Note] [MY-000000] [Repl] event_type: 2   QUERY_EVENT
    2020-02-10T06:41:32.628704Z 11 [Note] [MY-000000] [Repl] event_time: 1581316823
    2020-02-10T06:41:32.628713Z 11 [Note] [MY-000000] [Repl] event_exec_time: 35
    2020-02-10T06:41:32.629037Z 11 [Note] [MY-000000] [Repl] event_type: 19   TABLE_MAP_EVENT
    2020-02-10T06:41:32.629057Z 11 [Note] [MY-000000] [Repl] event_time: 1581316823
    2020-02-10T06:41:32.629063Z 11 [Note] [MY-000000] [Repl] event_exec_time: 0
    2020-02-10T06:41:33.644111Z 11 [Note] [MY-000000] [Repl] event_type: 30    WRITE_ROWS_EVENT
    2020-02-10T06:41:33.644149Z 11 [Note] [MY-000000] [Repl] event_time: 1581316823
    2020-02-10T06:41:33.644156Z 11 [Note] [MY-000000] [Repl] event_exec_time: 0
    2020-02-10T06:41:43.520272Z 0 [Note] [MY-011953] [InnoDB] Page cleaner took 9185ms to flush 3 and evict 0 pages
    2020-02-10T06:42:05.982458Z 11 [Note] [MY-000000] [Repl] event_type: 19   TABLE_MAP_EVENT
    2020-02-10T06:42:05.982488Z 11 [Note] [MY-000000] [Repl] event_time: 1581316858
    2020-02-10T06:42:05.982495Z 11 [Note] [MY-000000] [Repl] event_exec_time: 0
    2020-02-10T06:42:06.569345Z 11 [Note] [MY-000000] [Repl] event_type: 30    WRITE_ROWS_EVENT
    2020-02-10T06:42:06.569376Z 11 [Note] [MY-000000] [Repl] event_time: 1581316858
    2020-02-10T06:42:06.569384Z 11 [Note] [MY-000000] [Repl] event_exec_time: 0
    2020-02-10T06:42:16.506176Z 0 [Note] [MY-011953] [InnoDB] Page cleaner took 9352ms to flush 8 and evict 0 pages
    2020-02-10T06:42:37.202507Z 11 [Note] [MY-000000] [Repl] event_type: 16    XID_EVENT
    2020-02-10T06:42:37.202539Z 11 [Note] [MY-000000] [Repl] event_time: 1581316890
    2020-02-10T06:42:37.202546Z 11 [Note] [MY-000000] [Repl] event_exec_time: 0

    并行复制:

    rpl_slave.cc   mts_checkpoint_routine
    ts = rli->gaq->empty()
              ? 0
              : reinterpret_cast<Slave_job_group *>(rli->gaq->head_queue())->ts;
     rli->reset_notified_checkpoint(cnt, ts, true);
     /* end-of "Coordinator::"commit_positions" */

    在该模式下备机上存在一个分发队列gaq,如果gaq为空,则设置last_commit_timestamp为0;如果gaq不为空,则此时维护一个checkpoint点lwm,lwm之前的事务全部在备机上执行完成,此时last_commit_timestamp被更新为lwm所在事务执行完成后的时间。该时间类型为time_t类型。

    ptr_group->ts = common_header->when.tv_sec +
                       (time_t)exec_time;  // Seconds_behind_master related
    rli->rli_checkpoint_seqno++;
    if (update_timestamp) {
     mysql_mutex_lock(&data_lock);
     last_master_timestamp = new_ts;
     mysql_mutex_unlock(&data_lock);
    }

    在并行复制下,event执行完成之后才会更新last_master_timestamp,所以非并行复制和并行复制下的seconds_behind_master会存在差异。

    clock_diff_with_master

    定义:

    • The difference in seconds between the clock of the master and the clock of  the slave (second - first). It must be signed as it may be <0 or >0. clock_diff_with_master is computed when the I/O thread starts; for this the I/O thread does a SELECT UNIX_TIMESTAMP() on the master.  

    • type: long

    rpl_slave.cc::get_master_version_and_clock()
    if (!mysql_real_query(mysql, STRING_WITH_LEN("SELECT UNIX_TIMESTAMP()")) &&
         (master_res= mysql_store_result(mysql)) &&
         (master_row= mysql_fetch_row(master_res)))
     {
       mysql_mutex_lock(&mi->data_lock);
       mi->clock_diff_with_master=
         (long) (time((time_t*) 0) - strtoul(master_row[0], 0, 10));
       DBUG_EXECUTE_IF("dbug.mts.force_clock_diff_eq_0",
         mi->clock_diff_with_master= 0;);
       mysql_mutex_unlock(&mi->data_lock);
     }

    该差值仅被计算一次,在master与slave建立联系时处理。

    其他

    exec_time

    定义:

    • the difference from the statement’s original start timestamp and the time at which it completed executing.

    • type: unsigned long

    struct timeval end_time;
    ulonglong micro_end_time = my_micro_time();
    my_micro_time_to_timeval(micro_end_time, &end_time);
    exec_time = end_time.tv_sec - thd_arg->query_start_in_secs();

    时间函数

    (1)time_t time(time_t timer) time_t为long类型,返回的数值仅精确到秒;

    (2)int gettimeofday (struct timeval *tv, struct timezone *tz) 可以获得微秒级的当前时间;

    (3)timeval结构

    #include <time.h>
    stuct timeval {
       time_t tv_sec; /*seconds*/
       suseconds_t tv_usec; /*microseconds*/
    }

    总结

    使用seconds_behind_master衡量主备延时只能精确到秒级别,且在某些场景下,seconds_behind_master并不能准确反映主备之间的延时。主备异常时,可以结合seconds_behind_master源码进行具体分析。

    点击这里,了解更多精彩内容

  • 相关阅读:
    开发者看过来,哪个移动平台好赚钱?
    EGit下配置Github项目
    用户接口(UI)设计的 20 条原则
    要想工作效率高,我们到底需要多少睡眠?
    Android 读取<metadata>元素的数据
    Android实现推送方式解决方案
    余晟:做个懂产品的程序员
    Gson简要使用笔记
    编程从业五年的十四条经验,句句朴实
    程序员不是包身工
  • 原文地址:https://www.cnblogs.com/2020-zhy-jzoj/p/13164676.html
Copyright © 2011-2022 走看看