zoukankan      html  css  js  c++  java
  • Innodb引擎 long semaphore waits

    上一篇介绍了因为子表过多,导致innodb crash的情况,但crash的原因是long semaphore waits。
    long semaphore waits又为何物?

    背景:Innodb使用了mutex和rw_lock来保护内存数据结构,同步的机制要么是互斥,要么是读写阻塞的模式,
    Innode认为mutex和rw_lock hold的时间足够短,所以,如果有线程wait mutex或者rw_lock时间过长,那么
    很可能是程序有bug,所以就会异常主动crash。

    关于innodb的锁:可以参考前面介绍的blog。

    1. Innodb如何界定时间过长?  

    /* The following is the maximum allowed duration of a lock wait. */
    UNIV_INTERN ulint    srv_fatal_semaphore_wait_threshold = 600;

    Innodb的认为600s足够长。

    2. Innodb如何判断?

        /* Create the thread which warns of long semaphore waits */
        os_thread_create(&srv_error_monitor_thread, NULL,
                 thread_ids + 3 + SRV_MAX_N_IO_THREADS);

    innodb后台启动了一个线程,专门监控wait队列,

       if (sync_array_print_long_waits(&waiter, &sema)
            && sema == old_sema && os_thread_eq(waiter, old_waiter)) {
            fatal_cnt++;
            if (fatal_cnt > 10) {
                fprintf(stderr,
                    "InnoDB: Error: semaphore wait has lasted"
                    " > %lu seconds
    "
                    "InnoDB: We intentionally crash the server,"
                    " because it appears to be hung.
    ",
                    (ulong) srv_fatal_semaphore_wait_threshold);
    
                ut_error;
            }

    判断10次,如果是同一个线程,等待同一个sema,那么就crash掉Innodb。

        for (i = 0; i < sync_primary_wait_array->n_cells; i++) {
            double    diff;
            void*    wait_object;
            cell = sync_array_get_nth_cell(sync_primary_wait_array, i);
            wait_object = cell->wait_object;
            if (wait_object == NULL || !cell->waiting) {
                continue;
            }
            diff = difftime(time(NULL), cell->reservation_time);
            if (diff > SYNC_ARRAY_TIMEOUT) {
                fputs("InnoDB: Warning: a long semaphore wait:
    ",
                      stderr);
                sync_array_cell_print(stderr, cell);
                noticed = TRUE;
            }
            if (diff > fatal_timeout) {
                fatal = TRUE;
            }
            if (diff > longest_diff) {
                longest_diff = diff;
                *sema = wait_object;
                *waiter = cell->thread;

    sync_primary_wait_array是一个数组,每一个wait sema的进入队列, 找到等待时间最长的,并且大于600s的,就设置fatal=TRUE。

  • 相关阅读:
    集群、分布式与微服务概念和区别理解
    博弈论的入门——nim游戏&&sg函数浅谈
    csp-2020 初赛游记
    洛谷 P2340 [USACO03FALL]Cow Exhibition G 题解
    P5687 [CSP-SJX2019]网格图 题解
    HBase 数据迁移/备份方法
    mac远程连接服务上传下载命令实例
    Redis安装详细步骤
    VMware虚拟机中的CentOS服务安装Nginx后本机无法访问的解决办法
    开发业务逻辑处理之策略模式场景使用
  • 原文地址:https://www.cnblogs.com/xpchild/p/3901649.html
Copyright © 2011-2022 走看看