zoukankan      html  css  js  c++  java
  • MySQL-负载很高排查思路

    工欲善其事必先利其器,我说一下思路

    思路:
    1、确定高负载的类型 htop,dstat命令看负载高是CPU还是IO
    2、监控具体的sql语句,是insert update 还是 delete导致高负载
    3、检查mysql日志
    4、检查硬件问题

    dstat

    可以看到具体是哪个用户哪个进程占用了相关系统资源,当前CPU、内存谁在使用

    [root@cc ~]# dstat -l -m -r -c  --top-io --top-mem --top-cpu
    
    --io/total- ------memory-usage----- --most-expensive- ----most-expensive---- -most-expensive-
     read  writ| used  buff  cach  free|  memory process |     i/o process      |  cpu process   
    1.90   267 |3399M  178M 3892M  400M|php-fpm: poo 372M|init       1682k  647k|flush-202:0  0.1
       0  72.0 |3399M  178M 3892M  400M|php-fpm: poo 372M|php-fpm: po  10k  143k|php-fpm: pool2.0
       0  8.00 |3399M  178M 3892M  399M|php-fpm: poo 372M|nginx: work 228k  229k|php-fpm: pool0.5
       0  88.0 |3399M  178M 3892M  399M|php-fpm: poo 372M|nginx: work 102k  166k|php-fpm: pool 11
       0  38.0 |3399M  178M 3892M  399M|php-fpm: poo 372M|php-fpm: po 787k  650B|php-fpm: pool4.8
       0     0 |3399M  178M 3892M  399M|php-fpm: poo 372M|php-fpm: po 788k  723B|php-fpm: pool1.8
       0   140 |3400M  178M 3892M  399M|php-fpm: poo 372M|nginx: work  38k  154k|php-fpm: pool1.2
       0  12.0 |3400M  178M 3892M  399M|php-fpm: poo 372M|nginx: work 178k  364k|php-fpm: pool1.5
       0     0 |3400M  178M 3892M  399M|php-fpm: poo 372M|nginx: work 758k  639k|php-fpm: pool1.5
       0  12.0 |3400M  178M 3892M  399M|php-fpm: poo 372M|nginx: work 773k  616k|php-fpm: pool2.0
    6.00     0 |3401M  178M 3892M  398M|php-fpm: poo 372M|nginx: work 994k  688k|nginx: worker1.5
       0   272 |3401M  178M 3892M  398M|php-fpm: poo 372M|nginx: work 388k  422k|php-fpm: pool1.5
       0     0 |3400M  178M 3893M  398M|php-fpm: poo 372M|nginx: work 483k  548k|php-fpm: pool1.8
       0  4.00 |3400M  178M 3893M  398M|php-fpm: poo 372M|php-fpm: po 787k  650B|php-fpm: pool1.5
       0  12.0 |3400M  178M 3893M  398M|php-fpm: poo 372M|nginx: work 223k  323k|php-fpm: pool1.5
       0     0 |3400M  178M 3893M  398M|php-fpm: poo 372M|nginx: work 371k  474k|php-fpm: pool7.8

    htop

    htop是top的增强版,更直观

    [root@cc ~]# htop
    
      1  [|||||||||||                                                                    12.4%]
      2  [|||||||||                                                                       9.5%]
      3  [|                                                                               1.0%]
      4  [||                                                                              1.9%]
      Mem[|||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||3394/7869MB]
      Swp[||||||||||||||                                                              75/478MB]
            
    Tasks: 71, 12 thr; 2 running
    Load average: 0.39 0.39 0.31 
    Uptime: 526 days(!), 17:36:38
    
      PID USER      PRI  NI  VIRT   RES   SHR S CPU% MEM%   TIME+  Command
        1 root       20   0 19232   396   248 S  0.0  0.0  0:01.86 /sbin/init
    30752 root       20   0 52532    72    56 S  0.0  0.0  0:00.16 ├─ /usr/sbin/vsftpd /etc/vsftpd/vsftpd.conf
    24301 root       20   0  193M  3268  1600 S  0.0  0.0  1:41.43 ├─ /usr/sbin/snmpd -LS0-6d -Lf /dev/null -p /var/run/snmpd.pid
    21361 root       20   0  902M  6500  1308 S  0.0  0.1  0:07.16 ├─ php-fpm: master process (/etc/php-fpm.conf)
    28627 www        20   0  962M  202M  138M S  0.0  2.6  0:34.46 │  ├─ php-fpm: pool www-c
    27537 www        20   0  965M  236M  171M R  1.4  3.0  1:19.64 │  ├─ php-fpm: pool www-c
    27449 www        20   0  961M  251M  189M S  0.0  3.2  1:35.54 │  ├─ php-fpm: pool www-a
    26442 www        20   0  962M  280M  217M S  0.0  3.6  2:29.71 │  ├─ php-fpm: pool www-a
    26310 www        20   0  917M  251M  234M S  1.9  3.2  2:46.45 │  ├─ php-fpm: pool www-a
    26162 www        20   0  962M  297M  233M S  0.0  3.8  2:37.50 │  ├─ php-fpm: pool www-b
    26147 www        20   0  924M  258M  233M S  0.0  3.3  2:38.37 │  ├─ php-fpm: pool www-c
    25717 www        20   0  965M  302M  238M S  0.0  3.8  2:54.50 │  ├─ php-fpm: pool www-c
    24585 www        20   0  964M  324M  260M S  0.0  4.1  4:15.20 │  ├─ php-fpm: pool www-b
    

    tcpdump

    抓取mysql包分析,一般抓3306端口的数据

    [root@cc ~]# tcpdump -i eth0 -A -s 3000 port 3306 > ~/sql.log

    然后使用awk,sort,wc 等命令进行分析

    或者

    [root@cc ~]# tcpdump -i eth0 -s 0 -l -w - dst port 3306 | strings | perl -e '
    while(<>) { chomp; next if /^[^ ]+[ ]*$/;
    if(/^(SELECT|UPDATE|DELETE|INSERT|SET|COMMIT|ROLLBACK|CREATE|DROP|ALTER)/i) {
        if (defined $q) { print "$qn"; }
        $q=$_;
        } else {
        $_ =~ s/^[ t]+//; $q.=" $_";
        }
    }'

    就可以看出最繁忙的sql语句了

    strace

    查看系统调用是否有问题,进程是否堵塞,是否有Broken pipe

    [root@cc ~]# strace -p 26578
    

    pt-query-digest

    分析mysql慢日志,查看哪些sql语句最耗时

    [root@cc ~]#  pt-query-digest slow.logs
    
    # 390ms USER TIME, 10ms system TIME, 15.67M rss, 105.84M vsz  
    # CURRENT DATE: Thu DEC 29 13:22:42 2014  
    # Hostname: test   
    # Files: slow.log  
    # Overall: 776 total, 11 UNIQUE, 0.00 QPS, 0.00x concurrency _____________  
    # TIME range: 2011-09-10 04:03:19 TO 2011-12-29 05:02:51  
    # Attribute          total     MIN     MAX     avg     95%  stddev  median  
    # ============     ======= ======= ======= ======= ======= ======= =======  
    # EXEC TIME          5657s      2s     33s      7s     23s      6s      5s  
    # LOCK TIME            33s       0     19s    43ms    98us   715ms    38us  
    # ROWS sent        323.38k       0 107.36k  426.73    0.99   6.35k       0  
    # ROWS examine     323.39k       0 107.36k  426.74       0   6.35k       0  
    # Query SIZE       217.95k      38     562  287.61  420.77   81.78  284.79  

    show processlist

    查看系统到底在干什么

    
    mysql> show full processlist;
    +-----------+---------------+---------------------+---------------------+---------+------+---------------+---------------------------+
    | Id        | User          | Host                | db                  | Command | Time | State         | Info                      |
    +-----------+---------------+---------------------+---------------------+---------+------+---------------+---------------------------+
    | 184498848 | testdb_rr1356 | 10.11.211.120:61343 | testdb_rr1356_db121 | Sleep   | 1384 |               | NULL                      |
    | 184508740 | testdb_rr1356 | 10.11.211.120:11809 | testdb_rr1356_db121 | Sleep   |   87 |               | NULL                      |
    | 184509415 | testdb_rr1356 | 10.11.211.120:12760 | testdb_rr1356_db121 | Query   |    0 | NULL          | show full processlist     |
    | 184509451 | testdb_rr1356 | 10.11.211.120:12804 | testdb_rr1356_db121 | Sleep   |   10 |               | NULL                      |
    | 184509528 | testdb_rr1356 | 10.11.211.120:12919 | testdb_rr1356_db121 | Query   |    0 | freeing items | DESCRIBE test_channel     |

    检查mysql配置参数是否有问题,引起大量的IO或者高CPU操作

    innodb_flush_log_at_trx_commit 、innodb_buffer_pool_size 、key_buffer_size 等重要参数

    mysql> show variables like '%innodb%';
    +---------------------------------+----------------------------+
    | Variable_name                   | Value                      |
    +---------------------------------+----------------------------+
    | have_innodb                     | YES                        |
    | ignore_builtin_innodb           | ON                         |
    | innodb_adaptive_flushing        | ON                         |
    | innodb_adaptive_hash_index      | ON                         |
    | innodb_additional_mem_pool_size | 2097152                    |
    | innodb_autoextend_increment     | 8                          |
    | innodb_autoinc_lock_mode        | 1                          |
    | innodb_buffer_pool_size         | 2013265920                 |
    | innodb_change_buffering         | inserts                    |
    | innodb_checksums                | ON                         |

    通过show engine innodb status查看当前事务,内存使用

    mysql> show engine innodb status  G
    
    LATEST DETECTED DEADLOCK
    ------------------------
    150731 10:36:50
    *** (1) TRANSACTION:
    TRANSACTION EBFBBEC, ACTIVE 0 sec, process no 20691, OS thread id 47345217033984 inserting
    mysql tables in use 1, locked 1
    LOCK WAIT 5 lock struct(s), heap size 1248, 4 row lock(s), undo log entries 2
    MySQL thread id 143249904, query id 1286731854 10.135.21.120 tybuser2014 update
    
    #此处具体sql省略
    
    ----------------------
    BUFFER POOL AND MEMORY
    ----------------------
    Total memory allocated 2058485760; in additional pool allocated 0
    Dictionary memory allocated 819282
    Buffer pool size   122879
    Free buffers       97599
    Database pages     24313
    Old database pages 8954
    Modified db pages  7
    Pending reads 0
    Pending writes: LRU 0, flush list 0, single page 0
    Pages made young 6, not young 0
    0.00 youngs/s, 0.00 non-youngs/s
    Pages read 1049, created 41853, written 30401604
    0.00 reads/s, 0.00 creates/s, 1.75 writes/s
    Buffer pool hit rate 1000 / 1000, young-making rate 0 / 1000 not 0 / 1000
    Pages read ahead 0.00/s, evicted without access 0.00/s, Random read ahead 0.00/s
    LRU len: 24313, unzip_LRU len: 0
    I/O sum[45]:cur[0], unzip sum[0]:cur[0]

    最后通过zabbix或者cacti等监控来查看IO、CPU、MEMORY、磁盘等是否有异常

    这样基本上就可以把问题找出来了

    仅供参考,当然,不同的DBA肯定还有其他更好的方式

  • 相关阅读:
    实现图片加载从模糊到清晰显示的方法
    审批流程设计方案-介绍(一)
    SpringBoot+JPA实现DDD(一)
    RabbitMQ使用入门
    SpringBoot+JPA实现DDD(六)
    SpringBoot+JPA实现DDD(五)
    Spring Boot+JPA实现DDD(四)
    Spring Boot+JPA实现DDD(三)
    Spring Boot+JPA实现DDD(二)
    DDD入门之解决了什么问题(二)
  • 原文地址:https://www.cnblogs.com/JohnABC/p/5948615.html
Copyright © 2011-2022 走看看