--2019.4.18 mysql技术大会分享--叶金荣
mysql为什么会慢
性能瓶颈分析、排查思路
先确认一下真的是mysql响应慢了吗
导致mysql慢可能会有那些因素呢?
--资源稀缺 ---cpu,memory,io,network
--系统层面 ---numa,io调度等
--mysql配置不科学 ---innodb_buffer_pool_size,innodb_io_capacity等
--垃圾sql ---没有使用索引,大量使用了排序,临时表等
先查看系统层面的负载情况
top、free、vmstat/sar、mpstat、iotop、perf
--top显示了系统总体的cpu和内存使用情况,以及各个进程的资源使用情况 [root@localhost ~]# top top - 01:44:40 up 3 days, 16:08, 4 users, load average: 0.01, 0.02, 0.05 Tasks: 170 total, 1 running, 169 sleeping, 0 stopped, 0 zombie %Cpu(s): 0.4 us, 0.1 sy, 0.0 ni, 99.4 id, 0.0 wa, 0.0 hi, 0.0 si, 0.0 st KiB Mem : 18331708 total, 11389252 free, 3680040 used, 3262416 buff/cache KiB Swap: 2097148 total, 2097148 free, 0 used. 14285316 avail Mem PID USER PR NI VIRT RES SHR S %CPU %MEM TIME+ COMMAND 2828 rabbitmq 20 0 2330112 80920 3504 S 6.2 0.4 65:17.53 beam.smp 11947 root 20 0 157712 2228 1508 R 6.2 0.0 0:00.01 top 1 root 20 0 190892 3820 2400 S 0.0 0.0 0:03.80 systemd 2 root 20 0 0 0 0 S 0.0 0.0 0:00.00 kthreadd 3 root 20 0 0 0 0 S 0.0 0.0 0:00.06 ksoftirqd/0 5 root 0 -20 0 0 0 S 0.0 0.0 0:00.00 kworker/0:0H 7 root rt 0 0 0 0 S 0.0 0.0 0:00.32 migration/0 8 root 20 0 0 0 0 S 0.0 0.0 0:00.00 rcu_bh 9 root 20 0 0 0 0 S 0.0 0.0 1:25.63 rcu_sched 10 root rt 0 0 0 0 S 0.0 0.0 0:01.14 watchdog/0 11 root rt 0 0 0 0 S 0.0 0.0 0:01.03 watchdog/1 12 root rt 0 0 0 0 S 0.0 0.0 0:00.36 migration/1 13 root 20 0 0 0 0 S 0.0 0.0 0:00.06 ksoftirqd/1 15 root 0 -20 0 0 0 S 0.0 0.0 0:00.00 kworker/1:0H 16 root rt 0 0 0 0 S 0.0 0.0 0:01.02 watchdog/2 17 root rt 0 0 0 0 S 0.0 0.0 0:00.19 migration/2 18 root 20 0 0 0 0 S 0.0 0.0 0:00.07 ksoftirqd/2 19 root 20 0 0 0 0 S 0.0 0.0 0:00.00 kworker/2:0 20 root 0 -20 0 0 0 S 0.0 0.0 0:00.00 kworker/2:0H 21 root rt 0 0 0 0 S 0.0 0.0 0:01.03 watchdog/3 22 root rt 0 0 0 0 S 0.0 0.0 0:00.21 migration/3 23 root 20 0 0 0 0 S 0.0 0.0 0:00.24 ksoftirqd/3 27 root 0 -20 0 0 0 S 0.0 0.0 0:00.00 khelper 28 root 20 0 0 0 0 S 0.0 0.0 0:00.00 kdevtmpfs 29 root 0 -20 0 0 0 S 0.0 0.0 0:00.00 netns 30 root 20 0 0 0 0 S 0.0 0.0 0:00.19 khungtaskd 31 root 0 -20 0 0 0 S 0.0 0.0 0:00.00 writeback 32 root 0 -20 0 0 0 S 0.0 0.0 0:00.00 kintegrityd 33 root 0 -20 0 0 0 S 0.0 0.0 0:00.00 bioset 34 root 0 -20 0 0 0 S 0.0 0.0 0:00.00 kblockd 35 root 0 -20 0 0 0 S 0.0 0.0 0:00.00 md 36 root 20 0 0 0 0 S 0.0 0.0 0:21.68 kworker/2:1 41 root 20 0 0 0 0 S 0.0 0.0 0:00.00 kswapd0 42 root 25 5 0 0 0 S 0.0 0.0 0:00.00 ksmd 43 root 39 19 0 0 0 S 0.0 0.0 0:01.71 khugepaged 44 root 20 0 0 0 0 S 0.0 0.0 0:00.00 fsnotify_mark 45 root 0 -20 0 0 0 S 0.0 0.0 0:00.00 crypto 53 root 0 -20 0 0 0 S 0.0 0.0 0:00.00 kthrotld
load average: 0.01, 0.02, 0.05 ,依次是过去1分钟、5分钟、15分钟的平均负载。
--平均负载,单位时间内,系统处于可运行状态和不可中断状态的平均进程数,也就是平均活跃进程数,它和cpu使用率并没有直接关系。
平均负载最理想的情况是等于cpu数,所以在判断平均负载时,首先知道cpu,但是如果load高于5可能就会有问题
[root@localhost ~]# grep 'model name' /proc/cpuinfo | wc -l
4
%Cpu(s): 0.4 us, 0.1 sy, 0.0 ni, 99.4 id, 0.0 wa, 0.0 hi, 0.0 si, 0.0 st
us 用户CPU时间
sy 系统CPU时间,如果太高,表示系统调用时间长,例如是IO操作频繁
id 空闲 CPU时间,一般来说,id + us + sy = 100
wa 等待io时间
--如果us%值占比很高,很可能是sql语句没有使用到索引或使用不当
--如果是sy%占比很高,要关注mysql的等待时间和锁信息
--如果wa%太高,要关注系统io,mysql使用磁盘临时表或刷盘等操作,可以后续用iostat来查看磁盘
[root@localhost ~]# free -g total used free shared buff/cache available Mem: 17 3 10 0 3 13 Swap: 1 0 1
--free查看系统内存及使用情况
一般mysql服务都不建议使用Swap,所以对应used项为0,如果使用了swap,就查看numa(numactl),看是否有多numa不均衡,建议关闭
如果available(远远小于)<< total - free的值,说明多半可能出现了内存泄漏,
出现了内存泄漏解决办法
--重启mysql服务
--升级到最新的小版本mysql
[root@localhost ~]# vmstat 1 procs -----------memory---------- ---swap-- -----io---- -system-- ------cpu----- r b swpd free buff cache si so bi bo in cs us sy id wa st 1 0 0 11385476 1040 3261992 0 0 1 5 1 28 0 0 99 0 0 0 0 0 11385336 1040 3262024 0 0 0 0 508 838 0 0 100 0 0 0 0 0 11385336 1040 3262024 0 0 0 0 475 786 0 0 99 0 0 0 0 0 11385336 1040 3262028 0 0 0 4 486 799 1 0 99 0 0 0 0 0 11382916 1040 3262028 0 0 0 0 504 854 1 0 99 0 0 0 0 0 11382668 1040 3262028 0 0 0 0 705 1076 1 1 99 0 0 0 0 0 11382792 1040 3262036 0 0 0 0 559 902 1 0 99 0 0 0 0 0 11383660 1040 3262036 0 0 0 0 487 796 1 0 99 0 0 0 0 0 11383660 1040 3262048 0 0 0 0 533 874 1 0 99 0 0 0 0 0 11383660 1040 3262048 0 0 0 0 478 833 0 0 99 0 0
--vmstat 观察内存、swap、io、cpu等详细情况
--r:就绪队列的长度,表示运行队列(就是说多少个进程真的分配到CPU),当这个值超过了CPU数目,就会出现CPU瓶颈了
--b:表示阻塞的进程
--us(user)和sy(system)列,如果sy很高,说明cpu主要被内核占用
--in:中断次数,如果值很大,说明中断处理也是个潜在的问题
[root@localhost ~]# mpstat -P ALL 5 Linux 3.10.0-514.ky3.kb5.x86_64 (localhost.localdomain) 04/19/2019 _x86_64_ (4 CPU) 02:09:11 AM CPU %usr %nice %sys %iowait %irq %soft %steal %guest %gnice %idle 02:09:16 AM all 0.50 0.00 0.20 0.00 0.00 0.00 0.00 0.00 0.00 99.30 02:09:16 AM 0 1.00 0.00 0.20 0.00 0.00 0.00 0.00 0.00 0.00 98.80 02:09:16 AM 1 0.40 0.00 0.40 0.00 0.00 0.20 0.00 0.00 0.00 99.00 02:09:16 AM 2 0.60 0.00 0.20 0.20 0.00 0.00 0.00 0.00 0.00 99.00 02:09:16 AM 3 0.00 0.00 0.20 0.00 0.00 0.00 0.00 0.00 0.00 99.80
--mpstat 观察各个逻辑cpu是否负载均衡(可能会出现中断不均衡导致性能问题)
sys usr iowait
perf,是linux2.6.31以后内置的性能分析工具,它以性能事件采样为基础,不仅可以分析系统的各种事件和内核性能,还可以用来分析指定应用程序的性能问题。
[root@mysqlhq ~]# yum install perf -y [root@mysqlhq ~]# ps -ef|grep mysql [root@mysqlhq ~]# perf top -p 4122 Samples: 746 of event 'cpu-clock', Event count (approx.): 75362059 Overhead Shared Object Symbol 43.43% [kernel] [k] find_get_pages 10.55% [kernel] [k] radix_tree_next_chunk 1.87% mysqld [.] lfind 1.75% libc-2.17.so [.] __memcpy_ssse3_back 1.67% mysqld [.] MYSQLparse 1.27% [kernel] [k] mpt_put_msg_frame 0.97% mysqld [.] my_hash_sort_utf8 0.77% [kernel] [k] finish_task_switch 0.74% mysqld [.] Protocol::net_store_data 0.74% [kernel] [k] _raw_spin_unlock_irqrestore 0.67% mysqld [.] mtr_commit 0.62% mysqld [.] alloc_root 0.62% mysqld [.] close_thread_table 0.61% [kernel] [k] system_call_after_swapgs 0.60% mysqld [.] Protocol::store_string_aux 0.59% mysqld [.] page_cur_search_with_match 0.55% mysqld [.] buf_page_get_gen 0.48% [kernel] [k] fget_light 0.46% mysqld [.] my_strcasecmp_utf8 0.45% [kernel] [k] xlog_space_left 0.44% mysqld [.] btr_cur_search_to_nth_level 0.43% mysqld [.] show_status_array 0.42% mysqld [.] sync_array_print_long_waits 0.42% [kernel] [k] kmem_cache_alloc 0.40% libc-2.17.so [.] __memcmp_sse4_1 0.40% mysqld [.] THD::enter_stage 0.37% mysqld [.] Item_cond::fix_fields 0.35% [kernel] [k] do_sys_poll 0.34% libc-2.17.so [.] malloc 0.34% mysqld [.] btr_search_guess_on_hash [root@mysqlhq ~]# perf top Samples: 1K of event 'cpu-clock', Event count (approx.): 189946503 Overhead Shared Object Symbol 9.06% [kernel] [k] find_get_pages 3.46% [kernel] [k] __do_softirq 3.27% [kernel] [k] _raw_spin_unlock_irqrestore 2.83% [kernel] [k] finish_task_switch 2.83% [kernel] [k] radix_tree_next_chunk 1.74% perf [.] __symbols__insert 1.64% [kernel] [k] clear_page 1.49% [kernel] [k] kallsyms_expand_symbol.constprop.1 1.16% libc-2.17.so [.] __memcpy_ssse3_back 1.16% libslang.so.2.2.4 [.] SLsmg_write_ --Overhead,是该符号的性能事件所在所有采样中的比例,用百分比来表示 --Shared,是该函数或指令所在的动态共享对象(Dynamic Shared Object),如内核、进程名、动态链接库名、内核模块名等。 --Object,是动态共享对象的类型,比如[.]表示用户空间的可执行程序、或者动态链接库,而[k]则表示内核空间 --Symbol 是符号名,也就是函数名,当函数名未知时,用十六进制的地址来表示 第一行包含3个数据,分别是采样数据Samples,事件类型event和事件总数 Event count,比如这个例子中,perf总共采集了476个cpu时钟事件,而总事件数是75362059 [root@mysqlhq ~]# perf record #ctrl+c终止 $ perf report --sort comm,dso,symbol | head -10 PerfTop: 7 irqs/sec kernel:42.9% exact: 0.0% [1000Hz cpu-clock], (target_pid: 2974) --------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------- samples pcnt function DSO _______ _____ _____________________________________________________________________________________________________________________ ___________________________ 176.00 14.8% _Z24btr_search_guess_on_hashP12dict_index_tP12btr_search_tPK8dtuple_tmmP9btr_cur_tmP5mtr_t /usr/local/mysql/bin/mysqld 147.00 12.3% _Z20rec_get_offsets_funcPKhPK12dict_index_tPmmPP16mem_block_info_t /usr/local/mysql/bin/mysqld 108.00 9.1% _Z16buf_page_get_genRK9page_id_tRK11page_size_tmP11buf_block_tmPKcmP5mtr_tb /usr/local/mysql/bin/mysqld 85.00 7.1% my_strnncollsp_utf8 /usr/local/mysql/bin/mysqld 74.00 6.2% _Z29cmp_dtuple_rec_with_match_lowPK8dtuple_tPKhPKmmPm /usr/local/mysql/bin/mysqld 70.00 5.9% _Z23ha_insert_for_fold_funcP12hash_table_tmPKh /usr/local/mysql/bin/mysqld 67.00 5.6% _Z26page_cur_search_with_matchPK11buf_block_tPK12dict_index_tPK8dtuple_t15page_cur_mode_tPmS9_P10page_cur_tP8rtr_info /usr/local/mysql/bin/mysqld 65.00 5.5% _raw_spin_unlock_irqrestore [kernel.kallsyms] 49.00 4.1% __GI_memcpy /lib64/libc-2.12.so 36.00 3.0% _Z8ut_delaym /usr/local/mysql/bin/mysqld 31.00 2.6% _Z27ha_remove_all_nodes_to_pageP12hash_table_tmPKh /usr/local/mysql/bin/mysqld 28.00 2.3% _Z19ha_delete_hash_nodeP12hash_table_tP9ha_node_t /usr/local/mysql/bin/mysqld 27.00 2.3% finish_task_switch [kernel.kallsyms] 23.00 1.9% _ZN11PolicyMutexI14TTASEventMutexI16BlockMutexPolicyEE5enterEjjPKcj /usr/local/mysql/bin/mysqld 17.00 1.4% _ZL15cmp_whole_fieldmmPKhjS0_j /usr/local/mysql/bin/mysqld 12.00 1.0% _Z15row_search_mvccPh15page_cur_mode_tP14row_prebuilt_tmm /usr/local/mysql/bin/mysqld 12.00 1.0% _ZN5mtr_t7Command11release_allEv /usr/local/mysql/bin/mysqld 11.00 0.9% _Z40row_sel_field_store_in_mysql_format_funcPhPK17mysql_row_templ_tPKhm /usr/local/mysql/bin/mysqld 10.00 0.8% _Z41btr_cur_search_to_nth_level_with_no_latchP12dict_index_tmPK8dtuple_t15page_cur_mode_tP9btr_cur_tPKcmP5mtr_tb /usr/local/mysql/bin/mysqld 10.00 0.8% get_charset /usr/local/mysql/bin/mysqld 10.00 0.8% _Z18get_datetime_valueP3THDPPP4ItemS3_S2_Pb /usr/local/mysql/bin/mysqld 10.00 0.8% _ZL30row_sel_store_mysql_field_funcPhP14row_prebuilt_tPKhPKmmPK17mysql_row_templ_tm /usr/local/mysql/bin/mysqld 9.00 0.8% _Z11ut_crc32_hwPKhm /usr/local/mysql/bin/mysqld 8.00 0.7% _Z27btr_cur_search_to_nth_levelP12dict_index_tmPK8dtuple_t15page_cur_mode_tmP9btr_cur_tmPKcmP5mtr_t /usr/local/mysql/bin/mysqld 7.00 0.6% _ZL18rw_lock_s_lock_lowP9rw_lock_tmPKcm /usr/local/mysql/bin/mysqld 7.00 0.6% _ZL23row_sel_store_mysql_recPhP14row_prebuilt_tPKhPK8dtuple_tmPK12dict_index_tPKmb /usr/local/mysql/bin/mysqld 6.00 0.5% _Z18row_search_no_mvccPh15page_cur_mode_tP14row_prebuilt_tmm /usr/local/mysql/bin/mysqld 6.00 0.5% _ZN11PolicyMutexI14TTASEventMutexI16BlockMutexPolicyEE4exitEv /usr/local/mysql/bin/mysqld 6.00 0.5% _Z38row_mysql_store_col_in_innobase_formatP8dfield_tPhmPKhmm /usr/local/mysql/bin/mysqld 6.00 0.5% _Z21dict_index_copy_typesP8dtuple_tPK12dict_index_tm /usr/local/mysql/bin/mysqld 5.00 0.4% _Z18buf_block_from_ahiPKh /usr/local/mysql/bin/mysqld 5.00 0.4% system_call_after_swapgs [kernel.kallsyms] 5.00 0.4% pfs_start_rwlock_wait_v1 /usr/local/mysql/bin/mysqld
[root@mysqlhq ~]# perf record #ctrl+c终止 [root@mysqlhq ~]# perf record ^C[ perf record: Woken up 11 times to write data ] [ perf record: Captured and wrote 3.328 MB perf.data (53866 samples) ] [root@mysqlhq ~]# perf report --sort comm,dso,symbol | head -100 no symbols found in /usr/bin/cut, maybe install a debug package? no symbols found in /usr/sbin/rngd, maybe install a debug package? Failed to open /tmp/perf-2276.map, continuing without symbols # To display the perf.data header info, please use --header/--header-only options. # # # Total Lost Samples: 0 # # Samples: 53K of event 'cpu-clock' # Event count (approx.): 13466500000 # # Overhead Command Shared Object Symbol # ........ ............. ............................. .............................................. # 98.21% swapper [kernel.kallsyms] [k] native_safe_halt 0.25% mysqld [kernel.kallsyms] [k] find_get_pages 0.06% mysqld [kernel.kallsyms] [k] radix_tree_next_chunk 0.05% swapper [kernel.kallsyms] [k] _raw_spin_unlock_irqrestore 0.04% swapper [kernel.kallsyms] [k] __do_softirq 0.02% mysql libc-2.17.so [.] __strncmp_sse42 0.02% swapper [kernel.kallsyms] [k] tick_nohz_idle_exit 0.02% swapper [kernel.kallsyms] [k] rcu_process_callbacks 0.02% swapper [kernel.kallsyms] [k] run_timer_softirq 0.02% sh [kernel.kallsyms] [k] __do_page_fault 0.02% rcu_sched [kernel.kallsyms] [k] finish_task_switch
iotop – simple top-like I/O monitor
iotop是一个用来监视磁盘I/O使用状况的 top 类工具,可监测到哪一个程序使用的磁盘IO的信息(requires 2.6.20 or later)
yum -y install iotop
--version #显示版本号
-h, --help #显示帮助信息
-o, --only #显示进程或者线程实际上正在做的I/O,而不是全部的,可以随时切换按o
-b, --batch #运行在非交互式的模式
-n NUM, --iter=NUM #在非交互式模式下,设置显示的次数,
-d SEC, --delay=SEC #设置显示的间隔秒数,支持非整数值
-p PID, --pid=PID #只显示指定PID的信息
-u USER, --user=USER #显示指定的用户的进程的信息
-P, --processes #只显示进程,一般为显示所有的线程
-a, --accumulated #显示从iotop启动后每个线程完成了的IO总数
-k, --kilobytes #以千字节显示
-t, --time #在每一行前添加一个当前的时间
-q, --quiet #suppress some lines of header (implies --batch). This option can be specified up to three times to remove header lines.
-q column names are only printed on the first iteration,
-qq column names are never printed,
-qqq the I/O summary is never printed.
# iotop -oP Total DISK READ : 0.00 B/s | Total DISK WRITE : 15.49 K/s Actual DISK READ: 0.00 B/s | Actual DISK WRITE: 16.46 K/s PID PRIO USER DISK READ DISK WRITE SWAPIN IO> COMMAND 680 be/3 root 0.00 B/s 7.75 K/s 0.00 % 0.29 % auditd -n 36 be/4 root 0.00 B/s 0.00 B/s 0.00 % 0.01 % [kworker/2:1] 989 be/4 root 0.00 B/s 7.75 K/s 0.00 % 0.00 % rsyslogd -n
如果出现jbd2/vda1-8 占用大量磁盘io,建议磁盘分区的文件格式XFS,不要使用ext4格式的磁盘
[root@localhost ~]# df -Th
Filesystem Type Size Used Avail Use% Mounted on
/dev/mapper/kylin-root xfs 496G 14G 483G 3% /
/dev/sda1 xfs 1014M 181M 834M 18% /boot
如果是虚拟机或者云主机怎么办
云主机就只能联系云厂商去协商解决
必要时,也要检查硬件层的情况
在查看硬件层、系统层有无异常事件
/var/log/messages*
/var/log/dmesg
/var/log/kern*
在查看mysql的整体情况
sys schema(mysql 5.7+)
--select * from sys.host_summary order by statements desc limit 10;
--select statement,total,total_latency,rows_sent,rows_examined,rows_affected,
full_scans from sys.host_summary_by_statement_type where host='localhost' order by total desc limit 5;
pt-summary ##收集和显示系统概况
pt-mysql-summary ##对mysql配置和status进行汇总
pt-variable-advisor ##分析参数,并提出建议
pt-mext ##并行查看status样本信息
sys schema解读
查看MySQL当前的线程状态
(system@127.0.0.1:3306) [information_schema]> show full processlist;
其他需要关注的状态有: coping to tmp table、sending data、 creating sort index、 sorting result、 query end、 Waiting ... lock等等
查看InnoDB关键状态指标
(system@127.0.0.1:3306) [information_schema]> show engine innodb statusG;
---TRANSACTION 0, ACTIVE 2130 sec ## 事务的时间,长事务,有问题
12360 lock struct(s),heap size 15384566,51242 row lock(s), undo log entires 64423 ##大量的行锁和大量的undo 日志
除此外,也要关注“A long semaphore wait”告警
查看MySQL关键状态指标
(system@127.0.0.1:3306) [information_schema]> show global status like 'handler_read%';
+-----------------------+------------+
| Variable_name | Value |
+-----------------------+------------+
| Handler_read_first | 1463098 |
| Handler_read_key | 47240483 |
| Handler_read_last | 2 |
| Handler_read_next | 172676638 |
| Handler_read_prev | 44 |
| Handler_read_rnd | 8799996 | ##查询直接操作了数据文件,很多时候表现为没有使用索引或者文件排序
| Handler_read_rnd_next | 1374741574 | ##通常说明你的表索引不正确或写入的查询没有利用索引。
+-----------------------+------------+
其他指标: Created_tmp_tables、 Created_tmp_disk_tables、Select_scan、 Select_full_join、 Sort_merge_passes等等
干掉垃圾sql
explain、 desc
optimizer_trace
profiling
session status
pt-query-digest
explain
type:显示关联类型。重点关注ALL(全表扫描)、 index(全索引扫描)
key_len:使用到索引的长度。通常该值大于30就要注意被选中的索引是否字符串类型,可否进一步优化
rows:预估扫描的行数。通常该值大于1万就要注意可否选择更合适的索引减少扫描的行数
extra:显示额外信息。重点关注Using temporary,Using filesort,尽量通过添加或调整来消除
optimizer_trace mysql> set optimizer_trace='enabled=on'; mysql> set optimizer_trace_max_mem_size=1000000; mysql> explain ... mysql> select * from information_schema.optimizer_traceG "analyzing_range_alternatives": { "range_scan_alternatives": [ { "index": "ind_c2", "ranges": [ "b <= c2 <= b" ], "index_dives_for_eq_ranges": true, "rowid_ordered": true, "using_mrr": false, "index_only": false, "rows": 4, --扫描行数 "cost": 5.81, --代价 "chosen": false, --未被选中 "cause": "cost" --未被选中原因是"成本太高" }, { "index": "ind_c4", "ranges": [ "20 <= c4 <= 20" ], "index_dives_for_eq_ranges": true, "rowid_ordered": true, "using_mrr": false, "index_only": false, "rows": 2, --扫描行数 "cost": 3.41, --代价 "chosen": true --被选中 }, { "index": "ind_c2_c4", "ranges": [ "b <= c2 <= b AND 20 <= c4 <= 20" ], "index_dives_for_eq_ranges": true, "rowid_ordered": true, "using_mrr": false, "index_only": false, "rows": 2, "cost": 3.41, "chosen": false, "cause": "cost" } ],
profiling
mysql> set profiling=1; mysql> select * from t order by non_key_col; mysql> show profiles; mysql> show profile for query 1; session status mysql> pager cat - > /dev/null; mysql> flush status; mysql> select * from t order by non_key_col; mysql> nopager; mysql> show status like ‘handler_read_%’;
pt-query-digest
[root@DSI log]# pt-query-digest mysql-slow.log > pt_slow.log [root@DSI log]# cat pt_slow.log # A software update is available: # 140ms user time, 40ms system time, 25.21M rss, 214.06M vsz # Current date: Fri Apr 19 16:34:42 2019 # Hostname: DSI # Files: mysql-slow.log # Overall: 70 total, 20 unique, 0.00 QPS, 0.00x concurrency ______________ # Time range: 2019-04-17T14:30:16 to 2019-04-18T09:40:49 # Attribute total min max avg 95% stddev median # ============ ======= ======= ======= ======= ======= ======= ======= # Exec time 51s 355us 7s 724ms 4s 1s 7ms # Lock time 118ms 0 14ms 2ms 8ms 3ms 131us # Rows sent 11.21M 0 2.71M 164.03k 419.40k 545.43k 14.52 # Rows examine 106.53M 120 13.56M 1.52M 13.08M 3.64M 363.48 # Query size 9.87k 47 554 144.34 234.30 98.56 97.36 # Profile # Rank Query ID Response time Calls R/Call V/M I # ==== ================================ ============= ===== ====== ===== = # 1 0xAF2BD8B9E02829BB4A867D8C3EE... 14.4647 28.5% 2 7.2324 0.02 SELECT employees salaries # 2 0xE3C753C2F267B2D767A347A2812... 12.9679 25.6% 12 1.0807 0.76 SELECT salaries # 3 0x8266D3C2CA3BBCEDB3A67A18A29... 9.8776 19.5% 3 3.2925 0.00 SELECT employees salaries # 4 0xCEE7829C0836E31C7E6778A1B13... 7.1704 14.1% 2 3.5852 0.00 SELECT employees salaries # 5 0xD41AD7ECDD2C0E2E8210300E878... 3.6396 7.2% 1 3.6396 0.00 SELECT employees salaries # 6 0xBF4B43B9689E5DF79B0A74D7D42... 1.2942 2.6% 1 1.2942 0.00 SELECT salaries cte # MISC 0xMISC 1.2712 2.5% 49 0.0259 0.0 <14 ITEMS> # Query 1: 0.08 QPS, 0.58x concurrency, ID 0xAF2BD8B9E02829BB4A867D8C3EE5F2BB at byte 10609 # Scores: V/M = 0.02 # Time range: 2019-04-17T16:01:32 to 2019-04-17T16:01:57 # Attribute pct total min max avg 95% stddev median # ============ === ======= ======= ======= ======= ======= ======= ======= # Count 2 2 # Exec time 28 14s 7s 7s 7s 7s 365ms 7s # Lock time 7 8ms 162us 8ms 4ms 8ms 6ms 4ms # Rows sent 0 20 10 10 10 10 0 10 # Rows examine 25 27.12M 13.56M 13.56M 13.56M 13.56M 0 13.56M # Query size 6 688 341 347 344 347 4.24 344 # String: # Databases employees # Hosts 10.15.7.126 (1/50%), localhost (1/50%) # Users root (1/50%), system (1/50%) # Query_time distribution # 1us # 10us # 100us # 1ms # 10ms # 100ms # 1s ################################################################ # 10s+ # Tables # SHOW TABLE STATUS FROM `employees` LIKE 'employees'G # SHOW CREATE TABLE `employees`.`employees`G # SHOW TABLE STATUS FROM `employees` LIKE 'salaries'G # SHOW CREATE TABLE `employees`.`salaries`G # EXPLAIN /*!50100 PARTITIONS*/ select year(hire_date) hire_date_year,salary,rank()over w as 'rank', first_value(salary) over w as 'first', nth_value(salary,3) over w as 'third', last_value(salary) over w as 'last' from employees join salaries on salaries.emp_no=employees.emp_no window w as (partition by year(hire_date) order by salary desc) order by salary desc limit 10G
几个窍门
mysqld进程消耗cpu长时间超过90%的话,99.9%是因为没用好索引
cpu的%sys很高的话,大概率是swap或中断不均衡导致,也有可能是有多个索引且超高并发写入(更新),或者有很严重的锁等待事件
最大的瓶颈通常是在磁盘I/O上,因此尽量用高速磁盘设备
如果物理磁盘无法再升级,则通过增加内存提升性能容量
遇到无法诊断的问题时,试试用perf top来观测跟踪
sql执行慢,有时未必是效率低,也可能是因为锁等待导致,甚至是磁盘满了
这次分享,叶老师的还是很给力。感谢分享。