pstack
获取堆栈信息
问题线程的定位
负载较低
mysql_pid=4522
pstack $mysql_pid>pstack.info
pt-pmp对堆栈信息排序
pt-pmp pstack.info | less
也可以直接执行pt-pmp
pt-pmp --pid 4522
如
10 __io_getevents_0_4(libaio.so.1),LinuxAIOHandler::collect(os0file.cc:2502),LinuxAIOHandler::poll(os0file.cc:2648),os_aio_linux_handler(os0file.cc:2704),os_aio_handler(os0file.cc:2704),fil_aio_wait(fil0fil.cc:5835),io_handler_thread(srv0start.cc:311),start_thread(libpthread.so.0),clone(libc.so.6)
3 pthread_cond_wait,wait(os0event.cc:165),os_event::wait_low(os0event.cc:165),srv_worker_thread(srv0srv.cc:2520),start_thread(libpthread.so.0),clone(libc.so.6)
2 pthread_cond_wait,native_cond_wait(thr_cond.h:140),my_cond_wait(thr_cond.h:140),inline_mysql_cond_wait(thr_cond.h:140),Per_thread_connection_handler::block_until_new_connection(thr_cond.h:140),handle_connection(connection_handler_per_thread.cc:329),pfs_spawn_thread(pfs.cc:2190),start_thread(libpthread.so.0),clone(libc.so.6)
gdb
堆栈跟踪
等待分析
侵入性大
mysqld_pid 4522
gdb -p $mysqld_pid
(gdb) info thread 显示运行的所有线程
(gdb) thread 4 切换到某个线程
bt 显示调用栈
如
SELECT B.THREAD_OS_ID AS mysqld_os_thread_id,A.ID as processlist_id,A.USER,A.HOST,A.DB,A.COMMAND,A.TIME,A.STATE,LEFT(A.INFO,150) AS statement FROM information_schema.PROCESSLIST A INNER JOIN performance_schema.threads B on A.ID=B.PROCESSLIST_ID where A.id !=connection_id();
+---------------------+----------------+-------+-----------+------+---------+------+---------------------------------+----------------------------------------+
| mysqld_os_thread_id | processlist_id | USER | HOST | DB | COMMAND | TIME | STATE | statement |
+---------------------+----------------+-------+-----------+------+---------+------+---------------------------------+----------------------------------------+
| 27015 | 4 | admin | localhost | NULL | Sleep | 2597 | | NULL |
| 28252 | 6 | admin | localhost | test | Query | 1166 | Waiting for table metadata lock | alter table test.test_1 drop index idx |
+---------------------+----------------+-------+-----------+------+---------+------+---------------------------------+----------------------------------------+
(gdb) thread 4
[Switching to thread 4 (Thread 0x7f8d107f8700 (LWP 28252))]
#0 0x00007f8d3b147d12 in pthread_cond_timedwait@@GLIBC_2.3.2 () from /lib64/libpthread.so.0bt
........
#4 MDL_wait::timed_wait (this=0x7f8ce4000958, owner=0x7f8ce40008c0, abs_timeout=0x7f8d107f37e0, set_status_on_timeout=false,
wait_state_name=) at /export/home/pb2/build/sb_0-27500212-1520171728.22/mysql-5.7.22/sql/mdl.cc:1861
- processlist_id 为6( os中 mysqld的线程id为28252)正在等待MDL 元数据锁,通过gdb堆栈跟踪发现,该连接正在执行的内部函数为 pthread_cond_timedwait
注意
- pstack 和gdb都可以获取堆栈信息,区别是gdb是交互式的(需要手动退出),pstack是快照形式(执行3s自动退出)
- 对mysqld使用gdb后,已经建立的连接无法再执行语句,新连接无法建立
strace
查看进程的系统调用信息
负载较高
查看系统调用
strace -cp $mysqld_pid
查看执行语句
mysqld_pid=4522
strace -f -F -ff -o mysqld-strace -s 1024 -p $mysqld_pid
find ./ -name "mysqld-strace" -type f -print |xargs grep -n "SELECT.FROM"
查看读写的文件
strace -o /tmp/strace_output.txt -T -tt -f -e trace=read,open -p “mysqld_pid”
perf
进程内部函数调用情况
负载较低
mysqld内部函数整体消耗情况
perf top -p 4522
如
Samples: 66 of event 'cpu-clock', Event count (approx.): 6520074
Overhead Shared Object Symbol
18.42% mysqld [.] sync_array_print_long_waits_low
10.83% [kernel] [k] _raw_spin_unlock_irqrestore
7.88% libaio.so.1.0.1 [.] 0x0000000000000645
7.30% [kernel] [k] finish_task_switch
6.84% [kernel] [k] system_call_after_swapgs
6.22% [kernel] [k] aio_read_events
6.02% libc-2.17.so [.] __memset_sse2
记录所有内部函数调用
该命令会输出记录到perf.data中
perf record -p 4522
解析并查看perf.data内容
perf script -i perf.data>perf.log && less perf.log
参考
MySQL所有操作hang住问题的故障排查
linux strace追踪mysql执行语句 (mysqld --debug)
linux 调试利器gdb, strace, pstack, pstree, lsof
Debugging a MySQL Server
What to Do If MySQL Keeps Crashing