这篇文章主要介绍了MySQL DBA怎么利用strace/pstack/gdb来定位问题,具有一定借鉴价值,感兴趣的朋友可以参考下,希望大家阅读完这篇文章之后大有收获,下面让小编带着大家一起了解一下。
strace简介
strace是Linux环境下的一款程序调试工具,用来监察一个应用程序所使用的系统调用。
Strace是一个简单的跟踪系统调用执行的工具。在其最简单的形式中,它可以从开始到结束跟踪二进制的执行,并在进程的生命周期中输出一行具有系统调用名称,每个系统调用的参数和返回值的文本行。
常用选项
-
-T:strace输出显示时间
-
-t:可以在每行的输出之前添加时间戳,-tt可以精确到微妙级别,-ttt也可以精确到微妙级,但是它并不是打印当前时间,而是显示自从epoch以来的所经过的秒数
-
-s:指定输出的字符串的最大长度,默认为32,如果输出到文件的话会全部输出
-
-o:指定将strace输出到文件
-
-ff:如果提供-o filename,则所有进程的跟踪结果输出到相应的filename.pid中,pid是各进程的进程号.
-
-e:指定跟踪某个行为,例如-e trace=open指定只跟踪open行为
-
-y:将文件句柄用文件路径代替显示
正确姿势
简单使用
strace -T -tt -o /tmp/strace.log CMD strace -T -tt CMD 2>&1 |tee /tmp/strace.log strace -T -tt -s 100 -o /tmp/strace.log CMD strace -T -tt -s 100 -ff -o /tmp/strace.log CMD strace -T -tt -s 100 -e trace=XXXX -o /tmp/strace.log CMD
使用案例
利用strace观察客户端client执行SQL
#通过sys.processlist表中pid可以知道客户端连接pid MySQL [sys]> select thd_id,conn_id,user,pid,program_name,command,current_statement from processlist where conn_id>0 and pid>0; +--------+---------+------------------+------+--------------+---------+-------------------+ | thd_id | conn_id | user | pid | program_name | command | current_statement | +--------+---------+------------------+------+--------------+---------+-------------------+ | 78 | 22 | xucl@172.17.0.11 | 8656 | mysql | Sleep | select * from t1 | +--------+---------+------------------+------+--------------+---------+-------------------+ 1 row in set (0.05 sec) #session1 [root@VM_0_11_centos ~]# strace -T -tt -s 100 -o /tmp/strace.log -p 5841 strace: Process 5841 attached ^Cstrace: Process 5841 detached #session进行数据查询 MySQL [(none)]> use xucl Reading table information for completion of table and column names You can turn off this feature to get a quicker startup with -A Database changed MySQL [xucl]> select * from t1; +----+----+ | id | c1 | +----+----+ | 1 | | +----+----+ 1 row in set (0.00 sec)
需要提醒一下的是从客户端连接到MySQL的时候需要注意客户端的版本,例如我用mariadb的客户端连接官方版本8.0就无法获取到pid
查看strace文件
.... 20:54:16.901980 poll([{fd=3, events=POLLIN|POLLPRI}], 1, 0) = 0 (Timeout) <0.000006> 20:54:16.902001 write(3, "/21/0/0/0/3select * from t1", 21) = 21 <0.000011> 20:54:16.902026 read(3, "/1/0/0/1/2/"/0/0/2/3def/4xucl/2t1/2t1/2id/2id/f?/0/v/0/0/0/3/3P/0/0/0/"/0/0/3/3def/4xucl/2t1/2t1/2c1/2c1/f!/0/36/0/0/0/375/t@/0/0/0/5/0/0/4/376/0/0/2/0/3/0/0/5/0011/0/5/0/0"..., 16384) = 106 <0.000355> 20:54:16.902405 times({tms_utime=0, tms_stime=0, tms_cutime=0, tms_cstime=0}) = 429630696 <0.000005> 20:54:16.902441 write(1, "+----+----+/n", 12) = 12 <0.000026> 20:54:16.902482 write(1, "| id | c1 |/n", 12) = 12 <0.000018> 20:54:16.902514 write(1, "+----+----+/n", 12) = 12 <0.000018> 20:54:16.902549 write(1, "| 1 | |/n", 12) = 12 <0.000019> 20:54:16.902582 write(1, "+----+----+/n", 12) = 12 <0.000019> 20:54:16.902631 write(1, "/33(B/33[0;1m1 row in set (0.00 sec)/n", 33) = 33 <0.000019> 20:54:16.902666 write(1, "/33(B/33[m/33(B/33[0;1m/n", 16) = 16 <0.000019> 20:54:16.902702 stat("/etc/localtime", {st_mode=S_IFREG|0644, st_size=528, ...}) = 0 <0.000006> 20:54:16.902752 rt_sigprocmask(SIG_BLOCK, [INT], [], 8) = 0 <0.000004> 20:54:16.902771 ioctl(0, TIOCGWINSZ, {ws_row=42, ws_col=163, ws_xpixel=0, ws_ypixel=0}) = 0 <0.000005> 20:54:16.902789 ioctl(0, TIOCSWINSZ, {ws_row=42, ws_col=163, ws_xpixel=0, ws_ypixel=0}) = 0 <0.000004> 20:54:16.902806 ioctl(0, TCGETS, {B38400 opost isig icanon echo ...}) = 0 <0.000005> 20:54:16.902824 ioctl(0, SNDCTL_TMR_STOP or TCSETSW, {B38400 opost isig -icanon -echo ...}) = 0 <0.000006> 20:54:16.902841 write(1, "/33(B/33[m", 6) = 6 <0.000022> 20:54:16.902877 rt_sigprocmask(SIG_SETMASK, [], NULL, 8) = 0 <0.000005> 20:54:16.902895 rt_sigprocmask(SIG_BLOCK, [INT QUIT ALRM TSTP TTIN TTOU], [], 8) = 0 <0.000005> 20:54:16.902912 rt_sigaction(SIGINT, {0x55e19ff48a00, [], SA_RESTORER, 0x7fc3e63585f0}, {0x55e19fee21d0, [INT], SA_RESTORER|SA_RESTART, 0x7fc3e63585f0}, 8) = 0 <0.000005> 20:54:16.902931 rt_sigaction(SIGTERM, {0x55e19ff48a00, [], SA_RESTORER, 0x7fc3e63585f0}, {SIG_DFL, [], SA_RESTORER, 0x7fc3e63585f0}, 8) = 0 <0.000005> 20:54:16.902948 rt_sigaction(SIGQUIT, {0x55e19ff48a00, [], SA_RESTORER, 0x7fc3e63585f0}, {0x55e19fee1fa0, [QUIT], SA_RESTORER|SA_RESTART, 0x7fc3e63585f0}, 8) = 0 <0.000005> 20:54:16.902966 rt_sigaction(SIGALRM, {0x55e19ff48a00, [], SA_RESTORER, 0x7fc3e63585f0}, {SIG_DFL, [], SA_RESTORER, 0x7fc3e63585f0}, 8) = 0 <0.000005> 20:54:16.902983 rt_sigaction(SIGTSTP, {0x55e19ff48a00, [], SA_RESTORER, 0x7fc3e63585f0}, {SIG_DFL, [], SA_RESTORER, 0x7fc3e63585f0}, 8) = 0 <0.000005> 20:54:16.903001 rt_sigaction(SIGTTOU, {0x55e19ff48a00, [], SA_RESTORER, 0x7fc3e63585f0}, {SIG_DFL, [], SA_RESTORER, 0x7fc3e63585f0}, 8) = 0 <0.000004> 20:54:16.903018 rt_sigaction(SIGTTIN, {0x55e19ff48a00, [], SA_RESTORER, 0x7fc3e63585f0}, {SIG_DFL, [], SA_RESTORER, 0x7fc3e63585f0}, 8) = 0 <0.000006> 20:54:16.903037 rt_sigprocmask(SIG_SETMASK, [], NULL, 8) = 0 <0.000005> 20:54:16.903055 rt_sigaction(SIGWINCH, {0x55e19ff481a0, [], SA_RESTORER|SA_RESTART, 0x7fc3e63585f0}, {0x55e19fee0760, [WINCH], SA_RESTORER|SA_RESTART, 0x7fc3e63585f0}, 8) = 0 <0.000006> 20:54:16.903088 write(1, "MySQL [xucl]> ", 14) = 14 <0.000032> 20:54:16.903139 read(0, <detached ...>
利用strace观察server端执行
#命令 strace -o /tmp/strace.log -T -tt -f -p `pidof mysqld` #假设你想看跟IO相关的行为,可以用-e trace来指定 # mysql相关的IO操作: # mysql: read, write, open # innodb: pread64, pwrite64 strace -o /tmp/strace.log -T -tt -f -e trace=read,open,write,pwrite64,pread64 -p `pidof mysqld`
示例:
strace -o /tmp/strace.log -T -tt -ff -p `pidof mysqld` #然后在本地通过socket登录执行命令 MySQL [performance_schema]> use xucl; Reading table information for completion of table and column names You can turn off this feature to get a quicker startup with -A Database changed MySQL [xucl]> select * from t1; +------+ | id | +------+ | 1 | | 2 | | 4 | | 3 | | 5 | | 6 | +------+ 6 rows in set (0.01 sec) #查看本线程对应的LWP MySQL [xucl]> show processlist; +----+------+-----------+------+---------+------+----------+------------------+ | Id | User | Host | db | Command | Time | State | Info | +----+------+-----------+------+---------+------+----------+------------------+ | 28 | root | localhost | xucl | Query | 0 | starting | show processlist | +----+------+-----------+------+---------+------+----------+------------------+ 1 row in set (0.00 sec) MySQL [xucl]> select * from performance_schema.threads where processlist_id=28/G *************************** 1. row *************************** THREAD_ID: 62 NAME: thread/sql/one_connection TYPE: FOREGROUND PROCESSLIST_ID: 28 PROCESSLIST_USER: root PROCESSLIST_HOST: localhost PROCESSLIST_DB: xucl PROCESSLIST_COMMAND: Query PROCESSLIST_TIME: 0 PROCESSLIST_STATE: Sending data PROCESSLIST_INFO: select * from performance_schema.threads where processlist_id=28 PARENT_THREAD_ID: NULL ROLE: NULL INSTRUMENTED: YES HISTORY: YES CONNECTION_TYPE: Socket THREAD_OS_ID: 8208 1 row in set (0.00 sec)
strace会在/tmp文件夹下生成strace.log.xxx,xxx对应的就是THREAD_OS_ID,我们直接看strace.log.8208文件
t@VM_0_9_centos tmp]# cat strace.log.8208 16:44:14.447709 restart_syscall(<... resuming interrupted restart_syscall ...>) = 1 <3.268331> 16:44:17.716076 recvfrom(134<socket:[12555437]>, "/22/0/0/0", 4, MSG_DONTWAIT, NULL, NULL) = 4 <0.000015> 16:44:17.716179 recvfrom(134<socket:[12555437]>, "/3SELECT DATABASE()", 18, MSG_DONTWAIT, NULL, NULL) = 18 <0.000017> 16:44:17.716318 sendto(134<socket:[12555437]>, "/1/0/0/1/1 /0/0/2/3def/0/0/0/nDATABASE()/0/f!/0f"..., 68, MSG_DONTWAIT, NULL, 0) = 68 <0.000033> 16:44:17.716420 recvfrom(134<socket:[12555437]>, "/5/0/0/0", 4, MSG_DONTWAIT, NULL, NULL) = 4 <0.000015> 16:44:17.716476 recvfrom(134<socket:[12555437]>, "/2xucl", 5, MSG_DONTWAIT, NULL, NULL) = 5 <0.000017> 16:44:17.716538 access("./xucl", F_OK) = 0 <0.000016> 16:44:17.716604 sendto(134<socket:[12555437]>, "/7/0/0/1/0/0/0/2/0/0/0", 11, MSG_DONTWAIT, NULL, 0) = 11 <0.000090> 16:44:17.716738 recvfrom(134<socket:[12555437]>, 0x7f06ec009fe0, 4, MSG_DONTWAIT, NULL, NULL) = -1 EAGAIN (Resource temporarily unavailable) <0.000014> 16:44:17.716793 poll([{fd=134<socket:[12555437]>, events=POLLIN|POLLPRI}], 1, 600000) = 1 ([{fd=134, revents=POLLIN}]) <1.326139> 16:44:19.042985 recvfrom(134<socket:[12555437]>, "/21/0/0/0", 4, MSG_DONTWAIT, NULL, NULL) = 4 <0.000014> 16:44:19.043048 recvfrom(134<socket:[12555437]>, "/3select * from t1", 17, MSG_DONTWAIT, NULL, NULL) = 17 <0.000017> 16:44:19.043256 sendto(134<socket:[12555437]>, "/1/0/0/1/1/"/0/0/2/3def/4xucl/2t1/2t1/2id/2id/f?"..., 97, MSG_DONTWAIT, NULL, 0) = 97 <0.000137> 16:44:19.043448 recvfrom(134<socket:[12555437]>, 0x7f06ec009fe0, 4, MSG_DONTWAIT, NULL, NULL) = -1 EAGAIN (Resource temporarily unavailable) <0.000014> 16:44:19.043501 poll([{fd=134<socket:[12555437]>, events=POLLIN|POLLPRI}], 1, 600000) = 1 ([{fd=134, revents=POLLIN}]) <11.279983> 16:44:30.323556 recvfrom(134<socket:[12555437]>, "/21/0/0/0", 4, MSG_DONTWAIT, NULL, NULL) = 4 <0.000050> 16:44:30.323682 recvfrom(134<socket:[12555437]>, "/3show processlist", 17, MSG_DONTWAIT, NULL, NULL) = 17 <0.000017> 16:44:30.323788 sendto(134<socket:[12555437]>, "/1/0/0/1/10/30/0/0/2/3def/0/0/0/2Id/0/f?/0/25/0/0/0/10/201/0/0/0"..., 324, MSG_DONTWAIT, NULL, 0) = 324 <0.000221> 16:44:30.324066 recvfrom(134<socket:[12555437]>, 0x7f06ec009fe0, 4, MSG_DONTWAIT, NULL, NULL) = -1 EAGAIN (Resource temporarily unavailable) <0.000014> 16:44:30.324120 poll([{fd=134<socket:[12555437]>, events=POLLIN|POLLPRI}], 1, 600000) = 1 ([{fd=134, revents=POLLIN}]) <2.943989> 16:44:33.268173 recvfrom(134<socket:[12555437]>, "A/0/0/0", 4, MSG_DONTWAIT, NULL, NULL) = 4 <0.000055> 16:44:33.268309 recvfrom(134<socket:[12555437]>, "/3select * from performance_schem"..., 65, MSG_DONTWAIT, NULL, NULL) = 65 <0.000017> 16:44:33.268592 sendto(134<socket:[12555437]>, "/1/0/0/1/21H/0/0/2/3def/22performance_schema"..., 1606, MSG_DONTWAIT, NULL, 0) = 1606 <0.000142> 16:44:33.268790 recvfrom(134<socket:[12555437]>, 0x7f06ec009fe0, 4, MSG_DONTWAIT, NULL, NULL) = -1 EAGAIN (Resource temporarily unavailable) <0.000018> 16:44:33.268853 poll([{fd=134<socket:[12555437]>, events=POLLIN|POLLPRI}], 1, 600000 <detached ...>
可以清楚地看到有对应的SQL输出
pstack简介
pstack用来跟踪进程栈,这个命令在排查进程问题时非常有用,比如我们发现一个服务一直处于work状态(如假死状态,好似死循环),使用这个命令就能轻松定位问题所在;
可以在一段时间内,多执行几次pstack,若发现代码栈总是停在同一个位置,那个位置就需要重点关注,很可能就是出问题的地方;
正确使用姿势
pstack使用起来非常简单,直接pstack -pid即可
[root@VM_0_9_centos mysql3306]# netstat -lntp|grep 3306 tcp6 0 0 :::33060 :::* LISTEN 13499/mysqld tcp6 0 0 :::3306 :::* LISTEN 13499/mysqld [root@VM_0_9_centos mysql3306]# ps -Lf 13499 UID PID PPID LWP C NLWP STIME TTY STAT TIME CMD mysql 13499 11316 13499 0 53 20:31 pts/0 Sl 0:00 /usr/local/mysql/bin/mysqld --defaults-file=/data/mysql/mysql3306/my3306.cnf --user=mysql mysql 13499 11316 13506 0 53 20:31 pts/0 Sl 0:00 /usr/local/mysql/bin/mysqld --defaults-file=/data/mysql/mysql3306/my3306.cnf --user=mysql mysql 13499 11316 13507 0 53 20:31 pts/0 Sl 0:00 /usr/local/mysql/bin/mysqld --defaults-file=/data/mysql/mysql3306/my3306.cnf --user=mysql mysql 13499 11316 13508 0 53 20:31 pts/0 Sl 0:00 /usr/local/mysql/bin/mysqld --defaults-file=/data/mysql/mysql3306/my3306.cnf --user=mysql mysql 13499 11316 13509 0 53 20:31 pts/0 Sl 0:00 /usr/local/mysql/bin/mysqld --defaults-file=/data/mysql/mysql3306/my3306.cnf --user=mysql mysql 13499 11316 13510 0 53 20:31 pts/0 Sl 0:00 /usr/local/mysql/bin/mysqld --defaults-file=/data/mysql/mysql3306/my3306.cnf --user=mysql mysql 13499 11316 13511 0 53 20:31 pts/0 Sl 0:00 /usr/local/mysql/bin/mysqld --defaults-file=/data/mysql/mysql3306/my3306.cnf --user=mysql ....
mysqld的pid为13499,fork了几十个线程,有的是用户连接,有的是MySQL自己的线程,例如read_thread、write_thread等等,这里的LWP对应的就是performance_schema.threads表的THREAD_OS_ID。
全部抓取
假如发生了mysqld hang住的情况,可以直接pstack mysqldpid来抓取所有的线程堆栈,这也是通常使用的方式,例如
pstack 13499 >> /tmp/pstack.log
这里简单随便看下抓取到的pstack
Thread 27 (Thread 0x7ff1bc29f700 (LWP 13535)): #0 0x00007ff1f6368da2 in pthread_cond_timedwait@@GLIBC_2.3.2 () from /lib64/libpthread.so.0 #1 0x00000000020ace25 in os_event::timed_wait(timespec const*) () at ../../../mysql-8.0.19/storage/innobase/include/sync0types.h:540 #2 0x00000000020ad9e1 in os_event::wait_time_low (this=0x7ff1dc022458, time_in_usec=<optimized out>, reset_sig_count=1) at ../../../mysql-8.0.19/storage/innobase/os/os0event.cc:495 #3 0x000000000215fe0c in srv_error_monitor_thread() () at ../../../mysql-8.0.19/storage/innobase/srv/srv0srv.cc:1835 #4 0x0000000002076da5 in __invoke_impl<void, void (*&)()> (__f=<synthetic pointer>) at /opt/rh/devtoolset-8/root/usr/include/c++/8/bits/invoke.h:89 #5 __invoke<void (*&)()> (__fn=<synthetic pointer>) at /opt/rh/devtoolset-8/root/usr/include/c++/8/bits/invoke.h:95 #6 __call<void> (__args=..., this=<synthetic pointer>) at /opt/rh/devtoolset-8/root/usr/include/c++/8/functional:400 #7 operator()<> (this=<synthetic pointer>) at /opt/rh/devtoolset-8/root/usr/include/c++/8/functional:484 #8 operator()<void (*)()> (f=@0x7ff1dc4aec18: 0x215fd20 <srv_error_monitor_thread()>, this=0x7ff1dc4aec20) at ../../../mysql-8.0.19/storage/innobase/include/os0thread-create.h:101 #9 __invoke_impl<void, Runnable, void (*)()> (__f=...) at /opt/rh/devtoolset-8/root/usr/include/c++/8/bits/invoke.h:60 #10 __invoke<Runnable, void (*)()> (__fn=...) at /opt/rh/devtoolset-8/root/usr/include/c++/8/bits/invoke.h:95 #11 _M_invoke<0, 1> (this=0x7ff1dc4aec18) at /opt/rh/devtoolset-8/root/usr/include/c++/8/thread:244 #12 operator() (this=0x7ff1dc4aec18) at /opt/rh/devtoolset-8/root/usr/include/c++/8/thread:253 #13 std::thread::_State_impl<std::thread::_Invoker<std::tuple<Runnable, void (*)()> > >::_M_run (this=0x7ff1dc4aec10) at /opt/rh/devtoolset-8/root/usr/include/c++/8/thread:196 #14 0x000000000265d64f in execute_native_thread_routine () #15 0x00007ff1f6364e65 in start_thread () from /lib64/libpthread.so.0 #16 0x00007ff1f44a188d in clone () from /lib64/libc.so.6
从堆栈可以看出这其实就是MySQL的error monitor线程
单个抓取
假设你运气好,你现在可以通过performance_schema.threads可以查到对应的线程的thread_os_id,你可以指定单个线程的lwp进行抓取
MySQL [performance_schema]> show processlist; +----+-----------------+-----------+--------------------+---------+------+------------------------+---------------------------+ | Id | User | Host | db | Command | Time | State | Info | +----+-----------------+-----------+--------------------+---------+------+------------------------+---------------------------+ | 4 | event_scheduler | localhost | NULL | Daemon | 4319 | Waiting on empty queue | NULL | | 23 | root | localhost | performance_schema | Query | 0 | starting | show processlist | | 24 | root | localhost | xucl | Query | 22 | User sleep | select sleep(600) from t1 | +----+-----------------+-----------+--------------------+---------+------+------------------------+---------------------------+ 3 rows in set (0.00 sec) MySQL [performance_schema]> select THREAD_OS_ID from threads where processlist_id=24/G *************************** 1. row *************************** THREAD_OS_ID: 15020 1 row in set (0.00 sec)
你可以这样操作
pstack 15020 > /tmp/single_pstack.log
查看pstack结果,只抓取了15020线程的堆栈
Thread 1 (process 15020): #0 0x00007ff1f6368da2 in pthread_cond_timedwait@@GLIBC_2.3.2 () from /lib64/libpthread.so.0 #1 0x0000000001189b53 in native_cond_timedwait (abstime=0x7ff1b810d7f0, mutex=0x3b96760 <LOCK_item_func_sleep>, cond=0x7ff1b810d8e0) at ../../mysql-8.0.19/include/thr_cond.h:149 #2 my_cond_timedwait (abstime=0x7ff1b810d7f0, mp=0x3b96760 <LOCK_item_func_sleep>, cond=0x7ff1b810d8e0) at ../../mysql-8.0.19/include/thr_cond.h:149 #3 inline_mysql_cond_timedwait (src_file=0x2de5538 "../../mysql-8.0.19/sql/item_func.cc", src_line=4786, abstime=0x7ff1b810d7f0, mutex=0x3b96760 <LOCK_item_func_sleep>, that=0x7ff1b810d8e0) at ../../mysql-8.0.19/include/mysql/psi/mysql_cond.h:217 #4 Interruptible_wait::wait (thisthis=this@entry=0x7ff1b810d890, condcond=cond@entry=0x7ff1b810d8e0, mutexmutex=mutex@entry=0x3b96760 <LOCK_item_func_sleep>) at ../../mysql-8.0.19/sql/item_func.cc:4786 #5 0x0000000001189e26 in Item_func_sleep::val_int (this=<optimized out>) at ../../mysql-8.0.19/sql/item_func.cc:5435 #6 0x00000000010ff567 in Item::send (this=0x7ff1d45d9380, protocol=0x7ff1d4392240, buffer=<optimized out>) at ../../mysql-8.0.19/sql/item.cc:6638 #7 0x0000000000e3d36b in THD::send_result_set_row(List<Item>*) () at ../../mysql-8.0.19/sql/sql_class.cc:2530 #8 0x000000000127e8f1 in Query_result_send::send_data (this=<optimized out>, thd=0x7ff1d457b7b0, items=...) at ../../mysql-8.0.19/sql/query_result.cc:93 #9 0x0000000000f5d29a in SELECT_LEX_UNIT::ExecuteIteratorQuery(THD*) () at ../../mysql-8.0.19/sql/sql_union.cc:1530 #10 0x0000000000f5f3f9 in SELECT_LEX_UNIT::execute(THD*) () at ../../mysql-8.0.19/sql/sql_union.cc:1588 #11 0x0000000000eedf8b in Sql_cmd_dml::execute_inner(THD*) () at ../../mysql-8.0.19/sql/sql_select.cc:910 #12 0x0000000000ef7418 in Sql_cmd_dml::execute (this=0x7ff1d45d9a38, thd=0x7ff1d457b7b0) at ../../mysql-8.0.19/sql/sql_select.cc:715 #13 0x0000000000eab3a9 in mysql_execute_command(THD*, bool) () at ../../mysql-8.0.19/sql/sql_parse.cc:4478 #14 0x0000000000ead0cc in mysql_parse (thdthd=thd@entry=0x7ff1d457b7b0, parser_stateparser_state=parser_state@entry=0x7ff1b810f610) at ../../mysql-8.0.19/sql/sql_parse.cc:5288 #15 0x0000000000eaeb6b in dispatch_command(THD*, COM_DATA const*, enum_server_command) () at ../../mysql-8.0.19/sql/sql_parse.cc:1777 #16 0x0000000000eb0104 in do_command (thdthd=thd@entry=0x7ff1d457b7b0) at ../../mysql-8.0.19/sql/sql_parse.cc:1275 #17 0x0000000000fc1a08 in handle_connection (argarg=arg@entry=0x43cae90) at ../../mysql-8.0.19/sql/conn_handler/connection_handler_per_thread.cc:302 #18 0x00000000023ffdec in pfs_spawn_thread (arg=0x44c2a30) at ../../../mysql-8.0.19/storage/perfschema/pfs.cc:2854 #19 0x00007ff1f6364e65 in start_thread () from /lib64/libpthread.so.0 #20 0x00007ff1f44a188d in clone () from /lib64/libc.so.6
gdb简介
GDB 是 linux 环境下的一般功能强大的调试器,用来调试 C 或 C++ 写的程序。它可以做这些事情
-
Start your program, specifying anything that might affect its behavior.
-
Make your program stop on specified conditions.
-
Examine what has happened, when your program has stopped.
-
Change things in your program, so you can experiment with correcting the effects of one bug and go on to learn about another.
常用命令
-
info threads:查看全部线程
-
thread n:指定某个线程
-
b:在某处打断点
-
c:继续往下走
-
s:执行一行代码,如果代码函数调用,则进入函数
-
n:执行一行代码,函数调用不进入
-
p:打印某个变量值
-
list:打印代码的文本信息
-
bt:查看某个线程的栈帧
-
info b:查看当前所有断点信息
实际案例
进入gdb界面
[root@VM_0_9_centos ~]# gdb /usr/local/mysql5.7/bin/mysqld
进入调试
run --defaults-file=/data/mysql/mysql3308/my3308.cnf --user=mysql --gdb
通过info thread查看所有线程,这个时候我从另外一个会话连接到了MySQL,我从pfs.threads查寻到这个会话的LWP值为513,那么我可以通过thread 41切换到这个线程
(gdb) info thread Id Target Id Frame 41 Thread 0x7fff819cf700 (LWP 513) "mysqld" 0x00007ffff61abbed in poll () from /lib64/libc.so.6 40 Thread 0x7fff5affd700 (LWP 416) "mysqld" 0x00007ffff7bca9f5 in pthread_cond_wait@@GLIBC_2.3.2 () from /lib64/libpthread.so.0 39 Thread 0x7fff81a51700 (LWP 415) "mysqld" 0x00007ffff7bce381 in sigwait () from /lib64/libpthread.so.0 38 Thread 0x7fff5b7fe700 (LWP 414) "mysqld" 0x00007ffff7bca9f5 in pthread_cond_wait@@GLIBC_2.3.2 () from /lib64/libpthread.so.0 37 Thread 0x7fff5bfff700 (LWP 413) "mysqld" 0x00007ffff7bcada2 in pthread_cond_timedwait@@GLIBC_2.3.2 () from /lib64/libpthread.so.0 36 Thread 0x7fff6cff9700 (LWP 412) "mysqld" 0x00007ffff7bcada2 in pthread_cond_timedwait@@GLIBC_2.3.2 () from /lib64/libpthread.so.0 35 Thread 0x7fff6d7fa700 (LWP 411) "mysqld" 0x00007ffff7bca9f5 in pthread_cond_wait@@GLIBC_2.3.2 () from /lib64/libpthread.so.0 34 Thread 0x7fff6dffb700 (LWP 410) "mysqld" 0x00007ffff7bca9f5 in pthread_cond_wait@@GLIBC_2.3.2 () from /lib64/libpthread.so.0 33 Thread 0x7fff6e7fc700 (LWP 409) "mysqld" 0x00007ffff7bca9f5 in pthread_cond_wait@@GLIBC_2.3.2 () from /lib64/libpthread.so.0 32 Thread 0x7fff6effd700 (LWP 408) "mysqld" 0x00007ffff7bca9f5 in pthread_cond_wait@@GLIBC_2.3.2 () from /lib64/libpthread.so.0 31 Thread 0x7fff6f7fe700 (LWP 407) "mysqld" 0x00007ffff7bca9f5 in pthread_cond_wait@@GLIBC_2.3.2 () from /lib64/libpthread.so.0 30 Thread 0x7fff6ffff700 (LWP 406) "mysqld" 0x00007ffff7bcde5d in nanosleep () from /lib64/libpthread.so.0 29 Thread 0x7fff8084e700 (LWP 405) "mysqld" 0x00007ffff7bcada2 in pthread_cond_timedwait@@GLIBC_2.3.2 () from /lib64/libpthread.so.0 28 Thread 0x7fff8104f700 (LWP 404) "mysqld" 0x00007ffff7bcada2 in pthread_cond_timedwait@@GLIBC_2.3.2 () from /lib64/libpthread.so.0 27 Thread 0x7fff81850700 (LWP 403) "mysqld" 0x00007ffff7bcada2 in pthread_cond_timedwait@@GLIBC_2.3.2 () from /lib64/libpthread.so.0 24 Thread 0x7fff74c24700 (LWP 398) "mysqld" 0x00007ffff7bca9f5 in pthread_cond_wait@@GLIBC_2.3.2 () from /lib64/libpthread.so.0 23 Thread 0x7fff75425700 (LWP 397) "mysqld" 0x00007ffff7bca9f5 in pthread_cond_wait@@GLIBC_2.3.2 () from /lib64/libpthread.so.0 22 Thread 0x7fff75c26700 (LWP 396) "mysqld" 0x00007ffff7bca9f5 in pthread_cond_wait@@GLIBC_2.3.2 () from /lib64/libpthread.so.0 21 Thread 0x7fff76427700 (LWP 395) "mysqld" 0x00007ffff7bcada2 in pthread_cond_timedwait@@GLIBC_2.3.2 () from /lib64/libpthread.so.0 20 Thread 0x7fff76c28700 (LWP 394) "mysqld" 0x00007ffff79bd644 in __io_getevents_0_4 () from /lib64/libaio.so.1 19 Thread 0x7fff77429700 (LWP 393) "mysqld" 0x00007ffff79bd644 in __io_getevents_0_4 () from /lib64/libaio.so.1 18 Thread 0x7fff77c2a700 (LWP 392) "mysqld" 0x00007ffff79bd644 in __io_getevents_0_4 () from /lib64/libaio.so.1 17 Thread 0x7fff7842b700 (LWP 391) "mysqld" 0x00007ffff79bd644 in __io_getevents_0_4 () from /lib64/libaio.so.1 16 Thread 0x7fff78c2c700 (LWP 390) "mysqld" 0x00007ffff79bd644 in __io_getevents_0_4 () from /lib64/libaio.so.1 15 Thread 0x7fff7942d700 (LWP 389) "mysqld" 0x00007ffff79bd644 in __io_getevents_0_4 () from /lib64/libaio.so.1 14 Thread 0x7fff79c2e700 (LWP 388) "mysqld" 0x00007ffff79bd644 in __io_getevents_0_4 () from /lib64/libaio.so.1 13 Thread 0x7fff7a42f700 (LWP 387) "mysqld" 0x00007ffff79bd644 in __io_getevents_0_4 () from /lib64/libaio.so.1 12 Thread 0x7fff7ac30700 (LWP 386) "mysqld" 0x00007ffff79bd644 in __io_getevents_0_4 () from /lib64/libaio.so.1 11 Thread 0x7fff7b431700 (LWP 385) "mysqld" 0x00007ffff79bd644 in __io_getevents_0_4 () from /lib64/libaio.so.1 10 Thread 0x7fff7bc32700 (LWP 384) "mysqld" 0x00007ffff79bd644 in __io_getevents_0_4 () from /lib64/libaio.so.1 9 Thread 0x7fff7c433700 (LWP 383) "mysqld" 0x00007ffff79bd644 in __io_getevents_0_4 () from /lib64/libaio.so.1 8 Thread 0x7fff7cc34700 (LWP 382) "mysqld" 0x00007ffff79bd644 in __io_getevents_0_4 () from /lib64/libaio.so.1 7 Thread 0x7fff7d435700 (LWP 381) "mysqld" 0x00007ffff79bd644 in __io_getevents_0_4 () from /lib64/libaio.so.1 6 Thread 0x7fff7dc36700 (LWP 380) "mysqld" 0x00007ffff79bd644 in __io_getevents_0_4 () from /lib64/libaio.so.1 5 Thread 0x7fff7e437700 (LWP 379) "mysqld" 0x00007ffff79bd644 in __io_getevents_0_4 () from /lib64/libaio.so.1 4 Thread 0x7fff7ec38700 (LWP 378) "mysqld" 0x00007ffff79bd644 in __io_getevents_0_4 () from /lib64/libaio.so.1 3 Thread 0x7fff7f439700 (LWP 377) "mysqld" 0x00007ffff79bd644 in __io_getevents_0_4 () from /lib64/libaio.so.1 2 Thread 0x7fffec8c4700 (LWP 376) "mysqld" 0x00007ffff60ef53a in sigwaitinfo () from /lib64/libc.so.6 * 1 Thread 0x7ffff7fe4880 (LWP 371) "mysqld" 0x00007ffff61abbed in poll () from /lib64/libc.so.6 (gdb) thread 41 [Switching to thread 41 (Thread 0x7fff819cf700 (LWP 513))] #0 0x00007ffff61abbed in poll () from /lib64/libc.so.6
演示一下打断点:
(gdb) b do_command Breakpoint 1 at 0xc8d060: file /usr/local/src/mysql-5.7.30/sql/sql_parse.cc, line 903. (gdb) c Continuing. (gdb) b do_command Breakpoint 1 at 0xc8d060: file /usr/local/src/mysql-5.7.30/sql/sql_parse.cc, line 903. (gdb) n Single stepping until exit from function poll, which has no line number information. vio_io_wait (viovio=vio@entry=0x54fd330, eventevent=event@entry=VIO_IO_EVENT_READ, timeout=<optimized out>) at /usr/local/src/mysql-5.7.30/vio/viosocket.c:815 815 MYSQL_END_SOCKET_WAIT(locker, 0); (gdb) n 817 } (gdb) n vio_socket_io_wait (viovio=vio@entry=0x54fd330, eventevent=event@entry=VIO_IO_EVENT_READ) at /usr/local/src/mysql-5.7.30/vio/viosocket.c:92 92 ret= -1; (gdb) n 105 } (gdb) n vio_read (vio=0x54fd330, buf=0x54e3260 "/001", size=4) at /usr/local/src/mysql-5.7.30/vio/viosocket.c:134 134 while ((ret= mysql_socket_recv(vio->mysql_socket, (SOCKBUF_T *)buf, size, flags)) == -1) (gdb) n 148 } (gdb) n net_read_raw_loop (netnet=net@entry=0x54841e8, countcount=count@entry=4) at /usr/local/src/mysql-5.7.30/sql/net_serv.cc:697 697 if (recvcnt == VIO_SOCKET_ERROR) (gdb) n 706 else if (!recvcnt) (gdb) n 715 thd_increment_bytes_received(recvcnt); (gdb) n 712 count-= recvcnt; (gdb) n 713 buf+= recvcnt; (gdb) n 715 thd_increment_bytes_received(recvcnt); (gdb) n 692 while (count) (gdb) n 737 } (gdb) n 736 return MY_TEST(count); (gdb) n 737 } (gdb) n net_read_packet_header (net=0x54841e8) at /usr/local/src/mysql-5.7.30/sql/net_serv.cc:779 779 server_extension->m_after_header(net, user_data, count, rc); (gdb) n 778 rc= net_read_raw_loop(net, count); (gdb) n 779 server_extension->m_after_header(net, user_data, count, rc); (gdb) n 787 if (rc) (gdb) n 792 pkt_nr= net->buff[net->where_b + 3]; (gdb) n 798 if (pkt_nr != (uchar) net->pkt_nr) (gdb) n 818 net->pkt_nr++; (gdb) n net_read_packet (net=0x54841e8, complen=0x7fff819cec00) at /usr/local/src/mysql-5.7.30/sql/net_serv.cc:850 850 if (net->compress) (gdb) n 844 if (net_read_packet_header(net)) (gdb) n 847 net->compress_pkt_nr= net->pkt_nr; (gdb) n 850 if (net->compress) (gdb) n 868 pkt_len= uint3korr(net->buff+net->where_b); (gdb) n 871 if (!pkt_len) (gdb) n 868 pkt_len= uint3korr(net->buff+net->where_b); (gdb) n 871 if (!pkt_len) (gdb) p pkt_len $1 = 22 (gdb) n 874 pkt_data_len = max(pkt_len, *complen) + net->where_b; (gdb) n 877 if ((pkt_data_len >= net->max_packet) && net_realloc(net, pkt_data_len)) (gdb) n 881 if (net_read_raw_loop(net, pkt_len)) (gdb) n 868 pkt_len= uint3korr(net->buff+net->where_b); (gdb) n 886 return pkt_len; (gdb) n 891 } (gdb) n my_net_read (net=0x54841e8) at /usr/local/src/mysql-5.7.30/sql/net_serv.cc:922 922 if (len == MAX_PACKET_LENGTH) (gdb) n 925 ulong save_pos = net->where_b; (gdb) n 922 if (len == MAX_PACKET_LENGTH) (gdb) n 937 net->read_pos = net->buff + net->where_b; (gdb) n 938 if (len != packet_error) (gdb) n 937 net->read_pos = net->buff + net->where_b; (gdb) n 938 if (len != packet_error) (gdb) n 939 net->read_pos[len]=0; /* Safeguard for mysql_use_result */ (gdb) n 1056 MYSQL_NET_READ_DONE(0, len); (gdb) n 1058 } (gdb) n Protocol_classic::read_packet (this=0x5483a88) at /usr/local/src/mysql-5.7.30/sql/protocol_classic.cc:820 820 raw_packet= m_thd->net.read_pos; (gdb) n 819 bad_packet= false; (gdb) n 820 raw_packet= m_thd->net.read_pos; (gdb) n 829 } (gdb) n 820 raw_packet= m_thd->net.read_pos; (gdb) n 829 } (gdb) n Protocol_classic::get_command (this=0x5483a88, com_data=0x7fff819cecc0, cmd=0x7fff819cecb0) at /usr/local/src/mysql-5.7.30/sql/protocol_classic.cc:983 983 if (packet_length == 0) /* safety */ (gdb) n 990 raw_packet[packet_length]= '/0'; /* safety */ (gdb) n 1002 return parse_packet(com_data, *cmd); (gdb) n 990 raw_packet[packet_length]= '/0'; /* safety */ (gdb) n 992 *cmd= (enum enum_server_command) raw_packet[0]; (gdb) n 994 if (*cmd >= COM_END) (gdb) n 992 *cmd= (enum enum_server_command) raw_packet[0]; (gdb) n 1002 return parse_packet(com_data, *cmd); (gdb) n 999 packet_length--; (gdb) n 1000 raw_packet++; (gdb) n 1002 return parse_packet(com_data, *cmd); (gdb) n 1003 } (gdb) n do_command (thdthd=thd@entry=0x5482a30) at /usr/local/src/mysql-5.7.30/sql/sql_parse.cc:974 974 if (rc) (gdb) s 971 rc= thd->get_protocol()->get_command(&com_data, &command); (gdb) bt #0 do_command (thdthd=thd@entry=0x5482a30) at /usr/local/src/mysql-5.7.30/sql/sql_parse.cc:971 #1 0x0000000000d4abf8 in handle_connection (argarg=arg@entry=0x5418ff0) at /usr/local/src/mysql-5.7.30/sql/conn_handler/connection_handler_per_thread.cc:313 #2 0x000000000120a451 in pfs_spawn_thread (arg=0x5452410) at /usr/local/src/mysql-5.7.30/storage/perfschema/pfs.cc:2197 #3 0x00007ffff7bc6e65 in start_thread () from /lib64/libpthread.so.0 #4 0x00007ffff61b688d in clone () from /lib64/libc.so.6 (gdb) s 972 thd->m_server_idle= false; (gdb) s 974 if (rc) (gdb) s 1027 thd->variables.net_buffer_length); (gdb) s Protocol_classic::get_packet (this=0x5483a88) at /usr/local/src/mysql-5.7.30/sql/protocol_classic.cc:807 807 { (gdb) bt #0 Protocol_classic::get_packet (this=0x5483a88) at /usr/local/src/mysql-5.7.30/sql/protocol_classic.cc:807 #1 0x0000000000c8d1b3 in do_command (thdthd=thd@entry=0x5482a30) at /usr/local/src/mysql-5.7.30/sql/sql_parse.cc:1027 #2 0x0000000000d4abf8 in handle_connection (argarg=arg@entry=0x5418ff0) at /usr/local/src/mysql-5.7.30/sql/conn_handler/connection_handler_per_thread.cc:313 #3 0x000000000120a451 in pfs_spawn_thread (arg=0x5452410) at /usr/local/src/mysql-5.7.30/storage/perfschema/pfs.cc:2197 #4 0x00007ffff7bc6e65 in start_thread () from /lib64/libpthread.so.0 #5 0x00007ffff61b688d in clone () from /lib64/libc.so.6
我们可以在调试过程中随时打印某个变量的值,例如:
(gdb) p pkt_len $1 = 22
甚至gdb可以在紧急情况下救你一命,例如,当MySQL数据库连接打满又没有后台线程可以连接到MySQL的时候,你可以通过gdb来修改MySQL的连接数,例如
gdb -p $(pidof mysqld) -ex "set max_connections=1500" -batch
总结一下:
-
本文内容比较多,给大家介绍了几个MySQL DBA(可能其他岗位同样适用)应该掌握的必备工具
-
strace可以用来跟踪某个线程的调用情况,例如可以适用strace跟踪客户端SQL执行情况,如果开发说有很多慢SQL,而MySQL却没有任何记录,那么就可以用strace来跟踪把锅甩给开发(案例来自有赞王航威)
-
pstack是平时用的比较多的工具,尤其是诊断MySQL hang住的情况,例如主从延迟特别高等
-
gdb可能用的不是太多,但是我认为是MySQL DBA进阶必会的技能之一,尤其是抽丝剥茧某些疑难case的时候非常有用
感谢你能够认真阅读完这篇文章,希望小编分享的“MySQL DBA怎么利用strace/pstack/gdb来定位问题”这篇文章对大家有帮助,同时也希望大家多多支持亿速云,关注亿速云行业资讯频道,更多相关知识等着你来学习!
原创文章,作者:Maggie-Hunter,如若转载,请注明出处:https://blog.ytso.com/202295.html