怎么理解MySQL中半同步引起Master实例Crash,很多新手对此不是很清楚,为了帮助大家解决这个难题,下面小编将为大家详细讲解,有这方面需求的人可以来学习下,希望你能有所收获。
场景 :
Crash发生时的数据库版本: MySQL-5.7.12, 官方标注在5.7.17进行了fix;
开启半同步的主从架构中, 从库开启半同步, 启动/重启slave线程导致Master实例Crash;
结论 :
mysql bug, 附上bug单链接: https://bugs.mysql.com/bug.php?id=79865
问题描述(摘抄):
Description: From 5.7,semi-sync add Ack_receiver thread for listening slave ack,which use select(). But select() can only listen socket fd between 1 and __FD_SET_SIZE(my os is 1024), when socket fd is bigger than __FD_SET_SIZE, select() has no effect, and can never get ack from slave,then semi-sync can't run normally.even more,select() use array store fds, when use FD_SET store fd which is bigger than __FD_SET_SIZE, array will overflow,so mysqld may crash。
主要问题就出在tcp连接的select方法, 通常, 操作系统通过宏FD_SET_SIZE来声明一个进程中select能操作的文件描述符的最大数据, 然而通常情况下, 这个FD_SET_SIZE的值仅为1024;
实际上, 用epoll或者poll会比较少, select貌似是用的很少的;
问题复现:
准备一套MySQL-5.7.12的主从架构, 开启半同步:
为了能尽量简单的启用大量的文件描述符, 这里利用MyISAM分区表的"特性";
这时候在主库上连续执行select语句多次(>5);
这时候看一下主库的文件描述符数量;
那么现在在开启半同步的从库上重启一下slave, 同时tail一下主库的日志;
在重启线程几秒钟之后, 主库就发生了Crash;
PS: 在测试的过程中, 多次执行了select语句, 然后确认主库的半同步状态也是ON的情况下迅速在从库上重启slave, 基本是必现的;
PPS: MyISAM表在open的时候会同时打开所有的分区文件, 所以能比较方便的模拟占用大量文件描述符的情景;
(MyISAM分区表: http://blog.itpub.net/29510932/viewspace-2134679/)
PPPPPPPS: _(:з」∠)_
附上测试用的脚本与Crash的信息
-
CREATE TABLE `myisam_t` (
-
`id` int(11) DEFAULT NULL
-
) ENGINE=InnoDB DEFAULT CHARSET=utf8mb4
-
/*!50100 PARTITION BY HASH (id)
-
PARTITIONS 2000 */
点击(此处)折叠或打开
2017-04-28T22:10:00.731611+08:00 5092 [Note] Start binlog_dump to master_thread_id(5092) slave_server(13043), pos(, 4)
2017-04-28T22:10:01.648365+08:00 5092 [Note] Start semi-sync binlog_dump to slave (server_id: 13043), pos(, 4)
*** buffer overflow detected ***: /usr/sbin/mysqld terminated
Backtrace:
/lib/x86_64-linux-gnu/libc.so.6(+0x731af)[0x7fcdfc7981af]
/lib/x86_64-linux-gnu/libc.so.6(__fortify_fail+0x37)[0x7fcdfc81dcf7]
/lib/x86_64-linux-gnu/libc.so.6(+0xf6f10)[0x7fcdfc81bf10]
/lib/x86_64-linux-gnu/libc.so.6(+0xf8c67)[0x7fcdfc81dc67]
/usr/lib/mysql/plugin/semisync_master.so(_ZN12Ack_receiver17get_slave_socketsEP6fd_set+0x83)[0x7fcc73d4a493]
/usr/lib/mysql/plugin/semisync_master.so(_ZN12Ack_receiver3runEv+0x603)[0x7fcc73d4aaf3]
/usr/lib/mysql/plugin/semisync_master.so(ack_receive_handler+0x19)[0x7fcc73d4aba9]
/usr/sbin/mysqld(pfs_spawn_thread+0x1b4)[0xe90784]
/lib/x86_64-linux-gnu/libpthread.so.0(+0x80a4)[0x7fcdfdf650a4]
/lib/x86_64-linux-gnu/libc.so.6(clone+0x6d)[0x7fcdfc80d87d]
点击(此处)折叠或打开
-
14:10:01 UTC – mysqld got signal 6 ;
-
This could be because you hit a bug. It is also possible that this binary
-
or one of the libraries it was linked against is corrupt, improperly built,
-
or misconfigured. This error can also be caused by malfunctioning hardware.
-
Attempting to collect some information that could help diagnose the problem.
-
As this is a crash and something is definitely wrong, the information
-
collection process might fail.
-
key_buffer_size=8388608
-
read_buffer_size=131072
-
max_used_connections=5
-
max_threads=9999
-
thread_count=8
-
connection_count=2
-
It is possible that mysqld could use up to
-
key_buffer_size + (read_buffer_size + sort_buffer_size)*max_threads = 21899362 K bytes of memory
-
Hope that's ok; if not, decrease some variables in the equation.
-
Thread pointer: 0x0
-
Attempting backtrace. You can use the following information to find out
-
where mysqld died. If you see no messages after this, something went
-
terribly wrong…
-
stack_bottom = 0 thread_stack 0x40000
-
/usr/sbin/mysqld(my_print_stacktrace+0x2c)[0xe77fec]
-
/usr/sbin/mysqld(handle_fatal_signal+0x459)[0x7a7019]
-
/lib/x86_64-linux-gnu/libpthread.so.0(+0xf8d0)[0x7fcdfdf6c8d0]
-
/lib/x86_64-linux-gnu/libc.so.6(gsignal+0x37)[0x7fcdfc75a067]
-
/lib/x86_64-linux-gnu/libc.so.6(abort+0x148)[0x7fcdfc75b448]
-
/lib/x86_64-linux-gnu/libc.so.6(+0x731b4)[0x7fcdfc7981b4]
-
/lib/x86_64-linux-gnu/libc.so.6(__fortify_fail+0x37)[0x7fcdfc81dcf7]
-
/lib/x86_64-linux-gnu/libc.so.6(+0xf6f10)[0x7fcdfc81bf10]
-
/lib/x86_64-linux-gnu/libc.so.6(+0xf8c67)[0x7fcdfc81dc67]
-
/usr/lib/mysql/plugin/semisync_master.so(_ZN12Ack_receiver17get_slave_socketsEP6fd_set+0x83)[0x7fcc73d4a493]
-
/usr/lib/mysql/plugin/semisync_master.so(_ZN12Ack_receiver3runEv+0x603)[0x7fcc73d4aaf3]
-
/usr/lib/mysql/plugin/semisync_master.so(ack_receive_handler+0x19)[0x7fcc73d4aba9]
-
/usr/sbin/mysqld(pfs_spawn_thread+0x1b4)[0xe90784]
-
/lib/x86_64-linux-gnu/libpthread.so.0(+0x80a4)[0x7fcdfdf650a4]
-
/lib/x86_64-linux-gnu/libc.so.6(clone+0x6d)[0x7fcdfc80d87d]
-
The manual page at http://dev.mysql.com/doc/mysql/en/crashing.html contains
-
information that should help you find out what is causing the crash.
看完上述内容是否对您有帮助呢?如果还想对相关知识有进一步的了解或阅读更多相关文章,请关注亿速云行业资讯频道,感谢您对亿速云的支持。
原创文章,作者:ItWorker,如若转载,请注明出处:https://blog.ytso.com/tech/database/204325.html