Linux-redis哨兵-故障模拟


设备环境

centos7    10.0.0.17    redis-5.0.7    master节点

centos7    10.0.0.27    redis-5.0.7    从节点1(redis-slave1)

centos7    10.0.0.37    redis-5.0.7    从节点2(redis-slave2)

 

所有主从节点的redis.conf中关键配置

[root@centos7-liyj ~]#vim /apps/redis/etc/redis.conf
bind 0.0.0.0
masterauth "123456"
requirepass "123456"

或者使用sed批量修改
[root@centos7-liyj ~]#sed -i -e 's/bind 127.0.0.1/bind 0.0.0.0/' -e 's/^# masterauth.*/masterauth 123456/' -e 's/^# requirepass .*/requirepass 123456/' 

1、配置主从关系

所有从节点配置

slave1

[root@redis-slave1 ~]#redis-cli -a 123456
Warning: Using a password with '-a' or '-u' option on the command line interface may not be safe.
127.0.0.1:6379> replicaof 10.0.0.17 6379
OK
127.0.0.1:6379> config set masterauth "123456"
OK
127.0.0.1:6379> info replication
# Replication
role:slave
master_host:10.0.0.17
master_port:6379
master_link_status:up
master_last_io_seconds_ago:6
master_sync_in_progress:0
slave_repl_offset:1484
slave_priority:100
slave_read_only:1
connected_slaves:0
master_replid:2eedc0a9be1a9842c028454c86c0b173daf9d17a
master_replid2:0000000000000000000000000000000000000000
master_repl_offset:1484
second_repl_offset:-1
repl_backlog_active:1
repl_backlog_size:1048576
repl_backlog_first_byte_offset:1
repl_backlog_histlen:1484

slave2

[root@redis-slave2 ~]#redis-cli -a 123456
Warning: Using a password with '-a' or '-u' option on the command line interface may not be safe.
127.0.0.1:6379> replicaof 10.0.0.17 6379
OK
127.0.0.1:6379> config set masterauth "123456"
OK
127.0.0.1:6379> info replication
# Replication
role:slave
master_host:10.0.0.17
master_port:6379
master_link_status:up
master_last_io_seconds_ago:3
master_sync_in_progress:0
slave_repl_offset:364
slave_priority:100
slave_read_only:1
connected_slaves:0
master_replid:2eedc0a9be1a9842c028454c86c0b173daf9d17a
master_replid2:0000000000000000000000000000000000000000
master_repl_offset:364
second_repl_offset:-1
repl_backlog_active:1
repl_backlog_size:1048576
repl_backlog_first_byte_offset:351
repl_backlog_histlen:14

主节点的服务状态

[root@redis-master ~]#redis-cli -a 123456
Warning: Using a password with '-a' or '-u' option on the command line interface may not be safe.
127.0.0.1:6379> info replication
# Replication
role:master
connected_slaves:2
slave0:ip=10.0.0.27,port=6379,state=online,offset=1750,lag=0
slave1:ip=10.0.0.37,port=6379,state=online,offset=1750,lag=0
master_replid:2eedc0a9be1a9842c028454c86c0b173daf9d17a
master_replid2:0000000000000000000000000000000000000000
master_repl_offset:1750
second_repl_offset:-1
repl_backlog_active:1
repl_backlog_size:1048576
repl_backlog_first_byte_offset:1
repl_backlog_histlen:1750

主从关系永久生效需写入配置文件

#在所有从节点执行

[root@redis-master ~]#echo "replicaof 10.0.0.17 6379" >> /etc/redis.conf

 

2、哨兵sentinel配置

编译安装的redis,在源码目录有sentinel.conf,复制到安装目录即可

[root@redis-master ~]#cd /usr/local/src/redis-5.0.7
[root@redis-master /usr/local/src/redis-5.0.7]#ls
00-RELEASENOTES  CONTRIBUTING  deps     Makefile   README.md   runtest          runtest-moduleapi  sentinel.conf  tests
BUGS             COPYING       INSTALL  MANIFESTO  redis.conf  runtest-cluster  runtest-sentinel   src            utils
[root@redis-master /usr/local/src/redis-5.0.7]#cp sentinel.conf /apps/redis/
bin/  data/ etc/  log/  run/  
[root@redis-master /usr/local/src/redis-5.0.7]#cp sentinel.conf /apps/redis/etc/

所有主从节点哨兵配置文件修改

[root@redis-master ~]#grep -v "^#" /apps/redis/etc/sentinel.conf 

bind 0.0.0.0           
#监听地址
port 26379
#端口
daemonize yes
#后端运行,默认为no前台运行
pidfile /apps/redis/run/redis-sentinel.pid
#哨兵的pid文件存放路径,一般与redis文件存放一个目录
logfile "/apps/redis/log/sentinel_26379.log"
#哨兵日志文件,与redis日志文件存放一个目录

dir /tmp

sentinel monitor mymaster 10.0.0.17  6379 2
#哨兵集群名称mymaster 此行指定当前mymaster集群中的master服务器的地址和端口
#2为法定人数限制,即有几个sentinel认为master down了就惊醒故障转移,一般此值是所有sentinel
节点(一般总数是>=3的 奇数,如:3,5,7等)的一半以上的整数值,比如,总数是3,即3/2=1.5, 取整为2,
是master的ODOWN客观下线的依据
sentinel auth-pass mymaster 123456
#mymaster集群中的master的密码,注意此行要在上面行的下面
sentinel down-after-milliseconds mymaster 3000 #(SDOWN)判断mymaster集群中所有节点的主观下线的时间,单位:毫秒,建议3000 sentinel parallel-syncs mymaster 1 #发生故障转移后,可以同时向新master同步数据的slave的数量,数字越小总同步时间越长,但可以减轻新
master的负载压力 sentinel failover-timeout mymaster 180000 #所有slaves指向新的master所需的超时时间,单位:毫秒 sentinel deny-scripts-reconfig yes
#禁止修改脚本

修改好哨兵配置文件,复制到其他的哨兵节点

[root@redis-master ~]#scp /apps/redis/etc/sentinel.conf  10.0.0.27:/apps/redis/etc/
root@10.0.0.27's password: 
sentinel.conf                                                                          100% 9801    14.7MB/s   00:00    
[root@redis-master ~]#scp /apps/redis/etc/sentinel.conf  10.0.0.37:/apps/redis/etc/
root@10.0.0.37's password: 
sentinel.conf                                                                          100% 9801    13.6MB/s   00:00    
[root@redis-master ~]# chown -R redis.redis /apps/redis/    #修改sentinel.conf文件权限访问,redis,其他的哨兵节点都需要修改权限为redis访问

  

注意:

编译安装的redis中,没有redis-sentinel.service启动文件

service启动文件内容如下

[root@redis-master ~]#cat /lib/systemd/system/redis-sentinel.service 
[Unit]
Description=Redis Sentinel
After=network.target

[Service]
ExecStart=/apps/redis/bin/redis-sentinel /apps/redis/etc/sentinel.conf --supervised systemd
ExecStop=/bin/kill -s QUIT $MAINPID Type=notify User=redis Group=redis RuntimeDirectory=redis RuntimeDirectoryMode=0755 [Install] WantedBy=multi-user.target

加载service文件,启动

[root@redis-master ~]#systemctl daemon-reload
[root@redis-master ~]#systemctl start redis-sentinel.service
[root@redis-master ~]#systemctl status redis-sentinel.service
● redis-sentinel.service - Redis Sentinel
   Loaded: loaded (/usr/lib/systemd/system/redis-sentinel.service; enabled; vendor preset: disabled)
   Active: active (running) since Sat 2022-07-09 18:23:19 CST; 6s ago
 Main PID: 1787 (redis-sentinel)
   CGroup: /system.slice/redis-sentinel.service
           └─1787 /apps/redis/bin/redis-sentinel 0.0.0.0:26379 [sentinel]

Jul 09 18:23:19 redis-master systemd[1]: Starting Redis Sentinel...
Jul 09 18:23:19 redis-master systemd[1]: Started Redis Sentinel.

将redis-sentinel.service文件,复制到其他的哨兵节点并启动

[root@redis-master ~]#scp /lib/systemd/system/redis-sentinel.service  10.0.0.27:/lib/systemd/system/
root@10.0.0.27's password: 
redis-sentinel.service                                                                 100%  317   459.3KB/s   00:00    
[root@redis-master ~]#scp /lib/systemd/system/redis-sentinel.service  10.0.0.37:/lib/systemd/system/
root@10.0.0.37's password: 
redis-sentinel.service                                                                 100%  317   360.4KB/s   00:00    
修改sentinel.conf文件权限
chown -R redis.redis /apps/redis/

加载sentinel服务配置文件 systemctl daemon-reload
启动服务 systemctl start redis-sentinel.service

 端口监听

redis监听6379端口,哨兵(sentinel)监听26379端口

[root@redis-master ~]#ss -tnl
State       Recv-Q Send-Q               Local Address:Port                              Peer Address:Port              
LISTEN      0      128                              *:22                                           *:*                  
LISTEN      0      100                      127.0.0.1:25                                           *:*                  
LISTEN      0      128                      127.0.0.1:6010                                         *:*                  
LISTEN      0      128                      127.0.0.1:6011                                         *:*                  
LISTEN      0      128                      127.0.0.1:6012                                         *:*                  
LISTEN      0      511                              *:26379                                        *:*                  
LISTEN      0      511                              *:6379                                         *:*                  
LISTEN      0      128                           [::]:22                                        [::]:*                  
LISTEN      0      100                          [::1]:25                                        [::]:*                  
LISTEN      0      128                          [::1]:6010                                      [::]:*                  
LISTEN      0      128                          [::1]:6011                                      [::]:*                  
LISTEN      0      128                          [::1]:6012                                      [::]:*                  

当前的sentinel状态

在sentinel状态中尤其是最后一行,涉及到masterIP是多少,有几个slave,有几个sentinels,必须是符 合全部服务器数量

[root@redis-master ~]#redis-cli -p 26379
127.0.0.1:26379> info sentinel
# Sentinel
sentinel_masters:1
sentinel_tilt:0
sentinel_running_scripts:0
sentinel_scripts_queue_length:0
sentinel_simulate_failure_flags:0
master0:name=mymaster,status=ok,address=10.0.0.17:6379,slaves=2,sentinels=3
#两个 slave,三个sentinel服务器,如果sentinels值不符合,检查myid可能冲突

3、故障模拟

在主节点上结束redis进程

[root@redis-master ~]#killall redis-server
[root@redis-master ~]#redis-cli -a 123456 -p 26379
Warning: Using a password with '-a' or '-u' option on the command line interface may not be safe.
127.0.0.1:26379> info sentinel
# Sentinel
sentinel_masters:1
sentinel_tilt:0
sentinel_running_scripts:0
sentinel_scripts_queue_length:0
sentinel_simulate_failure_flags:0
master0:name=mymaster,status=ok,address=10.0.0.37:6379,slaves=2,sentinels=3    #sentinel已经自动切换10.0.0.37机器为主节点

日志文件

Linux-redis哨兵-故障模拟

[root@redis-master ~]#tail -f /apps/redis/log/sentinel_26379.log 
1787:X 10 Jul 2022 09:43:20.848 # +sdown master mymaster 10.0.0.17 6379
1787:X 10 Jul 2022 09:43:20.941 # +new-epoch 1
1787:X 10 Jul 2022 09:43:20.942 # +vote-for-leader fea07631cec40925c196cef392ff27ea41e66af8 1
1787:X 10 Jul 2022 09:43:21.962 # +odown master mymaster 10.0.0.17 6379 #quorum 3/2
1787:X 10 Jul 2022 09:43:21.962 # Next failover delay: I will not start a failover before Sun Jul 10 09:49:21 2022
1787:X 10 Jul 2022 09:43:22.045 # +config-update-from sentinel fea07631cec40925c196cef392ff27ea41e66af8 10.0.0.27 26379 @ mymaster 10.0.0.17 6379
1787:X 10 Jul 2022 09:43:22.045 # +switch-master mymaster 10.0.0.17 6379 10.0.0.37 6379
1787:X 10 Jul 2022 09:43:22.045 * +slave slave 10.0.0.27:6379 10.0.0.27 6379 @ mymaster 10.0.0.37 6379
1787:X 10 Jul 2022 09:43:22.045 * +slave slave 10.0.0.17:6379 10.0.0.17 6379 @ mymaster 10.0.0.37 6379
1787:X 10 Jul 2022 09:43:25.091 # +sdown slave 10.0.0.17:6379 10.0.0.17 6379 @ mymaster 10.0.0.37 6379

10.0.0.17

 

Linux-redis哨兵-故障模拟

[root@redis-slave1 ~]#tail -f /apps/redis/log/sentinel_26379.log
1631:X 10 Jul 2022 09:43:20.879 # +sdown master mymaster 10.0.0.17 6379
1631:X 10 Jul 2022 09:43:20.932 # +odown master mymaster 10.0.0.17 6379 #quorum 3/2
1631:X 10 Jul 2022 09:43:20.932 # +new-epoch 1
1631:X 10 Jul 2022 09:43:20.932 # +try-failover master mymaster 10.0.0.17 6379
1631:X 10 Jul 2022 09:43:20.939 # +vote-for-leader fea07631cec40925c196cef392ff27ea41e66af8 1
1631:X 10 Jul 2022 09:43:20.941 # 01df2a6c3895fa6c3586bc1988a1dae6562aae23 voted for fea07631cec40925c196cef392ff27ea41e66af8 1
1631:X 10 Jul 2022 09:43:20.942 # b84ef9c78a46b0d25eb04dedf39fcd4a63e1d92f voted for fea07631cec40925c196cef392ff27ea41e66af8 1
1631:X 10 Jul 2022 09:43:21.001 # +elected-leader master mymaster 10.0.0.17 6379
1631:X 10 Jul 2022 09:43:21.001 # +failover-state-select-slave master mymaster 10.0.0.17 6379
1631:X 10 Jul 2022 09:43:21.065 # +selected-slave slave 10.0.0.37:6379 10.0.0.37 6379 @ mymaster 10.0.0.17 6379
1631:X 10 Jul 2022 09:43:21.065 * +failover-state-send-slaveof-noone slave 10.0.0.37:6379 10.0.0.37 6379 @ mymaster 10.0.0.17 6379
1631:X 10 Jul 2022 09:43:21.129 * +failover-state-wait-promotion slave 10.0.0.37:6379 10.0.0.37 6379 @ mymaster 10.0.0.17 6379
1631:X 10 Jul 2022 09:43:21.983 # +promoted-slave slave 10.0.0.37:6379 10.0.0.37 6379 @ mymaster 10.0.0.17 6379
1631:X 10 Jul 2022 09:43:21.983 # +failover-state-reconf-slaves master mymaster 10.0.0.17 6379
1631:X 10 Jul 2022 09:43:22.044 * +slave-reconf-sent slave 10.0.0.27:6379 10.0.0.27 6379 @ mymaster 10.0.0.17 6379
1631:X 10 Jul 2022 09:43:23.002 * +slave-reconf-inprog slave 10.0.0.27:6379 10.0.0.27 6379 @ mymaster 10.0.0.17 6379
1631:X 10 Jul 2022 09:43:23.003 * +slave-reconf-done slave 10.0.0.27:6379 10.0.0.27 6379 @ mymaster 10.0.0.17 6379
1631:X 10 Jul 2022 09:43:23.073 # -odown master mymaster 10.0.0.17 6379
1631:X 10 Jul 2022 09:43:23.074 # +failover-end master mymaster 10.0.0.17 6379
1631:X 10 Jul 2022 09:43:23.074 # +switch-master mymaster 10.0.0.17 6379 10.0.0.37 6379
1631:X 10 Jul 2022 09:43:23.074 * +slave slave 10.0.0.27:6379 10.0.0.27 6379 @ mymaster 10.0.0.37 6379
1631:X 10 Jul 2022 09:43:23.074 * +slave slave 10.0.0.17:6379 10.0.0.17 6379 @ mymaster 10.0.0.37 6379
1631:X 10 Jul 2022 09:43:26.117 # +sdown slave 10.0.0.17:6379 10.0.0.17 6379 @ mymaster 10.0.0.37 6379
 

10.0.0.27

 

Linux-redis哨兵-故障模拟

[root@redis-slave2 ~]#tail -f /apps/redis/log/sentinel_26379.log
1694:X 10 Jul 2022 09:43:20.876 # +sdown master mymaster 10.0.0.17 6379
1694:X 10 Jul 2022 09:43:20.941 # +new-epoch 1
1694:X 10 Jul 2022 09:43:20.942 # +vote-for-leader fea07631cec40925c196cef392ff27ea41e66af8 1
1694:X 10 Jul 2022 09:43:20.967 # +odown master mymaster 10.0.0.17 6379 #quorum 2/2
1694:X 10 Jul 2022 09:43:20.967 # Next failover delay: I will not start a failover before Sun Jul 10 09:49:21 2022
1694:X 10 Jul 2022 09:43:22.045 # +config-update-from sentinel fea07631cec40925c196cef392ff27ea41e66af8 10.0.0.27 26379 @ mymaster 10.0.0.17 6379
1694:X 10 Jul 2022 09:43:22.045 # +switch-master mymaster 10.0.0.17 6379 10.0.0.37 6379
1694:X 10 Jul 2022 09:43:22.045 * +slave slave 10.0.0.27:6379 10.0.0.27 6379 @ mymaster 10.0.0.37 6379
1694:X 10 Jul 2022 09:43:22.045 * +slave slave 10.0.0.17:6379 10.0.0.17 6379 @ mymaster 10.0.0.37 6379
1694:X 10 Jul 2022 09:43:25.097 # +sdown slave 10.0.0.17:6379 10.0.0.17 6379 @ mymaster 10.0.0.37 6379

10.0.0.37

 

10.0.0.17节点,redis故障恢复后状态

[root@redis-master ~]#redis-cli -a 123456
Warning: Using a password with '-a' or '-u' option on the command line interface may not be safe.
127.0.0.1:6379> info replication
# Replication
role:slave
master_host:10.0.0.37
master_port:6379
master_link_status:up
master_last_io_seconds_ago:0
master_sync_in_progress:0
slave_repl_offset:4802248
slave_priority:100
slave_read_only:1
connected_slaves:0
master_replid:2f3d70f43253c8a836babe3d818c170a6064d6d8
master_replid2:0000000000000000000000000000000000000000
master_repl_offset:4802248
second_repl_offset:-1
repl_backlog_active:1
repl_backlog_size:1048576
repl_backlog_first_byte_offset:4796878
repl_backlog_histlen:5371

10.0.0.37为主节点

127.0.0.1:6379> info replication
# Replication
role:master
connected_slaves:2
slave0:ip=10.0.0.27,port=6379,state=online,offset=4804124,lag=1
slave1:ip=10.0.0.17,port=6379,state=online,offset=4804257,lag=1
master_replid:2f3d70f43253c8a836babe3d818c170a6064d6d8
master_replid2:b1f1e26fbaedff9e01bd7e5e93795a26e14e11ce
master_repl_offset:4804257
second_repl_offset:4633167
repl_backlog_active:1
repl_backlog_size:1048576
repl_backlog_first_byte_offset:4051024
repl_backlog_histlen:753234

 

故障转移后的redis配置文件会被自动修改

[root@redis-slave2 ~]#grep ^replicaof /etc/redis.conf
replicaof 10.0.0.37 6379

#在配置redis出从关系的时候,必须写入配置文件,永久生效。
否则都是临时。在哨兵切换主从关系时,也不会写入配置文件,都是临时性的

哨兵的配置文件的sentinel monitor IP 同样也会被修改

[root@redis-slave2 ~]#grep "^[a-z]" /apps/redis/etc/sentinel.conf
bind 0.0.0.0
port 26379
daemonize yes
pidfile "/apps/redis/run/redis-sentinel.pid"
logfile "/apps/redis/log/sentinel_26379.log"
dir "/tmp"
sentinel myid 01df2a6c3895fa6c3586bc1988a1dae6562aae23
sentinel deny-scripts-reconfig yes
sentinel monitor mymaster 10.0.0.37 6379 2
sentinel down-after-milliseconds mymaster 3000
sentinel auth-pass mymaster 123456
sentinel config-epoch mymaster 1
maxclients 4064
protected-mode no
supervised systemd
sentinel leader-epoch mymaster 1
sentinel known-replica mymaster 10.0.0.17 6379
sentinel known-replica mymaster 10.0.0.27 6379
sentinel known-sentinel mymaster 10.0.0.27 26379 fea07631cec40925c196cef392ff27ea41e66af8
sentinel known-sentinel mymaster 10.0.0.17 26379 b84ef9c78a46b0d25eb04dedf39fcd4a63e1d92f
sentinel current-epoch 1

 

原创文章,作者:bd101bd101,如若转载,请注明出处:https://blog.ytso.com/tech/database/273376.html

(0)
上一篇 2022年7月10日 12:19
下一篇 2022年7月10日 12:19

相关推荐

发表回复

登录后才能评论