布署方案说明
2、redis集群负责对外提供相关服务
Sentinel原理介绍
流言协议:sentinel服务通过ping命令来确认监控的服务器是否正常,当足够多数量的sentinel都确认监控的同一服务器停止服务了(主观下线),则判定此服务器停止服务。
投票协议:其实就选举,sentinel集群根据一定的规则从redis群中选择一个新的服务器成为主服务器,并使其它的服务器做为新的从服务器,并修改自身的配置文件。
服务器布署规划
服务器系统环境
Centos 6.6 x86_64
Master服务器 10.0.0.3/24
Redis-Mster 10.0.0.3:6379
Redis-Slave1 10.0.0.3:63791
Redis-Slave2 10.0.0.3:63792
Sentinel服务
s 10.0.0.3:26379
s1 10.0.0.3:26378
Slave服务器 10.0.0.4/24
Redis-Slave3 10.0.0.4:63793
Redis-Slave4 10.0.0.4:63794
Sentinel服务
s2 10.0.0.4:26379
s3 10.0.0.4:26378
故障切换前后逻辑图
Redis-sentinel服务配置
安装redis服务
mkdir /usr/local/redis/data cd /usr/local/src wget http://download.redis.io/releases/redis-2.8.9.tar.gz tar zxf redis-2.8.9.tar.gz cd redis-2.8.9 make && make install
复制配置文件
cp redis.conf /usr/local/bin/ cd /usr/local/bin cp redis.conf redis-slave1 cp redis.conf redis-slave2
修改配置文件
[root@master bin]#vi redis.conf daemonize yes #开启后台运行模式 pidfile /var/run/redis.pid bind 10.0.0.3 dbfilename dump.rdb dir /usr/local/redis/data port 6379 [root@master bin]#vi redis-slave1 daemonize yes pidfile /var/run/redis-slave1.pid port 63791 bind 10.0.0.3 dbfilename dump-slave1.rdb dir /usr/local/redis/data slaveof 10.0.0.3 6379 slave-read-only yes [root@master bin]#vi redis-slave2 daemonize yes pidfile /var/run/redis-slave2.pid port 63792 bind 10.0.0.3 dbfilename dump-slave2.rdb dir /usr/local/redis/data slaveof 10.0.0.3 6379
配置redis-sentinel服务
mkdir /var/log/redis -p cp /usr/local/src/redis-2.8.9/src/redis-sentinel /usr/bin/ cp /usr/local/src/redis-2.8.9/src/sentinel.conf /usr/local/bin/ cd /usr/local/bin cp sentinel.conf sentinel-s1.conf
修改配置文件
[root@master bin]# egrep -v "^#|^$" sentinel.conf port 26379 daemonize yes logfile /var/log/redis/sentinel.log sentinel monitor mymaster 10.0.0.3 6379 2 sentinel down-after-milliseconds mymaster 30000 sentinel parallel-syncs mymaster 1 sentinel failover-timeout mymaster 180000 [root@master bin]# egrep -v "^#|^$" sentinel-s1.conf port 26378 daemonize yes logfile /var/log/redis/sentinel-s1.log sentinel monitor mymaster 10.0.0.3 6379 2 sentinel down-after-milliseconds mymaster 30000 sentinel parallel-syncs mymaster 1 sentinel failover-timeout mymaster 180000
以上配置从服务器操作过程同上
启动服务
[root@master bin]# redis-server redis.conf [root@master bin]# redis-server redis-slave1 [root@master bin]# redis-server redis-slave2 [root@master bin]# ps -ef|grep redis root 2579 1 0 23:55 ? 00:00:00 redis-server 10.0.0.3:6379 root 2585 1 0 23:55 ? 00:00:00 redis-server 10.0.0.3:63792 root 2590 1 0 23:55 ? 00:00:00 redis-server 10.0.0.3:63791 root 2597 2479 0 23:56 pts/0 00:00:00 grep --color=auto redis [root@slave bin]# redis-server redis-slave3 [root@slave bin]# redis-server redis-slave4 [root@slave bin]# ps -ef|grep redis root 2576 1 0 23:56 ? 00:00:00 redis-server 10.0.0.4:63793 root 2580 1 0 23:56 ? 00:00:00 redis-server 10.0.0.4:63794 root 2584 2502 0 23:56 00:00:00 grep --color=auto redis
启动redis-sentinel服务
[root@master bin]# redis-sentinel sentinel.conf [root@master bin]# redis-sentinel sentinel-s1.conf [root@master bin]# ps -ef|grep redis-sentinel root 2638 1 0 01:05 ? 00:00:04 redis-sentinel *:26379 root 2646 1 0 01:13 ? 00:00:00 redis-sentinel *:26378 root 2650 2479 0 01:13 00:00:00 grep --color=auto redis [root@slave bin]# redis-sentinel sentinel-s2.conf [root@slave bin]# redis-sentinel sentinel-s3.conf [root@slave bin]# ps -ef|grep redis-sentinel root 2644 1 1 01:14 ? 00:00:00 redis-sentinel *:26378 root 2649 1 0 01:14 ? 00:00:00 redis-sentinel *:26379 root 2653 2502 0 01:15 00:00:00 grep --color=auto redis-sentinel
查看日志观察启动过程
[root@master bin]# tail -f /var/log/redis/sentinel.log `-.__.-' [2664] 12 May 01:20:11.036 # Sentinel runid is c327be464ef36e670566a0d76c9dc85bac7f33b1 [2664] 12 May 01:20:11.036 # +monitor master mymaster 10.0.0.3 6379 quorum 2 [2664] 12 May 01:20:11.123 * -dup-sentinel master mymaster 10.0.0.3 6379 #duplicate of 10.0.0.3:26378 or fb1fbe73b51a0a6e71a8ceae57d34ef773d086e3 [2664] 12 May 01:20:11.123 * +sentinel sentinel 10.0.0.3:26378 10.0.0.3 26378 @ mymaster 10.0.0.3 6379 [2664] 12 May 01:20:21.410 * -dup-sentinel master mymaster 10.0.0.3 6379 #duplicate of 10.0.0.4:26379 or 3d43ddea4d4ba8de7dd30e2d332723508f6d4c19 [2664] 12 May 01:20:21.410 * +sentinel sentinel 10.0.0.4:26379 10.0.0.4 26379 @ mymaster 10.0.0.3 6379 [2664] 12 May 01:20:25.103 * -dup-sentinel master mymaster 10.0.0.3 6379 #duplicate of 10.0.0.4:26378 or 6d134d9a3e53c0cb70de842281de8aaf17a84c00 [2664] 12 May 01:20:25.103 * +sentinel sentinel 10.0.0.4:26378 10.0.0.4 26378 @ mymaster 10.0.0.3 6379
可以看出有其它监控服务器加入到集群中来
查看配置文件是否有变化
[root@master bin]# egrep -v "^#|^$" sentinel-s1.conf port 26378 daemonize yes logfile "/var/log/redis/sentinel-s1.log" sentinel monitor mymaster 10.0.0.3 6379 2 sentinel config-epoch mymaster 0 sentinel leader-epoch mymaster 0 sentinel known-slave mymaster 10.0.0.3 63792 dir "/usr/local/bin" sentinel known-slave mymaster 10.0.0.4 63793 sentinel known-slave mymaster 10.0.0.4 63794 sentinel known-slave mymaster 10.0.0.3 63791 sentinel known-sentinel mymaster 10.0.0.3 26379 c327be464ef36e670566a0d76c9dc85bac7f33b1 sentinel known-sentinel mymaster 10.0.0.4 26379 3d43ddea4d4ba8de7dd30e2d332723508f6d4c19 sentinel known-sentinel mymaster 10.0.0.4 26378 6d134d9a3e53c0cb70de842281de8aaf17a84c00 sentinel current-epoch 0
通过日志观察故障切换过程
[root@master bin]# redis-cli -h 10.0.0.3 -p 6379 shutdown [root@master bin]# ps -ef|grep redis root 2585 1 0 May11 ? 00:00:07 redis-server 10.0.0.3:63792 root 2590 1 0 May11 ? 00:00:07 redis-server 10.0.0.3:63791 root 2660 1 0 01:20 ? 00:00:02 redis-sentinel *:26378 root 2664 1 0 01:20 ? 00:00:02 redis-sentinel *:26379 root 2676 2479 0 01:30 00:00:00 grep --color=auto redis
此时发现主服务器进程不存在,说明服务有故障
清空原来的日志并查看故障切换过程
[root@slave bin]# > /var/log/redis/sentinel-s3.log [root@slave bin]# tail -f /var/log/redis/sentinel-s3.log [2669] 12 May 01:30:55.203 # +sdown master mymaster 10.0.0.3 6379 [2669] 12 May 01:30:55.276 # +new-epoch 1 [2669] 12 May 01:30:55.280 # +vote-for-leader c327be464ef36e670566a0d76c9dc85bac7f33b1 1 [2669] 12 May 01:30:56.329 # +odown master mymaster 10.0.0.3 6379 #quorum 4/2 [2669] 12 May 01:30:57.547 # +switch-master mymaster 10.0.0.3 6379 10.0.0.3 63792 [2669] 12 May 01:30:57.548 * +slave slave 10.0.0.4:63794 10.0.0.4 63794 @ mymaster 10.0.0.3 63792 [2669] 12 May 01:30:57.553 * +slave slave 10.0.0.4:63793 10.0.0.4 63793 @ mymaster 10.0.0.3 63792 [2669] 12 May 01:30:57.556 * +slave slave 10.0.0.3:63791 10.0.0.3 63791 @ mymaster 10.0.0.3 63792 [2669] 12 May 01:30:57.561 * +slave slave 10.0.0.3:6379 10.0.0.3 6379 @ mymaster 10.0.0.3 63792 [2669] 12 May 01:31:27.620 # +sdown slave 10.0.0.3:6379 10.0.0.3 6379 @ mymaster 10.0.0.3 63792
可以看出判定master主观下线(+sdown),sentinel选举10.0.0.3 63792为新的主服务器,其它slave自动执行slaveof ,故障转移成功
恢复原主服务器
[root@master bin]# redis-server redis.conf [root@master bin]# ps -ef|grep redis root 2585 1 0 May11 ? 00:00:08 redis-server 10.0.0.3:63792 root 2590 1 0 May11 ? 00:00:08 redis-server 10.0.0.3:63791 root 2660 1 0 01:20 ? 00:00:05 redis-sentinel *:26378 root 2664 1 0 01:20 ? 00:00:05 redis-sentinel *:26379 root 2683 1 0 01:36 ? 00:00:00 redis-server 10.0.0.3:6379 root 2689 2479 0 01:36 00:00:00 grep --color=auto redis [root@slave bin]# tail -f /var/log/redis/sentinel-s3.log [2673] 12 May 01:36:21.925 # -sdown slave 10.0.0.3:6379 10.0.0.3 6379 @ mymaster 10.0.0.3 63792
当原来主服务器故障恢复后,自动以从角色加入到集群,并不会抢占主服务器的角色
测试读写分离
[root@master bin]# redis-cli -h 10.0.0.3 -p 63792 10.0.0.3:63792> get key "test" 10.0.0.3:63792> set key file OK 10.0.0.3:63792> get key "file" [root@master bin]# redis-cli -h 10.0.0.3 -p 6379 10.0.0.3:6379> get key "file" 10.0.0.3:6379> set key file1 (error) READONLY You can't write against a read only slave.
说明新主是提升成功的,原来的主故障恢复后已是从服务器,而且也是只读状态,没有破坏之前的主写从读的状态
至此整个布署过程结束,实现了集群监控与自动故障切换、读写分离的功能
原创文章,作者:Maggie-Hunter,如若转载,请注明出处:https://blog.ytso.com/54973.html