Redis的主从架构,如果master发现故障了,还得手动将slave切换成master继续服务,手动的方式容易造成失误,导致数据丢失,那Redis有没有一种机制可以在master和slave进行监控,并在master发送故障的时候,能自动将slave切换成master呢?有的,那就是哨兵。
哨兵的作用:
1、监控redis进行状态,包括master和slave
2、当master down机,能自动将slave切换成master
下面配置哨兵监控redis进程,假如我们已经配置好了Master和Slave,具体详细配置参考 http://www.cnblogs.com/liuyansheng/p/6530851.html
1、新建配置文件,命名为sentinel.conf,输入下面内容
sentinel monitor mymaster 192.168.137.101 6379 1 sentinel auth-pass myMaster master123 #配置密码
mymaster :master服务的名称,随便定义
192.168.137.101 6379:master的ip和端口
1:在至少1个哨兵实例都认为master down后把master标记为odown(objective down客观down;相对应的存在sdown,subjective down,主观down)状态
sentinel.conf具体配置参考 http://www.cnblogs.com/liuyansheng/p/6531321.html
2、启动哨兵,在redis的src目录下,执行下面命令
./redis-sentinel ../../conf/sentinel.conf
可以看到哨兵已经把master和slave都监控了。
3、进行哨兵测试,将103的slave进行shutdown,哨兵将监控到状态。
6594:X 14 Jun 01:01:26.647 # +sdown slave 192.168.137.102:6379 192.168.137.103 6379 @ myMaster 192.168.137.101 6379
然后将103重启服务,查看哨兵控制台
6594:X 14 Jun 20:13:22.716 * +reboot slave 192.168.137.102:6379 192.168.137.103 6379 @ myMaster 192.168.137.101 6379
已经进行恢复
4、将master kill掉,查看哨兵后台打印信息
6594:X 14 Jun 20:13:50.300 # +sdown master myMaster 192.168.137.101 6379 #说明master服务已经宕机 6594:X 14 Jun 20:13:50.300 # +odown master myMaster 192.168.137.101 6379 #quorum 1/1 6594:X 14 Jun 20:13:50.300 # +new-epoch 1 6594:X 14 Jun 20:13:50.300 # +try-failover master myMaster 192.168.137.101 6379 #开始恢复 6594:X 14 Jun 20:13:50.304 # +vote-for-leader a496627d72d98fde98b8270e297ab32b330ebac7 1 #投票选举哨兵leader,现在就一个哨兵所以leader就自己 6594:X 14 Jun 20:13:50.304 # +elected-leader master myMaster 192.168.137.101 6379 # 选中leader 6594:X 14 Jun 20:13:50.304 # +failover-state-select-slave master myMaster 192.168.137.101 6379 #选择master 6594:X 14 Jun 20:13:50.357 # +selected-slave slave 192.168.137.102:6379 192.168.137.102 6379 @ myMaster 192.168.137.101 6379 #选中192.168.137.102 6379作为切换 6594:X 14 Jun 20:13:50.357 * +failover-state-send-slaveof-noone slave 192.168.137.102:6379 192.168.137.102 6379 @ myMaster 192.168.137.101 6379 #发送slaveof no one命令 6594:X 14 Jun 20:13:50.420 * +failover-state-wait-promotion slave 192.168.137.102:6379 192.168.137.102 6379 @ myMaster 192.168.137.101 6379 #等待升级master 6594:X 14 Jun 20:13:50.515 # +promoted-slave slave 192.168.137.102:6379 192.168.137.102 6379 @ myMaster 192.168.137.101 6379 #192.168.137.102 6379升级为master 6594:X 14 Jun 20:13:50.515 # +failover-state-reconf-slaves master myMaster 192.168.137.101 6379 6594:X 14 Jun 20:13:50.566 * +slave-reconf-sent slave 192.168.137.103:6379 192.168.137.103 6379 @ myMaster 192.168.137.101 6379 6594:X 14 Jun 20:13:51.333 * +slave-reconf-inprog slave 192.168.137.103:6379 192.168.137.103 6379 @ myMaster 192.168.137.101 6379 6594:X 14 Jun 20:13:52.382 * +slave-reconf-done slave 192.168.137.103:6379 192.168.137.103 6379 @ myMaster 192.168.137.101 6379 6594:X 14 Jun 20:13:52.438 # +failover-end master myMaster 192.168.137.101 6379 #故障恢复完成 6594:X 14 Jun 20:13:52.438 # +switch-master myMaster 192.168.137.101 6379 192.168.137.102 6379 #master切换 6594:X 14 Jun 20:13:52.438 * +slave slave 192.168.137.103:6379 192.168.137.103 6379 @ myMaster 192.168.137.102 6379 #添加master从库 6594:X 14 Jun 20:13:52.438 * +slave slave 192.168.137.101:6379 192.168.137.101 6379 @ myMaster 192.168.137.102 6379 #添加master从库 6594:X 14 Jun 20:17:22.463 # +sdown slave 192.168.137.101:6379 192.168.137.101 6379 @ myMaster 192.168.137.102 6379 #发现192.168.137.101 6379故障
5、配置多个哨兵,修改配置文件
sentinel monitor myMaster1 192.168.137.101 6379 2 sentinel monitor myMaster2 192.168.137.101 6379 2 sentinel monitor myMaster3 192.168.137.101 6379 2 sentinel auth-pass myMaster1 master123 sentinel auth-pass myMaster2 master123 sentinel auth-pass myMaster3 master123
这里是配置3个哨兵,架构如下: