哨兵模式是一种特殊的模式,哨兵是一个独立的进程。其原理是哨兵通过发送命令 等待redis服务器的响应 从而监控多个redis实例。
1、配置哨兵配置文件sentinel.conf
[root@redis01 redis]# cat sentinel.conf //这是最主要的一个配置
sentienel monitor myredis01 192.168.100.208 6379 1
官方配置文件里 是这么解释的
# sentinel monitor <master-name> <ip> <redis-port> <quorum>
# Tells Sentinel to monitor this master, and to consider it in O_DOWN
# (Objectively Down) state only if at least <quorum> sentinels agree.
当至少quorum个哨兵认为master节点失联后 这时候客观上认为主节点下线了。
这种情况 哨兵之间就会发起投票 选举新的master节点。
此外一般会将哨兵配置为后台运行和配置好日志
daemonize yes
logfile "sentinel01.log"
sentinel down-after-milliseconds mymaster 30000 //哨兵主观上认为主节点下线 默认30s
2、启动哨兵
[root@redis01 redis]# redis-sentinel /redis/sentinel.conf
13625:X 22 Oct 2020 22:34:39.014 # oO0OoO0OoO0Oo Redis is starting oO0OoO0OoO0Oo
13625:X 22 Oct 2020 22:34:39.014 # Redis version=5.0.5, bits=64, commit=00000000, modified=0, pid=13625, just started
13625:X 22 Oct 2020 22:34:39.014 # Configuration loaded
13625:X 22 Oct 2020 22:34:39.017 * Increased maximum number of open files to 10032 (it was originally set to 1024).
_._
_.-``__ ''-._
_.-`` `. `_. ''-._ Redis 5.0.5 (00000000/0) 64 bit
.-`` .-```. ```/ _.,_ ''-._
( ' , .-` | `, ) Running in sentinel mode
|`-._`-...-` __...-.``-._|'` _.-'| Port: 26379
| `-._ `._ / _.-' | PID: 13625
`-._ `-._ `-./ _.-' _.-'
|`-._`-._ `-.__.-' _.-'_.-'|
| `-._`-._ _.-'_.-' | http://redis.io
`-._ `-._`-.__.-'_.-' _.-'
|`-._`-._ `-.__.-' _.-'_.-'|
| `-._`-._ _.-'_.-' |
`-._ `-._`-.__.-'_.-' _.-'
`-._ `-.__.-' _.-'
`-._ _.-'
`-.__.-'
13625:X 22 Oct 2020 22:34:39.019 # WARNING: The TCP backlog setting of 511 cannot be enforced because /proc/sys/net/core/somaxconn is set to the lower value of 128.
13625:X 22 Oct 2020 22:34:39.020 # Sentinel ID is c75fccd6ca92e1f1ea71977d8e2aaa705147b708
13625:X 22 Oct 2020 22:34:39.020 # +monitor master myredis01 192.168.100.208 6379 quorum 1
13625:X 22 Oct 2020 22:34:39.021 * +slave slave 192.168.100.210:6379 192.168.100.210 6379 @ myredis01 192.168.100.208 6379
13625:X 22 Oct 2020 22:34:39.022 * +slave slave 192.168.100.209:6379 192.168.100.209 6379 @ myredis01 192.168.100.208 6379
3.手动关闭了master主机 192.168.100.208 哨兵模式下192.168.100.210被设置为新的master主机
13625:X 22 Oct 2020 23:34:19.523 # +sdown master myredis01 192.168.100.208 6379
13625:X 22 Oct 2020 23:34:19.523 # +odown master myredis01 192.168.100.208 6379 #quorum 1/1
13625:X 22 Oct 2020 23:34:19.523 # +new-epoch 1
13625:X 22 Oct 2020 23:34:19.523 # +try-failover master myredis01 192.168.100.208 6379
13625:X 22 Oct 2020 23:34:19.525 # +vote-for-leader c75fccd6ca92e1f1ea71977d8e2aaa705147b708 1
13625:X 22 Oct 2020 23:34:19.525 # +elected-leader master myredis01 192.168.100.208 6379
13625:X 22 Oct 2020 23:34:19.525 # +failover-state-select-slave master myredis01 192.168.100.208 6379
13625:X 22 Oct 2020 23:34:19.626 # +selected-slave slave 192.168.100.210:6379 192.168.100.210 6379 @ myredis01 192.168.100.208 6379
13625:X 22 Oct 2020 23:34:19.626 * +failover-state-send-slaveof-noone slave 192.168.100.210:6379 192.168.100.210 6379 @ myredis01 192.168.100.208 6379
13625:X 22 Oct 2020 23:34:19.678 * +failover-state-wait-promotion slave 192.168.100.210:6379 192.168.100.210 6379 @ myredis01 192.168.100.208 6379
13625:X 22 Oct 2020 23:34:20.380 # +promoted-slave slave 192.168.100.210:6379 192.168.100.210 6379 @ myredis01 192.168.100.208 6379
13625:X 22 Oct 2020 23:34:20.380 # +failover-state-reconf-slaves master myredis01 192.168.100.208 6379
13625:X 22 Oct 2020 23:34:20.432 * +slave-reconf-sent slave 192.168.100.209:6379 192.168.100.209 6379 @ myredis01 192.168.100.208 6379
13625:X 22 Oct 2020 23:34:21.411 * +slave-reconf-inprog slave 192.168.100.209:6379 192.168.100.209 6379 @ myredis01 192.168.100.208 6379
13625:X 22 Oct 2020 23:34:21.411 * +slave-reconf-done slave 192.168.100.209:6379 192.168.100.209 6379 @ myredis01 192.168.100.208 6379
13625:X 22 Oct 2020 23:34:21.473 # +failover-end master myredis01 192.168.100.208 6379
13625:X 22 Oct 2020 23:34:21.473 # +switch-master myredis01 192.168.100.208 6379 192.168.100.210 6379
13625:X 22 Oct 2020 23:34:21.473 * +slave slave 192.168.100.209:6379 192.168.100.209 6379 @ myredis01 192.168.100.210 6379
13625:X 22 Oct 2020 23:34:21.473 * +slave slave 192.168.100.208:6379 192.168.100.208 6379 @ myredis01 192.168.100.210 6379
13625:X 22 Oct 2020 23:34:51.494 # +sdown slave 192.168.100.208:6379 192.168.100.208 6379 @ myredis01 192.168.100.210 6379
4.将原master主机192.168.100.208上的redis服务恢复 192.168.100.208角色将会变为slave主机。哨兵模式会检测到如下:
13625:X 22 Oct 2020 23:37:41.849 # -sdown slave 192.168.100.208:6379 192.168.100.208 6379 @ myredis01 192.168.100.210 6379
13625:X 22 Oct 2020 23:37:51.753 * +convert-to-slave slave 192.168.100.208:6379 192.168.100.208 6379 @ myredis01 192.168.100.210 6379
当只有一个哨兵的情况下,如果这个哨兵进程本身出了问题。就会出现问题。
所以生产环境中 推荐使用多个哨兵来进行监控。哨兵之间还会互相监控,即多哨兵模式。