zoukankan      html  css  js  c++  java
  • redis sentinel(哨兵)

    Redis-Sentinel是Redis官方推荐的高可用性(HA)解决方案,当用Redis做Master-slave的高可用方案时,假如master宕机了,Redis本身(包括它的很多客户端)都没有实现自动进行主备切换,而Redis-sentinel本身也是一个独立运行的进程,它能监控多个master-slave集群,发现master宕机后能进行自动切换。

    环境准备:

    ip hostname server
    192.168.20.3 node2003 redis-master,sentinel
    192.168.20.4 node2004 redis-slave,sentinel
    192.168.20.5 node2005 redis-slave,sentinel

    这里使用yum方式安装


    node2003:

    redis配置文件:

    ~]# grep "^[^#]" /etc/redis.conf
    bind 192.168.20.3
    protected-mode yes
    port 6379
    tcp-backlog 511
    timeout 0
    tcp-keepalive 300
    daemonize systemd
    slave-priority 100  
    //设置优先级,当master因各种原因断开了,node2004和node2005会根据优先级大小选择新的master。如果优先级相同,则会根据复制的下标来判断,哪个从master接收的复制数据多,哪个就靠前。如果复制ID也相同,则选择进程ID较小的。还有会根据slave与master断开连接的次数,断开过越多就不适合。
    masterauth foo   
    requirepass foo  
    //当使用sentinel时,一个master可能会变成slave,一个slave也可能会变成master,所以需要同时设置`masterauth`和`requirepass`
    
    ...
    

    其它配置使用默认值,这里只显示有关操作的切换操作相关配置

    sentinel配置文件:
    sentinel会自动从master那里获取其它sentinel相关信息组成集群,也会从master那里获取slave相关信息。

    ~]# grep "^[^#]" /etc/redis-sentinel.conf 
    bind node2003
    port 26379
    dir /tmp
    sentinel monitor R1 node2003 6379 2
    //监控的master名字叫R1,地址为node2003:6379。2代表,当sentinel集群中有2(总共3个,大于半数)个认为master已经不可用了,才能真正认为该master不可用。
    
    sentinel auth-pass R1 foo
    //设置连接master和slave时的密码。sentinel不能分别为master和slave设置不同的密码,因此密码应该设置相同。
    
    sentinel down-after-milliseconds R1 30000
    //多长时间失效,一个master才会被这个sentinel SDOWN(主观地)认为不可用。单位毫秒
    
    sentinel parallel-syncs R1 1
    //发生failover主备切换时最多可以有多少个slave同时对新的master进行同步。根据实际情况,小于slave数量,数据慢慢复制。等于slave数量,复制的这段时间服务将不可用。
    
    sentinel failover-timeout R1 25000
    //failover-timeout 可以用在以下这些方面: 
    1. 同一个sentinel对同一个master两次failover之间的间隔时间。
    2. 当一个slave从一个错误的master那里同步数据开始计算时间。直到slave被纠正为向正确的master那里同步数据时。
    3.当想要取消一个正在进行的failover所需要的时间。  
    4.当进行failover时,配置所有slaves指向新的master所需的最大时间。不过,即使过了这个超时,slaves依然会被正确配置为指向master,但是就不按parallel-syncs所配置的规则来了。
    
    logfile /var/log/redis/sentinel.log
    

    node2004和node2005:

    redis配置文件:

    ~]# grep "^[^#]" /etc/redis.conf
    bind node2004      //node2004和node2005只有此处地址绑定不同,其它配置一样
    protected-mode yes
    port 6379
    tcp-backlog 511
    timeout 0
    tcp-keepalive 300
    daemonize yes
    supervised systemd
    slaveof 192.168.20.3 6379
    masterauth foo
    requirepass foo
    slave-priority 100
    ...
    

    sentinel配置文件:

    ~]# grep "^[^#]" /etc/redis-sentinel.conf 
    bind node2004
    port 26379
    dir /tmp
    sentinel monitor R1 node2003 6379 2
     sentinel auth-pass R1 foo
    sentinel down-after-milliseconds R1 30000
    sentinel parallel-syncs R1 1
    sentinel failover-timeout R1 20000
    logfile /var/log/redis/sentinel.log
    

    测试:

    启动node2003,node2004,node2005的redis和redis-sentinel

    ~]# systemctl start redis
    ~]# systemctl start redis-sentinel
    

    查看日志: node2003: ``` ~]# tail -f /var/log/redis/sentinel.log 6719:X 28 Dec 10:45:44.973 * supervised by systemd, will signal readiness _._ _.-``__ ''-._ _.-`` `. `_. ''-._ Redis 3.2.12 (00000000/0) 64 bit .-`` .-```. ```/ _.,_ ''-._ ( ' , .-` | `, ) Running in sentinel mode |`-._`-...-` __...-.``-._|'` _.-'| Port: 26379 | `-._ `._ / _.-' | PID: 6719 `-._ `-._ `-./ _.-' _.-' |`-._`-._ `-.__.-' _.-'_.-'| | `-._`-._ _.-'_.-' | http://redis.io `-._ `-._`-.__.-'_.-' _.-' |`-._`-._ `-.__.-' _.-'_.-'| | `-._`-._ _.-'_.-' | `-._ `-._`-.__.-'_.-' _.-' `-._ `-.__.-' _.-' `-._ _.-' `-.__.-'

    6719:X 28 Dec 10:45:44.975 # WARNING: The TCP backlog setting of 511 cannot be enforced because /proc/sys/net/core/somaxconn is set to the lower value of 128.
    6719:X 28 Dec 10:45:44.975 # Sentinel ID is e698fe03128cb9460cea882d5144387c142a45f3 //这里报了个backlog错,说系统值太小,手动调大即可
    6719:X 28 Dec 10:45:44.975 # +monitor master R1 192.168.20.3 6379 quorum 2
    6719:X 28 Dec 10:45:44.976 * +slave slave 192.168.20.4:6379 192.168.20.4 6379 @ R1 192.168.20.3 6379
    6719:X 28 Dec 10:45:44.976 * +slave slave 192.168.20.5:6379 192.168.20.5 6379 @ R1 192.168.20.3 6379 /已经发现node2004和node2005两台slave了
    6719:X 28 Dec 10:46:05.142 * +fix-slave-config slave 192.168.20.5:6379 192.168.20.5 6379 @ R1 192.168.20.3 6379

    <br />
    
    node2004:
    

    ~]# tail -f /var/log/redis/sentinel.log
    ...
    20779:X 28 Dec 10:48:29.663 # WARNING: The TCP backlog setting of 511 cannot be enforced because /proc/sys/net/core/somaxconn is set to the lower value of 128.
    20779:X 28 Dec 10:48:29.664 # Sentinel ID is d1705f00f601d82871bde7a2719cb39e8e880984
    20779:X 28 Dec 10:48:29.664 # +monitor master R1 192.168.20.3 6379 quorum 2
    20779:X 28 Dec 10:48:30.501 * +sentinel sentinel e698fe03128cb9460cea882d5144387c142a45f3 192.168.20.3 26379 @ R1 192.168.20.3 6379 //已经发现node2003的sentinel了
    ...

    <br />
    
    node2005:
    

    ~]# tail -f /var/log/redis/sentinel.log
    ...
    8475:X 28 Dec 10:49:44.089 # Sentinel ID is e21ab29980073fbed07e0b1719a6c6f270ebc10a
    8475:X 28 Dec 10:49:44.089 # +monitor master R1 192.168.20.3 6379 quorum 2
    8475:X 28 Dec 10:49:44.575 * +sentinel sentinel e698fe03128cb9460cea882d5144387c142a45f3 192.168.20.3 26379 @ R1 192.168.20.3 6379
    8475:X 28 Dec 10:49:46.133 * +sentinel sentinel d1705f00f601d82871bde7a2719cb39e8e880984 192.168.20.4 26379 @ R1 192.168.20.3 6379 //已经发现node2003和node2004两台sentinel服务了
    ...

    这时整个redis和sentinel都已经启动完成。接下来测试故障转移功能。
    <br />
    
    
    查看主从信息:
    

    ~]# redis-cli -h node2003 -p 6379 -a foo
    node2003:6379> INFO replication

    Replication

    role:master
    connected_slaves:2
    slave0:ip=192.168.20.4,port=6379,state=online,offset=391855,lag=1
    slave1:ip=192.168.20.5,port=6379,state=online,offset=391988,lag=0 //查看replication信息
    master_repl_offset:392254
    repl_backlog_active:1
    repl_backlog_size:1048576
    repl_backlog_first_byte_offset:2
    repl_backlog_histlen:392253

    <br />
    
    
    关闭node2003上的上的redis,并查看状态变换
    

    ~]# systemctl stop redis

    //查看node2005 sentinel的日志看下
    ~]# tail -f /var/log/redis/sentinel.log
    20200:X 28 Dec 14:28:01.640 # Sentinel ID is e21ab29980073fbed07e0b1719a6c6f270ebc10a
    20200:X 28 Dec 14:28:01.640 # +monitor master R1 192.168.20.3 6379 quorum 2
    20200:X 28 Dec 14:28:01.641 * +slave slave 192.168.20.4:6379 192.168.20.4 6379 @ R1 192.168.20.3 6379
    20200:X 28 Dec 14:28:01.641 * +slave slave 192.168.20.5:6379 192.168.20.5 6379 @ R1 192.168.20.3 6379
    20200:X 28 Dec 14:28:09.782 * +sentinel sentinel d1705f00f601d82871bde7a2719cb39e8e880984 192.168.20.4 26379 @ R1 192.168.20.3 6379
    20200:X 28 Dec 14:28:11.778 * +sentinel sentinel e698fe03128cb9460cea882d5144387c142a45f3 192.168.20.3 26379 @ R1 192.168.20.3 6379 //刚启动时node2003,node2004,node2005都在线的时候正常日志
    20200:X 28 Dec 14:28:11.779 # +new-epoch 8
    20200:X 28 Dec 14:29:43.387 # +sdown master R1 192.168.20.3 6379 //发现node2003出现问题,主观不可用
    20200:X 28 Dec 14:29:43.399 # +new-epoch 9
    20200:X 28 Dec 14:29:43.400 # +vote-for-leader d1705f00f601d82871bde7a2719cb39e8e880984 9
    20200:X 28 Dec 14:29:43.470 # +odown master R1 192.168.20.3 6379 #quorum 3/2 //达到法定票数,客观不可用
    20200:X 28 Dec 14:29:43.470 # Next failover delay: I will not start a failover before Fri Dec 28 14:30:24 2018 //failover期间node2003未恢复
    20200:X 28 Dec 14:29:44.140 # +config-update-from sentinel d1705f00f601d82871bde7a2719cb39e8e880984 192.168.20.4 26379 @ R1 192.168.20.3 6379
    20200:X 28 Dec 14:29:44.140 # +switch-master R1 192.168.20.3 6379 192.168.20.5 6379 //切换master至node2005
    20200:X 28 Dec 14:29:44.141 * +slave slave 192.168.20.4:6379 192.168.20.4 6379 @ R1 192.168.20.5 6379
    20200:X 28 Dec 14:29:44.141 * +slave slave 192.168.20.3:6379 192.168.20.3 6379 @ R1 192.168.20.5 6379
    20200:X 28 Dec 14:30:14.146 # +sdown slave 192.168.20.3:6379 192.168.20.3 6379 @ R1 192.168.20.5 6379

    <br />
    
    查看replication信息:
    

    node2004:
    node2004:6379> info replication

    Replication

    role:slave
    master_host:192.168.20.5 //master已经切换成node2005
    master_port:6379
    master_link_status:up
    master_last_io_seconds_ago:0
    master_sync_in_progress:0
    slave_repl_offset:128128
    slave_priority:100
    slave_read_only:1
    connected_slaves:0
    master_repl_offset:0
    repl_backlog_active:0
    repl_backlog_size:1048576
    repl_backlog_first_byte_offset:0
    repl_backlog_histlen:0
    node2004:6379>

    node2005:
    node2005:6379> INFO replication

    Replication

    role:master
    connected_slaves:1
    slave0:ip=192.168.20.4,port=6379,state=online,offset=19183,lag=1
    master_repl_offset:19183
    repl_backlog_active:1
    repl_backlog_size:1048576
    repl_backlog_first_byte_offset:2
    repl_backlog_histlen:19182
    node2005:6379>

    <br />
    
    **恢复node2003上的redis:**
    

    node2005:6379> INFO replication

    Replication

    role:master
    connected_slaves:2
    slave0:ip=192.168.20.4,port=6379,state=online,offset=146099,lag=1
    slave1:ip=192.168.20.3,port=6379,state=online,offset=146232,lag=1
    master_repl_offset:146232
    repl_backlog_active:1
    repl_backlog_size:1048576
    repl_backlog_first_byte_offset:2
    repl_backlog_histlen:146231
    node2005:6379>

    可以看到自动将node2003加入到node2005下。
    
    
    最后查看一下node2003的sentinel的配置文件:
    

    ~]# vim /etc/redis-sentinel.conf
    ...
    sentinel monitor R1 192.168.20.5 6379 2

    ~]# vim /etc/redis.conf
    ...
    slaveof 192.168.20.5 6379

    可以看到node2003配置文件中原先是没这些配置的。sentinel会自己修改其中配置,这样重启sentinel也不会担心相关信息丢失了。
    
    
    
    sentinel原理可参考如下文章:https://segmentfault.com/a/1190000002680804
  • 相关阅读:
    HDU 3466(01背包变种
    HDU 2639(01背包第K大)
    POJ 2184(01背包)(负体积)
    UVA 562(01背包)
    UVA 624(01背包记录路径)
    SQL总结二
    oracle--知识点汇总1
    时间日期----java
    字符串、数值----转换
    字符串反转----示例
  • 原文地址:https://www.cnblogs.com/dance-walter/p/10190528.html
Copyright © 2011-2022 走看看