zoukankan      html  css  js  c++  java
  • Redis 哨兵(sentinel)模式集群配置(5.0.3版本)

    一、准备工作

    1.系统环境:centos6.4

    2.服务器六台(1主5从):

    192.168.1.161(master)

    192.168.1.162(slave)

    192.168.1.163(slave)

    192.168.1.141(slave)

    192.168.1.142(slave)

    192.168.1.143(slave)

    2.redis版本:5.0.3

    3.安装:

    进入到目录:cd /usr/local

    下载redis:wget http://download.redis.io/releases/redis-5.0.3.tar.gz

    下载完成后解压:tar zxvf redis-5.0.3.tar.gz

    重命名为redis文件夹(这一步纯属个人喜好):mv redis-5.0.3 redis

    进入到redis文件夹:cd redis

    编译及安装:make && make install

    特别说明:官方文档只给出了make(编译),没有给出make install(安装)

    二、配置

    1.Master配置,把master的redis.conf拷贝了一份命名为redis_master.conf

    设置:

    ### NETWORK 设置:
    # bind 127.0.0.1 //绑定监控的网卡IP,注释掉bind,任何ip均可访问,
    protected-mode no //保护模式关闭,这里使用密码访问
    port 17000 //设置端口,建议测试时可以使用默认端口,我这里改掉了,建议生产环境均使用自定义端口
    timeout 30 //Client 端空闲断开连接的时间

    ### GENERAL 设置:
    daemonize yes //后台模式运行
    pidfile /var/run/redis_17000.pid  //pid进程文件名
    logfile /usr/local/redis/logs/redis.log //日志文件的位置

    ### SNAPSHOTTING 设置:
    dir /usr/local/redis/datas //快照文件的路径

    ### APPEND ONLY MODE 设置:
    appendonly yes //默认值是No,意思是不使用AOF增量持久化的方式,使用RDB全量持久化的方式。把No值改成Yes,使用AOF增量持久化的方式
    appendfsync always

    ###SECURITY 设置密码:,生产环境一定要使用复杂密码
    requirepass 123456

    2.Slave配置,我把master的redis.conf拷贝了一份命名为redis_slave.conf

    ### NETWORK 设置:
    # bind 127.0.0.1 //注释掉bind,任何ip均可访问
    port 17000 //设置端口
    protected-mode no //保护模式关闭,使用密码访问
    timeout 30 //Client 端空闲断开连接的时间

    ### GENERAL 设置:
    daemonize yes //后台模式运行
    pidfile /var/run/redis_17000.pid
    logfile /usr/local/redis/logs/redis.log //日志文件的位置

    ### SNAPSHOTTING 设置:
    dir /usr/local/redis/datas //SNAPSHOTTING文件的路径

    ### REPLICATION 设置:
    replicaof 192.168.1.161 17000 //主服务器的Ip地址和Port端口号
    replica-serve-stale-data no //如果slave 无法与master 同步,设置成slave不可读,方便监控脚本发现问题。
    masterauth 123456 //master的密码

    ### APPEND ONLY MODE 设置:
    appendonly yes //默认值是No,意思是不使用AOF增量持久化的方式,使用RDB全量持久化的方式。把No值改成Yes,使用AOF增量持久化的方式
    appendfsync always

    ###SECURITY 设置密码:,生产环境一定要使用复杂密码
    requirepass 123456

    3.Sentinel(哨兵)配置,配置文件为sentinel.conf

    port 16000 //哨兵端口号
    protected-mode no //关闭保护模式
    daemonize yes //守护进程
    dir /usr/local/redis/sentinel/ //哨兵程序的工作路径
    sentinel auth-pass mymaster 123456 //master的访问密码

    //Sentinel去监视一个名为mymaster的主redis实例,这个主实例的IP地址为本机地址192.168.1.161,端口号为17000,而将这个主实例判断为失效至少需要1个 Sentinel进程的同意,只要同意Sentinel的数量不达标,自动failover就不会执行

    sentinel monitor mymaster 192.168.1.161 17000 1
    sentinel down-after-milliseconds mymaster 5000 //哨兵程序每5秒检测一次Master是否正常


    //指定了在执行故障转移时,最多可以有多少个从Redis实例在同步新的主实例,在从Redis实例较多的情况下这个数字越小,同步的时间越长,完成故障转移所需的时间就越长
    sentinel parallel-syncs mymaster 2

    sentinel failover-timeout mymaster 300000 //如果在该时间(ms)内未能完成failover操作,则认为该failover失败,生产环境需要根据数据量设置该值

    三、启动Redis服务

    1.在各服务器上先建立配置文件中需要的文件夹

    mkdir -p /usr/local/redis/datas
    mkdir -p /usr/local/redis/logs
    mkdir -p /usr/local/redis/sentinel

    2.启动服务

    先开启master服务,也就是192.168.1.161服务器

    进入到redis文件夹中:cd /usr/local/redis

    ### 启动master:
    src/redis-server redis_master.conf

    注意: 这里指定了配置文件redis_master.conf

    使用redis-cli访问服务端,查看状态

    src/redis-cli -p 17000   //由于修改了端口,所以在使用redis-cli工具时,需要指定端口

    看到127.0.0.1:17000> 表示连接到服务端

    输入info replication 查看集群状态,这里报错:NOAUTH Authentication required.

    原因是:master和slave均设置了密码,如果要查看replication,则需要输入密码授权:

    AUTH 123456   --密码就是123456(配置文件里配置的),看到OK表示授权通过,再次输入info replication便可以显示集群状态:

    [root@mongodb-161 redis]# src/redis-server redis_master.conf
    [root@mongodb-161 redis]# src/redis-cli -p 17000
    127.0.0.1:17000> info replication
    NOAUTH Authentication required.
    127.0.0.1:17000> AUTH 123456
    OK
    127.0.0.1:17000> info replication
    # Replication
    role:master
    connected_slaves:0
    master_replid:96942e0268bff14dba3b6d1e22a9b55e9e77343b
    master_replid2:0000000000000000000000000000000000000000
    master_repl_offset:0
    second_repl_offset:-1
    repl_backlog_active:0
    repl_backlog_size:1048576
    repl_backlog_first_byte_offset:0
    repl_backlog_histlen:0
    127.0.0.1:17000> 

    这里只是刚启动了master一个服务,所以connected_slaves:0

    再启动192.168.1.162服务器上的redis

    src/redis-server redis_slave.conf 

    注意:这里指定了配置文件redis_slave.conf 

    启动完成后使用redis-cli -p 17000 查看状态

    [root@mongodb-162 redis]# src/redis-server redis_slave.conf 
    [root@mongodb-162 redis]# src/redis-cli -p 17000
    127.0.0.1:17000> AUTH 123456
    OK
    127.0.0.1:17000> info replication
    # Replication
    role:slave
    master_host:192.168.1.161
    master_port:17000
    master_link_status:up
    master_last_io_seconds_ago:9
    master_sync_in_progress:0
    slave_repl_offset:28
    slave_priority:100
    slave_read_only:1
    connected_slaves:0
    master_replid:b3f15ec093bb8ab2b1e908b41740f4bdabbfeba4
    master_replid2:0000000000000000000000000000000000000000
    master_repl_offset:28
    second_repl_offset:-1
    repl_backlog_active:1
    repl_backlog_size:1048576
    repl_backlog_first_byte_offset:1
    repl_backlog_histlen:28
    127.0.0.1:17000> 

    可以看到当前的状态是slave,连接的master是192.168.1.161,这里再来查看192.168.1.161的master上的状态

    127.0.0.1:17000> info replication
    # Replication
    role:master
    connected_slaves:1
    slave0:ip=192.168.1.162,port=17000,state=online,offset=14,lag=1
    master_replid:b3f15ec093bb8ab2b1e908b41740f4bdabbfeba4
    master_replid2:0000000000000000000000000000000000000000
    master_repl_offset:14
    second_repl_offset:-1
    repl_backlog_active:1
    repl_backlog_size:1048576
    repl_backlog_first_byte_offset:1
    repl_backlog_histlen:14
    127.0.0.1:17000> 

    可以看到连接到master有一个slave了,继续把剩余服务器上的redis都启动起来,然后再查看master状态

    127.0.0.1:17000> info replication
    # Replication
    role:master
    connected_slaves:5
    slave0:ip=192.168.1.162,port=17000,state=online,offset=308,lag=1
    slave1:ip=192.168.1.163,port=17000,state=online,offset=308,lag=0
    slave2:ip=192.168.1.141,port=17000,state=online,offset=308,lag=1
    slave3:ip=192.168.1.142,port=17000,state=online,offset=308,lag=1
    slave4:ip=192.168.1.143,port=17000,state=online,offset=308,lag=0
    master_replid:b3f15ec093bb8ab2b1e908b41740f4bdabbfeba4
    master_replid2:0000000000000000000000000000000000000000
    master_repl_offset:308
    second_repl_offset:-1
    repl_backlog_active:1
    repl_backlog_size:1048576
    repl_backlog_first_byte_offset:1
    repl_backlog_histlen:308
    127.0.0.1:17000> 

    这里5个slave均已成功连接到master上了。

    四、启动Sentinel(哨兵)进程

    哨兵进程不一定与redis数量一致,也不一定要放在redis服务器上,sentinel的作用是监控所有服务及与其它哨兵通信,若sentinel单独放其它服务器上,则也需要安装redis,sentinel只是redis软件包中的一个服务

    每台服务器上都放了一个sentinel进程

    启动命令:src/redis-sentinel sentinel.conf

    启动后查看日志:

    [root@redis redis]# src/redis-sentinel sentinel.conf
    [root@redis redis]# cat sentinel/sentinel.log 
    2747:X 15 Mar 2019 16:18:52.042 # oO0OoO0OoO0Oo Redis is starting oO0OoO0OoO0Oo
    2747:X 15 Mar 2019 16:18:52.042 # Redis version=5.0.3, bits=64, commit=00000000, modified=0, pid=2747, just started
    2747:X 15 Mar 2019 16:18:52.042 # Configuration loaded
    2748:X 15 Mar 2019 16:18:52.059 * Increased maximum number of open files to 10032 (it was originally set to 1024).
    2748:X 15 Mar 2019 16:18:52.061 * Running mode=sentinel, port=16000.
    2748:X 15 Mar 2019 16:18:52.061 # WARNING: The TCP backlog setting of 511 cannot be enforced because /proc/sys/net/core/somaxconn is set to the lower value of 128.
    2748:X 15 Mar 2019 16:18:52.071 # Sentinel ID is 53e37f3442336b7b229fa42c655bdca9eaa6578c
    2748:X 15 Mar 2019 16:18:52.072 # +monitor master mymaster 192.168.1.161 17000 quorum 1
    2748:X 15 Mar 2019 16:18:52.073 * +slave slave 192.168.1.162:17000 192.168.1.162 17000 @ mymaster 192.168.1.161 17000
    2748:X 15 Mar 2019 16:18:52.075 * +slave slave 192.168.1.163:17000 192.168.1.163 17000 @ mymaster 192.168.1.161 17000
    2748:X 15 Mar 2019 16:18:52.076 * +slave slave 192.168.1.141:17000 192.168.1.141 17000 @ mymaster 192.168.1.161 17000
    2748:X 15 Mar 2019 16:18:52.078 * +slave slave 192.168.1.142:17000 192.168.1.142 17000 @ mymaster 192.168.1.161 17000
    2748:X 15 Mar 2019 16:18:52.079 * +slave slave 192.168.1.143:17000 192.168.1.143 17000 @ mymaster 192.168.1.161 17000
    2748:X 15 Mar 2019 16:18:52.515 * +sentinel sentinel 4a32c88aa8bff64295e45f8bb9c25154f4533d2a 192.168.1.163 16000 @ mymaster 192.168.1.161 17000
    2748:X 15 Mar 2019 16:18:54.057 * +sentinel sentinel 93da0f13c96f22e0263e4288c68390cce5e11cea 192.168.1.161 16000 @ mymaster 192.168.1.161 17000
    2748:X 15 Mar 2019 16:18:54.078 * +sentinel sentinel 605bf637a218de833308aa590cb3346de0eaacaf 192.168.1.162 16000 @ mymaster 192.168.1.161 17000
    2748:X 15 Mar 2019 16:18:55.520 * +sentinel sentinel 3607a3f65b224ee03e49a825604b40d1cc1fbeaf 192.168.1.142 16000 @ mymaster 192.168.1.161 17000
    2748:X 15 Mar 2019 16:18:58.040 * +sentinel sentinel d8e333f5daec7809605b8f276b1f2b44c1711459 192.168.1.143 16000 @ mymaster 192.168.1.161 17000

    这里有个警告是由于TCP backlog 设置了511,而系统目前仅支持128,可以先忽略。

    目前所有master-slave服务均正常。

    五、故障转移

    模拟master宕机,这里直接把master进程杀掉

    查看redis进程:

    [root@mongodb-161 redis]# ps -e | grep redis
     2730 ?        00:00:01 redis-server
     2748 ?        00:00:02 redis-sentinel

    redis-server进程号为2730

    按进程号杀掉:kill -9 2730

    [root@mongodb-161 redis]# kill -9 2730
    [root@mongodb-161 redis]# ps -e | grep redis
     2748 ?        00:00:02 redis-sentinel
    [root@mongodb-161 redis]# 

    杀掉后再查看进程,已经没有了redis-server进程了。

    然后查看192.168.1.161上的sentinel日志

    [root@mongodb-161 redis]# cat sentinel/sentinel.log 
    3129:X 15 Mar 2019 16:18:42.113 # oO0OoO0OoO0Oo Redis is starting oO0OoO0OoO0Oo
    3129:X 15 Mar 2019 16:18:42.113 # Redis version=5.0.3, bits=64, commit=00000000, modified=0, pid=3129, just started
    3129:X 15 Mar 2019 16:18:42.113 # Configuration loaded
    3130:X 15 Mar 2019 16:18:42.127 * Increased maximum number of open files to 10032 (it was originally set to 1024).
    3130:X 15 Mar 2019 16:18:42.129 * Running mode=sentinel, port=16000.
    3130:X 15 Mar 2019 16:18:42.129 # WARNING: The TCP backlog setting of 511 cannot be enforced because /proc/sys/net/core/somaxconn is set to the lower value of 128.
    3130:X 15 Mar 2019 16:18:42.139 # Sentinel ID is 93da0f13c96f22e0263e4288c68390cce5e11cea
    3130:X 15 Mar 2019 16:18:42.139 # +monitor master mymaster 192.168.1.161 17000 quorum 1
    3130:X 15 Mar 2019 16:18:42.141 * +slave slave 192.168.1.162:17000 192.168.1.162 17000 @ mymaster 192.168.1.161 17000
    3130:X 15 Mar 2019 16:18:42.143 * +slave slave 192.168.1.163:17000 192.168.1.163 17000 @ mymaster 192.168.1.161 17000
    3130:X 15 Mar 2019 16:18:42.144 * +slave slave 192.168.1.141:17000 192.168.1.141 17000 @ mymaster 192.168.1.161 17000
    3130:X 15 Mar 2019 16:18:42.146 * +slave slave 192.168.1.142:17000 192.168.1.142 17000 @ mymaster 192.168.1.161 17000
    3130:X 15 Mar 2019 16:18:42.147 * +slave slave 192.168.1.143:17000 192.168.1.143 17000 @ mymaster 192.168.1.161 17000
    3130:X 15 Mar 2019 16:18:50.335 * +sentinel sentinel 605bf637a218de833308aa590cb3346de0eaacaf 192.168.1.162 16000 @ mymaster 192.168.1.161 17000
    3130:X 15 Mar 2019 16:18:52.836 * +sentinel sentinel 4a32c88aa8bff64295e45f8bb9c25154f4533d2a 192.168.1.163 16000 @ mymaster 192.168.1.161 17000
    3130:X 15 Mar 2019 16:18:54.374 * +sentinel sentinel 53e37f3442336b7b229fa42c655bdca9eaa6578c 192.168.1.141 16000 @ mymaster 192.168.1.161 17000
    3130:X 15 Mar 2019 16:18:55.841 * +sentinel sentinel 3607a3f65b224ee03e49a825604b40d1cc1fbeaf 192.168.1.142 16000 @ mymaster 192.168.1.161 17000
    3130:X 15 Mar 2019 16:18:58.361 * +sentinel sentinel d8e333f5daec7809605b8f276b1f2b44c1711459 192.168.1.143 16000 @ mymaster 192.168.1.161 17000
    3130:X 15 Mar 2019 16:25:46.300 # +new-epoch 1
    3130:X 15 Mar 2019 16:25:46.301 # +vote-for-leader 4a32c88aa8bff64295e45f8bb9c25154f4533d2a 1
    3130:X 15 Mar 2019 16:25:46.336 # +sdown master mymaster 192.168.1.161 17000
    3130:X 15 Mar 2019 16:25:46.336 # +odown master mymaster 192.168.1.161 17000 #quorum 1/1
    3130:X 15 Mar 2019 16:25:46.336 # Next failover delay: I will not start a failover before Fri Mar 15 16:35:47 2019
    3130:X 15 Mar 2019 16:25:46.635 # +config-update-from sentinel 4a32c88aa8bff64295e45f8bb9c25154f4533d2a 192.168.1.163 16000 @ mymaster 192.168.1.161 17000
    3130:X 15 Mar 2019 16:25:46.635 # +switch-master mymaster 192.168.1.161 17000 192.168.1.162 17000
    3130:X 15 Mar 2019 16:25:46.636 * +slave slave 192.168.1.163:17000 192.168.1.163 17000 @ mymaster 192.168.1.162 17000
    3130:X 15 Mar 2019 16:25:46.636 * +slave slave 192.168.1.142:17000 192.168.1.142 17000 @ mymaster 192.168.1.162 17000
    3130:X 15 Mar 2019 16:25:46.636 * +slave slave 192.168.1.143:17000 192.168.1.143 17000 @ mymaster 192.168.1.162 17000
    3130:X 15 Mar 2019 16:25:46.636 * +slave slave 192.168.1.141:17000 192.168.1.141 17000 @ mymaster 192.168.1.162 17000
    3130:X 15 Mar 2019 16:25:46.636 * +slave slave 192.168.1.161:17000 192.168.1.161 17000 @ mymaster 192.168.1.162 17000
    3130:X 15 Mar 2019 16:25:51.638 # +sdown slave 192.168.1.161:17000 192.168.1.161 17000 @ mymaster 192.168.1.162 17000

    可以看到时间在16:25:46时master宕机后,已经进行了故障转移,新的master是192.168.1.162,查看192.168.1.162的sentinel日志:

    [root@mongodb-162 redis]# cat sentinel/sentinel.log 
    3103:X 15 Mar 2019 16:18:48.245 # oO0OoO0OoO0Oo Redis is starting oO0OoO0OoO0Oo
    3103:X 15 Mar 2019 16:18:48.245 # Redis version=5.0.3, bits=64, commit=00000000, modified=0, pid=3103, just started
    3103:X 15 Mar 2019 16:18:48.245 # Configuration loaded
    3104:X 15 Mar 2019 16:18:48.250 * Increased maximum number of open files to 10032 (it was originally set to 1024).
    3104:X 15 Mar 2019 16:18:48.251 * Running mode=sentinel, port=16000.
    3104:X 15 Mar 2019 16:18:48.252 # WARNING: The TCP backlog setting of 511 cannot be enforced because /proc/sys/net/core/somaxconn is set to the lower value of 128.
    3104:X 15 Mar 2019 16:18:48.254 # Sentinel ID is 605bf637a218de833308aa590cb3346de0eaacaf
    3104:X 15 Mar 2019 16:18:48.254 # +monitor master mymaster 192.168.1.161 17000 quorum 1
    3104:X 15 Mar 2019 16:18:48.256 * +slave slave 192.168.1.162:17000 192.168.1.162 17000 @ mymaster 192.168.1.161 17000
    3104:X 15 Mar 2019 16:18:48.257 * +slave slave 192.168.1.163:17000 192.168.1.163 17000 @ mymaster 192.168.1.161 17000
    3104:X 15 Mar 2019 16:18:48.259 * +slave slave 192.168.1.141:17000 192.168.1.141 17000 @ mymaster 192.168.1.161 17000
    3104:X 15 Mar 2019 16:18:48.261 * +slave slave 192.168.1.142:17000 192.168.1.142 17000 @ mymaster 192.168.1.161 17000
    3104:X 15 Mar 2019 16:18:48.262 * +slave slave 192.168.1.143:17000 192.168.1.143 17000 @ mymaster 192.168.1.161 17000
    3104:X 15 Mar 2019 16:18:48.270 * +sentinel sentinel 93da0f13c96f22e0263e4288c68390cce5e11cea 192.168.1.161 16000 @ mymaster 192.168.1.161 17000
    3104:X 15 Mar 2019 16:18:52.815 * +sentinel sentinel 4a32c88aa8bff64295e45f8bb9c25154f4533d2a 192.168.1.163 16000 @ mymaster 192.168.1.161 17000
    3104:X 15 Mar 2019 16:18:54.355 * +sentinel sentinel 53e37f3442336b7b229fa42c655bdca9eaa6578c 192.168.1.141 16000 @ mymaster 192.168.1.161 17000
    3104:X 15 Mar 2019 16:18:55.820 * +sentinel sentinel 3607a3f65b224ee03e49a825604b40d1cc1fbeaf 192.168.1.142 16000 @ mymaster 192.168.1.161 17000
    3104:X 15 Mar 2019 16:18:58.341 * +sentinel sentinel d8e333f5daec7809605b8f276b1f2b44c1711459 192.168.1.143 16000 @ mymaster 192.168.1.161 17000
    3104:X 15 Mar 2019 16:25:46.277 # +new-epoch 1
    3104:X 15 Mar 2019 16:25:46.279 # +vote-for-leader 4a32c88aa8bff64295e45f8bb9c25154f4533d2a 1
    3104:X 15 Mar 2019 16:25:46.309 # +sdown master mymaster 192.168.1.161 17000
    3104:X 15 Mar 2019 16:25:46.309 # +odown master mymaster 192.168.1.161 17000 #quorum 1/1
    3104:X 15 Mar 2019 16:25:46.309 # Next failover delay: I will not start a failover before Fri Mar 15 16:35:46 2019
    3104:X 15 Mar 2019 16:25:46.612 # +config-update-from sentinel 4a32c88aa8bff64295e45f8bb9c25154f4533d2a 192.168.1.163 16000 @ mymaster 192.168.1.161 17000
    3104:X 15 Mar 2019 16:25:46.612 # +switch-master mymaster 192.168.1.161 17000 192.168.1.162 17000
    3104:X 15 Mar 2019 16:25:46.614 * +slave slave 192.168.1.142:17000 192.168.1.142 17000 @ mymaster 192.168.1.162 17000
    3104:X 15 Mar 2019 16:25:46.614 * +slave slave 192.168.1.143:17000 192.168.1.143 17000 @ mymaster 192.168.1.162 17000
    3104:X 15 Mar 2019 16:25:46.614 * +slave slave 192.168.1.141:17000 192.168.1.141 17000 @ mymaster 192.168.1.162 17000
    3104:X 15 Mar 2019 16:25:46.614 * +slave slave 192.168.1.163:17000 192.168.1.163 17000 @ mymaster 192.168.1.162 17000
    3104:X 15 Mar 2019 16:25:46.614 * +slave slave 192.168.1.161:17000 192.168.1.161 17000 @ mymaster 192.168.1.162 17000
    3104:X 15 Mar 2019 16:25:51.622 # +sdown slave 192.168.1.161:17000 192.168.1.161 17000 @ mymaster 192.168.1.162 17000

    跟161上日志几乎是一样的,那么162是怎么被选举出作为master的呢,继续查看163日志:

    [root@mongodb-163 redis]# cat sentinel/sentinel.log 
    3067:X 15 Mar 2019 16:18:50.731 # oO0OoO0OoO0Oo Redis is starting oO0OoO0OoO0Oo
    3067:X 15 Mar 2019 16:18:50.731 # Redis version=5.0.3, bits=64, commit=00000000, modified=0, pid=3067, just started
    3067:X 15 Mar 2019 16:18:50.731 # Configuration loaded
    3068:X 15 Mar 2019 16:18:50.747 * Increased maximum number of open files to 10032 (it was originally set to 1024).
    3068:X 15 Mar 2019 16:18:50.749 * Running mode=sentinel, port=16000.
    3068:X 15 Mar 2019 16:18:50.749 # WARNING: The TCP backlog setting of 511 cannot be enforced because /proc/sys/net/core/somaxconn is set to the lower value of 128.
    3068:X 15 Mar 2019 16:18:50.751 # Sentinel ID is 4a32c88aa8bff64295e45f8bb9c25154f4533d2a
    3068:X 15 Mar 2019 16:18:50.751 # +monitor master mymaster 192.168.1.161 17000 quorum 1
    3068:X 15 Mar 2019 16:18:50.754 * +slave slave 192.168.1.162:17000 192.168.1.162 17000 @ mymaster 192.168.1.161 17000
    3068:X 15 Mar 2019 16:18:50.756 * +slave slave 192.168.1.163:17000 192.168.1.163 17000 @ mymaster 192.168.1.161 17000
    3068:X 15 Mar 2019 16:18:50.757 * +slave slave 192.168.1.141:17000 192.168.1.141 17000 @ mymaster 192.168.1.161 17000
    3068:X 15 Mar 2019 16:18:50.759 * +slave slave 192.168.1.142:17000 192.168.1.142 17000 @ mymaster 192.168.1.161 17000
    3068:X 15 Mar 2019 16:18:50.760 * +slave slave 192.168.1.143:17000 192.168.1.143 17000 @ mymaster 192.168.1.161 17000
    3068:X 15 Mar 2019 16:18:52.329 * +sentinel sentinel 93da0f13c96f22e0263e4288c68390cce5e11cea 192.168.1.161 16000 @ mymaster 192.168.1.161 17000
    3068:X 15 Mar 2019 16:18:52.331 * +sentinel sentinel 605bf637a218de833308aa590cb3346de0eaacaf 192.168.1.162 16000 @ mymaster 192.168.1.161 17000
    3068:X 15 Mar 2019 16:18:54.330 * +sentinel sentinel 53e37f3442336b7b229fa42c655bdca9eaa6578c 192.168.1.141 16000 @ mymaster 192.168.1.161 17000
    3068:X 15 Mar 2019 16:18:55.796 * +sentinel sentinel 3607a3f65b224ee03e49a825604b40d1cc1fbeaf 192.168.1.142 16000 @ mymaster 192.168.1.161 17000
    3068:X 15 Mar 2019 16:18:58.316 * +sentinel sentinel d8e333f5daec7809605b8f276b1f2b44c1711459 192.168.1.143 16000 @ mymaster 192.168.1.161 17000
    3068:X 15 Mar 2019 16:25:46.247 # +sdown master mymaster 192.168.1.161 17000
    3068:X 15 Mar 2019 16:25:46.248 # +odown master mymaster 192.168.1.161 17000 #quorum 1/1
    3068:X 15 Mar 2019 16:25:46.248 # +new-epoch 1
    3068:X 15 Mar 2019 16:25:46.248 # +try-failover master mymaster 192.168.1.161 17000
    3068:X 15 Mar 2019 16:25:46.250 # +vote-for-leader 4a32c88aa8bff64295e45f8bb9c25154f4533d2a 1
    3068:X 15 Mar 2019 16:25:46.253 # 53e37f3442336b7b229fa42c655bdca9eaa6578c voted for 53e37f3442336b7b229fa42c655bdca9eaa6578c 1
    3068:X 15 Mar 2019 16:25:46.254 # 93da0f13c96f22e0263e4288c68390cce5e11cea voted for 4a32c88aa8bff64295e45f8bb9c25154f4533d2a 1
    3068:X 15 Mar 2019 16:25:46.255 # 605bf637a218de833308aa590cb3346de0eaacaf voted for 4a32c88aa8bff64295e45f8bb9c25154f4533d2a 1
    3068:X 15 Mar 2019 16:25:46.255 # 3607a3f65b224ee03e49a825604b40d1cc1fbeaf voted for 4a32c88aa8bff64295e45f8bb9c25154f4533d2a 1
    3068:X 15 Mar 2019 16:25:46.255 # d8e333f5daec7809605b8f276b1f2b44c1711459 voted for 4a32c88aa8bff64295e45f8bb9c25154f4533d2a 1
    3068:X 15 Mar 2019 16:25:46.316 # +elected-leader master mymaster 192.168.1.161 17000
    3068:X 15 Mar 2019 16:25:46.316 # +failover-state-select-slave master mymaster 192.168.1.161 17000
    3068:X 15 Mar 2019 16:25:46.372 # +selected-slave slave 192.168.1.162:17000 192.168.1.162 17000 @ mymaster 192.168.1.161 17000
    3068:X 15 Mar 2019 16:25:46.372 * +failover-state-send-slaveof-noone slave 192.168.1.162:17000 192.168.1.162 17000 @ mymaster 192.168.1.161 17000
    3068:X 15 Mar 2019 16:25:46.424 * +failover-state-wait-promotion slave 192.168.1.162:17000 192.168.1.162 17000 @ mymaster 192.168.1.161 17000
    3068:X 15 Mar 2019 16:25:46.528 # +promoted-slave slave 192.168.1.162:17000 192.168.1.162 17000 @ mymaster 192.168.1.161 17000
    3068:X 15 Mar 2019 16:25:46.528 # +failover-state-reconf-slaves master mymaster 192.168.1.161 17000
    3068:X 15 Mar 2019 16:25:46.584 * +slave-reconf-sent slave 192.168.1.163:17000 192.168.1.163 17000 @ mymaster 192.168.1.161 17000
    3068:X 15 Mar 2019 16:25:46.584 * +slave-reconf-sent slave 192.168.1.142:17000 192.168.1.142 17000 @ mymaster 192.168.1.161 17000
    3068:X 15 Mar 2019 16:25:47.546 * +slave-reconf-inprog slave 192.168.1.163:17000 192.168.1.163 17000 @ mymaster 192.168.1.161 17000
    3068:X 15 Mar 2019 16:25:47.546 * +slave-reconf-done slave 192.168.1.163:17000 192.168.1.163 17000 @ mymaster 192.168.1.161 17000
    3068:X 15 Mar 2019 16:25:47.546 * +slave-reconf-inprog slave 192.168.1.142:17000 192.168.1.142 17000 @ mymaster 192.168.1.161 17000
    3068:X 15 Mar 2019 16:25:47.546 * +slave-reconf-done slave 192.168.1.142:17000 192.168.1.142 17000 @ mymaster 192.168.1.161 17000
    3068:X 15 Mar 2019 16:25:47.602 * +slave-reconf-sent slave 192.168.1.143:17000 192.168.1.143 17000 @ mymaster 192.168.1.161 17000
    3068:X 15 Mar 2019 16:25:47.602 * +slave-reconf-sent slave 192.168.1.141:17000 192.168.1.141 17000 @ mymaster 192.168.1.161 17000
    3068:X 15 Mar 2019 16:25:48.588 * +slave-reconf-inprog slave 192.168.1.141:17000 192.168.1.141 17000 @ mymaster 192.168.1.161 17000
    3068:X 15 Mar 2019 16:25:48.588 * +slave-reconf-done slave 192.168.1.141:17000 192.168.1.141 17000 @ mymaster 192.168.1.161 17000
    3068:X 15 Mar 2019 16:25:48.588 * +slave-reconf-inprog slave 192.168.1.143:17000 192.168.1.143 17000 @ mymaster 192.168.1.161 17000
    3068:X 15 Mar 2019 16:25:49.661 * +slave-reconf-done slave 192.168.1.143:17000 192.168.1.143 17000 @ mymaster 192.168.1.161 17000
    3068:X 15 Mar 2019 16:25:49.732 # +failover-end master mymaster 192.168.1.161 17000
    3068:X 15 Mar 2019 16:25:49.732 # +switch-master mymaster 192.168.1.161 17000 192.168.1.162 17000
    3068:X 15 Mar 2019 16:25:49.733 * +slave slave 192.168.1.163:17000 192.168.1.163 17000 @ mymaster 192.168.1.162 17000
    3068:X 15 Mar 2019 16:25:49.733 * +slave slave 192.168.1.142:17000 192.168.1.142 17000 @ mymaster 192.168.1.162 17000
    3068:X 15 Mar 2019 16:25:49.733 * +slave slave 192.168.1.143:17000 192.168.1.143 17000 @ mymaster 192.168.1.162 17000
    3068:X 15 Mar 2019 16:25:49.733 * +slave slave 192.168.1.141:17000 192.168.1.141 17000 @ mymaster 192.168.1.162 17000
    3068:X 15 Mar 2019 16:25:49.733 * +slave slave 192.168.1.161:17000 192.168.1.161 17000 @ mymaster 192.168.1.162 17000
    3068:X 15 Mar 2019 16:25:54.743 # +sdown slave 192.168.1.161:17000 192.168.1.161 17000 @ mymaster 192.168.1.162 17000

    在16:25:46.247时sentinel已将161进行了主观下线(sdown,标记为不可用),然后紧接着进行了客观下线(odown,sentinel认为确实不可用),这里说明一下,为什么主观下线后紧跟着客观下线,原因是配置了:

    sentinel monitor mymaster 192.168.1.161 17000 1

    最后这个1的意思是只要大于等于1个sentinel认为sdown,则认为可以进行odown了,所以日志中有一句:odown master mymaster 192.168.1.161 17000 #quorum 1/1,所以odown紧跟在sdown后面。

    在163的sentinel的日志中:Sentinel ID is 4a32c88aa8bff64295e45f8bb9c25154f4533d2a  表示163上的sentinel的ID,后面会用到这个。

    当sentinel设置161为odown(客观下线)后,这些sentinel开始进行投票选举,这里特别注意,选举的不是redis_server服务,而是sentinel服务,意思就是先选举一个sentinel,然后由它决定将哪个slave提升为master

    查看141服务器上的sentinel日志:

    [root@redis redis]# cat sentinel/sentinel.log 
    2747:X 15 Mar 2019 16:18:52.042 # oO0OoO0OoO0Oo Redis is starting oO0OoO0OoO0Oo
    2747:X 15 Mar 2019 16:18:52.042 # Redis version=5.0.3, bits=64, commit=00000000, modified=0, pid=2747, just started
    2747:X 15 Mar 2019 16:18:52.042 # Configuration loaded
    2748:X 15 Mar 2019 16:18:52.059 * Increased maximum number of open files to 10032 (it was originally set to 1024).
    2748:X 15 Mar 2019 16:18:52.061 * Running mode=sentinel, port=16000.
    2748:X 15 Mar 2019 16:18:52.061 # WARNING: The TCP backlog setting of 511 cannot be enforced because /proc/sys/net/core/somaxconn is set to the lower value of 128.
    2748:X 15 Mar 2019 16:18:52.071 # Sentinel ID is 53e37f3442336b7b229fa42c655bdca9eaa6578c
    2748:X 15 Mar 2019 16:18:52.072 # +monitor master mymaster 192.168.1.161 17000 quorum 1
    2748:X 15 Mar 2019 16:18:52.073 * +slave slave 192.168.1.162:17000 192.168.1.162 17000 @ mymaster 192.168.1.161 17000
    2748:X 15 Mar 2019 16:18:52.075 * +slave slave 192.168.1.163:17000 192.168.1.163 17000 @ mymaster 192.168.1.161 17000
    2748:X 15 Mar 2019 16:18:52.076 * +slave slave 192.168.1.141:17000 192.168.1.141 17000 @ mymaster 192.168.1.161 17000
    2748:X 15 Mar 2019 16:18:52.078 * +slave slave 192.168.1.142:17000 192.168.1.142 17000 @ mymaster 192.168.1.161 17000
    2748:X 15 Mar 2019 16:18:52.079 * +slave slave 192.168.1.143:17000 192.168.1.143 17000 @ mymaster 192.168.1.161 17000
    2748:X 15 Mar 2019 16:18:52.515 * +sentinel sentinel 4a32c88aa8bff64295e45f8bb9c25154f4533d2a 192.168.1.163 16000 @ mymaster 192.168.1.161 17000
    2748:X 15 Mar 2019 16:18:54.057 * +sentinel sentinel 93da0f13c96f22e0263e4288c68390cce5e11cea 192.168.1.161 16000 @ mymaster 192.168.1.161 17000
    2748:X 15 Mar 2019 16:18:54.078 * +sentinel sentinel 605bf637a218de833308aa590cb3346de0eaacaf 192.168.1.162 16000 @ mymaster 192.168.1.161 17000
    2748:X 15 Mar 2019 16:18:55.520 * +sentinel sentinel 3607a3f65b224ee03e49a825604b40d1cc1fbeaf 192.168.1.142 16000 @ mymaster 192.168.1.161 17000
    2748:X 15 Mar 2019 16:18:58.040 * +sentinel sentinel d8e333f5daec7809605b8f276b1f2b44c1711459 192.168.1.143 16000 @ mymaster 192.168.1.161 17000
    2748:X 15 Mar 2019 16:25:45.962 # +sdown master mymaster 192.168.1.161 17000
    2748:X 15 Mar 2019 16:25:45.963 # +odown master mymaster 192.168.1.161 17000 #quorum 1/1
    2748:X 15 Mar 2019 16:25:45.963 # +new-epoch 1
    2748:X 15 Mar 2019 16:25:45.963 # +try-failover master mymaster 192.168.1.161 17000
    2748:X 15 Mar 2019 16:25:45.966 # +vote-for-leader 53e37f3442336b7b229fa42c655bdca9eaa6578c 1
    2748:X 15 Mar 2019 16:25:45.966 # 4a32c88aa8bff64295e45f8bb9c25154f4533d2a voted for 4a32c88aa8bff64295e45f8bb9c25154f4533d2a 1
    2748:X 15 Mar 2019 16:25:45.967 # 93da0f13c96f22e0263e4288c68390cce5e11cea voted for 4a32c88aa8bff64295e45f8bb9c25154f4533d2a 1
    2748:X 15 Mar 2019 16:25:45.968 # 605bf637a218de833308aa590cb3346de0eaacaf voted for 4a32c88aa8bff64295e45f8bb9c25154f4533d2a 1
    2748:X 15 Mar 2019 16:25:45.968 # 3607a3f65b224ee03e49a825604b40d1cc1fbeaf voted for 4a32c88aa8bff64295e45f8bb9c25154f4533d2a 1
    2748:X 15 Mar 2019 16:25:45.968 # d8e333f5daec7809605b8f276b1f2b44c1711459 voted for 4a32c88aa8bff64295e45f8bb9c25154f4533d2a 1
    2748:X 15 Mar 2019 16:25:46.301 # +config-update-from sentinel 4a32c88aa8bff64295e45f8bb9c25154f4533d2a 192.168.1.163 16000 @ mymaster 192.168.1.161 17000
    2748:X 15 Mar 2019 16:25:46.301 # +switch-master mymaster 192.168.1.161 17000 192.168.1.162 17000
    2748:X 15 Mar 2019 16:25:46.302 * +slave slave 192.168.1.142:17000 192.168.1.142 17000 @ mymaster 192.168.1.162 17000
    2748:X 15 Mar 2019 16:25:46.302 * +slave slave 192.168.1.141:17000 192.168.1.141 17000 @ mymaster 192.168.1.162 17000
    2748:X 15 Mar 2019 16:25:46.303 * +slave slave 192.168.1.163:17000 192.168.1.163 17000 @ mymaster 192.168.1.162 17000
    2748:X 15 Mar 2019 16:25:46.303 * +slave slave 192.168.1.143:17000 192.168.1.143 17000 @ mymaster 192.168.1.162 17000
    2748:X 15 Mar 2019 16:25:46.303 * +slave slave 192.168.1.161:17000 192.168.1.161 17000 @ mymaster 192.168.1.162 17000
    2748:X 15 Mar 2019 16:25:51.320 # +sdown slave 192.168.1.161:17000 192.168.1.161 17000 @ mymaster 192.168.1.162 17000

    由141上日志看出:选举后,141上的sentinel并没有进行操作failover,所以可以证实选举是sentinel。

    日志说明:

    这段日志就是在进行选举:

    3068:X 15 Mar 2019 16:25:46.248 # +new-epoch 1
    3068:X 15 Mar 2019 16:25:46.248 # +try-failover master mymaster 192.168.1.161 17000
    3068:X 15 Mar 2019 16:25:46.250 # +vote-for-leader 4a32c88aa8bff64295e45f8bb9c25154f4533d2a 1
    3068:X 15 Mar 2019 16:25:46.253 # 53e37f3442336b7b229fa42c655bdca9eaa6578c voted for 53e37f3442336b7b229fa42c655bdca9eaa6578c 1
    3068:X 15 Mar 2019 16:25:46.254 # 93da0f13c96f22e0263e4288c68390cce5e11cea voted for 4a32c88aa8bff64295e45f8bb9c25154f4533d2a 1
    3068:X 15 Mar 2019 16:25:46.255 # 605bf637a218de833308aa590cb3346de0eaacaf voted for 4a32c88aa8bff64295e45f8bb9c25154f4533d2a 1
    3068:X 15 Mar 2019 16:25:46.255 # 3607a3f65b224ee03e49a825604b40d1cc1fbeaf voted for 4a32c88aa8bff64295e45f8bb9c25154f4533d2a 1
    3068:X 15 Mar 2019 16:25:46.255 # d8e333f5daec7809605b8f276b1f2b44c1711459 voted for 4a32c88aa8bff64295e45f8bb9c25154f4533d2a 1

    可以看出id为4a32c88aa8bff64295e45f8bb9c25154f4533d2a的sentinel有4票(共6个sentinel),它决定哪个slave升为master: 

    3068:X 15 Mar 2019 16:25:46.316 # +elected-leader master mymaster 192.168.1.161 17000
    3068:X 15 Mar 2019 16:25:46.316 # +failover-state-select-slave master mymaster 192.168.1.161 17000
    3068:X 15 Mar 2019 16:25:46.372 # +selected-slave slave 192.168.1.162:17000 192.168.1.162 17000 @ mymaster 192.168.1.161 17000
    3068:X 15 Mar 2019 16:25:46.372 * +failover-state-send-slaveof-noone slave 192.168.1.162:17000 192.168.1.162 17000 @ mymaster 192.168.1.161 17000
    3068:X 15 Mar 2019 16:25:46.424 * +failover-state-wait-promotion slave 192.168.1.162:17000 192.168.1.162 17000 @ mymaster 192.168.1.161 17000
    3068:X 15 Mar 2019 16:25:46.528 # +promoted-slave slave 192.168.1.162:17000 192.168.1.162 17000 @ mymaster 192.168.1.161 17000
    3068:X 15 Mar 2019 16:25:46.528 # +failover-state-reconf-slaves master mymaster 192.168.1.161 17000

    最终163上的sentinel选择了162为master,并进行了failover

    其它各服务器则进行更新配置:

    3068:X 15 Mar 2019 16:25:46.584 * +slave-reconf-sent slave 192.168.1.163:17000 192.168.1.163 17000 @ mymaster 192.168.1.161 17000
    3068:X 15 Mar 2019 16:25:46.584 * +slave-reconf-sent slave 192.168.1.142:17000 192.168.1.142 17000 @ mymaster 192.168.1.161 17000
    3068:X 15 Mar 2019 16:25:47.546 * +slave-reconf-inprog slave 192.168.1.163:17000 192.168.1.163 17000 @ mymaster 192.168.1.161 17000
    3068:X 15 Mar 2019 16:25:47.546 * +slave-reconf-done slave 192.168.1.163:17000 192.168.1.163 17000 @ mymaster 192.168.1.161 17000
    3068:X 15 Mar 2019 16:25:47.546 * +slave-reconf-inprog slave 192.168.1.142:17000 192.168.1.142 17000 @ mymaster 192.168.1.161 17000
    3068:X 15 Mar 2019 16:25:47.546 * +slave-reconf-done slave 192.168.1.142:17000 192.168.1.142 17000 @ mymaster 192.168.1.161 17000
    3068:X 15 Mar 2019 16:25:47.602 * +slave-reconf-sent slave 192.168.1.143:17000 192.168.1.143 17000 @ mymaster 192.168.1.161 17000
    3068:X 15 Mar 2019 16:25:47.602 * +slave-reconf-sent slave 192.168.1.141:17000 192.168.1.141 17000 @ mymaster 192.168.1.161 17000
    3068:X 15 Mar 2019 16:25:48.588 * +slave-reconf-inprog slave 192.168.1.141:17000 192.168.1.141 17000 @ mymaster 192.168.1.161 17000
    3068:X 15 Mar 2019 16:25:48.588 * +slave-reconf-done slave 192.168.1.141:17000 192.168.1.141 17000 @ mymaster 192.168.1.161 17000
    3068:X 15 Mar 2019 16:25:48.588 * +slave-reconf-inprog slave 192.168.1.143:17000 192.168.1.143 17000 @ mymaster 192.168.1.161 17000
    3068:X 15 Mar 2019 16:25:49.661 * +slave-reconf-done slave 192.168.1.143:17000 192.168.1.143 17000 @ mymaster 192.168.1.161 17000

    最后,形成新的master-slave集群:

    3068:X 15 Mar 2019 16:25:49.733 * +slave slave 192.168.1.163:17000 192.168.1.163 17000 @ mymaster 192.168.1.162 17000
    3068:X 15 Mar 2019 16:25:49.733 * +slave slave 192.168.1.142:17000 192.168.1.142 17000 @ mymaster 192.168.1.162 17000
    3068:X 15 Mar 2019 16:25:49.733 * +slave slave 192.168.1.143:17000 192.168.1.143 17000 @ mymaster 192.168.1.162 17000
    3068:X 15 Mar 2019 16:25:49.733 * +slave slave 192.168.1.141:17000 192.168.1.141 17000 @ mymaster 192.168.1.162 17000
    3068:X 15 Mar 2019 16:25:49.733 * +slave slave 192.168.1.161:17000 192.168.1.161 17000 @ mymaster 192.168.1.162 17000
    3068:X 15 Mar 2019 16:25:54.743 # +sdown slave 192.168.1.161:17000 192.168.1.161 17000 @ mymaster 192.168.1.162 17000

    查看162上的replication信息:

    [root@mongodb-162 redis]# src/redis-cli -p 17000
    127.0.0.1:17000> AUTH 123456
    OK
    127.0.0.1:17000> info replication
    # Replication
    role:master
    connected_slaves:4
    slave0:ip=192.168.1.163,port=17000,state=online,offset=947470,lag=1
    slave1:ip=192.168.1.142,port=17000,state=online,offset=947470,lag=1
    slave2:ip=192.168.1.141,port=17000,state=online,offset=947754,lag=1
    slave3:ip=192.168.1.143,port=17000,state=online,offset=947470,lag=1
    master_replid:b6c12c4d48aa953cb473480198359da669f171cc
    master_replid2:b3f15ec093bb8ab2b1e908b41740f4bdabbfeba4
    master_repl_offset:948180
    second_repl_offset:172534
    repl_backlog_active:1
    repl_backlog_size:1048576
    repl_backlog_first_byte_offset:1
    repl_backlog_histlen:948180
    127.0.0.1:17000> 

    再查看163上集群配置:

    [root@mongodb-163 redis]# src/redis-cli -p 17000
    127.0.0.1:17000> AUTH 123456
    OK
    127.0.0.1:17000> info replication
    # Replication
    role:slave
    master_host:192.168.1.162
    master_port:17000
    master_link_status:up
    master_last_io_seconds_ago:1
    master_sync_in_progress:0
    slave_repl_offset:955166
    slave_priority:100
    slave_read_only:1
    connected_slaves:0
    master_replid:b6c12c4d48aa953cb473480198359da669f171cc
    master_replid2:b3f15ec093bb8ab2b1e908b41740f4bdabbfeba4
    master_repl_offset:955166
    second_repl_offset:172534
    repl_backlog_active:1
    repl_backlog_size:1048576
    repl_backlog_first_byte_offset:253
    repl_backlog_histlen:954914
    127.0.0.1:17000> 

     最后,master宕机后,master-slave自动切换成功。

    六、遇到的坑

    1.slave不能连接到master,原因是防火墙开着的,解决方案是关闭防火墙,或把相应端口添加到防火墙中

    2.master宕机后,不能自动选举新的master,查看日志报如下错误:

    2230:X 15 Mar 2019 09:11:16.006 # -failover-abort-not-elected master mymaster 192.168.1.161 17000
    2230:X 15 Mar 2019 09:11:16.007 # Next failover delay: I will not start a failover before Fri Mar 15 09:17:05 2019

     原因是:原先开启了保护模式:protected-mode yes,而bind参数没有设置正确,实际上redis_master.conf、redis_slave.conf、sentinel.conf均需要设置正确,绑定上正确的网卡IP,我这里嫌麻烦改每台服务器配置(配置文件是做好后上传到服务器的),就把保护模式关闭了:protected-mode no。

    官方配置中是默认开启的,一定要注意这点。不过文档中也建议,如果设置了访问密码,也可以关闭。

  • 相关阅读:
    【BZOJ 2324】 [ZJOI2011]营救皮卡丘
    【BZOJ 2809】 [Apio2012]dispatching
    网络流小结
    复活
    终结
    11.7模拟赛
    codevs 2173 忠诚
    P3386 【模板】二分图匹配
    Leetcode 大部分是medium难度不怎么按顺序题解(上)
    ATP的新博客!
  • 原文地址:https://www.cnblogs.com/chensuqian/p/10538365.html
Copyright © 2011-2022 走看看