zoukankan      html  css  js  c++  java
  • Redis Cluster部署、管理和测试

    背景:

          Redis 3.0之后支持了Cluster,大大增强了Redis水平扩展的能力。Redis Cluster是Redis官方的集群实现方案,在此之前已经有第三方Redis集群解决方案,如Twenproxy、Codis,与其不同的是:Redis Cluster并非使用Porxy的模式来连接集群节点,而是使用无中心节点的模式来组建集群。在Cluster出现之前,只有Sentinel保证了Redis的高可用性。

          Redis Cluster实现在多个节点之间进行数据共享,即使部分节点失效或者无法进行通讯时,Cluster仍然可以继续处理请求。若每个主节点都有一个从节点支持,在主节点下线或者无法与集群的大多数节点进行通讯的情况下, 从节点提升为主节点,并提供服务,保证Cluster正常运行,Redis Cluster的节点分片是通过哈希槽(hash slot)实现的,每个键都属于这 16384(0~16383) 个哈希槽的其中一个,每个节点负责处理一部分哈希槽。

    环境:

    Ubuntu 14.04
    Redis 3.2.8
    主节点:192.168.100.134/135/136:17021
    从节点:192.168.100.134/135/136:17022

    对应主从节点:

       主        从 
    134:17021 135:17022 135:17021 136:17022 136:17021 134:17022

    手动部署:

    ①:安装
    按照Redis之Sentinel高可用安装部署文章中的说明,装好Redis。只需要修改一下Cluster相关的配置参数:

     View Code

    安装好之后开启Redis:均运行在集群模式下

    root@redis-cluster1:~# ps -ef | grep redis
    redis      4292      1  0 00:33 ?        00:00:03 /usr/local/bin/redis-server 192.168.100.134:17021 [cluster]
    redis      4327      1  0 01:58 ?        00:00:00 /usr/local/bin/redis-server 192.168.100.134:17022 [cluster]

    ②:配置主节点

    添加节点: cluster meet ip port

    复制代码
    进入其中任意17021端口的实例,进入集群模式需要参数-c:
    ~# redis-cli -h 192.168.100.134 -p 17021 -c
    192.168.100.134:17021> cluster meet 192.168.100.135 17021
    OK
    192.168.100.134:17021> cluster meet 192.168.100.136 17021
    OK
    节点添加成功
    复制代码

    查看集群状态:cluster info

    复制代码
    192.168.100.134:17021> cluster info
    cluster_state:fail                        #集群状态
    cluster_slots_assigned:0                  #被分配的槽位数
    cluster_slots_ok:0                        #正确分配的槽位             
    cluster_slots_pfail:0
    cluster_slots_fail:0
    cluster_known_nodes:3                     #当前3个节点
    cluster_size:0
    cluster_current_epoch:2                  
    cluster_my_epoch:1
    cluster_stats_messages_sent:83
    cluster_stats_messages_received:83
    复制代码

    上面看到集群状态是失败的,原因是槽位没有分配,而且需要一次性把16384个槽位完全分配了,集群才可用。接着开始分配槽位:需要登入到各个节点,进行槽位的分配,如:
    node1分配:0~5461
    node2分配:5462~10922
    node3分配:10923~16383

    分配槽位:cluster addslots 槽位,一个槽位只能分配一个节点,16384个槽位必须分配完,不同节点不能冲突。

    192.168.100.134:17021> cluster addslots 0
    OK
    192.168.100.135:17021> cluster addslots 0   #冲突
    (error) ERR Slot 0 is already busy

    目前还没有支持区间范围的添加槽位操作,所以添加16384个槽位的需要写一个批量脚本(addslots.sh):

    复制代码
    node1:
    #!/bin/bash
    n=0
    for ((i=n;i<=5461;i++))
    do
       /usr/local/bin/redis-cli -h 192.168.100.134 -p 17021 -a dxy CLUSTER ADDSLOTS $i
    done
    
    node2:
    #!/bin/bash
    n=5462
    for ((i=n;i<=10922;i++))
    do
       /usr/local/bin/redis-cli -h 192.168.100.135 -p 17021 -a dxy CLUSTER ADDSLOTS $i
    done
    
    node3:
    #!/bin/bash
    n=10923
    for ((i=n;i<=16383;i++))
    do
       /usr/local/bin/redis-cli -h 192.168.100.136 -p 17021 -a dxy CLUSTER ADDSLOTS $i
    done
    复制代码

    连接3个节点分别执行:bash addslots.sh。所有槽位得到分配之后,在看下集群状态:

    复制代码
    192.168.100.134:17021> cluster info
    cluster_state:ok
    cluster_slots_assigned:16384
    cluster_slots_ok:16384
    cluster_slots_pfail:0
    cluster_slots_fail:0
    cluster_known_nodes:3
    cluster_size:3
    cluster_current_epoch:2
    cluster_my_epoch:1
    cluster_stats_messages_sent:4193
    cluster_stats_messages_received:4193
    复制代码

    看到集群已经成功,那移除一个槽位看看集群会怎么样:cluster delslots 槽位

    复制代码
    192.168.100.134:17021> cluster delslots 0
    OK
    192.168.100.134:17021> cluster info
    cluster_state:fail
    cluster_slots_assigned:16383
    cluster_slots_ok:16383
    cluster_slots_pfail:0
    cluster_slots_fail:0
    cluster_known_nodes:3
    cluster_size:3
    cluster_current_epoch:2
    cluster_my_epoch:1
    cluster_stats_messages_sent:4482
    cluster_stats_messages_received:4482
    复制代码

    看到16384个槽位如果没有分配完全,集群是不成功的。 到这里为止,一个简单的Redis Cluster已经搭建完成,这里每个节点都是一个单点,若出现一个节点不可用,会导致整个集群的不可用,如何保证各个节点的高可用呢?这可以对每个主节点再建一个从节点来保证。

    添加从节点(集群复制): 复制的原理和单机的Redis复制原理一样,区别是:集群下的从节点也需要运行在cluster模式下,要先添加到集群里面,再做复制。

    ①:添加从节点到集群中

    复制代码
    192.168.100.134:17021> cluster meet 192.168.100.134 17022
    OK
    192.168.100.134:17021> cluster meet 192.168.100.135 17022
    OK
    192.168.100.134:17021> cluster meet 192.168.100.136 17022
    OK
    192.168.100.134:17021> cluster info
    cluster_state:ok
    cluster_slots_assigned:16384
    cluster_slots_ok:16384
    cluster_slots_pfail:0
    cluster_slots_fail:0
    cluster_known_nodes:6     #当前集群下的所有节点,包括主从节点
    cluster_size:3            #当前集群下的有槽位分配的节点,即主节点
    cluster_current_epoch:5
    cluster_my_epoch:1
    cluster_stats_messages_sent:13438
    cluster_stats_messages_received:13438
    复制代码

    ②:创建从节点 cluster replicate node_id ,通过cluster nodes得到node_id,需要在要成为的从节点的Redis(17022)上执行。

    复制代码
    192.168.100.134:17022> cluster nodes #查看节点信息
    7438368ca8f8a27fdf2da52940bb50098a78c6fc 192.168.100.136:17022 master - 0 1488255023528 5 connected
    e1b78bb74970d0353832b2913e9b35eba74a2a1a 192.168.100.134:17022 myself,master - 0 0 0 connected
    05e72d06edec6a920dd91b050c7a315937fddb66 192.168.100.136:17021 master - 0 1488255022526 2 connected 10923-16383
    b461a30fde28409c38ee6c32db1cd267a6cfd125 192.168.100.135:17021 master - 0 1488255026533 3 connected 5462-10922
    11f9169577352c33d85ad0d1ca5f5bf0deba3209 192.168.100.134:17021 master - 0 1488255025531 1 connected 0-5461
    2b8b518324de0990ca587b47f6316e5f07b1df59 192.168.100.135:17022 master - 0 1488255024530 4 connected

    #成为135:17021的从节点 192.168.100.134:17022> cluster replicate b461a30fde28409c38ee6c32db1cd267a6cfd125 OK
    复制代码

    处理其他2个节点:

    #成为136:17021的从节点
    192.168.100.135:17022> cluster replicate 05e72d06edec6a920dd91b050c7a315937fddb66
    OK
    #成为134:17021的从节点
    192.168.100.136:17022> cluster replicate 11f9169577352c33d85ad0d1ca5f5bf0deba3209
    OK

    查看节点状态:cluster nodes 

    复制代码
    2b8b518324de0990ca587b47f6316e5f07b1df59 192.168.100.135:17022 slave 05e72d06edec6a920dd91b050c7a315937fddb66 0 1488255859347 4 connected
    11f9169577352c33d85ad0d1ca5f5bf0deba3209 192.168.100.134:17021 myself,master - 0 0 1 connected 0-5461
    05e72d06edec6a920dd91b050c7a315937fddb66 192.168.100.136:17021 master - 0 1488255860348 2 connected 10923-16383
    e1b78bb74970d0353832b2913e9b35eba74a2a1a 192.168.100.134:17022 slave b461a30fde28409c38ee6c32db1cd267a6cfd125 0 1488255858344 3 connected
    7438368ca8f8a27fdf2da52940bb50098a78c6fc 192.168.100.136:17022 slave 11f9169577352c33d85ad0d1ca5f5bf0deba3209 0 1488255856341 5 connected
    b461a30fde28409c38ee6c32db1cd267a6cfd125 192.168.100.135:17021 master - 0 1488255857343 3 connected 5462-10922
    复制代码

    可以通过查看slave对应的node_id找出它的master节点,如以上操作遇到问题可以查看/var/log/redis/目录下的日志。到此Redis Cluster分片、高可用部署完成,接着继续说明一下集群的相关管理命令。

    管理:cluster xxx

    上面已经介绍了一部分Cluster相关的命令,现在对所有的命令所以下说明。 

    复制代码
    CLUSTER info:打印集群的信息。
    CLUSTER nodes:列出集群当前已知的所有节点(node)的相关信息。
    CLUSTER meet <ip> <port>:将ip和port所指定的节点添加到集群当中。
    CLUSTER addslots <slot> [slot ...]:将一个或多个槽(slot)指派(assign)给当前节点。
    CLUSTER delslots <slot> [slot ...]:移除一个或多个槽对当前节点的指派。
    CLUSTER slots:列出槽位、节点信息。
    CLUSTER slaves <node_id>:列出指定节点下面的从节点信息。
    CLUSTER replicate <node_id>:将当前节点设置为指定节点的从节点。
    CLUSTER saveconfig:手动执行命令保存保存集群的配置文件,集群默认在配置修改的时候会自动保存配置文件。
    CLUSTER keyslot <key>:列出key被放置在哪个槽上。
    CLUSTER flushslots:移除指派给当前节点的所有槽,让当前节点变成一个没有指派任何槽的节点。
    CLUSTER countkeysinslot <slot>:返回槽目前包含的键值对数量。
    CLUSTER getkeysinslot <slot> <count>:返回count个槽中的键。
    
    CLUSTER setslot <slot> node <node_id> 将槽指派给指定的节点,如果槽已经指派给另一个节点,那么先让另一个节点删除该槽,然后再进行指派。  
    CLUSTER setslot <slot> migrating <node_id> 将本节点的槽迁移到指定的节点中。  
    CLUSTER setslot <slot> importing <node_id> 从 node_id 指定的节点中导入槽 slot 到本节点。  
    CLUSTER setslot <slot> stable 取消对槽 slot 的导入(import)或者迁移(migrate)。 
    
    CLUSTER failover:手动进行故障转移。
    CLUSTER forget <node_id>:从集群中移除指定的节点,这样就无法完成握手,过期时为60s,60s后两节点又会继续完成握手。
    CLUSTER reset [HARD|SOFT]:重置集群信息,soft是清空其他节点的信息,但不修改自己的id,hard还会修改自己的id,不传该参数则使用soft方式。
    
    CLUSTER count-failure-reports <node_id>:列出某个节点的故障报告的长度。
    CLUSTER SET-CONFIG-EPOCH:设置节点epoch,只有在节点加入集群前才能设置。
    复制代码

    为了更好的展示上面命令,先为这个新集群插入一些数据:通过脚本插入:

     View Code

    这里说明一下上面没有介绍过的管理命令:

    ①:cluster slots 列出槽位和对应节点的信息

    复制代码
    192.168.100.134:17021> cluster slots
    1) 1) (integer) 0
       2) (integer) 5461
       3) 1) "192.168.100.134"
          2) (integer) 17021
          3) "11f9169577352c33d85ad0d1ca5f5bf0deba3209"
       4) 1) "192.168.100.136"
          2) (integer) 17022
          3) "7438368ca8f8a27fdf2da52940bb50098a78c6fc"
    2) 1) (integer) 10923
       2) (integer) 16383
       3) 1) "192.168.100.136"
          2) (integer) 17021
          3) "05e72d06edec6a920dd91b050c7a315937fddb66"
       4) 1) "192.168.100.135"
          2) (integer) 17022
          3) "2b8b518324de0990ca587b47f6316e5f07b1df59"
    3) 1) (integer) 5462
       2) (integer) 10922
       3) 1) "192.168.100.135"
          2) (integer) 17021
          3) "b461a30fde28409c38ee6c32db1cd267a6cfd125"
       4) 1) "192.168.100.134"
          2) (integer) 17022
          3) "e1b78bb74970d0353832b2913e9b35eba74a2a1a"
    复制代码

    ②:cluster slaves:列出指定节点的从节点

    192.168.100.134:17021> cluster slaves 11f9169577352c33d85ad0d1ca5f5bf0deba3209
    1) "7438368ca8f8a27fdf2da52940bb50098a78c6fc 192.168.100.136:17022 slave 11f9169577352c33d85ad0d1ca5f5bf0deba3209 0 1488274385311 5 connected"

    ③:cluster keyslot:列出key放在那个槽上

    192.168.100.134:17021> cluster keyslot 9223372036854742675
    (integer) 10310

    ④:cluster countkeysinslot:列出指定槽位的key数量

    192.168.100.134:17021> cluster countkeysinslot 1
    (integer) 19

    ⑤:cluster getkeysinslot :列出指定槽位中的指定数量的key

    192.168.100.134:17021> cluster getkeysinslot 1 3
    1) "9223372036854493093"
    2) "9223372036854511387"
    3) "9223372036854522344"

    ⑥:cluster setslot ...手动迁移192.168.100.134:17021的0槽位到192.168.100.135:17021

    复制代码
    1:首先查看各节点的槽位
    192.168.100.134:17021> cluster nodes
    2b8b518324de0990ca587b47f6316e5f07b1df59 192.168.100.135:17022 slave 05e72d06edec6a920dd91b050c7a315937fddb66 0 1488295105089 4 connected
    11f9169577352c33d85ad0d1ca5f5bf0deba3209 192.168.100.134:17021 myself,master - 0 0 7 connected 0-5461
    05e72d06edec6a920dd91b050c7a315937fddb66 192.168.100.136:17021 master - 0 1488295107092 2 connected 10923-16383
    e1b78bb74970d0353832b2913e9b35eba74a2a1a 192.168.100.134:17022 slave b461a30fde28409c38ee6c32db1cd267a6cfd125 0 1488295106090 6 connected
    7438368ca8f8a27fdf2da52940bb50098a78c6fc 192.168.100.136:17022 slave 11f9169577352c33d85ad0d1ca5f5bf0deba3209 0 1488295104086 7 connected
    b461a30fde28409c38ee6c32db1cd267a6cfd125 192.168.100.135:17021 master - 0 1488295094073 6 connected 5462-10922
    
    2:查看要迁移槽位的key
    192.168.100.134:17021> cluster getkeysinslot 0 100
    1) "9223372012094975807"
    2) "9223372031034975807"
    
    3:到目标节点执行导入操作
    192.168.100.135:17021> cluster setslot 0 importing 11f9169577352c33d85ad0d1ca5f5bf0deba3209
    OK
    192.168.100.135:17021> cluster nodes
    ...
    b461a30fde28409c38ee6c32db1cd267a6cfd125 192.168.100.135:17021 myself,master - 0 0 6 connected 5462-10922 [0-<-11f9169577352c33d85ad0d1ca5f5bf0deba3209]
    ...
    
    4:到源节点进行迁移操作
    192.168.100.134:17021> cluster setslot 0 migrating b461a30fde28409c38ee6c32db1cd267a6cfd125
    OK
    192.168.100.134:17021> cluster nodes
    ...
    11f9169577352c33d85ad0d1ca5f5bf0deba3209 192.168.100.134:17021 myself,master - 0 0 7 connected 0-5461 [0->-b461a30fde28409c38ee6c32db1cd267a6cfd125]
    ...
    
    5:在源节点迁移槽位中的key到目标节点:MIGRATE host port key destination-db timeout [COPY] [REPLACE]
    192.168.100.134:17021> migrate 192.168.100.135 17021 9223372031034975807 0 5000 replace
    OK
    192.168.100.134:17021> migrate 192.168.100.135 17021 9223372012094975807 0 5000 replace
    OK
    192.168.100.134:17021> cluster getkeysinslot 0 100     #key迁移完之后,才能进行下一步
    (empty list or set)
    
    6:最后设置槽位到指定节点,命令将会广播给集群其他节点,已经将Slot转移到目标节点
    192.168.100.135:17021> cluster setslot 0 node b461a30fde28409c38ee6c32db1cd267a6cfd125
    OK
    192.168.100.134:17021> cluster setslot 0 node b461a30fde28409c38ee6c32db1cd267a6cfd125
    OK
    
    7:验证是否迁移成功:
    192.168.100.134:17021> cluster nodes
    ...
    11f9169577352c33d85ad0d1ca5f5bf0deba3209 192.168.100.134:17021 myself,master - 0 0 9 connected 1-5461 #变了
    ...
    b461a30fde28409c38ee6c32db1cd267a6cfd125 192.168.100.135:17021 master - 0 1488300965322 10 connected 0 5462-10922
    
    查看槽位信息:
    192.168.100.134:17021> cluster slots
    1) 1) (integer) 10923
       2) (integer) 16383
       3) 1) "192.168.100.136"
          2) (integer) 17021
          3) "05e72d06edec6a920dd91b050c7a315937fddb66"
    2) 1) (integer) 1
       2) (integer) 5461
       3) 1) "192.168.100.134"
          2) (integer) 17021
          3) "11f9169577352c33d85ad0d1ca5f5bf0deba3209"
    3) 1) (integer) 0
       2) (integer) 0
       3) 1) "192.168.100.135"
          2) (integer) 17021
          3) "b461a30fde28409c38ee6c32db1cd267a6cfd125"
    4) 1) (integer) 5462
       2) (integer) 10922
       3) 1) "192.168.100.135"
          2) (integer) 17021
          3) "b461a30fde28409c38ee6c32db1cd267a6cfd125"
    
    查看数据是否迁移成功:
    192.168.100.134:17021> cluster getkeysinslot 0 100
    (empty list or set)
    192.168.100.135:17021> cluster getkeysinslot 0 100
    1) "9223372012094975807"
    2) "9223372031034975807"
    复制代码

    对于大量slot要迁移,而且slot里也有大量的key的话,可以按照上面的步骤写个脚本处理,或则用后面脚本部署里介绍的处理。

    大致的迁移slot的步骤如下:

    复制代码
    1,在目标节点上声明将从源节点上迁入Slot CLUSTER SETSLOT <slot> IMPORTING <source_node_id>
    2,在源节点上声明将往目标节点迁出Slot CLUSTER SETSLOT <slot> migrating <target_node_id>
    3,批量从源节点获取KEY CLUSTER GETKEYSINSLOT <slot> <count>
    4,将获取的Key迁移到目标节点 MIGRATE <target_ip> <target_port> <key_name> 0 <timeout>
    重复步骤3,4直到所有数据迁移完毕,MIGRATE命令会将所有的指定的key通过RESTORE key ttl serialized-value REPLACE迁移给target
    5,分别向双方节点发送 CLUSTER SETSLOT <slot> NODE <target_node_id>,该命令将会广播给集群其他节点,取消importing和migrating。
    6,等待集群状态变为OK CLUSTER INFO 中的 cluster_state = ok
    复制代码

    注意:这里在操作migrate的时候,若各节点有认证,执行的时候会出现:

    (error) ERR Target instance replied with error: NOAUTH Authentication required.

    若确定执行的迁移,本文中是把所有节点的masterauth和requirepass注释掉之后进行的,等进行完之后再开启认证。

    ⑦:cluster forget:从集群中移除指定的节点,这样就无法完成握手,过期时为60s,60s后两节点又会继续完成握手。

    复制代码
    192.168.100.134:17021> cluster nodes
    05e72d06edec6a920dd91b050c7a315937fddb66 192.168.100.136:17021 master - 0 1488302330582 2 connected 10923-16383
    11f9169577352c33d85ad0d1ca5f5bf0deba3209 192.168.100.134:17021 myself,master - 0 0 9 connected 1-5461
    b461a30fde28409c38ee6c32db1cd267a6cfd125 192.168.100.135:17021 master - 0 1488302328576 10 connected 0 5462-10922
    ...
    
    192.168.100.134:17021> cluster forget 05e72d06edec6a920dd91b050c7a315937fddb66
    OK
    192.168.100.134:17021> cluster nodes
    11f9169577352c33d85ad0d1ca5f5bf0deba3209 192.168.100.134:17021 myself,master - 0 0 9 connected 1-5461
    b461a30fde28409c38ee6c32db1cd267a6cfd125 192.168.100.135:17021 master - 0 1488302376718 10 connected 0 5462-10922
    ...
    
    一分钟之后:
    192.168.100.134:17021> cluster nodes
    05e72d06edec6a920dd91b050c7a315937fddb66 192.168.100.136:17021 master - 0 1488302490107 2 connected 10923-16383
    11f9169577352c33d85ad0d1ca5f5bf0deba3209 192.168.100.134:17021 myself,master - 0 0 9 connected 1-5461
    b461a30fde28409c38ee6c32db1cd267a6cfd125 192.168.100.135:17021 master - 0 1488302492115 10 connected 0 5462-10922
    复制代码

    ⑧:cluster failover:手动进行故障转移,在下一节会详解。需要注意的是在需要故障转移的节点上执行,必须在slave节点上执行,否则报错:

    (error) ERR You should send CLUSTER FAILOVER to a slave

    ⑨:cluster flushslots:需要在没有key的节点执行,移除指派给当前节点的所有槽,让当前节点变成一个没有指派任何槽的节点,该节点所有数据丢失。

    复制代码
    192.168.100.136:17022> cluster nodes
    05e72d06edec6a920dd91b050c7a315937fddb66 192.168.100.136:17021 master - 0 1488255398859 2 connected 10923-16383
    ...
    
    192.168.100.136:17021> cluster flushslots
    OK
    
    192.168.100.136:17021> cluster nodes
    05e72d06edec6a920dd91b050c7a315937fddb66 192.168.100.136:17021 myself,master - 0 0 2 connected
    ...
    复制代码

    ⑩:cluster reset :需要在没有key的节点执行,重置集群信息。

    192.168.100.134:17021> cluster reset
    OK
    192.168.100.134:17021> cluster nodes
    11f9169577352c33d85ad0d1ca5f5bf0deba3209 192.168.100.134:17021 myself,master - 0 0 9 connected

    脚本部署(redis-trib.rb)

    Redis Cluster有一套管理脚本,如:创建集群、迁移节点、增删槽位等,这些脚本都存放在源码包里,都是用ruby编写的。现在测试用下脚本完成集群的部署。

    ①:按照需求创建Redis实例,6个实例(3主3从)。

    ②:安全需要ruby模块:

    apt-get install ruby
    gem install redis

    ③:脚本redis-trib.rb(/usr/local/src/redis-3.2.8/src)

    复制代码
    ./redis-trib.rb help
    Usage: redis-trib <command> <options> <arguments ...>
    
    #创建集群
    create          host1:port1 ... hostN:portN  
                      --replicas <arg> #带上该参数表示是否有从,arg表示从的数量
    #检查集群
    check           host:port
    #查看集群信息
    info            host:port
    #修复集群
    fix             host:port
                      --timeout <arg>
    #在线迁移slot  
    reshard         host:port       #个是必传参数,用来从一个节点获取整个集群信息,相当于获取集群信息的入口
                      --from <arg>  #需要从哪些源节点上迁移slot,可从多个源节点完成迁移,以逗号隔开,传递的是节点的node id,还可以直接传递--from all,这样源节点就是集群的所有节点,不传递该参数的话,则会在迁移过程中提示用户输入
                      --to <arg>    #slot需要迁移的目的节点的node id,目的节点只能填写一个,不传递该参数的话,则会在迁移过程中提示用户输入。
                      --slots <arg> #需要迁移的slot数量,不传递该参数的话,则会在迁移过程中提示用户输入。
                      --yes         #设置该参数,可以在打印执行reshard计划的时候,提示用户输入yes确认后再执行reshard
                      --timeout <arg>  #设置migrate命令的超时时间。
                      --pipeline <arg> #定义cluster getkeysinslot命令一次取出的key数量,不传的话使用默认值为10。
    #平衡集群节点slot数量  
    rebalance       host:port
                      --weight <arg>
                      --auto-weights
                      --use-empty-masters
                      --timeout <arg>
                      --simulate
                      --pipeline <arg>
                      --threshold <arg>
    #将新节点加入集群 
    add-node        new_host:new_port existing_host:existing_port
                      --slave
                      --master-id <arg>
    #从集群中删除节点
    del-node        host:port node_id
    #设置集群节点间心跳连接的超时时间
    set-timeout     host:port milliseconds
    #在集群全部节点上执行命令
    call            host:port command arg arg .. arg
    #将外部redis数据导入集群
    import          host:port
                      --from <arg>
                      --copy
                      --replace
    #帮助
    help            (show this help)
    
    For check, fix, reshard, del-node, set-timeout you can specify the host and port of any working node in the cluster.
    复制代码

    1)创建集群 cretate :6个节点,每个节点一个从库,这里有个问题是不能指定那个从库属于哪个主库,不过可以先添加3个主库,通过新增节点(add-node)来添加从库到指定主库。

    ./redis-trib.rb create --replicas 1 192.168.100.134:17021 192.168.100.135:17021 192.168.100.136:17021 192.168.100.134:17022 192.168.100.135:17022 192.168.100.136:17022
     View Code

    2)测试集群 check ip:port:测试集群是否分配完了slot

    ./redis-trib.rb check 192.168.100.134:17021
     View Code

    3)查看集群信息 info ip:port:查看集群信息:包括slot、slave、和key的数量分布

    ./redis-trib.rb info 192.168.100.134:17021
     View Code

    4)平衡节点的slot数量 rebalance ip:port:平均各个节点的slot数量

    ./redis-trib.rb rebalance 192.168.100.134:17021

    流程:

     View Code

    5)删除集群节点 del-node ip:port <node_id>:只能删除没有分配slot的节点,从集群中删出之后直接关闭实例

    ./redis-trib.rb del-node 192.168.100.135:17022 77d02fef656265c9c421fef425527c510e4cfcb8
     View Code

    流程:

     View Code

    6)添加集群节点 add-node :新节点加入集群,节点可以为master,也可以为某个master节点的slave。

    添加一个主节点:134:17022 加入到134:17021的集群当中

    ./redis-trib.rb add-node 192.168.100.134:17022 192.168.100.134:17021
     View Code

    添加一个从节点:135:17022加入到134:17021的集群当中,并且作为指定<node_id>的从库

    ./redis-trib.rb add-node --slave --master-id 7fa64d250b595d8ac21a42477af5ac8c07c35d83 192.168.100.135:17022 192.168.100.134:17021
     View Code

    最后集群的信息:

    复制代码
    192.168.100.134:17021> cluster nodes
    77d02fef656265c9c421fef425527c510e4cfcb8 192.168.100.135:17022 slave 7fa64d250b595d8ac21a42477af5ac8c07c35d83 0 1488346523944 5 connected
    5476787f31fa375fda6bb32676a969c8b8adfbc2 192.168.100.134:17022 master - 0 1488346525949 4 connected
    7fa64d250b595d8ac21a42477af5ac8c07c35d83 192.168.100.134:17021 myself,master - 0 0 1 connected 0-5460
    51bf103f7cf6b5ede6e009ce489fdeec14961be8 192.168.100.135:17021 master - 0 1488346522942 2 connected 5461-10922
    0191a8b52646fb5c45323ab0c1a1a79dc8f3aea2 192.168.100.136:17021 master - 0 1488346524948 3 connected 10923-16383
    复制代码

    流程:

     View Code

    7)在线迁移slot reshard :在线把集群的一些slot从集群原来slot节点迁移到新的节点,即可以完成集群的在线横向扩容和缩容。

    提示执行:迁移134:17021集群

    ./redis-trib.rb reshard 192.168.100.134:17021
    复制代码
    >>> Performing Cluster Check (using node 192.168.100.134:17021)
    M: 7fa64d250b595d8ac21a42477af5ac8c07c35d83 192.168.100.134:17021
       slots:0-5460 (5461 slots) master
       1 additional replica(s)
    S: 77d02fef656265c9c421fef425527c510e4cfcb8 192.168.100.135:17022
       slots: (0 slots) slave
       replicates 7fa64d250b595d8ac21a42477af5ac8c07c35d83
    M: 5476787f31fa375fda6bb32676a969c8b8adfbc2 192.168.100.134:17022
       slots: (0 slots) master
       0 additional replica(s)
    M: 51bf103f7cf6b5ede6e009ce489fdeec14961be8 192.168.100.135:17021
       slots:5461-10922 (5462 slots) master
       0 additional replica(s)
    M: 0191a8b52646fb5c45323ab0c1a1a79dc8f3aea2 192.168.100.136:17021
       slots:10923-16383 (5461 slots) master
       0 additional replica(s)
    [OK] All nodes agree about slots configuration.
    >>> Check for open slots...
    >>> Check slots coverage...
    [OK] All 16384 slots covered.
    #迁移几个槽位?
    How many slots do you want to move (from 1 to 16384)? 1 
    #迁移到那个node_id?
    What is the receiving node ID? 5476787f31fa375fda6bb32676a969c8b8adfbc2
    #从哪些node_id迁移?
    Please enter all the source node IDs.
    #输入all,集群里的所有节点
      Type 'all' to use all the nodes as source nodes for the hash slots.
    #输入源节点,回车后再输入done开始迁移
      Type 'done' once you entered all the source nodes IDs.
    Source node #1:7fa64d250b595d8ac21a42477af5ac8c07c35d83
    Source node #2:done
    
    Ready to move 1 slots.
      Source nodes:
        M: 7fa64d250b595d8ac21a42477af5ac8c07c35d83 192.168.100.134:17021
       slots:0-5460 (5461 slots) master
       1 additional replica(s)
      Destination node:
        M: 5476787f31fa375fda6bb32676a969c8b8adfbc2 192.168.100.134:17022
       slots: (0 slots) master
       0 additional replica(s)
      Resharding plan:
        Moving slot 0 from 7fa64d250b595d8ac21a42477af5ac8c07c35d83
    #是否看迁移计划?
    Do you want to proceed with the proposed reshard plan (yes/no)? yes 
    Moving slot 0 from 192.168.100.134:17021 to 192.168.100.134:17022: ..........
    复制代码

    参数执行:从from指定的node迁移10个slots到to指定的节点

    ./redis-trib.rb reshard --from 7fa64d250b595d8ac21a42477af5ac8c07c35d83 --to 5476787f31fa375fda6bb32676a969c8b8adfbc2 --slots 10 192.168.100.134:17021
    复制代码
    >>> Performing Cluster Check (using node 192.168.100.134:17021)
    M: 7fa64d250b595d8ac21a42477af5ac8c07c35d83 192.168.100.134:17021
       slots:2-5460 (5459 slots) master
       1 additional replica(s)
    S: 77d02fef656265c9c421fef425527c510e4cfcb8 192.168.100.135:17022
       slots: (0 slots) slave
       replicates 7fa64d250b595d8ac21a42477af5ac8c07c35d83
    M: 5476787f31fa375fda6bb32676a969c8b8adfbc2 192.168.100.134:17022
       slots:0-1 (2 slots) master
       0 additional replica(s)
    M: 51bf103f7cf6b5ede6e009ce489fdeec14961be8 192.168.100.135:17021
       slots:5461-10922 (5462 slots) master
       0 additional replica(s)
    M: 0191a8b52646fb5c45323ab0c1a1a79dc8f3aea2 192.168.100.136:17021
       slots:10923-16383 (5461 slots) master
       0 additional replica(s)
    [OK] All nodes agree about slots configuration.
    >>> Check for open slots...
    >>> Check slots coverage...
    [OK] All 16384 slots covered.
    
    Ready to move 10 slots.
      Source nodes:
        M: 7fa64d250b595d8ac21a42477af5ac8c07c35d83 192.168.100.134:17021
       slots:2-5460 (5459 slots) master
       1 additional replica(s)
      Destination node:
        M: 5476787f31fa375fda6bb32676a969c8b8adfbc2 192.168.100.134:17022
       slots:0-1 (2 slots) master
       0 additional replica(s)
      Resharding plan:
        Moving slot 2 from 7fa64d250b595d8ac21a42477af5ac8c07c35d83
        Moving slot 3 from 7fa64d250b595d8ac21a42477af5ac8c07c35d83
        Moving slot 4 from 7fa64d250b595d8ac21a42477af5ac8c07c35d83
        Moving slot 5 from 7fa64d250b595d8ac21a42477af5ac8c07c35d83
        Moving slot 6 from 7fa64d250b595d8ac21a42477af5ac8c07c35d83
        Moving slot 7 from 7fa64d250b595d8ac21a42477af5ac8c07c35d83
        Moving slot 8 from 7fa64d250b595d8ac21a42477af5ac8c07c35d83
        Moving slot 9 from 7fa64d250b595d8ac21a42477af5ac8c07c35d83
        Moving slot 10 from 7fa64d250b595d8ac21a42477af5ac8c07c35d83
        Moving slot 11 from 7fa64d250b595d8ac21a42477af5ac8c07c35d83
    Do you want to proceed with the proposed reshard plan (yes/no)? yes
    Moving slot 2 from 192.168.100.134:17021 to 192.168.100.134:17022: ....................
    Moving slot 3 from 192.168.100.134:17021 to 192.168.100.134:17022: ..........
    Moving slot 4 from 192.168.100.134:17021 to 192.168.100.134:17022: ..................
    Moving slot 5 from 192.168.100.134:17021 to 192.168.100.134:17022: ..
    Moving slot 6 from 192.168.100.134:17021 to 192.168.100.134:17022: ..
    Moving slot 7 from 192.168.100.134:17021 to 192.168.100.134:17022: ...............................
    Moving slot 8 from 192.168.100.134:17021 to 192.168.100.134:17022: ..........
    Moving slot 9 from 192.168.100.134:17021 to 192.168.100.134:17022: ..........................
    Moving slot 10 from 192.168.100.134:17021 to 192.168.100.134:17022: ........................................
    Moving slot 11 from 192.168.100.134:17021 to 192.168.100.134:17022: ..........
    复制代码

    流程:

     View Code

    迁移后的slots分布:

    复制代码
    192.168.100.135:17021> cluster nodes
    5476787f31fa375fda6bb32676a969c8b8adfbc2 192.168.100.134:17022 master - 0 1488349695628 7 connected 0-11
    7fa64d250b595d8ac21a42477af5ac8c07c35d83 192.168.100.134:17021 master - 0 1488349698634 1 connected 12-5460
    51bf103f7cf6b5ede6e009ce489fdeec14961be8 192.168.100.135:17021 myself,master - 0 0 2 connected 5461-10922
    77d02fef656265c9c421fef425527c510e4cfcb8 192.168.100.135:17022 slave 7fa64d250b595d8ac21a42477af5ac8c07c35d83 0 1488349697631 1 connected
    0191a8b52646fb5c45323ab0c1a1a79dc8f3aea2 192.168.100.136:17021 master - 0 1488349696631 3 connected 10923-16383
    复制代码

    新增的节点,slot分布不均匀,可以通过上面说的rebalance进行平衡slot。

    这里需要注意的是:要是Redis Server 配置了认证,需要密码登入,这个脚本就不能执行了,脚本执行的Server之间都是无密码。若确定需要登陆,则:可以暂时修改成无认证状态:

    192.168.100.134:17022> config set masterauth ""  
    OK
    192.168.100.134:17022> config set requirepass ""
    OK
    #正常来讲是没有权限写入的。
    #192.168.100.134:17022> config rewrite  

    等到处理完毕之后,可以再把密码设置回去。到此,通过脚本部署也介绍完了,通过手动和脚本部署发现在数据迁移的时候服务器都不能设置密码,否则认证失败。在设置了认证的服务器上操作时,需要注意一下。

    故障检测和转移

    在上面管理中介绍过failover的命令,现在可以用这个命令模拟故障检测转移,当然也可以stop掉Redis Server来实现模拟。进行failover节点必须是slave节点,查看集群里各个节点和slave的信息:

    复制代码
    192.168.100.134:17021> cluster nodes
    93a030d6f1d1248c1182114c7044b204aa0ee022 192.168.100.136:17021 master - 0 1488378411940 4 connected 10923-16383
    b836dc49206ac8895be7a0c4b8ba571dffa1e1c4 192.168.100.135:17022 slave 23c2bb6fc906b55fb59a051d1f9528f5b4bc40d4 0 1488378410938 1 connected
    5980546e3b19ff5210057612656681b505723da4 192.168.100.134:17022 slave 93a030d6f1d1248c1182114c7044b204aa0ee022 0 1488378408935 4 connected
    23c2bb6fc906b55fb59a051d1f9528f5b4bc40d4 192.168.100.134:17021 myself,master - 0 0 1 connected 0-5461
    526d99b679229c8003b0504e27ae7aee4e9c9c3a 192.168.100.135:17021 master - 0 1488378412941 2 connected 5462-10922
    39bf42b321a588dcd93efc4b4cc9cb3b496cacb6 192.168.100.136:17022 slave 526d99b679229c8003b0504e27ae7aee4e9c9c3a 0 1488378413942 5 connected
    192.168.100.134:17021> cluster slaves 23c2bb6fc906b55fb59a051d1f9528f5b4bc40d4
    1) "b836dc49206ac8895be7a0c4b8ba571dffa1e1c4 192.168.100.135:17022 slave 23c2bb6fc906b55fb59a051d1f9528f5b4bc40d4 0 1488378414945 1 connected"
    复制代码

    在134:17021上模拟故障,要到该节点的从节点135:17022上执行failover,通过日志看如何进行故障转移

    复制代码
    192.168.100.135:17022> cluster failover
    OK
    192.168.100.135:17022> cluster nodes
    39bf42b321a588dcd93efc4b4cc9cb3b496cacb6 192.168.100.136:17022 slave 526d99b679229c8003b0504e27ae7aee4e9c9c3a 0 1488378807681 5 connected
    23c2bb6fc906b55fb59a051d1f9528f5b4bc40d4 192.168.100.134:17021 slave b836dc49206ac8895be7a0c4b8ba571dffa1e1c4 0 1488378804675 6 connected
    526d99b679229c8003b0504e27ae7aee4e9c9c3a 192.168.100.135:17021 master - 0 1488378806679 2 connected 5462-10922
    5980546e3b19ff5210057612656681b505723da4 192.168.100.134:17022 slave 93a030d6f1d1248c1182114c7044b204aa0ee022 0 1488378808682 4 connected
    b836dc49206ac8895be7a0c4b8ba571dffa1e1c4 192.168.100.135:17022 myself,master - 0 0 6 connected 0-5461
    93a030d6f1d1248c1182114c7044b204aa0ee022 192.168.100.136:17021 master - 0 1488378809684 4 connected 10923-16383
    复制代码

    通过上面结果看到从库已经提升变成了主库,而老的主库起来之后变成了从库。在日志里也可以看到这2个节点同步的过程。当然有兴趣的可以模拟一下stop的过程。

    整个集群的部署、管理和测试到这里全部结束,下面附上几个生成数据的测试脚本:

    ①:操作集群(cluster_write_test.py)

     View Code

    ②:pipeline操作集群(cluster_write_pipe_test.py)

     View Code

    ③:操作单例(single_write_test.py)

     View Code

    ④:pipeline操作单例(single_write_pipe_test.py)

     View Code

    总结:

          Redis Cluster采用无中心节点方式实现,无需proxy代理,客户端直接与redis集群的每个节点连接,根据同样的hash算法计算出key对应的slot,然后直接在slot对应的Redis上执行命令。从CAP定理来看,Cluster支持了AP(Availability&Partition-Tolerancy),这样让Redis从一个单纯的NoSQL内存数据库变成了分布式NoSQL数据库。

    参考文档: 

    Redis Cluster 实现介绍

    Redis cluster tutorial

    集群教程

    Redis cluster管理工具redis-trib.rb详解

    全面剖析Redis Cluster原理和应用

    Redis Cluster实现原理

     

  • 相关阅读:
    size_type、size_t、differentce_type以及ptrdiff_t
    题目1003:A+B ---c_str(),atoi()函数的使用;remove , erase函数的使用
    字符串中符号的替换---replace的用法
    A+B for Matrices 及 C++ transform的用法
    97.5%准确率的深度学习中文分词(字嵌入+Bi-LSTM+CRF)
    详细解读简单的lstm的实例
    如何使用 Pylint 来规范 Python 代码风格
    Python下Json和Msgpack序列化比较
    除了cPickle,cjson外还有没有更高效点的序列化库了
    python对象序列化或持久化的方法
  • 原文地址:https://www.cnblogs.com/ExMan/p/11039133.html
Copyright © 2011-2022 走看看