zoukankan      html  css  js  c++  java
  • redis主从中断异常处理

    线上预警主从中断: 查看线上复制信息:

    # Replication
    role:slave
    master_host:master_host
    master_port:6379
    master_link_status:down
    master_last_io_seconds_ago:-1
    master_sync_in_progress:1
    slave_repl_offset:1
    master_sync_left_bytes:713983940
    master_sync_last_io_seconds_ago:0
    master_link_down_since_seconds:248
    slave_priority:100
    slave_read_only:1
    connected_slaves:0
    master_repl_offset:0
    repl_backlog_active:0
    repl_backlog_size:1048576
    repl_backlog_first_byte_offset:0
    repl_backlog_histlen:0
    

    状态为DOWN.主从失败,查看主节点相关日志

    [374] 15 Oct 16:41:28.146 # Connection with slave 10.72.26.55:6379 lost.
    [374] 15 Oct 16:41:28.999 * Slave asks for synchronization
    [374] 15 Oct 16:41:28.999 * Unable to partial resync with the slave for lack of backlog (Slave request was: 152340118946214).
    [374] 15 Oct 16:41:28.999 * Starting BGSAVE for SYNC
    [374] 15 Oct 16:41:29.447 * Background saving started by pid 11357
    [11357] 15 Oct 16:41:57.325 * DB saved on disk
    [11357] 15 Oct 16:41:57.555 * RDB: 231 MB of memory used by copy-on-write
    [374] 15 Oct 16:41:57.980 * Background saving terminated with success
    [374] 15 Oct 16:42:31.739 * Synchronization with slave succeeded
    [374] 15 Oct 16:43:01.021 # Client id=6082455 addr=slave_host:55308 fd=329 name= age=93 idle=1 flags=S db=0 sub=0 psub=0 multi=-1 qbuf=0 qbuf-free=0 obl=0 oll=10657 omem=2504780296 events=rw cmd=replconf scheduled to be closed ASAP for overcoming of output buffer limits.
    

    查看从节点日志:

    [372] 15 Oct 16:43:01.141 # Connection with master lost.
    [372] 15 Oct 16:43:01.141 * Caching the disconnected master state.
    [372] 15 Oct 16:43:01.213 * Connecting to MASTER masterhost:6379
    [372] 15 Oct 16:43:01.213 * MASTER <-> SLAVE sync started
    [372] 15 Oct 16:43:01.213 * Non blocking connect for SYNC fired the event.
    [372] 15 Oct 16:43:01.572 * Master replied to PING, replication can continue...
    [372] 15 Oct 16:43:01.599 * Trying a partial resynchronization (request cbc213a279fde141211f65d436595e4ed64198fa:152342150944513).
    [372] 15 Oct 16:43:01.602 * Full resync from master: cbc213a279fde141211f65d436595e4ed64198fa:152344338348685
    [372] 15 Oct 16:43:01.602 * Discarding previously cached master state.
    [372] 15 Oct 16:43:30.326 * MASTER <-> SLAVE sync: receiving 1308737462 bytes from master
    [372] 15 Oct 16:43:59.846 * MASTER <-> SLAVE sync: Flushing old data
    [372] 15 Oct 16:44:01.534 * MASTER <-> SLAVE sync: Loading DB in memory
    [372] 15 Oct 16:44:22.590 * MASTER <-> SLAVE sync: Finished with success
    [372] 15 Oct 16:44:22.600 # Connection with master lost.
    [372] 15 Oct 16:44:22.600 * Caching the disconnected master state.
    

    从主库的日志我们可以看到slave的链接由于超过了output buffer limits的设置值所以被强行中断了。看一下redis2.8的自描述文件

    # client-output-buffer-limit <class> <hard limit> <soft limit> <soft seconds>
    #
    # A client is immediately disconnected once the hard limit is reached, or if
    # the soft limit is reached and remains reached for the specified number of
    # seconds (continuously).
    # So for instance if the hard limit is 32 megabytes and the soft limit is
    # 16 megabytes / 10 seconds, the client will get disconnected immediately
    # if the size of the output buffers reach 32 megabytes, but will also get
    # disconnected if the client reaches 16 megabytes and continuously overcomes
    # the limit for 10 seconds.
    #
    # By default normal clients are not limited because they don't receive data
    # without asking (in a push way), but just after a request, so only
    # asynchronous clients may create a scenario where data is requested faster
    # than it can read.
    #
    # Instead there is a default limit for pubsub and slave clients, since
    # subscribers and slaves receive data in a push fashion.
    #
    # Both the hard or the soft limit can be disabled by setting them to zero.
    client-output-buffer-limit normal 0 0 0
    client-output-buffer-limit slave 256mb 64mb 60
    client-output-buffer-limit pubsub 32mb 8mb 60
    

    我们主要看slave的限制:

    256mb 是一个硬性限制,当output-buffer的大小大于256mb之后就会断开连接
    64mb 60 是一个条件限制,当output-buffer的大小大于64mb并且超过了60秒的时候就会断开连接
    

    当我们链接暴增,数据量大的情况下默认参数已经不能满足主从同步,从库会不停的向主库发起同步,主库就会不停的bgsave,发送文件给从库,这样就会造成一个死循环。我们必须依据从库的使用来调整client-output-buffer-limit slave 的值。调整以后就可以正常同步了。

  • 相关阅读:
    C语言寒假大作战01
    C语言I作业12—学期总结
    C语言I博客作业11
    C语言I博客作业10
    非数值数据的编码方式
    定点数
    C语言||作业01
    C语言寒假大作战04
    C语言寒假大作战03
    C语言寒假大作战02
  • 原文地址:https://www.cnblogs.com/shengdimaya/p/11679393.html
Copyright © 2011-2022 走看看