zoukankan      html  css  js  c++  java
  • MongoDB 集群 config server 查询超时导致 mongos 集群写入失败

    环境

    OS:CentOS 7.x
    DB:MongoDB 3.6.12
    集群模式:mongod-shard1 *3 + mongod-shard2 *3 + mongod-conf-shard *3 + mongos *3

    业务错误日志

    caused by :: NetworkInterfaceExceededTimeLimit: Operation time out on server ****:27018
    ....
    at org.springframework.data.mongodb.core.MongoExceptionTranslator.translateExceptionIfPossible(MongoExceptionTranslator.java:107)
    

    故障复现


    在一个集合执行 insert 操作的时候,提示 NetworkInterfaceExceededTimeLimit: Operation time out
    在另一个不存在的集合执行就可以正常操作。

    怀疑 config server 查询分片信息的时候有问题。

    排查问题

    2020-07-07T09:55:36.605+0800 D REPL     [conn52850] Required snapshot optime: { ts: Timestamp(1594086936, 7), t: 19 } is not yet part of the current 'committed' snapshot: { ts: Timestamp(1594086936, 3), t: 19 }
    2020-07-07T09:55:36.605+0800 D REPL     [conn35081] Required snapshot optime: { ts: Timestamp(1594086936, 7), t: 19 } is not yet part of the current 'committed' snapshot: { ts: Timestamp(1594086936, 3), t: 19 }
    2020-07-07T09:55:37.084+0800 D REPL     [conn72545] waitUntilOpTime: waiting for optime:{ ts: Timestamp(1594086683, 2), t: 20 } to be in a snapshot -- current snapshot: { ts: Timestamp(1594086936, 7), t: 19 }
    2020-07-07T09:55:37.187+0800 I COMMAND  [conn72537] Command on database config timed out waiting for read concern to be satisfied. Command: { find: "shards", readConcern: { level: "majority", afterOpTime: { ts: Timestamp(1594086804, 1), t: 20 } }, maxTimeMS: 30000, $readPreference: { mode: "nearest" }, $replData: 1, $clusterTime: { clusterTime: Timestamp(1594086903, 1), signature: { hash: BinData(0, CD6262BF59D2AAC318183C6109F3B31DEE2E1837), keyId: 6807014219125358676 } }, $configServerState: { opTime: { ts: Timestamp(1594086804, 1), t: 20 } }, $db: "config" }
    2020-07-07T09:55:37.187+0800 I COMMAND  [conn72537] command config.$cmd command: find { find: "shards", readConcern: { level: "majority", afterOpTime: { ts: Timestamp(1594086804, 1), t: 20 } }, maxTimeMS: 30000, $readPreference: { mode: "nearest" }, $replData: 1, $clusterTime: { clusterTime: Timestamp(1594086903, 1), signature: { hash: BinData(0, CD6262BF59D2AAC318183C6109F3B31DEE2E1837), keyId: 6807014219125358676 } }, $configServerState: { opTime: { ts: Timestamp(1594086804, 1), t: 20 } }, $db: "config" } numYields:0 reslen:517 locks:{} protocol:op_msg 30009ms
    2020-07-07T09:55:37.187+0800 I NETWORK  [conn72537] end connection *.*.*.*:45296 (34 connections now open)
    2020-07-07T09:55:40.425+0800 D REPL     [conn72539] Required snapshot optime: { ts: Timestamp(1594086940, 1), t: 19 } is not yet part of the current 'committed' snapshot: { ts: Timestamp(1594086936, 7), t: 19 }
    

    在 config server 的日志里找到一行 Command on database config timed out waiting for read concern to be satisfied.
    具体原因未知,但是显示在 config server 上执行 find 操作的时候,执行超时。 和业务日志报错限制一致。

    重启 config server PRIMARY 节点,触发 config server 副本集SECONDARY节点的重新选举机制。
    故障恢复。

  • 相关阅读:
    git功能速查
    iPad actionsjeet
    iOS开发中集成Reveal
    【转】ios内联函数 inline
    【转】数据存储——APP 缓存数据线程安全问题探讨
    iOS 改变导航栏高度
    ios 闪屏页的设置
    AFNetworking content type not support
    iOS 获取本地文件的各种坑
    iOS UICollectionView 长按移动cell
  • 原文地址:https://www.cnblogs.com/TopGear/p/13259952.html
Copyright © 2011-2022 走看看