zoukankan      html  css  js  c++  java
  • MongoDB副本集--Secondary节点实例恢复

    场景描述

    MongoDB副本集中有一台Secondary节点出现RECOVERING的状态

    状态如下:

    
    
        arps:RECOVERING> rs.status()
        {
                "set" : "arps",
                "date" : ISODate("2017-12-22T02:31:58.803Z"),
                "myState" : 3,
                "members" : [
                        {
                                "_id" : 0,
                                "name" : "172.17.4.37:27017",
                                "health" : 1,
                                "state" : 2,
                                "stateStr" : "SECONDARY",
                                "uptime" : 7579839,
                                "optime" : Timestamp(1513909913, 3),
                                "optimeDate" : ISODate("2017-12-22T02:31:53Z"),
                                "lastHeartbeat" : ISODate("2017-12-22T02:31:58.019Z"),
                                "lastHeartbeatRecv" : ISODate("2017-12-22T02:31:57.750Z"),
                                "pingMs" : 0,
                                "syncingTo" : "172.17.4.38:27017",
                                "configVersion" : 1
                        },
                        {
                                "_id" : 1,
                                "name" : "172.17.4.38:27017",
                                "health" : 1,
                                "state" : 1,
                                "stateStr" : "PRIMARY",
                                "uptime" : 7579913,
                                "optime" : Timestamp(1513909913, 3),
                                "optimeDate" : ISODate("2017-12-22T02:31:53Z"),
                                "lastHeartbeat" : ISODate("2017-12-22T02:31:58.051Z"),
                                "lastHeartbeatRecv" : ISODate("2017-12-22T02:31:58.018Z"),
                                "pingMs" : 0,
                                "electionTime" : Timestamp(1506330005, 1),
                                "electionDate" : ISODate("2017-09-25T09:00:05Z"),
                                "configVersion" : 1
                        },
                        {
                                "_id" : 2,
                                "name" : "172.17.4.39:27017",
                                "health" : 1,
                                "state" : 3,
                                "stateStr" : "RECOVERING",//RECOVERING状态,第三个结点出现问题。
                                "uptime" : 7580364,
                                "optime" : Timestamp(1473614444, 2),
                                "optimeDate" : ISODate("2016-09-11T17:20:44Z"),
                                "configVersion" : 1,
                                "self" : true
                        }
                ],
                "ok" : 1
        }
    
    
    

    恢复思路:
    1.关闭MongoDB故障节点的数据库服务,移除数据目录,启动MongoDB服务,开启自动同步机制,恢复secondary节点。
    2.找到另外一个secondary数据节点的快照,关闭写操作。在数据不变化的情况下,获得一致性的备份快照,拷贝至故障节点中,启动MongoDB服务,应用oplog日志。恢复secondary节点。

    由于环境数据量小,使用第一种方案。

    1.mongodb数据库服务关闭

    arps:RECOVERING> use admin
    switched to db admin
    arps:RECOVERING> db.shutdownServer()
    

    2.删除或者移走数据目录

    [root@mongodb data]# mv /opt/data/mongodb /opt/data/mongodb20171222
    [root@mongodb data]# mkdir /opt/data/mongodb
    [root@mongodb data]# mkdir /opt/data/mongodb/log
    

    3.启动数据库服务且查看状态

    [root@mongodb data]#/opt/software/mongodb-linux-x86_64-3.0.1/bin/mongod -f /opt/software/mongodb-linux-x86_64-3.0.1/bin/mongodb.conf
    
     arps:STARTUP2> rs.status()
    {
            "set" : "arps",
            "date" : ISODate("2017-12-22T02:46:52.288Z"),
            "myState" : 5,
            "syncingTo" : "172.17.4.38:27017",
            "members" : [
                    {
                            "_id" : 0,
                            "name" : "172.17.4.37:27017",
                            "health" : 1,
                            "state" : 2,
                            "stateStr" : "SECONDARY",
                            "uptime" : 25,
                            "optime" : Timestamp(1513910813, 3),
                            "optimeDate" : ISODate("2017-12-22T02:46:53Z"),
                            "lastHeartbeat" : ISODate("2017-12-22T02:46:51.122Z"),
                            "lastHeartbeatRecv" : ISODate("2017-12-22T02:46:51.114Z"),
                            "pingMs" : 0,
                            "syncingTo" : "172.17.4.38:27017",
                            "configVersion" : 1
                    },
                    {
                            "_id" : 1,
                            "name" : "172.17.4.38:27017",
                            "health" : 1,
                            "state" : 1,
                            "stateStr" : "PRIMARY",
                            "uptime" : 25,
                            "optime" : Timestamp(1513910813, 3),
                            "optimeDate" : ISODate("2017-12-22T02:46:53Z"),
                            "lastHeartbeat" : ISODate("2017-12-22T02:46:51.127Z"),
                            "lastHeartbeatRecv" : ISODate("2017-12-22T02:46:51.303Z"),
                            "pingMs" : 0,
                            "electionTime" : Timestamp(1506330005, 1),
                            "electionDate" : ISODate("2017-09-25T09:00:05Z"),
                            "configVersion" : 1
                    },
                    {
                            "_id" : 2,
                            "name" : "172.17.4.39:27017",
                            "health" : 1,
                            "state" : 5,
                            "stateStr" : "STARTUP2",//STARTUP2的状态为:新加入的节点做数据初始化
                            "uptime" : 27,
                            "optime" : Timestamp(0, 0),
                            "optimeDate" : ISODate("1970-01-01T00:00:00Z"),
                            "syncingTo" : "172.17.4.38:27017",
                            "configVersion" : 1,
                            "self" : true
                    }
            ],
            "ok" : 1
    } 
    

    关于副本集的状态,文献参考如下:https://docs.mongodb.com/v3.0/reference/replica-states/index.html

    过了半个小时之后,数据恢复完成,状态日志如下:

    
    
        .....................
        2017-12-22T11:27:02.474+0800 I INDEX [rsSync] building index using bulk method
        2017-12-22T11:27:02.475+0800 I INDEX [rsSync] build index done. scanned 75 total records. 0 secs
        2017-12-22T11:27:02.477+0800 I REPL [rsSync] initial sync data copy, starting syncup
        2017-12-22T11:27:02.798+0800 I REPL [rsSync] oplog sync 1 of 3
        2017-12-22T11:27:03.145+0800 I REPL [ReplicationExecutor] syncing from: 172.17.4.38:27017
        2017-12-22T11:27:03.288+0800 I REPL [rsSync] oplog sync 2 of 3
        2017-12-22T11:27:03.289+0800 I REPL [rsSync] initial sync building indexes
        2017-12-22T11:27:03.289+0800 I REPL [rsSync] initial sync cloning indexes for : demo
        2017-12-22T11:27:03.300+0800 I REPL [SyncSourceFeedback] replset setting syncSourceFeedback to 172.17.4.38:27017
        2017-12-22T11:27:03.390+0800 I STORAGE [rsSync] copying indexes for: { name: "ACT_AUTH_LOG", options: {} }
        2017-12-22T11:27:03.391+0800 I STORAGE [rsSync] copying indexes for: { name: "SYSTEM_DATA_LOG", options: {} }
        2017-12-22T11:27:03.392+0800 I STORAGE [rsSync] copying indexes for: { name: "SYSTEM_ERROR_LOG", options: {} }
        2017-12-22T11:27:03.392+0800 I STORAGE [rsSync] copying indexes for: { name: "SYSTEM_EXTERNAL_PACKET", options: {} }
        2017-12-22T11:27:03.393+0800 I STORAGE [rsSync] copying indexes for: { name: "SYSTEM_EXTERNAL_PACKET_LOG", options: {} }
        2017-12-22T11:27:03.393+0800 I STORAGE [rsSync] copying indexes for: { name: "SYSTEM_JPUSH_LOG", options: {} }
        2017-12-22T11:27:03.394+0800 I STORAGE [rsSync] copying indexes for: { name: "SYSTEM_MESSAGE_LOG", options: {} }
        2017-12-22T11:27:03.395+0800 I STORAGE [rsSync] copying indexes for: { name: "SYSTEM_REQUEST_LOG", options: {} }
        2017-12-22T11:27:03.395+0800 I STORAGE [rsSync] copying indexes for: { name: "SYSTEM_RETRY_MESSAGE", options: {} }
        2017-12-22T11:27:03.395+0800 I STORAGE [rsSync] copying indexes for: { name: "SYSTEM_RUN_LOG", options: { capped: true, size: 536870912 } }
        2017-12-22T11:27:03.396+0800 I STORAGE [rsSync] copying indexes for: { name: "SYSTEM_SMSEMAIL_LOG", options: {} }
        2017-12-22T11:27:03.396+0800 I STORAGE [rsSync] copying indexes for: { name: "SYSTEM_TIMEOUT_LOG", options: {} }
        2017-12-22T11:27:03.397+0800 I REPL [rsSync] oplog sync 3 of 3
        2017-12-22T11:27:03.406+0800 I REPL [rsSync] initial sync finishing up
        2017-12-22T11:27:03.406+0800 I REPL [rsSync] replSet set minValid=5a3c7b93:3
        2017-12-22T11:27:03.429+0800 I REPL [rsSync] initial sync done
        2017-12-22T11:27:03.474+0800 I REPL [ReplicationExecutor] transition to RECOVERING
        2017-12-22T11:27:03.476+0800 I REPL [ReplicationExecutor] transition to SECONDARY
        .................
    

    节点恢复的状态,如下:

    
    
        arps:SECONDARY> rs.status()
        ...............
                        {
                                "_id" : 2,
                                "name" : "172.17.4.39:27017",
                                "health" : 1,
                                "state" : 2,
                                "stateStr" : "SECONDARY",//恢复完成
                                "uptime" : 2500,
                                "optime" : Timestamp(1513913295, 3),
                                "optimeDate" : ISODate("2017-12-22T03:28:15Z"),
                                "syncingTo" : "172.17.4.38:27017",
                                "configVersion" : 1,
                                "self" : true
                        }
        .................
    
    
  • 相关阅读:
    Scala课程01
    深入分析面向对象中的对象概念(转)
    代码审查时,发现功能实现的原因,而不仅仅是挑毛病(转)
    独立开发者复盘:手游研发犯过的8个错误(转)
    HTTPS背后的加密算法(转)
    How to recover from 'programmers burnout(转)
    数据流图的画法
    Filter及FilterChain的使用具体解释
    SimpleDateFormat使用具体解释
    TCP/IP协议,HTTP协议
  • 原文地址:https://www.cnblogs.com/zhangshengdong/p/11732214.html
Copyright © 2011-2022 走看看