mongodb不愧是功能上都比较完备的NoSQL数据库,其高可用方面做的明显要好一些。
主从复制的设置比较简单,关键是使用--master、--slave和--source参数,启动主从服务的命令如下
XXXXX@XXXXX-asus:~$ sudo mongod --dbpath /var/lib/mongodb --logpath /var/log/mongodb/mongodb.log --port 10000 --master --rest --nojournal all output going to: /var/log/mongodb/mongodb.log log file [/var/log/mongodb/mongodb.log] exists; copied to temporary file [/var/log/mongodb/mongodb.log.2014-03-15T02-46-11] XXXXX@XXXXX-asus:~$ sudo mongod --dbpath /var/lib/mongodb --logpath /var/log/mongodb/mongodb.log --port 10000 --master --rest --nojournal --fork forked process: 5969 all output going to: /var/log/mongodb/mongodb.log log file [/var/log/mongodb/mongodb.log] exists; copied to temporary file [/var/log/mongodb/mongodb.log.2014-03-15T02-46-58] child process started successfully, parent exiting XXXXX@XXXXX-asus:~$ sudo mongod --dbpath /var/lib/mongodb_slave --logpath /var/log/mongodb/mongodb_slave.log --port 10001 --slave --source localhost:10000 --rest --nojournal --fork forked process: 5987 all output going to: /var/log/mongodb/mongodb_slave.log log file [/var/log/mongodb/mongodb_slave.log] exists; copied to temporary file [/var/log/mongodb/mongodb_slave.log.2014-03-15T02-47-12] child process started successfully, parent exiting XXXXX@XXXXX-asus:~$ ps -ef | grep mongod root 5969 1 0 10:46 ? 00:00:00 mongod --dbpath /var/lib/mongodb --logpath /var/log/mongodb/mongodb.log --port 10000 --master --rest --nojournal --fork root 5987 1 0 10:47 ? 00:00:00 mongod --dbpath /var/lib/mongodb_slave --logpath /var/log/mongodb/mongodb_slave.log --port 10001 --slave --source localhost:10000 --rest --nojournal --fork XXXXX 6050 2732 0 10:48 pts/0 00:00:00 grep --color=auto mongod
可见,master进程在启动的时候使用了--master参数监听了10000端口,随后slave进程启动,使用了--slave --source localhost:10000,从master复制,并监听10001端口。做了主从的配置之后,在master上做的改动会立刻同步到slave上,如下所示
XXXXX@XXXXX-asus:~$ mongo localhost:10000 MongoDB shell version: 2.2.4 connecting to: localhost:10000/test > use test switched to db test > db.master_slave.insert({"abc":123}) > db.master_slave.find() { "_id" : ObjectId("5323c1594678819d9c4323a2"), "abc" : 123 } > exit bye XXXXX@XXXXX-asus:~$ mongo localhost:10001 MongoDB shell version: 2.2.4 connecting to: localhost:10001/test > use test switched to db test > db.master_slave.find() { "_id" : ObjectId("5323c1594678819d9c4323a2"), "abc" : 123 } > exit bye XXXXX@XXXXX-asus:~$ clear
先登陆master,并insert了一条数据,在slave上可以立刻看到这个改动。
mongodb的副本集是想在primary数据库启动的时候,有多个副本(secondary)在向primary同步数据,此时只有primary可用,所有secondary都不可用。在primary异常离线之后,副本集中立刻选取出一个primary来代替原有的primary继续工作。
配置副本集稍微麻烦一点,先要启动多个mongodb的副本集进程,如下
XXXXX@XXXXX-asus:~$ sudo mongod --dbpath /var/lib/mongodb --logpath /var/log/mongodb/mongodb.log --port 10000 --replSet testrep --nojournal --fork forked process: 6862 all output going to: /var/log/mongodb/mongodb.log log file [/var/log/mongodb/mongodb.log] exists; copied to temporary file [/var/log/mongodb/mongodb.log.2014-03-15T03-07-53] child process started successfully, parent exiting XXXXX@XXXXX-asus:~$ sudo mongod --dbpath /var/lib/mongodb1 --logpath /var/log/mongodb/mongodb1.log --port 10001 --replSet testrep --nojournal --fork forked process: 6915 all output going to: /var/log/mongodb/mongodb1.log child process started successfully, parent exiting XXXXX@XXXXX-asus:~$ sudo mongod --dbpath /var/lib/mongodb2 --logpath /var/log/mongodb/mongodb2.log --port 10002 --replSet testrep --nojournal --fork forked process: 6964 all output going to: /var/log/mongodb/mongodb2.log child process started successfully, parent exiting XXXXX@XXXXX-asus:~$
在上述步骤中,启动了3个mongodb进程,分别监听10000,10001,10002端口,组成副本集testrep,这里关键是要使用--replSet参数。
但是这样仍然不算已经完成配置,需要登陆mongodb初始化副本集才可以,如下
XXXXX@XXXXX-asus:~$ mongo localhost:10000 MongoDB shell version: 2.2.4 connecting to: localhost:10000/test > rs.initiate({"_id":"testrep","members":[ ... {"_id":1, "host":"localhost:10000"}, ... {"_id":2, "host":"localhost:10001"}, ... {"_id":3, "host":"localhost:10002"} ... ]}) { "info" : "Config now saved locally. Should come online in about a minute.", "ok" : 1 } >
用rs.initiate初始化副本集,其中的参数要和启动mongodb时候的参数相符。初始化需要一点时间,完成之后,监听10000端口的mongodb变为primary,另外2个是secondary,可以看到log里面的状态变化,primary的log如下
Sat Mar 15 11:15:43 [rsStart] replSet can't get local.system.replset config from self or any seed (EMPTYCONFIG) Sat Mar 15 11:15:48 [initandlisten] connection accepted from 127.0.0.1:59140 #3 (1 connection now open) Sat Mar 15 11:15:52 [conn3] replSet replSetInitiate admin command received from client Sat Mar 15 11:15:52 [conn3] replSet replSetInitiate config object parses ok, 3 members specified Sat Mar 15 11:15:52 [conn3] replSet replSetInitiate all members seem up Sat Mar 15 11:15:52 [conn3] ****** Sat Mar 15 11:15:52 [conn3] creating replication oplog of size: 1810MB... Sat Mar 15 11:15:52 [FileAllocator] allocating new datafile /var/lib/mongodb/local.1, filling with zeroes... Sat Mar 15 11:15:52 [FileAllocator] creating directory /var/lib/mongodb/_tmp Sat Mar 15 11:16:37 [FileAllocator] done allocating datafile /var/lib/mongodb/local.1, size: 2047MB, took 44.99 secs Sat Mar 15 11:16:41 [conn3] ****** Sat Mar 15 11:16:41 [conn3] replSet info saving a newer config version to local.system.replset Sat Mar 15 11:16:41 [conn3] replSet saveConfigLocally done Sat Mar 15 11:16:41 [conn3] replSet replSetInitiate config now saved locally. Should come online in about a minute. Sat Mar 15 11:16:41 [conn3] command admin.$cmd command: { replSetInitiate: { _id: "testrep", members: [ { _id: 1.0, host: "localhost:10000" }, { _id: 2.0, host: "localhost:10001" }, { _id: 3.0, host: "localhost:10002" } ] } } ntoreturn:1 keyUpdates:0 locks(micros) W:49581556 reslen:112 49587ms Sat Mar 15 11:16:41 [rsStart] replSet I am localhost:10000 Sat Mar 15 11:16:41 [rsStart] replSet STARTUP2 Sat Mar 15 11:16:41 [rsHealthPoll] replSet member localhost:10001 is up Sat Mar 15 11:16:42 [rsSync] replSet SECONDARY Sat Mar 15 11:16:43 [rsHealthPoll] replSet member localhost:10002 is up Sat Mar 15 11:16:43 [rsMgr] replSet info electSelf 1 Sat Mar 15 11:16:43 [rsMgr] replSet couldn't elect self, only received 1 votes Sat Mar 15 11:16:45 [initandlisten] connection accepted from 127.0.0.1:59173 #4 (2 connections now open) Sat Mar 15 11:16:47 [rsHealthPoll] replSet member localhost:10002 is now in state STARTUP2 Sat Mar 15 11:16:47 [rsMgr] not electing self, localhost:10002 would veto Sat Mar 15 11:16:49 [initandlisten] connection accepted from 127.0.0.1:59177 #5 (3 connections now open) Sat Mar 15 11:17:01 [conn4] end connection 127.0.0.1:59173 (2 connections now open) Sat Mar 15 11:17:01 [initandlisten] connection accepted from 127.0.0.1:59185 #6 (3 connections now open) Sat Mar 15 11:17:01 [rsHealthPoll] DBClientCursor::init call() failed Sat Mar 15 11:17:02 [rsHealthPoll] replSet info localhost:10001 is down (or slow to respond): DBClientBase::findN: transport error: localhost:10001 ns: admin.$cmd query: { replSetHeartbeat: "testrep", v: 1, pv: 1, checkEmpty: false, from: "localhost:10000", $auth: {} } Sat Mar 15 11:17:02 [rsHealthPoll] replSet member localhost:10001 is now in state DOWN
secondary中的log如下
Sat Mar 15 11:16:39 [rsStart] replSet can't get local.system.replset config from self or any seed (EMPTYCONFIG) Sat Mar 15 11:16:47 [initandlisten] connection accepted from 127.0.0.1:59521 #2 (2 connections now open) Sat Mar 15 11:16:49 [rsStart] trying to contact localhost:10000 Sat Mar 15 11:16:52 [rsStart] trying to contact localhost:10002 Sat Mar 15 11:17:02 [initandlisten] connection accepted from 127.0.0.1:59532 #3 (3 connections now open) Sat Mar 15 11:17:03 [rsStart] DBClientCursor::init call() failed Sat Mar 15 11:17:03 [conn2] command admin.$cmd command: { replSetHeartbeat: "testrep", v: 1, pv: 1, checkEmpty: false, from: "localhost:10002", $auth: {} } ntoreturn:1 keyUpdates:0 reslen:72 11356ms Sat Mar 15 11:17:03 [conn1] command admin.$cmd command: { replSetHeartbeat: "testrep", v: 1, pv: 1, checkEmpty: false, from: "localhost:10000", $auth: {} } ntoreturn:1 keyUpdates:0 reslen:72 11177ms Sat Mar 15 11:17:03 [conn1] end connection 127.0.0.1:59489 (2 connections now open) Sat Mar 15 11:17:03 [rsStart] replSet I am localhost:10001 Sat Mar 15 11:17:03 [rsStart] replSet got config version 1 from a remote, saving locally Sat Mar 15 11:17:03 [rsStart] replSet info saving a newer config version to local.system.replset Sat Mar 15 11:17:03 [FileAllocator] allocating new datafile /var/lib/mongodb1/local.ns, filling with zeroes... Sat Mar 15 11:17:03 [FileAllocator] creating directory /var/lib/mongodb1/_tmp Sat Mar 15 11:17:04 [conn2] end connection 127.0.0.1:59521 (1 connection now open) Sat Mar 15 11:17:04 [initandlisten] connection accepted from 127.0.0.1:59535 #4 (2 connections now open) Sat Mar 15 11:17:06 [conn3] end connection 127.0.0.1:59532 (1 connection now open) Sat Mar 15 11:17:06 [initandlisten] connection accepted from 127.0.0.1:59537 #5 (3 connections now open) Sat Mar 15 11:17:07 [FileAllocator] done allocating datafile /var/lib/mongodb1/local.ns, size: 16MB, took 1.438 secs Sat Mar 15 11:17:07 [FileAllocator] allocating new datafile /var/lib/mongodb1/local.0, filling with zeroes... Sat Mar 15 11:17:13 [FileAllocator] done allocating datafile /var/lib/mongodb1/local.0, size: 64MB, took 5.174 secs Sat Mar 15 11:17:13 [FileAllocator] allocating new datafile /var/lib/mongodb1/local.1, filling with zeroes... Sat Mar 15 11:17:15 [rsStart] replSet saveConfigLocally done Sat Mar 15 11:17:16 [rsStart] replSet STARTUP2 Sat Mar 15 11:17:16 [rsSync] ****** Sat Mar 15 11:17:16 [rsSync] creating replication oplog of size: 1631MB... Sat Mar 15 11:17:17 [rsHealthPoll] replSet member localhost:10000 is up Sat Mar 15 11:17:17 [rsHealthPoll] replSet member localhost:10000 is now in state SECONDARY Sat Mar 15 11:17:17 [rsHealthPoll] replSet member localhost:10002 is up Sat Mar 15 11:17:17 [rsHealthPoll] replSet member localhost:10002 is now in state STARTUP2 Sat Mar 15 11:17:21 [FileAllocator] done allocating datafile /var/lib/mongodb1/local.1, size: 128MB, took 7.663 secs Sat Mar 15 11:17:21 [FileAllocator] allocating new datafile /var/lib/mongodb1/local.2, filling with zeroes... Sat Mar 15 11:17:26 [conn4] end connection 127.0.0.1:59535 (1 connection now open)
也可以直接登入primary去查看状态,用rs.status()
XXXXX@XXXXX-asus:~$ mongo localhost:10000 MongoDB shell version: 2.2.4 connecting to: localhost:10000/test testrep:PRIMARY> rs.status() { "set" : "testrep", "date" : ISODate("2014-03-15T05:18:36Z"), "myState" : 1, "members" : [ { "_id" : 1, "name" : "localhost:10000", "health" : 1, "state" : 1, "stateStr" : "PRIMARY", "uptime" : 7843, "optime" : Timestamp(1394853401000, 1), "optimeDate" : ISODate("2014-03-15T03:16:41Z"), "self" : true }, { "_id" : 2, "name" : "localhost:10001", "health" : 1, "state" : 2, "stateStr" : "SECONDARY", "uptime" : 7292, "optime" : Timestamp(1394853401000, 1), "optimeDate" : ISODate("2014-03-15T03:16:41Z"), "lastHeartbeat" : ISODate("2014-03-15T05:18:35Z"), "pingMs" : 0 }, { "_id" : 3, "name" : "localhost:10002", "health" : 1, "state" : 2, "stateStr" : "SECONDARY", "uptime" : 7313, "optime" : Timestamp(1394853401000, 1), "optimeDate" : ISODate("2014-03-15T03:16:41Z"), "lastHeartbeat" : ISODate("2014-03-15T05:18:34Z"), "pingMs" : 0 } ], "ok" : 1 } testrep:PRIMARY>
可以看到,再次登入primary的时候,提示符中的信息已经变为testrep:PRIMARY>,表示当前登陆的是副本集的primary节点。
如果此时对primary做读写操作,都是可以的,但是对secondary都不能读写,如下
XXXXX@XXXXX-asus:~$ mongo localhost:10000 MongoDB shell version: 2.2.4 connecting to: localhost:10000/test testrep:PRIMARY> use test switched to db test testrep:PRIMARY> db.testrep.insert({"abcd":1234}) testrep:PRIMARY> db.testrep.find() { "_id" : ObjectId("5323e46faeb9bd3d8d02e2b6"), "abcd" : 1234 } testrep:PRIMARY> exit bye XXXXX@XXXXX-asus:~$ mongo localhost:10001 MongoDB shell version: 2.2.4 connecting to: localhost:10001/test testrep:SECONDARY> use test switched to db test testrep:SECONDARY> db.testrep.find() error: { "$err" : "not master and slaveOk=false", "code" : 13435 } testrep:SECONDARY> db.testrep.insert({"efgh":5678}) not master testrep:SECONDARY> exit bye XXXXX@XXXXX-asus:~$
如果此时primary离线,2个secondary中会选举出一个成为新的primary,用kill -9 来模拟这个primary离线这个动作
XXXXX@XXXXX-asus:~$ ps -ef | grep mongo root 6862 1 1 11:07 ? 00:01:30 mongod --dbpath /var/lib/mongodb --logpath /var/log/mongodb/mongodb.log --port 10000 --replSet testrep --nojournal --fork root 6915 1 0 11:08 ? 00:01:26 mongod --dbpath /var/lib/mongodb1 --logpath /var/log/mongodb/mongodb1.log --port 10001 --replSet testrep --nojournal --fork root 6964 1 1 11:08 ? 00:01:27 mongod --dbpath /var/lib/mongodb2 --logpath /var/log/mongodb/mongodb2.log --port 10002 --replSet testrep --nojournal --fork XXXXX 13047 5380 0 13:33 pts/2 00:00:00 grep --color=auto mongo XXXXX@XXXXX-asus:~$ sudo kill -9 6862 XXXXX@XXXXX-asus:~$ ps -ef | grep mongo root 6915 1 0 11:08 ? 00:01:26 mongod --dbpath /var/lib/mongodb1 --logpath /var/log/mongodb/mongodb1.log --port 10001 --replSet testrep --nojournal --fork root 6964 1 1 11:08 ? 00:01:27 mongod --dbpath /var/lib/mongodb2 --logpath /var/log/mongodb/mongodb2.log --port 10002 --replSet testrep --nojournal --fork XXXXX 13058 5380 0 13:34 pts/2 00:00:00 grep --color=auto mongo XXXXX@XXXXX-asus:~$
此时secondary的log会显示其已经代替原先的primary成为新的primary,新的primary是原先监听10001端口的mongodb进程
Sat Mar 15 13:33:59 [initandlisten] connection accepted from 127.0.0.1:35927 #556 (2 connections now open) Sat Mar 15 13:34:00 [conn555] end connection 127.0.0.1:35925 (1 connection now open) Sat Mar 15 13:34:00 [rsBackgroundSync] replSet db exception in producer: 10278 dbclient error communicating with server: localhost:10000 Sat Mar 15 13:34:01 [rsHealthPoll] DBClientCursor::init call() failed Sat Mar 15 13:34:01 [rsHealthPoll] replSet info localhost:10000 is down (or slow to respond): DBClientBase::findN: transport error: localhost:10000 ns: admin.$cmd query: { replSetHeartbeat: "testrep", v: 1, pv: 1, checkEmpty: false, from: "localhost:10001", $auth: {} } Sat Mar 15 13:34:01 [rsHealthPoll] replSet member localhost:10000 is now in state DOWN Sat Mar 15 13:34:02 [rsMgr] replSet info electSelf 2 Sat Mar 15 13:34:02 [rsMgr] replSet PRIMARY Sat Mar 15 13:34:03 [rsHealthPoll] couldn't connect to localhost:10000: couldn't connect to server localhost:10000 Sat Mar 15 13:34:03 [rsHealthPoll] couldn't connect to localhost:10000: couldn't connect to server localhost:10000 Sat Mar 15 13:34:05 [rsHealthPoll] couldn't connect to localhost:10000: couldn't connect to server localhost:10000 Sat Mar 15 13:34:07 [rsHealthPoll] couldn't connect to localhost:10000: couldn't connect to server localhost:10000 Sat Mar 15 13:34:09 [rsHealthPoll] couldn't connect to localhost:10000: couldn't connect to server localhost:10000 Sat Mar 15 13:34:10 [initandlisten] connection accepted from 127.0.0.1:35945 #557 (2 connections now open) Sat Mar 15 13:34:11 [rsHealthPoll] couldn't connect to localhost:10000: couldn't connect to server localhost:10000 Sat Mar 15 13:34:13 [rsHealthPoll] couldn't connect to localhost:10000: couldn't connect to server localhost:10000
也可以直接登陆新的primary数据库查看副本集的状态
XXXXX@XXXXX-asus:~$ mongo localhost:10001 MongoDB shell version: 2.2.4 connecting to: localhost:10001/test testrep:PRIMARY> rs.status() { "set" : "testrep", "date" : ISODate("2014-03-15T05:39:24Z"), "myState" : 1, "members" : [ { "_id" : 1, "name" : "localhost:10000", "health" : 0, "state" : 8, "stateStr" : "(not reachable/healthy)", "uptime" : 0, "optime" : Timestamp(1394861167000, 1), "optimeDate" : ISODate("2014-03-15T05:26:07Z"), "lastHeartbeat" : ISODate("2014-03-15T05:33:59Z"), "pingMs" : 0, "errmsg" : "socket exception [CONNECT_ERROR] for localhost:10000" }, { "_id" : 2, "name" : "localhost:10001", "health" : 1, "state" : 1, "stateStr" : "PRIMARY", "uptime" : 9065, "optime" : Timestamp(1394861167000, 1), "optimeDate" : ISODate("2014-03-15T05:26:07Z"), "self" : true }, { "_id" : 3, "name" : "localhost:10002", "health" : 1, "state" : 2, "stateStr" : "SECONDARY", "uptime" : 8527, "optime" : Timestamp(1394861167000, 1), "optimeDate" : ISODate("2014-03-15T05:26:07Z"), "lastHeartbeat" : ISODate("2014-03-15T05:39:23Z"), "pingMs" : 0 } ], "ok" : 1 } testrep:PRIMARY>
可以看到,原先监听10000端口的副本状态是“(not reachable/healthy)”,此时新的primary数据库可以做查询新增动作
XXXXX@XXXXX-asus:~$ mongo localhost:10001 MongoDB shell version: 2.2.4 connecting to: localhost:10001/test testrep:PRIMARY> use test switched to db test testrep:PRIMARY> db.testrep.find() { "_id" : ObjectId("5323e46faeb9bd3d8d02e2b6"), "abcd" : 1234 } testrep:PRIMARY> db.testrep.insert({"efgh":5678}) testrep:PRIMARY> db.testrep.find() { "_id" : ObjectId("5323e46faeb9bd3d8d02e2b6"), "abcd" : 1234 } { "_id" : ObjectId("5323e8d10a61cc7b258aac5a"), "efgh" : 5678 } testrep:PRIMARY>
此时如果再次启动原先监听10000端口的进程,多半无法启动,因为数据库异常推出,锁没有释放,此时需要先对数据库做repair然后再启动
XXXXX@XXXXX-asus:~$ sudo mongod --dbpath /var/lib/mongodb --logpath /var/log/mongodb/mongodb.log --port 10000 --replSet testrep --nojournal --repair --fork forked process: 16902 all output going to: /var/log/mongodb/mongodb.log log file [/var/log/mongodb/mongodb.log] exists; copied to temporary file [/var/log/mongodb/mongodb.log.2014-03-15T06-00-10] child process started successfully, parent exiting XXXXX@XXXXX-asus:~$ sudo mongod --dbpath /var/lib/mongodb --logpath /var/log/mongodb/mongodb.log --port 10000 --replSet testrep --nojournal --fork forked process: 17059 all output going to: /var/log/mongodb/mongodb.log log file [/var/log/mongodb/mongodb.log] exists; copied to temporary file [/var/log/mongodb/mongodb.log.2014-03-15T06-02-04] child process started successfully, parent exiting XXXXX@XXXXX-asus:~$ ps -ef | grep mongo root 6915 1 0 11:08 ? 00:01:44 mongod --dbpath /var/lib/mongodb1 --logpath /var/log/mongodb/mongodb1.log --port 10001 --replSet testrep --nojournal --fork root 6964 1 1 11:08 ? 00:01:44 mongod --dbpath /var/lib/mongodb2 --logpath /var/log/mongodb/mongodb2.log --port 10002 --replSet testrep --nojournal --fork XXXXX 13123 5712 0 13:34 pts/3 00:00:00 vim /var/log/mongodb/mongodb1.log XXXXX 15728 5086 0 13:51 pts/1 00:00:00 mongo localhost:10001 root 17059 1 3 14:02 ? 00:00:00 mongod --dbpath /var/lib/mongodb --logpath /var/log/mongodb/mongodb.log --port 10000 --replSet testrep --nojournal --fork XXXXX 17131 5380 0 14:02 pts/2 00:00:00 grep --color=auto mongo XXXXX@XXXXX-asus:~$
此时,再次使用rs.status()就可以看到和最初状态几乎相同的输出了,唯一差别是10000端口的进程变为secondary,而10001端口的进程变为primary。
另外,mongodb官方已经开始推荐使用副本集而非主从复制来建立高可用,副本集常用的命令还有,rs.add("localhost:10000"),rs.remove("localhost:10000")分别用来增加或者删除副本集中的一个节点,在以后的文章中描述。