I. Introduction to replica sets

Replica sets are built to make mongodb highly available.

Mongodb(M) indicates the active node, Mongodb(S) indicates the standby node, and Mongodb(A) indicates the quorum node. The active and standby nodes store data. The quorum nodes do not store data. The client is connected to both the active and standby nodes, but not the quorum node.

A quorum node is a special node that does not store data. Its main function is to determine which secondary node is promoted to the primary node after the primary node fails. Therefore, clients do not need to connect to this node.

In the MongoDB replica set, the master node is responsible for processing client read and write requests, and the backup node is responsible for mapping the data of the master node.

The working principle of the backup node can be roughly described as follows: The backup node periodically polls data operations on the primary node and then performs these operations on its own data copy to ensure data synchronization with the primary node. All database state changes on the master node are stored in a specific system table. The backup node updates its own data based on the data.

The operation that changes the database state mentioned above is called oploG (Operation log). Oplog is stored in the oplog.rs table of the local database. The replica centralized backup node synchronizes the Oplog asynchronously from the primary node and then re-performs its recorded operations to achieve data synchronization.

Oplog notes:

  • The size of Oplog is fixed, and when the collection is filled, new inserted documents overwrite the old ones.
  • Oplog Synchronizes data

Initialization: This process occurs when a new database is created in the replica or a node has just recovered from an outage, or a new member is added to the replica. By default, the node in the replica copies oplog from the nearest node to synchronize data. The nearest node can be the primary or the secondary node with the latest copy of oplog.

2. Set up a copy set with arbitration node

1. Go to /usr/java, create the mongodbRepliSet folder, and create three nodes in the mongodbRepliSet folder, mkdir node1.

2. Go to node1, create data and log folders, that is, mkdir data log, and go to data, mkdir db.

3. Copy the mongodb configuration file to node1.

cp /usr/java/mongoNode/mongodb.conf /usr/java/mongodbRepliSet/node1/mongodb.conf

4. Modify the vim mongodb.conf configuration file

dbpath=/usr/java/mongodbRepliSet/node1/data//db
logpath=/usr/java/mongodbRepliSet/node1/log/mongodb.log
logappend=true
fork=trueBind_ip = 192.168.80.128 port = 27017 replSet = JoeSetThis configuration should be the same for all 3 nodes, indicating that they are in a replica set
Copy the code

5. Copy the entire node1 folder named node2 and node3

cp -r node1 node2
cp -r node1 node3
Copy the code

6. Modify mongodb. Conf in the node2 and node3 folders, including dbpath, logPath, and port 27018 and 27019.

7. Run the export PATH=/usr/java/mongodb/bin:$PATH command to configure the temporary environment variable

8. Run echo $PATH to check whether the temporary environment variable is configured successfully

9. Run the mongod –config mongodb. Conf command to start the three nodes in node1 and node2 and node3 respectively

10. Connect to node1, mongo –host 192.168.80.128 –port 27017

11. The above configuration does not specify which is the master node, slave node, or quorum node. Therefore, the replica set needs to be initialized.

Rs. Initiate ({” _id “:” JoeSet members: [{” _id “: 1,” the host “:” 192.168.80.128:27017 “, priority: 3}, {” _id “: 2,” the host “:” 192.168.80.128: 27018 “, priority: 9}, {3, “_id” : “the host” : “192.168.80.128:27019”, arbiterOnly: true}}).

  1. “_id”: indicates the name of the replica set
  2. “Members “: list of servers for replica sets
  3. “_id”: indicates the unique ID of a server
  4. “Host “: indicates the server host
  5. “Priority “: indicates the priority. The default value is 1. Priority 0 is a passive node and cannot become an active node. If the priority is not 0, active nodes are selected from the largest to the smallest.
  6. “ArbiterOnly “: arbiter node that only participates in voting, does not receive data, and cannot become an active node.

After the initialization command is executed, the following information is displayed.

Beware of possible errors:

> rs.initiate({"_id":"JoeSet",members:[{"_id": 1,"host":"192.168.80.128:27017",priority:3},{"_id": 2."host":"192.168.80.128:27018", priority:9},{"_id": 4."host":"192.168.80.128:27019",arbiterOnly:true}] {})"ok": 0."errmsg" : "This node, 192.168.80.128:27019, with _id 4 is not electable under the new configuration version 1 for replica set JoeSet"."code": 93}Copy the code

The client is connected to 27019, and 27019 is the quorum node. Solution: The client connects to other nodes and does not connect to the node that is set as quorum.

12. Run rs.status() to check the status.

JoeSet:OTHER> rs.status()
{
    "set" : "JoeSet"."date" : ISODate("The 2017-07-26 T22:09:44. 940 z"),
    "myState": 2."members": [{"_id" : 1,
            "name" : "192.168.80.128:27017"."health" : 1,
            "state": 2."stateStr" : "SECONDARY",               # SECONDARY indicates the SECONDARY node
            "uptime": 38."optime" : Timestamp(1501106974, 1),
            "optimeDate" : ISODate("2017-07-26T22:09:34Z"),
            "configVersion" : 1,
            "self" : true
        },
        {
            "_id": 2."name" : "192.168.80.128:27018"."health" : 1,
            "state" : 1,
            "stateStr" : "PRIMARY",               # PRIMARY indicates the PRIMARY node
            "uptime" : 10,
            "optime" : Timestamp(1501106974, 1),
            "optimeDate" : ISODate("2017-07-26T22:09:34Z"),
            "lastHeartbeat" : ISODate("The 2017-07-26 T22:09:44. 560 z"),
            "lastHeartbeatRecv" : ISODate("The 2017-07-26 T22:09:44. 579 z"),
            "pingMs" : 1,
            "electionTime" : Timestamp(1501106981, 1),
            "electionDate" : ISODate("2017-07-26T22:09:41Z"),
            "configVersion": 1}, {"_id": 3."name" : "192.168.80.128:27019"."health" : 1,
            "state" : 7,
            "stateStr" : "ARBITER",               # ARBITER specifies the arbitration node
            "uptime" : 10,
            "lastHeartbeat" : ISODate("The 2017-07-26 T22:09:44. 559 z"),
            "lastHeartbeatRecv" : ISODate("The 2017-07-26 T22:09:44. 585 z"),
            "pingMs": 0."configVersion": 1}]."ok": 1}Copy the code

13. To verify that the replica set is successfully set up, we insert a few pieces of data into Node2 (master node) and then look at node1 (slave node) because it is configured with Node2 as master, node1 as slave, and Node3 as quorum.

  1. In 2:
JoeSet:PRIMARY> show dbs
local0.078GB JoeSet:PRIMARY> use testdb switched to DB TestDB JoeSet:PRIMARY> db.createcollection ("testCon")
{ "ok" : 1 }
JoeSet:PRIMARY> show collections
system.indexes
testCon
JoeSet:PRIMARY> for(var i=0; i<3; i++) db.testCon.insert({name:"Joe",index:i})
WriteResult({ "nInserted" : 1 })
JoeSet:PRIMARY> db.testCon.find()
{ "_id" : ObjectId("597a553d7db09ad9f77f1353"), "name" : "Joe"."index": 0} {"_id" : ObjectId("597a553d7db09ad9f77f1354"), "name" : "Joe"."index": 1} {"_id" : ObjectId("597a553d7db09ad9f77f1355"), "name" : "Joe"."index"2} :Copy the code
  1. In the node1:
JoeSet:SECONDARY> show DBS 2017-07-27T14:06:14.672-0700 E QUERY Error: listDatabases failed:{SECONDARY> show DBS 2017-07-27T14:06:14.672-0700 E QUERY Error: listDatabases failed:{"note" : "from execCommand"."ok": 0."errmsg" : "not master" }
    at Error (<anonymous>)
    at Mongo.getDBs (src/mongo/shell/mongo.js:47:15)
    at shellHelper.show (src/mongo/shell/utils.js:630:33)
    at shellHelper (src/mongo/shell/utils.js:524:36)
    at (shellhelp2):1:1 at src/mongo/shell/mongo.js:47
JoeSet:SECONDARY> rs.slaveOk()
JoeSet:SECONDARY> show dbs
local   0.078GB
testdb  0.078GB
JoeSet:SECONDARY> use testdb
switched to db testdb
JoeSet:SECONDARY> show collections
system.indexes
testCon
JoeSet:SECONDARY> db.testCon.find()
{ "_id" : ObjectId("597a553d7db09ad9f77f1353"), "name" : "Joe"."index": 0} {"_id" : ObjectId("597a553d7db09ad9f77f1354"), "name" : "Joe"."index": 1} {"_id" : ObjectId("597a553d7db09ad9f77f1355"), "name" : "Joe"."index"2} :Copy the code

If the data inserted for Node2 (primary node) is displayed on Node1 (secondary node), the replica set is successfully set up.

Note: Write operations are not allowed from a node.

Three, simulate the master node failure and recovery

  1. Simulation node2 (primary node) hangs and Kiil drops the process on Node2 (primary node)

  2. Client connection node1 (slave node) : mongo –host 192.168.80.128 –port 27017

JoeSet:PRIMARY> rs.status()
{
    "set" : "JoeSet"."date" : ISODate("The 2017-07-27 T21: at 629 z"),
    "myState" : 1,
    "members": [{"_id" : 1,
            "name" : "192.168.80.128:27017"."health" : 1,
            "state" : 1,
            "stateStr" : "PRIMARY"."uptime" : 4209,
            "optime" : Timestamp(1501189437, 10),
            "optimeDate" : ISODate("2017-07-27T21:03:57Z"),
            "electionTime" : Timestamp(1501190187, 1),
            "electionDate" : ISODate("2017-07-27T21:16:27Z"),
            "configVersion" : 1,
            "self" : true
        },
        {
            "_id": 2."name" : "192.168.80.128:27018"."health": 0."state" : 8,
            "stateStr" : "(not reachable/healthy)"."uptime": 0."optime" : Timestamp(0, 0),
            "optimeDate" : ISODate("1970-01-01T00:00:00Z"),
            "lastHeartbeat" : ISODate("The 2017-07-27 T21: thus. 773 z"),
            "lastHeartbeatRecv" : ISODate("The 2017-07-27 T21:16:24. 292 z"),
            "pingMs" : 1,
            "lastHeartbeatMessage" : "Failed attempt to connect to 192.168.80.128:27018; couldn't connect to server 192.168.80.128:27018 (192.168.80.128), connection attempt failed"."configVersion": 1}, {"_id": 3."name" : "192.168.80.128:27019"."health" : 1,
            "state" : 7,
            "stateStr" : "ARBITER"."uptime" : 4191,
            "lastHeartbeat" : ISODate("The 2017-07-27 T21: at 475 z"),
            "lastHeartbeatRecv" : ISODate("The 2017-07-27 T21: fell 876 z"),
            "pingMs": 0."configVersion": 1}]."ok": 1}Copy the code

The original slave node (node1) is found to be the master node, and the original master node displays an unreachable and unhealthy state. This has no effect on the use of the entire replica set.

  1. Restart Node2, the original primary node. If starting node2 fails, delete the mongod. Lock file in the DB folder.

  2. Mongo –host 192.168.80.128 –port 27017 connects to node1 and finds that after node2 is restarted, node1 becomes the secondary node again and Node2 becomes the original primary node again due to the quorum node and set priority.

JoeSet:SECONDARY> rs.status()
{
    "set" : "JoeSet"."date" : ISODate("The 2017-07-27 T21:31:06. 197 z"),
    "myState": 2."members": [{"_id" : 1,
            "name" : "192.168.80.128:27017"."health" : 1,
            "state": 2."stateStr" : "SECONDARY"."uptime" : 4830,
            "optime" : Timestamp(1501189437, 10),
            "optimeDate" : ISODate("2017-07-27T21:03:57Z"),
            "configVersion" : 1,
            "self" : true
        },
        {
            "_id": 2."name" : "192.168.80.128:27018"."health" : 1,
            "state" : 1,
            "stateStr" : "PRIMARY"."uptime" : 87,
            "optime" : Timestamp(1501189437, 10),
            "optimeDate" : ISODate("2017-07-27T21:03:57Z"),
            "lastHeartbeat" : ISODate("The 2017-07-27 T21:31:05. 209 z"),
            "lastHeartbeatRecv" : ISODate("The 2017-07-27 T21:31:05. 506 z"),
            "pingMs": 0."electionTime" : Timestamp(1501190981, 1),
            "electionDate" : ISODate("2017-07-27T21:29:41Z"),
            "configVersion": 1}, {"_id": 3."name" : "192.168.80.128:27019"."health" : 1,
            "state" : 7,
            "stateStr" : "ARBITER"."uptime" : 4811,
            "lastHeartbeat" : ISODate("The 2017-07-27 T21:31:06. 085 z"),
            "lastHeartbeatRecv" : ISODate("The 2017-07-27 T21:31:05. 633 z"),
            "pingMs": 0."configVersion": 1}]."ok": 1}Copy the code

When the quorum node is down, the ha cannot be reached. That is, when the primary node is down, the secondary node does not switch to the primary node without the quorum node. Therefore, it is recommended to use a replica set without quorum nodes, as follows.

4. It is recommended to build a replica set without a quorum node

1. Copy node1 to node6, node7, node8:

cp  -r  node1   node6
cp  -r  node1   node7
cp  -r  node1   node8
Copy the code

2. Clear the DB and log folders on node6, node7, and Node8

rm -rf data/db/*
rm -rf log/ *Copy the code

3. Modify datapath, LogPath, port, and replSet for node6, node7, and Node8

The port for node6 is 27021, the port for node7 is 27022, the port for node8 is 27023 and the replSet is XbqSet

4. Start the three nodes and go to the corresponding node folder:

mongod –config mongodb.conf

5. Connect to node6 (27021) : mongo –host 192.168.80.128 –port 27021

6. Run the initialization command. Rs. Initiate ({” _id “:” XbqSet members: [{” _id “: 1,” the host “:” 192.168.80.128:27021 “}, {” _id “: 2,” the host “:” 192.168.80.128:27022 “}, {” _ Id “: 3,” the host “:” 192.168.80.128:27023}]})”

7. Check the status of one primary node and two secondary nodes.

XbqSet:PRIMARY> rs.status()
{
    "set" : "XbqSet"."date" : ISODate("The 2017-07-27 T22:02:28. 188 z"),
    "myState" : 1,
    "members": [{"_id" : 1,
            "name" : "192.168.80.128:27021"."health" : 1,
            "state" : 1,
            "stateStr" : "PRIMARY"."uptime" : 483,
            "optime" : Timestamp(1501192872, 1),
            "optimeDate" : ISODate("2017-07-27T22:01:12Z"),
            "electionTime" : Timestamp(1501192876, 1),
            "electionDate" : ISODate("2017-07-27T22:01:16Z"),
            "configVersion" : 1,
            "self" : true
        },
        {
            "_id": 2."name" : "192.168.80.128:27022"."health" : 1,
            "state": 2."stateStr" : "SECONDARY"."uptime" : 75,
            "optime" : Timestamp(1501192872, 1),
            "optimeDate" : ISODate("2017-07-27T22:01:12Z"),
            "lastHeartbeat" : ISODate("The 2017-07-27 T22:02:26. 973 z"),
            "lastHeartbeatRecv" : ISODate("The 2017-07-27 T22:02:26. 973 z"),
            "pingMs": 0."configVersion": 1}, {"_id": 3."name" : "192.168.80.128:27023"."health" : 1,
            "state": 2."stateStr" : "SECONDARY"."uptime" : 75,
            "optime" : Timestamp(1501192872, 1),
            "optimeDate" : ISODate("2017-07-27T22:01:12Z"),
            "lastHeartbeat" : ISODate("The 2017-07-27 T22:02:26. 973 z"),
            "lastHeartbeatRecv" : ISODate("The 2017-07-27 T22:02:26. 973 z"),
            "pingMs": 0."configVersion": 1}]."ok": 1}Copy the code

8. Kill node6, that is, port 27021

9. Connect to node7: mongo –host 192.168.80.128 –port 27022. Check whether node7 is the secondary node and node8 is the primary node.

XbqSet:SECONDARY> rs.status()
{
    "set" : "XbqSet"."date" : ISODate("The 2017-07-27 T22:08:24. 894 z"),
    "myState": 2."members": [{"_id" : 1,
            "name" : "192.168.80.128:27021"."health": 0."state" : 8,
            "stateStr" : "(not reachable/healthy)"."uptime": 0."optime" : Timestamp(0, 0),
            "optimeDate" : ISODate("1970-01-01T00:00:00Z"),
            "lastHeartbeat" : ISODate("The 2017-07-27 T22:08:23. 748 z"),
            "lastHeartbeatRecv" : ISODate("The 2017-07-27 T22:07:43. 464 z"),
            "pingMs": 0."lastHeartbeatMessage" : "Failed attempt to connect to 192.168.80.128:27021; couldn't connect to server 192.168.80.128:27021 (192.168.80.128), connection attempt failed"."configVersion": 1}, {"_id": 2."name" : "192.168.80.128:27022"."health" : 1,
            "state": 2."stateStr" : "SECONDARY"."uptime" : 833,
            "optime" : Timestamp(1501192872, 1),
            "optimeDate" : ISODate("2017-07-27T22:01:12Z"),
            "configVersion" : 1,
            "self" : true
        },
        {
            "_id": 3."name" : "192.168.80.128:27023"."health" : 1,
            "state" : 1,
            "stateStr" : "PRIMARY"."uptime" : 431,
            "optime" : Timestamp(1501192872, 1),
            "optimeDate" : ISODate("2017-07-27T22:01:12Z"),
            "lastHeartbeat" : ISODate("The 2017-07-27 T22:08:23. 409 z"),
            "lastHeartbeatRecv" : ISODate("The 2017-07-27 T22:08:23. 521 z"),
            "pingMs": 0."electionTime" : Timestamp(1501193266, 1),
            "electionDate" : ISODate("2017-07-27T22:07:46Z"),
            "configVersion": 1}]."ok": 1}Copy the code

10. Restart node6: mongod –config mongodb

11. Continue to connect to node7: mongo –host 192.168.80.128 –port 27022

XbqSet:SECONDARY> rs.status()
{
    "set" : "XbqSet"."date" : ISODate("The 2017-07-27 T22: are you 275 z"),
    "myState": 2."members": [{"_id" : 1,
            "name" : "192.168.80.128:27021"."health" : 1,
            "state": 2."stateStr" : "SECONDARY"."uptime": 54."optime" : Timestamp(1501192872, 1),
            "optimeDate" : ISODate("2017-07-27T22:01:12Z"),
            "lastHeartbeat" : ISODate("The 2017-07-27 T22: now. 382 z"),
            "lastHeartbeatRecv" : ISODate("The 2017-07-27 T22: are you 213 z"),
            "pingMs": 0."configVersion": 1}, {"_id": 2."name" : "192.168.80.128:27022"."health" : 1,
            "state": 2."stateStr" : "SECONDARY"."uptime" : 1090,
            "optime" : Timestamp(1501192872, 1),
            "optimeDate" : ISODate("2017-07-27T22:01:12Z"),
            "configVersion" : 1,
            "self" : true
        },
        {
            "_id": 3."name" : "192.168.80.128:27023"."health" : 1,
            "state" : 1,
            "stateStr" : "PRIMARY"."uptime" : 688,
            "optime" : Timestamp(1501192872, 1),
            "optimeDate" : ISODate("2017-07-27T22:01:12Z"),
            "lastHeartbeat" : ISODate("The 2017-07-27 T22: dough. 673 z"),
            "lastHeartbeatRecv" : ISODate("The 2017-07-27 T22: dough. 820 z"),
            "pingMs": 0."electionTime" : Timestamp(1501193266, 1),
            "electionDate" : ISODate("2017-07-27T22:07:46Z"),
            "configVersion": 1}]."ok": 1}Copy the code

5. Add or delete nodes

1. Cp -r node8 node10, change the datapath and logpath port of node10 to 27024, and start node10.

2. Run mongo –host 192.168.80.128 –port 27023 on the active node.

3. Add a node: rs.add(” 192.168.80.128:27024 “)

XbqSet:PRIMARY> rs.add("192.168.80.128:27024")
{ "ok": 1}Copy the code

4. Check the status, rs.status(), and find that the newly added node is the SECONDARY node.

XbqSet:PRIMARY> rs.status()
{
    "set" : "XbqSet"."date" : ISODate("The 2017-07-29 T14:31:44. 872 z"),
    "myState" : 1,
    "members": [{"_id" : 1,
            "name" : "192.168.80.128:27021"."health" : 1,
            "state": 2."stateStr" : "SECONDARY"."uptime" : 630,
            "optime" : Timestamp(1501338697, 1),
            "optimeDate" : ISODate("2017-07-29T14:31:37Z"),
            "lastHeartbeat" : ISODate("The 2017-07-29 T14:31:43. 555 z"),
            "lastHeartbeatRecv" : ISODate("The 2017-07-29 T14:31:43. 457 z"),
            "pingMs": 0."configVersion": 2}, {"_id": 2."name" : "192.168.80.128:27022"."health" : 1,
            "state": 2."stateStr" : "SECONDARY"."uptime" : 1854,
            "optime" : Timestamp(1501338697, 1),
            "optimeDate" : ISODate("2017-07-29T14:31:37Z"),
            "lastHeartbeat" : ISODate("The 2017-07-29 T14:31:43. 555 z"),
            "lastHeartbeatRecv" : ISODate("The 2017-07-29 T14:31:43. 186 z"),
            "pingMs": 0."configVersion": 2}, {"_id": 3."name" : "192.168.80.128:27023"."health" : 1,
            "state" : 1,
            "stateStr" : "PRIMARY"."uptime" : 1953,
            "optime" : Timestamp(1501338697, 1),
            "optimeDate" : ISODate("2017-07-29T14:31:37Z"),
            "electionTime" : Timestamp(1501338022, 1),
            "electionDate" : ISODate("2017-07-29T14:20:22Z"),
            "configVersion": 2."self" : true
        },
        {
            "_id": 4."name" : "192.168.80.128:27024"."health" : 1,
            "state": 2."stateStr" : "SECONDARY"."uptime" : 7,
            "optime" : Timestamp(1501338697, 1),
            "optimeDate" : ISODate("2017-07-29T14:31:37Z"),
            "lastHeartbeat" : ISODate("The 2017-07-29 T14:31:43. 566 z"),
            "lastHeartbeatRecv" : ISODate("The 2017-07-29 T14:31:43. 579 z"),
            "pingMs" : 6,
            "configVersion": 2}],"ok": 1}Copy the code

5. Delete a node: rs.remove(hostportstr)

XbqSet:PRIMARY> rs.remove("192.168.80.128:27024")
{ "ok": 1}Copy the code

Check the status of the node whose port number is 27024.