In the previous article introduces Redis master-slave replication mechanism, master-slave replication mechanism can allow us to expand the node for data copy, can read and write according to the business scenario for separation, data backup, and other functions, but the Master node Master when abnormal and can’t achieve automatic master-slave replication nodes, fault handling transfer operations such as switch, The sentinel mechanism introduced in this paper is a node monitoring management mechanism based on the master/slave replication mechanism of Redis. It can switch nodes and failover when the above problems occur. It is an implementation mechanism of Redis high availability scheme.
Structure topology
- Master (Master node) The Master service database of Redis, which is responsible for receiving the writing of business data. Generally, there is one (the horizontal extension of multiple masters after Sharding under distributed architecture is not extended here, but the general architecture topology of Redis Sentinel is briefly discussed).
- Slave The Slave database of Redis that replicates the data of the Master node
- Sentinel Node Sentinel nodes are used to monitor the status of service data nodes such as Master and Slave. Sentinel nodes are usually composed of multiple Sentinel nodes, which reflects the high availability of the Sentinel mechanism
composition | role | role | The number of |
---|---|---|---|
Master | Primary node of service data | Receives client requests, readable and writable | 1 |
Slave | Service data from nodes | Replication Master data, Dr, writable (read/write separation) | > = 1 |
Sentinel Node | The sentinel node | Monitor Master and Slave service data nodes and perform failover when faults occur | > = 1 |
Operation mechanism
The main task of Redis Sentinel is to monitor the status of all Redis nodes at all times, and handle faults according to the preset value mechanism once exceptions occur, so that Redis can achieve high availability. The core implementation mechanism is to discover and monitor each node through three scheduled monitoring tasks.
Timing task | Trigger interval | function |
---|---|---|
Timing the info | 10s | Sentinel node to obtain the latest Redis node information and topological relationship |
The publish/subscribe regularly | 2s | Sentinel nodes communicate with each other by subscribing to the Master channel |
Timing ping | 1s | The Sentinel node checks the network with all Redis nodes and other Sentinel nodes |
- [Worker-1]Every 10 seconds, each Sentinel node is directed to
master
andslave
sendinfo
Command to obtain the latest topology.The scheduled task provides the following functions:When a fault occurs or a new node is added, you can periodically obtain and update the topology of the current Redis node
After you run info replication on the master node, the following information can be viewed:
The Replication role: master connected_slaves: 2 slave0: IP = 127.0.0.1 port = 6380, state = online, offset = 4917, lag = 1 Slave1: IP = 127.0.0.1 port = 6381, state = online, offset = 4917, ` lag = 1
- [Worker-2]Every 2 seconds, each Sentinel node calls the Redis data node
_sentinel_ : hello!
On the channelPublish
The judgment of the Sentinel node for the master node and the information of the current Sentinel node, as well as each Sentinel nodeSubscribe (Subcribe)
To learn about other Sentinel nodes and their relationship tomaster
The judgment of the
All sentinel nodes publish/subscribe to _sentinel_ : hello of the master node to communicate and exchange information, so as to provide basis for objective downsizing and leader election
- [Worker-3]Every second, each Sentinel node is directed to
master
,slave
,Other Sentinel nodes
Send aping
Order to do it onceThe heartbeat detectionTo verify that these nodes are currently reachable
failover
- [step-1] 当
sentinel node
Node listeningmaster
The node is faulty.slave
The slave node cannot be pairedmaster
Data Replication
- [step-2]
sentinel node
foundmaster
The node is abnormalSentinel cluster node
To be elected by internal voteleader
To carry outmaster
,slave
Service data node faults are transferred and notifiedclient
The client
Timeout detection: Uses the down-after-milliseconds parameter. If no response is detected after the timeout, the node is faulty. Because The Sentinel node exists in the form of cluster, when the Sentinel node detects that the master node is abnormal, it will ask other Sentinel nodes to vote for the next step, which can greatly reduce the misjudgment of single node to the failure
- [step-3]When the new
master
After producing,slave
The node will copy the new onemaster
But they will continue to monitor the old onesmaster
node
- [step-4]When the old
master
After the node recovers from the fault, becauseSentinel cluster
Listen all the time, it will be put back into the cluster management, make it newmaster
Secondary node of a node. The recovered faulty node becomesslave
And start copying new onesmaster
Node to realize the reuse after node failure
For the aboveRedis Sentinel
The sequence diagram interaction of the failover process under the architecture is summarized as follows:
The cluster of election
Sentinel node election
Due to thesentinel
Therefore, you need to select a cluster to ensure high availabilitySentinel node
As aLeaderTo operate on each of themSentinel node
Can beLeader. Election process:
- When a
Sentinel node
Identify the primary node of the Redis clusterofflineafter - Request other
Sentinel node
Ask to be elected asLeader. The requestedSentinel node
If you haven’t agreed to anything elseSentinel node
, the request is granted, i.eThe electoral vote is +1Otherwise, do not agree. - When a
Sentinel node
The number of electoral votes obtained reachedLeaderMinimum number of votes (Maximum number of Sentinel nodes /2+1
), theSentinel node
Elected asLeader; Otherwise, a new election will be held.
Raft algorithm is adopted for Sentinel cluster election. If you are interested, you can continue to explore the internal implementation mechanism of this algorithm.
Subjective referral & Objective referral:
- Subjective offline
Each Sentinel node in the Sentinel cluster periodically sends heartbeat packets to all nodes in the Redis cluster to check whether the nodes are normal. If a node does not reply to the heartbeat packet of the Sentinel node within down-after-milliseconds, the redis node is subjectively offline by the Sentinel node. The so-called subjective offline is judged by a single node. It is possible that the node is not communicating with the master properly. The interaction between the non-master and all nodes is abnormal, so multiple Sentinel nodes need to confirm.
- Objective offline
When a node is recorded as subjective offline by a Sentinel node, it does not mean that the node is definitely faulty, and it needs to be judged as subjective offline by other Sentinel nodes in the Sentinel cluster.
Redis node election
When the Sentinel cluster elects a Sentinel leader, the Sentinel leader selects a slave as the master.
Election process:
- Filter faulty nodes
- Priority selection
slave-priority
One of the biggestslave
As amaster
Continue if it does not exist - choose
Copy offset
(The amount of data written in bytes, which records how much data was written. The primary server synchronizes the offset to the secondary server. When the offset is the same, the data is fully synchronizedslave
As amaster
Continue if it does not exist - choose
runid
Redis generates a random RUNID each time it is started as an identifier of Redisslave
As amaster
So this is a random scenario and it’s also a bottom-of-the-barrel scenario
reference
Redis Design and Implementation
Redis Development and Operation
www.cnblogs.com/albert32/p/… Sentinel node election