“This is the fifth day of my participation in the First Challenge 2022. For details: First Challenge 2022.”

preface

The principle of master slave replication of Redis is introduced above, which solves the problem of data backup of Redis. When the master node fails, it cannot automatically elect a new master node, so it needs to manually set the slave node as the master node, which is inefficient and cannot realize automatic failover. Redis officially offers Sentinel, a highly available solution.

What is Redis Sentinel?

Redis Sentinel is Redis official high availability solution. Redis Sentinel provides high availability for Redis. In practice, this means that using Sentinel you can create a cluster of Redis that can resist certain types of failures without human intervention and automatically failover.

What can Sentinel do?

1. Monitor the health status of Redis cluster nodes (Master + Replica) and Sentinel nodes

2. Automatic failover: If the master fails, Sentinel can failover and notify the client to connect to the new master.

3. Notification: Through the API, you can send notifications to administrators and developers that the monitored Redis instance has failed

4. Configuration center: The client is connected to Sentinel, and Sentinel can access the master and return node information to the client

3. Start sentinel method

1, sentinel redis – / path/to/sentinel. Conf

2, redis-server /path/to/sentinel.conf –sentinel

The configuration description of sentinel.conf is as follows

Configure the master node information to be monitored2The quorum is used to indicate the minimum number of sentinel nodes required to agree with #master node is unreachable to be marked as objective offline # example5The sentinel instance quorum is set to2So there are2If at least three sentinels are reachable, failover is authorized and actually started. sentinel monitor mymaster127.0. 01. 6379 2# master sentinel automatically detects slave messages sentinel down-after-milliseconds mymaster60000If the master does not respond to the ping command within the specified time, or if an error is reported, the master is considered offline. sentinel failover-timeout mymaster180000
sentinel parallel-syncs mymaster 1# Specify how many replicas are supported to synchronize data with the master during failover. The smaller the replica, the longer the failover. To modify in real time. sentinel monitor resque192.1681.3. 6380 4
sentinel down-after-milliseconds resque 10000
sentinel failover-timeout resque 180000
sentinel parallel-syncs resque 5
Copy the code

Note:

Port 26379 is enabled by default, and port access must be enabled for sentinel to facilitate mutual access.

4. Sentinel workflow

1. Firstly, sentinel realizes dynamic perception through redis PUB/SUBSCRIBE mechanism.

2. How does Sentinel sense when a master has failed?

There are two cases where the master is subjectively logged out and the master is objectively logged out.

Offline: Each sentinel sends a ping to the master every 1s. If the master does not respond before down-after-milliseconds, the Sentinel considers the master offline.

Objective referral:

When the subjective offline node is the primary node, the sentinel node will seek the judgment of other sentinels on the primary node through the sentinel IS-Masterdown-by-addr command. When the number of sentinels exceeds quorum (the quorum configured in the Sentinel configuration), At this point, the sentinel node thinks that there is indeed a problem with the master node, so it is objectively offline. Most of the sentinel nodes agree to the offline operation, which is objectively offline.

Note that objective logoff only applies to the master node, which triggers failover

3, Master is offline and failover is required

Here, there are two steps. First, the Sentinel master node needs to be selected for redis failover through the Sentinel master node.

First, Sentinel elects a leader. Use raft algorithm (state consensus algorithm).

Each Sentinel node can become the Leader. When a Sentinel node confirms that the primary node of redis cluster is subjectively offline, it will request other Sentinel nodes to elect itself as the Leader. If the requested Sentinel node has not agreed to the election request of other Sentinel nodes, it will agree to the request (election number +1), otherwise it will not agree.

If the number of votes obtained by a Sentinel node reaches the minimum number of Leader votes (the maximum number of quorum and Sentinel nodes /2+1), the Sentinel node is elected as the Leader. Otherwise, a new election will be held.

Raft core idea: First come, first served, minority rule.

After the primary node is elected by Sentinel, the Sentinel primary node needs to elect the redis cluster primary node to build a new cluster relationship.

The new Redis master node is elected on the basis of:

1. Disconnection time with Sentinel. Filter replica slaves found that the disconnection time with the primary Sentinel server exceeded the configured host timeout down-after-milliseconds

2. Copy priority. Priority is given to those with low replica-priority.

3. If the priority is the same, the replication offset has been processed. The bigger the better, this is more in line with the business scenario function.

4, if the offset is the same, look at the run ID. Choose the small first.

After the master node is selected, the system starts to maintain cluster relationships.

1. Sentinel node sends slave no one command to the new master node to make it an independent node

2. Sentinel node sends Slaveof IP port to other nodes to follow it to the master node

Five, the summary

According to the above analysis, Sentinel can realize automatic failover through timing monitoring. However, Sentinel still has some problems, such as data loss in the case of a single master node, and no horizontal expansion capability if the performance of a single master node is limited.