Small knowledge, big challenge! This article is participating in the creation activity of “Essential Tips for Programmers”.

This article has participated in the “Digitalstar Project” and won a creative gift package to challenge the creative incentive money.


In the paper of Redis master-slave replication architecture, we analyzed the characteristics of master-slave replication. One of the problems is that the host needs to be manually adjusted after downtime. Changing the slave machine to the host is not conducive to the rapid recovery of production scenarios, but also increases the labor cost. The Sentry mode is designed to automate things that we need to do in master-slave replication mode, and it’s much more timely than humans. In this article, we will show you how to quickly implement automatic failover through sentinel mode.

I. Main functions of sentry

Sentinels in Redis have the following functions:

  • Monitoring (Monitoring) : The sentry constantly checks that the primary and secondary nodes are functioning properly.
  • Automatic failover (Automatic Failover) : When the master node does not function properly, Sentry begins an automatic failover operation by upgrading one of the slave nodes of the failed master node to the new master and having the other slave nodes replicate the new master node instead.
  • Configure the provider (Configuration Provider) : During initialization, the client connects to the sentinel to get the primary node address of the current Redis service.
  • Notification (Notification) : Sentry can send the result of failover to the client.

Among them, the monitoring and automatic failover function, so that the sentry can detect the master node failure and complete the transition; Configuring the provider and notification functions is reflected in the interaction with the client.

2. Sentinel mode architecture

In sentry mode, node types can be divided into data node and sentry node:

  • Data nodes: The master and slave nodes in a master and slave architecture are data nodes.
  • Sentinel node: The sentinel system consists of one or more sentinel nodes, which are special Redis nodes that do not store data.

In the architecture above, in addition to the 1 master and 2 slave, three additional sentinels are used to monitor cluster health.

3. Build sentinel mode

1. Deploy the primary and secondary nodes

The sentry master/slave node configuration is the same as the normal master/slave node configuration and does not require any additional configuration, just continue with the configuration in the previous article. Note materauth configuration password must be enabled when using Sentinel mode:

masterauth 123456
Copy the code

2. Deploy the sentinel node

Copy the sentinel.conf configuration file from the installation directory to the sentinel directory and rename it sentinel28000.conf to distinguish the sentinels from the sentinels.

To modify the configuration file, first modify the general configuration:

Bind 0.0.0.0 protected-mode no port 28000 daemonize yes pidfile /var/run/redis-sentinel28000.pid logfile "sentinel28000.log" dir /tmpCopy the code

Modify the sentinel core configuration. If only one sentinel is configured, modify the following configuration:

Sentinel Monitor myMaster 127.0.0.1 8000 2 Sentinel Auth-pass myMaster 123456Copy the code

Take a look at the format in the official comments:

sentinel monitor <master-name> <ip> <redis-port> <quorum>  
Copy the code
  • master-nameSpecifies the primary node name
  • ipandredis-portSpecifies the primary node address
  • quorumYes Indicates the threshold for determining the number of sentries to go offline objectively: When the number of sentries to go offline reachesquorum, the primary node is objectively offline. The recommended value is half the number of sentinels plus one
sentinel auth-pass <master-name> <password>
Copy the code

When requirePass Foobared authorization passwords are enabled in a Redis instance, all clients connected to the Redis instance must provide passwords. Set the password of the sentinel connection master and slave. Note that the master and slave must have the same authentication password.

Other configuration parameters:

sentinel down-after-milliseconds mymaster 30000
Copy the code

This parameter is related to offline detection. The sentry uses the ping command to check the heartbeat of other nodes. If the node fails to respond to down-after-milliseconds, the sentry takes it offline. This configuration is valid for subjective offline decisions of master, slave, and sentinel nodes.

sentinel parallel-syncs mymaster 1
Copy the code

This parameter is related to replication of slave nodes after failover: it specifies the number of slave nodes that initiate a replication operation to the new master node at a time.

For example, suppose that after the master node switch is complete, three slave nodes want to initiate replication to the new master node:

  • If parallel-syncs=1, the slave nodes are replicated one by one
  • If parallel-syncs=3, then the three slave nodes start replication together

The larger the value of parallel-syncs is, the faster the secondary node completes the replication, but the greater the network load and disk load of the primary node. Set the value based on the actual situation.

Here we use three sentinels, so copy the configuration files sentinel28001.conf and sentinel28002.conf and batch replace the port numbers 28001 and 28002.

3. Start the Sentinel node

. / redis - 5.0.4 / SRC/redis - sentinel sentinel28000. Conf.. / redis - 5.0.4 / SRC/redis - sentinel sentinel28001. Conf.. / redis - 5.0.4 / SRC/redis - sentinel sentinel28002. ConfCopy the code

Otherwise, you can also use the following command to start, the effect is the same:

. / redis - 5.0.4 / SRC/redis - server sentinel28002. Conf - sentinelCopy the code

To view the running process:

As you can see, the master and slave are now running three instances of Redis and configured with three sentinels.

4. Use the Jedis client for testing

public class RedisSentinelTest {
    public static void main(String[] args)  {
        Set<String> set=new HashSet<>();
        set.add("172.20.5.179:28000");
        set.add("172.20.5.179:28001");
        set.add("172.20.5.179:28002");
        JedisSentinelPool jedisSentinelPool=new JedisSentinelPool("mymaster",set,"123456");
        while (true) {
            Jedis jedis=null;
            try {
                jedis = jedisSentinelPool.getResource();
                String s = UUID.randomUUID().toString();
                jedis.set("k" + s, "v" + s);
                System.out.println(jedis.get("k" + s));
                Thread.sleep(1000);
            }catch (Exception e){
                e.printStackTrace();
            }finally {
                if(jedis! =null){
                    jedis.close();
                }
            }
        }
    }
}
Copy the code

If you run the kill command to kill the host during data writing on the client, a temporary write failure occurs and an exception is thrown. During failover, data cannot be written and services may become unavailable. After failover, services will be automatically recovered. And when the original host is restarted, it becomes the slave of the new host, because the sentry dynamically modifies the configuration file.

4. Sentry mode principle

1. The key to understanding sentry’s principles is to understand the following core concepts:

  • Subjective offline: In the scheduled task of heartbeat detection, if other nodes do not reply for a certain period of time, the sentinel node will take them subjective offline. As the name suggests, subjective logoff means that a sentinel node “subjectively” judges a logoff; The opposite of subjective downline is objective
  • Objective offline: The sentinel node passes after subjectively offline the master nodesentinel is-master-down-by-addrCommand to ask other sentinel nodes about the status of the master node; If the number of sentinels on the primary node goes offline reaches a certain value, the primary node goes offline objectively

It is important to note that objective logoff is a concept only for the master node; If the slave node and sentinel node fail, there are no subsequent objective offline and failover operations after being subjectively offline by sentinel.

2. Each sentinel node maintains three scheduled tasks. The scheduled task has the following functions:

  • Every 10 seconds through the primary and secondary nodesinfoCommand to get the latest master-slave structure: discoverslaveNode and determine the master-slave relationship
  • Obtain the information of other sentinel nodes by publishing and subscribing every 2 seconds, and exchange “views” on the nodes and their own situation
  • Every 1 second through to other nodespingCommand to check whether the heartbeat is offline

3. Leadership election

Leader sentry node election: When the master node is judged to be offline objectively, each sentry node will negotiate to elect a leader sentry node, and the leader node will failover it.

All sentinels monitoring the master node are potentially elected as leaders using the Raft algorithm; The basic idea behind Raft’s algorithm is first come, first served: in A round of elections, sentry A sends A request to BE leader to B, and if B doesn’t agree with any of the other sentries, he agrees with A to be leader.

To select a new master node from a node, follow the following rules:

  • Filter out unhealthy slave nodes first
  • Then select the slave node with the highest priority (byreplica-prioritySpecify)
  • If the priorities cannot be distinguished, the secondary node with the largest replication offset is selected
  • If no, selectrunidSmallest slave node
  • Update the primary/secondary status: Yesslaveof no oneCommand to make the selected slave node the master node and passslaveofCommand to make other nodes its slave nodes
  • Keep an eye on the primary node that has gone offline and set it as the secondary node of the new primary node when it comes online again

Five, the summary

In practice, the number of sentinel nodes should be greater than one. On the one hand, the redundancy of sentry nodes is increased to avoid sentry itself becoming the bottleneck of high availability. On the other hand, reduce the misjudgment of the offline. In addition, different sentinel nodes should be deployed on different physical machines. In addition, the number of sentries should be an odd number, which is convenient for sentries to make decisions about leader election and objective downsizing by voting.

On the basis of master-slave replication, Sentry introduced automatic failover of the master node to further improve the high availability of Redis. However, the defects of sentry are also obvious: Sentry cannot automatically failover slave nodes. In read/write separation scenarios, slave node failure will lead to unavailability of read services, requiring us to perform additional monitoring and switchover operations on slave nodes. In addition, Sentry still hasn’t solved the problem of write operations not being load-balanced and storage capacity being limited by a single machine.

The last

If you think it is helpful, you can like it and forward it. Thank you very much

Public number agriculture ginseng, add a friend, do a thumbs-up friend ah