1 Significance of Redis Sentinel

  • What if master is down? Wait for the operation and maintenance manual switch from the main, and then inform all programs to change the address all again online? Then the service will be stagnant for a long time, which is obviously disastrous for large systems!

Therefore, it is necessary to have a high availability scheme. When a fault occurs, it can automatically switch from the master, and the program does not need to restart, and manual operation and maintenance are not necessary. Redis offers just such a solution, Redis Sentinel.

Sentinal, Sentry, Redis cluster architecture is a very important component, the main functions are as follows

  • The cluster monitoring

Monitor the Redis master and Slave processes

  • alerts

If a Redis instance fails, the sentry is responsible for sending alarm messages to the administrator

  • failover

If the master node breaks down, it is automatically transferred to the slave node

  • Configuration center

If a failover occurs, notify the client of the new master address

Sentinels themselves are also distributed and run as a cluster:

  • During failover, determining whether a master node is down requires the agreement of most sentinels, involving distributed elections
  • Even if some sentinel nodes go down, the sentinel cluster still works

The current version is Sentinal 2, which rewrites much of the code compared to Sentinal 1, mainly to make failover mechanisms and algorithms more robust and simple

Sentinel + Redis master-slave deployment architecture does not guarantee zero data loss, only high availability of Redis clusters.

Why cannot the two nodes work properly

More than two nodes must be deployed. If only two instances are deployed, quorum=1

+----+         +----+
| M1 |---------| R1 |
| S1 |         | S2 |
+----+         +----+
Copy the code

Configuration: quorum = 1

If the master is down, only one sentry in S1 and S2 thinks the master is down can switch, and one of s1 and S2 will be elected to perform the failover. If the majority of the sentinels are running, the majority of the two sentinels is 2

Majority of 2 sentries =2 majority of 3 =2 Majority of 4 =2 majority of 5 =3

With both sentinels running, failover is allowed

If the entire M1 and S1 machine goes down, then there is only one sentinel left, and no majority of the sentinels are allowed to fail over, even though the other machine still has R1

3 node sentinel cluster

       +----+
       | M1 |
       | S1 |
       +----+
          |
+----+    |    +----+
| R2 |----+----| R3 |
| S2 |         | S3 |
+----+         +----+
Copy the code

Configuration: quorum = 2, majority

If M1 goes down and there are two sentinels left, S2 and S3 can agree that the master is down and elect one to failover

Majority of the three sentinels is 2, so the two remaining sentinels are running and failover is performed

2 Redis Sentinel architecture

Redis Sentinel failover

  1. Multiple Sentinels detected and confirmed problems with master.
  2. A Sentinel was elected as leader.
  3. Select a slave as master.
  4. Notifies the remaining slaves to become the new master.
  5. Notifies clients of primary/secondary changes
  6. Slave waiting for the old master to be resurrected as the new master

  • Can monitor multiple sets

3 Installation and configuration

  1. Enable the primary and secondary nodes
  2. Enable sentinel to monitor the primary node. (Sentinel is a special redis)
  3. It should actually be multiple machines
  4. Configuring Nodes in Detail

Redis master node

[Startup] redis-server redis-7000. conf [configuration] port 7000 daemonize yes pidfile /var/run/redis-7000.pid logfile"7000.log"
dir "/opt/soft/redis/data/"
Copy the code

Redis slave node

[Start] redis-server redis-701. conf redis-server redis-7002.conf slave-1[Configuration] port 7001 daemonize yes pidfile /var/run/redis-7001.pid logfile"7001.log"
dir "/opt/soft/redis/data/"Slaveof 127.0.0.1 7000 slave-2[configuration] port 7002 daemonize yes pidfile /var/run/redis-7002.pid logfile"7002.log"
dir "/opt/soft/redis/data/"Slaveof 127.0.0.1 7000Copy the code

Sentinel main configuration

port $(port)
dir "/opt/soft/redis/data/"
logfile " $(port).log"Sentinel Monitor MyMaster 127.0.0.1 7000 2 Sentinel Down-after-milliseconds myMaster 30000 Sentinel PARALLEL milliseconds mymaster 1 sentinel failover-timeout mymaster 180000Copy the code

demo

  • Active Node Configuration

  • redirect
  • Print check configuration files
  • Start the

4 the client

  • Client Implementation Fundamentals -1

  • Client implementation Fundamentals -2

  • Client Implementation Principles-3 Verification
  • Client Implementation Fundamentals -4 Notifications (Publish subscribe)

Client access process

  1. Sentinel address set
  2. masterName
  3. Not the proxy model
JedisSentinelPool sentinelPool = new JedisSentinelPool(masterName, sentinelSet, poolConfig, timeout);

Jedis jedis = null;
try {
	jedis = redisSentinelPool.getResource();
	//jedis command
} catch (Exception e) {
	logger.error(e.getMessage(), e);
} finally {
	if(jedis ! =null)
		jedis.close();
}
Copy the code

5 Scheduled Task

  1. Every 10 seconds each sentinel executes the INFO command for master and Replica
  • Discover replica node
  • Confirm the master/slave relationship

  1. Every 2s each sentinel exchanges information through the channel of the master node (pub/sub)
  • throughsentinel: Java channel interaction
  • Exchange “views” of a node with information about itself

  1. Every 1s each sentinel pings other Sentinels and Redis
  • Heartbeat detection, failure judgment basis

6 Subjective and objective Offline

Subjective offline (Subjectively Down,SDOWN)

The offline judgment made by a single Sentinel node to the server, that is, a single Sentinel considers a service offline (it may be due to a series of reasons such as failure to receive subscriptions and network connectivity).

Note that subjective logoff is the “bias” of each Sentinel node against Redis node failure. Objective referral mechanisms are also needed.

We were Objectively Down,ODOWN

After multiple Sentinel instances make SDOWN judgment on the same server and communicate with each other through commands, the judgment of server offline is obtained and failover is enabled.

Servers are marked as ODOWN only after a sufficient number of sentinels have marked a server as subjectively offline. Failover occurs only if the Master is considered objectively offline.

The arbitration

Arbitration is in the configuration filequorumParameters. A Sentinel would mark the Master node as a subjective downline and then pass itsentinel is-master-down-by-addrThe command asks other Sentinel nodes if they agree that the Master node of this addr is subjectively offline.

Sentinel Monitor <masterName> < IP > < port> <quorum> Sentinel monitor myMaster 127.0.0.1 6379 2# 
sentinel down-after-milliseconds < masterName> < timeout>
sentinel down-after-milliseconds mymaster 30000
Copy the code

Finally, when the number of sentinels reached the quorum set value mentioned above, the Master node would be considered as objectively offline and failover would occur. Quorum is typically set to one half of the number of sentinels plus one, or 2 for three sentinels.

In conclusion

Sentry principle

  1. Each Sentinel sends a PING once per second to the Master, Slave, and other Sentinel nodes it knows
  2. If the time between the last effective PING reply of an instance exceeds the time in the configuration filedown-after-milliseconds, the instance will be marked by SentinelSubjective offline
  3. If a Master is marked as subjectively offline, all sentinels that are monitoring the Master confirm once /s that the Master is actually subjectively offline
  4. The Master is marked as objective offline when enough sentinels (greater than or equal to the value specified in the configuration file) confirm that the Master has indeed gone subjective offline within the specified time period
  5. If the Master is in the ODOWN state, the new Master is automatically elected. Point the remaining slave nodes to the new Master to continue data replication
  6. Under normal conditions, each Sentinel sends INFO commands to all known masters and slaves every 10 seconds. When the Master is marked as objective offline by Sentinel, Sentinel sends the INFO command to all the slaves of the offline Master once every 10 seconds instead of once every second.
  7. If not enough sentinels agree that the Master is offline, the Master’s objective offline status is removed. If the Master returns a valid reply to the PING command of Sentinel, the Master’s subjective offline status will be removed.

7 Leadership Election

Cause: Only one Sentinel node has completed failover. Election: want to be a leader by sentinel IS-master-down-by-addr command:

  1. Each Sentinel node that does a subjective referral sends a command to the other Sentinel nodes to make it a leader
  2. A Sentinel node receiving a command will agree to the request if it does not agree to a command sent through another Sentinel node, otherwise it will reject the request
  3. If the Sentinel node finds that it has more than half the votes of the Sentinel set and exceeds quorum, it becomes the leader
  4. If more than one Sentinel node becomes a leader during this process, a period of time will be required for re-election
  • The election instance

8 Failover

Leader node completion

  1. Select a “suitable” node from the slave node as the new master node
  2. Run slaveof no one on the slave node to make it master
  3. Send commands to the remaining slave nodes to make them slave nodes of the new master node. The replication rules are related to the parallel syncs parameter
  4. The update configures the original master node as a slave, keeps its attention on it, and commands it to copy the new master node when it recovers.

Select the appropriate slave node

  1. Select the slave node with the highest slave priority. If the slave node does not exist, the slave node continues.
  2. Select the slave node with the largest replication offset (the most complete replication). If the slave node exists, return the slave node. If the slave node does not exist, return the slave node

To continue. 3. Select the slave node with the smallest runId.

Common development operation and maintenance problems

Node operations

  • Machine offline: such as warranty, etc
  • Insufficient machine performance: for example, CPU, memory, hard disk, network, etc
  • The node is faulty, for example, the service is unstable

The master nodesentinel failover <masterName>

Nodes offline

  • Secondary node: temporary or permanent offline, such as whether to do some cleaning. But consider read/write separation.
  • Sentinel node: same as above.

Node online

  • Primary node: Sentinel Failover is used for replacement
  • Slave node: Slaveof, sentinel node can sense
  • Sentinel node: Start the node by referring to other sentinel nodes.

Function from the node

  1. Copy: High availability basis
  2. Expand your reading ability

Because Redis Sentinel only fails over the master node and takes subjective offline to the slave node, a custom client is required to monitor the corresponding events

Three messages

  • +switch-master: Switch master node (from master node to master node)
  • +convert-to-slave: Switch from master node to slave node
  • +sdown: subjective offline.

conclusion

  • Redis Sentinel is a highly available implementation of Redis:

Fault discovery, failover, configuration center, client notification.

  • Redis Sentinel has only been available for production since version 2.8
  • Deploy all nodes of Redis Sentinel on as many different physical machines as possible
  • The number of Sentinel nodes in Redis Sentinel should be ≥3, and it is better to be an odd number to facilitate election
  • There is no difference between the data nodes in Redis Sentinel and common data nodes
  • The client is initialized to connect to the Sentinel node collection, not the specific Redis node, but

Sentinel is only a configuration hub, not a proxy

  • Redis Sentinel realizes Sentinel node for master node, slave node,

Monitoring of other Sentinel nodes

  • Redis Sentinel can be divided into subjective downline and objective downline when judging the failure of nodes
  • Understanding the Redis Sentinel failover log is very helpful for Redis Sentinel and troubleshooting problems
  • Redis Sentinel realizes read/write separation and high availability. It can obtain the status changes of Redis data nodes by relying on the message notifications of Sentinel nodes

reference

  • Hellokangning. Making. IO/useful/post/red…