Why should Sentinel be isolated from Redis high availability, rather than implemented in Redis itself? What monitoring roles do distributed architectures have for high reliability?

01 Distributed Architecture Centralized mode

From the perspective of the distributed cluster mode, the monitoring mode of distributed centralized storage is recommended to separate service data processing and form an independent manager or status monitor, for example, namenode, ZooKeeper, and JournalNode of distributed file system HDFS. Arbitration in MongoDB replica set mode; RocketMQ nameserver; And the sentinels of Redis.

Hadoop HA moat includes ZooKeeper, JournalNode, and ZKFC, which is armed to the teeth to ensure HDFS Namenode dual-node reliability.

This is associated with the demand of distributed centralized, high reliability of the Lord in the cluster is very important, because once the master node to hang out, you need to have a copy on the top of the node can now come, or cluster will thoroughly collapse, so this time who can monitor whether the master node to hang up, at the same time let from the top of the node is very key role, If the monitor role also has data business processing tasks, it complicates the monitor operation and increases the probability of service failures.

If something happens to the monitor, the cluster is left unprotected. Redis’ sentinels play this role. Therefore, the role requirements of the Redis Sentinel service itself should be single-purpose and minimize the risk of failure.

As shown in the figure below: Redis1-4 nodes are monitored by a Sentinel. When Redis1 as the main node fails and goes offline, the slave node replication of Redis2-4 will be stopped. Sentinel schedules Redis4 to be upgraded to the master node to continue serving clients and to continue data replication for Redis2-3

02 Partitions (sharding) as a master-slave distributed architecture

In distributed centralized cluster, there is also a special mode, that is Kafka, a partition (shard) centered master-slave mode of Elasticsearch. In fact, the Leader node does not have special requirements for centralized coordination, and usually the elected master node is the coordination and management cluster.

For example, Kafka’s Leader plays the role of Leader in command operations such as Topic creation and partition redistribution, while the Leader node is a peer brocker in the cluster at other times, receiving client Topic partition writes and reads.

In this way, the risk of service data operation and load generated when the primary node must uniformly accept data writing is reduced. For example, in the primary/secondary mode of Redis, MongoDB, and MySQL, the primary node must bear all the write load and the data replication load of the duplicate node. The Kafka/Elasticsearch cluster has a master/slave mode based on the data partitioning (shards) of each node, so the risk of reading/writing loads is shared among the nodes in the cluster.

For example, when Elasticsearch is written, the node loads data written to the master shard and copies data to the slave shard of other nodes. In the process of query, the parallel query of multiple nodes in the cluster is realized. The Leader node coordinates the shard copy of which nodes to participate in the shard query, and finally aggregates the query results of different nodes to one node and provides them to the client.

Adopting the design idea of primary/secondary partition (sharding) to disperse the risk of single master is a distributed mode between centralization and decentralization, but in essence it is still centralized management, so there is no need for an independent role like Redis sentinel to strengthen the protection of the master node. As shown below:

The last type is a truly decentralized distributed cluster, where nodes typically include state maintenance and data operations. Such as: Redis Cluster is a typical decentralized Cluster mode, and each node needs to consider Cluster consistency. For example, distributed file system like Glusterfs is also a typical decentralized storage, which does not require separate state monitoring. Each node is itself. The management of a cluster is usually decided by a majority vote.