Why do you need a Redis Cluster architecture when you have a master-slave architecture
Capacity bottlenecks for the master/slave architecture
Recall that in a master-slave schema (one master with many slaves), the master node writes data and synchronizes it to the slave node, which then handles read requests. The capacity expansion of slave nodes can improve the read QPS of primary and secondary architectures. At the same time based on sentry can ensure high availability of master/slave architecture.
It can be said that master slave + sentry set of combined-punch, not only can resist high concurrent read requests, but also can achieve high availability; However, the data source of the master-slave architecture is actually the master, and there is only one master node in the master-slave architecture, which leads to the limited data capacity of Redis. In the case of massive data, if only use the master-slave + sentinel architecture, it is certainly not good.
This is what we call a master-slave capacity bottleneck.
So we need a new architecture to support massive amounts of data.
Readers can first think that the data capacity of a single master node is limited, so we have multiple master nodes, and the data in each master node is different, so can not support a larger amount of data?
Again, this new architecture supports more data volumes, and at the same time you need to support at least high availability. What’s the point of having so much data if you can’t guarantee high availability? If one of the master nodes fails, the entire architecture becomes unavailable. So the new architecture has to be highly available.
This new architecture is the Redis Cluster that we will talk about next.
Redis Cluster principle
Basic introduction
-
The Redis Cluster contains multiple master nodes. Each master node is attached to multiple slave nodes. Multiple masters can support larger data volumes
Scale-out allows for more data: If we need to support more data, we can add new master and slave nodes
-
When the master node fails, the Redis Cluster will select a new master from the slave node corresponding to the master node to complete the failover (high availability).
-
By default, redis cluster does not support read or write on the slave node. The slave node is used as the standby node when the master fails
Node communication
The communication process
This section describes two ways to maintain meta information
A mechanism is required to maintain node metadata in distributed storage. The common maintenance modes are centralized and P2P. The Redis Cluster uses the P2P Gossip protocol.
Node meta information mainly includes the status information such as the data that a node is responsible for and whether a fault occurs.
Let’s compare these two ways of maintaining meta information.
Centralized: Centrally store meta information in an external component, such as ZooKeeper
- The advantage of centralized mode lies in the update and reading of metadata, and the timeliness is very good. Once there is a change in metadata, it will be updated to the centralized storage immediately, and other nodes can sense it immediately when reading.
- The downside is that all the new pressure on metadata is in one place, which can lead to stress on metadata storage.
Gossip:
- Benefits: Metadata updates are distributed rather than concentrated in one place. Update requests are sent to all nodes intermittently for update, which has a certain delay and reduces pressure
- Disadvantages: Metadata updates are delayed, which may cause some of the cluster operations to lag
Port 10000
Each node has a port dedicated for inter-node communication, which is the port number +10000 that provides its own services. For example, 7001, then port 17001 is used for inter-node communication
Each node sends ping messages to several other nodes at intervals, while the other nodes receive pings and return pong
Information exchanged by nodes
Contains information about faults, adding and removing nodes, hash slot information, and so on
Details of the Gossip protocol
gossip
The message
The Gossip protocol contains multiple messages, including ping, Pong, meet, and fail.
-
Meet: A node sends a meet message to a new node to make it join the cluster, and then the new node communicates with other nodes
-
Ping: Each node sends pings to other nodes frequently every second, containing its own status and maintained cluster meta information. (Nodes exchange meta information with each other)
-
Nodes exchange meta information by ping each other.
-
Each node sends ping messages to other nodes every second, exchanging data frequently and updating metadata with each other
-
-
Pong: Returns ping and meet, containing its status and other information. It can also be used for information broadcasts and updates
-
Fail: After a node determines that another node fails, it sends a fail message to other nodes to inform them that the specified node is down
Deep understanding of ping messages:
Ping is frequent and carries some metadata, so it can be a strain on the network
When each node performs 10 pings per second to take into account the network overhead, the five other nodes that have not communicated for the longest time are selected each time
Of course, if the communication delay between the node and the node reaches cluster_node_timeout / 2, ping is sent immediately to avoid a long delay in data exchange
- For example, if two nodes haven’t exchanged data for 10 minutes, the entire cluster is inSerious metadata inconsistency, there will be a problem; so
cluster_node_timeout
Can adjust, if the adjustment is larger, then will reduce the frequency of sendingFor each ping, one carries the information of its own node, and the other carries 1/10 of the information of other nodes, and sends it out for data exchange
Principle of high availability and active/standby switchover
The principle of redis Cluster high availability is almost similar to that of Sentinel.
1. Determine the node breakdown
-
If one node thinks another node is down, then it’s a PFAIL, subjective down
-
If multiple nodes think that another node is down, then it is a fail, an objective failure
It can be likened to the sentinels’ sdown and odown
If a node does not return pong during cluster-nod-timeout, it is considered a Pfail.
If a node believes that a node fails, it will ping other nodes in the Gossip ping message. If more than half of the nodes believe that a node fails, it will become a fail (objective downtime).
2. Secondary node filtering
-
For the broken master node, select one of its slave nodes to become the master node
-
Check the duration of disconnection between each slave node and the master node. If the duration exceeds cluster-node-timeout * cluster-slave-validity-factor, the slave node is not eligible to become the master node
3. Secondary node election
-
Each slave node sets an election time based on its offset to the master replication data. The larger the offset (the more data the slave node replicates), the higher the election time, and the higher the election time.
-
If most master nodes (N/2 + 1) vote for a slave node, the slave node can be switched to the master node.
-
The secondary node performs an active/standby switchover, and the secondary node switches over to the active node (failover).
Slave priority, offset, run ID
conclusion
This article first from the shortcomings of the master-slave architecture led to the Redis Cluster, and then introduced the principle of Redis Cluster, mainly including gossip protocol and high availability implementation principle of these two points. The hash slot algorithm, which is an important data sharding algorithm in Redis Cluster, is not discussed in detail in this article.
The resources
- https://redis.io/topics/cluster-tutorial
- Redis Development and Operations P274 — P345
- Huperia chinensis Hundred million level flow
This article was typeset using MDNICE