Why does Redis need to be distributed
A high performance
We know that The QPS of Redis itself is already high, but in some very high concurrency situations, performance can suffer. At this time, we hope that more Redis services can share the pressure and realize load balancing.
High availability
If there is only one Redis service, once the service breaks down, all clients will be unable to access it, which will have a great impact on the business. Besides, if the hardware is damaged, then all the data on it is unrecoverable, and we need a backup.
extensible
The third point is due to storage considerations, because all data in Redis is stored in memory, if the amount of data is large, it is easy to be limited by hardware. For example, a Redis can only store 4G capacity, but there are 8G data to store, so it can only be placed on two machines, this is horizontal scale-out, horizontal sharding.
Redis distributed solution
A master-slave replication
Like Kafka, RocketMQ, MySQL, and ZooKeeper, Redis supports a cluster architecture with primary and secondary nodes. The primary node is called master, and the secondary node is called slave. The slave automatically synchronizes data from the master by using the replication technology.
Redis master-secondary replication solves some of the problems of data backup and performance. But not solve the problem of high availability, in the main from more than a master or from the case, if the primary server hang up and provides the service is not available, the need to manually switch to the server from the server, and then set the remaining nodes to it from the node, the more time-consuming, also causes a certain time of service is not available.
Sentinel sentry
Since Redis version 2.8, a stable version of Sentinel Sentinel has been provided to solve the problem of high availability. The idea is to start an odd number of Sentinel services to monitor Redis servers to ensure the availability of services.
Starting Sentinel can be started with a script, which is essentially just a Redis running in a special mode. Sentinel uses the info command to obtain the master and slave information of the Redis machine being listened on.
./redis-sentinel ../sentinel.conf
# 或者
./redis-server ../sentinel.conf --sentinel
Copy the code
In order to ensure the availability of monitoring servers, we will make cluster deployment of Sentinel. Sentinel monitors all Redis services and also monitors each other. Sentinel itself has no master and slave, the status is equal, only Redis service node has master and slave.
Sentinel uses Raft consensus algorithm to implement Sentinel election and elect a leader to complete failover. Raft algorithms are widely used, such as cryptocurrency BTB, and are also used by Spring Cloud registry Consul. The core idea of Raft algorithm is: first come, first served, majority rule. Sentinel’s Raft implementation differs from the original algorithm, but the general idea is the same. Raft algorithm demo: thesecretlivesofdata.com/raft/.
Both Jedis and Spring Boot (2.x default is Lettuce) only need to configure all sentinel addresses, and the sentinel will return the current master node address.
Disadvantages of sentry: data will be lost during master/slave switchover because there is only one master. Only single-point write does not solve the problem of horizontal expansion.
Redis Cluster
Redis Cluster was officially launched in Redis 3.0 to solve the need of distribution, but also to achieve high availability, it is decentralized, clients can connect to any available node. A Redis Cluster can be viewed as a data set composed of multiple Redis instances. The client does not need to care about which node the subset of data is stored on, only about the set as a whole.
The following is a three-master and three-slave Redis Cluster architecture:
Redis creates 16,384 slots, and each node is responsible for a range of slots. For example, Node1 is responsible for 0-5460, Node2 is responsible for 5461-10922, and Node3 is responsible for 10923-16383.
When the object is distributed to the Redis node, the first step is to calculate %16384 for the Key using the CRC16 algorithm to get the value of a slot, and the data falls to the Redis node responsible for this slot. Check which slot the Key belongs to:
redis>cluster keyslot jack
Copy the code
The relationship between key and slot will never change. Only the relationship between slot and Redis node will change.
We know that keys will be distributed in different nodes after modulo of CRC16 algorithm. If we want many keys to fall in the same node at the same time, we just need to add {hash tag} to the key. Redis only obtains the string between {} to calculate the slot number, as shown below:
user{666}base=...
user{666}fin=...
Copy the code
Master/slave switchover process:
When a slave discovers that its master is in the FAIL state, it performs a Failover to become a new master. Because the failed master may have multiple slaves, there is a process in which multiple slaves compete to become the master node. The process is as follows:
- Slave discovers that his master becomes FAIL
- Add the recorded cluster currentEpoch by 1 and broadcast FAILOVER_AUTH_REQUEST information
- When other nodes receive this information, only the master responds by determining the validity of the requester and sending FAILOVER_AUTH_ACK. An ACK is sent only once for each epoch
- The slave that attempts failover collects FAILOVER_AUTH_ACK
- More than half become new masters
- Broadcast a Pong message to notify other cluster nodes
For example, in the cluster composed of three small master and slave A,B and C, A’s master fails, and A’s two younger brothers initiate an election. As A result,B’s master votes for A’s younger brother A1, and C’s master votes for A’s younger brother A2. In this way, the second election will be initiated, and the election round mark +1 continues the above process. In fact, the slave node does not attempt to initiate the election as soon as the master node enters the FAIL state, but there is a certain delay, which ensures that we wait for the FAIL state to spread in the cluster. If the slave tries the election immediately, other masters may not be aware of the FAIL state and may refuse to vote.
Redis Cluster characteristics
- There is no central structure.
- Data is distributed to multiple nodes according to slot. Data is shared among nodes and data distribution can be dynamically adjusted.
- The system can be linearly expanded to 1000 nodes (no more than 1000 nodes are recommended on the website). Nodes can be dynamically added or removed.
- High availability: when some nodes are unavailable, the cluster is still available. By adding a Slave to the standby data copy, an automatic failover is implemented. Nodes use the Gossip protocol to exchange status information, and the role from Slave to Master is promoted through voting.
- Reduce o&M costs and improve system scalability and availability.
So far, three kinds of Distributed Redis schemes have been introduced. Redis Cluster can not only realize the role assignment of master and slave, but also realize the switchover of master and slave, which is equivalent to the integration of Replication and Sentinel.
Redis sharding scheme
There are three solutions. The first is to implement the related logic on the client side. For example, the key is sharded by modulo or consistency hash, and the route of the key is determined first by the query and modification.
The second is to extract the sharding logic and run a separate proxy service to which the client connects and to which the request is forwarded.
The third is based on the server implementation, is the Redis Cluster described above.
The client
Client let’s take Jedis as an example. Jedis has several connection pools, one of which supports sharding, ShardedJedisPool. Now let’s do an experiment where we have two Redis nodes and set 100 keys to them by JedisShardInfo.
public class ShardingTest {
public static void main(String[] args) {
JedisPoolConfig poolConfig = new JedisPoolConfig();
// Redis server
JedisShardInfo shardInfo1 = new JedisShardInfo("127.0.0.1".6379);
JedisShardInfo shardInfo2 = new JedisShardInfo("192.168.8.205".6379);
/ / the connection pool
List<JedisShardInfo> infoList = Arrays.asList(shardInfo1, shardInfo2);
ShardedJedisPool jedisPool = new ShardedJedisPool(poolConfig, infoList);
ShardedJedis jedis = null;
try {
jedis = jedisPool.getResource();
for (int i = 0; i < 100; i++) {
jedis.set("k" + i, "" + i);
}
for (int i = 0; i < 100; i++) {
Client client = jedis.getShard("k" + i).getClient();
System.out.println("Get value:" + jedis.get("k" + i) + "," + "Current key is located at:" + client.getHost() + ":"+ client.getPort()); }}finally {
if(jedis ! =null) { jedis.close(); }}}}Copy the code
Source in: com/XHJ/jedis/shard/ShardingTest Java
The dbsize command shows that a server has 44 keys and a server has 56 keys. From the results, we can find that the load balancing is done, but how do you do it? We guess it’s modulo by hashing, hash(key)%N, which node to map to, based on the remainder. This method is relatively simple and belongs to static sharding rules. However, once the number of nodes changes (new or reduced), the data needs to be redistributed due to the change of modulus N. To solve this problem, we have a consistent hash algorithm, ShardedJedisPool actually uses a consistent hash algorithm.
Consistent hashing algorithm
In the consistent hash algorithm, we organize all the hash space into a virtual ring (hash ring), the entire space is organized in a clockwise direction. Because it’s circular space, 0 and 2 to the 32 minus 1 overlap.
Suppose we have four machines. We calculate the hash values based on the machine name or IP, and then distribute them into the hash ring (pink circle), as shown below:
Now that we have four requests to set or get, we hash the key to get the location in the hash ring (blue circle). The first Node found clockwise along the hash ring is the Node where the data is stored.
Node 5 is added, which affects only some data
Deleting node 1 affects only part of the data
Consistent hash algorithm solves the problem that all data needs to be redistributed when nodes are dynamically added or subtracted. It only affects the next adjacent node, and has no effect on other nodes. However, such consistency algorithm still has disadvantages, that is, nodes may not be uniformly distributed, especially in the case of a relatively small number of nodes, which puts great pressure on node 1. To solve this problem, virtual nodes need to be introduced.
With Node1 introducing two virtual nodes and Node2 introducing two virtual nodes, the data distribution will be fairly even.
Consistent hash algorithm in distributed system, load balancing, database and table have applications, like LRU, is a basic algorithm. So how do we implement it in Java code, what is a hash ring? How about virtual nodes?
We point Jedis source in redis. Clients. Util. Sharded. The initialize () method, redis node is in a red and black tree TreeMap.
private void initialize(List<S> shards) {
// Create a red black tree
nodes = new TreeMap<Long, S>();
// For loop Redis node
for (int i = 0; i ! = shards.size(); ++i) {final S shardInfo = shards.get(i);
// Create 160 virtual nodes for each node and put them into the red-black tree
if (shardInfo.getName() == null) for (int n = 0; n < 160 * shardInfo.getWeight(); n++) {
// Hash by name
nodes.put(this.algo.hash("SHARD-" + i + "-NODE-" + n), shardInfo);
}
else for (int n = 0; n < 160 * shardInfo.getWeight(); n++) {
// Hash by name
nodes.put(this.algo.hash(shardInfo.getName() + "*" + shardInfo.getWeight() + n), shardInfo);
}
// Put the redis node information into the mapresources.put(shardInfo, shardInfo.createResource()); }}Copy the code
When we have a key that needs to get or set, we need to know which node to put it on.
public R getShard(String key) {
// Select a specific node from resources
return resources.get(getShardInfo(key));
}
public S getShardInfo(byte[] key) {
// Here hash the key and pick the first node larger than the value from the red-black tree
SortedMap<Long, S> tail = nodes.tailMap(algo.hash(key));
if (tail.isEmpty()) {
// There is no one larger than it, just pull it from node
return nodes.get(nodes.firstKey());
}
// If not, return the first node larger than it
return tail.get(tail.firstKey());
}
Copy the code
Here to Jedis source code to introduce the consistent hash algorithm, in other use scenarios code writing is much the same, in the selection of data structures:
- The simplest implementation is to take an ordered list, starting at the 0th element until the first node is found that is larger than the hash value of the data, which belongs to the server corresponding to that node. The time complexity is O(n).
- A binary search tree is used with order log n time complexity.
We cannot simply use a binary lookup tree because there may be an unbalanced situation. Balanced binary search trees include AVL trees, red-black trees, etc. Red-black trees are used here. There are two reasons for selecting red-black trees:
- The main purpose of red-black trees is to store ordered data, which is actually the same idea as the first solution, but it is very efficient.
- The JDK provides redblack tree code to implement TreeMap and TreeSet.
The Proxy agent
The advantages of using ShardedJedisPool and other client-side sharding code are simple configuration, independent of other middleware, partition logic can be determined by themselves, more flexible, the disadvantage is that dynamic service increase and decrease cannot be realized, each client needs to maintain their own sharding strategy, there are repeated code. So at this time, our idea is to extract the sharded code and make it into a public service. All clients are connected to this proxy layer, and the proxy layer forwards it.
The architecture diagram is shown below.This is the same level of work as Mycat in the database sub-table sub-library middleware. Typical proxy partitioning schemes include Twitter open source Twemproxy and domestic open source Codis.
However, there are some problems, failure can not be automatically transferred, the architecture is complex, need to use other components Zookeeper (or ETCD/local files), now is rarely used, can be said to be a transition solution before the Redis Cluster, so I won’t go into details here.
Redis Cluster
Redis Cluster, as described above, is inherently integrated with data sharding to distribute data across different instances. This is the most perfect Redis distributed solution.
Because the relationship between key and slot will never change, when a new node is added, the original slot needs to be allocated to the new node, and the related data should be migrated.
Example Add a new node 192.168.10.219:6378
Redis-cli --cluster add-node 192.168.10.219:6378 192.168.10.219:6379Copy the code
The newly added node has no hash slot and cannot distribute data. Execute on any of the original nodes:
Redis - cli - cluster reshard 192.168.10.219:6379Copy the code
Enter the number of hash slots to be allocated (for example, 500) and the source node of the hash slots (enter all or ID).
This article source: github.com/xuhaoj/redi… Thanks for watching.