This is the first day of my participation in Gwen Challenge

This article is participating in “Java Theme Month – Java Development in Action”, see the activity link for details

preface

Hello, this is the classic chicken wing. Today we are going to talk about the application of consistent hash algorithm on Redis. With the growth of business volume, the amount of data is also increasing, and redis cache is used more and more. At this time, one REDis often cannot solve the problem, and multiple Redis are needed for distributed storage. When it comes to distribution, we need to consider the balance of data storage and the data impact of machine downtime. So, above, we pose two questions. First, how to make data evenly stored on each Redis server? Second, how to minimize the loss of data and reduce the occurrence of cache avalanche when a certain redis machine goes down. Today we’ll talk about a consistent hash algorithm that solves both of these problems.

A careful analysis of the original problem

First let’s talk about some traditional load balancing algorithms. There are round robin algorithm, hash algorithm, failover algorithm, etc., among which the load balancing algorithm suitable for storage is hash algorithm, which is also the most commonly used one. The idea of a hash algorithm is to use a designed hash function to compute the unique identifier of the data we want to store into a hash value, and then mod the hash value and the number of machines to get the remainder, so that the remainder we get must be in our machine. Finally, it can be stored. At this time we think there is no problem, the data can be evenly stored on each machine. So what happens if one of my machines dies? As the number of machines becomes smaller, all the previous mod changes, which means that all our previous caches are invalid and we can no longer get cached values from the computed machines. This means that all requested data will be sent to the database instantaneously, and in severe cases, the production environment will simply hang up. Losing one would have such a serious impact, but adding one? The same. Adding one will also result in rehash and a large cache invalidation. This may be a little confusing for some of you. So let me give you a practical example.

Suppose company A has three Redis servers. The machine numbers are 0,1,2. So let’s say we come in with a number, and we hash it to a value of 5, so 5% of 3, mod 2. Our data should be stored on server 2. If server 2 just goes down. Our number of machines is reduced to 2. The same hash value is 5. I mod it to 1. So server 1 definitely does not have his cache data, so we have to check from the database. A small number of requests are ok, but if a large number of requests hit the database, the production server may hang up.

So how to solve it. Use the following consistent hash algorithm. Consistent hash algorithms can greatly reduce the impact of adding or removing machines.

Consistent hash algorithm

Circular hash space

An important concept introduced here is the circular Hash space. We set up a number space from 0 to 2^32-1, and connect the number space first, forming a closed loop. This hash space is used to map hash values to the ring.

Data is stored on the ring

Next we need to store our data on the ring. Suppose we have four servers, hash the IP addresses of the servers to the following location on the ring.

After determining the location of the server, we use the same hash algorithm to hash the key value of the data to be stored, and the hash value obtained is mapped to the ring. If there is no mapping to the server, we search clockwise and fall to the first server found.

According to this rule, we can see that data A corresponds to Node A, B corresponds to Node B, C corresponds to Node C, and D corresponds to Node D.

Resolve the problem of adding or deleting servers

Now let’s assume that node C goes down and A, B, and D remain normal, then object C will be relocated to node D. Therefore, in the consistent Hash algorithm, if a server is unavailable, only the data from this server to the previous server will be affected, as shown below:The other option is to add a server Node X to the system, as shown below:

In this case, A, B, and D are not affected. Only object C needs to be relocated to the new node X. If a server is added, only the data between the new server and the previous server in its ring space is affected, and no other data is affected.

Data skew of the Hash ring

The consistent hash algorithm also has its drawbacks. When there are too few service nodes, data skew may occur due to uneven node distribution (most cached objects are cached on one server in a centralized manner). For example, there are only two servers in the system, and their ring distribution is as follows:Our idea of A clockwise search results in A large amount of data being concentrated on node A and A very small amount being located on node B. So how to solve the consistent hash algorithm? Virtual node mechanism is introduced, that is, multiple hashes are calculated for each service node, and one service node is placed in each computed result position, which is called virtual node. This can be done by adding a number after the server IP address or host name.

For example, three virtual nodes can be calculated for each server, namely “Node A#1”, “Node A#2”, “Node A#3”, “Node B#1”, “Node B#2”, and “Node B#3”, thus forming six virtual nodes.The data location algorithm remains unchanged, but A mapping step is added from virtual nodes to actual nodes. For example, data on Node A#1, Node A#2, and Node A#3 are all located on Node A. This solves the problem of data skew when there are few service nodes. In practical applications, the number of virtual nodes is usually set to 32 or more, so even data distribution can be achieved even with a few service nodes.

Adding virtual nodes can solve this problem, but the problem of hash skew still exists. However, as the number of virtual nodes increases, the occurrence of hash match failure is minimized. Consistent hash algorithm formula: 1-n/(n + m) x 100%. N represents the real Redis node, and M represents the added virtual node. As the number of virtual nodes increases, the number of hit failures decreases.

conclusion

The application of the consistent hash algorithm on Redis is over.