• How we implemented consistent hashing
  • Srushtika Neelakantam
  • The Nuggets translation Project
  • Permanent link to this article: github.com/xitu/gold-m…
  • Translator: yqian1991
  • Proofreader: Starrier

Relaxed’s real-time platform is distributed across more than 14 physical data centers and more than 100 nodes. In order to ensure that the load and data can be evenly and uniformly distributed to all nodes, we adopt the consistent hashing algorithm.

In this article, we will understand what consistent hashing is all about and why it is an important tool in scalable distributed system architectures. Then we take it a step further and introduce data structures that can be used to efficiently scale up consistent hashing algorithms. Finally, we will also show you a working example of this algorithm.

Talk about a hash

Remember that old, primitive method of hashing you learned in college? By using hash functions, we ensure that the resources needed by the computer program can be stored in memory in an efficient manner and that the in-memory data structure can be loaded evenly. We also made sure that this resource storage strategy made information retrieval more efficient, which made the program run faster.

Classical hashing uses a hash function to generate a pseudorandom number, which is then divisible by the size of the memory space, thereby converting a random numeric identifier into a location in the available memory space. As shown in the following function:

location = hash(key) mod size

So why can’t we do the same for network requests?

In scenarios where different programs, computers, or users request resources from multiple servers, we need a mechanism to evenly distribute requests to available servers to ensure load balancing and consistent performance. We can think of these server nodes as places where one or more requests can be mapped.

Now let’s take a step back. In traditional hashing, we always assume:

  • The number of memory locations is known, and
  • This quantity never changes

For example, with relaxed, we usually have to expand or shrink clusters throughout the day, and we also have to deal with unexpected breakdowns. However, if we consider the scenarios mentioned earlier, we cannot guarantee a constant number of servers. What if one of the servers fails unexpectedly? If we continue with the simplest hash method, the result is that we need to recalculate the hash value for each hash key, because the new mapping is entirely dependent on the number of server nodes or memory addresses, as shown below:

Before the node changes

After the node changes

The problem with simple rehashing in distributed systems – where each hash key is stored changes – is that each node stores a state; Even a very small change in the number of clusters can cause a huge amount of work to rearrange all the data on the cluster. Rehashing is not sustainable as the cluster grows, because the amount of work required to rehash increases linearly with the size of the cluster. This is where the concept of consistent hashing was introduced.

Consistency hashing – What exactly is it?

A consistent hash can be described as follows:

  • It represents the resource requester (” request “for the sake of description) and the server node in a virtual ring structure, often referred to as a Hashring.
  • The number of storage locations is no longer certain, but we assume that there are infinitely many points on the ring and that server nodes can be placed anywhere on the ring. Of course, we can still use a hash function to select this random number, but the previous second step, dividing by the number of storage locations, is omitted because the number of storage locations is no longer a finite number.
  • Requests, such as users, computers, or serverless programs, which are equivalent to keys in traditional hashing methods, are placed on the same ring using the same hash function.

So how does it decide which server to serve the request? If we assume that the loop is ordered and that a clockwise traverse of the loop corresponds to an increasing order of storage addresses, each request can be served by the first node encountered in the clockwise traverse; That is, the first server whose address on the ring is larger than the requested address will serve the request. If the requested address is larger than the maximum address in the node, it will in turn be served by the server with the smallest address, because traversing the ring is done in a circular fashion. The method is illustrated by the following figure:

In theory, each server ‘owns’ a range of ranges on a hashring, and any requests mapped to this range will be served by the same server. Now, what if one of the servers fails, take node 3 for example, at which point the address range of the next server node on the ring is expanded, and any requests mapped to that range are dispatched to the new server. That’s all. Only the hashes within the range corresponding to the failed node need to be redistributed, while the rest of the hash ring and request-server allocation remain unaffected. This is the opposite of traditional hashing, where changes in the size of the hash table affect the entire mapping. Because of consistent hashing, only part of the request (which is related to the distribution factor of the ring) is affected by the changes in the known hash ring. (The addition or removal of nodes can cause changes in the loop, causing some request-server mappings to change.)

An efficient implementation method

Now we are familiar with what a hash ring is…

We need to implement the following for it to work:

  1. A mapping from hash space to all server nodes on the cluster allows us to find nodes that can service a given request.
  2. A collection of requests served by each node on a cluster. Later, this collection will allow us to find out which hashes are affected by the addition or deletion of nodes.

mapping

To complete the first part above, we need the following:

  • A hash function that computes the position of a known request ID on the ring.
  • A method for finding the node corresponding to a request id converted to a hash value.

To find the nodes that correspond to a particular request, we can illustrate this with a simple data structure consisting of the following:

  • A hash array that corresponds one to one to the nodes on the ring.
  • A graph (hash table) used to find the server node corresponding to a known request.

This is actually the original representation of an ordered graph.

In order to find a node in the above data structure that can serve a known hash value, we need:

  • Perform a modified binary search to find the hash map of the first node in the array that is equal to or greater than (≥) the hash value you want to query.
  • Find the node found in the graph – the node that the hash map corresponds to.

Adding or deleting nodes

As we saw at the beginning of this article, when a node is added, a portion of the range on the hash ring, along with the various requests it contains, must be assigned to the new node. Conversely, when a node is deleted, requests that were previously allocated to that node will need to be processed by other nodes.

How do I find requests that are affected by changes to the hash ring?

One solution is to iterate over all requests assigned to a node. For each request, we determine whether it is within the range of the loop change and, if necessary, move it somewhere else.

However, the amount of work required to do this increases as the number of requests on the node increases. To make matters worse, as the number of nodes increases, so does the number of changes that occur on the ring. In the worst case, because ring changes are often associated with local failures, the transient loads associated with ring changes may also increase the likelihood of other affected nodes failing, potentially leading to cascade failures throughout the system.

With this in mind, we want the requested relocation to be as efficient as possible. Ideally, we could keep all requests in one data structure so that we could find requests that are affected when a hash change occurs anywhere on the ring.

Efficient lookup of affected hashes

Adding or removing a node to the cluster will change the allocation of a portion of the requests on the ring, which we call the affected Range. If we know the boundaries of the affected scope, we can move the request to the correct location.

To find the boundary of the affected area, we start with the hash H of one node added or removed and move around the loop (counterclockwise in the figure) from H until we find another node. Let’s define the hash of this node as S (to start). Counterclockwise requests from this node are assigned to it (S), so they are not affected.

Note: This is just a simplified description of what will actually happen; In practice, data structures and algorithms are more complex because we use more than one replication factor and a special replication strategy when only a portion of the nodes are available for any given request.

Requests with hash values between the range of nodes found and added (or deleted) nodes need to be moved.

Efficient lookup of requests in the affected area

One solution is to simply iterate over all requests corresponding to a node and update those requests whose hash values are mapped to that range.

In JavaScript it looks something like this:

for (const request of requests) {
  if(contains(S, H, request.hash)) {/* This request is affected by loop changes */ request.hash (); }}function contains(lowerBound, upperBound, hash) {
   const wrapsOver = upperBound < lowerBound;
   const aboveLower = hash >= lowerBound;
   const belowUpper = upperBound >= hash;
   if (wrapsOver) {
     return aboveLower || belowUpper;
   } else {
     returnaboveLower && belowUpper; }}Copy the code

Since the hash ring is circular, it is not enough to simply look for requests between S <= r < H, since S may be larger than H (indicating that this interval range includes the top beginning of the hash ring). The function contains() handles this case.

It is possible to traverse all requests for a given node as long as the number of requests is relatively small or the number of nodes added or removed is relatively rare.

However, as the number of requests on the node increases, so does the amount of work required, and worse, loop changes can occur more frequently as the number of nodes increases, Whether because of automated node scaling or failover, the concurrent load on the entire system is triggered to rebalance access requests.

At worst, the load associated with these changes could increase the likelihood of other nodes failing, potentially leading to system-wide cascading failures.

To mitigate this effect, we can also store requests in a separate circular data structure similar to the one discussed earlier, in which a hash is mapped directly to the request corresponding to the hash.

This allows us to locate all requests within the affected scope by following these steps:

  • Locate the first request from S.
  • Walk clockwise until you find a hash value outside this range.
  • Relocates requests that fall within this range.

The average number of requests that need to be traversed when a hash is updated is R over N, where R is the number of requests that are located in the range of the node, and N is the number of hashed values on the ring. Here we assume that the requests are evenly distributed.


Let’s put this explanation into practice with a working example:

Suppose we have A cluster containing nodes A and B.

Let’s randomly generate a ‘hash assignment’ for each node (assuming a 32-bit hash), so we get

A:0x5e6058e5

B:0xa2d65c0

Here we place the node on a virtual ring with values 0x0, 0x1, and 0x2… It’s continuously placed on the ring up to 0xFFFFFFFF, and so you go around the ring and 0xFFFFFFFF comes right after 0x0.

Since the hash of node A is 0x5E6058E5, it is responsible for any requests from 0xA2D65C0 +1 to 0xFFFFFFFF, and from 0x0 to 0x5E6058E5, as shown below:

B, on the other hand, is responsible for the range from 0x5E6058E5 +1 to 0xA2D65c0. So, the entire hash space is partitioned.

The mapping from nodes to their hashes is shared across the cluster, ensuring that the results of each ring calculation are always consistent. Therefore, any node can determine where a service request is placed when it is needed.

Let’s say we need to find (or create) a new request with the identifier ‘[email protected]’.

  1. We compute the hash H of this identifier, and we get, for example, zero0x89e04a0a
  2. We look for the first node on the ring that has a hash value greater than H. So here we find B.

So B is the node responsible for the request. If we need the request again, we will repeat the above steps and get the same node again, which will contain the state we need.

This example is too simple. In practice, giving each node only one hash can result in a very uneven distribution of the load. You may have noticed that in this example, B is responsible for the ring’s (0xA2D656C0-0x5E6058E5)/232 = 26.7%, while A is responsible for the rest. Ideally, each node can be responsible for an equally sized portion of the ring.

One way to make the distribution more balanced and reasonable is to generate multiple random hashes for each node, like this:

In fact, we found that this was still unsatisfactory, so we split the ring into 64 equally sized fragments and made sure that each node was placed somewhere within each fragment; The details of this are not so important. Anyway, the goal is to make sure that each node is responsible for an equally large portion of the ring, so that the load is evenly distributed. (Another advantage of generating multiple hashes per node is that hashes can be gradually added and removed across the ring, thus avoiding sudden changes in load.)

Suppose we now add a new node to the ring called C, and we generate a random hash for C.

A:0x5e6058e5

B:0xa2d65c0

C:0xe12f751c

Now, the ring space between 0xA2D65C0 + 1 and 0xe12F751c (formerly part of A) is allocated to C. All other requests continue to be hashed to the same node as before. All requests within this scope that have been assigned to A need to transfer all of their state to C in order to handle the change in node responsibilities.

Now you understand why a hash is needed to balance the load in a distributed system. However, we need consistent hashing to ensure that we minimize the amount of work required on the cluster in the event of any ring changes.

In addition, nodes need to be in multiple places on the ring to ensure that the load is evenly distributed statistically. It is not efficient to traverse the entire hash ring every time a ring changes, and as your distributed system scales, it is necessary to have a more efficient way to determine what has changed to help you minimize the performance impact of ring changes. We need new indexes and data types to solve this problem.


Building distributed systems is hard. But we love it and we love talking about it. If you need to rely on a distributed system, opt for relaxed. If you want to talk to us, contact us!

A special thanks goes to John Diamond, relaxed distributed systems engineer, for his contributions to this article.


Srushtika is a software development consultant at Relaxed Realtime

John Diamond and Matthew O’Riordan.

If you find any mistakes in your translation or other areas that need to be improved, you are welcome to the Nuggets Translation Program to revise and PR your translation, and you can also get the corresponding reward points. The permanent link to this article at the beginning of this article is the MarkDown link to this article on GitHub.


The Nuggets Translation Project is a community that translates quality Internet technical articles from English sharing articles on nuggets. The content covers Android, iOS, front-end, back-end, blockchain, products, design, artificial intelligence and other fields. If you want to see more high-quality translation, please continue to pay attention to the Translation plan of Digging Gold, the official Weibo, Zhihu column.