Some time ago when I was learning about distribution, I found the Firefoxbug blog article “Application of Consistent Hash in Distributed System” explained this problem clearly and easily. This article is mainly my own understanding and practice.

As the number of visits to the application system or the amount of data in the DB/ file storage system increases, the system responds slowly or even goes down due to the increased load. In order to solve this problem, the system is often designed by vertical extension and horizontal extension architecture, and distributed system is an application practice of horizontal extension architecture.

1 Distributed system requirements

The original intention of the distributed design is to solve the problem of excessive load on a single server, so after the horizontal expansion of the system, data should be evenly distributed on each server node as far as possible (that is, no hot data nodes). Secondly, if capacity expansion is required or a node fails and needs to be removed from the cluster, the distributed system after processing should minimize the impact on stored data and reduce the cost and risk of data migration.

2 Solutions

Since the number of machines can not be infinite, so when extending horizontally, it is necessary to consider the infinite data distributed on these machines in a balanced, orderly and easily extended way through certain algorithms.

The common practice is to use the data to be processed number, and then the machine’s data module operation. For example, if there are 10 data (numbered from 0 to 9) and the number of machines is 3 (numbered from 0 to 2), machine 0 stores data numbered from 0,3,6, and 9 after each data number is modulo of machine 3. Machine 1 stores data numbered 1,4, and 7; Machine 2 holds data numbered 2,5, and 8.

The modeling algorithm is relatively simple, but when a server node fails or a new node is added, a large amount of existing data needs to be migrated. Consistent Hashing algorithm is introduced in memcached distributed principle, which can solve this problem well.

3. Consistency hashing algorithm principle

As shown in the figure above, the main process of the memcached distributed hash algorithm is as follows:

1. Calculate the hash value x of each memcached server node (IP address) using the algorithm and assign it to a circle of 0~2^32 (range); 2. Use the same method to find the hash value y of the stored data key and map it to the circle. 3. Search clockwise for the first x greater than y, so that y is distributed on the node in front of x.Copy the code

4 Sample programs

A python2 sample program was provided in the original firefoxbug article, which was changed to python3. Note that replicas are used for all four machines, which increases the uniformity of data distribution.

# - * - coding: utf-8 - * - "' FileName: consistenthashdistributed1. Sh Description: distributed systems: the consistency of the hash algorithm used Simple Usage: python consistenthashdistributed1.py [numbers of replicate] Reference: http://www.firefoxbug.com/index.php/archives/2791/ (c) 2018.02.17 vfhky https://typecodes.com/python/consistenthashdistributed1.html ''' import sys import hashlib CONTENT = """Consistent hashing is a special kind of hashing such that when a hash table is resized and consistent hashing is used, only K/n keys need to be remapped on average, where K is the number of keys, and n is the number of slots. In contrast, in most traditional hash tables, A change in the number of array slots causes nearly all keys to be remapped."" "192.168.2.2," "192.168.3.3 192.168.4.4", ""] class HashRing (object) : """Constructs. """ def __init__(self, nodes=None, replicas=3): """Manages a hash ring. `nodes` is a list of objects that have a proper __str__ representation. `replicas` indicates how  many virtual points should be used pr. node, replicas are required to improve the distribution. """ self.replicas = replicas self.ring = dict() self._sorted_keys = [] if nodes: for node in nodes: self.add_node(node) def add_node(self, node): """Adds a `node` to the hash ring (including a number of replicas). """ for i in range(0, self.replicas): key = self.gen_key('%s:%s' % (node, i)) self.ring[key] = node # print("key=[%s]=[%s]." %(key, node)) self._sorted_keys.append(key) self._sorted_keys.sort() #print("%s" %(self._sorted_keys)) def remove_node(self, node): """Removes `node` from the hash ring and its replicas. """ for i in range(0, self.replicas): key = self.gen_key('%s:%s' % (node, i)) del self.ring[key] self._sorted_keys.remove(key) def get_node(self, string_key): """Given a string key a corresponding node in the hash ring is returned. If the hash ring is empty, `None` is returned. """ return self.get_node_pos(string_key)[0] def get_node_pos(self, string_key): """Given a string key a corresponding node in the hash ring is returned along with it's position in the ring. If the hash ring is empty, (`None`, `None`) is returned. """ if not self.ring: return None, None key = self.gen_key(string_key) nodes = self._sorted_keys nodes_num = len(nodes) for i in range(0, nodes_num): node = nodes[i] if key <= node: Return self.ring[node], I # return self.ring[node], I # Return self.ring[node]; print("[%s:%s] string_key=[%s] key=[%s] node=[%s] self.ring[nodes[0]]=[%s].\n" %(__file__, sys._getframe().f_lineno, string_key, key, node, self.ring[nodes[0]])) return self.ring[nodes[0]], 0 def gen_key(self, key): """Given a string key it returns a long value, this long value represents a place on the hash ring. md5 is currently used because it mixes well. """ m = hashlib.md5() m.update(key.encode('utf-8')) return m.hexdigest() def consistent_hash(replicas): "" docString" "# database = {} for server: database[s] = [] hr = HashRing(SERVERS,replicas) for w in CONTENT.split(): Database [hr.get_node(w)].append(w) # print the following data for node in database: print("[%s]=[%s].\n" %(node, database[node])) if __name__ == '__main__': '''docstring''' replicas = 3 if len(sys.argv) > 1: replicas = long(sys.argv[1]) if( replicas < 3 or replicas > 100000 ): print( "Rreplicas should lower than 100000." ) sys.exit() consistent_hash(replicas)Copy the code

5 test

When searching for the landing node, the above program traverses the value of the entire hash circle, so the virtual node should not be too large, or the search time will be too long. As shown in the figure below, BZ tested its own single core 1GB VIRTUAL machine and found that the speed and balance of four nodes with 10000 virtual nodes were good.

6 Reference Article

Memcached Distributed Cache Implementation Principles.