Implement a consistent Hash algorithm yourself

preface

In Consistent Hash, we discuss the principle of Consistent Hash algorithms and say that we will write a simple algorithm of our own. Let’s write one today.

The result of a normal hash

Let’s see what we can do with ordinary hash.

First, you need to cache node objects, store objects in the cache, and a collection of cache nodes to hold valid cache nodes.

To actually store an object, it’s a simple class that just needs to get its hash value:

  static class Obj {
    String key;
    Obj(String key) {
      this.key = key;
    }
    @Override
    public int hashCode(a) {
      return key.hashCode();
    }
    @Override
    public String toString(a) {
      return "Obj{" +
          "key='" + key + '\' ' +
          '} '; }}Copy the code

The cache node object is used to store the actual object:

  static class Node {

    Map<Integer, Obj> node = new HashMap<>();
    String name;

    Node(String name) {
      this.name = name;
    }

    public void putObj(Obj obj) {
      node.put(obj.hashCode(), obj);
    }

    Obj getObj(Obj obj) {
      return node.get(obj.hashCode());
    }

    @Override
    public int hashCode(a) {
      returnname.hashCode(); }}Copy the code

Also very simple, internal use of a map to hold the node.

A collection of cache nodes used to hold valid cache nodes:

 static class NodeArray {

    Node[] nodes = new Node[1024];
    int size = 0;

    public void addNode(Node node) {
      nodes[size++] = node;
    }

    Obj get(Obj obj) {
      int index = obj.hashCode() % size;
      return nodes[index].getObj(obj);
    }

    void put(Obj obj) {
      intindex = obj.hashCode() % size; nodes[index].putObj(obj); }}Copy the code

An internal array from which data is fetched from the cache node by the number of machines to be moded.

Test: can the original data be found when adding or removing nodes:

 /** * Verify that the normal hash will not move for adding or subtracting nodes. * /
  public static void main(String[] args) {

    NodeArray nodeArray = new NodeArray();

    Node[] nodes = {
        new Node("Node--> 1"),
        new Node("Node--> 2"),
        new Node("Node--> 3")};for (Node node : nodes) {
      nodeArray.addNode(node);
    }

    Obj[] objs = {
        new Obj("1"),
        new Obj("2"),
        new Obj("3"),
        new Obj("4"),
        new Obj("5")};for (Obj obj : objs) {
      nodeArray.put(obj);
    }

    validate(nodeArray, objs);
  }
Copy the code

  private static void validate(NodeArray nodeArray, Obj[] objs) {
    for (Obj obj : objs) {
      System.out.println(nodeArray.get(obj));
    }

    nodeArray.addNode(new Node("anything1"));
    nodeArray.addNode(new Node("anything2"));

    System.out.println("========== after =============");

    for(Obj obj : objs) { System.out.println(nodeArray.get(obj)); }}Copy the code

The test steps are as follows:

Add three nodes to the collection.
toThe clusterTo add five objects, which will be hashed to different nodes based on the hash value.
printBefore more or lessThe data.
printAdd two nodesAfter the data, see if you can still access the data.

Results:

I can’t visit any of them. This is the disadvantage of ordinary mod, which is unacceptable in the case of adding or subtracting machines.

Let’s see how consistent hashing works.

Result of consistent Hash

Here’s the key.

The cache node object and the actual save object are not changed. What is changed?

We’re changing the way we save objects and the way we get objects out, so we’re not using a machine mod algorithm.

The new NodeArray object is as follows:

static class NodeArray {

/** Sort by key */
TreeMap<Integer, Node> nodes = new TreeMap<>();

void addNode(Node node) {
  nodes.put(node.hashCode(), node);
}

void put(Obj obj) {
  int objHashcode = obj.hashCode();
  Node node = nodes.get(objHashcode);
  if(node ! =null) {
    node.putObj(obj);
    return;
  }

  // Find the set larger than the given key
  SortedMap<Integer, Node> tailMap = nodes.tailMap(objHashcode);
  // Find the smallest node
  int nodeHashcode = tailMap.isEmpty() ? nodes.firstKey() : tailMap.firstKey();
  nodes.get(nodeHashcode).putObj(obj);

}

Obj get(Obj obj) {
  Node node = nodes.get(obj.hashCode());
  if(node ! =null) {
    return node.getObj(obj);
  }

  // Find the set larger than the given key
  SortedMap<Integer, Node> tailMap = nodes.tailMap(obj.hashCode());
  // Find the smallest node
  int nodeHashcode = tailMap.isEmpty() ? nodes.firstKey() : tailMap.firstKey();
  returnnodes.get(nodeHashcode).getObj(obj); }}Copy the code

This class differs from the previous class in that:

Instead of using arrays internally, an ordered Map is used.
In the put method, if the object does not fall on the cache node, it looks for the nearest and smaller node. Here we use the tailMap method of TreeMap. See the documentation for the API.
The get method does the same thing as the PUT method, otherwise the object cannot be retrieved.

The specific way to find nodes is shown in the figure below:

For the same test case, the result is as follows:

I found all the previous nodes. Fixed the normal hash problem.

conclusion

The code is relatively simple, mainly through the JDK’s own TreeMap to find nearby nodes. Of course, we’ve only tested additions here, not modifications yet, but the idea is the same. This is just a primer.

At the same time, we have not implemented the virtual node, interested friends can try.

Good luck！！！！

Implement a consistent Hash algorithm yourself

preface

The result of a normal hash

Result of consistent Hash

conclusion

Related Posts

Forced high and practical Linux command, development can master some more

What are the different techniques for Python

SpringCloud source Series (10) – HTTP client component of load balancing Ribbon