1. HashMap source code analysis

In JDK 1.8, the bottom layer of a HashMap is an array + a linked list + a red-black tree. Let’s start with a brief overview of what hashCode() does in a HashMap: A HashMap uses an array of nodes (which may represent a linked list or a red-black tree) to store key-value pairs through the perturbation function of the key’s hashCode(). Then use (n-1) &hash (n is the length of the array, n is a power of 2, so hash & n = hash & (n-1)) to determine where the element should be added to the array (index), If no node is found (null) then the node is created using the key value and added to the corresponding index position of the array, otherwise the node is overridden using the key’s equals method if equals returns true, otherwise it is added to the corresponding position of the linked list or red-black tree

JDK 1.8 perturbation function hash method

static final int hash(Object key) {
    int h;
    return (key == null)?0 : (h = key.hashCode()) ^ (h >>> 16);
}
Copy the code

JDK 1.7 perturbation function hash method

final int hash(Object k) {
    int h = hashSeed;
    if (0! = h && kinstanceof String) {
        return sun.misc.Hashing.stringHash32((String) k);
    }

    h ^= k.hashCode();

    // This function ensures that hashCodes that differ only by
    // constant multiples at each bit position have a bounded
    // number of collisions (approximately 8 at default load factor).
    h ^= (h >>> 20) ^ (h >>> 12);
    return h ^ (h >>> 7) ^ (h >>> 4);
}
Copy the code

JDK 1.7’s perturbation function was disturbed four times compared to JDK 1.8’s one time, so it’s clear that JDK 1.8’s perturbation function is more efficient

The biggest change in HashMap from JDK 1.7 to 1.8 is how hash conflicts are resolved when the index position of a key already exists; 1.7 use linked lists to solve the problem; In 1.8, linked lists and red-black trees are used to solve the problem.

1. Class attributes

	// The default capacity is 16
    static final int DEFAULT_INITIAL_CAPACITY = 1 << 4; // aka 16

    // Maximum capacity 2^30
    static final int MAXIMUM_CAPACITY = 1 << 30;

    // Default load factor
    static final float DEFAULT_LOAD_FACTOR = 0.75 f;

    // An empty instance of an hash table
    static finalEntry<? ,? >[] EMPTY_TABLE = {};// The hash table can be expanded as necessary and must be a power of 2
    transient Entry<K,V>[] table = (Entry<K,V>[]) EMPTY_TABLE;

    // The number of key-value mappings contained in the map
    transient int size;

    // Threshold. When the threshold is reached, capacity needs to be expanded. Threshold = capacity * loadfactor
    int threshold;

    // Load factor
    final float loadFactor;

    // Number of map changes
    transient int modCount;

	// A default threshold for a key-value pair whose key is String,
	// Alternative hashing is used when map capacity reaches this threshold
	// Alternate hashes reduce the (easier) incidence of hash collisions for String keys.
	/ / JDK. This value can be defined system attributes map. Althashing. The threshold to specify.
	// If the value is 1, it forces the standby hash to always be used; If the value is -1, the value is disabled.
    static final int ALTERNATIVE_HASHING_THRESHOLD_DEFAULT = Integer.MAX_VALUE;
Copy the code

LoadFactor controls the sparsity of data stored in the array. The closer the loadFactor is to 1, the more entries stored in the array and the denser the data. Conversely, the more sparse the data. Too much loadFactor will lead to low query efficiency. Too small a loadFactor can lead to data fragmentation and waste a lot of space. So loadFactor has a default value of 0.75F, which is officially a good value. The initial capacity is 16, and the critical value threshold is 16 x 0.75 = 12. That is, when the number of elements reaches 12, capacity expansion needs to be performed. The process of capacity expansion involves rehash and data replication, which consumes resources. The correct thing to do is to give the specified capacity when constructing a HashMap object if you can determine roughly how much storage you need!

2, node class source code analysis

Node Node source code:

static class Node<K.V> implements Map.Entry<K.V> {
    // cache the hash values so that you don't need to call the hashCode method to evaluate the hashCode again
    final int hash;
    // key
    final K key;
    // value
    V value;
    // Subsequent nodes
    Node<K,V> next;

    // There are parameters
    Node(int hash, K key, V value, Node<K,V> next) {
        this.hash = hash;
        this.key = key;
        this.value = value;
        this.next = next;
    }

    Getters for key and value
    public final K getKey(a)        { return key; }
    public final V getValue(a)      { return value; }
    // toString, the output Node object will print key=value
    public final String toString(a) { return key + "=" + value; }

    // Implementation of hashCode for Node objects: hashCode for key and hashCode for value do either or
    public final int hashCode(a) {
        return Objects.hashCode(key) ^ Objects.hashCode(value);
    }

    // Change the value method, return the value before the change
    public final V setValue(V newValue) {
        V oldValue = value;
        value = newValue;
        return oldValue;
    }

    // equals method, you can see the field used in equals method, which must also be used when overriding hashCode
    public final boolean equals(Object o) {
        if (o == this)
            return true;
        if (o instanceofMap.Entry) { Map.Entry<? ,? > e = (Map.Entry<? ,? >)o;if (Objects.equals(key, e.getKey()) &&
                Objects.equals(value, e.getValue()))
                return true;
        }
        return false; }}Copy the code

Node nodes can be used as nodes of a linked list

Tree node source code is as follows:

static final class TreeNode<K.V> extends LinkedHashMap.Entry<K.V> {
    TreeNode<K,V> parent;  / / the parent node
    TreeNode<K,V> left;	// Left child node
    TreeNode<K,V> right; // Right child node
    TreeNode<K,V> prev;    // The precursor node
    boolean red;	// If it is red, this is a node of a red-black tree
    TreeNode(int hash, K key, V val, Node<K,V> next) {
        super(hash, key, val, next);
    }

    // Returns the root node containing the node
    final TreeNode<K,V> root(a) {
        for (TreeNode<K,V> r = this, p;;) {
            if ((p = r.parent) == null)
                returnr; r = p; }}// Other methods are omitted
}
Copy the code

3. Construction method

// Specify the initial capacity and load factor
public HashMap(int initialCapacity, float loadFactor) {
    if (initialCapacity < 0)
        throw new IllegalArgumentException("Illegal initial capacity: " +
                                           initialCapacity);
    if (initialCapacity > MAXIMUM_CAPACITY)
        initialCapacity = MAXIMUM_CAPACITY;
    if (loadFactor <= 0 || Float.isNaN(loadFactor))
        throw new IllegalArgumentException("Illegal load factor: " +
                                           loadFactor);
    this.loadFactor = loadFactor;
    // The initial threshold is the minimum power of 2 greater than or equal to initialCapacity
    this.threshold = tableSizeFor(initialCapacity);
}
// Specify the initial capacity, using the default load factor
public HashMap(int initialCapacity) {
    this(initialCapacity, DEFAULT_LOAD_FACTOR);
}
// Default constructor, load factor to the power of 2
public HashMap(a) {
    this.loadFactor = DEFAULT_LOAD_FACTOR; // all other fields defaulted
}
// Construct a HashMap using another Map and call the putMapEntries method to do so
public HashMap(Map<? extends K, ? extends V> m) {
    this.loadFactor = DEFAULT_LOAD_FACTOR;
    putMapEntries(m, false);
}

final void putMapEntries(Map<? extends K, ? extends V> m, boolean evict) {
    // The size of the collection
    int s = m.size();
    if (s > 0) {
        // Check whether the table has been initialized
        if (table == null) { // pre-size
            // If not initialized, s is the number of elements in m
            float ft = ((float)s / loadFactor) + 1.0 F;
            int t = ((ft < (float)MAXIMUM_CAPACITY) ?
                     (int)ft : MAXIMUM_CAPACITY);
            // If the calculated t (actual capacity required) is greater than threshold, initialize threshold
            if (t > threshold)
                threshold = tableSizeFor(t);
        }
        // If the table is initialized and the number of elements in M exceeds the threshold, expand the table
        else if (s > threshold)
            resize();
        // Add all elements in m to the HashMap
        for (Map.Entry<? extends K, ? extends V> e : m.entrySet()) {
            K key = e.getKey();
            V value = e.getValue();
            putVal(hash(key), key, value, false, evict); }}}Copy the code

As you can see, the constructors only initialize the values of loadFactor and threshold (the default threshold constructor is also not initialized), and do not initialize the table array

4, put method

PutVal (); putVal (); putVal (); putVal ();

public V put(K key, V value) {
    // The putVal method is called internally to add or update the specified key-value pair to the HashMap
    return putVal(hash(key), key, value, false.true);
}

// Perturbation function to avoid reducing hash conflicts
static final int hash(Object key) {
    int h;
    return (key == null)?0 : (h = key.hashCode()) ^ (h >>> 16);
}

/** * Hash: key Hash value computed by the perturbation function * key: key * value: value * onlyIfAbsent: If true, the existing value will not be changed * EVict: If false, the table is in creation mode */
final V putVal(int hash, K key, V value, boolean onlyIfAbsent,
               boolean evict) {
    Node<K,V>[] tab; Node<K,V> p; int n, i;
    // If the table is null or the length of the table is 0, the capacity expansion is performed
    if ((tab = table) == null || (n = tab.length) == 0)
        n = (tab = resize()).length;
    // Use n-1&hash to find the index that should appear in the hash table
    // If not, insert the new node into the index position
    if ((p = tab[i = (n - 1) & hash]) == null)
        tab[i] = newNode(hash, key, value, null);
    else { // If there is one, it needs to be determined further
        Node<K,V> e; K k;
        // If the hash value of the index node is the hash of the key and the key of the index node is the same object
        // If the index key is not empty and the keyequals of the key and index return true, e points to P
        if(p.hash == hash && ((k = p.key) == key || (key ! =null && key.equals(k))))
            e = p;
        // Red black tree processing
        else if (p instanceof TreeNode)
            // Add key-value to the red-black tree corresponding to the index node and return the updated or newly added tree node object to e
            e = ((TreeNode<K,V>)p).putTreeVal(this, tab, hash, key, value);
        // List processing
        else {
            for (int binCount = 0; ; ++binCount) {
                // Insert elements
                if ((e = p.next) == null) {
                    p.next = newNode(hash, key, value, null);
                    // If the list has reached the tree threshold, the list is converted to a red-black tree
                    if (binCount >= TREEIFY_THRESHOLD - 1) // -1 for 1st
                        // Todo: analysis of tree methods
                        treeifyBin(tab, hash);
                    break;
                }
                // e is now the next node of the node being traversed (not empty)
                // Return true if the list has the same hash value, or if the key is the same object or equals of the key
                // jumps out of the loop
                if(e.hash == hash && ((k = e.key) == key || (key ! =null && key.equals(k))))
                    break;
                // p moves to the next nodep = e; }}// If e is not null, there is a mapping for the key, and its value is updated
        if(e ! =null) { // existing mapping for key
            V oldValue = e.value;
            if(! onlyIfAbsent || oldValue ==null)
                e.value = value;
            afterNodeAccess(e);
            // Updating the map returns the previous value
            returnoldValue; }}// The hash table structure is changed
    ++modCount;
    // If the value exceeds the threshold after adding elements, expand the capacity
    if (++size > threshold)
        resize();
    afterNodeInsertion(evict);
    // Returning null indicates that the node is being inserted, so there is no previous value
    return null;
}
Copy the code

Now use a language to describe the execution flow of the following putVal methods:

  • Check whether the table is null or has a length of 0
    • If yes, perform resize()
    • Otherwise continue
  • Find the corresponding index through the hash calculated by the perturbation function of the key’s hashCode(), and determine whether the node at the index position exists
    • If no, insert the node directly
    • If yes, check whether it is a tree node
      • Tree node, key key-value pair added to the corresponding red-black tree, insert returns NULL, update returns the updated tree node object to e
      • If there is a same key (hash, same object, equals())) in the process, the value is updated. If no key is found, a node is inserted at the end of the list. After insertion, the critical value of tree is determined, and the list is converted to a red-black tree
      • If the length of the table array is smaller than the default minimum tree length of 64 when converting a linked list to a red-black tree, perform expansion instead of converting the linked list to a red-black tree
  • If a key exists in the linked list or red-black tree, then the node is assigned to e. At this point, it is determined whether E is null. If it is not null, e is the node object to be updated, and the value of this node is updated and the old value is returned
  • If e is null, it indicates that the node is being inserted. At this point, the node is being inserted. Check whether the value exceeds the threshold

The resize() method for capacity expansion is described later

Resize method source code: **

5. Get method

Get method source:

public V get(Object key) {
    Node<K,V> e;
    return (e = getNode(hash(key), key)) == null ? null : e.value;
}

final Node<K,V> getNode(int hash, Object key) {
    Node<K,V>[] tab; Node<K,V> first, e; int n; K k;
    // Hash the key to the index position. If the node at the index position is not empty, assign it to first
    if((tab = table) ! =null && (n = tab.length) > 0 &&
        (first = tab[(n - 1) & hash]) ! =null) {
        // Return the first node if first's hash is equal to hash or first's key and key's equals return true
        if (first.hash == hash && // always check first node((k = first.key) == key || (key ! =null && key.equals(k))))
            return first;
        // When first is not a separate node
        if((e = first.next) ! =null) {
            // If first is a tree node instance, use the red-black tree lookup method to find the corresponding node
            if (first instanceof TreeNode)
                // todo: analysis of red-black tree lookup methods
                return ((TreeNode<K,V>)first).getTreeNode(hash, key);
            // Otherwise, search through the list
            do {
                if(e.hash == hash && ((k = e.key) == key || (key ! =null && key.equals(k))))
                    return e;
            } while((e = e.next) ! =null); }}// If first is null, null is returned indicating that no search was found
    return null;
}
Copy the code

Get method idea analysis:

  • First according to pass keyhashCode()The hash, which is the result of the perturbation function, is the index to which the hash should be based on (n-1) &hash (n is the length of the array)
  • Assign the value of the index position in the array to first to determine whether first is empty
    • If yes, null is returned
    • If no, check whether first is the node to be searched
      • If yes, the first node is returned
      • If no, continue to check whether first has the next node
        • If no, null is returned
        • If yes, check whether first is a tree node
          • If yes, the node is searched in the red-black tree and returned. If no node is found, null is returned
          • If no, the node is iterated through the list and returned. If no node is found, null is returned

6. Resize method

final Node<K,V>[] resize() {
    // Assign table to oldTab
    Node<K,V>[] oldTab = table;
    // oldCap specifies the size of the original table (length, 0 if table is null).
    int oldCap = (oldTab == null)?0 : oldTab.length;
    // Assign threshold to oldThr
    int oldThr = threshold;
    // newCap and newThr represent new capacity and new threshold
    int newCap, newThr = 0;
    // If oldCap is greater than 0
    if (oldCap > 0) {
        // If oldCap is greater than or equal to the maximum capacity, 2^30, then threshold is assigned to the maximum value of int and oldTab is returned
        // That is, do not perform capacity expansion at this time, because the capacity cannot be expanded
        if (oldCap >= MAXIMUM_CAPACITY) {
            threshold = Integer.MAX_VALUE;
            return oldTab;
        }
        // newCap is assigned twice the value of oldCap if newCap is smaller than the maximum capacity and oldCap is larger than the default initial capacity
        // newThr is set to double oldThr
        else if ((newCap = oldCap << 1) < MAXIMUM_CAPACITY &&
                 oldCap >= DEFAULT_INITIAL_CAPACITY)
            newThr = oldThr << 1; // double threshold
    }
    // If oldCap is 0 but oldThr is greater than 0, oldThr is assigned to newCap
    // This indicates that the underlying array has not been initialized
    else if (oldThr > 0) // initial capacity was placed in threshold
        newCap = oldThr;
    // If oldCap is 0 and oldThr is 0, then newCap is assigned the default capacity and newThr is assigned the default capacity * default load factor
    else {               // zero initial threshold signifies using defaults
        newCap = DEFAULT_INITIAL_CAPACITY;
        newThr = (int)(DEFAULT_LOAD_FACTOR * DEFAULT_INITIAL_CAPACITY);
    }
    // If newThr is 0, oldThr is also 0, and newCap is used to calculate newThr(newCap * loadFactor).
    if (newThr == 0) {
        float ft = (float)newCap * loadFactor;
        newThr = (newCap < MAXIMUM_CAPACITY && ft < (float)MAXIMUM_CAPACITY ?
                  (int)ft : Integer.MAX_VALUE);
    }
    // Assign newThr to threshold, where newCap and newThr are identified
    threshold = newThr;
    @SuppressWarnings({"rawtypes","unchecked"})
    // Use newCap to create the newTab array of the corresponding size
    Node<K,V>[] newTab = (Node<K,V>[])new Node[newCap];
    // Assign newTable to table
    table = newTab;
    // If there is data in oldTab, migrate it to newTab
    if(oldTab ! =null) {
        // Iterate over the nodes in the number group
        for (int j = 0; j < oldCap; ++j) {
            Node<K,V> e;
            // e is the node being traversed, if e is not null
            if((e = oldTab[j]) ! =null) {
                // Empty the oldTab location for garbage collection
                oldTab[j] = null;
                // If next of e is null, indicating that e is a separate node, insert the node directly at the corresponding index position in newTab
                if (e.next == null)
                    newTab[e.hash & (newCap - 1)] = e;
                // The migration of red-black trees is complicated and not explained here
                else if (e instanceof TreeNode)
                    ((TreeNode<K,V>)e).split(this, newTab, j, oldCap);
                else { 
                    // The two nodes are defined because the elements in the list are being migrated
                    // Do you want it at the same subscript, or at the same subscript + the length of the array
                    Node<K,V> loHead = null, loTail = null;
                    Node<K,V> hiHead = null, hiTail = null;
                    Node<K,V> next;
                    // Iterate over the nodes in the original list
                    do {
                        next = e.next;
                        // If the high level is 0, the subscript is unchanged
                        if ((e.hash & oldCap) == 0) {
                            if (loTail == null)
                                // The first node
                                loHead = e;
                            else
                                // Insert the tail node
                                loTail.next = e;
                            // loTail redirects to the tail node
                            loTail = e;
                        }
                        // If the high level is 1, the subscript becomes the original subscript + the original array length
                        else {
                            if (hiTail == null)
                                // The first node
                                hiHead = e;
                            else
                                // Insert the tail node
                                hiTail.next = e;
                            // hiTail redirects to the tail nodehiTail = e; }}while((e = next) ! =null);
                    // Insert the low level list into the original index
                    if(loTail ! =null) {
                        loTail.next = null;
                        newTab[j] = loHead;
                    }
                    // Insert the high-order list at the original index + the original array length
                    if(hiTail ! =null) {
                        hiTail.next = null;
                        newTab[j + oldCap] = hiHead;
                    }
                }
            }
        }
    }
    // Returns the expanded array
    return newTab;
}
Copy the code

Split () method in TreeNode TreeNode class

final void split(HashMap<K,V> map, Node<K,V>[] tab, int index, int bit) {
    // b refers to the current object (treenode-class object), since HashMap is only used when the resize method iterates over nodes in a number group
    // So this must be the root of a red-black tree
    TreeNode<K,V> b = this;
    // As above, the high and low levels are linked separately, with the head node representing the first node and the tail node used to insert node elements
    TreeNode<K,V> loHead = null, loTail = null;
    TreeNode<K,V> hiHead = null, hiTail = null;
    // lc represents the number of nodes in the low list, hc represents the number of nodes in the high list
    int lc = 0, hc = 0;
    // The initial value of the e tree node is b, and next is also a tree node
    for(TreeNode<K,V> e = b, next; e ! =null; e = next) {
        next = (TreeNode<K,V>)e.next;
        e.next = null;
        // If the high level is 0, insert it into the low level list
        if ((e.hash & bit) == 0) {
            if ((e.prev = loTail) == null)
                // The first insert is the header node
                loHead = e;
            else
                // Insert the tail node
                loTail.next = e;
            loTail = e;
            ++lc;
        }
        // If the high level is 1, insert it into the high level list
        else {
            if ((e.prev = hiTail) == null)
                // The first insert is the header node
                hiHead = e;
            else
                // Insert the tail nodehiTail.next = e; hiTail = e; ++hc; }}if(loHead ! =null) {
        if (lc <= UNTREEIFY_THRESHOLD)
            // If the number of elements in the list is below the de-tree threshold (6), it is converted to a list of Node objects
            tab[index] = loHead.untreeify(map);
        else {
            // Otherwise tree it
            tab[index] = loHead;
            if(hiHead ! =null) // (else is already treeified)loHead.treeify(tab); }}// High-order lists are treated the same way
    if(hiHead ! =null) {
        if (hc <= UNTREEIFY_THRESHOLD)
            tab[index + bit] = hiHead.untreeify(map);
        else {
            tab[index + bit] = hiHead;
            if(loHead ! =null) hiHead.treeify(tab); }}}Copy the code

Now let’s summarize the following resize() method execution flow: First determine the new capacity and threshold

  • Firstly, oldCap and oldThr of the original table are determined. According to these two values, newCap and newThr of the new capacity are determined
    • If oldCap is greater than 0, it indicates that the table has been initialized. In this case, check whether oldCap is greater than or equal to the maximum length of the array
      • If yes, set threshold to integer.max_value and return the original array. It indicates that the capacity cannot be expanded.
      • Otherwise newCap becomes twice as large as oldCap, or if newCap is less than the maximum array length, newThr becomes twice as large as oldThr (don’t worry about oldCap beingIf the threshold is exceeded, the procedure in the previous step will continue to set threshold to integer.max_value
    • If oldCap is 0 (it cannot be less than 0), the table is not initialized. Check whether oldThr is greater than 0
      • If yes, the argument construct is called, specifying the initial capacity, and the value of oldThr is the size of the array to be created, which is assigned to newCap
      • If no, the default construct is invoked. In this case, newCap is the default capacity (16) and newThr is the default capacity * default load factor =12
    • If newThr is still 0, the value (newCap * loadFactor) is calculated, of course, if newCap is greater than or equal to the maximum length of the array, newThr is integer.max_value

Create an array and migrate elements:

  • Create array newTab with size newCap, table pointing to newTab
  • The nodes in oldTab are traversed, and the following is the traversal process
    • If the node is a single node, hash & (newcap-1) is inserted into the corresponding index position
    • Red-black trees and linked lists have some of the same ideas, and here’s why it’s twice as big every time, right
      • Use the top and bottom nodes of the high and low lists to represent the two lists and check if hash & newCap is 0
        • If yes, insert it into the low linked list (tailplug)
        • Otherwise insert into high linked list (tail insert)
      • In the case of a linked list, the lower linked-head node is inserted directly into the original index position and the higher linked-head node into the original index +oldCap position
      • If it is a red-black tree, special treatment is required. If the list length is less than or equal to the de-tree critical value (6), it will be converted to Node Node insertion (red-black tree nodes will be strong first, so it will be rotated back here), otherwise, the list will be converted to red-black tree and inserted

7, remove method

public V remove(Object key) {
    Node<K,V> e;
    return (e = removeNode(hash(key), key, null.false.true)) = =null ?
        null : e.value;
}

final Node<K,V> removeNode(int hash, Object key, Object value,
                           boolean matchValue, boolean movable) {
    Node<K,V>[] tab; Node<K,V> p; int n, index;
    // The node (p) corresponding to the index position is not empty
    if((tab = table) ! =null && (n = tab.length) > 0 &&
        (p = tab[index = (n - 1) & hash]) ! =null) {
        Node<K,V> node = null, e; K k; V v;
        // If the key of node P is the same as key, node points to p
        if(p.hash == hash && ((k = p.key) == key || (key ! =null && key.equals(k))))
            node = p;
        // e is the next node of p and e is not empty
        else if((e = p.next) ! =null) {
            // If p is a tree node, look for the node in the red-black tree
            if (p instanceof TreeNode)
                node = ((TreeNode<K,V>)p).getTreeNode(hash, key);
            // Otherwise, traverse the list to find the node
            else {
                do {
                    if(e.hash == hash && ((k = e.key) == key || (key ! =null && key.equals(k)))) {
                        node = e;
                        break;
                    }
                    p = e;
                } while((e = e.next) ! =null); }}// Node is not empty and removes elements from the red-black tree or linked list
        if(node ! =null&& (! matchValue || (v = node.value) == value || (value ! =null && value.equals(v)))) {
            if (node instanceof TreeNode)
                ((TreeNode<K,V>)node).removeTreeNode(this, tab, movable);
            else if (node == p)
                tab[index] = node.next;
            else
                p.next = node.next;
            ++modCount;
            --size;
            afterNodeRemoval(node);
            returnnode; }}// Return null to indicate that the deleted key was not found
    return null;
}
Copy the code

There’s nothing to say about the remove method, but notice that when you remove a node from a red-black tree you might be able to turn a red-black tree into a linked list, rightDraw an extreme case:You can see from this graph why it’s a linked list if it’s less than or equal to 6