Source code series -HashMap source code interpretation (JDK1.8)

The class diagram

First, take a look at the four commonly used implementation classes of the Map family: HashMap, Hashtable, LinkedHashMap, and TreeMap.

Below we mainly interpret HashMap, combined with the source code, from the storage structure, common method analysis, expansion and security and other aspects of in-depth interpretation of the working principle of HashMap.

Storage structure

In terms of structure, HashMap is an array + linked list + red-black tree (JDK1.8 added red-black tree), as shown below:

Red-black trees were introduced because the average time complexity for finding, inserting, and deleting is order (log(n)). This is because when hash collisions occur, the data is mounted (tailed) to form a linked list. The linked list is discontinuous in space and continuous in logic. It is fast to add and delete elements, and only needs to process references between nodes. The time complexity is O(1), and the query is slow.

Internal properties

Default initial capacity DEFAULT_INITIAL_CAPACITY = 1 << 4 (16)
Maximum capacity MAXIMUM_CAPACITY = 1 << 30
Default load factor DEFAULT_LOAD_FACTOR = 0.75f
Threshold = Capacity x load factor
The threshold for converting to a red-black tree: TREEIFY_THRESHOLD = 8
UNTREEIFY_THRESHOLD = 6

A constructor

// The default constructor, with the default load factor of 0.75f
public HashMap(a) {
    this.loadFactor = DEFAULT_LOAD_FACTOR;
 }

 public HashMap(int initialCapacity, float loadFactor) {
     if (initialCapacity < 0)
         throw new IllegalArgumentException("Illegal initial capacity: " + initialCapacity);
     if (initialCapacity > MAXIMUM_CAPACITY)
         initialCapacity = MAXIMUM_CAPACITY;
     if (loadFactor <= 0 || Float.isNaN(loadFactor))
         throw new IllegalArgumentException("Illegal load factor: " + loadFactor);
     this.loadFactor = loadFactor;
     this.threshold = tableSizeFor(initialCapacity);
 }
 
 // The tableSizeFor function (not considering the case of larger than maximum capacity) is to return a number greater than the input parameter and the nearest integer power of 2.
 static final int tableSizeFor(int cap) {
    int n = cap - 1;
    n |= n >>> 1;
    n |= n >>> 2;
    n |= n >>> 4;
    n |= n >>> 8;
    n |= n >>> 16;
    return (n < 0)?1 : (n >= MAXIMUM_CAPACITY) ? MAXIMUM_CAPACITY : n + 1;
}
Copy the code

The main method

Index index calculation

The Hash algorithm here is essentially a three-step process: taking the hashCode value of the key, the high-order operation, and the modulus operation.

// Perturbation function
static final int hash(Object key) {
    int h;
    // >>> 16 Unsigned move 16 bits to the right to reduce collisions
    return (key == null)?0 : (h = key.hashCode()) ^ (h >>> 16);
}

index = (n - 1) & hash;

Copy the code

The index is calculated by (n-1) & hash, that is, the binary of (array length -1) and the hash value calculated previously.

(hash%length==hash&(length-1) if length is 2 to the NTH power;)

Put method

final V putVal(int hash, K key, V value, boolean onlyIfAbsent,
                   boolean evict) {
    Node<K,V>[] tab; Node<K,V> p; int n, i;
    // Check whether the current array is initialized
    if ((tab = table) == null || (n = tab.length) == 0)
    	// If not, call the resize method to initialize (default length is 16).
        n = (tab = resize()).length;  
    // (n-1) &hash determines which bucket the element is in
    if ((p = tab[i = (n - 1) & hash]) == null)
        // If the bucket is empty, the new generated node is put into the array
        tab[i] = newNode(hash, key, value, null);
    // If an element already exists in the array
    else {
        Node<K,V> e; K k;
        // If it is equal to the hash value of the first element, the key is equal
        if(p.hash == hash && ((k = p.key) == key || (key ! =null && key.equals(k))))
                // Assign the first element to e, which is recorded as e
                e = p;
        // If the node is a red-black tree node
        else if (p instanceof TreeNode)
            // Put it in the tree
            e = ((TreeNode<K,V>)p).putTreeVal(this, tab, hash, key, value);
        // The rest is a linked list node
        else {
            // Insert the node at the end of the list.
            // Iterate through the current list
            for (int binCount = 0; ; ++binCount) {
                // If p.next is empty, it is at the end of the list
                if ((e = p.next) == null) {
                    // Insert the new node at the end
                    p.next = newNode(hash, key, value, null);
                    // If the number of nodes reaches the threshold of 8
                    if (binCount >= TREEIFY_THRESHOLD - 1) // -1 for 1st
                        // Convert to a red-black tree
                        treeifyBin(tab, hash);
                    // Break out of the loop
                    break;
                }
                // Determine whether the key value of the node in the list is equal to the key value of the inserted element
                if(e.hash == hash && ((k = e.key) == key || (key ! =null && key.equals(k))))
                    // If yes, break out of the loop
                    break;
                // Use to traverse the list in the bucket. Combine with the previous e = p.ext to traverse the listp = e; }}// e has a record, which represents the node whose key value and hash value of the element found in the bucket are equal to the inserted element
        if(e ! =null) { 
            // Record the value of e (first element)
            V oldValue = e.value;
            // onlyIfAbsent if true, don't change existing value
            // The value of onlyIfAbsent is false or null
            if(! onlyIfAbsent || oldValue ==null)
                // Replace the old value with the new value
                e.value = value;
            // Call back after access to add the element to the end of the list
            afterNodeAccess(e);
            // Return the old value
            // map.put(1, "Orcas");
            // String oldVal = map.put(1, "Fish");
            // => Orcas
            returnoldValue; }}// The record is the number of internal structural changes
    ++modCount;
    // Put an element ++size each time. When the actual size exceeds the threshold, expand the capacity
    if (++size > threshold)
        resize();
    // Callback after insert
    afterNodeInsertion(evict);
    return null;
} 
Copy the code

The specific process can be understood by the following figure:

Expansion mechanism

final Node<K,V>[] resize() {
    Node<K,V>[] oldTab = table;
    int oldCap = (oldTab == null)?0 : oldTab.length;
    int oldThr = threshold;
    int newCap, newThr = 0;
    // If there are elements in the old array, it means that the array has been initialized
    if (oldCap > 0) {
        if (oldCap >= MAXIMUM_CAPACITY) {
            // If the size of the old array is larger than the maximum capacity of 2^30, set the threshold to the maximum value of Integer
            threshold = Integer.MAX_VALUE;
            return oldTab;
        }
        // The old array is doubled to less than the maximum capacity and the old array is larger than the default initial capacity 16
        else if ((newCap = oldCap << 1) < MAXIMUM_CAPACITY &&
                 oldCap >= DEFAULT_INITIAL_CAPACITY)
            // Set threshold*2 to get the new threshold
            newThr = oldThr << 1; 
    }
    // Old threshold =threshold greater than 0
    // Use the constructor HashMap(int initialCapacity, float loadFactor)
    // In this method, this.threshold = tableSizeFor(initialCapacity);
    // tableSizeFor returns the capacity of the array (2^N)
    else if (oldThr > 0) 
        // Set the capacity to threshold
        newCap = oldThr;
    else {  
        // The threshold is 0 when initialized, oldCap is empty, that is, the array was created with no parameters, and resize() is called to initialize to the default value
        // Set the new length to the default initialization length, which is 16
        newCap = DEFAULT_INITIAL_CAPACITY;  
        // The load factor is 0.75* Array length 16=12
        newThr = (int)(DEFAULT_LOAD_FACTOR * DEFAULT_INITIAL_CAPACITY); 
    }
    // If the new threshold is 0, set the new threshold based on the load factor
    if (newThr == 0) {
        float ft = (float)newCap * loadFactor;
        newThr = (newCap < MAXIMUM_CAPACITY && ft < (float)MAXIMUM_CAPACITY ?
                  (int)ft : Integer.MAX_VALUE);
    }
    threshold = newThr;
        // Create a new array of nodes with length newCap
        Node<K,V>[] newTab = (Node<K,V>[])new Node[newCap];  
    table = newTab;
    // If the old array contains data, copy the array into the new array
    if(oldTab ! =null) {
        // Loop through the old array, copying the nodes with elements
        for (int j = 0; j < oldCap; ++j) {
            Node<K,V> e;
            // The old array has element nodes
            if((e = oldTab[j]) ! =null) {
                oldTab[j] = null;
                / / array
                if (e.next == null)
                    // Recalculate the hash value to determine the element position
                    newTab[e.hash & (newCap - 1)] = e;
                / / the red-black tree
                else if (e instanceof TreeNode)
                    // Split the original binary tree structure into a new red-black tree
                    ((TreeNode<K,V>)e).split(this, newTab, j, oldCap);
                / / list
                else {
                    Node<K,V> loHead = null, loTail = null;
                    Node<K,V> hiHead = null, hiTail = null;
                    Node<K,V> next;
                    do {
                        next = e.next;
                        // If it is 0, the position of the element in the expanded array has not changed
                        if ((e.hash & oldCap) == 0) {
                            if (loTail == null)
                                / / the first
                                loHead = e;
                            else
                                loTail.next = e;
                            loTail = e;
                        }
                        // If it is not 0, the position of the element in the array is changed after the expansion, and the new subscript is the original index +oldCap
                        else {
                            if (hiTail == null)
                                hiHead = e;
                            elsehiTail.next = e; hiTail = e; }}while((e = next) ! =null);
                    if(loTail ! =null) {
                        loTail.next = null;
                        newTab[j] = loHead;
                    }
                    if(hiTail ! =null) {
                        hiTail.next = null;
                        newTab[j + oldCap] = hiHead;
                    }
                }
            }
        }
    }
    return newTab;
}
Copy the code

As shown in the following figure, n is the length of the table, and array expansion n << 1. Figure (a) shows the index location determined by key1 and key2 before capacity expansion. Figure (b) shows the index location determined by key1 and key2 after capacity expansion. Therefore, when we extend the HashMap, we don’t need to recalculate the hash as in the JDK1.7 implementation. We just need to see if the new bit is 1 or 0. If 0 is the same index, the index becomes “old index +oldCap”.

Source code series -HashMap source code interpretation (JDK1.8)

The class diagram

Storage structure

Internal properties

A constructor

The main method

Index index calculation

Put method

Expansion mechanism

Related Posts

Case analysis of long GC time caused by frequent object generation

[SpringMVC starts from 0] @requestMapping annotation

Python practical case, PYTHon10 line script to create real-time screenshots to identify OCR, easy to handle “Baidu library”