1. An overview of the

HashMap is a hash table-based Map interface implementation that allows NULL keys and values

2. Member variables

    /** * [1] The default initial capacity - MUST be a power of two. */
    static final int DEFAULT_INITIAL_CAPACITY = 1 << 4; // aka 16
    
    /** * Maximum capacity */
    static final int MAXIMUM_CAPACITY = 1 << 30;
    
    /* The load factor used when none specified in constructor. */
    static final float DEFAULT_LOAD_FACTOR = 0.75 f;
    
    /** * The list to red black tree threshold */
    static final int TREEIFY_THRESHOLD = 8;
    
    /** * [2] Red-black tree list threshold */
    static final int UNTREEIFY_THRESHOLD = 6;
    
    /** * Min tree threshold * If the length of the table in the HashMap is greater than 64, then the bucket lists will be allowed to become red-black trees. * If only the bucket lists are too long, If the length of the table is less than 64 *, the table should be expanded by resize instead of being converted to a red-black tree * The minimum tree size threshold should be at least four times that of the red-black tree */
    static final int MIN_TREEIFY_CAPACITY = 64;
    
    /** * Holds cached entrySet(). Note that AbstractMap fields are used * for keySet() and values(). */
    transient Set<Map.Entry<K,V>> entrySet;
    
    /** * Load factor controls how dense the table array in the HashMap is stored * The closer the load factor is to 1, the more dense the data is stored, resulting in inefficient element search * The closer the load factor is to 0, the more sparse the data is stored. The load factor for The hash table. * *@serial* /
    final float loadFactor;
    
    /** * number of changes */
    transient int modCount;
    
    /** * The number of key-value mappings contained in this map. */
    transient int size;
    
    /** * an array of elements */
    transient Node<K,V>[] table;
    
    When {/ * * *@link HashMap#size} >= {@link* The next size value at which to resize (capacity * load factor). * *@serial* /
    int threshold;
Copy the code

[1], the capacity of HashMap must be a power of 2

The reason why HashMap capacity is a power of 2

3. Construction method

java.util.HashMap#HashMap()

    public HashMap(a) {
        // Use the default parameters
        // Default load factor, default capacity
        this.loadFactor = DEFAULT_LOAD_FACTOR; // all other fields defaulted
    }
Copy the code

The default constructor does not initialize the table array; it does so in the java.util.hashmap #putVal method

java.util.HashMap#HashMap(int)

    public HashMap(int initialCapacity) {
        // Call the overloaded constructor
        // Specify the initial capacity, default load factor
        this(initialCapacity, DEFAULT_LOAD_FACTOR);
    }
Copy the code

java.util.HashMap#HashMap(int, float)

    public HashMap(int initialCapacity, float loadFactor) {
        // Specify initial capacity, specify load factor
        if (initialCapacity < 0)
            throw new IllegalArgumentException("Illegal initial capacity: " +
                                               initialCapacity);

        if (initialCapacity > MAXIMUM_CAPACITY)
            // If the specified initial capacity is greater than the maximum capacity, the maximum capacity is used
            initialCapacity = MAXIMUM_CAPACITY;
        if (loadFactor <= 0 || Float.isNaN(loadFactor))
            // Check the load factor
            throw new IllegalArgumentException("Illegal load factor: " +
                                               loadFactor);
        this.loadFactor = loadFactor;
        // When manually specifying the HashMap capacity, the HashMap threshold is set regardless of the load factor
        this.threshold = tableSizeFor(initialCapacity); / / [1]
    }
Copy the code

TableSizeFor Specifies the value greater than the smallest power of 2 of a specified capacity

For example, given 15, return 16; Given 30, return 32

TableSizeFor method is a very cool method, 5 lines of code to see my face confused

java.util.HashMap#HashMap(java.util.Map<? extends K,? extends V>)

    public HashMap(Map<? extends K, ? extends V> m) {
        // Use the default load factor
        this.loadFactor = DEFAULT_LOAD_FACTOR;

        putMapEntries(m, false);
    }
    
    final void putMapEntries(Map<? extends K, ? extends V> m, boolean evict) {
        int s = m.size();
        if (s > 0) {
            // Check whether the table is instantiated
            if (table == null) { // pre-size
                // Calculate the capacity expansion limit of M
                float ft = ((float)s / loadFactor) + 1.0 F;
                // Check whether the capacity expansion limit is greater than the maximum capacity of HashMap
                int t = ((ft < (float)MAXIMUM_CAPACITY) ?
                         (int)ft : MAXIMUM_CAPACITY);
                if (t > threshold) {
                    // If the capacity expansion limit of M exceeds the capacity expansion limit of the current HashMap, you need to adjust the capacity expansion limitthreshold = tableSizeFor(t); }}else if (s > threshold)
                // if the value of m.size exceeds the upper limit, run the resize method to expand the table
                resize();
            for (Map.Entry<? extends K, ? extends V> e : m.entrySet()) {
                // Add all key-value pairs in m to the HashMap
                K key = e.getKey();
                V value = e.getValue();
                putVal(hash(key), key, value, false, evict); }}}Copy the code

The flow of the putMapEntries method:

  1. If the table is empty, the capacity expansion upper limit is recalculated
  2. If the HashMap capacity expansion limit is smaller than that of the specified Mapsize, then executeresizesuper-popular
  3. Passes all key values in the specified MapputValMethods are placed in a HashMap

The hash function used here is actually a perturbation function, as described below

4. It’s Very important

java.util.HashMap#tableSizeFor

    Returns a power of two for the given target capacity. */
    static final int tableSizeFor(int cap) {
        /* Static tool methods hash(), tableSizeFor() (4) - programmer - CSDN blog https://blog.csdn.net/fan2012huan/article/details/51097331 */
        int n = cap - 1;
        n |= n >>> 1;
        n |= n >>> 2;
        n |= n >>> 4;
        n |= n >>> 8;
        n |= n >>> 16;
        return (n < 0)?1 : (n >= MAXIMUM_CAPACITY) ? MAXIMUM_CAPACITY : n + 1;
    }
Copy the code

The flow of the algorithm is as follows

The five lines of code are really awesome!! If you don’t understand the flowchart after reading it, read the text in the notes

java.util.HashMap#hash

    // Perturbation function
    static final int hash(Object key) {
        int h;
        // Mix the high and position of the original Hash value to reduce the probability of low level collisions
        return (key == null)?0 : (h = key.hashCode()) ^ (h >>> 16);
    }
Copy the code

Why does this function exist?

A HashMap determines the index of a key in a table based on its hash value

How to understand this sentence? Let me give you an example

There is a 32-bit hash value as follows

If the Hash value is the lowest 4 bits, index = 0101 = 5

If a large number of hash values with the lower 4 digits of 0101 are present, then all key-value pairs will be placed at index = 5 of the table

As a result, keys cannot be evenly distributed in the table

So HashMap solves this problem by creating this method java.util.hashmap #hash

Take the high 16 bits of a 32-bit hash value and the low 16 bits, and the low one carries the high one

To put it bluntly, even if there are a large number of keys with the same hash value, the index computed will be different after going through the hash method

Use the hash method to reduce the hash conflict probability

java.util.HashMap#resize

    /** * Initialize or expand the table **@return the table
     */
    final Node<K,V>[] resize() {
        // Take out the old table snapshot
        Node<K,V>[] oldTab = table;
        // Check the old table capacity
        int oldCap = (oldTab == null)?0 : oldTab.length;
        // Old capacity limit
        int oldThr = threshold;
        // The capacity of the new table and the upper limit of capacity expansion
        int newCap, newThr = 0;
        if (oldCap > 0) {
            if (oldCap >= MAXIMUM_CAPACITY) {
                // If the capacity of the old table is larger than the maximum capacity of the HashMap, no expansion operation is performed
                threshold = Integer.MAX_VALUE;
                return oldTab;
            }
            else if ((newCap = oldCap << 1) < MAXIMUM_CAPACITY &&
                     oldCap >= DEFAULT_INITIAL_CAPACITY)
                // If the size of the HashMap does not exceed the maximum size, double the size (newCap, newThr)
                newThr = oldThr << 1; // double threshold
        }

        else if (oldThr > 0) // initial capacity was placed in threshold
            // The table is not initialized. The initial capacity is set to the maximum capacity
            newCap = oldThr;
        else {               // zero initial threshold signifies using defaults
            // If the table is not initialized, use the default capacity (16) and capacity (12).
            / / such as Java. Util. HashMap. HashMap ()
            newCap = DEFAULT_INITIAL_CAPACITY;
            newThr = (int)(DEFAULT_LOAD_FACTOR * DEFAULT_INITIAL_CAPACITY);
        }
        if (newThr == 0) {
            // newThr = 0 if oldCap > 0, oldThr = 0
            // Recalculate the new capacity limit
            float ft = (float)newCap * loadFactor;
            newThr = (newCap < MAXIMUM_CAPACITY && ft < (float)MAXIMUM_CAPACITY ?
                      (int)ft : Integer.MAX_VALUE);
        }
        // Update the HashMap capacity limit
        threshold = newThr;
        @SuppressWarnings({"rawtypes"."unchecked"})
            Node<K,V>[] newTab = (Node<K,V>[])new Node[newCap];
        // Initialize the table
        table = newTab;
        // If the old table is not empty, add the
        if(oldTab ! =null) {
            for (int j = 0; j < oldCap; ++j) {
                Node<K,V> e;
                if((e = oldTab[j]) ! =null) {
                    oldTab[j] = null;
                    if (e.next == null)
                        // If there is only one node in the linked list, recalculate e's index in newTab based on the hash value of the node
                        newTab[e.hash & (newCap - 1)] = e;
                    else if (e instanceof TreeNode)
                        // The linked list has been turned into a red-black tree, split tree
                        ((TreeNode<K,V>)e).split(this, newTab, j, oldCap);
                    else { // preserve order
                        // Define two lists, lo and HI
                        Node<K,V> loHead = null, loTail = null;
                        Node<K,V> hiHead = null, hiTail = null;
                        Node<K,V> next;
                        do {
                            next = e.next;
                            // Use hash &oldCap to determine whether a node in the linked list should stay in the same position in the new table
                            if ((e.hash & oldCap) == 0) {
                                // The node remains at position j
                                // Insert lo list
                                if (loTail == null)
                                    loHead = e;
                                else
                                    loTail.next = e;
                                loTail = e;
                            }
                            else {
                                // The node is moved to position j+oldCap
                                // Insert the hi list.
                                if (hiTail == null)
                                    hiHead = e;
                                elsehiTail.next = e; hiTail = e; }}while((e = next) ! =null);
                        if(loTail ! =null) {
                            // If lo is not empty, place the entire lo list in position J of the new table
                            loTail.next = null;
                            newTab[j] = loHead;
                        }
                        if(hiTail ! =null) {
                            // If the hi list is not empty, place the entire hi list at position J +oldCap of the new table
                            hiTail.next = null;
                            newTab[j + oldCap] = hiHead;
                        }
                    }
                }
            }
        }
        return newTab;
    }
Copy the code

The resize process is not complicated and is roughly as follows

In my opinion, the key point is to reassign the key-value pairs to the new table

Here are three things to consider:

  1. The index position in the table has only one element
  2. Table index is a red-black tree
  3. The index position in the table is a linked list.

In case 3, the index position in the table is a linked list, and redistribution will split the list into two linked lists

An LO linked list, left at index

The other HI list will be moved to index+oldCapacity

At this point, we need to determine whether the node will stay on the LO list or the HI list.

Read this article for an in-depth understanding of HashMap(4): Resize expansion through line-by-line analysis of key source code

In jdk1.7, the expansion of the table will form a loop under the concurrent execution of multiple threads. We recommend that you read this article 🎉🎉

java.util.HashMap#treeifyBin

    final void treeifyBin(Node<K,V>[] tab, int hash) {
        int n, index; Node<K,V> e;
        if (tab == null || (n = tab.length) < MIN_TREEIFY_CAPACITY)
            // [1] If the table length is smaller than the minimum tree threshold, run resize to expand the table
            resize();
        else if ((e = tab[index = (n - 1) & hash]) ! =null) {
            // Turn the list into a red-black tree
            // hd head node, TL tail node
            TreeNode<K,V> hd = null, tl = null;
            // the do-while loop converts a one-way list to a bidirectional list
            do {
                // Convert the node to a TreeNode
                TreeNode<K,V> p = replacementTreeNode(e, null);
                if (tl == null)
                    // Set the root node
                    hd = p;
                else {
                    // Point the first node of the tree node to the last node
                    p.prev = tl;
                    tl.next = p;
                }
                tl = p;
            } while((e = e.next) ! =null);
            // Turn the bidirectional list into a red-black tree
            if((tab[index] = hd) ! =null) hd.treeify(tab); }}Copy the code

The treeifyBin method converts a linked list at an index position in the table into a red-black tree

This method is usually called when the list length is greater than TREEIFY_THRESHOLD after adding or merging elements

[1] it can be seen that if the current table. Length is less than the minimum tree threshold (64), then the resize method will be called to expand the list, rather than tree it

java.util.HashMap#putVal

    final V putVal(int hash, K key, V value, boolean onlyIfAbsent,
                   boolean evict) {
        Node<K,V>[] tab; Node<K,V> p; int n, i;
        if ((tab = table) == null || (n = tab.length) == 0)
            // [1] The table is not initialized yet
            n = (tab = resize()).length;
        if ((p = tab[i = (n - 1) & hash]) == null)
            // If the hash value of the key has no element in the corresponding position of the table, the node will be created directly
            tab[i] = newNode(hash, key, value, null);
        else {
            // A node already exists in the corresponding position of the table
            Node<K,V> e; K k;
            if(p.hash == hash && ((k = p.key) == key || (key ! =null && key.equals(k))))
                // The specified key is equal to the key of an existing node
                e = p;
            else if (p instanceof TreeNode)
                // The list node has become a red-black tree node
                e = ((TreeNode<K,V>)p).putTreeVal(this, tab, hash, key, value);
            else {
                // Iterate through the list, insert or update nodes
                for (int binCount = 0; ; ++binCount) {
                    if ((e = p.next) == null) {
                        // Add a new node to the end of the list
                        p.next = newNode(hash, key, value, null);
                        if (binCount >= TREEIFY_THRESHOLD - 1) // -1 for 1st
                            // the reason for treeify_threshold-1 is that binCount starts at 0, and when there are 8 nodes in the list, binCount=7
                            // When the number of nodes is greater than or equal to 8, tree the list
                            / / [2]
                            treeifyBin(tab, hash);
                        break;
                    }
                    if(e.hash == hash && ((k = e.key) == key || (key ! =null && key.equals(k))))
                        // The node corresponding to the key is found in the list
                        break; p = e; }}if(e ! =null) { // existing mapping for key
                // If the node corresponding to the key is found, the value on the node is updated
                V oldValue = e.value;
                if(! onlyIfAbsent || oldValue ==null)
                    // Update the value of the node
                    e.value = value;
                afterNodeAccess(e);
                // Returns the old value of the node
                return oldValue;
            }
        }
        ++modCount;
        if (++size > threshold)
            / / [3]
            resize();
        afterNodeInsertion(evict);
        return null;
    }
Copy the code

The putVal method is an implementation of adding key-value pairs to the associated method

As you can see in the figure above, the putVal method is called internally in methods that add key-value pairs

As can be seen from [1], the putVal method initializes the table internally by calling the resize method

The overall logic is not complicated, but pay attention to [2] and [3]

[2], if the linked list is too long after nodes are added, the linked list should be turned into a red-black tree

[3], if the number of key-value pairs in the whole HashMap reaches the upper limit of capacity expansion after nodes are added, the table should be expanded

5. To summarize

If the source code for ArrayList is one and a half stars, I’d say HashMap is at least three stars

This article omitted some content, such as the red-black tree implementation in HashMap. The main reason for not writing it is that I do not understand the red-black tree 😂 very well. If I understand it later, I will make up another article

At the very least, this article makes the implementation of the most important methods of HashMap quite clear, or it can be 😁

6. Recommended reading

HashMap static tool methods hash(), tableSizeFor()

Deep understanding of HashMap(4): Resize expansion for line-by-line analysis of key source code

Hash function in HashMap – faint maple – blog garden

The reason why HashMap capacity is a power of 2

Reasons and solutions for loop formation of HashMap linked list