A: Summary overview

  • HashMap underlying data structure
  • HashMap capacity requires a power of two
  • Expansion mechanism of HashMap
  • The number of HashMap linked list nodes reaches 8 turn red black tree
  • An infinite loop problem with HashMap1.7
  • Thread safety issues with HashMap

2. Underlying data structure of HashMap

The underlying data structure of HashMap will be described according to the JDK version. In JDK1.7, the structure is array + list + red-black tree. So how does its specific structure achieve maintenance?

2.1 list

An array is actually a Hash bucket, and each bucket stores the linked list nodes, which are internal nodes of HashMap. The main properties of the linked list are as follows: the structure of the linked list is a one-way linked list

/ / keyhashValue of final inthash; // Key final K key; // key pair value V value; // next <K,V> next;Copy the code

2.2 the red-black tree

Red-black tree is a special AVL, namely balanced binary tree, with excellent search efficiency. When the length of the linked list in a single Hash bucket exceeds the default value TREEIFY_THRESHOLD=8, the list will be converted to red-black tree. Of course, if the number of red-black tree nodes reaches the default value UNTREEIFY_THRESHOLD=6, the red-black tree will Convert to a linked list

    TreeNode<K,V> parent;  // red-black tree links
    TreeNode<K,V> left;
    TreeNode<K,V> right;
    TreeNode<K,V> prev;    // needed to unlink next upon deletion
    boolean red;
Copy the code

Three: Capacity requirements

HashMap provides a number of constructor overloads, including no-parameter constructs, capacity constructs, and capacity + load factor constructs

3.1 No-parameter construction

The no-parameter constructor assigns the load factor by default. DEFAULT_LOAD_FACTOR=0.75, which is an empirical value and is not recommended to change. The array capacity will use the default value DEFAULT_INITIAL_CAPACITY=1<<4=16 when inserting the function call

    public HashMap() {
        this.loadFactor = DEFAULT_LOAD_FACTOR; 
    }
Copy the code

3.2 Capacity Structure

The capacity + load factor constructor is called to instantiate the object when the capacity assignment constructor of a HashMap is called. Of course, the default load factor is 0.75. TableSizeFor. The function recalculates the size of the array and calculates the value as the minimum integer power of 2 for the adjacent argument passed in to the constructor. It is mandatory that the array size is always an integer power of 2

    public HashMap(int initialCapacity, float loadFactor) {
        if (initialCapacity < 0)
            throw new IllegalArgumentException("Illegal initial capacity: " +
                                               initialCapacity);
        if (initialCapacity > MAXIMUM_CAPACITY)
            initialCapacity = MAXIMUM_CAPACITY;
        if (loadFactor <= 0 || Float.isNaN(loadFactor))
            throw new IllegalArgumentException("Illegal load factor: " +
                                               loadFactor);
        this.loadFactor = loadFactor;
        this.threshold = tableSizeFor(initialCapacity);
    }
Copy the code

3.3 Integer power benefits of 2

The default capacity is 16, even if the initial capacity passed in is not an integer power of 2, it will be converted to an integer power of 2, with each expansion doubling *2. What’s the advantage of going to all this trouble to make sure it works? It is well known that the key-value pair store calculates the hash bucket position using (n-1)&hash. According to the calculation rules of &, you can find the following:

  • All binary numbers after the integer power -1 of 2 are 1. For example, 16-1=15 is converted to binary1111
  • The result of the & is the last bits of the hash that store the key-value pair key
  • The final conclusion is that the hash distribution of stored key-value pairs is uniform, and the hash distribution is uniform when stored in a HashMap

Iv. Capacity expansion mechanism of HashMap

The implementation of the expansion algorithm in HashMap is concentrated in the function resize(). The main code of the algorithm for hash bucket capacity of this function is shown as follows: To sum up, if the special case over the maximum value is excluded, the existing hash bucket capacity is doubled

// The capacity of the existing hash bucket is greater than 0. // In this case, the hash bucket has already been initializedif(oldCap > 0) {// No operation is performed when the existing capacity of the hash bucket exceeds the limitif (oldCap >= MAXIMUM_CAPACITY) {
            threshold = Integer.MAX_VALUE;
            returnoldTab; } // The existing hash bucket capacity is lower than the maximum value after doubling the capacity. // The existing hash bucket capacity is greater than the default value 16. // Double the existing capacity expansion thresholdelse if((newCap = oldCap << 1) < MAXIMUM_CAPACITY && oldCap >= DEFAULT_INITIAL_CAPACITY) newThr = oldThr << 1; } // The existing hash bucket capacity is 0 and the capacity expansion threshold is greater than 0else if(oldThr > 0) newCap = oldThr; // Hash bucket capacity and capacity expansion threshold is less than 0 // This condition only occurs in no-parameter constructionelse{ newCap = DEFAULT_INITIAL_CAPACITY; newThr = (int)(DEFAULT_LOAD_FACTOR * DEFAULT_INITIAL_CAPACITY); } // The new capacity threshold is 0 // In the case of the parameter constructor // The capacity of the hash bucket is doubled to exceed the maximum or the capacity is not greater than the default value 16if (newThr == 0) {
        float ft = (float)newCap * loadFactor;
        newThr = (newCap < MAXIMUM_CAPACITY && ft <
        (float)MAXIMUM_CAPACITY ?
                    (int)ft : Integer.MAX_VALUE);
    }
Copy the code

Five: red-black tree and linked list conversion

When the number of nodes in a single bucket is greater than or equal to 8, the linked list is transformed into a red-black tree, and when the number of nodes is less than or equal to 6, the linked list is transformed into a red-black tree. How are these values given by default? The default values of 6 and 8 are maintained in the HashMap using the attributes UNTREEIFY_THRESHOLD and TREEIFY_THRESHOLD

The conversion of linked list and red-black tree is actually a contest of space and time. Red-black tree takes up twice the space of linked list. When the number of linked list nodes is too small, the traversal performance is not too bad, so it is unwise to sacrifice twice the space for some time. Therefore, it is stipulated that when the number of nodes in the linked list is 8, the probability of reaching this condition is calculated according to the Poisson distribution as 1 in 10 million

JDK1.7 has an infinite loop

In JDK1.8, the element insertion method has been changed to tail insertion, and the concurrent loop problem has been resolved

Seven: thread safety issues

Is HashMap concurrency safe in JDK1.8 without concurrency dead-loops? PutVal: When using a linked list to resolve hash conflicts, loop through the list to implement a tail insert. Consider the following comment scenario that would result in data overwriting

    for(int binCount = 0; ; ++binCount) {// 1: thread A loses CPU resources // 2: thread B grabs CPU resources and executes completion methodif ((e = p.next) == null) {
            p.next = newNode(hash, key, value, null);
            if (binCount >= TREEIFY_THRESHOLD - 1) // -1 for 1st
                treeifyBin(tab, hash);
            break;
        }
        if (e.hash == hash&& ((k = e.key) == key || (key ! = null && key.equals(k))))break;
        p = e;
    }
Copy the code