Week2.jdk8 HashMap source code analysis

The underlying structure of HashMap has actually been updated quite a bit in JDK8. This article looks at the HashMap after JDK8.

If you’re familiar with hashmaps in Java, you should know that the underlying HashMap is made up of arrays and linked lists. (in JDK1.8, if the list is longer than 8, Java automatically converts the list to a red-black tree and improves efficiency with red-black trees. Red black trees also seem to be a difficult part of the interview, so try to learn about them later. The array is usually called a hash bucket, which is an array of nodes, and each bucket holds a linked list. Each node in the list is an element of the HashMap.

The constructor


    static final int MAXIMUM_CAPACITY = 1 << 30;

    static final float DEFAULT_LOAD_FACTOR = 0.75 f;


	InitialCapacity is the default capacity and loadFactor is the loadFactor
	// Another parameter in HashMap is threshold, which is the length of the hash bucket *loadFactor
    public HashMap(int initialCapacity, float loadFactor) {
		// The default capacity cannot be less than 0
        if (initialCapacity < 0)
            throw new IllegalArgumentException("Illegal initial capacity: " +
                                               initialCapacity);
		The default capacity cannot be greater than the maximum capacity
        if (initialCapacity > MAXIMUM_CAPACITY)
            initialCapacity = MAXIMUM_CAPACITY;
		//loadFactor cannot be less than 0
        if (loadFactor <= 0 || Float.isNaN(loadFactor))
            throw new IllegalArgumentException("Illegal load factor: " +
                                               loadFactor);
        this.loadFactor = loadFactor;
        this.threshold = tableSizeFor(initialCapacity);
    }

	// Call the last constructor with two arguments.
    public HashMap(int initialCapacity) {
        this(initialCapacity, DEFAULT_LOAD_FACTOR);
    }


	// Default no-parameter constructor, loadFactor is 0.75 by default
    public HashMap(a) {
        this.loadFactor = DEFAULT_LOAD_FACTOR; // all other fields defaulted
    }

	// Add elements from another Map to the HashMap
    public HashMap(Map<? extends K, ? extends V> m) {
        this.loadFactor = DEFAULT_LOAD_FACTOR;
        putMapEntries(m, false);
    }
Copy the code

The put operation

	// Enter key-value pairs to put values into the HashMap
    public V put(K key, V value) {
        return putVal(hash(key), key, value, false.true);
    }

	// is the function called by the previous constructor to put the value of another Map into the HashMap
	// Notice the difference between putVal and onlyIfAbsent. Note the Boolean onlyIfAbsent and Boolean evict arguments
	//onlyIfAbsent If true, values with the same key value will be overwritten
	// Evict, if false, is invoked when HashMap is initialized
    final void putMapEntries(Map<? extends K, ? extends V> m, boolean evict) {
		//s is the number of values in another Map
        int s = m.size();
        if (s > 0) {
			// If the current table is empty
            if (table == null) { // pre-size
				// This step calculates the threshold from the number of elements in the Map
                float ft = ((float)s / loadFactor) + 1.0 F;
				// Ensure that the threshold is smaller than MAXIMUM_CAPACITY
                int t = ((ft < (float)MAXIMUM_CAPACITY) ?
                         (int)ft : MAXIMUM_CAPACITY);
				// If the threshold is greater than the current threshold, the table size is calculated based on the threshold
                if (t > threshold)
                    threshold = tableSizeFor(t);
            }
			// If the Map has more elements than the threshold, we need to resize the current table
            else if (s > threshold)
                resize();
			// Iterate over the Map to add elements to the HashMap
            for (Map.Entry<? extends K, ? extends V> e : m.entrySet()) {
                K key = e.getKey();
                V value = e.getValue();
                putVal(hash(key), key, value, false, evict); }}}// Used to add should elements to a HashMap. The latter two Boolean values have the same meaning as putMapEntries and will not be explained here
    final V putVal(int hash, K key, V value, boolean onlyIfAbsent,
                   boolean evict) {
        Node<K,V>[] tab; Node<K,V> p; int n, i;
		// If the current HashMap is empty, expand TAB and give n the length of the expanded hashbucket
        if ((tab = table) == null || (n = tab.length) == 0)
            n = (tab = resize()).length;
		// If a hash collision has occurred
		// (n-1) &hash] evaluates the position of the key in the hash bucket
		// Assign the value of the corresponding hash bucket to p after the modulus calculation, and determine whether it is empty. If it is empty, it means that no hash collision has occurred
		// Pay the bucket for creating a Node.
        if ((p = tab[i = (n - 1) & hash]) == null)
            tab[i] = newNode(hash, key, value, null);
        else {
			// Enter else to indicate that a hash collision has occurred
            Node<K,V> e; K k;
			// If the hash value of p is the same as the hash value of the passed element, the key is the same, and neither is empty
			// do the overwrite operation
            if(p.hash == hash && ((k = p.key) == key || (key ! =null && key.equals(k))))
                e = p;
			// If p is a red-black tree, insert the element into p
            else if (p instanceof TreeNode)
                e = ((TreeNode<K,V>)p).putTreeVal(this, tab, hash, key, value);
            else {
				// If it is neither a red-black tree nor an overlay, it is a normal hash collision
				// Iterate through the list
                for (int binCount = 0; ; ++binCount) {
					// If p next is empty, it means that the entire list is traversed without the same key, and the element needs to be inserted.
					// a new node will be created on p.ext
                    if ((e = p.next) == null) {
                        p.next = newNode(hash, key, value, null);
						// After inserting the node, check whether the length of the linked list is >=8
						// If >=8, turn the list into a red-black tree, then break
                        if (binCount >= TREEIFY_THRESHOLD - 1) // -1 for 1st
                            treeifyBin(tab, hash);
                        break;
                    }
					// If the hash value and key are the same and not equal to 0, the nodes need to be overwritten and the operation is performed outside the loop
                    if(e.hash == hash && ((k = e.key) == key || (key ! =null && key.equals(k))))
                        break; p = e; }}// If e is not empty, then overridden nodes are required
            if(e ! =null) { // existing mapping for key
				// Overwrite the node and return the value of the original node
                V oldValue = e.value;
                if(! onlyIfAbsent || oldValue ==null)
                    e.value = value;
				AfterNodeAccess () is an empty function
                afterNodeAccess(e);
                returnoldValue; }}/ / modify modCount
        ++modCount;
		// Determine whether to expand the capacity again
        if (++size > threshold)
            resize();
		// is also an empty function
        afterNodeInsertion(evict);
        return null;
    }

Copy the code

resize

Expansion function, is the key! If the current HashMap is null, a hash bucket size matching the current threshold is allocated. Otherwise double the current HashMap.

    final Node<K,V>[] resize() {
        Node<K,V>[] oldTab = table;
		// If the old hash bucket is empty, assign 0 to oldCap, otherwise make oldCap equal to the length of the old hash bucket
        int oldCap = (oldTab == null)?0 : oldTab.length;
		//oldThr equals the original threshold
        int oldThr = threshold;
		// Initialize the new threshold and capacity
        int newCap, newThr = 0;
        if (oldCap > 0) {
			// If the original threshold is greater than MAXIMUM_CAPACITY
			// Set the threshold to the maximum value of Integer and return to the original HashMap without further expansion
            if (oldCap >= MAXIMUM_CAPACITY) {
                threshold = Integer.MAX_VALUE;
                return oldTab;
            }
            // If the new length is smaller than the maximum capacity and the original length is greater than or equal to the default initial capacity, set the new threshold to twice the original threshold
            else if ((newCap = oldCap << 1) < MAXIMUM_CAPACITY &&
                     oldCap >= DEFAULT_INITIAL_CAPACITY)
                newThr = oldThr << 1; // double threshold
        }
        else if (oldThr > 0) // initial capacity was placed in threshold
            newCap = oldThr;
        // If both the original length and the original threshold are equal to 0, use the default parameters to assign values to the new length and the new threshold
        else {               // zero initial threshold signifies using defaults
            newCap = DEFAULT_INITIAL_CAPACITY;
            newThr = (int)(DEFAULT_LOAD_FACTOR * DEFAULT_INITIAL_CAPACITY);
        }
        // If the previous if statement did not assign newThr (the new threshold), calculate the new threshold and assign it to it
        if (newThr == 0) {
            float ft = (float)newCap * loadFactor;
            newThr = (newCap < MAXIMUM_CAPACITY && ft < (float)MAXIMUM_CAPACITY ?
                      (int)ft : Integer.MAX_VALUE);
        }
        threshold = newThr;
        @SuppressWarnings({"rawtypes"."unchecked"})
            Node<K,V>[] newTab = (Node<K,V>[])new Node[newCap];
        table = newTab;
        // Process of assigning the original HashMap to the new HashMap
        if(oldTab ! =null) {// Copy is required only if the original HashMap is not empty
            for (int j = 0; j < oldCap; ++j) {// Probe the position on each hash bucket
                //e = oldTab[j], which is a hash bucket
                Node<K,V> e;
                if((e = oldTab[j]) ! =null) {// If e is not empty, it means that the current position has a value and needs to be copied
                    // Make references in the original location empty for JavaGC to recycle
                    oldTab[j] = null;
                    // If e.next is empty, there is only one key-value pair in the bucket, just put the object into the new HashMap
                    if (e.next == null)
                        //e.hash & (newcap-1
                        newTab[e.hash & (newCap - 1)] = e;
                    // If e is a TreeNode, the bucket is a red-black tree, which is split
                    else if (e instanceof TreeNode)
                        ((TreeNode<K,V>)e).split(this, newTab, j, oldCap);
                    else { // preserve order
                        // Since the HashMap is doubled each time, the elements in the original HashMap can be either in the original location or in the original location
                        // The old position +oldCap position (these two are called low and high), the next is to judge, and then insert
                        // New location.
                        Node<K,V> loHead = null, loTail = null;
                        Node<K,V> hiHead = null, hiTail = null;
                        Node<K,V> next;
                        do {
                            next = e.next;
                            // E.hash & oldCap is explained below.
                            // if e.hash & oldCap==0, the new position is low and loTail is executed
                            if ((e.hash & oldCap) == 0) {
                                if (loTail == null)
                                    loHead = e;
                                else
                                    loTail.next = e;
                                loTail = e;
                            }
                            // The new position is high
                            else {
                                if (hiTail == null)
                                    hiHead = e;
                                elsehiTail.next = e; hiTail = e; }}while((e = next) ! =null);//e.next==null to exit the loop
                        // Insert loTail and hiTail into the correct positions, respectively
                        if(loTail ! =null) {
                            loTail.next = null;
                            newTab[j] = loHead;
                        }
                        if(hiTail ! =null) {
                            hiTail.next = null;
                            newTab[j + oldCap] = hiHead;
                        }
                    }
                }
            }
        }
        return newTab;
    }

Copy the code

E.hash & oldCap == 0 parse

In JDK1.7, when resize is used, index is rehash. JDK1.8 improves on this. We used to use (e.hash & oldCap-) to determine the index1(e.hash & oldCap), which has two results, one is0One is oldCap, such as oldCap=8The hash is3.11.19.27, (E. hash & oldCap) is0.8.0.8So that the3.19Make a new list, index is3; while11.27To form a new linked list, the newly assigned index is3+8; The overwritten hash in JDK1.7 is (e.hash & newCap-1(That's right3.11.19.27right16Mod, also3.11.3.11, the same result as above, but index is3Is in the list19.3The index for3+8Is in the list27.11That is to say1.7After resize, the order of the data becomes a flashback, while1.8I didn't change the order. How it works: We use2An extension of a power2Times), so, the position of the element is either in the original position, or in the original position and then moved2Student: To the power. N is the length of the table. Figure (a) shows the example of determining the index position of key1 and KEY2 before capacity expansion. Figure (b) shows the example of determining the index position of key1 and KEY2 after capacity expansion.Copy the code

Element after recalculating the hash, because n becomes2Times PI, so n minus PI1The mask range is much higher1Bit (red), so the new index will look like this:Copy the code

Therefore, when we extend the HashMap, we don't need to recalculate the hash as we did in JDK1.7, just look at the new bit of the original hash value1or0That's it. Yeah0If the index hasn't changed, yes1If the index is changed to "old index +oldCap", you can see the following figure16Expand to32Resize schematic diagram ofCopy the code

This design is actually very clever, not only saves the time of recalculating the hash value, but also, due to the addition of1Bit is0or1It can be considered random, so the resize process evenly dispersing the previously conflicting nodes into the new bucket. This is the new optimization point for JDK1.8. In JDK1.7, the list element will be inverted if the index position of the new table is the same as that of the old list.Copy the code

Get operation

Get is a simple operation that simply calculates the hash value of the incoming key and finds the corresponding position.

    public V get(Object key) {
        Node<K,V> e;
        return (e = getNode(hash(key), key)) == null ? null : e.value;
    }
Copy the code

    final Node<K,V> getNode(int hash, Object key) {
        Node<K,V>[] tab; Node<K,V> first, e; int n; K k;
        if((tab = table) ! =null && (n = tab.length) > 0 &&
            (first = tab[(n - 1) & hash]) ! =null) {
            if (first.hash == hash && // always check first node((k = first.key) == key || (key ! =null && key.equals(k))))
                return first;
            if((e = first.next) ! =null) {
                if (first instanceof TreeNode)
                    return ((TreeNode<K,V>)first).getTreeNode(hash, key);
                do {
                    if(e.hash == hash && ((k = e.key) == key || (key ! =null && key.equals(k))))
                        return e;
                } while((e = e.next) ! =null); }}return null;
    }
Copy the code

Week2.jdk8 HashMap source code analysis

The constructor

The put operation

resize

Get operation

Related Posts

What’s the secret behind the scan? Why does small code look like a Daisy? | you ask a goose

The rust extension for deno is part 2 of the denO series

Windows PC mechanism -DLL global variable sharing