A few words about HashMap

preface

HashMap is a Map interface implemented based on hash tables. It is a key-value store, that is, it is used to store key-value pairs. The implementation of HashMap is not synchronous, which means it is not thread-safe. Both key and value can be null, and the mapping in a HashMap is not ordered

The hash algorithm

The hash algorithm, also known as the Digest algorithm, is a function that converts given data into an irregular value of fixed length

The length of the output value is fixed (under the same hash function)
The same input must yield the same output
The difference between the output values of similar data may be a bit large
It is possible to input different data and get the same output value (hash collision, also known as hash collision)

In Java, hashCode() is defined in the JDK’s Object class, which means that any class in Java contains a hashCode() function

     The hashCode method of the Object class returns the 32-bit JVM memory address of the Object
	public native int hashCode(a);
Copy the code

The use of the native keyword indicates that the method is a native function, which means that the method is implemented in C/C++, compiled into a DLL, and called by Java

The String class overrides the hashCode() method

    public int hashCode(a) {
        int h = hash;	//Default to 0
        if (h == 0 && value.length > 0) {	//private final char value[];
            char val[] = value;

            for (int i = 0; i < value.length; i++) {
                h = 31 * h + val[i];
            }
            hash = h;
        }
        return h;
    }
Copy the code

Since it’s a hash algorithm, collisions can’t be completely avoided

Because the probability of collision is related to the security of the algorithm, a safe hash algorithm must be satisfied

Low collision probability
You can’t hash out the original value

Common hash algorithms

algorithm	Output length (bytes)	Output length (bits)
MD5	16 bytes	128 bits
SHA-1	20 bytes	160 bits
RipeMD-160	20 bytes	160 bits
SHA-256	32 bytes	256 bits
SHA-512	64 bytes	512 bits

According to the collision probability, the longer the output length of the hash algorithm, the harder it is to produce collisions, and the safer it is

The data structure of the HashMap

Before JDk1.8: Array + linked list

After JDk1.8: array + linked list + red-black tree

Differences between JDK1.7 and JDK1.8

Explain the process briefly

If the hash value is the same, then the hash value will be collated. If the hash value is the same, then the hash value will be collated. If the hash value is the same, then the corresponding value will be obtained by equals. If the length of the chain is too long, the efficiency will be affected. By default, if the length of the linked list is greater than 8 and the length of the array is greater than 64, the tree will be converted to red black tree

Why were red-black trees introduced after 1.8?

No matter how good the hash algorithm is, it cannot avoid hash collisions. When there are too many collisions and the linked list is connected, the early data is less and the efficiency is ok

However, due to the continuous collision in the later period, the linked list is too long, and the reading needs to be traversed, consuming O(n) time.

After the introduction of red-black tree, search time O(logn), improve efficiency

When will the capacity be expanded?

    /** * The load factor used when none specified in constructor. */
    static final float DEFAULT_LOAD_FACTOR = 0.75 f;


		// Capacity expansion condition 1: The table is null or the TAB length is 0
        if ((tab = table) == null || (n = tab.length) == 0)
            n = (tab = resize()).length;

		// Condition 2: Size is greater than the boundary value
        if (++size > threshold)
            resize();
Copy the code

The first condition does not have what to say, empty expand bai

The second condition requires more attention

Size represents the actual number of k-Vs in the HashMap, not the length of the array
Threshold = Capacity DEFAULT_INITIAL_CAPACITY * Load factor DEFAULT_LOAD_FACTOR

Therefore, the threshold point for expansion is 16*0.75 = 12

Expansion source

NewThr = oldThr << 1; // Double threshold

Cap capacity gives the old boundary value newCap = oldThr;

If the old boundary value is 0, the initial value is given

    newCap = DEFAULT_INITIAL_CAPACITY;
    newThr = (int)(DEFAULT_LOAD_FACTOR * DEFAULT_INITIAL_CAPACITY);
Copy the code

High-frequency set capacity expansion information

Collection classes	Initial capacity	The load factor	Expansion increment
ArrayList	10	1	0.5 times
Vector	10	1	1 times
HashSet	16	0.75	1 times
HashMap	16	0.75	1 times

When to turn red black tree?

	/** * The value must be greater than 2 and should be at least 8 to mesh with assumptions in * tree removal about conversion back to plain bins upon shrinkage. */
static final int TREEIFY_THRESHOLD = 8;

    /** * The smallest table capacity for which bins may be treeified. * (Otherwise the table is resized if too many nodes in a bin.) * Should be at least 4 * TREEIFY_THRESHOLD to avoid conflicts * between resizing and treeification thresholds. */
static final int MIN_TREEIFY_CAPACITY = 64;
Copy the code

The linked list length threshold is 8 and the array length threshold is 64

The list threshold must be greater than 2 and at least 8 in order for a red-black tree to shrink and become a normal list

Array length is at least 4 times the linked list threshold to avoid collisions

    if (binCount >= TREEIFY_THRESHOLD - 1) // Check whether the list threshold is reached
        treeifyBin(tab, hash);

    if (tab == null || (n = tab.length) < MIN_TREEIFY_CAPACITY)// Check whether the array threshold is reached
        resize();
Copy the code

How is the hash function implemented in HashMap? What other hash functions can be implemented?

A: To hash the key hashCode, unsigned 16 bits to the right and xor. There are also square method, pseudorandom method and remainder method. All three are inefficient. Unsigned 16 – bit xor is the most efficient operation.

What happens when the Hashcodes of two objects are equal?

A: Hash collisions are generated. If the key value is the same, replace the old value. If the key value is not the same, the value is connected to the end of the list. If the list length exceeds the threshold 8, the value is converted to a red-black tree.

What are hash collisions and how do you solve them?

A: Hash collisions occur whenever the keys of two elements compute the same hash code value. Jdk8 previously used linked lists to resolve hash collisions. After jdK8, use linked list + red black tree to resolve hash collisions.

If two keys have the same hashCode, how do you store key-value pairs?

Answer: Compare content to equals. Same: the new value overwrites the previous value. If no, the new key-value pair is added to the hash table.

Member of the HashMap collection class

public class HashMap<K.V> extends AbstractMap<K.V>
    implements Map<K.V>, Cloneable.Serializable 
Copy the code

Cloneable: Cloneable, Serializable: Serializable

Both interfaces are a declarative interface

public interface Serializable {}public interface Cloneable {}Copy the code

AbstractMap

AbstractMap

AbstractMap

hashMap

AbstractMap
,v>
,v>
,v>
,v>

1. Serialize the version number

private static final long serialVersionUID = 362498820763181265L;
Copy the code

Serialization meaning, meaning, and usage scenarios

Serialization: Writes an object to an IO stream

Deserialization: Recover objects from AN IO stream

Meaning: The serialization mechanism allows serialized Java objects to be converted into bit-byte sequences that can be stored on disk or transferred over the network for later restoration to the original object. Serialization allows objects to exist independently of program execution

2. The initial capacity of the collection

    /** * The default initial capacity - MUST be a power of two. */
    static final int DEFAULT_INITIAL_CAPACITY = 1 << 4; // aka 16
Copy the code

Default initial capacity – has to be a power of 2, why does it have to be a power of 2?

We all know that in order to access a Hashmap efficiently, hash collisions should be minimized, that is, data should be evenly distributed, array space should be occupied, and the linked list length should be close

The algorithm of HashMap is == modulus ==hash%length, and the efficiency of computer direct mod is not as good as bit operation

Hash %length = hash&(length-1) if length is a power of 2

& : Bitwise and, the result is 1 if and only if both are 1

Capacity expansion is very performance intensive, and the default capacity may not be suitable, so it is correct to specify a suitable size for what needs to be created

What if MY custom capacity is not 2 to the N?

It will give a larger 2^n^ number than the custom number, for example, 16 at 14 and 32 at 20

    /** * Returns a power of two size for the given target capacity. */
    static final int tableSizeFor(int cap) {
        int n = cap - 1;
        n |= n >>> 1;
        n |= n >>> 2;
        n |= n >>> 4;
        n |= n >>> 8;
        n |= n >>> 16;
        return (n < 0)?1 : (n >= MAXIMUM_CAPACITY) ? MAXIMUM_CAPACITY : n + 1;
    }
Copy the code

3. Load factor (load factor)

static final float DEFAULT_LOAD_FACTOR = 0.75 f;
Copy the code

The above has been made very clear, and will not be repeated here

4. Maximum collection capacity

static final int MAXIMUM_CAPACITY = 1 << 30;
Copy the code

The maximum capacity is 2^30^, moved 30 bits to the left

5. Tree threshold (boundary value of red-black tree)

static final int TREEIFY_THRESHOLD = 8;
static final int MIN_TREEIFY_CAPACITY = 64;
Copy the code

When the list length is greater than 8, it becomes a red-black tree

(You also need an array length greater than 64, as mentioned above)

final void treeifyBin(Node<K,V>[] tab, int hash) {
    int n, index; Node<K,V> e;
    if (tab == null|| (n = tab.length) < MIN_TREEIFY_CAPACITY) resize(); . }Copy the code

8 to red black tree to 6 to linked list

Red-black trees take up a lot more space than linked lists, so linked lists are better to start with -> space and time trade-offs

Why is the threshold 8 for red-black tree conversion different from the threshold 6 for linked list conversion, in order to avoid frequent back and forth conversion

The average search length of red-black tree is log(n), if the length is 8, the average search length is log(8)=3, the average search length of linked list is N /2, when the length is 8, the average search length is 8/2=4, which is necessary to convert into a tree; If the length of the list is less than or equal to 6, 6/2=3, and log(6)= 2.6, it is also fast, but the conversion time to tree structure and spanning tree is not too short.

6. Cancel the numerical threshold (tree to list)

static final int UNTREEIFY_THRESHOLD = 6;
Copy the code

Eight becomes a red-black tree and six becomes a linked list

The value must be greater than 2 and should be at least 8 to mesh with assumptions in tree removal about conversion back to plain bins upon shrinkage.

This value must be greater than 2 and should be at least 8 to comply with the assumption in tree removal that it will be converted back to plain BIN on shrinkage.

We can see that the contraction goes back to plain bin, proving that the HashMap shrinks from a red-black tree to a linked list

And when they become too small (due to removal or resizing) they are converted back to plain bins. When they become too small (due to removal or resizing), they are converted back to the bucket.

Constructor of a HashMap

1. The default

public HashMap() {
    this.loadFactor = DEFAULT_LOAD_FACTOR; // all other fields defaulted
}
Copy the code

The ones that are used the most are the default values, which may not be the best choice

2. Specify the capacity

public HashMap(int initialCapacity) {
    this(initialCapacity, DEFAULT_LOAD_FACTOR);
}
Copy the code

Since capacity expansion is very performance intensive, we can customize capacity

You can use the following ↓ method to specify the capacity and the default load factor

3. Specify the capacity and load factor

public HashMap(int initialCapacity, float loadFactor) {
    if (initialCapacity < 0)
        throw new IllegalArgumentException("Illegal initial capacity: " +
                                           initialCapacity);
    if (initialCapacity > MAXIMUM_CAPACITY)
        initialCapacity = MAXIMUM_CAPACITY;
    if (loadFactor <= 0 || Float.isNaN(loadFactor))
        throw new IllegalArgumentException("Illegal load factor: " +
                                           loadFactor);
    this.loadFactor = loadFactor;
    this.threshold = tableSizeFor(initialCapacity);
}
Copy the code

this.threshold = tableSizeFor(initialCapacity);

The specified capacity must be 2 to the n^, as I mentioned above. What if I didn’t give the capacity 2 to the n? I will not repeat it here

4. Specify the collection constructor

    public HashMap(Map<? extends K, ? extends V> m) {
        this.loadFactor = DEFAULT_LOAD_FACTOR;
        putMapEntries(m, false);
    }

    final void putMapEntries(Map<? extends K, ? extends V> m, boolean evict) {
        int s = m.size();
        if (s > 0) {
            if (table == null) { // pre-size
                float ft = ((float)s / loadFactor) + 1.0 F;
                int t = ((ft < (float)MAXIMUM_CAPACITY) ?
                         (int)ft : MAXIMUM_CAPACITY);
                if (t > threshold)
                    threshold = tableSizeFor(t);
            }
            else if (s > threshold)
                resize();
            for (Map.Entry<? extends K, ? extends V> e : m.entrySet()) {
                K key = e.getKey();
                V value = e.getValue();
                putVal(hash(key), key, value, false, evict); }}}Copy the code

whyFloat ft = (float)s/loadFactor) + 1.0f;How about this 1.0f?

In order to obtain more capacity to reduce capacity expansion, improve efficiency ==

The result of s/loadFactor is a decimal. Adding 1.0f and (int)ft is equivalent to rounding up the decimal to ensure as much capacity as possible. Larger capacity reduces the number of resize calls

So +1.0F is for more capacity

For example, if the number of elements in the original set is 6, 6/0.75 is 8, which is 2 to the NTH power, so the new array size is 8. Then the data from the original array will be stored in a new array of length 8. As a result, the storage capacity of the elements will not be enough, and the performance will be reduced

If you add 1, the array becomes 16, which reduces the size of the array

Member methods of a HashMap

1. Increase the put

The steps are as follows

First hash the key to which bucket (array index)
Compares hash, if there is no collision on the bucket, directly inserts hash
If a collision occurs, the conflict needs to be handled
- If the bucket uses a red-black tree to handle conflicts, the red-black tree method is called to insert data
- Otherwise, the traditional chain method is used. If the length of the chain reaches a critical value, the chain is transformed into a red-black tree
Compare keys. If duplicate keys exist in the bucket, replace the key with a new value value
If the size is greater than threshold, expand the capacity

    //HashMap
	public V put(K key, V value) {
        // Computes the hash value of the key to the puval method
        return putVal(hash(key), key, value, false.true);
    }

    static final int hash(Object key) {
        int h;
        If the key is null, the hash value is 0
        // The key hashCode() is unsigned 16 bits right and xor ^ -> to get the hash value
        return (key == null)?0 : (h = key.hashCode()) ^ (h >>> 16);
    }
	
	// The hash and the array length are bitwise and indexed
	//(tab.length-1)&hash is the same as the hash%tab.length (if length is a power of 2)
	n = tab.length;
	tab[i = (n - 1) & hash]
Copy the code

Interestingly, the hashMap returns 0 if the key is null

There is no nullation in a HashTable, so a HashTable key cannot be null

        //HashTable put()
        if (value == null) {
            throw new NullPointerException();
        }
        int hash = key.hashCode();
Copy the code

2. Add putVal to the core

    /**
     * Implements Map.put and related methods.
     *
     * @param hash hash for key
     * @param key the key
     * @param value the value to put
     * @paramOnlyIfAbsent If true, do not change the existing value *@paramIf evict is false, the table is in create mode *@return previous value, or null if none
     */    
	final V putVal(int hash, K key, V value, boolean onlyIfAbsent,
                   boolean evict)
Copy the code

Create an array table

Is the hashMap array created with the new HashMap?

No, it was created after the first putVal nulled (called expansion method).

        if ((tab = table) == null || (n = tab.length) == 0)
            n = (tab = resize()).length;
Copy the code

An algorithm for indexing an array

n = tab.length
tab[i = (n - 1) & hash]
Copy the code

As mentioned above, let me copy it again

The algorithm of HashMap is == modulus ==hash%length, and the efficiency of computer direct mod is not as good as bit operation

Hash %length = hash&(length-1) if length is a power of 2

3. Convert the red-black tree to treeifyBin

Replace all linked nodes in bin at a given hash index unless the table is too small, in which case resize instead

	PutVal putVal putVal putVal putVal putVal putVal putVal putVal putVal
    for (int binCount = 0; ; ++binCount) {
        if ((e = p.next) == null) {
            p.next = newNode(hash, key, value, null);
            if (binCount >= TREEIFY_THRESHOLD - 1) // -1 for 1st
                treeifyBin(tab, hash);
            break;
        }
        if(e.hash == hash && ((k = e.key) == key || (key ! =null && key.equals(k))))
            break;
        p = e;
    }
Copy the code

So before WE get into the transformation, what are two-three-four trees and red-black trees

A data structure learning site is recommended

What’s a two-three-four tree?

A 2-3-4 Tree is a fourth-order B Tree (Balance Tree), which belongs to a multi-path search Tree. Its structure has the following limitations: all leaf nodes have the same depth, and only 2-node, 3-node and 4-node can be used

2-node: a 1-element node with 2 child nodes
3-node: a 2-element node with 3 child nodes
4- Node: a 3-element node with 4 child nodes

All nodes must contain at least one element

The elements are always sorted, and the overall binary search tree property is maintained, where the parent is larger than the left child and smaller than the right child

And when a node has multiple elements, each element must be greater than the one to its left and the element in its left subtree

The query operation of two-three-four tree is like the common binary search tree, very simple, but because of the uncertain number of node elements, it is not convenient to achieve in some programming languages, so it is generally equivalent to red-black tree

Two-three-four trees and red-black trees

2 nodes – Black nodes
3 nodes – Left/right
4 nodes – Middle moved up

What is a red-black tree?

Red-black Tree red-black-tree [RBT], is a self-balanced binary search Tree, each node complies with the following principles

Each node is either black or red
The root node is black
Each leaf node is black
- Nodes in a tree that have no children (i.e. degree 0) are called leaf nodes
Both children of each red node must be black
- The red node has no children because the null black node is hidden by default
The path from any node to each leaf node contains an equal number of black nodes

What makes a red-black tree self-equilibrating? Left rotation, right rotation, discoloration

Sinistral: The child passes from left to right, and the father and son switch

Take a node as the fulcrum (rotation node), its right child becomes the parent of rotation node,

The left child of the right child becomes the right child of the rotation, and the left child remains the same.

Dextral: The child gives right to left, and the father and son switch

Take a node as the fulcrum (rotation node), its left child becomes the parent of rotation node,

The right child of the left child becomes the left child of the rotation, and the right child remains the same.

discoloration

The node color changes from black to red or from red to black

Red-black tree analysis of inserted nodes

For a normal binary search tree, it’s smaller on the left and bigger on the right

The same is true for red-black tree inserts, except that they may need to be balanced (rotated, discolored).

Red black tree balance adjustment processing

A new element added to two nodes of a 2-3-4 tree is directly merged into three nodes
- Red black tree: add a new red node under the black node: no need to adjust
We added a new element to the 2-3-4 tree to merge it into a 4-node
- Red black tree: there will be 6 cases and 2 (left, middle and right) will not need to be adjusted
- The roots of left, left, right, right are rotated once
- Root left and right root right and left rotate 2 times
2-3-4 tree 4 nodes add an element. 4 nodes upgrade the middle element to the parent node. The new element is merged with the remaining nodes
- Red-black tree: New nodes are red + grandfather nodes are black parent nodes and uncle nodes are red
- If the grandfather node is changed to red, the father and uncle nodes are changed to black, and if the grandfather node is root node, the grandfather node is changed to black

Red-black tree analysis of deleted nodes

Let’s start with an ordinary binary tree

Forerunner node: Traverses a binary tree in order. The preceding node of the current node is the forerunner node of the node

Successor node: Traverses a binary tree in order. The next node of the current node is the successor node of the node

For example, a complete binary tree (1,2,3,4,5,6,7) is traversed in the middle order in the following order :(4,2,5,1,6,3,7)

That is, the precursor node of node 1 is 5, and the successor node is 6

There are three cases of deletion

Delete the leaf node directly
The deleted node has a child node, so replace it with a child node
If the deleted node has two child nodes to the right, it is necessary to find the precursor node or successor node to replace the situation that can be converted to 1 or 2

Convert method treeifyBin

    // Pass in the array TAB and hash value to determine the subscript
	final void treeifyBin(Node<K,V>[] tab, int hash) {
        //n is the array length, index is the subscript, and e is the node from the array based on the hash value
        int n, index; Node<K,V> e;
        // Check whether the array has capacity and whether the array threshold is greater than 64
        if (tab == null || (n = tab.length) < MIN_TREEIFY_CAPACITY)
            resize();
        // If the node in the bucket is not empty, the loop converts the nodes in the red-black tree
        else if ((e = tab[index = (n - 1) & hash]) ! =null) {
            //hd header, tl tail
            TreeNode<K,V> hd = null, tl = null;
            do {
                // Turn an ordinary node into a tree node
                TreeNode<K,V> p = replacementTreeNode(e, null);
                if (tl == null)
                    hd = p;
                else {
                    p.prev = tl;
                    tl.next = p;
                }
                tl = p;
            } while((e = e.next) ! =null);
            // The list processes the treeNode and puts it back into the bucket
            if((tab[index] = hd) ! =null)
                // The tree becomes a red black tree by balancing the list of nodeshd.treeify(tab); }}Copy the code

Turn nodes into tree nodes

    TreeNode<K,V> replacementTreeNode(Node<K,V> p, Node<K,V> next) {
        return new TreeNode<>(p.hash, p.key, p.value, next);
    }
Copy the code

Create a red-black tree

final void treeify(Node<K,V>[] tab) {
    TreeNode<K,V> root = null; // Define the root node of the tree
    for (TreeNode<K,V> x = this, next; x ! =null; x = next) { // Iterate through the list with x pointing to the current node and next to the next node
        next = (TreeNode<K,V>)x.next; // Next node
        x.left = x.right = null; // Set the left and right nodes of the current node to null
        if (root == null) { // If there is no root node
            x.parent = null; // The parent of the current node is set to null
            x.red = false; // Set the red attribute of the current node to false (set the current node to black)
            root = x; // The root node points to the current node
        }
        else { // If the root node already exists
            K k = x.key; // Get the key of the current list node
            int h = x.hash; // Get the hash value of the current list nodeClass<? > kc =null; // Define the Class to which key belongs
            for (TreeNode<K,V> p = root;;) { // Start from the root node. This traversal has no bounds and can only be jumped from the inside
	            // GOTO1
                int dir, ph; // dir indicates the direction (left and right) and pH indicates the hash value of the current tree node
                K pk = p.key; // Key of the current tree node
                if ((ph = p.hash) > h) // If the current tree node hash value is greater than the current list node hash value
                    dir = -1; // Indicates that the current list node will be placed to the left of the current tree node
                else if (ph < h)
                    dir = 1; / / on the right side
 
                * If the current list node's key implements the COMPARABLE interface and the current tree node and the list node are instances of the same Class, then the tree node and the list node will be compared using comparable. * If it is still equal, make a final tieBreakOrder comparison */
                else if ((kc == null &&
                            (kc = comparableClassFor(k)) == null) ||
                            (dir = compareComparables(kc, k, pk)) == 0)
                    dir = tieBreakOrder(k, pk);
 
                TreeNode<K,V> xp = p; // Save the current tree node
 
                /* * If dir is less than or equal to 0: the current list node must be placed to the left of the current tree node, but not necessarily the left child of the tree node, but also the right child of the left child or a deeper node. * If dir is greater than 0: The current list node must be placed to the right of the current tree node, but not necessarily the right child of the tree node, but also the left child of the right child or a deeper node. * If the current tree node is not a leaf node, it will eventually start from the left or right child of the current tree node and start from GOTO1 to find its (current list node) position * If the current tree node is a leaf node, then according to the value of dir, You can mount the current list node to the left or right of the current tree node. * After mounting, you need to rebalance the tree. Once balanced, you can move on to the next list node. * /
                if ((p = (dir <= 0)? p.left : p.right) ==null) {
                    x.parent = xp; // The current list node is the child of the current tree node
                    if (dir <= 0)
                        xp.left = x; // As a left child
                    else
                        xp.right = x; // As a right child
                    root = balanceInsertion(root, x); // Rebalance
                    break; }}}}// After all the list nodes have been traversed, the resulting tree may undergo multiple balancing operations, and it is uncertain which node in the list is the root node
    // Because we want to search based on the tree, we should TAB [N] to obtain the root of the node object, which is only the first node object of the list, so we should do the corresponding processing.
    // Set the root node of the red-black tree to the first element in the slot
    // TreeNode is both a red-black tree and a double-linked list
    // All this does is make sure that the root node of the tree is also the first node of the list
    moveRootToFront(tab, root); 
}
Copy the code

4. Increase the resize

The capacity expansion mechanism is described above and will not be repeated here

Capacity to double

	final Node<K,V>[] resize() {
        // Get the list of elements before expansion
        Node<K,V>[] oldTab = table;
        // Get the previous list length
        int oldCap = (oldTab == null)?0 : oldTab.length;
        int oldThr = threshold;
        int newCap, newThr = 0;
 
 
        if (oldCap > 0) {
            // If the original capacity is greater than the maximum capacity, set threshold to the maximum value of int
            if (oldCap >= MAXIMUM_CAPACITY) {
                threshold = Integer.MAX_VALUE;
                return oldTab;
            }
            // Otherwise, the new capacity expands to the original cup oldCap << 1 and shifts 1 to the left, equivalent to oldCap * 2
            else if ((newCap = oldCap << 1) < MAXIMUM_CAPACITY &&
                     oldCap >= DEFAULT_INITIAL_CAPACITY)
                newThr = oldThr << 1; // double threshold
        }
        else if (oldThr > 0) // initial capacity was placed in threshold
            newCap = oldThr;
 
        // If the following else is followed, then this is the first put element and the first expansion
        else {               // zero initial threshold signifies using defaults
            newCap = DEFAULT_INITIAL_CAPACITY;
 
            // Capacity expansion = Capacity expansion factor * capacity. DEFAULT_INITIAL_CAPACITY is 16 by default, and DEFAULT_LOAD_FACTOR is 0.75 by default
            newThr = (int)(DEFAULT_LOAD_FACTOR * DEFAULT_INITIAL_CAPACITY);
        }
Copy the code

Recalculate and roll over the old


        @SuppressWarnings({"rawtypes","unchecked"})
            Node<K,V>[] newTab = (Node<K,V>[])new Node[newCap];
        table = newTab;
 
        // So far, if it is the first expansion, i.e. there are no elements, the following code will not run
        // If there are elements in the array, add the elements to the new array
        if(oldTab ! =null) {
            // Loop through the original list, placing the old array in the new array
            for (int j = 0; j < oldCap; ++j) {
                Node<K,V> e;
                if((e = oldTab[j]) ! =null) {
                    oldTab[j] = null;
                    // If there is only one element in the list, the hash of e's key is directly recalculated with the new capacity
            
                    if (e.next == null)
                        newTab[e.hash & (newCap - 1)] = e;
                    // If it is a red-black tree, the red-black tree is copied
                    else if (e instanceof TreeNode)
                        ((TreeNode<K,V>)e).split(this, newTab, j, oldCap);
                    else {
/* The hash key (k1,k2) is 1101, and the hash key (k1,k2) is 1101. The default value for 11101 initialization is 16 bytes, so the binary value for 16-1 is 10000, and the binary value for 16-1 is 1111 and the key value is 1101, but the key value is 1101 and the key value is 11111(36-1), so we need to recalculate the subscript, Through & operation with oldCap, it is divided into low value and high value. Because there are only two results of oldCap & operation, one is 0 and the other is that oldCap sets the result =0 to low value and saves the result equal to oldCap to high value, and the hash value of low value must be less than oldCap. So the hash at the top of the index will be larger than oldCap, and at least one more 1 will be added to the far left of the binary (there may be other binary numbers in front of it, but the result will be the same) to calculate exactly where it was plus the new capacity */
						// Start and end of the chain
                        Node<K,V> loHead = null, loTail = null;
						// The top and bottom of the list
                        Node<K,V> hiHead = null, hiTail = null;
                        Node<K,V> next;
                        do {
                            // Gets the next element of the current node
                            next = e.next;
                         
                            if ((e.hash & oldCap) == 0) {
                                // If loTail==null is the first element, assign this element to loHead
                                if (loTail == null)
                                    loHead = e;
                                else
                                    If it is not the first element, add it to the end
                                    loTail.next = e;
                                loTail = e;
                            }
                            // The process of high level nodes
                            else {
                                if (hiTail == null)
                                    hiHead = e;
                                elsehiTail.next = e; hiTail = e; }}while((e = next) ! =null);
                        if(loTail ! =null) {
                            loTail.next = null;
                            // Place the original subscript at the lower end of the list
                            newTab[j] = loHead;
                        }
                        if(hiTail ! =null) {
                            hiTail.next = null;
                            // the new index calculated by subscript j + oldCap is the same as the original location index plus the new capacity
                            newTab[j + oldCap] = hiHead;
                        }
                    }
                }
            }
        }
        return newTab;
    }
Copy the code

Optimization of JDK8 to JDK7 in capacity expansion method

In JDK7, all elements are recalculated according to the index algorithm, which is inefficient

It can be seen that in 1.7, the transfer() method (removed in JDK1.8) will be used to recalculate threshold and array length and rehash during expansion, which is inefficient

/ / jdk1.7
void resize(int newCapacity) {Entry[] oldTable = table;intoldCapacity = oldTable.length;// Check whether the capacity expansion exceeds the maximum value. If the capacity expansion exceeds the maximum value, no capacity expansion is performed
　　　　if(oldCapacity == MAXIMUM_CAPACITY) {threshold = integer-max_value;return; } Entry [] newTable =newEntry[newCapacity];The transfer() method puts the values from the original array into the new arraytransfer(newTable, initHashSeedAsNeeded(newCapacity));// Set hashMap to a new array reference after expansiontable = newTable;// Set a new threshold for hashMap expansionthreshold = (int)Math.min(newCapacity * loadFactor, MAXIMUM_CAPACITY + 1); }void transfer(Entry[] newTable, boolean rehash) {intnewCapacity = newTable.length;for(Entry<K,V> e : table) {while(null! = e) {Entry<K,V> next = e.next;if(rehash) {e.h ash =null == e.key ? 0: hash (e.k ey); }// Use the hash value of the key value and the size of the new array to calculate the location in the current array
　　　　　　　　intI = indexFor (e.h ash, newCapacity); E.n ext = newTable [I]; NewTable [I] = e; E = next; }}}Copy the code

In JDK8, when extending a HashMap, there is no need to recalculate the hash, just to see if the new bit is 1 or 0

E. hash & oldCap perform bit operation and judge the result

If the value is 0, the index is unchanged:The original location newTab[j] = loHead;
If the value is 1, the index becomes: old index +oldCap(Original location + old capacity) newTab[j + oldCap] = hiHead;

Because of this clever rehash method, it not only saves the time of recalculating the hash value, but also can recognize whether the new 1bit is 0 or 1

Is random. During the resize process, the number of nodes on each bucket after rehash must be less than or equal to the number of nodes on the original bucket

No more serious hash collisions occur after rehash, evenly spreading the previously conflicting nodes into the new bucket.

    / / jdk1.8

    if ((e.hash & oldCap) == 0) {
        if (loTail == null)
            loHead = e;
        else
            loTail.next = e;
        loTail = e;
    }
    else {
        if (hiTail == null)
            hiHead = e;
        else
            hiTail.next = e;
        hiTail = e;
    }
	// Expanded location = original location
	if(loTail ! =null) {
        loTail.next = null;
        newTab[j] = loHead;
    }
	// Expanded location = original location + Old capacity
    if(hiTail ! =null) {
        hiTail.next = null;
        newTab[j + oldCap] = hiHead;
    }
Copy the code

From the above source can be analyzed capacity expansion is divided into three situations:

The first is the initialization phase, where newCap and newThr are both set to 0. According to the previous source code, the default threshold for the first capacity expansion is threshold = DEFAULT_INITIAL_CAPACITY * DEFAULT_LOAD_FACTOR = 12. DEFAULT_INITIAL_CAPACITY is 16, and DEFAULT_LOAD_FACTOR is 0.75.

The second is that if oldCap is greater than 0, that is, if the table has been expanded, the capacity and threshold of the table will be expanded twice each time

Third, if the initial capacity of threshold is specified, newCap will be equal to this threshold, and the new threshold will be recalculated

5. Remove the remove

As with PUT, provide the user with a method that actually calls removeNode

Pass the hash value of the key to the selected location, and pass the key to select the specific element for deletion

Check whether the deleted node exists. If the node exists, delete it and return its value. If the node exists, return null

    public V remove(Object key) {
        Node<K,V> e;
        return (e = removeNode(hash(key), key, null.false.true)) = =null ?
            null : e.value;
    }
Copy the code

6. Remove removeNode from core

/** * The method is final and cannot be overwritten, and subclasses can add their own processing logic by implementing the visting method **@paramHash Key Hash value, which is the * obtained from the hash(key)@paramKey Key * of the key/value pair to be deleted@paramValue specifies the value of the key-value pair to be deleted, depending on whether matchValue is true *@paramMatchValue equals(value) = true; matchValue = true; Otherwise, we don't care about the value *@paramMovable whether to move nodes after deletion, if false, do not move *@returnReturns the deleted node object, or NULL */ if no node is deleted
final Node<K,V> removeNode(int hash, Object key, Object value,
                            boolean matchValue, boolean movable) {
    Node<K,V>[] tab; Node<K,V> p; int n, index; // Declare the node array, current node, array length, index value
    /* * If the array TAB is not empty, the array length n is greater than 0, and the node object p (which is the root of the tree or the first node of the list) is not empty, you need to traverse down from the node p to find the node object that matches the key */
    if((tab = table) ! =null && (n = tab.length) > 0 &&
        (p = tab[index = (n - 1) & hash]) ! =null) {
        Node<K,V> node = null, e; K k; V v; // Define the node object to return, declare a temporary node variable, key variable, value variable
 
        // If the key and key of the current node are equal, then the current node is the node to be deleted
        if(p.hash == hash && ((k = p.key) == key || (key ! =null && key.equals(k))))
            node = p;
 
        If there is no next node, there is no hash collision on the node. If there is no hash collision on the node, there is no deletion. If there is no next node, then null is returned. A hash collision has occurred at the array location, which could be a linked list or a red-black tree */
        else if((e = p.next) ! =null) {
            // If the current node is of type TreeNode, it is already a red-black tree. Then call getTreeNode to find the node in the tree structure
            if (p instanceof TreeNode)
                node = ((TreeNode<K,V>)p).getTreeNode(hash, key);
            // If it is not a tree node, then it is a linked list
            else {
                do {
                    // If the key of node e is equal to key, node e is the node to be deleted, assign the value to node variable, and call the loop
                    if(e.hash == hash && ((k = e.key) == key || (key ! =null && key.equals(k)))) {
                        node = e;
                        break;
                    }
 
                    // If I go here, I can see that e is not matched
                    p = e; // Point the current node p to e. This step makes P store the parent node of E in the next loop. If e matches next time, p will be the parent node of node
                } while((e = e.next) ! =null); // If e exists in the next node, continue to match the next node. Until a node is matched or all nodes of the linked list are iterated}}/* * If the node is not empty, the node to be removed is matched by the key. * If the node does not need to compare values or values that need to be compared are equal * then the node can be removed */
        if(node ! =null&& (! matchValue || (v = node.value) == value || (value ! =null && value.equals(v)))) {
            if (node instanceof TreeNode) // If the node is a TreeNode object, the node exists in the red-black tree structure. Call removeTreeNode (which resolves separately) to remove the node
                ((TreeNode<K,V>)node).removeTreeNode(this, tab, movable);
            else if (node == p) // If the node is not a TreeNode object, node == p means that the node is the first node
                tab[index] = node.next; // The first node is deleted, so we can directly point to the second node
            else // If the node is not the first node, p is the parent node of the node. To delete the node, you only need to point the next node of P to the next node of node
                p.next = node.next;
            ++modCount; // The number of changes to the HashMap increases
            --size; // The number of elements in the HashMap decreases
            afterNodeRemoval(node); // Call the visting emoval method, which has no implementation logic and allows subclasses to override them if necessary
            returnnode; }}return null;
}
Copy the code

7. Get a get

As with put and remove, it provides an external method, which is actually a call to the getNode method

Returns a value if there is a value, null if there is no value

Pass the hash value of the key and key to getNode

    public V get(Object key) {
        Node<K,V> e;
        return (e = getNode(hash(key), key)) == null ? null : e.value;
    }
Copy the code

8. The core obtains getNode

First = TAB [(n-1) &hash]

Check whether the first element of the bucket is desired (k = first.key) == key or key! = null && key.equals(k)

If the key is the same as the key, the value is returned

Otherwise, check whether it is a red-black tree. If it is, use getTreeNode to get and return it

    if (first instanceof TreeNode)
        return ((TreeNode<K,V>)first).getTreeNode(hash, key);
Copy the code

If it’s not a red-black tree, loop through the list and check the keys as before

    do {
        if(e.hash == hash && ((k = e.key) == key || (key ! =null && key.equals(k))))
            return e;
    } while((e = e.next) ! =null);
Copy the code

The source code is as follows

    final Node<K,V> getNode(int hash, Object key) {
        Node<K,V>[] tab; Node<K,V> first, e; int n; K k;
        if((tab = table) ! =null && (n = tab.length) > 0 &&
            (first = tab[(n - 1) & hash]) ! =null) {
            if (first.hash == hash && // always check first node((k = first.key) == key || (key ! =null && key.equals(k))))
                return first;
            if((e = first.next) ! =null) {
                if (first instanceof TreeNode)
                    return ((TreeNode<K,V>)first).getTreeNode(hash, key);
                do {
                    if(e.hash == hash && ((k = e.key) == key || (key ! =null && key.equals(k))))
                        return e;
                } while((e = e.next) ! =null); }}return null;
    }
Copy the code

Five ways to traverse a HashMap

1. Loop through entries in the entrySet provided by map

The entrySet() method of a map can return Set< map.entry

> entrySet();
,>

Loop through set to get map. Entry
,v>

Map.entry

contains entry.getKey() and entry.getValue() to obtain key and value
,v>

    Map<Integer, Integer> map = new HashMap<Integer, Integer>();
    map.put(1.10);
    map.put(2.20);

    Set<Map.Entry<Integer, Integer>> entries = map.entrySet();
    for (Map.Entry<Integer, Integer> entry : entries) {
        System.out.println("key ="+entry.getKey()+" value ="+entry.getValue());
    }
Copy the code

2.ForEach iterates key-value pairs

Map.keyset () and map.values() contain all keys and values and can be traversed directly

    Map<Integer, Integer> map = new HashMap<Integer, Integer>();
    map.put(1.10);
    map.put(2.20);

    / / iteration key
    for (Integer key : map.keySet()) {
        System.out.println("Key = " + key);
    }

    / / iteration value
    for (Integer value : map.values()) {
        System.out.println("Value = " + value);
    }
Copy the code

3. Iterator of entrySet with generic map for traversal

Sets returned by map.entryset () can be iterated using the iterator() method

Entries.next () gets the entry, just like the first method, getKey and getValue

    Map<Integer, Integer> map = new HashMap<Integer, Integer>();
    map.put(1.10);
    map.put(2.20);

    Iterator<Map.Entry<Integer, Integer>> entries = map.entrySet().iterator();
    while (entries.hasNext()) {
        Map.Entry<Integer, Integer> entry = entries.next();
        System.out.println("Key = " + entry.getKey() + ", Value = " + entry.getValue());
    }
Copy the code

4. Iterators to entrySet of map without generics are traversed

Similar to the third method, the only difference is that there is no generic, the value needs to be strong

    Map map = new HashMap();
    map.put(1.10);
    map.put(2.20);

    Iterator<Map.Entry> entries = map.entrySet().iterator();
    while (entries.hasNext()) {
        Map.Entry entry = (Map.Entry) entries.next();
        Integer key = (Integer) entry.getKey();
        Integer value = (Integer) entry.getValue();
        System.out.println("Key = " + key + ", Value = " + value);
    }
Copy the code

5. Iterate through the Java8 Lambda expression

    Map<Integer, Integer> map = new HashMap<Integer, Integer>();
    map.put(1.10);
    map.put(2.20);
    map.forEach((k, v) -> System.out.println("key: " + k + " value:" + v));
Copy the code

A few words about HashMap

A few words about HashMap

preface

The hash algorithm

Common hash algorithms

The data structure of the HashMap

Differences between JDK1.7 and JDK1.8

Explain the process briefly

Why were red-black trees introduced after 1.8?

When will the capacity be expanded?

Expansion source

High-frequency set capacity expansion information

When to turn red black tree?

How is the hash function implemented in HashMap? What other hash functions can be implemented?

What happens when the Hashcodes of two objects are equal?

What are hash collisions and how do you solve them?

If two keys have the same hashCode, how do you store key-value pairs?

Member of the HashMap collection class

1. Serialize the version number

2. The initial capacity of the collection

What if MY custom capacity is not 2 to the N?

3. Load factor (load factor)

4. Maximum collection capacity

5. Tree threshold (boundary value of red-black tree)

6. Cancel the numerical threshold (tree to list)

Constructor of a HashMap

1. The default

2. Specify the capacity

3. Specify the capacity and load factor

4. Specify the collection constructor

Member methods of a HashMap

1. Increase the put

2. Add putVal to the core

Create an array table

An algorithm for indexing an array

3. Convert the red-black tree to treeifyBin

What’s a two-three-four tree?

Two-three-four trees and red-black trees

What is a red-black tree?

Red-black tree analysis of inserted nodes

Red-black tree analysis of deleted nodes

Convert method treeifyBin

4. Increase the resize

Capacity to double

Recalculate and roll over the old

Optimization of JDK8 to JDK7 in capacity expansion method

5. Remove the remove

6. Remove removeNode from core

7. Get a get

8. The core obtains getNode

Five ways to traverse a HashMap

1. Loop through entries in the entrySet provided by map

2.ForEach iterates key-value pairs

3. Iterator of entrySet with generic map for traversal

4. Iterators to entrySet of map without generics are traversed

5. Iterate through the Java8 Lambda expression

Related Posts

Talk about logging best practices

Netty source code analysis – Preface (3) Netty overall architecture

Maven’s life cycle and the use of common commands