HashMap source code analysis

This article is migrated from my previous CSDN: blog.csdn.net/qq_43731074…

Based on the jdk11

HashMap = array + list = array + list = array + list = array + list Collisions are also resolved using chained addresses (here referred to as zippers), but the difference between HashMap and HashTable is that one is thread-safe and the other is non-thread-safe. After jdK1.8 is released, the performance of HashMap has been optimized, and the underlying data structure has been changed to array + linked list + red-black tree. Because the JUC package already has a HashMap-(ConcurrentHashMap) that supports concurrent operations.Take a look at the class diagramTwo versions of the data structure:

Jdk1.8 before:

After jdk1.8:If you’re confused by the above figure, it’s okay to look at the actual data structure inside a HashMap and see why a red-black tree has successive nodes. Right

Here is the data structure stored in HashMap:

// This is an array structure used to store the headers of each Node and its subclasses. transient Node<K,V>[] table; Static class Node<K,V> implements map. Entry<K,V> {// Final int hash; // Static class Node<K,V> implements Map. final K key; V value; // Node<K,V> next; Node(int hash, K key, V value, Node<K,V> next) { this.hash = hash; this.key = key; this.value = value; this.next = next; Static final class TreeNode<K,V> extends Linkedhashmap. Entry<K,V> {TreeNode<K,V> parent; TreeNode<K,V> left; // TreeNode<K,V> right; TreeNode<K,V> prev; // The TreeNode is both a red-black tree and a double-linked list. // Node color Boolean red; TreeNode(int hash, K key, V val, Node<K,V> next) { super(hash, key, val, next); }} // This static class is linkedhashmap.Entry, which is derived from the Node class of HashMap, i.e. TreeNode is a subclass of Node. static class Entry<K,V> extends HashMap.Node<K,V> { }Copy the code

TreeNode is a subclass of Node, which is why the tree nodes in the graph point to other nodes. Since HashMap converts a single linked list into a red black tree, then a red black tree turns into a single linked list. When the number of nodes in a red-black tree is less than 6, the process of converting to a single linked list is needed to speed up the conversion.

HashMap data structure:

Analysis of global variables and their effects

public class HashMap<K,V> extends AbstractMap<K,V> implements Map<K,V>, Cloneable, Serializable {// Default initial capacity is 16 static final int DEFAULT_INITIAL_CAPACITY = 1 << 4; Static final int MAXIMUM_CAPACITY = 1 << 30; Static final float DEFAULT_LOAD_FACTOR = 0.7f; static final float DEFAULT_LOAD_FACTOR = 0.75f; Static final int TREEIFY_THRESHOLD = 8; static final int TREEIFY_THRESHOLD = 8; Static final int UNTREEIFY_THRESHOLD = 6 static final int UNTREEIFY_THRESHOLD = 6; Static final int MIN_TREEIFY_CAPACITY = 64 static final int MIN_TREEIFY_CAPACITY = 64 // Array (hash table) transient Node<K,V>[] table; // Data size transient int size; // Transient int modCount; // If k-v is used for the maximum square of the number, the hash value will be recalculated. Therefore, the power of 2 will play a certain role in optimizing performance, because after the power of 2 expansion, the location of the old data will either be in the original position, or move based on the location of the principle. Load factor * capacity int threshold; // loadFactor final float loadFactor; }Copy the code

As can be seen from the above analysis, HashMap is stored in the form of array + linked list at the beginning. When the threshold exceeds 8 and the capacity of hash table reaches the minimum tree capacity of 64, the bucket (linked list) will be converted into a red-black tree structure. When the red-black tree node is lower than 6, the red-black tree will be cut into a linked list.

The relation between the calculation of hash code and position

Take a look at the source code
```
static final int hash(Object key) { int h; Return (key == null)? Return (key == null)? 0 : (h = key.hashCode()) ^ (h >>> 16); } // this is the capacity of the hash table. This is the capacity of the hash table. This is the capacity of the hash table. TAB [index = (n-1) &hash] hash [index = (n-1) &hash]Copy the code
```
The haxsh(Object key) method in HashMap is a new hash code based on (key’s hash code) and (key’s hash code high 16 bits) (xOR operation). But determining where the node should be stored in the hash table also requires (the new hash code) and (the capacity of the hash table -1).

In simple terms the element should be stored in the bucket (the linked list or red-black tree) depending on the hash code of the key and the size of the hash table

Below is the general process

3. Capacity expansion mechanism of HashMap

/** * This method is used for expansion or initialization, and the capacity is a power of two, * * @return the table */ final Node<K,V>[] resize() {// First copy a hash table into oldtab Node<K,V>[] oldTab = table; Int oldCap = (oldTab == null)? 0 : oldTab.length; int oldThr = threshold; // Old capacity expansion threshold // New capacity expansion threshold int newCap, newThr = 0; // Case 1: If (oldCap >= MAXIMUM_CAPACITY) {threshold = integer.max_value; return oldTab; } // Otherwise, the new capacity is 2 times the old capacity. If the new capacity is less than the maximum capacity and the old capacity is greater than the default initial capacity, the new capacity is 2 times the old capacity. else if ((newCap = oldCap << 1) < MAXIMUM_CAPACITY && oldCap >= DEFAULT_INITIAL_CAPACITY) newThr = oldThr << 1; // double threshold} // Else if (oldThr > 0) // Initial capacity was placed in threshold newCap = oldThr; // Case 3: Constructor with no parameters, using the default capacity as the new capacity, // Zero initial threshold Using defaults newCap = DEFAULT_INITIAL_CAPACITY; newThr = (int)(DEFAULT_LOAD_FACTOR * DEFAULT_INITIAL_CAPACITY); } // new capacity * loadFactor if (newThr == 0) {float ft = (float)newCap * loadFactor; NewThr = (newCap < MAXIMUM_CAPACITY && ft < (float)MAXIMUM_CAPACITY? (int)ft : Integer.MAX_VALUE); } // Change the threshold to the new threshold threshold = newThr; // Create a hash table with new capacity Node<K,V>[] newTab = (Node<K,V>[])new Node[newCap]; // Create a hash table with new capacity Node<K,V>[] newTab = (Node<K,V>[])new Node[newCap]; // Assign the table to table. table = newTab; If (oldTab! OldTab! OldTab! OldTab! = null) {for (int j = 0; j < oldCap; ++j) {// The temporary Node is used to store the head Node (or the first Node in each bucket). If ((e = oldTab[j])! OldTab [j] = null; oldTab[j] = null; NewTab [e.hash & (newcap-1)] = e; newTab[e.hash & (newcap-1)] = e; newTab[e.hash & (newcap-1)] = e; // If the head node is a tree, use the tree splitting method to rewrite the mapping to the new hash table. Else if ((TreeNode<K,V>)e).split(this, newTab, j, oldCap); Else {// Case 3: linked list case, and there is not only one node // Each linked list data node in the new hash table only two cases: 1. 2. In the new hash table position: LoHead = null; loTail = null; loTail = null; loTail = null; <K,V> hiHead = null, hiTail = null; // next Node Node<K,V> next; Do {// Record a node next = e.next; If ((e.hash &oldCap) == 0) {if (loTail == null) loHead = e; if (e.hash &oldCap) loHead = e; else loTail.next = e; loTail = e; } else { if (hiTail == null) hiHead = e; Next = e; // Set the current node as the successor of the previous node in principle else hitail. next = e; // Set the current node to the successor of the previous node in the new location hiTail = e; } } while ((e = next) ! = null); // If (loTail! Lotail. next = null; NewTab [j] = loHead; newTab[j] = loHead; } // If (hiTail! = null) { hiTail.next = null; NewTab [j + oldCap] = hiHead; } } } } } return newTab; }Copy the code

Red and black split

/** * This method is ** @param map current hashmap * @param new hash table * @param index original position * @param bit original hash table capacity */ final void Split (HashMap<K,V> map, Node<K,V>[] TAB, int index, int bit) {// Record the first Node (or root Node) of the current bucket TreeNode<K,V> b = this; TreeNode<K,V> loHead = null, TreeNode<K,V> loHead = null, TreeNode<K,V> loHead = null loTail = null; TreeNode<K,V> hiHead = null, hiTail = null; TreeNode<K,V> hiHead = null; int lc = 0, hc = 0; For (TreeNode<K,V> e = b, next; e ! = null; e = next) { next = (TreeNode<K,V>)e.next; // Get the next node e.ext = null; If ((e.hash & bit) == 0) {if (e.hash & bit) == 0) {// If (e.hash & bit) == 0) {// If (e.hash & bit) == 0) {// If (e.hash & bit) == 0) {// If (e.hash & bit) == 0) {// If (e.hash & bit) == 0) ((e.prev = loTail) == null) loHead = e; Next = e; lotail.next = e; lotail.next = e; LoTail = e; // At the same time, record the number of nodes in the original position for determining whether to convert red to black later. ++lc; If ((e.rev = hiTail) == null) hiHead = e; if (e.rev = hiTail) hiHead = e; else hiTail.next = e; hiTail = e; ++hc; }} // If (loHead! If (lc <= UNTREEIFY_THRESHOLD) // If (lc <= UNTREEIFY_THRESHOLD) // If (lc <= UNTREEIFY_THRESHOLD) And set the head node to the current position of the node TAB [index] = lohead.untreeify (map); TAB [index] = loHead; else {// Set the current position to loHead; if (hiHead ! = null) // (else is already treeified) loHead.treeify(tab); }} // Same as above. if (hiHead ! = null) { if (hc <= UNTREEIFY_THRESHOLD) tab[index + bit] = hiHead.untreeify(map); else { tab[index + bit] = hiHead; if (loHead ! = null) hiHead.treeify(tab); }}}Copy the code

Method analysis of converting linked list to red black tree

Final void treeifyBin(Node<K,V>[] TAB, int hash) {int n, index; Node<K,V> e; // Determine whether the length of the hash table is required to be smaller than the minimum tree threshold. Expansion method already about the if (TAB = = null | | (n = TAB. Length) < MIN_TREEIFY_CAPACITY) the resize (); Else if ((e = TAB [index = (n-1) & hash])! = null) { TreeNode<K,V> hd = null, tl = null; <K,V> p = replacementTreeNode(e, null); <K,V> p = replacementTreeNode(e, null); If (tl == null) hd = p; Else {// otherwise set a node to be the successor of the node. Tl. next = p; // Set the previous node to its successor tl.next = p; } tl = p; } while ((e = e.next) ! = null); If ((TAB [index] = hd)! If (TAB [index] = hd)! = null) hd.treeify(tab); Final void treeify(Node<K,V>[] TAB) {TreeNode<K,V> root = null; For (TreeNode<K,V> x = this, next; x ! = null; x = next) { next = (TreeNode<K,V>)x.next; x.left = x.right = null; If (root == null) {// If the root node is empty, the current node is the root node, because the red-black root node must be black, and the parent node is null x.parent = null; x.red = false; root = x; } else {// otherwise add child node // record current node key, and hash value K K = x.key; int h = x.hash; Class<? > kc = null; For (TreeNode<K,V> p = root;) { int dir, ph; K pk = p.key; // Since the red-black tree is also a binary sort tree, insert by hash value, there are several cases // Case 1: The hash value of the inserted node is less than the current node hash value. If ((ph = p.dash) > h) dir = -1; If (ph < h) dir = 1; if (ph < h) dir = 1; // If the hash values are equal, call compareable or compare to see if the key implements the interface comparison. If not, call tieBreakOrder. If so, the hashCode comparison is generated on the calling object. else if ((kc == null && (kc = comparableClassFor(k)) == null) || (dir = compareComparables(kc, k, pk)) == 0) dir = tieBreakOrder(k, pk); TreeNode<K,V> xp = p; TreeNode<K,V> p; If (p = (dir <= 0)? If (p = (dir <= 0)? P.left: p.right) == null) {// insert the parent of the node to the current node. If (dir <= 0)// add x to the left subtree xp.left = x; Right = x; else// Right = x; Root = balanceInsertion(root, x); break; } } } } moveRootToFront(tab, root); } static int tieBreakOrder(Object a, Object b) { int d; If (a = = null | | b = = null | | / / is the name of the class, GetName ().compareto (b.getClass().getName()))) == 0) d = (system.identityHashCode (a) <= System.identityHashCode(b) ? 1:1); return d; }Copy the code

Red and black convert to a linked list structure

Converting a red-black tree to a linked list structure is very simple, because the data structure definition of a red-black tree itself is also a double-linked list, and you just need to traverse the conversion to a single-linked list structure and join it together
```
final Node<K,V> untreeify(HashMap<K,V> map) { Node<K,V> hd = null, tl = null; For (Node<K,V> q = this; q ! = null; Node<K,V> p = map.replacementNode(q, null); If (tl == null) hd = p; Next = p; else tl.next = p; tl = p; } // return hd; }Copy the code
```

Next, examine the PUT method for adding elements

The general process for adding elements:

1. Copy a hash table to TAB

Then divide into several situations:

If there is no initialization, call resize(expansion is the same method) to initialize, and then directly create a new node to be added to the position of the computed hash table
It’s already initialized. There are several of them
1. If the node already exists and is the head node (the first point), the temporary node is directly used to point to it, and then the subsequent related conditions are carried out to determine whether its value needs to be modified
2. Non-head node, and the node has been converted to a tree structure (red-black tree), the red-black tree insertion method is called and the temporary node points to the newly added node
3. Non-head node and linked list structure, traverse the linked list, find the existing node if it already exists and point to it with a temporary node; otherwise, find the end, create a new node, insert and add by tail insertion method, and judge whether the number of linked list exceeds the threshold after adding. If the threshold is exceeded and the structure of the hash table exceeds 64(this is just a reference to the method used to transform the tree structure), the linked list tree (converted to a red-black tree) is tree-shaped and the temporary node points to that node after the transformation
After the above operation, you can determine whether to change the value of the existing node

Public v put(k key, v value) {return putVal(hash(key), key, value, false, true); } /** * Implements Map. Put and related methods. ** @param hash key * @param key * @param value * @param onlyIfAbsen * @param evict If true create a new node add * @return */ final V putVal(int hash, K key, V value, boolean onlyIfAbsent,boolean evict) { Node<K,V>[] tab; // copy a hash table to TAB Node<K,V> p; // Temporary storage node int n,// hash table length I; / / is recorded by hash code calculation node location / / TAB to follow to the hash table and decide whether the hash table is empty if ((TAB = table) = = null | | (n = TAB. Length) = = 0) / / N = (TAB = resize()).length; // Use the hash value and the hash table capacity -1 to calculate the specific hash position in the hash table. If (p = TAB [I = (n-1) & hash]) == null) TAB [I] = newNode(hash, key, value, null); if (p = TAB [I] = (hash, key, value, null); Else {// Add Node<K,V> e if root Node (head Node) is not empty; K k; / / case 1: whether the first node with the same hash value add node and the key, if the same is directly pointing to the first node if e (p.h ash = = hash && ((k = p.k ey) = = key | | (key! = null && key.equals(k)))) e = p; // Case 2: Determine whether the node is a tree node (red-black tree), Else if (p instanceof TreeNode) e = ((TreeNode<K,V>)p).puttreeval (this, TAB, hash, key, value); For (int binCount = 0; ; ++binCount) {// If it is a tail node, If ((e = p.ext) == null) {//(end method) add newNode p.ext = newNode(hash, key, value, null); // Check whether the current number of nodes reaches the threshold. If (binCount >= treeify_threshold-1) // -1 for 1st treeifyBin(TAB, hash); break; } / / determine whether existing list if there is an end to the if (e.h ash = = hash && ((k = e.k ey) = = key | | (key! = null && key.equals(k)))) break; p = e; } // if (e! = null) { // existing mapping for key V oldValue = e.value; // if (! onlyIfAbsent || oldValue == null) e.value = value; afterNodeAccess(e); return oldValue; ++modCount; If (++size > threshold) resize(); afterNodeInsertion(evict); // LinkedHashMap return null; }Copy the code

Methods to get data in a HashMap

public V get(Object key) { Node<K,V> e; return (e = getNode(hash(key), key)) == null ? null : e.value; } final Node<K,V> getNode(int hash, Object key) { Node<K,V>[] tab; Node<K,V> first, e; int n; K k; If ((TAB = table)! If (TAB = table)! If (TAB = table)! = null && (n = tab.length) > 0 && (first = tab[(n - 1) & hash]) ! = null) {/ / case 1: the first node is returned directly if (first. The hash = = hash && / / always check first node ((k = first. Key) = = key | | (key! = null && key.equals(k)))) return first; If ((e = first.next)! If (first instanceof TreeNode) return ((TreeNode<K,V>)first).getTreenode (hash, key);  Do {// otherwise iterate over the list. if (e.hash == hash && ((k = e.key) == key || (key ! = null && key.equals(k)))) return e; } while ((e = e.next) ! = null); } } return null; }Copy the code

HashMap removal method analysis

public V remove(Object key) { Node<K,V> e; return (e = removeNode(hash(key), key, null, false, true)) == null ? null : e.value; } /** * */ final Node<K,V> removeNode(int hash, Object key, Object value, boolean matchValue, boolean movable) { Node<K,V>[] tab; Node<K,V> p; int n, index; If ((TAB = table)! If (TAB = table)! If (TAB = table)! = null && (n = tab.length) > 0 && (p = tab[index = (n - 1) & hash]) ! = null) { Node<K,V> node = null, e; K k; V v; If the assignment between the first node is a node if (p.h ash = = hash && ((k = p.k ey) = = key | | (key! = null && key.equals(k)))) node = p; Else if ((e = p.ext)! = null) {//// if (p instanceof TreeNode) node = ((TreeNode<K,V>)p).getTreenode (hash, key); The else {/ / it is list will traverse the list do {/ / if the current node is the exit if (e.h ash = = hash && ((k = e.k ey) = = key | | (key! = null && key.equals(k)))) { node = e; break; } p = e; } while ((e = e.next) ! = null); } } if (node ! = null && (! matchValue || (v = node.value) == value || (value ! = null && value.equals(v)))) {// If (node instanceof TreeNode) ((TreeNode<K, v >)node).removeTreenode (  s, tab, movable); TAB [index] = node.next; else if (node == p)/ if (node == p); TAB [index] = node.next; P.ext = node.next; ++modCount; --size; afterNodeRemoval(node); return node; } } return null; } // Remove the tree node method. final void removeTreeNode(HashMap<K,V> map, Node<K,V>[] tab, boolean movable) { int n; / / whether the hash table is null or countless according to the if (TAB = = null | | (n = TAB. Length) = = 0) return; // Otherwise calculate at index position int index = (n-1) & hash; TreeNode<K,V> first = (TreeNode<K,V>) TAB [index], root = first, rl; TreeNode<K,V> TAB [index], root = first, rl; TreeNode<K,V> succ = (TreeNode<K,V>)next, pred = prev; If (pred == null) TAB [index] = first = succ; if (pred == null) TAB [index] = first = succ; Next = succ; else make the successor of the removed node point to the successor of the removed node. // If (succ! Prev succ.prev = pred; If (first == null)// Return if first is null; if (root.parent ! = null)// If the parent node is not empty, point root to the root node root = root.root(); / / should determine whether the tree is converted to a list if (root = = null | | (movable && (root) right = = null | | (rl = root. Left) = = null | | rl. Left = = null))) {  tab[index] = first.untreeify(map); // too small return; TreeNode<K,V> p = this, pl = left, pr = right, replacement; // TreeNode<K,V> p = this, pl = left, pr = right, replacement; if (pl ! = null && pr ! = null) { TreeNode<K,V> s = pr, sl; while ((sl = s.left) ! = null) // find successor s = sl; boolean c = s.red; s.red = p.red; p.red = c; // swap colors TreeNode<K,V> sr = s.right; TreeNode<K,V> pp = p.parent; if (s == pr) { // p was s's direct parent p.parent = s; s.right = p; } else { TreeNode<K,V> sp = s.parent; if ((p.parent = sp) ! = null) { if (s == sp.left) sp.left = p; else sp.right = p; } if ((s.right = pr) ! = null) pr.parent = s; } p.left = null; if ((p.right = sr) ! = null) sr.parent = p; if ((s.left = pl) ! = null) pl.parent = s; if ((s.parent = pp) == null) root = s; else if (p == pp.left) pp.left = s; else pp.right = s; if (sr ! = null) replacement = sr; else replacement = p; } else if (pl ! = null) replacement = pl; else if (pr ! = null) replacement = pr; else replacement = p; if (replacement ! = p) { TreeNode<K,V> pp = replacement.parent = p.parent; if (pp == null) root = replacement; else if (p == pp.left) pp.left = replacement; else pp.right = replacement; p.left = p.right = p.parent = null; } TreeNode<K,V> r = p.red ? root : balanceDeletion(root, replacement); if (replacement == p) { // detach TreeNode<K,V> pp = p.parent; p.parent = null; if (pp ! = null) { if (p == pp.left) pp.left = null; else if (p == pp.right) pp.right = null; } } if (movable) moveRootToFront(tab, r); }Copy the code

Conclusion:

The underlying data structure of HashMap is array + linked list + red-black tree (red-black tree with double-linked list structure). When the number of nodes in the bucket of array (hash table) is less than 8, single-linked list structure will be used to store data. When the node tree of a linked list is larger than 8, judgment will be made once. Check whether the current number of hash tables exceeds the minimum number of nodes in tree (the default value is 64). If the number of hash tables reaches the threshold value of 8 and the number of hash tables also exceeds 64, the list is converted into a tree chain of tree structure, and then converted into a red-black tree.
If the number of linked lists reaches the threshold of 8 but the number of nodes in the hash table is less than 64, expansion is used to store the list. Besides the previous expansion will be triggered, another expansion method of hashmap is that the number of nodes in the current hash table exceeds the expansion threshold. The calculation formula of expansion threshold is as follows: Load factor * The capacity of the hash table. The default load factor is 0.75. The default initial capacity is 16.
In addition, the red black tree is converted to a single linked list when the number of nodes is less than the threshold 6.
HashMap is thread-safe and efficient, while HasTable is thread-safe and inefficient. If it is used in the first place, the efficiency is high. If it is used in the first place, the efficiency is low. The CurrentHashMap concurrent container class in the JUC package is also thread-safe.

Related Posts

A study of STD :: IO ::Error in Rust

Content type ‘application/x-www-form-urlencoded; charset=UTF-8’ not supported

Spring Boot 13 Spring Boot integrates RabbitMQ