1. HashMap source code analysis
In JDK 1.8, the bottom layer of a HashMap is an array + a linked list + a red-black tree. Let’s start with a brief overview of what hashCode() does in a HashMap: A HashMap uses an array of nodes (which may represent a linked list or a red-black tree) to store key-value pairs through the perturbation function of the key’s hashCode(). Then use (n-1) &hash (n is the length of the array, n is a power of 2, so hash & n = hash & (n-1)) to determine where the element should be added to the array (index), If no node is found (null) then the node is created using the key value and added to the corresponding index position of the array, otherwise the node is overridden using the key’s equals method if equals returns true, otherwise it is added to the corresponding position of the linked list or red-black tree
JDK 1.8 perturbation function hash method
static final int hash(Object key) {
int h;
return (key == null)?0 : (h = key.hashCode()) ^ (h >>> 16);
}
Copy the code
JDK 1.7 perturbation function hash method
final int hash(Object k) {
int h = hashSeed;
if (0! = h && kinstanceof String) {
return sun.misc.Hashing.stringHash32((String) k);
}
h ^= k.hashCode();
// This function ensures that hashCodes that differ only by
// constant multiples at each bit position have a bounded
// number of collisions (approximately 8 at default load factor).
h ^= (h >>> 20) ^ (h >>> 12);
return h ^ (h >>> 7) ^ (h >>> 4);
}
Copy the code
JDK 1.7’s perturbation function was disturbed four times compared to JDK 1.8’s one time, so it’s clear that JDK 1.8’s perturbation function is more efficient
The biggest change in HashMap from JDK 1.7 to 1.8 is how hash conflicts are resolved when the index position of a key already exists; 1.7 use linked lists to solve the problem; In 1.8, linked lists and red-black trees are used to solve the problem.
1. Class attributes
// The default capacity is 16
static final int DEFAULT_INITIAL_CAPACITY = 1 << 4; // aka 16
// Maximum capacity 2^30
static final int MAXIMUM_CAPACITY = 1 << 30;
// Default load factor
static final float DEFAULT_LOAD_FACTOR = 0.75 f;
// An empty instance of an hash table
static finalEntry<? ,? >[] EMPTY_TABLE = {};// The hash table can be expanded as necessary and must be a power of 2
transient Entry<K,V>[] table = (Entry<K,V>[]) EMPTY_TABLE;
// The number of key-value mappings contained in the map
transient int size;
// Threshold. When the threshold is reached, capacity needs to be expanded. Threshold = capacity * loadfactor
int threshold;
// Load factor
final float loadFactor;
// Number of map changes
transient int modCount;
// A default threshold for a key-value pair whose key is String,
// Alternative hashing is used when map capacity reaches this threshold
// Alternate hashes reduce the (easier) incidence of hash collisions for String keys.
/ / JDK. This value can be defined system attributes map. Althashing. The threshold to specify.
// If the value is 1, it forces the standby hash to always be used; If the value is -1, the value is disabled.
static final int ALTERNATIVE_HASHING_THRESHOLD_DEFAULT = Integer.MAX_VALUE;
Copy the code
LoadFactor controls the sparsity of data stored in the array. The closer the loadFactor is to 1, the more entries stored in the array and the denser the data. Conversely, the more sparse the data. Too much loadFactor will lead to low query efficiency. Too small a loadFactor can lead to data fragmentation and waste a lot of space. So loadFactor has a default value of 0.75F, which is officially a good value. The initial capacity is 16, and the critical value threshold is 16 x 0.75 = 12. That is, when the number of elements reaches 12, capacity expansion needs to be performed. The process of capacity expansion involves rehash and data replication, which consumes resources. The correct thing to do is to give the specified capacity when constructing a HashMap object if you can determine roughly how much storage you need!
2, node class source code analysis
Node Node source code:
static class Node<K.V> implements Map.Entry<K.V> {
// cache the hash values so that you don't need to call the hashCode method to evaluate the hashCode again
final int hash;
// key
final K key;
// value
V value;
// Subsequent nodes
Node<K,V> next;
// There are parameters
Node(int hash, K key, V value, Node<K,V> next) {
this.hash = hash;
this.key = key;
this.value = value;
this.next = next;
}
Getters for key and value
public final K getKey(a) { return key; }
public final V getValue(a) { return value; }
// toString, the output Node object will print key=value
public final String toString(a) { return key + "=" + value; }
// Implementation of hashCode for Node objects: hashCode for key and hashCode for value do either or
public final int hashCode(a) {
return Objects.hashCode(key) ^ Objects.hashCode(value);
}
// Change the value method, return the value before the change
public final V setValue(V newValue) {
V oldValue = value;
value = newValue;
return oldValue;
}
// equals method, you can see the field used in equals method, which must also be used when overriding hashCode
public final boolean equals(Object o) {
if (o == this)
return true;
if (o instanceofMap.Entry) { Map.Entry<? ,? > e = (Map.Entry<? ,? >)o;if (Objects.equals(key, e.getKey()) &&
Objects.equals(value, e.getValue()))
return true;
}
return false; }}Copy the code
Node nodes can be used as nodes of a linked list
Tree node source code is as follows:
static final class TreeNode<K.V> extends LinkedHashMap.Entry<K.V> {
TreeNode<K,V> parent; / / the parent node
TreeNode<K,V> left; // Left child node
TreeNode<K,V> right; // Right child node
TreeNode<K,V> prev; // The precursor node
boolean red; // If it is red, this is a node of a red-black tree
TreeNode(int hash, K key, V val, Node<K,V> next) {
super(hash, key, val, next);
}
// Returns the root node containing the node
final TreeNode<K,V> root(a) {
for (TreeNode<K,V> r = this, p;;) {
if ((p = r.parent) == null)
returnr; r = p; }}// Other methods are omitted
}
Copy the code
3. Construction method
// Specify the initial capacity and load factor
public HashMap(int initialCapacity, float loadFactor) {
if (initialCapacity < 0)
throw new IllegalArgumentException("Illegal initial capacity: " +
initialCapacity);
if (initialCapacity > MAXIMUM_CAPACITY)
initialCapacity = MAXIMUM_CAPACITY;
if (loadFactor <= 0 || Float.isNaN(loadFactor))
throw new IllegalArgumentException("Illegal load factor: " +
loadFactor);
this.loadFactor = loadFactor;
// The initial threshold is the minimum power of 2 greater than or equal to initialCapacity
this.threshold = tableSizeFor(initialCapacity);
}
// Specify the initial capacity, using the default load factor
public HashMap(int initialCapacity) {
this(initialCapacity, DEFAULT_LOAD_FACTOR);
}
// Default constructor, load factor to the power of 2
public HashMap(a) {
this.loadFactor = DEFAULT_LOAD_FACTOR; // all other fields defaulted
}
// Construct a HashMap using another Map and call the putMapEntries method to do so
public HashMap(Map<? extends K, ? extends V> m) {
this.loadFactor = DEFAULT_LOAD_FACTOR;
putMapEntries(m, false);
}
final void putMapEntries(Map<? extends K, ? extends V> m, boolean evict) {
// The size of the collection
int s = m.size();
if (s > 0) {
// Check whether the table has been initialized
if (table == null) { // pre-size
// If not initialized, s is the number of elements in m
float ft = ((float)s / loadFactor) + 1.0 F;
int t = ((ft < (float)MAXIMUM_CAPACITY) ?
(int)ft : MAXIMUM_CAPACITY);
// If the calculated t (actual capacity required) is greater than threshold, initialize threshold
if (t > threshold)
threshold = tableSizeFor(t);
}
// If the table is initialized and the number of elements in M exceeds the threshold, expand the table
else if (s > threshold)
resize();
// Add all elements in m to the HashMap
for (Map.Entry<? extends K, ? extends V> e : m.entrySet()) {
K key = e.getKey();
V value = e.getValue();
putVal(hash(key), key, value, false, evict); }}}Copy the code
As you can see, the constructors only initialize the values of loadFactor and threshold (the default threshold constructor is also not initialized), and do not initialize the table array
4, put method
PutVal (); putVal (); putVal (); putVal ();
public V put(K key, V value) {
// The putVal method is called internally to add or update the specified key-value pair to the HashMap
return putVal(hash(key), key, value, false.true);
}
// Perturbation function to avoid reducing hash conflicts
static final int hash(Object key) {
int h;
return (key == null)?0 : (h = key.hashCode()) ^ (h >>> 16);
}
/** * Hash: key Hash value computed by the perturbation function * key: key * value: value * onlyIfAbsent: If true, the existing value will not be changed * EVict: If false, the table is in creation mode */
final V putVal(int hash, K key, V value, boolean onlyIfAbsent,
boolean evict) {
Node<K,V>[] tab; Node<K,V> p; int n, i;
// If the table is null or the length of the table is 0, the capacity expansion is performed
if ((tab = table) == null || (n = tab.length) == 0)
n = (tab = resize()).length;
// Use n-1&hash to find the index that should appear in the hash table
// If not, insert the new node into the index position
if ((p = tab[i = (n - 1) & hash]) == null)
tab[i] = newNode(hash, key, value, null);
else { // If there is one, it needs to be determined further
Node<K,V> e; K k;
// If the hash value of the index node is the hash of the key and the key of the index node is the same object
// If the index key is not empty and the keyequals of the key and index return true, e points to P
if(p.hash == hash && ((k = p.key) == key || (key ! =null && key.equals(k))))
e = p;
// Red black tree processing
else if (p instanceof TreeNode)
// Add key-value to the red-black tree corresponding to the index node and return the updated or newly added tree node object to e
e = ((TreeNode<K,V>)p).putTreeVal(this, tab, hash, key, value);
// List processing
else {
for (int binCount = 0; ; ++binCount) {
// Insert elements
if ((e = p.next) == null) {
p.next = newNode(hash, key, value, null);
// If the list has reached the tree threshold, the list is converted to a red-black tree
if (binCount >= TREEIFY_THRESHOLD - 1) // -1 for 1st
// Todo: analysis of tree methods
treeifyBin(tab, hash);
break;
}
// e is now the next node of the node being traversed (not empty)
// Return true if the list has the same hash value, or if the key is the same object or equals of the key
// jumps out of the loop
if(e.hash == hash && ((k = e.key) == key || (key ! =null && key.equals(k))))
break;
// p moves to the next nodep = e; }}// If e is not null, there is a mapping for the key, and its value is updated
if(e ! =null) { // existing mapping for key
V oldValue = e.value;
if(! onlyIfAbsent || oldValue ==null)
e.value = value;
afterNodeAccess(e);
// Updating the map returns the previous value
returnoldValue; }}// The hash table structure is changed
++modCount;
// If the value exceeds the threshold after adding elements, expand the capacity
if (++size > threshold)
resize();
afterNodeInsertion(evict);
// Returning null indicates that the node is being inserted, so there is no previous value
return null;
}
Copy the code
Now use a language to describe the execution flow of the following putVal methods:
- Check whether the table is null or has a length of 0
- If yes, perform resize()
- Otherwise continue
- Find the corresponding index through the hash calculated by the perturbation function of the key’s hashCode(), and determine whether the node at the index position exists
- If no, insert the node directly
- If yes, check whether it is a tree node
- Tree node, key key-value pair added to the corresponding red-black tree, insert returns NULL, update returns the updated tree node object to e
- If there is a same key (hash, same object, equals())) in the process, the value is updated. If no key is found, a node is inserted at the end of the list. After insertion, the critical value of tree is determined, and the list is converted to a red-black tree
- If the length of the table array is smaller than the default minimum tree length of 64 when converting a linked list to a red-black tree, perform expansion instead of converting the linked list to a red-black tree
- If a key exists in the linked list or red-black tree, then the node is assigned to e. At this point, it is determined whether E is null. If it is not null, e is the node object to be updated, and the value of this node is updated and the old value is returned
- If e is null, it indicates that the node is being inserted. At this point, the node is being inserted. Check whether the value exceeds the threshold
The resize() method for capacity expansion is described later
Resize method source code: **
5. Get method
Get method source:
public V get(Object key) {
Node<K,V> e;
return (e = getNode(hash(key), key)) == null ? null : e.value;
}
final Node<K,V> getNode(int hash, Object key) {
Node<K,V>[] tab; Node<K,V> first, e; int n; K k;
// Hash the key to the index position. If the node at the index position is not empty, assign it to first
if((tab = table) ! =null && (n = tab.length) > 0 &&
(first = tab[(n - 1) & hash]) ! =null) {
// Return the first node if first's hash is equal to hash or first's key and key's equals return true
if (first.hash == hash && // always check first node((k = first.key) == key || (key ! =null && key.equals(k))))
return first;
// When first is not a separate node
if((e = first.next) ! =null) {
// If first is a tree node instance, use the red-black tree lookup method to find the corresponding node
if (first instanceof TreeNode)
// todo: analysis of red-black tree lookup methods
return ((TreeNode<K,V>)first).getTreeNode(hash, key);
// Otherwise, search through the list
do {
if(e.hash == hash && ((k = e.key) == key || (key ! =null && key.equals(k))))
return e;
} while((e = e.next) ! =null); }}// If first is null, null is returned indicating that no search was found
return null;
}
Copy the code
Get method idea analysis:
- First according to pass key
hashCode()
The hash, which is the result of the perturbation function, is the index to which the hash should be based on (n-1) &hash (n is the length of the array) - Assign the value of the index position in the array to first to determine whether first is empty
- If yes, null is returned
- If no, check whether first is the node to be searched
- If yes, the first node is returned
- If no, continue to check whether first has the next node
- If no, null is returned
- If yes, check whether first is a tree node
- If yes, the node is searched in the red-black tree and returned. If no node is found, null is returned
- If no, the node is iterated through the list and returned. If no node is found, null is returned
6. Resize method
final Node<K,V>[] resize() {
// Assign table to oldTab
Node<K,V>[] oldTab = table;
// oldCap specifies the size of the original table (length, 0 if table is null).
int oldCap = (oldTab == null)?0 : oldTab.length;
// Assign threshold to oldThr
int oldThr = threshold;
// newCap and newThr represent new capacity and new threshold
int newCap, newThr = 0;
// If oldCap is greater than 0
if (oldCap > 0) {
// If oldCap is greater than or equal to the maximum capacity, 2^30, then threshold is assigned to the maximum value of int and oldTab is returned
// That is, do not perform capacity expansion at this time, because the capacity cannot be expanded
if (oldCap >= MAXIMUM_CAPACITY) {
threshold = Integer.MAX_VALUE;
return oldTab;
}
// newCap is assigned twice the value of oldCap if newCap is smaller than the maximum capacity and oldCap is larger than the default initial capacity
// newThr is set to double oldThr
else if ((newCap = oldCap << 1) < MAXIMUM_CAPACITY &&
oldCap >= DEFAULT_INITIAL_CAPACITY)
newThr = oldThr << 1; // double threshold
}
// If oldCap is 0 but oldThr is greater than 0, oldThr is assigned to newCap
// This indicates that the underlying array has not been initialized
else if (oldThr > 0) // initial capacity was placed in threshold
newCap = oldThr;
// If oldCap is 0 and oldThr is 0, then newCap is assigned the default capacity and newThr is assigned the default capacity * default load factor
else { // zero initial threshold signifies using defaults
newCap = DEFAULT_INITIAL_CAPACITY;
newThr = (int)(DEFAULT_LOAD_FACTOR * DEFAULT_INITIAL_CAPACITY);
}
// If newThr is 0, oldThr is also 0, and newCap is used to calculate newThr(newCap * loadFactor).
if (newThr == 0) {
float ft = (float)newCap * loadFactor;
newThr = (newCap < MAXIMUM_CAPACITY && ft < (float)MAXIMUM_CAPACITY ?
(int)ft : Integer.MAX_VALUE);
}
// Assign newThr to threshold, where newCap and newThr are identified
threshold = newThr;
@SuppressWarnings({"rawtypes","unchecked"})
// Use newCap to create the newTab array of the corresponding size
Node<K,V>[] newTab = (Node<K,V>[])new Node[newCap];
// Assign newTable to table
table = newTab;
// If there is data in oldTab, migrate it to newTab
if(oldTab ! =null) {
// Iterate over the nodes in the number group
for (int j = 0; j < oldCap; ++j) {
Node<K,V> e;
// e is the node being traversed, if e is not null
if((e = oldTab[j]) ! =null) {
// Empty the oldTab location for garbage collection
oldTab[j] = null;
// If next of e is null, indicating that e is a separate node, insert the node directly at the corresponding index position in newTab
if (e.next == null)
newTab[e.hash & (newCap - 1)] = e;
// The migration of red-black trees is complicated and not explained here
else if (e instanceof TreeNode)
((TreeNode<K,V>)e).split(this, newTab, j, oldCap);
else {
// The two nodes are defined because the elements in the list are being migrated
// Do you want it at the same subscript, or at the same subscript + the length of the array
Node<K,V> loHead = null, loTail = null;
Node<K,V> hiHead = null, hiTail = null;
Node<K,V> next;
// Iterate over the nodes in the original list
do {
next = e.next;
// If the high level is 0, the subscript is unchanged
if ((e.hash & oldCap) == 0) {
if (loTail == null)
// The first node
loHead = e;
else
// Insert the tail node
loTail.next = e;
// loTail redirects to the tail node
loTail = e;
}
// If the high level is 1, the subscript becomes the original subscript + the original array length
else {
if (hiTail == null)
// The first node
hiHead = e;
else
// Insert the tail node
hiTail.next = e;
// hiTail redirects to the tail nodehiTail = e; }}while((e = next) ! =null);
// Insert the low level list into the original index
if(loTail ! =null) {
loTail.next = null;
newTab[j] = loHead;
}
// Insert the high-order list at the original index + the original array length
if(hiTail ! =null) {
hiTail.next = null;
newTab[j + oldCap] = hiHead;
}
}
}
}
}
// Returns the expanded array
return newTab;
}
Copy the code
Split () method in TreeNode TreeNode class
final void split(HashMap<K,V> map, Node<K,V>[] tab, int index, int bit) {
// b refers to the current object (treenode-class object), since HashMap is only used when the resize method iterates over nodes in a number group
// So this must be the root of a red-black tree
TreeNode<K,V> b = this;
// As above, the high and low levels are linked separately, with the head node representing the first node and the tail node used to insert node elements
TreeNode<K,V> loHead = null, loTail = null;
TreeNode<K,V> hiHead = null, hiTail = null;
// lc represents the number of nodes in the low list, hc represents the number of nodes in the high list
int lc = 0, hc = 0;
// The initial value of the e tree node is b, and next is also a tree node
for(TreeNode<K,V> e = b, next; e ! =null; e = next) {
next = (TreeNode<K,V>)e.next;
e.next = null;
// If the high level is 0, insert it into the low level list
if ((e.hash & bit) == 0) {
if ((e.prev = loTail) == null)
// The first insert is the header node
loHead = e;
else
// Insert the tail node
loTail.next = e;
loTail = e;
++lc;
}
// If the high level is 1, insert it into the high level list
else {
if ((e.prev = hiTail) == null)
// The first insert is the header node
hiHead = e;
else
// Insert the tail nodehiTail.next = e; hiTail = e; ++hc; }}if(loHead ! =null) {
if (lc <= UNTREEIFY_THRESHOLD)
// If the number of elements in the list is below the de-tree threshold (6), it is converted to a list of Node objects
tab[index] = loHead.untreeify(map);
else {
// Otherwise tree it
tab[index] = loHead;
if(hiHead ! =null) // (else is already treeified)loHead.treeify(tab); }}// High-order lists are treated the same way
if(hiHead ! =null) {
if (hc <= UNTREEIFY_THRESHOLD)
tab[index + bit] = hiHead.untreeify(map);
else {
tab[index + bit] = hiHead;
if(loHead ! =null) hiHead.treeify(tab); }}}Copy the code
Now let’s summarize the following resize() method execution flow: First determine the new capacity and threshold
- Firstly, oldCap and oldThr of the original table are determined. According to these two values, newCap and newThr of the new capacity are determined
- If oldCap is greater than 0, it indicates that the table has been initialized. In this case, check whether oldCap is greater than or equal to the maximum length of the array
- If yes, set threshold to integer.max_value and return the original array. It indicates that the capacity cannot be expanded.
- Otherwise newCap becomes twice as large as oldCap, or if newCap is less than the maximum array length, newThr becomes twice as large as oldThr (don’t worry about oldCap beingIf the threshold is exceeded, the procedure in the previous step will continue to set threshold to integer.max_value
- If oldCap is 0 (it cannot be less than 0), the table is not initialized. Check whether oldThr is greater than 0
- If yes, the argument construct is called, specifying the initial capacity, and the value of oldThr is the size of the array to be created, which is assigned to newCap
- If no, the default construct is invoked. In this case, newCap is the default capacity (16) and newThr is the default capacity * default load factor =12
- If newThr is still 0, the value (newCap * loadFactor) is calculated, of course, if newCap is greater than or equal to the maximum length of the array, newThr is integer.max_value
- If oldCap is greater than 0, it indicates that the table has been initialized. In this case, check whether oldCap is greater than or equal to the maximum length of the array
Create an array and migrate elements:
- Create array newTab with size newCap, table pointing to newTab
- The nodes in oldTab are traversed, and the following is the traversal process
- If the node is a single node, hash & (newcap-1) is inserted into the corresponding index position
- Red-black trees and linked lists have some of the same ideas, and here’s why it’s twice as big every time, right
- Use the top and bottom nodes of the high and low lists to represent the two lists and check if hash & newCap is 0
- If yes, insert it into the low linked list (tailplug)
- Otherwise insert into high linked list (tail insert)
- In the case of a linked list, the lower linked-head node is inserted directly into the original index position and the higher linked-head node into the original index +oldCap position
- If it is a red-black tree, special treatment is required. If the list length is less than or equal to the de-tree critical value (6), it will be converted to Node Node insertion (red-black tree nodes will be strong first, so it will be rotated back here), otherwise, the list will be converted to red-black tree and inserted
- Use the top and bottom nodes of the high and low lists to represent the two lists and check if hash & newCap is 0
7, remove method
public V remove(Object key) {
Node<K,V> e;
return (e = removeNode(hash(key), key, null.false.true)) = =null ?
null : e.value;
}
final Node<K,V> removeNode(int hash, Object key, Object value,
boolean matchValue, boolean movable) {
Node<K,V>[] tab; Node<K,V> p; int n, index;
// The node (p) corresponding to the index position is not empty
if((tab = table) ! =null && (n = tab.length) > 0 &&
(p = tab[index = (n - 1) & hash]) ! =null) {
Node<K,V> node = null, e; K k; V v;
// If the key of node P is the same as key, node points to p
if(p.hash == hash && ((k = p.key) == key || (key ! =null && key.equals(k))))
node = p;
// e is the next node of p and e is not empty
else if((e = p.next) ! =null) {
// If p is a tree node, look for the node in the red-black tree
if (p instanceof TreeNode)
node = ((TreeNode<K,V>)p).getTreeNode(hash, key);
// Otherwise, traverse the list to find the node
else {
do {
if(e.hash == hash && ((k = e.key) == key || (key ! =null && key.equals(k)))) {
node = e;
break;
}
p = e;
} while((e = e.next) ! =null); }}// Node is not empty and removes elements from the red-black tree or linked list
if(node ! =null&& (! matchValue || (v = node.value) == value || (value ! =null && value.equals(v)))) {
if (node instanceof TreeNode)
((TreeNode<K,V>)node).removeTreeNode(this, tab, movable);
else if (node == p)
tab[index] = node.next;
else
p.next = node.next;
++modCount;
--size;
afterNodeRemoval(node);
returnnode; }}// Return null to indicate that the deleted key was not found
return null;
}
Copy the code
There’s nothing to say about the remove method, but notice that when you remove a node from a red-black tree you might be able to turn a red-black tree into a linked list, rightDraw an extreme case:You can see from this graph why it’s a linked list if it’s less than or equal to 6