This article is based on JDK1.8. It will take about 25 minutes to read.
Abstract
HashMap believes this is one of the most frequently used face tests, one of the worst interview questions to be asked, and the most commonly used data type in Java for handling key-value pairs. Let’s look at JDK8 HashMap.
The main method
Key variables:
/** * The default initial capacity - MUST be a power of two. */
static final int DEFAULT_INITIAL_CAPACITY = 1 << 4; // aka 16
/** * The maximum capacity, used if a higher value is implicitly specified * by either of the constructors with arguments. * MUST be a power of two <= 1<<30. * Maximum capacity * If this value is exceeded, change threshold to integer. MAX_VALUE without expanding the array */
static final int MAXIMUM_CAPACITY = 1 << 30;
/** * The load factor used when none specified in constructor. Because statistical hash conflicts conform to poisson distribution, collisions between 7 and 8 are minimal */
static final float DEFAULT_LOAD_FACTOR = 0.75 f;
/** * The bin count threshold for using a tree rather than list for a * bin. Bins are converted to trees when adding an element to a * bin with at least this many nodes. The value must be greater * than 2 and should be at least 8 to mesh Lifecycle with assumptions in * tree removal about conversion back to plain bins upon * Shrinkage. * Lists that are greater than this value are treed * Note: The tree is not the entire MAP list, but a list greater than this threshold */
static final int TREEIFY_THRESHOLD = 8;
/** * The bin count threshold for untreeifying a (split) bin during a * resize operation. Should be less than TREEIFY_THRESHOLD, and at * most 6 to mesh with Shrinkage detection under removal. * Less than this value will de-tree */
static final int UNTREEIFY_THRESHOLD = 6;
Copy the code
Four construction methods:
Constructor 1
// Specify the initial capacity size, load factor
public HashMap(int initialCapacity, float loadFactor) {
if (initialCapacity < 0)
throw new IllegalArgumentException("Illegal initial capacity: " +
initialCapacity);
if (initialCapacity > MAXIMUM_CAPACITY) //1<<30 The maximum capacity is integer.max_value;
initialCapacity = MAXIMUM_CAPACITY;
if (loadFactor <= 0 || Float.isNaN(loadFactor))
throw new IllegalArgumentException("Illegal load factor: " +
loadFactor);
this.loadFactor = loadFactor;
//tableSizeFor This method is used to find the smallest power of 2 greater than or equal to initialCapacity
this.threshold = tableSizeFor(initialCapacity);
}
Constructor 2
// this calls the default value 0.75 given by the load factor in constructor 1 above
public HashMap(int initialCapacity) {
this(initialCapacity, DEFAULT_LOAD_FACTOR);
}
Constructor 3
// Empty parameter constructs, all use default values
public HashMap(a) {
this.loadFactor = DEFAULT_LOAD_FACTOR; // all other fields defaulted
}
Constructor 4
// Compared to the other three, it is initialized
public HashMap(Map<? extends K, ? extends V> m) {
this.loadFactor = DEFAULT_LOAD_FACTOR;/ / 0.75 f
PutVal (resize); putVal (resize); putVal (resize)
putMapEntries(m, false);
}
Copy the code
A quick summary of the four constructors:
(1) The first three constructors are not initialized. They are all initialized when needed. (2) Constructor 4 uses put, so it is initializedCopy the code
hashmap->hash()
/** * Computes key.hashCode() and spreads (XORs) higher bits of hash * to lower. Because the table uses power-of-two masking, sets of * hashes that vary only in bits above the current mask will * always collide. (Among known examples are sets of Float keys * holding consecutive whole numbers in small tables.) So we * apply a transform that spreads the impact of higher bits * downward. There is a tradeoff between speed, utility, and * quality of bit-spreading. Because many common sets of hashes * are already reasonably distributed (so don't benefit from * spreading), and because we use trees to handle large sets of * collisions in bins, we just XOR some shifted bits in the * cheapest possible way to reduce systematic lossage, as well as * to incorporate impact of the highest bits that would otherwise * never be used in index calculations Because of table bounds. * evaluates key.hashcode () and extends the higher (XOR) of the hash to the lower. * Since the table uses a power mask of 2, hash sets that change only in the bits above the current mask will always collide. * (A well-known example is a Float keyset containing consecutive integers in a small table.) Therefore, we apply a transformation that will extend down to a higher influence. There is a trade-off between speed, practicality, and bit expansion quality. * has reasonable distribution because of the many common hash sets (and therefore unable to benefit from the extended), * and because we use to deal with a lot of conflicts in container in trees, so we only in the cheapest way for some shift after an XOR, * to reduce system loss, and the effect of combination of highest, otherwise as a result of the limitation of scope of table, These bits will never be used in index calculations. * /
static final int hash(Object key) {
int h;
// Why ^ (h >>> 16) is more hashed, performance-sensitive, bit-friendly
// It is convenient to hash array subscripts in put and get methods (n-1)
return (key == null)?0 : (h = key.hashCode()) ^ (h >>> 16);
// The key value in the HashMap can be null. If we see 0, we can tell that the null value must be stored in the first place in the array
}
Copy the code
hashmap->put()
Main logic:
Here is the source code (annotated) :
/** * Associates the specified value with the specified key in this map. * If the map previously contained a mapping for The key, the old * value is replaced. * // Associate the specified value with the specified key in this mapping. If the map previously contained a map for the key, the old value is replaced. * *@param key key with which the specified value is to be associated
* @param value value to be associated with the specified key
* @returnthe previous value associated with <tt>key</tt>, or * <tt>null</tt> if there was no mapping for <tt>key</tt>. * (A <tt>null</tt> return can also indicate that the map * previously associated <tt>null</tt> with <tt>key</tt>.) * /
public V put(K key, V value) {
// Hash the key to get the hash value
return putVal(hash(key), key, value, false.true);
}
/**
* Implements Map.put and related methods.
*
* @param hash hash for key
* @param key the key
* @param value the value to put
* @paramOnlyIfAbsent if true, don't change existing value //if true, don't change existing value *@paramEvict if false, the table is in creation mode@return previous value, or null if none
*/
final V putVal(int hash, K key, V value, boolean onlyIfAbsent,
boolean evict) {
// Array + list + red-black tree, linked type (Node generic) array, each element represents a list, then each element is called a bucket
// Each element of a HashMap is an Entry of a linked list (Entry
).
,v>
// TAB: bucket p: bucket n: hash array size I: array subscript (bucket location)
Node<K,V>[] tab; Node<K,V> p; int n, i;
//1. If the bucket is empty, call resize().
if ((tab = table) == null || (n = tab.length) == 0)
n = (tab = resize()).length;
//2. Check whether there is hash conflict, find the bucket based on the hash value of the key and key, and empty the bucket. If the bucket is empty, create a bucket without conflict
/ /? Why use (n-1) &hash to compute array subscripts
if ((p = tab[i = (n - 1) & hash]) == null)
tab[i] = newNode(hash, key, value, null);
//3. The following indicates a hash conflict
else {
Node<K,V> e; K k;// are temporary variables
Check whether the key of the current bucket is the same as the input key. If the key exists, assign the value of the current bucket P to e and override the original value in Step 10
if(p.hash == hash && ((k = p.key) == key || (key ! =null && key.equals(k))))
e = p;
//5. If the current bucket is a red-black tree, write the assignment to e using putTreeVal()
else if (p instanceof TreeNode)
e = ((TreeNode<K,V>)p).putTreeVal(this, tab, hash, key, value);
//6. The current bucket is a list traversal list
else {
for (int binCount = 0; ; ++binCount) {
if ((e = p.next) == null) {
// add a new node to the end of the list if the next node is null
p.next = newNode(hash, key, value, null);
//8. To check whether the value is greater than the threshold, you need to turn the list to red black tree
if (binCount >= TREEIFY_THRESHOLD - 1) // -1 for 1st
//binCount starts at 0, so when binCount is 7 and the list length is 8 (counting the node at the beginning of the array slot, the total length is 9), we need to tree buckets
treeifyBin(tab, hash);
break;
}
//9. As in Step 4, if the key exists in the linked list, the value is directly overwritten by Step 10
if(e.hash == hash && ((k = e.key) == key || (key ! =null && key.equals(k))))
break; p = e; }}//10. Nodes with the same key overwrite the original value
if(e ! =null) { // existing mapping for key
V oldValue = e.value;
//onlyIfAbsent is true: does not change the previous value; False: change the original value
if(! onlyIfAbsent || oldValue ==null)
e.value = value;
// callback method used by LinkedHashMap
afterNodeAccess(e);
returnoldValue; }}/* The number of times the HashMap has been modified is used for fast-fail. Since the HashMap is not thread-safe, if the structure of the HashMap has changed during the iteration of the HashMap due to the participation of other threads (e.g. put, remove, etc.), Need to throw an exception ConcurrentModificationException * /
++modCount;
//11. Expand the capacity when the capacity exceeds the threshold
if (++size > threshold)
resize();
// callback method used by LinkedHashMap
afterNodeInsertion(evict);
return null;
}
Copy the code
So let’s summarize the put method:
2. Check whether the current table is empty or length=0. If (resize() = 0, resize() = 0, resize() = 0, resize() = 0, resize() = 0, resize() = 0, resize() = 0, resize() = 0, resize(); If yes, go to the next step. If no, check whether the key exists. If yes, overwrite the corresponding value and go to Step 3. If no, check whether the current table[I] is a treeNode. If yes, insert the key and value in red-black tree mode. If no, start traversing the linked list to check whether the length of the linked list is greater than 8. If yes, insert key and value into a red-black tree. If no, insert key and value into a linked list. If key exists, overwrite the corresponding value. Resize () 4. EndCopy the code
hashmap->get()
public V get(Object key) {
Node<K,V> e;
return (e = getNode(hash(key), key)) == null ? null : e.value;
}
/**
* Implements Map.get and related methods.
*
* @param hash hash for key
* @param key the key
* @return the node, or null if none
*/
final Node<K,V> getNode(int hash, Object key) {
Node<K,V>[] tab; Node<K,V> first, e; int n; K k;
//1, check that the current array is not empty and the length is greater than 0 && Find the bucket under the array by the hash value of the key (possibly a red-black tree or linked list)
if((tab = table) ! =null && (n = tab.length) > 0 &&
(first = tab[(n - 1) & hash]) ! =null) {
//2. Check whether the first node of the bucket returns the same key
if (first.hash == hash && // always check first node((k = first.key) == key || (key ! =null && key.equals(k))))
return first;
// if the next node is not null && if it is a red-black tree, it is still a linked list
if((e = first.next) ! =null) {
//4. If the value is a red-black tree, the value is a red-black tree
if (first instanceof TreeNode)
return ((TreeNode<K,V>)first).getTreeNode(hash, key);
// Otherwise, the list is iterated
do {
if(e.hash == hash && ((k = e.key) == key || (key ! =null && key.equals(k))))
return e;
} while((e = e.next) ! =null); }}return null;
}
Copy the code
So let’s summarize the get() method:
If the array length is not null &&length>0 return null; If no, check the first node, if the key matches, return and check whether the next node is empty Yes, return null, check whether it is a red black tree? If yes, set the value in red-black tree mode. If no, set the value through the linked list. 3Copy the code
hashmap->resize()
/** * Initializes or doubles table size. If null, allocates in * accord with initial capacity target held in field threshold. * Otherwise, because we are using power-of-two expansion, the * elements from each bin must either stay at same index, Or move * with a power of two offset in the new table. If empty, the allocation is based on the initial capacity target held in the field threshold. * Otherwise, because we are using a power of 2, each element in bin must remain the same index or be offset by a power of 2 in the new table. *@return the table
*/
final Node<K,V>[] resize() {
Node<K,V>[] oldTab = table;
int oldCap = (oldTab == null)?0 : oldTab.length;
int oldThr = threshold;
int newCap, newThr = 0;
//1
if (oldCap > 0) {
// If the size of the array is larger than the maximum size, increase the threshold and return
if (oldCap >= MAXIMUM_CAPACITY) {
threshold = Integer.MAX_VALUE;
return oldTab;
}
// Double the size of the array and the threshold
else if ((newCap = oldCap << 1) < MAXIMUM_CAPACITY &&
oldCap >= DEFAULT_INITIAL_CAPACITY)
newThr = oldThr << 1; // double threshold
}
// If the initialCapacity constructor is used, the old threshold is used as the new capacity
else if (oldThr > 0) // initial capacity was placed in threshold
newCap = oldThr;
// Use the null parameter construct, using the default value
else { // zero initial threshold signifies using defaults
newCap = DEFAULT_INITIAL_CAPACITY;/ / 16
newThr = (int)(DEFAULT_LOAD_FACTOR * DEFAULT_INITIAL_CAPACITY);/ / 0.75 * 16 = 12
}
// The initialCapacity constructor is used. If the new threshold is 0, the new threshold is calculated
if (newThr == 0) {
float ft = (float)newCap * loadFactor;
newThr = (newCap < MAXIMUM_CAPACITY && ft < (float)MAXIMUM_CAPACITY ?
(int)ft : Integer.MAX_VALUE);
}
threshold = newThr;
@SuppressWarnings({"rawtypes","unchecked"})
// initialize the array with the new size
Node<K,V>[] newTab = (Node<K,V>[])new Node[newCap];
// If this is just the initialization process, return newTab
table = newTab;
//3. Start the main work of capacity expansion, including data migration
if(oldTab ! =null) {
// Start copying old data by iterating through the original array
for (int j = 0; j < oldCap; ++j) {
Node<K,V> e;
if((e = oldTab[j]) ! =null) {
oldTab[j] = null;// Clear old table references
// Copy a single element from the original array directly to the new table
if (e.next == null)
newTab[e.hash & (newCap - 1)] = e;
// If the element type is a red-black tree, it is treated as a red-black tree
else if (e instanceof TreeNode)
((TreeNode<K,V>)e).split(this, newTab, j, oldCap);
else { // preserve order
// This code is cleverly designed and closely linked
// There are two types of linked lists and the first and last nodes of the high and low linked list
Node<K,V> loHead = null, loTail = null;
Node<K,V> hiHead = null, hiTail = null;
Node<K,V> next;
// Walk through the nodes of the original list in order
do {
next = e.next;
// This is a core judgment condition. Why would you do that?
// if =0, put it in the low list
if ((e.hash & oldCap) == 0) {
if (loTail == null)
loHead = e;
else
loTail.next = e;
loTail = e;
}
// Otherwise put it in the high list
else {
if (hiTail == null)
hiHead = e;
else
hiTail.next = e;
hiTail = e;
}
// This is essentially splitting the original list into two high and low lists
} while((e = next) ! =null);
// Put the entire list in position J of the new array
if(loTail ! =null) {
loTail.next = null;
newTab[j] = loHead;
}
// Place the entire high-order list at position J +oldCap in the new array
if(hiTail ! =null) {
hiTail.next = null;
newTab[j + oldCap] = hiHead;
}
}
}
}
}
return newTab;
}
Copy the code
To summarize the resize() method:
1Start, get the original array2, expand the original array2.1If the capacity in the original array reaches its maximum and is not expanded,returnThe original array2.2Double the size of the original array and the threshold3If the initialCapacity constructor is used for initialization, the old threshold is used as the new capacity4, such as the empty parameter used during initialization, use the default capacity and default threshold5, such as the initialCapacity constructor used for initialization, threshold =0To calculate a new threshold6Initialize the array with the new capacity. If the array is initialized, return the new array7. Expand capacity and migrate data7.1Iterate over the original array copy data to the new array7.11.If there is only one element in the group, copy directly7.12.If the element is of red-black tree type, it is treated as a red-black tree7.13.Process the linked list of the original array. Define a high-order list and a low-order list (split the original list) to start a loop, traversing the original list judgment condition e.hash & oldCap ==0? Yes, put these nodes in the low order list no, put them in the high order list and we're done, we're done iterating through the list and put the whole low order list in the new array j and the whole high order list in the new array j+oldCap7.2The loop ends, and traversing the old array is complete8Returns a new arrayCopy the code
I think by the end of this reading you should have some understanding of HashMap, hope to help you in the interview or work!
Hongchen gossip
Looking a bit cool, it is called Anywaysite, in Wenyu River Park, Beijing.
Last week, I stayed at home with my girlfriend and went out for a walk, so we found a park, the park in autumn and winter is still quite bleak.
It is the end of the year, and the epidemic is scattered. I hope it will be over soon! peaceeeee
Thank you for reading
If you find any mistakes, please mention them in the background or comments, and I will correct them. Thank you for reading, welcome and thank you for following my wechat official account!