1. An overview of the
HashMap is a hash table-based Map interface implementation that allows NULL keys and values
2. Member variables
/** * [1] The default initial capacity - MUST be a power of two. */
static final int DEFAULT_INITIAL_CAPACITY = 1 << 4; // aka 16
/** * Maximum capacity */
static final int MAXIMUM_CAPACITY = 1 << 30;
/* The load factor used when none specified in constructor. */
static final float DEFAULT_LOAD_FACTOR = 0.75 f;
/** * The list to red black tree threshold */
static final int TREEIFY_THRESHOLD = 8;
/** * [2] Red-black tree list threshold */
static final int UNTREEIFY_THRESHOLD = 6;
/** * Min tree threshold * If the length of the table in the HashMap is greater than 64, then the bucket lists will be allowed to become red-black trees. * If only the bucket lists are too long, If the length of the table is less than 64 *, the table should be expanded by resize instead of being converted to a red-black tree * The minimum tree size threshold should be at least four times that of the red-black tree */
static final int MIN_TREEIFY_CAPACITY = 64;
/** * Holds cached entrySet(). Note that AbstractMap fields are used * for keySet() and values(). */
transient Set<Map.Entry<K,V>> entrySet;
/** * Load factor controls how dense the table array in the HashMap is stored * The closer the load factor is to 1, the more dense the data is stored, resulting in inefficient element search * The closer the load factor is to 0, the more sparse the data is stored. The load factor for The hash table. * *@serial* /
final float loadFactor;
/** * number of changes */
transient int modCount;
/** * The number of key-value mappings contained in this map. */
transient int size;
/** * an array of elements */
transient Node<K,V>[] table;
When {/ * * *@link HashMap#size} >= {@link* The next size value at which to resize (capacity * load factor). * *@serial* /
int threshold;
Copy the code
[1], the capacity of HashMap must be a power of 2
The reason why HashMap capacity is a power of 2
3. Construction method
java.util.HashMap#HashMap()
public HashMap(a) {
// Use the default parameters
// Default load factor, default capacity
this.loadFactor = DEFAULT_LOAD_FACTOR; // all other fields defaulted
}
Copy the code
The default constructor does not initialize the table array; it does so in the java.util.hashmap #putVal method
java.util.HashMap#HashMap(int)
public HashMap(int initialCapacity) {
// Call the overloaded constructor
// Specify the initial capacity, default load factor
this(initialCapacity, DEFAULT_LOAD_FACTOR);
}
Copy the code
java.util.HashMap#HashMap(int, float)
public HashMap(int initialCapacity, float loadFactor) {
// Specify initial capacity, specify load factor
if (initialCapacity < 0)
throw new IllegalArgumentException("Illegal initial capacity: " +
initialCapacity);
if (initialCapacity > MAXIMUM_CAPACITY)
// If the specified initial capacity is greater than the maximum capacity, the maximum capacity is used
initialCapacity = MAXIMUM_CAPACITY;
if (loadFactor <= 0 || Float.isNaN(loadFactor))
// Check the load factor
throw new IllegalArgumentException("Illegal load factor: " +
loadFactor);
this.loadFactor = loadFactor;
// When manually specifying the HashMap capacity, the HashMap threshold is set regardless of the load factor
this.threshold = tableSizeFor(initialCapacity); / / [1]
}
Copy the code
TableSizeFor Specifies the value greater than the smallest power of 2 of a specified capacity
For example, given 15, return 16; Given 30, return 32
TableSizeFor method is a very cool method, 5 lines of code to see my face confused
java.util.HashMap#HashMap(java.util.Map<? extends K,? extends V>)
public HashMap(Map<? extends K, ? extends V> m) {
// Use the default load factor
this.loadFactor = DEFAULT_LOAD_FACTOR;
putMapEntries(m, false);
}
final void putMapEntries(Map<? extends K, ? extends V> m, boolean evict) {
int s = m.size();
if (s > 0) {
// Check whether the table is instantiated
if (table == null) { // pre-size
// Calculate the capacity expansion limit of M
float ft = ((float)s / loadFactor) + 1.0 F;
// Check whether the capacity expansion limit is greater than the maximum capacity of HashMap
int t = ((ft < (float)MAXIMUM_CAPACITY) ?
(int)ft : MAXIMUM_CAPACITY);
if (t > threshold) {
// If the capacity expansion limit of M exceeds the capacity expansion limit of the current HashMap, you need to adjust the capacity expansion limitthreshold = tableSizeFor(t); }}else if (s > threshold)
// if the value of m.size exceeds the upper limit, run the resize method to expand the table
resize();
for (Map.Entry<? extends K, ? extends V> e : m.entrySet()) {
// Add all key-value pairs in m to the HashMap
K key = e.getKey();
V value = e.getValue();
putVal(hash(key), key, value, false, evict); }}}Copy the code
The flow of the putMapEntries method:
- If the table is empty, the capacity expansion upper limit is recalculated
- If the HashMap capacity expansion limit is smaller than that of the specified Map
size
, then executeresize
super-popular - Passes all key values in the specified Map
putVal
Methods are placed in a HashMap
The hash function used here is actually a perturbation function, as described below
4. It’s Very important
java.util.HashMap#tableSizeFor
Returns a power of two for the given target capacity. */
static final int tableSizeFor(int cap) {
/* Static tool methods hash(), tableSizeFor() (4) - programmer - CSDN blog https://blog.csdn.net/fan2012huan/article/details/51097331 */
int n = cap - 1;
n |= n >>> 1;
n |= n >>> 2;
n |= n >>> 4;
n |= n >>> 8;
n |= n >>> 16;
return (n < 0)?1 : (n >= MAXIMUM_CAPACITY) ? MAXIMUM_CAPACITY : n + 1;
}
Copy the code
The flow of the algorithm is as follows
The five lines of code are really awesome!! If you don’t understand the flowchart after reading it, read the text in the notes
java.util.HashMap#hash
// Perturbation function
static final int hash(Object key) {
int h;
// Mix the high and position of the original Hash value to reduce the probability of low level collisions
return (key == null)?0 : (h = key.hashCode()) ^ (h >>> 16);
}
Copy the code
Why does this function exist?
A HashMap determines the index of a key in a table based on its hash value
How to understand this sentence? Let me give you an example
There is a 32-bit hash value as follows
If the Hash value is the lowest 4 bits, index = 0101 = 5
If a large number of hash values with the lower 4 digits of 0101 are present, then all key-value pairs will be placed at index = 5 of the table
As a result, keys cannot be evenly distributed in the table
So HashMap solves this problem by creating this method java.util.hashmap #hash
Take the high 16 bits of a 32-bit hash value and the low 16 bits, and the low one carries the high one
To put it bluntly, even if there are a large number of keys with the same hash value, the index computed will be different after going through the hash method
Use the hash method to reduce the hash conflict probability
java.util.HashMap#resize
/** * Initialize or expand the table **@return the table
*/
final Node<K,V>[] resize() {
// Take out the old table snapshot
Node<K,V>[] oldTab = table;
// Check the old table capacity
int oldCap = (oldTab == null)?0 : oldTab.length;
// Old capacity limit
int oldThr = threshold;
// The capacity of the new table and the upper limit of capacity expansion
int newCap, newThr = 0;
if (oldCap > 0) {
if (oldCap >= MAXIMUM_CAPACITY) {
// If the capacity of the old table is larger than the maximum capacity of the HashMap, no expansion operation is performed
threshold = Integer.MAX_VALUE;
return oldTab;
}
else if ((newCap = oldCap << 1) < MAXIMUM_CAPACITY &&
oldCap >= DEFAULT_INITIAL_CAPACITY)
// If the size of the HashMap does not exceed the maximum size, double the size (newCap, newThr)
newThr = oldThr << 1; // double threshold
}
else if (oldThr > 0) // initial capacity was placed in threshold
// The table is not initialized. The initial capacity is set to the maximum capacity
newCap = oldThr;
else { // zero initial threshold signifies using defaults
// If the table is not initialized, use the default capacity (16) and capacity (12).
/ / such as Java. Util. HashMap. HashMap ()
newCap = DEFAULT_INITIAL_CAPACITY;
newThr = (int)(DEFAULT_LOAD_FACTOR * DEFAULT_INITIAL_CAPACITY);
}
if (newThr == 0) {
// newThr = 0 if oldCap > 0, oldThr = 0
// Recalculate the new capacity limit
float ft = (float)newCap * loadFactor;
newThr = (newCap < MAXIMUM_CAPACITY && ft < (float)MAXIMUM_CAPACITY ?
(int)ft : Integer.MAX_VALUE);
}
// Update the HashMap capacity limit
threshold = newThr;
@SuppressWarnings({"rawtypes"."unchecked"})
Node<K,V>[] newTab = (Node<K,V>[])new Node[newCap];
// Initialize the table
table = newTab;
// If the old table is not empty, add the
if(oldTab ! =null) {
for (int j = 0; j < oldCap; ++j) {
Node<K,V> e;
if((e = oldTab[j]) ! =null) {
oldTab[j] = null;
if (e.next == null)
// If there is only one node in the linked list, recalculate e's index in newTab based on the hash value of the node
newTab[e.hash & (newCap - 1)] = e;
else if (e instanceof TreeNode)
// The linked list has been turned into a red-black tree, split tree
((TreeNode<K,V>)e).split(this, newTab, j, oldCap);
else { // preserve order
// Define two lists, lo and HI
Node<K,V> loHead = null, loTail = null;
Node<K,V> hiHead = null, hiTail = null;
Node<K,V> next;
do {
next = e.next;
// Use hash &oldCap to determine whether a node in the linked list should stay in the same position in the new table
if ((e.hash & oldCap) == 0) {
// The node remains at position j
// Insert lo list
if (loTail == null)
loHead = e;
else
loTail.next = e;
loTail = e;
}
else {
// The node is moved to position j+oldCap
// Insert the hi list.
if (hiTail == null)
hiHead = e;
elsehiTail.next = e; hiTail = e; }}while((e = next) ! =null);
if(loTail ! =null) {
// If lo is not empty, place the entire lo list in position J of the new table
loTail.next = null;
newTab[j] = loHead;
}
if(hiTail ! =null) {
// If the hi list is not empty, place the entire hi list at position J +oldCap of the new table
hiTail.next = null;
newTab[j + oldCap] = hiHead;
}
}
}
}
}
return newTab;
}
Copy the code
The resize process is not complicated and is roughly as follows
In my opinion, the key point is to reassign the key-value pairs to the new table
Here are three things to consider:
- The index position in the table has only one element
- Table index is a red-black tree
- The index position in the table is a linked list.
In case 3, the index position in the table is a linked list, and redistribution will split the list into two linked lists
An LO linked list, left at index
The other HI list will be moved to index+oldCapacity
At this point, we need to determine whether the node will stay on the LO list or the HI list.
Read this article for an in-depth understanding of HashMap(4): Resize expansion through line-by-line analysis of key source code
In jdk1.7, the expansion of the table will form a loop under the concurrent execution of multiple threads. We recommend that you read this article 🎉🎉
java.util.HashMap#treeifyBin
final void treeifyBin(Node<K,V>[] tab, int hash) {
int n, index; Node<K,V> e;
if (tab == null || (n = tab.length) < MIN_TREEIFY_CAPACITY)
// [1] If the table length is smaller than the minimum tree threshold, run resize to expand the table
resize();
else if ((e = tab[index = (n - 1) & hash]) ! =null) {
// Turn the list into a red-black tree
// hd head node, TL tail node
TreeNode<K,V> hd = null, tl = null;
// the do-while loop converts a one-way list to a bidirectional list
do {
// Convert the node to a TreeNode
TreeNode<K,V> p = replacementTreeNode(e, null);
if (tl == null)
// Set the root node
hd = p;
else {
// Point the first node of the tree node to the last node
p.prev = tl;
tl.next = p;
}
tl = p;
} while((e = e.next) ! =null);
// Turn the bidirectional list into a red-black tree
if((tab[index] = hd) ! =null) hd.treeify(tab); }}Copy the code
The treeifyBin method converts a linked list at an index position in the table into a red-black tree
This method is usually called when the list length is greater than TREEIFY_THRESHOLD after adding or merging elements
[1] it can be seen that if the current table. Length is less than the minimum tree threshold (64), then the resize method will be called to expand the list, rather than tree it
java.util.HashMap#putVal
final V putVal(int hash, K key, V value, boolean onlyIfAbsent,
boolean evict) {
Node<K,V>[] tab; Node<K,V> p; int n, i;
if ((tab = table) == null || (n = tab.length) == 0)
// [1] The table is not initialized yet
n = (tab = resize()).length;
if ((p = tab[i = (n - 1) & hash]) == null)
// If the hash value of the key has no element in the corresponding position of the table, the node will be created directly
tab[i] = newNode(hash, key, value, null);
else {
// A node already exists in the corresponding position of the table
Node<K,V> e; K k;
if(p.hash == hash && ((k = p.key) == key || (key ! =null && key.equals(k))))
// The specified key is equal to the key of an existing node
e = p;
else if (p instanceof TreeNode)
// The list node has become a red-black tree node
e = ((TreeNode<K,V>)p).putTreeVal(this, tab, hash, key, value);
else {
// Iterate through the list, insert or update nodes
for (int binCount = 0; ; ++binCount) {
if ((e = p.next) == null) {
// Add a new node to the end of the list
p.next = newNode(hash, key, value, null);
if (binCount >= TREEIFY_THRESHOLD - 1) // -1 for 1st
// the reason for treeify_threshold-1 is that binCount starts at 0, and when there are 8 nodes in the list, binCount=7
// When the number of nodes is greater than or equal to 8, tree the list
/ / [2]
treeifyBin(tab, hash);
break;
}
if(e.hash == hash && ((k = e.key) == key || (key ! =null && key.equals(k))))
// The node corresponding to the key is found in the list
break; p = e; }}if(e ! =null) { // existing mapping for key
// If the node corresponding to the key is found, the value on the node is updated
V oldValue = e.value;
if(! onlyIfAbsent || oldValue ==null)
// Update the value of the node
e.value = value;
afterNodeAccess(e);
// Returns the old value of the node
return oldValue;
}
}
++modCount;
if (++size > threshold)
/ / [3]
resize();
afterNodeInsertion(evict);
return null;
}
Copy the code
The putVal method is an implementation of adding key-value pairs to the associated method
As you can see in the figure above, the putVal method is called internally in methods that add key-value pairs
As can be seen from [1], the putVal method initializes the table internally by calling the resize method
The overall logic is not complicated, but pay attention to [2] and [3]
[2], if the linked list is too long after nodes are added, the linked list should be turned into a red-black tree
[3], if the number of key-value pairs in the whole HashMap reaches the upper limit of capacity expansion after nodes are added, the table should be expanded
5. To summarize
If the source code for ArrayList is one and a half stars, I’d say HashMap is at least three stars
This article omitted some content, such as the red-black tree implementation in HashMap. The main reason for not writing it is that I do not understand the red-black tree 😂 very well. If I understand it later, I will make up another article
At the very least, this article makes the implementation of the most important methods of HashMap quite clear, or it can be 😁
6. Recommended reading
HashMap static tool methods hash(), tableSizeFor()
Deep understanding of HashMap(4): Resize expansion for line-by-line analysis of key source code
Hash function in HashMap – faint maple – blog garden
The reason why HashMap capacity is a power of 2
Reasons and solutions for loop formation of HashMap linked list