We believe that the underlying principle of HashMap is more or less understood, here is a brief mention, jdK1.8 version after the internal HashMap mainly through array + linked list + red-black tree data structure to store elements, this article we will analyze several main methods of HashMap, look at the source code of HashMap is specifically how to achieve. And what optimizations the JDK has made to improve performance.
The Node Node
static class Node<K.V> implements Map.Entry<K.V> {
/ / hash value
final int hash;
// Element key
final K key;
// The element value
V value;
// point to the next node
Node<K,V> next;
// Omit other code here...
}
Copy the code
The Node class is an internal class of the HashMap. It represents the element to be added to the HashMap. The next field represents a pointer to the next Node
Put method
public V put(K key, V value) {
return putVal(hash(key), key, value, false.true);
}
Copy the code
The put method calls putVal directly, and you can see that the first argument is passed a hash(key), so let’s look at the hash method
static final int hash(Object key) {
int h;
return (key == null)?0 : (h = key.hashCode()) ^ (h >>> 16);
}
Copy the code
This hash method is used to calculate the hash value of the element put into the map method. This code (h = key.hashcode ()) ^ (h >>> 16) obtains the hash value of the key by using key.hashcode (). Then xor (h >>>16) computs the final hash value. H >>>16 moves the hash value 16 bits to the right
If you are not familiar with right shift >>> and xor ^, you may be confused by this line of code. For example:
H value: 1111 1111 1111 1111 1010 0111 1100
H >>>16 value: 0000 0000 0000 1111 1111 1111 1111
H ^ (h >>> 16) Value: 1111 1111 1111 1111 0000 0101 1000 0011
The final value computed by h ^ (h >>> 16) is the hash value obtained by the hash method. In order to reduce hash collisions, we will explain why we can reduce hash collisions later when we talk about the position of the elements in the array. We only need to know that xor operation can preserve the high and low 16 bits of the original hash and reflect them in the final hash value
Now that we’re done with the hash method, let’s take a look at the putVal method, which is a bit longer and has more complicated branch logic, so let’s look at it bit by bit
final V putVal(int hash, K key, V value, boolean onlyIfAbsent,
boolean evict) {
Node<K,V>[] tab; Node<K,V> p; int n, i;
//1. If the array is empty, call resize to generate an array
if ((tab = table) == null || (n = tab.length) == 0)
n = (tab = resize()).length;
//2. Locate the element in the array first. If the element is empty, build a Node and put it in the current location
if ((p = tab[i = (n - 1) & hash]) == null)
tab[i] = newNode(hash, key, value, null);
else {
Node<K,V> e; K k;
//3. An element already exists in the current location with the same key, overwriting the current element directly
if(p.hash == hash && ((k = p.key) == key || (key ! =null && key.equals(k))))
e = p;
//4. If the current element is a tree node, attach the new element to the tree
else if (p instanceof TreeNode)
e = ((TreeNode<K,V>)p).putTreeVal(this, tab, hash, key, value);
//5. Attach the element to the list
else {
//6. Iterate through the list elements
for (int binCount = 0; ; ++binCount) {
if ((e = p.next) == null) {
//7. Construct a node. Here we use the next pointer to link the node
p.next = newNode(hash, key, value, null);
//8. If the list has more than 8 elements, it becomes a red-black tree
if (binCount >= TREEIFY_THRESHOLD - 1) // -1 for 1st
treeifyBin(tab, hash);
break;
}
//9. If the current element has the same key as the new element, overwrite the node directly on the list
if(e.hash == hash && ((k = e.key) == key || (key ! =null && key.equals(k))))
break; p = e; }}if(e ! =null) { // existing mapping for key
V oldValue = e.value;
if(! onlyIfAbsent || oldValue ==null)
e.value = value;
afterNodeAccess(e);
return oldValue;
}
}
++modCount;
if (++size > threshold)
resize();
afterNodeInsertion(evict);
return null;
}
Copy the code
In the above code, I have added comments on each important code, which should be easy to understand. Let’s look at some important points:
1. Locate elements: p = TAB [I = (n-1) & hash]
Once you get the hash value, (n-1) &hash is used to locate the element in the array by subtracting the length of the hash and the array by 1. N is guaranteed to be a power of 2 in the code. For example, 16, 32, 64, and so on. After looking at this method, you can see why it is a power of 2. Similarly, let’s look at this operation:
The n value is assumed to be 16, so n-1 is 15, and the binary value is :1111
The hash value is 1111 1111 1111 1111 1111 0000 0101 1000 0011, and the result is 11, so the element will be at the index of 11. We use the and instead of modulo the array, again for performance reasons.
So why do xor, let’s say the first element is called a, the original hash is 1111 1111 1111 1111 1010 0111 1100, There’s another element b that has the original hash value of 1111 1111 1111 1110 1111 1010 0111 1100, and notice the difference from the previous element, only the 16th bit is different, a is 1, b is 0
If there’s no shift and xor, then it’s going to have the same (n minus 1) hash value as 16, so it’s going to clash. So let’s look at b’s hash value after the shift and xor
The h value of B is 1111 1111 1110 1111 1010 0111 1100
H >>>16 value: 0000 0000 0000 1111 1111 1111 1110
H ^ (h >>> 16) Value: 1111 1111 1111 1111 0000 0101 1000 0010
So the final hash value of B is 1111 1111 1111 1111 1111 0000 0101 1000 0010, which is 10 after (n-1) &hash, which is the position of the array with index 10, and it does not conflict with A, feel that? That’s what shift and xOR do. Reduces the probability that elements will collide in array positions.
2. Convert linked list to red-black tree: treeifyBin(TAB, hash); This method is simply to link the list into a red black tree, the internal implementation is still very complex, interested can go to see the source code.
The resize method
Node<K,V> loHead = null, loTail = null;
Node<K,V> hiHead = null, hiTail = null;
Node<K,V> next;
do {
next = e.next;
//1. Hash value & old array length equals 0
if ((e.hash & oldCap) == 0) {
if (loTail == null)
loHead = e;
else
loTail.next = e;
loTail = e;
}
//2. Hash value & old array length not equal to 0
else {
if (hiTail == null)
hiHead = e;
elsehiTail.next = e; hiTail = e; }}while((e = next) ! =null);
if(loTail ! =null) {
loTail.next = null;
//3. Put it in the original position of the new array
newTab[j] = loHead;
}
if(hiTail ! =null) {
hiTail.next = null;
//4. Place (old array length + old position) in a new position
newTab[j + oldCap] = hiHead;
}
Copy the code
When a HashMap is expanded, it creates a new array twice the size of the old one, and copies the old elements into the new array. The code above is taken from the resize method to determine the location of the new elements. If it is 0, put it in the old position. If it is not 0, put it in the new position. The new position is determined by the old array length + the old position.
Suppose the hash is 1111 1111 1111 1111 0000 0101 1000 0011, the array expands from 16 to 32, and the value evaluated by E.hash & oldCap is 0, so the position on both the old and new arrays is 11
The hash element 1111 1111 1111 1111 0000 0101 1001 0011 expands the array from 16 to 32, evaluates to 1 by e.hash & oldCap, so the position on the new array becomes 11 + 16 = 27 instead of 11
conclusion
This article mainly describes the following implementation of HashMap: 1. Calculate hash value (shift, xOR operation) 2. 3. Implementation of put() method (linked list and linked list to red black tree) 4. Implementation of resize method (Rehash position in a new array)
Above, I hope it will be helpful for you to understand HashMap. If you have different views or questions, please feel free to discuss with me