This article focuses on
In this article we will analyze the code
2. How the elements are redistributed when the number of elements increases. Also, for the convenience of the reader, I have taken the source code with the line number as far as possible.
The JDK version is 1.8
chart
Once again, the underlying data structure of HashMap is array + linked list + red-black tree. Put a HashMap structure diagram to get a general idea.
Anatomy of the thinking
Create a parameter constructor and add several elements to it until the capacity expansion mechanism is triggered
To facilitate the calculation of hash values, use small strings for key and value
For the use of debug key, please refer to: IDEA debug key description, which will not be repeated here
Debugging code
Public static void main(String[] args) {system.out.println ("★★★★★★ ★★★★★★"); HashMap<String, String> map = new HashMap<>(12); map.put("1", "1"); map.put("2", "2"); map.put("3", "3"); map.put("4", "4"); Map. put("4", "D"); map.put("5", "5"); map.put("6", "6"); map.put("7", "7"); map.put("8", "8"); map.put("9", "9"); map.put("10", "10"); map.put("11", "11"); map.put("12", "12"); Map. put("13", "13"); map.put("14", "14"); map.put("15", "15"); map.put("16", "16"); map.put("17", "17"); map.put("18", "18"); map.put("19", "19"); map.put("20", "20"); map.put("21", "21"); map.put("22", "22"); map.put("23", "23"); map.put("24", "24"); Map. put("25", "25"); } 123456789101112131415161718192021222324252627282930313233343536373839404142434445Copy the code
The break point is on the first line
Start in Debug mode to begin the dissection tour
Click Step Over to HashMap<String, String> map = new HashMap<>(12); Where the line
Click Force Step Into to see that the classloader is called first
This is not the focus of today, we will Step Out directly, and then click Force Step Into again to enter the source code
The two-parameter constructor DEFAULT_LOAD_FACTOR is called with 0.75 and initialCapacity is 12, continuing Force Step Into
If you continue to Force Step Into the two-parameter constructor, you will find that the parent constructor is still called
This. Threshold = tableSizeFor(initialCapacity)
TableSizeFor (int cap)
TableSizeFor (int Cap) tableSizeFor(int Cap) tableSizeFor(int Cap) Static final int MAXIMUM_CAPACITY = 1 << 30 = 2 to the power of 30
So if you pass in 12, you should get back 16,n is 15,n plus 1 is 16
TableSizeFor, tableSizeFor, tableSizeFor, tableSizeFor, tableSizeFor, tableSizeFor, tableSizeFor, tableSizeFor, tableSizeFor, tableSizeFor, tableSizeFor, tableSizeFor, tableSizeFor, tableSizeFor, tableSizeFor
Continue to Force Step Into the two-parameter function, and assign the result of tableSizeFor to threshold: =16
Begin to put
At this point, the map initialization is complete. In fact, at this point, the table has not been established
Force Step Into map.put(“1”, “1”); Put (K key, V value)
Put (K key, V value) is putVal (K key, V value). Put (K key, V value) is putVal
Key.hashcode () calls the hashCode of String
Then return the put method to putVal and continue with Force Step Into, where the hash value is 49
The table must be null because the table was not initialized at initialization time
Force Step Into resize()
The code execution path is marked with a red box. The code execution path is marked with a red box. The code execution path is marked with a red box
Return newTab directly
A walk through the source code shows that the map created by the parameter constructor,
The initial capacity depends on line 692 of the source code. .
Int oldThr = threshold;
TableSizeFor (int initialCapacity) tableSizeFor (int initialCapacity) tableSizeFor (int initialCapacity) tableSizeFor (int initialCapacity) tableSizeFor (int initialCapacity) tableSizeFor (int initialCapacity) tableSizeFor (int initialCapacity
Force Step Into; Force Step Into; Force Step Into
Now that we have an array, and we know the length, it’s time to think about the position of the element, as we explained in detail,
The position of the element is determined by the result of the (n-1) & hash expression
So let’s just go ahead and figure out I is equal to 1
TAB [I] = newNode(hash, key, value, null); , build a linked list of nodes into bit 1
Continue debugging, modCount increases and size increases after the element is added, and compare with the capacity expansion threshold (currently 12). If 1 is less than 12, no capacity expansion is required.
See above for modCount
So if YOU draw a diagram for yourself, it looks something like this, only position 1 has elements, everything else is null
Map. Put (“2”, “2”); The hash value of the table is 50, but the table is no longer null. The table position is calculated directly. 1 equals 2, and the position is null
Step Over to map. Put (“4”, “D”); If the key value is the same and the value is different, what can be done
When I Force Step Into putVal, I find that hash is still 52 and I is also 4. There are already elements in the position, so I walk 634 lines of source code
Take a close look at this line of code
p.hash == hash && ((k = p.key) == key || (key ! = null && key.equals(k))) 1Copy the code
P.hash == hash is true, (k = p.key) == key is also true so e = p; , and then there is no tree, so the source code 656 lines directly overwrite the new value of the old value
At this point, the overwriting problem is solved and Force Step Into is continued. If there is no repeat value, Force Step Into goes through all the way. The diagram of positions 1-9 is shown below
Put, put, put until map.put(“10”, “10”); Why do you stop at this point, because the hash value is different, and the position is probably different, so go in and see
Sure enough, the hash has now changed to 1567, much larger than the previous increment, and the position is 15. Similarly, 11 is 0
So far, 11 elements have been placed, and their locations are shown as follows
Theoretically, another one would hit our capacity expansion threshold = 16*0.75 = 12, but take a look at line 662 of the source code
Put (“12”, “12”) Force Step Into map. Put (“12”, “12”)
Things are starting to change, this hash is different, but I already have elements at the position I =1
The red box above is the code execution path. Source code 642 indicates that 12 nodes are placed next at node 1 of p, while e is still null at the time of p.ext assignment in source code 641, so the following code path is correct
So key=1 gives next, and the diagram is as follows:
Put (“13”, “13”); Put (“12”, “12”); The same
The script would have placed key=13 at position 2 and the next diagram with key=2 would have looked like this
If (++size > threshold
The incremental capacity
He’s going to do resize() again and look inside, forget about element movement, look at expansion first
Compare that to the first resize
Element migration
The capacity and threshold of the second time are two times larger than that of the first time, and oldTab is no longer null. Therefore, you need to migrate oldTab to newTab
So we’re going to go through this code, you go first
if (oldTab ! = null) { for (int j = 0; j < oldCap; ++j) { Node<K,V> e; if ((e = oldTab[j]) ! = null) { oldTab[j] = null; if (e.next == null) newTab[e.hash & (newCap - 1)] = e; else if (e instanceof TreeNode) ((TreeNode<K,V>)e).split(this, newTab, j, oldCap); else { // preserve order Node<K,V> loHead = null, loTail = null; Node<K,V> hiHead = null, hiTail = null; Node<K,V> next; do { next = e.next; if ((e.hash & oldCap) == 0) { if (loTail == null) loHead = e; else loTail.next = e; loTail = e; } else { if (hiTail == null) hiHead = e; else hiTail.next = e; hiTail = e; } } while ((e = next) ! = null); if (loTail ! = null) { loTail.next = null; newTab[j] = loHead; } if (hiTail ! = null) { hiTail.next = null; newTab[j + oldCap] = hiHead; }}}}} 123456789101112131415161718192021222324252627282930313233343536373839404142Copy the code
For (int j = 0; int j = 0; j < oldCap; ++j) is going to loop through the old table
The above code is read in three cases
- First, there is only one Node element in position 3-9, i.e. next is null
- The second type has multiple elements in one position, similar to positions 1 and 2 above, which currently have two elements
- The third type is the element of type TreeNode in this position
For the first case, the core operation is line 712 newTab[e.hash & (newcap-1)] = e;
Calculates the position of the element in the new table, e.hash (newCap-1)
So element 0 goes through E.hash (newCap-1) 1568&31, and the tool calculates 0 in the new table
And then the second element, element one, is exactly the second case, so let’s look at the diagram
709 line oldTab[1] is not null, and 711 line e.ext is not null
And e instanceof TreeNode is false, so the core flow comes to do on line 719… while
The do… Define the first and last of two lists, one to hold elements in the same position as the old list, and one to hold elements that need to be reassigned
Node<K,V> loHead = null, loTail = null; <K,V> hiHead = null, hiTail = null; <K,V> next; do { next = e.next; if ((e.hash & oldCap) == 0) { if (loTail == null) loHead = e; else loTail.next = e; loTail = e; } else { if (hiTail == null) hiHead = e; else hiTail.next = e; hiTail = e; } } while ((e = next) ! = null); if (loTail ! = null) { loTail.next = null; newTab[j] = loHead; } if (hiTail ! = null) { hiTail.next = null; newTab[j + oldCap] = hiHead; } 12345678910111213141516171819202122232425262728Copy the code
When the key=1 element is executed, line 721 e.hash & oldCap is 46 & 16! = 0, jump to source code 729
Continue the loop at position 1, where E. hash & oldCap is 1569&16 and the result is 0, so assign e to loHead, and loTail also points to E
Since there are only two elements, the loop ends here
Finally, place the loHead in newTab[1] at the same place in the new array as in the old one
The hiHead is placed in the new array newTab[1 + 16], which is the capacity of the old array plus the location of the old array
Similarly, the two elements in position 2 are placed at position 2 and position 2+16 in the new table
After the final operation, the position relation is as schematic diagram
conclusion
With this goal accomplished, summarize the main points
1. Initialization of a HashMap is done by calling resize when the first element is inserted (source line 629)
2, do not specify the capacity, default capacity is 16 (source code 694)
3, the specified capacity may not be in accordance with your value, will pass tableSizeFor converted into not less than the n power of the input value of 2, source 378
If tableSizeFor is converted to the NTH power of 2, the value of tableSizeFor is not directly assigned to Capacity. Instead, the value is temporarily stored in threshold (see source code 457), and then passed to newCap when the first element resize is put
I = (n-1) &hash (I = (n-1) &hash (I = (n-1) &hash (I = (n-1)) Direct new value overwrite old value * (source 657) 4.3 If there is and is not a key conflict, then put it in the next position of the original element (source 642) 5, only when the size is greater than the capacity expansion threshold size > threshold, will trigger expansion, source 662, before expansion, the current element has been put in place. Next ==null(next==null)711 7.2 When an element has a next node, there are two types of elements on the list. In the same position in the new table as in the old table (source 738) = 0, position for old table position + old table capacity, source 742
Outlook:
One day, and only a part of it, initialization, initial expansion, and incremental expansion, similar to tree, tree removal has not been studied
Structure tree ideas, but also from the source to find, mainly the following lines
As long as you can get different keys to have the same hash value, and different keys, you can create hash collisions, you can put multiple elements in the same place, and then tree, which we’ll break down next time.