This article will analyze the process of changing the internal data structure of HashMap through the following simple code.
public static void main(String[] args) {
Map<String, String> map = new HashMap<>();
for (int i = 0; i < 50; i++) {
map.put("key" + i, "value"+ i); }}Copy the code
1 Data structure description
The following are the fields in the HashMap used in this article:
The following describes the meanings of several fields
1) the table
// The HashMap uses this array internally to store all key-value pairs
transient Node<K,V>[] table;
Copy the code
The structure of a Node is as follows:
2) size
Records the number of key-value pairs in the HashMap
3) modCount
It records the number of structural changes to the HashMap, including operations that can change the number of key-value pairs, such as PUT and remove, and operations that can change the internal structure, such as rehash.
4) threshold
Record a critical value. If the number of stored key/value pairs exceeds the threshold, expand the storage capacity.
5) loadFactor
The loadFactor is usually used to calculate threshold. Threshold = total capacity x loadFactor.
2.new HashMap
The source code for new HashMap is as follows:
/** * The load factor used when none specified in constructor. */
static final float DEFAULT_LOAD_FACTOR = 0.75 f;
public HashMap(a) {
this.loadFactor = DEFAULT_LOAD_FACTOR; // all other fields defaulted
}
Copy the code
At this point, the HashMap only initializes the load factor (using the default value of 0.75) and does not initialize the table array. In fact, a HashMap uses a lazy initialization policy. The table is initialized only when it is put for the first time (the table is null).
3. Initialize the table array
When put for the first time, the HashMap determines whether the current table is empty, and if so, calls the resize method to initialize it. The resize method initializes an array of 16 and assigns it to the table. And calculate threshold=16*0.75=12. The state of the table array is as follows:
4. Put the process
map.put("key0"."value0");
Copy the code
Hash (“key0”) = 3288451 Hash (“key0”) = 3288451 hash(“key0”) = 3288451 Index =(array size -1) &hash = 3; if (table[index] == null) create a new Node on the table;
Keep putting…
map.put("key1"."value1");
Copy the code
map.put("key1"."value1");
map.put("key2"."value2");
map.put("key3"."value3");
map.put("key4"."value4");
map.put("key5"."value5");
map.put("key6"."value6");
map.put("key8"."value7");
map.put("key9"."value9");
map.put("key10"."value10");
map.put("key11"."value11");
Copy the code
map.put("key12"."value12");
Copy the code
Hash (“key12”)=101945043 hash(“key12”)= (total size -1) &hash = (16-1) &101945043 = 3 table[3] ! = null, but the key to be put is not the same as table[3]. Key, and we have to store it.
That’s where linked lists come in, and that’s how hashMaps resolve hash conflicts. The HashMap creates a new Node and places it at the end of the table[3] list. The table status is as follows:
5. The resize expansion
At this point, there are 13 elements in the table, which exceeds threshold(12). You need to call the resize method to expand the table. HashMap creates a table with twice the size (162=32) and copies the old Node into the new table with a new threshold of 24(320.75).
The following is a brief description of the copying process (not the exact copy, the Node position may change).
First look at the source code:
for (int j = 0; j < oldCap; ++j) { // oldCap: Old table size =16
Node<K,V> e;
if((e = oldTab[j]) ! =null) { // oldTab: backup of old table
oldTab[j] = null;
// If the element in the array has no successor Node, the new index value is computed and Node is placed in the new array
if (e.next == null)
newTab[e.hash & (newCap - 1)] = e;
// Ignore the else if. In fact, if the list is longer than 8, HashMap turns the list into a tree structure with TreeNode elements
else if (e instanceof TreeNode)
((TreeNode<K,V>)e).split(this, newTab, j, oldCap);
// There are successor nodes
else { // preserve order
Node<K,V> loHead = null, loTail = null;
Node<K,V> hiHead = null, hiTail = null;
Node<K,V> next;
do {
next = e.next;
//
if ((e.hash & oldCap) == 0) {
if (loTail == null)
loHead = e;
else
loTail.next = e;
loTail = e;
}
else {
if (hiTail == null)
hiHead = e;
elsehiTail.next = e; hiTail = e; }}while((e = next) ! =null);
if(loTail ! =null) {
loTail.next = null;
newTab[j] = loHead;
}
if(hiTail ! =null) {
hiTail.next = null;
//newTab[j + oldCap] = hiHead; }}}}Copy the code
[Note 1] Iterate through the linked list and calculate the position of each node in the linked list in the new table.
The position is calculated as follows:
1) If (hash & oldCap) == 0, then the node is still in its original position. Why?
Since oldCap=16, the binary representation is 0001 0000, any number &16, if equal to 0, must have the fifth binary of the number 0.
In its current state, the new capacity is 32, so the table’s maximum index is 31, and the binary representation of 31 is 00011111. Hash & (capacity -1), that is, the new index will be computed as hash & (32-1) assuming Node’s hash = 01101011, then
01101011
& 00011111
----------
00001011 = 11
Copy the code
2) Now compare (Hash & oldCap)! Theta is equal to 0
If the node’s (hash & oldCap)! = 0, then the position of this object = old index + old capacity
Assuming Node hash = 01111011, then
01111011
& 00011111
----------
00011011 = 27
Copy the code
The hash value 01101011 in the previous example differs from the hash value 01111011 in this example only in the 5th binary. The two values are in the same index in the old table, as follows:
01101011
& 00001111
----------
00001011 = 11
Copy the code
01111011
& 00001111
----------
00001011 = 11
Copy the code
Since capacity expansion is always done in a double way, i.e. : old capacity << 1, this also explains [Note 2], when (Hash & oldCap)! = 0, the Node’s new index = old index + old capacity.
After capacity expansion, the table status is as follows: