A HashMap the underlying

Prior to jdk1.8, the underlying implementation of HashMap was based on arrays and linked lists

After jdk1.8, the underlying implementation of HashMap is based on array + linked list + red-black tree

The reason is that as the amount of data continues to increase, the probability of hash collisions increases dramatically, and a long linear linked list will be formed. As a result, the data structure becomes severely linearized and the efficiency decreases. If the array size exceeds 64 and the list node number exceeds 8, the list structure will be automatically converted to a red-black tree structure.

1.1 What are arrays for? When is an array created?

Arrays are essentially stores key-value pairs, and when we first call the PUT method, we create an array with a default length of 16 and a load factor of 0.75.

1.2 What are linked lists for? When will a linked list be created?

HashMap storing data, through the hash (key) method to calculate the hash value of the key, and then lowered maximum index of hash and array operation, get an index of less than or equal to the maximum value (the secondary system bit operations can ensure that the calculation of index value must be less than or equal to the array index of the largest, ensures that there will be no index crossed). Now that you know how to get an index, look at the key hash!! As we all know, different keys may have the same hash value, and then there will be hash conflicts (the computed array index already has key-v data). In this case, you need to check whether the new key is the same as the original key (equivalent to calling hash+equal to compare keys). If the new key is not the same, a linked list will be created and the new data will be mounted to the current index as a linked list. To summarize, a linked list is created when a hash conflict occurs and the two keys do not match.

1.3 What are the functions of red-black trees? When will it be created?

As mentioned above, when data volumes are large, hash collisions increase, leading to long, linear lists. When the list has more than 8 nodes, the underlying layer will turn the list into a red-black tree.

1.4 Why is array expansion twice?

In fact, this problem is similar to the hash value and the maximum index bit operation to obtain the index value. For example, if the current array length is 16 times that of 32, then the maximum index is 32-1=31, corresponding to binary 0001 1111. Carefully observe that, regardless of the hash value, the sum of 0001 1111, the resulting value must be less than or equal to 31, so that there is no index out of bounds.

2.1 About the PUT method

2.1.1 Hash based on key to obtain hash value

Hash (key) hash(key) hash(key) hash(key) hash(key) hash(key)

2.1.2 Creating a Default array

For the first time, an array of length 16 and load factor 0.75 is created by default.

2.1.3 Hash value and array maximum index are combined

The hash value and the maximum index of the array are summed to obtain a value less than or equal to the maximum index, which is the index of the data stored in the array

2.1.4 Determine whether there is hash conflict with the calculation result

If the index calculated above already has a value, then a hash conflict has occurred. In this case, we need to determine whether the keys are really the same, by comparing keys using hash+equal. If the keys are the same, the original data will be overwritten. If the keys are not the same, the linked list will be mounted or new nodes will be added to the red-black tree

2.1.5 Hash conflict occurs. Is there a linked list or a red-black tree? If it is a linked list, determine whether it needs to be converted to a red-black tree

Whether to mount a linked list or add a red-black tree node depends on the current data structure (it is a linked list until the red-black tree condition is met). If the current mount is a linked list, check whether the current mount length exceeds 8 to determine whether to convert to a red-black tree.

2.1.6 Determine whether array expansion is required

If the current data does not have hash conflicts, then you need to consider array expansion. For example, if the current array length is 16, check whether the array length exceeds 12 after the data is saved. If so, expand the array twice.

Of course, in our daily use, may not care about some of the underlying execution logic, interested students can look at the source code. We’ll talk about get methods next time.

The above are some of the author’s views on the bottom of HashMap, there are incorrect or not precise places, I hope you can correct more.