Note: The JDK version is 1.8. This article analyzes until the method is added, and the rest of the following analysis is deleted and modified if you like

The source code for HashMap1.8 is quite different from that before 1.8

  • directory

    • Introduction to the

    • The data structure

    • Class structure

    • attribute

    • A constructor

    • increase

1. Introduction of HashMap

HashMap is implemented based on the Map interface of a hash table and exists in the form of key-value storage. (The HashMap class is roughly the same as Hashtable, except that it is not synchronized and null is allowed.) The implementation of HashMap is not synchronous, which means it is not thread-safe. Both key and value can be null. In addition, the mappings in a HashMap are not ordered. In JDK1.8, HashMap is composed of array + linked list + red-black tree. The red-black tree is added as the underlying data structure, which makes the structure more complex, but also becomes more efficient.

1.2 HashMap data structure

In JDK1.8, HashMap is composed of array + linked list + red-black tree. The red-black tree is added as the underlying data structure, which makes the structure more complex, but also becomes more efficient. When a value is stored in the Map, it is calculated based on the Key value

Hash, which determines the location of an array by hash, is stored as a linked list if a hash collision occurs, as explained in Object source analysis, but if the list becomes too long, HashMap converts the list to a red-black tree for storage.


Take a look at the storage structure of a HashMap

Why should a HashMap use a red-black tree?

I have never thought about this question, in fact, many people only care about the implementation of red black tree and ignore the question of why to use it, I also had a sudden confusion when writing this article. Refer to the example on the web, and also explain why the threshold is 8:

Since the elements of the bucket in the Map are initialized in a linked list, the lookup performance is O(n), while the tree structure improves the lookup performance to O(log(n)). When the list length is very small, even traversal is very fast, but when the list length keeps getting longer, it will certainly have a certain impact on query performance, so it is necessary to turn into a tree. As to why the threshold is 8, I think the most reliable way to find out is to go to the source code.

Please refer to https://dwz.cn/nPFXmXwJ

2. Class structure

Let’s look at the class structure

AbstractMap class AbstractMap implements the Map interface. So why should HashMap implement the Map interface? This is also the case in ArrayList LinkedList.

According to Josh Bloch, founder of the Java Collections Framework, this is a mistake. In the Java Collections framework, there are a lot of things like this, and when he started writing the Java Collections framework, he thought it might be valuable somewhere, until he realized he was wrong. Apparently, the JDK maintainers didn’t think this minor glitch was worth fixing, so it stuck.

  • Cloneable Indicates an empty interface that can be cloned

  • The Serializable serialization

  • AbstractMap provides a Map implementation interface

3. The attribute

Initialize the capacity (It has to be two to the NTH power)

Maximum set size (must be a power of two)

Load factor, default 0.75

Red black tree when the list value exceeds 8 (added in 1.8)

When the value of the list is less than 6, it goes back to the list from the red-black tree

When the number of buckets in the Map exceeds this value, the buckets in the table can be trefied. Otherwise, the bucket will be expanded if there are too many elements in the Map. To avoid selection conflicts between capacity expansion and treification, the value cannot be less than 4 x TREEIFY_THRESHOLD

Table is used to initialize (must be a power of two)

For caching

The amount stored in the HashMap

Used to record the number of changes to the HashMap

The value used to resize the next capacity is calculated as (capacity * load factor)

The load factor of the hash table

The key attributes

  • Table in JDK1.8 we learned that HashMap is composed of an array, a linked list, and a red-black tree. Table is the array in HashMap

  • Size is the real-time amount of k-v in the HashMap

  • The loadFactor is used to measure how full a HashMap is. The real-time loading factor of a HashMap is calculated as size/capacity instead of capacity. Capacity is the number of buckets, that is, the length of the table.

  • The threshold calculation formula is: capacity * loadFactor. This value is the maximum length of the array currently occupied. After this number, the HashMap is resized to double its previous capacity

4. Construction method

Let’s start with the constructor.

4.1 a HashMap ()

Construct an empty HashMap with a default initial capacity (16) and a default load factor (0.75).

4.2 a HashMap (int initialCapacity)

Construct an empty HashMap with the specified initial capacity and default load factor (0.75).

4.3 HashMap(int initialCapacity, float loadFactor)

Construct an empty HashMap with the specified initial capacity and load factor. So let’s analyze it.

TableSizeFor (tableSizeFor)

5. Increase

Now let’s look at the put() method

We can see that put calls putVal to insert data, but notice that key performs a hash() method here, to see how the hash method is implemented.

We can see from the above that HashMap supports an empty Key, and HashTable uses the Key directly to get HashCode, so an empty Key throws an exception and we’ve already explained why the length of a HashMap is a power of two because HashMap uses a very clever method, It uses hash & (table.length-1) to get the save bit of the object. As mentioned earlier, the length of the underlying array of HashMap is always 2 to the power of n, which is a speed optimization of HashMap. When length is always 2 to the n, hash & (length-1) is equivalent to modulo length, which is hash%length, but & is more efficient than %. For example, n % 32 = n & 32 -1.

Now look at the putVal() method and see what it does.

Main parameters:

  • Hash Hash value of the key

  • The key of the original key

  • Value Indicates the value to be stored

  • OnlyIfAbsent If true does not change the existing value

  • If evict is false, the table is created

Full source code analysis, put the picture would be too long, so I cut it into two.