This article has participated in the third “topic writing” track of the Denver Creators Training Camp. For details, check out: Digg Project | Creators Training Camp third is ongoing, “write” to make a personal impact.

An overview of the

HashMap in Java is a hash table implementation, which inherits AbstractMap class, implements Map interface, and stores key-value mappings. As its name suggests, a HashMap stores data using the hash value of a key, which is fast to access, and the storage of key-value pairs is unordered.

Internal data structure

Inside a HashMap, there is an array of buckets in which key-value pairs hash to determine where they will be stored in the array. The calculation is (n-1) & hash, where n represents the length of the array. If an element already exists at the corresponding location when storing a key-value pair, the hash value and key value of the existing element are the same as those of the element to be stored. If they are the same, the hash value and key value of the existing element are overwritten. Otherwise, multiple key-value pairs will be stored at this location.

Multiple key-value pairs can be stored at one location because these key-value pairs are stored as linked lists at each location in the array. When a hash conflict occurs, simply add the new key-value pair to the linked list.

After JDK1.8, if the length of the linked list exceeds a threshold (the default is 8), more operations will be done. If the size of the array is less than 64, the array is expanded. If greater than or equal to 64, the linked list is converted to a red-black tree to reduce the search time.

For details, see the putVal method in the HashMap source code.

final V putVal(int hash, K key, V value, boolean onlyIfAbsent,
               boolean evict) {
    Node<K,V>[] tab; Node<K,V> p; int n, i;
    if ((tab = table) == null || (n = tab.length) == 0)
        n = (tab = resize()).length;
    if ((p = tab[i = (n - 1) & hash]) == null)
        tab[i] = newNode(hash, key, value, null);
    else {
        Node<K,V> e; K k;
        if(p.hash == hash && ((k = p.key) == key || (key ! =null && key.equals(k))))
            e = p;
        else if (p instanceof TreeNode)
            e = ((TreeNode<K,V>)p).putTreeVal(this, tab, hash, key, value);
        else {
            for (int binCount = 0; ; ++binCount) {
                if ((e = p.next) == null) {
                    p.next = newNode(hash, key, value, null);
                    if (binCount >= TREEIFY_THRESHOLD - 1) // -1 for 1st
                        treeifyBin(tab, hash);
                    break;
                }
                if(e.hash == hash && ((k = e.key) == key || (key ! =null && key.equals(k))))
                    break; p = e; }}if(e ! =null) { // existing mapping for key
            V oldValue = e.value;
            if(! onlyIfAbsent || oldValue ==null)
                e.value = value;
            afterNodeAccess(e);
            return oldValue;
        }
    }
    ++modCount;
    if (++size > threshold)
        resize();
    afterNodeInsertion(evict);
    return null;
}
Copy the code

The source code of treeifyBin method is as follows.

final void treeifyBin(Node<K,V>[] tab, int hash) {
    int n, index; Node<K,V> e;
    if (tab == null || (n = tab.length) < MIN_TREEIFY_CAPACITY)
        resize();
    else if ((e = tab[index = (n - 1) & hash]) ! =null) {
        TreeNode<K,V> hd = null, tl = null;
        do {
            TreeNode<K,V> p = replacementTreeNode(e, null);
            if (tl == null)
                hd = p;
            else {
                p.prev = tl;
                tl.next = p;
            }
            tl = p;
        } while((e = e.next) ! =null);
        if((tab[index] = hd) ! =null) hd.treeify(tab); }}Copy the code

It does the work of scaling (calling the resize method) or converting the list to a red-black tree. The resize method also appeared in the putVal method before. It has two responsibilities:

  1. Class the first time you put an element into a HashMap.
  2. Expand the container if necessary.

In the logic of capacity expansion, concepts such as capacity, load factor, and gate limit are involved.

capacity

  • The capacity is the estimated number of elements specified when creating the HashMap, which defaults to 16.
  • Load factor refers to the density of data storage. The default value ranges from 0 to 1. The default value is 0.75.
  • Door limit isCapacity x load factorThe value of the. When the actual number of key-value pairs exceeds the threshold, capacity expansion is required.

Therefore, the closer the load factor is to 0, the lower the gate limit is and the more sparse the data storage is. Conversely, the closer the load factor is to 1, the more dense the data store is.

In most scenarios of daily development, it is recommended to keep the default values unchanged. If the load factor is too low, it can lead to frequent library operations, which involve rehash and data replication, which can be very performance consuming, and arrays that are too “empty” can also waste space. If the load factor is too high, too many hash conflicts will result, which will degrade the performance of the HashMap operation.

The tree,

When we introduced the HashMap structure earlier, we introduced the treification conditions:

  1. The number of elements in a bucket has reached the threshold (8 by default).
  2. The length of the array has reached the threshold (64 by default).

After the above two conditions are met, the linked list at the corresponding position will be converted into a red-black tree. Because the linked list is linear, when the number of data reaches a certain level, it will be converted into a red-black tree to improve the operation performance of the data. In addition, it can also prevent the artificial construction of hash collision attacks, resulting in a large number of CPU security problems.

conclusion

HashMap is one of the most important data structures in Java and is often asked in technical interviews. Understanding the structure, principles, and even reading the source code of HashMap is essential for Java programmers.