A HashMap overview

Before and after HashMap 1.7, the underlying data structure uses [array + linked list], and after 1.8, it uses [array + linked list/red-black tree]. Array is used to store elements because of fast lookup, linked list is used to solve hash conflicts, and red-black tree is used to solve the slow query speed of the linked list to optimize a data structure of the linked list. HashMap is thread-safe, if you need thread safety, use ConcurrentHashMap or Collections. SynchronizedMap () thread safe package HashMap to achieve.

Use two diagrams to clearly express the data structure of hashMap in JDK 1.7/1.8. [Photo from the Internet]



Why does the JDK use red-black trees to optimize linked lists in HashMap? Listen to me.

Red and black tree

Before introducing red-black tree, we first introduce binary search tree, also known as binary search tree, binary sort tree, which is generally used to optimize linked lists. Has the following features:

  • The value of all nodes in the left subtree is less than or equal to its root
  • The value of all nodes in the right subtree is greater than or equal to its root
  • The left and right subtrees are also binary sort trees respectively

Eg: To find the element 10, the search procedure is:

  • 9 < 10 (right node)
  • 13 > 10 (left node)
  • 11 > 10 (left node)
  • 10 = 10 find the element

The maximum number of times a binary sort tree can find a result is the height of the binary tree. If the elements are respectively [7,6,5,4,3,2,1], then according to the characteristics of binary sorting tree, the tree will become a straight line, which will become a linear query, time complexity is O(N) level, in order to optimize this problem, [red black tree] appeared.

A red-black tree is a self-balancing binary search tree that optimizes the problem that the linked list in a HashMap can be very long.

Red-black tree is a binary search tree with color attributes for each node, and has the following characteristics:

  • Nodes are red/black
  • The root node must be black
  • Each leaf node (NIL node, empty node) is black
  • Two children of each red node are black (there cannot be two consecutive red nodes on all paths from each leaf to the root)
  • All paths from any node to each of its leaves contain the same number of black nodes

[A typical red-black tree]

How do I insert a node into a red-black tree?

Eg: By inserting 14 node elements into the graph, the red-black tree characteristics will not be broken.

If you insert element 21, you break the red-black tree property.

How w do red black trees maintain tree self-equilibrium?

Red black trees maintain the rules of red black trees by “changing color” and “rotation”, which can be “left rotation” and “right rotation”.

Rotating sample

1) Rotate the following image left

2) Rotate the following image right

Source code analysis

The following source code is from JDK1.8

The default value

static final int DEFAULT_INITIAL_CAPACITY = 1 << 4; // aka 16, default initialization capacity
static final int MAXIMUM_CAPACITY = 1 << 30;	// Maximum capacity
static final float DEFAULT_LOAD_FACTOR = 0.75 f;	// Load factor
static final int TREEIFY_THRESHOLD = 8;	// Tree the threshold
static final int UNTREEIFY_THRESHOLD = 6;	// Cancel the tree threshold
static final int MIN_TREEIFY_CAPACITY = 64;	// The minimum tree size of the array to be converted to a red-black tree is 64
Copy the code

Two factors affect the performance of a HashMap

Initial capacity

  • The capacity of the hash table when it was created

Load factor

  • The capacity that measures how full a hash table is before it is automatically added to capacity, when the amount of data in the hash table exceeds (load factor * current capacity), the hash table will be rebuilt
  • Eg: If the HashMap capacity is not specified, the initial capacity is 16. When the map data amount reaches 12, the hash table will be expanded and rebuilt.
public HashMap(int initialCapacity, float loadFactor) {}
Copy the code

put()

final V putVal(int hash, K key, V value, boolean onlyIfAbsent,
               boolean evict) {
    Node<K,V>[] tab; Node<K,V> p; int n, i;
    // If the table storing the elements is empty, initialize the necessary fields
    if ((tab = table) == null || (n = tab.length) == 0)
        // Get length 16
        n = (tab = resize()).length;
    // If the hash value is null, create a new node
    if ((p = tab[i = (n - 1) & hash]) == null)
        tab[i] = newNode(hash, key, value, null);
    else {
        Node<K,V> e; K k;
        // If the hash and key values of the new node are the same as those of node P in the table
        if(p.hash == hash && ((k = p.key) == key || (key ! =null && key.equals(k))))
            e = p;
        // If the node is a red-black tree, red-black tree insertion is performed
        else if (p instanceof TreeNode)
            e = ((TreeNode<K,V>)p).putTreeVal(this, tab, hash, key, value);
        else {
            for (int binCount = 0; ; ++binCount) {
                // indicates that the list has only one header node
                if ((e = p.next) == null) {
                    p.next = newNode(hash, key, value, null);
                    // Convert the list to a red-black tree when the length of the list reaches 8
                    if (binCount >= TREEIFY_THRESHOLD - 1) // -1 for 1st
                        // Build a red-black tree
                        treeifyBin(tab, hash);
                    break;
                }
                if(e.hash == hash && ((k = e.key) == key || (key ! =null && key.equals(k))))
                    break; p = e; }}// If the key is repeated, the old value is overwritten and onlyIfAbsent is false
        if(e ! =null) { // existing mapping for key
            V oldValue = e.value;
            // Determine whether overwrite is allowed and whether value is null
            if(! onlyIfAbsent || oldValue ==null)
                e.value = value;
            afterNodeAccess(e);
            returnoldValue; }}// Record the number of times the internal structure of the HashMap has been modified (the number of times it has been mapped or rehashed)
    ++modCount;
    // Map size > Threshold
    if (++size > threshold)
        // Double the size of the original array and assign the value of the original array to the new array, either a linked list or a red-black tree
        resize();
    // callback to allow the LinkedHashMap post-operation
    afterNodeInsertion(evict);
    return null;
}
Copy the code

resize()

/** * Initializes or increases the size of the table. If the table is empty, allocate it according to the initial capacity. Expansion table, power 2 */
final Node<K,V>[] resize() {
    // Get various information about the old element array
    Node<K,V>[] oldTab = table;
    // The old array length
    int oldCap = (oldTab == null)?0 : oldTab.length;
    // The old array threshold
    int oldThr = threshold;
    // Define the length of the new array and the threshold for expansion
    int newCap, newThr = 0;
    // If the table is not empty
    if (oldCap > 0) {
        // If the array length reaches the maximum value, change the threshold to the maximum value
        if (oldCap >= MAXIMUM_CAPACITY) {
            threshold = Integer.MAX_VALUE;
            return oldTab;
        }
        // Perform the following operations:
        else if ((newCap = oldCap << 1) < MAXIMUM_CAPACITY &&
                 oldCap >= DEFAULT_INITIAL_CAPACITY)
            newThr = oldThr << 1; // double threshold
    }
    else if (oldThr > 0) // initial capacity was placed in threshold
        newCap = oldThr;
    else {               // zero initial threshold signifies using defaults
        // Use the default values for the initial capacity and threshold of the new array
        newCap = DEFAULT_INITIAL_CAPACITY;	/ / 16
        newThr = (int)(DEFAULT_LOAD_FACTOR * DEFAULT_INITIAL_CAPACITY); / / 16 * 0.75
    }
    // The critical value is also 0
    if (newThr == 0) {
        float ft = (float)newCap * loadFactor;
        newThr = (newCap < MAXIMUM_CAPACITY && ft < (float)MAXIMUM_CAPACITY ?
                  (int)ft : Integer.MAX_VALUE);
    }
    // Update load factor
    threshold = newThr;
    @SuppressWarnings({"rawtypes","unchecked"})
    Node<K,V>[] newTab = (Node<K,V>[])new Node[newCap];
    table = newTab;
    if(oldTab ! =null) {
        for (int j = 0; j < oldCap; ++j) {
            Node<K,V> e;
            if((e = oldTab[j]) ! =null) {
                oldTab[j] = null;
                if (e.next == null)
                    newTab[e.hash & (newCap - 1)] = e;
                // Red black tree adjustment
                else if (e instanceof TreeNode)
                    ((TreeNode<K,V>)e).split(this, newTab, j, oldCap);
                else { // preserve order
                    // List adjustment
                    Node<K,V> loHead = null, loTail = null;
                    Node<K,V> hiHead = null, hiTail = null;
                    Node<K,V> next;
                    do {
                        next = e.next;
                        if ((e.hash & oldCap) == 0) {
                            if (loTail == null)
                                loHead = e;
                            else
                                loTail.next = e;
                            loTail = e;
                        }
                        else {
                            if (hiTail == null)
                                hiHead = e;
                            elsehiTail.next = e; hiTail = e; }}while((e = next) ! =null);
                    if(loTail ! =null) {
                        loTail.next = null;
                        newTab[j] = loHead;
                    }
                    if(hiTail ! =null) {
                        hiTail.next = null;
                        newTab[j + oldCap] = hiHead;
                    }
                }
            }
        }
    }
    return newTab;
}
Copy the code

treeify

/** * insert each value in the list into a red-black tree */
final void treeify(Node<K,V>[] tab) {
    TreeNode<K,V> root = null;
    for (TreeNode<K,V> x = this, next; x ! =null; x = next) {
        next = (TreeNode<K,V>)x.next;
        // Initialize Root
        x.left = x.right = null;
        if (root == null) {
            x.parent = null;
            x.red = false;
            root = x;
        }
        else {
            K k = x.key;
            inth = x.hash; Class<? > kc =null;
            for (TreeNode<K,V> p = root;;) {
                int dir, ph;
                K pk = p.key;
                if ((ph = p.hash) > h)
                    dir = -1;
                else if (ph < h)
                    dir = 1;
                else if ((kc == null &&
                          (kc = comparableClassFor(k)) == null) ||
                         (dir = compareComparables(kc, k, pk)) == 0)
                    dir = tieBreakOrder(k, pk);
                TreeNode<K,V> xp = p;
                if ((p = (dir <= 0)? p.left : p.right) ==null) {
                    x.parent = xp;
                    if (dir <= 0)
                        xp.left = x;
                    else
                        xp.right = x;
                    // Balance adjustment
                    root = balanceInsertion(root, x);
                    break; }}}}// Make sure the given root is the root
    moveRootToFront(tab, root);
}
Copy the code

The interview questions

1. Why do HashMap choose red-black tree as data structure optimization linked list?

2. Why is the default initial capacity to the 2nd power? What if it’s not a power of two? What about HashMap perturbation functions?

3. What are the conditions for converting linked lists into red-black trees in HashMap?

4. Why is HashMap non-thread-safe? In what ways?

5. How is put implemented in Hashmap?

6. What types of elements are used as keys when using Hashmap?


conclusion

The above is xiaobian in the learning process summed up some of the content, if there is not in place to understand the place, but also hope you pointed out, very grateful.