why

Jdk8 rewrote this class. It is seldom used in ordinary business code, so I ignored it. Today, I will pick it up again and make a note. If there is any mistake, we welcome criticism and correction

Differences between JDK7 and JDK8

Unaddressed jdk7 uses ReentrantLock + segment + hashentry + unsafe JDK8 uses Synchronized + CAS + Node + NodeTree + unsafe

The key way to

Let’s start with the two most important methods, get and put

The put method is easy to read for concurrency and does not require locking because it does not involve data modification. How does ConcurrentHashMap work if you understand the put data logic

Put method

Check the value of the current subscript in the table using infinite loop logic

  1. Check whether the table is initialized. If not, initialize the table and repeat the loop
  2. If the value is null, CAS is used to set the value to the index. If the setting is successful, it ends. If the setting fails (the failure may be caused by other threads setting the value of the index), the loop is repeated
  3. To be determined
  4. If the value of the current subscript is not null, enter the synchronized code block
    1. If the key is equal, set a new value. If the key is equal, mount the linked list. Record the length of the linked list
    2. If it’s a red black tree, set it to a red black tree (red black tree is not expanded here)
    3. Determines whether to convert to a red-black tree based on the length of the list. The default threshold is 8

The picture above is clearer

Source code (key parts annotated)

final V putVal(K key, V value, boolean onlyIfAbsent) {
        if (key == null || value == null) throw new NullPointerException();
        int hash = spread(key.hashCode());
        int binCount = 0;
        for(Node<K,V>[] tab = table;;) { Node<K,V> f; int n, i, fh; // If the table is empty, initialize the table, as shown belowif(tab == null || (n = tab.length) == 0) tab = initTable(); // Determine the currenthashIf no value is set, cas is not blockedelse if ((f = tabAt(tab, i = (n - 1) & hash)) == null) {
                if (casTabAt(tab, i, null,
                             new Node<K,V>(hash, key, value, null)))
                    break;                   // no lock when adding to empty bin
            }
            else if ((fh = f.hash) == MOVED)
                tab = helpTransfer(tab, f);
            else{ V oldVal = null; Synchronized (f) {// check again for changesif(tabAt(TAB, I) == f) {// If this nodehashThe value is not zero, which means that when the current node is a normal node, this should be easier to understandhashValue, key equals is equal ifhashIf a conflict occurs, add a linked list, record the length of the list (binCount), and then adjust the length accordingly, whether to use a red-black tree instead of a linked listif (fh >= 0) {
                            binCount = 1;
                            for (Node<K,V> e = f;; ++binCount) {
                                K ek;
                                if (e.hash == hash&& ((ek = e.key) == key || (ek ! = null && key.equals(ek)))) { oldVal = e.val;if(! onlyIfAbsent) e.val = value;break;
                                }
                                Node<K,V> pred = e;
                                if ((e = e.next) == null) {
                                    pred.next = new Node<K,V>(hash, key,
                                                              value, null);
                                    break; }}} // If it is already a tree structure, it is a tree structureelse if (f instanceof TreeBin) {
                            Node<K,V> p;
                            binCount = 2;
                            if ((p = ((TreeBin<K,V>)f).putTreeVal(hash, key, value)) ! = null) { oldVal = p.val;if(! onlyIfAbsent) p.val = value; }}}} // Check the threshold, the default is 8, beyond which the tree will be convertedif(binCount ! = 0) {if (binCount >= TREEIFY_THRESHOLD)
                        treeifyBin(tab, i);
                    if(oldVal ! = null)return oldVal;
                    break;
                }
            }
        }
        addCount(1L, binCount);
        return null;
    }
Copy the code

Get method (comments)

The get method is much cleaner, and the main logic is handled in the PUT method

public V get(Object key) {
        Node<K,V>[] tab; Node<K,V> e, p; int n, eh; K ek;
        int h = spread(key.hashCode());
        if((tab = table) ! = null && (n = tab.length) > 0 && (e = tabAt(tab, (n - 1) & h)) ! = null) {if ((eh = e.hash) == h) {
                if((ek = e.key) == key || (ek ! = null && key.equals(ek)))returne.val; } // if node is a red-black tree, find the corresponding nodeelse if (eh < 0)
                return(p = e.find(h, key)) ! = null ? p.val : null; // For a linked list, loop to find the corresponding nodewhile((e = e.next) ! = null) {if(e.hash == h && ((ek = e.key) == key || (ek ! = null && key.equals(ek))))returne.val; }}return null;
    }
Copy the code

Initialize map table (optional)

Two pre-requisite basic concepts to understand

  1. Unsafe

Just a quick word about this class. Java does not access the underlying operating system directly, but rather through native methods. Even so, the JVM has a backdoor. The Unsafe class in the JDK provides hardware-level atomic operations.

Although the methods in this class are public, there is no way to use them, and the JDK API documentation does not provide any explanation of the methods in this class. In general, use of the Unsafe class is restricted. Only trusted code can obtain instances of the Unsafe class, and classes in the JDK library are free to use.

  1. CAS

The java.util.concurrent package is completely built on CAS. Without CAS, there would be no such package, which shows the importance of CAS.

Most current processors support CAS, but different vendors implement it differently. CAS has three operands: the memory value V, the old expected value A, and the value to be modified B. Change the memory value to B and return true if and only if the expected value A and the memory value V are the same, otherwise do nothing and return false.

  1. The source code

When we initialize the array size, we don’t lock it because we use a sizeCtl variable. Setting this variable to -1 indicates that the table is being initialized.

private final Node<K,V>[] initTable() {
        Node<K,V>[] tab; int sc;
        while((TAB = table) = = null | | TAB. The length = = 0) {/ / sizeCtl: initialization and resize table flags, table initialization and resize the control. When it is negative, the size of the table is initialized or adjustedif((sc = sizeCtl) < 0) // If the value is -1, the CPU usage is being initialized or resized. Thread.yield(); // lost initialization race; Just spin // set SIZECTL to -1, initialization starts on success, loop continues on failure. // compareAndSwapInt Non-blocking synchronization primitives: arg0, arg1, arg2, arg3 are the object instance, the target object property, the current expected value, and the value to be set, respectivelytrue, failurefalse
            else if (U.compareAndSwapInt(this, SIZECTL, sc, -1)) {
                try {
                    if ((tab = table) == null || tab.length == 0) {
                        int n = (sc > 0) ? sc : DEFAULT_CAPACITY;
                        @SuppressWarnings("unchecked") Node<K,V>[] nt = (Node<K,V>[])new Node<? ,? >[n]; table = tab = nt; sc = n - (n >>> 2); } } finally { sizeCtl = sc; }break; }}return tab;
    }
Copy the code

conclusion

  1. Use Synchronized + CAS + Node + NodeTree to replace Segment. Lock only when hash conflict occurs or values are modified. The granularity of lock is smaller and blocking is greatly reduced

  2. When the number of linked list nodes is larger than 8, the linked list is converted to a red-black tree for storage, and the query time complexity changes from O(n) to O(logN).

I have read several other posts about ConcurrentHashMap before, and it is easy to forget after reading. Looking at the source code is like reading a fine book. If you can read it carefully and understand it, you can memorize it and write middleware