🖕 welcome to pay attention to my public number “Tong Elder brother read source code”, view more source code series articles, with Tong elder brother tour the ocean of source code.


Remember, from the beginning of this article we change the way of learning, tongge first ask questions, you try to answer these questions in your mind, then enter our source code analysis process, finally Tongge pick a few questions to answer.

The opening question

(1) Are the data structures of ConcurrentHashMap and HashMap the same?

(2) When will the concurrency security problem occur in HashMap in multi-threaded environment?

(3) How does ConcurrentHashMap solve the concurrency security problem?

(4) Which locks are used by ConcurrentHashMap?

(5) How is the expansion of ConcurrentHashMap conducted?

(6) Is ConcurrentHashMap strongly consistent?

(7) What problems cannot ConcurrentHashMap solve?

(8) What are some unusual techniques worth learning from ConcurrentHashMap?

Introduction to the

ConcurrentHashMap is a thread-safe version of HashMap and uses an array + linked list + red-black tree structure internally to store elements.

Compared to the same thread-safe HashTable, efficiency and other aspects are greatly improved.

Introduction to various locks

Here’s a brief description of the various locks to give you an idea of the concepts when we talk about them later.

(1) synchronized

Keywords in Java, internally implemented as monitor locks, are primarily indicated by object monitor fields in the object header.

Synchronized has been optimized a lot since its old release, and at runtime exists in three different ways: biased locking, lightweight locking, and heavyweight locking.

Biased locking means that a piece of synchronized code is always accessed by a thread, so the thread will automatically acquire the lock, reducing the cost of acquiring the lock.

Lightweight lock means that when a biased lock is accessed by another thread, the biased lock will be upgraded to lightweight lock. This thread will try to acquire the lock by spinning, without blocking, and improve performance.

Heavyweight lock refers to a lightweight lock. When the spinning thread spins for a certain number of times and has not acquired the lock, it will enter the blocking state. This lock is upgraded to heavyweight lock, which will block other threads and reduce the performance.

(2) the CAS

CAS, Compare And Swap, is a kind of optimistic lock. It believes that concurrent operations on the same data may not be modified. When updating data, it tries to update data, And keeps trying if it fails.

(3) Volatile

In Java, when multiple threads access the same variable and one thread modifies the value of the variable, the other threads can immediately see the modified value. (This involves the knowledge of Java memory model, interested students can check the relevant information)

Volatile only guarantees visibility, not atomicity. For example, volatile variable I does not guarantee the correct result every time for I ++, because I ++ is a two-step operation that reads first and increments by 1.

(4) spin lock

Spin locking, which means that the thread trying to acquire the lock does not block, but keeps trying in a circular manner. This has the advantage of reducing the context switch of the thread to unlock the lock, improving performance, but the disadvantage is that the loop consumes CPU.

(5) segmenting lock

Segment-based lock is a kind of lock design idea that reduces the granularity of locks. It is mainly used in ConcurrentHashMap to achieve efficient concurrent operations. When the operation does not need to update the entire array, only one item in the array is locked.

(5) already

Reentrant lock means that a thread will automatically acquire the lock if it tries to acquire the lock again. The advantage of reentrant lock is to avoid deadlock.

Synchronized is also a reentrant lock.

Source code analysis

A constructor

public ConcurrentHashMap(a) {}public ConcurrentHashMap(int initialCapacity) {
    if (initialCapacity < 0)
        throw new IllegalArgumentException();
    int cap = ((initialCapacity >= (MAXIMUM_CAPACITY >>> 1))? MAXIMUM_CAPACITY : tableSizeFor(initialCapacity + (initialCapacity >>>1) + 1));
    this.sizeCtl = cap;
}

public ConcurrentHashMap(Map<? extends K, ? extends V> m) {
    this.sizeCtl = DEFAULT_CAPACITY;
    putAll(m);
}

public ConcurrentHashMap(int initialCapacity, float loadFactor) {
    this(initialCapacity, loadFactor, 1);
}

public ConcurrentHashMap(int initialCapacity,
                         float loadFactor, int concurrencyLevel) {
    if(! (loadFactor >0.0 f) || initialCapacity < 0 || concurrencyLevel <= 0)
        throw new IllegalArgumentException();
    if (initialCapacity < concurrencyLevel)   // Use at least as many bins
        initialCapacity = concurrencyLevel;   // as estimated threads
    long size = (long) (1.0 + (long)initialCapacity / loadFactor);
    int cap = (size >= (long)MAXIMUM_CAPACITY) ?
            MAXIMUM_CAPACITY : tableSizeFor((int)size);
    this.sizeCtl = cap;
}
Copy the code

SizeCtl = threshold/loadFactor/loadFactor/loadFactor/loadFactor/loadFactor/loadFactor/loadFactor/loadFactor/loadFactor/loadFactor/loadFactor/loadFactor/loadFactor/loadFactor The official explanation is as follows:

(1) -1: indicates that a thread is performing initialization operations

(2) -(1 + nThreads) : n threads are being added together

(3) 0, the default value, which is used for actual initialization later

(4) > 0, threshold of next expansion after initialization or expansion

Whether or not the official explanation is correct will be discussed later.

Add elements

public V put(K key, V value) {
    return putVal(key, value, false);
}

final V putVal(K key, V value, boolean onlyIfAbsent) {
    // Neither key nor value can be null
    if (key == null || value == null) throw new NullPointerException();
    // Computes the hash value
    int hash = spread(key.hashCode());
    // The number of elements in the bucket to be inserted
    int binCount = 0;
    // An infinite loop is used with CAS (if CAS fails, the whole bucket is fetched again)
    for (Node<K,V>[] tab = table;;) {
        Node<K,V> f; int n, i, fh;
        if (tab == null || (n = tab.length) == 0)
            // If the bucket is not initialized or the number of buckets is 0, the bucket is initialized
            tab = initTable();
        else if ((f = tabAt(tab, i = (n - 1) & hash)) == null) {
            // If there is no element in the bucket, insert the element into the bucket
            if (casTabAt(tab, i, null.new Node<K,V>(hash, key, value, null)))
                // If you insert an element using CAS and it already exists, enter the next loop and start again
                // If the CAS element is inserted successfully, break breaks out of the loop and the process ends
                break;                   // no lock when adding to empty bin
        }
        else if ((fh = f.hash) == MOVED)
            // If the hash of the first element in the bucket to be inserted is MOVED, the current thread will help migrate the elements together
            tab = helpTransfer(tab, f);
        else {
            // If the bucket is not empty and no elements are migrated, lock the bucket (segment lock)
            // Find if the element to be inserted is in the bucket
            // onlyIfAbsent=false
            // If not, insert to end of list or insert tree
            V oldVal = null;
            synchronized (f) {
                // Check again if the first element has changed. If so, enter the next loop and start from the beginning
                if (tabAt(tab, i) == f) {
                    // If the hash value of the first element is greater than or equal to 0 (no migration, no tree)
                    // The elements in the bucket are stored in linked lists
                    if (fh >= 0) {
                        // The number of elements in the bucket is set to 1
                        binCount = 1;
                        // Iterate through the bucket, incrementing binCount by 1 each time
                        for (Node<K,V> e = f;; ++binCount) {
                            K ek;
                            if(e.hash == hash && ((ek = e.key) == key || (ek ! =null && key.equals(ek)))) {
                                // If the element is found, a new value is assigned (onlyIfAbsent=false)
                                // And exit the loop
                                oldVal = e.val;
                                if(! onlyIfAbsent) e.val = value;break;
                            }
                            Node<K,V> pred = e;
                            if ((e = e.next) == null) {
                                // If no element is found at the end of the list
                                // Just insert it to the end of the list and exit the loop
                                pred.next = new Node<K,V>(hash, key,
                                        value, null);
                                break; }}}else if (f instanceof TreeBin) {
                        If the first element is a tree node
                        Node<K,V> p;
                        // The number of elements in the bucket is set to 2
                        binCount = 2;
                        // Call the red-black tree insert method to insert the element
                        // Returns null on successful insertion
                        // Otherwise return the node found
                        if((p = ((TreeBin<K,V>)f).putTreeVal(hash, key, value)) ! =null) {
                            // If the element is found, a new value is assigned (onlyIfAbsent=false)
                            // And exit the loop
                            oldVal = p.val;
                            if(! onlyIfAbsent) p.val = value; }}}}// If binCount is not 0, the element was successfully inserted or found
            if(binCount ! =0) {
                // If the number of linked list elements reaches 8, try tree
                // The value of binCount is only 2 when the element is inserted into the tree, not the number of elements in the whole tree
                // So there is no repetition of tree
                if (binCount >= TREEIFY_THRESHOLD)
                    treeifyBin(tab, i);
                // If the element to be inserted already exists, the old value is returned
                if(oldVal ! =null)
                    return oldVal;
                // Exit the outer loop and the process ends
                break; }}}// Insert elements successfully, the number of elements is increased by 1.
        addCount(1L, binCount);
        // Successfully inserted element returns NULL
        return null;
    }
Copy the code

The overall process is similar to the following steps:

(1) If the bucket array is not initialized, initialize it;

(2) If the bucket where the element to be inserted is empty, try to insert the element directly into the first position of the bucket;

(3) If capacity expansion is in progress, the current thread is added to the capacity expansion process.

(4) If the bucket where the element to be inserted is not empty and the element is not being migrated, the bucket will be locked (segment lock);

(5) If the element in the current bucket is stored in a linked list, find the element in the linked list or insert the element;

(6) If the element in the current bucket is stored in red-black tree mode, search for the element in the red-black tree or insert the element;

(7) Return the old value if the element exists;

(8) If the element does not exist, increase the number of elements in the whole Map by 1 and check whether expansion is needed;

The main locks used in the operation of adding elements are (spin lock + CAS + synchronized + segmenting lock).

Why use synchronized rather than ReentrantLock?

Because synchronized has been greatly optimized, it is no worse than ReentrantLock in certain situations.


To be continued ~~


Now the article can not leave a message, if there is any suggestion or opinion, welcome to leave a message to me in the background of the public account, thank you ~


Welcome to pay attention to my public number “Tong Elder brother read source code”, view more source code series articles, with Tong elder brother tour the ocean of source code.