Why this article

Recently in the ConcurrentHashMap, also referred to some related articles, mostly dabbed, did not answer the question in the process of reading.

In general, many people’s impression of ConcurrentHashMap is that it is a data structure stored as a linked list or red-black tree, with segmented locking to mitigate content-free concurrency. The former is not the core competitiveness of ConcurrentHashMap, and the latter is not enough to fully explain its advantages.

ConcurrentHashMap comes about because HashMap is not thread-safe and HashTable is thread-safe but concurrency is inefficient. ConcurrentHashMap carefully considers the difficulty of both, striking a delicate balance between fast storage and efficient concurrency.

What efforts does ConcurrentHashMap make to achieve high concurrency? The depth of the answer represents the degree of mastery of ConcurrentHashMap. Hence the article (source JDK8).

basis

HashMap briefly works as follows:

  1. In a HashMap, K/V is stored in Node granularity
  2. For each K, the hash value can be used to find the index position in the table. The table type is table[].
  3. Different hash values may be indexed to the same location, also known as a collision
  4. In the event of a collision, all nodes indexed to the table[index] are stored as a linked list
  5. When the list is too long, the search becomes inefficient, converted to a red-black tree, to speed up the search
  6. Therefore, whether the table[index] is a linked list or a red-black tree depends on the degree of collision at this location
  7. If the stored data exceeds the maximum capacity of the table, a new table is generated and the nodes of the old table are re-indexed to the new table

ConcurrentHashMap maintains the working nature of HashMap and works as expected in concurrent environments. From the point of view of HashTable, the reason why it is unable to work in high concurrency environment is that it locks all methods, which leads to excessive coverage of critical sections and fierce competition conditions. Therefore, when the amount of concurrency is high, each thread is restricted to other threads, which is no different from serial, and violates the original purpose of concurrency.

Some properties are key to understanding ConcurrentHashMap:

// Similar to the introduction of HashMap, each stored Node is indexed to the location
transient volatile Node<K,V>[] table;
// When the capacity is insufficient, create a larger capacity nextTable and reindex all nodes in the table to nextTable, then set table=nextTable
private transient volatile Node<K,V>[] nextTable;
// The loading factor determines how full the table should be. The larger the table is, the more likely collision will occur. The smaller the table is, the more space will be wasted.
// Unlike HashMap, the value is always 0.75, which is related to the optimization of ConcurrentHashMap
private static final float LOAD_FACTOR = 0.75f;

/** * is a very important and difficult to understand attribute of ConcurrentHashMap. SizeCtl is used for concurrency control and state tags: * =0, ConcurrentHashMap is in the initial state * =-1, indicating that ConcurrentHashMap is being initialized * >0, indicating that the current maximum capacity * <-1, indicating that ConcurrentHashMap is being expanded. In this state, sizeCtl = -(1 + Number of expansion threads) */
private transient volatile int sizeCtl;

/** * Because sizeCtl handles multiple states, other attributes are required to assist in */
// This parameter is used during expansion. The tag is generated using sizeCtl and the status of the tag is regressive. It indicates that the expansion is complete and the valid status is 16 bits
private static int RESIZE_STAMP_BITS = 16;
// Expand the maximum number of threads to 2^16,
private static final int MAX_RESIZERS = (1< < (32 - RESIZE_STAMP_BITS)) - 1;
// Restore the expansion flag in the sizeCtl record
private static final int RESIZE_STAMP_SHIFT = 32 - RESIZE_STAMP_BITS;
Copy the code

Above can tolerate need not hard swallow, have an impression can, after the follow-up mentioned natural clear.

Back to concurrent programming, there are several ways to improve concurrency efficiency:

  • Reduce the region of critical sections: It is easy to understand that in concurrency, critical sections pass sequentially, that is, critical sections are serial.
  • Speed through a critical section: Make the logical code in a critical section execute as fast as possible, so that the thread stays in the critical section for less time.
  • Reduce race conditions: Reduce interaction between threads by removing unnecessary race conditions, or by competing for different critical sections.

ConcurrentHashMap for concurrency efficiency optimization, will also start from these aspects.

Section capacity

Unlike the other articles, I want to start with the expansion method of ConcurrentHashMap, namely transfer(). Because the transfer() processing method, which contains the important core content of ConcurrentHashMap, is worth explaining in detail, and after understanding, it will be much simpler to watch the semantic implementation of other methods.

As with HashMap, when the data capacity exceeds the load factor, new storage continues to be allowed without expansion. Therefore, the collision problem will be aggravated, and the region with serious collision problem will become inefficient. At this point, you need to expand to maintain the efficiency balance that HashMap maintains.

ConcurrentHashMap. Transfer () content is very long, the segments.

Determining section length

private final void transfer(Node<K,V>[] tab, Node<K,V>[] nextTab) {
    int n = tab.length, stride;
    // Determine the length of each segment. NCPU represents the number of CPU cores
    if ((stride = (NCPU > 1)? (n >>>3) / NCPU : n) < MIN_TRANSFER_STRIDE)
        // The minimum length of each segment is not less than MIN_TRANSFER_STRIDE, that is, 16
        stride = MIN_TRANSFER_STRIDE; 
    if (nextTab == null) {
        // Create a new table
        try {
            @SuppressWarnings("unchecked")
            Node<K,V>[] nt = (Node<K,V>[])newNode<? ,? >[n <<1];
            nextTab = nt;
        } catch (Throwable ex) {      // try to cope with OOME
            sizeCtl = Integer.MAX_VALUE;
            return;
        }
        // Point to the new table
        nextTable = nextTab;
        // transferIndex indicates the index location that has not been processed by the old table during capacity expansion
        transferIndex = n;
    }
    // The length of the new table
    int nextn = nextTab.length;
    // Create a special Node for the Node subclass ForwardingNode
    ForwardingNode<K,V> fwd = new ForwardingNode<K,V>(nextTab);
    // Indicates whether to expand the capacity of the next area
    boolean advance = true;
    // Indicates whether the capacity expansion is complete
    boolean finishing = false; . }Copy the code

The more interesting variable is the stride. The Chinese explanation is “one step distance”, should be for the expansion scenario, meaning “the size of each region”. From the way of acquiring stride, CPU verification is considered, which directly shows that concurrent expansion is supported. Then, each participant in the concurrent process is responsible for a stride length region at a time, and the minimum stride is 16. We’ll see that in a second.

At ①, you will have a question: There is no lock here, so how do you ensure that the next operation will use the same nextTab? If null is passed to transfer(), only one thread will pass through ①. Leave the first question as a follow-up: How do I ensure that the nextTab is the same when expanding?

Determine which expansion area each thread is responsible for

private final void transfer(Node<K,V>[] tab, Node<K,V>[] nextTab){...for (int i = 0, bound = 0;;) {
        Node<K,V> f; int fh;
        while (advance) {
            
            // nextIndex: records the current transferIndex value
            // nextBound: the boundary of the area currently being processed
            int nextIndex, nextBound;
            if (--i >= bound || finishing)
                // Update the index position
                / / 1.
                advance = false;
            else if ((nextIndex = transferIndex) <= 0) {
                / / 2.
                i = -1;
                advance = false;
            }
            else if (U.compareAndSwapInt
                        (this, TRANSFERINDEX, nextIndex,
                        nextBound = (nextIndex > stride ?
                                    nextIndex - stride : 0))) {
                // Compete for the responsibility of expanding the region through CAS
                / / 3.
                // Remember the boundaries of the responsible expansion area
                bound = nextBound;
                // In this case, I is the index of the expansion region
                i = nextIndex - 1;
                advance = false; }}... }Copy the code

ConcurrentHashMap supports concurrent expansion, so each thread will have its own area to expand:

  • First, advance is false, indicating that the current thread is processing the expansion of a region. When advance is true, it indicates that the current thread is looking for the next region to expand, or is indexing forward on the region it is working on.
  • Then, the transferIndex indicates the index boundary of the region that has not been configured. When a thread receives the expansion task of a region, the transferIndex value needs to be updated. Therefore, the race condition will be generated on the transferIndex, and nextIndex is required to save the value of the transferIndex before the race condition for the race through CAS.
  • Finally, threads compete for capacity expansion tasks in a region through CAS, where bound records the boundary and I records the current location.

So, at (1) : indicates that the current thread has not completed the expansion task of this region, because it has not exceeded the bound; In (2) : there are no more regions to allocate, or all the regions are processed, or all the regions are processed by other threads, no longer need to participate in the current thread; At ③ : Remember the stride? It’s the length of each region. All incoming threads get their responsible region through CAS, update transferIndex, remember the bound and the current index position; The process is as follows:

It can be simply explained that TRANSFERINDEX is the memory address of TRANSFERINDEX. Because the update of TRANSFERINDEX has a race condition, the atomicity of the operation on TRANSFERINDEX can be guaranteed through CAS operation. The CAS operation can simply be understood as an atomic operation of the underlying guaranteed single variable, accepting the address of the object, the expected value, the value to be updated as a parameter, and returning true if the expected value is met and successfully set to the value to be updated.

Thus, by contending to update the value of the transferIndex, the thread gets the capacity expansion task for an area of the table, and nextIndex(the expected value, knowing that the transferIndex may have changed before the CAS call) assists in the CAS race.

Processing before Capacity Expansion

private final void transfer(Node<K,V>[] tab, Node<K,V>[] nextTab){...if (i < 0 || i >= n || i + n >= nextn) {
    int sc;
    if (finishing) {
        // The capacity expansion is complete
        nextTable = null;
        // Point to the new table
        table = nextTab;
        // Update the maximum capacity of the current table
        sizeCtl = (n << 1) - (n >>> 1);
        return;
    }
    if (U.compareAndSwapInt(this, SIZECTL, sc = sizeCtl, sc - 1)) {
        // The current thread has completed the region expansion task, and the sizeCtl value has been updated successfully
        if (/ * * / (sc - 2) != resizeStamp(n) << RESIZE_STAMP_SHIFT)
            // The entire capacity expansion is not finished, but the capacity expansion task is completed
            return;
        // Note that the entire expansion is finished, and the loop will be entered again and then returned
        finishing = advance = true;
        i = n; // recheck before commit}}else if ((f = tabAt(tab, i)) == null)
        // f remember the node at the current location
        // Update it to ForwardingNode to signal that the location has been processed
        // Since there is no content, the attempt is directly marked as processed, and the failure will come in again
        advance = casTabAt(tab, i, null, fwd);
    else if ((fh = f.hash) == MOVED)
        // Indicate that the location has already been handled
        advance = true; . }Copy the code

Before capacity expansion, use f to record all nodes in the old table[I] location. If the current position has no content, update the Node of type ForwardingNode on the old table[I], thus marking that the position has been processed.

Expansion is concurrent, and exit logic is processed at (1). In the current phase, it can be briefly understood that resizeStamp() gets a marker value, and when (sc = sizeCtl) -2 regress to this marker value, the concurrent expansion is over. As mentioned earlier, sizeCtl expresses multiple states, which leaves a follow-up to the second question: How does sizeCtl control multiple concurrent states?

When the concurrency ended, the maximum size was updated with sizeCtl = (n << 1) – (n >>> 1). A shift of 1 to the left is 2 times, and a shift of 1 to the right is 1/2, resulting in 1.5 times n. That is, n * 1.5 = 2N * 0.75, and the maximum capacity keeps the load factor capacity of the new table length.

Expansion processing

At this point, Node — f in table[I] is obtained. Whether f forms a linked list or a red-black tree, the processing control of concurrent expansion expressed by f is consistent. So, take the linked list, for example.

private final void transfer(Node<K,V>[] tab, Node<K,V>[] nextTab){...// Lock this position
    synchronized (f) {
        if (tabAt(tab, i) == f) {
            // lh indicates Node whose index position is unchanged
            // hn indicates the Node whose index position is to be changed
            Node<K,V> ln, hn;
            if (fh >= 0) {
                // A hash value greater than > indicates that nodes in table[I] form a linked list
                
                / / 1.
                // N can be interpreted as a mask, which is used by runBit to quickly determine whether a Node should move to a new location after expansion
                int runBit = fh & n;
                // To speed up the processing of ln and HN
                Node<K,V> lastRun = f;
                for(Node<K,V> p = f.next; p ! =null; p = p.next) {
                    int b = p.hash & n;
                    if(b ! = runBit) {/ / 2.
                        // Update lastRun and runBit
                        // At the end of the loop, all nodes after lastRun will go to the same position.runBit = b; lastRun = p; }}/ / 3.
                if (runBit == 0) {
                    // All nodes after lastRun are in the same index position
                    ln = lastRun;
                    hn = null;
                }
                else {
                   // All nodes after lastRun will go to the new index
                    hn = lastRun;
                    ln = null;
                }
                for(Node<K,V> p = f; p ! = lastRun; p = p.next) { int ph = p.hash; K pk = p.key; V pv = p.val;/ / 4.
                    if ((ph & n) == 0)
                        ln = new Node<K,V>(ph, pk, pv, ln);
                    else
                        hn = new Node<K,V>(ph, pk, pv, hn);
                }
                // Update ln to the same index position of the new table
                setTabAt(nextTab, i, ln);
                / / 5.
                // Update ln to the new index position of the new table
                setTabAt(nextTab, i + n, hn);
                // Mark as processed
                setTabAt(tab, i, fwd);
                // Enable the current thread to process the next location in the current region
                advance = true; }... }Copy the code

To understand this, you need to know how to determine your index position on a table. We can directly determine that the capacity of the table is always a multiple of 2. The way to determine the position is to hash the Key to a value, let’s say ph, and then combine it with the length of the table, notice that the table is an array, and the index starts at 0. Suppose the table length is n, then the index position = pH & (n-1). So:

Ph determines the number of significant bits in the index position of the data, depending on the number of consecutive “1” bits starting from the lowest bit of the binary (n-1). With this in mind, the expansion process for table[I] becomes something like this:

  • In the table, all nodes located in I can be indexed to I through p & (n-1), and are already in position I.
  • In nextTable, the length of nextTable has changed, meaning (n-1) -> (2n-1). In order to determine the index value, we need to pay attention to the valid bit, and the valid bit is also changed. Therefore, Node in table[I] may be in different positions in nextTable.

NextTable [I] or nextTable[I + n].

Suppose n =16 = 1 0000, n-1 = 15 = 1111. At this point, the significant pH bit of all nodes that determine the index position is low4Bits, let's say both of them1010. When n =2n.2n-1 = 31 = 1 1111If the Node in table[I] has one more significant bit in its pH value, it is lower5position Significant bit of0 10101 0101. So the new significant bit is0or1It decides which way to go. for0Where nextTable[I], for1To nextTable[I + x]1010, the latter position1 1010 = 1010 + n = 1010 + 1 0000That is, the difference depends on the fifth position.Copy the code

So,

  • At (1) : ph & n determines the new significant bit to be 0 or 1, which determines where the Node will go, and runBit represents the result
  • At ② : Through runBit, it is determined that the value of pH & n will not change from a certain position in the linked list, that is, all nodes from lastRun will go to the same position in nextTable
  • At ③, the lastRun linked list confirmed by runBit is recorded. When runBit is 0, it is remembered by ln. 1, remembered by hn, is used to speed up processing.
  • At ④, for the Node before lastRun, create a new Node(same k,v,hash) according to the result of pH & n, and enter ln or HN respectively
  • Place ln in nextTable[I] and hn in nextTable[I + n]

The process for constructing ln and HN is as follows:

It is important to note that the purpose of the ③ and ④ expressions is how to quickly organize ln and HN to go to different nextTable locations, rather than the backward queue that some articles have suggested. An interesting point is that the new Node is created at ④ rather than fully reuse the old Node because it reallocations the race conditions and may speed up concurrency, as explained later.

Table [I] with binary tree structure expresses the same semantics, interested in can read, do not need to rigidly.

Capacity summary

The expansion process can be summarized as follows:

  1. Take the stride as the unit of length, divide the table into various regions
  2. Each thread that participates in capacity expansion updates the transferIndex through CAS competition and is assigned to the responsible region
  3. Each thread knows the boundary of the region it is responsible for and expands the table[I] in the region one by one
  4. The location of the table[I] being processed will be marked as ForwardingNode
  5. Each Node on table[I] will likely go to a different index location in nextTable, nextTable[I] or nextTable[I + n], differentiated by the newly available valid bits
  6. When the region is processed, go back up. Controlled by a higher level, whether to continue into transfer().

SizeCtr state control

Next, I’ll answer the question, how does sizeCtl control multiple concurrent states? . Understanding how the ConcurrentHashMap sizeCtr property works is a key link to understanding the semantic implementation of the ConcurrentHashMap methods.

private transient volatile int sizeCtl;
Copy the code

SizeCtl is a 32-bit int that is not serialized, visible to all threads, and has an initial value of 0.

Initialize the

public ConcurrentHashMap(int initialCapacity) {
    if (initialCapacity < 0)
        throw new IllegalArgumentException();
    / / capacity
    int cap = ((initialCapacity >= (MAXIMUM_CAPACITY >>> 1))? MAXIMUM_CAPACITY :// Return the nearest 2^x greater than or equal to initialCapacity
                tableSizeFor(initialCapacity + (initialCapacity >>> 1) + 1));
    this.sizeCtl = cap;
}
Copy the code

ConcurrentHashMap allows a minimum table length of 16, which can be passed in as much as desired during initialization, but it will find the nearest table length of 2^x greater than or equal to initialCapacity, which sizeCtl will remember for the time being.

private final Node<K,V>[] initTable() {
    Node<K,V>[] tab; int sc;
    while ((tab = table) == null || tab.length == 0) {
        if ((sc = sizeCtl) < 0)
            // Indicates that other threads are initializing the table, freeing the time slice
            Thread.yield();
        else if (U.compareAndSwapInt(this, SIZECTL, sc, -1)) {
            // Update sizeCtl to -1 through CAS lock contention, which is handled by the successful thread
            try {
                if ((tab = table) == null || tab.length == 0) {
                    // Verify that the table is not created
                    int n = (sc > 0)? sc : DEFAULT_CAPACITY; @SuppressWarnings("unchecked")
                    // Create table
                    Node<K,V>[] nt = (Node<K,V>[])newNode<? ,? >[n]; table = tab = nt;// Mark the maximum amount of data that can be held currently, i.e., table length * load factor.
                    sc = n - (n >>> 2); }}finally {
                sizeCtl = sc;
            }
            break; }}return tab;
}
Copy the code

The table will be created as a lazy load, and the creation process may occur concurrently. To determine which thread is responsible for creating the table, set sizeCtl to -1 to indicate that a process is creating the table. Other threads see this information and give the time slice to the thread that created the table. After table is created, sizeCtl = table length * load factor (n – (n >>> 2)). After expansion update, still maintain the update according to this rule.

In this phase, the following results are obtained: sizeCtl = 0, indicating the initial state sizeCtl = -1, and creating state sizeCtl = 2^x * 0.75, indicating the maximum capacity allowed by the current table

Expansion tag

During the expansion phase, sizeCtl is marked as the end of expansion, along with the number of threads being expanded.

/ / return a scale mark static final ints resizeStamp (int n) {return an Integer. NumberOfLeadingZeros (n) | (1 < < (RESIZE_STAMP_BITS - 1)); }Copy the code

When expansion is required, the current table length is facedn, and a flag is obtained through resizeStamp(). When sizeCtl returns to this flag again, the expansion is complete.

  1. Integer. The first numberOfLeadingZeros (n) semantics is: return a value, on behalf of n binary from the highest level, know one is 1, how many continuous 0.
  2. 1 << (resize_stamp_bits-1) Returns an int of 1 only in the lower 16 bits.
  3. To get a marked value.
For the int32Bits, let's say n is equal to16, the Integer numberOfLeadingZeros (n) |1 << (RESIZE_STAMP_BITS - 1)
= 
0000 0000 0000 0000 0000 0000 0001 1011
|
0000 0000 0000 0000 1000 0000 0000 0000
= rc = 
0000 0000 0000 0000 1000 0000 0001 1011
Copy the code

In the example above, the 16-bit “1000 0000 0001 1011” will be the high 16 bits of the sizeCtl and will be used as the expansion marker.

Why use resizeStamp() for processing to get the number of marks instead of marking with a specific number?

One reason is that the table can be expanded in the concurrent state if the value is specified. Assume that a specific value Z represents expansion. At T0, all participating threads record a value Z, and the length is expanded from N to 2N. At time T1, one of the expansion threads wakes up and finds that the value is Z. It thinks expansion is needed and increases the length from 2N to 4N. Of course, you can ensure consistency by locking, but this reduces concurrency efficiency. Therefore, obtain a value X through resizeStamp(), when the table length changes, obtain a value through resizeStamp() will get X1, X is different from X1, thus avoiding this problem. Take a closer look, is it like THE ABA problem of CAS?

On the other hand, sizeCtl will also record the number of threads being expanded. According to ConcurrentHashMap’s MAX_RESIZERS, a maximum of 2^16 threads are supported for expansion. The lower 16 bits of sizeCtl are sufficient.

Thread record

The basic type in Java is a signed number, that is, the highest bit of binary is a sign bit, 0 is positive, 1 is negative. Get the value from resizeStamp() as the high 16 bits of the sizeCtl, so that sizeCtl<0.

Because of the characteristic of binary addition and subtraction of signed numbers, the addition and subtraction of negative numbers, if only the lower 16 bits, is just like the addition and subtraction of positive numbers. Therefore, the number of record threads can be updated by adding or subtracting sizeCtl.

In particular, the first thread enters at +2, the subsequent threads enter at +1, exit at -1, and then check whether the -2 value matches the value obtained by resizeStamp() to determine whether the expansion is complete.

SizeCtr summary

Thus, sizeCtr works as follows: If the value is 0, the initial state is sizeCtl = -1, and the table is created. If the value is 2^x * 0.75, the upper 16 bit indicates the end of expansion, and the lower 16 bit indicates the number of discovered threads

From transfer() and other operations to sizeCtr, a lot of bit operations are used to help process logic and speed up threading through critical sections. The preceding hint for these bit operations is to make the table length n constant to a power of 2, which is self-provable if you are interested.

Segmented lock

The point of segment locking is that each region has its own lock so that threads can compete conditions to different regions, thereby reducing the interaction between threads.

ConcurrentHashMap Stores different data sets in nodes on table[I]. Each table[I] is a lock, thus disperses the contention. The transfer() method mentioned above also locks the Node in table[I] and then expands the capacity.

Now, continue with the insert method putVal().

    final V putVal(K key, V value, boolean onlyIfAbsent) {
        if (key == null || value == null) throw new NullPointerException();
        // Get the hash value
        int hash = spread(key.hashCode());
        // Indicates the Node position on tabel[I]
        int binCount = 0;
        for (Node<K,V>[] tab = table;;) {
            Node<K,V> f; int n, i, fh;
            if (tab == null || (n = tab.length) == 0)
                // Lazy load, if there is no table, load
                tab = initTable();
            else if ((f = tabAt(tab, i = (n - 1) & hash)) == null) {
                // Table [I] has no Node yet. Insert, exit
                if (casTabAt(tab, i, null.new Node<K,V>(hash, key, value, null)))
                    break;                   // no lock when adding to empty bin
            }
            else if ((fh = f.hash) == MOVED)
                // A thread is expanding
                tab = helpTransfer(tab, f);
            else {
                V oldVal = null;
                synchronized (f) {
                / / lock table [I]
                    if (tabAt(tab, i) == f) {
                        // Check to add table[I] again
                        if (fh >= 0) {
                        // List processing. }else if (f instanceof TreeBin) {
                        // Red black tree processing. }}}if(binCount ! =0) {
                    if (binCount >= TREEIFY_THRESHOLD)
                        // If the number of nodes in table[I] reaches the threshold, convert the list to a red-black tree
                        treeifyBin(tab, i);
                    if(oldVal ! =null)
                        return oldVal;
                    break; }}}// Update the data storage
        addCount(1L, binCount);
        return null;
    }
Copy the code

As with tranfer(), the lock on table[I] must be taken before the operation continues. If table[I] is marked as MOVED(ForwardingNode) yes, it indicates that a thread is expanding the capacity.

If the number of nodes on table[I] is greater than or equal to TREEIFY_THRESHOLD, the linked list is converted to a red-black tree to speed up indexing. The probability of reaching the TREEIFY_THRESHOLD is about 0.00000006

A friendly reminder: don’t get stuck on how linked lists and red-black trees are constructed. What you should care about is how ConcurrentHashMap works.

final Node<K,V>[] helpTransfer(Node<K,V>[] tab, Node<K,V> f) {
    Node<K,V>[] nextTab; int sc;
    if(tab ! =null && (f instanceofForwardingNode) && (nextTab = ((ForwardingNode<K,V>)f).nextTable) ! =null) {
        // rs indicates the capacity expansion marker value
        int rs = resizeStamp(tab.length);
        while (nextTab == nextTable && table == tab &&
                (sc = sizeCtl) < 0) {
                // sizeCtl<0 indicates that capacity is being expanded
            if((sc >>> RESIZE_STAMP_SHIFT) ! = rs || sc == rs +1 ||
                sc == rs + MAX_RESIZERS || transferIndex <= 0)
                /** * 1.sc >>> RESIZE_STAMP_SHIFT) ! * 2.sc == rs + 1 Indicates that all threads are expanded. Sc == rs + MAX_RESIZERS Indicates that the capacity expansion thread has reached the upper limit * 4. TransferIndex <= 0 indicates that all areas of the table have been allocated, and no help is needed */
                break;
            if (U.compareAndSwapInt(this, SIZECTL, sc, sc + 1)) {
                // Updated sizeCtl successfully through CAS
                transfer(tab, nextTab);
                break; }}return nextTab;
    }
    return table;
}
Copy the code

Compare resizeStamp() tags to see if they match expectations, and then compete to update sizeCtl through CAS to help with the expansion.

After the new data is added successfully, the current data capacity is updated by addCount()

private final void addCount(long x, int check) {
    CounterCell[] as; long b, s; .if (check >= 0) {
    Node<K,V>[] tab, nt; int n, sc;
    while(s >= (long)(sc = sizeCtl) && (tab = table) ! =null &&
            (n = tab.length) < MAXIMUM_CAPACITY) {
          // The current capacity s >= sizeCtl
            
        // The capacity expansion marker value
        int rs = resizeStamp(n);
        if (sc < 0) {
            // Continue expansion
            if((sc >>> RESIZE_STAMP_SHIFT) ! = rs || sc == rs +1 ||
                    sc == rs + MAX_RESIZERS || (nt = nextTable) == null ||
                transferIndex <= 0)
                /** * 1.sc >>> RESIZE_STAMP_SHIFT) ! * 2.sc == rs + 1 Indicates that all threads are expanded. Sc == rs + MAX_RESIZERS Indicates that the capacity expansion thread has reached the upper limit * 4. TransferIndex <= 0 indicates that all areas of the table have been allocated, and no help is needed */
                break;
            if (U.compareAndSwapInt(this, SIZECTL, sc, sc + 1))
                // Update sizeCtl successfully through CAS. Continue expansion
                transfer(tab, nt);
            }
        else if (U.compareAndSwapInt(this, SIZECTL, sc,
                                             (rs << RESIZE_STAMP_SHIFT) + 2))
            // This is the first time to perform capacity expansion
            transfer(tab, null);
         // Calculate the current capacitys = sumCount(); }}}Copy the code

This answers the remaining question: How do YOU ensure that nextTabs are the same when you expand?

  • After the data is inserted, the current capacity is recalculated, and if it is larger than sizeCtl, the capacity is expanded.
  • The first time we enter Transfer (), we will update to sizeCtl = sizeCtl + 2 and pass null,
  • If transfer() is subsequently entered, sizeCtl = sizeCtl + 1 is updated, nt representing nextTable is passed in, and sizeCtl = sizectl-1 is updated on exit
  • When sizeCtl -2 = resizeStamp() is checked, the capacity expansion is completed
  • Therefore, only the thread that enters Transfer () for the first time will create a nextTable, ensuring that other threads will access the same one when they expand.

Section locking summary

The corresponding deletion method, replaceNode(), follows roughly the same logic as the insert method. First, discover that you are expanding, and help expand. Then, the lock on the table[I] is obtained and the semantic logic is performed. In this way, data on table[I] can be secured concurrently. The ConcurrentHashMap() race condition is also dispersed due to segmented locking.

With a good understanding of segment scaling, sizeCtl control, and segment locking, the rest of the method semantics are easy to implement and therefore do not expand.

Synchronized to replace already

In addition to the ConcurrentHashMap concurrency optimization mentioned above, changing the lock implementation from ReentrantLock to Synchronized after JDK8 is another big optimization.

In previous versions, ConcurrentHashMap was implemented with each table[I] location as a Segment containing all data indexed at that location. In addition, each Segment is a ReentrantLock(exclusive lock), so when you operate on table[I], you must display the unlock by adding method. ReentrantLock is based on an AQS implementation, so in concurrent situations, threads that cannot acquire the lock are suspended until another thread releases the lock and is awakened to continue trying to acquire the lock. Therefore, the consumption of user mode switching and kernel mode switching caused by suspension and wake up, and the performance consumption caused by context switch are inevitable.

In JDK8, the lock mode is changed from ReentrantLock to Synchronized, which is based on the Synchronized performance, and supports the expansion process of lock escalation.

In simple terms, Synchronized goes through a bias/lightweight/heavyweight lock process.

  • First, every object in Java can be used as a lock.
  • Initially, only one thread will use the lock, which will be marked as biased. At this point, the process runs as if there were no locks.
  • Next, more threads have race conditions and race to use the lock, which is marked as a lightweight lock. At this point, the thread that did not acquire the lock will spin through the CAS, wasting CPU time, acquire the lock, and sometimes acquire it very quickly. Lightweight locking means that, despite a race condition, everyone passes the critical section quickly, wasting CPU time and consuming less performance than switching kernel state and suspending/arousing threads. This process does not require a switch to kernel mode.
  • Later, the race became intense, and each thread was wasting CPU time by using lightweight locks, which were costing performance so much that it was better to hang them. At this point, the lock is marked as a heavyweight lock, and each thread that fails to acquire the lock is suspended and reinstated, at the cost of user mode switching, kernel mode switching, and the performance cost of context switching.

Therefore, when the table[I] is biased lock and lightweight lock, the performance is better than ReentrantLock; When table[I] is a heavyweight lock, the performance is similar to that of ReentrantLock. Overall, efficiency will be better than ReentrantLock.

Remember from the subsection expansion section, when a Node in table[I] goes to a nextTable, a new Node is created instead of a reusable Node, which may improve concurrency efficiency.

Although Synchronized does not support lock degradation, we can use a new lock to reobserve how competitive conditions are and reexperience lock inflation. When you use a new Node, you use a new lock. The reason is that the competition for table[I] is not the same as for nextTable[I] and nextTable[I + n]. It is likely that there will be less competition for either or both places. The worst case is mitigated by segment locking.

And Synchronized, as conduits, has more room for the imagination.

For more information about Synchronized and ReentrantLock, see Java Locks

Current Capacity Calculation

ConcurrentHashMap will be possible in various locations concurrently, so the current capacity is more of an estimate, after all, cannot stop the world. ConcurrentHashMap is an internal CounterCell object that records the number of times each thread adds or deletes data, and then combines the results. See fullAddCount() and sumCount() for related methods, which do not expand.

The semantics of other methods related to ConcurrentHashMap are also simplified and not expanded after the above is understood.

conclusion

Now, to answer the question raised at the beginning: What does ConcurrentHashMap do to achieve high concurrency?

  • First, following the nature of HashMap, the data is distributed in different areas. Then, fragmentation locks are used for concurrency, reducing race conditions on the region.
  • Then, the capacity under the load factor is maintained so that collisions are as few as possible to avoid the threads of race conditions crowding together. When the capacity is insufficient, the table is expanded in the following manner: transferIndex Records the areas that are not allocated for expansion. TransferIndex is updated competitively through the CAS. Each thread is allocated and processes an area to complete the transfer of the table to the nextTable. Thus, speed through the critical section.
  • In order to observe the concurrency intensity of a certain area, the lock mode is changed from ReentrantLock to Synchronized. With the lock expansion feature of Synchronized, different lock schemes are dynamically used according to the different concurrency intensity. During capacity expansion, observe each area again.
  • To process the logic faster, the length of the table region is always a power of 2, so that the logic itself can be processed faster through bitwise operations.
  • Finally, maintain ConcurrentHashMap states with int sizeCtr: initial state, initialization state, expansion state, expansion state. The way sizeCtr works is the way to understand ConcurrentHashMap.
  • As for whether a region is a linked list or a red-black tree, the transformation is also done to speed up the index, and the ultimate goal is to pass the critical section faster.

Above, wrong place, not stingy give instruction.