The ConcurrentHashMap source has been widely parsed on the web. This article focuses on its traverser-based implementation and attempts to parse it carefully.

ConcurrentHashMap Traverser is used primarily for traversal of internal arrays. Traverser provides a good design implementation for traversing the output of internal arrays in a way that allows other threads to perform properly (without locking).

1. Related concepts

1.1ConcurrentHashMapSupport methodsforEach(Consumer<? super K> action),keys,elements,containsAnd so on, their internal implementations are based onTraverser;1.2ConcurrentHashMapTraversal can ensure excellent performance and concurrency security during internal array expansion. 1.3TraverserThe main methods:advance, is represented as: advance, and during expansionbooleanvariableadvanceIt has a similar meaning, both advancing indexes(index)Growth in order to continuously advance element positioning. 1.4 Understanding in classesTraverserThe names of the variables in thebaseXxxThe variable is directed againstConcurrentHashMapOld table, nobase-Prefixed for both old and new tables (current traversal context).

2.

2.1 ConcurrentHashMap unifies the basic implementation of Traverser, that is, Traverser is responsible for advancing elements and returning them; BaseIterator is responsible for judging elements and providing iterative implementations: hasNext, Next; XxxIterator is BaseIterator that provides different implementations: iteration for keys, iteration for value, iteration for entry.

2.2TraverserClass keeps the index of the iteration:index, and the pairConcurrentHashMapAn internal array of the table — either an old table or a reference to a new table

    static class Traverser<K.V> {
        // The default is the current table, which will be replaced intermittently with nextTable if it is being reorganized
        Node<K,V>[] tab;        // current table; updated if resized
        // Next is a key base variable, as the current element in the iteration, mainly implemented: next() and hasNext()
        Node<K,V> next;         // the next entry to use
        // To achieve reuse, use a TableStack to replace the two variables with each other
        TableStack<K,V> stack, spare; // To save/restore on ForwardingNodes Save or restore
        // Resize can be baseIndex+ baseIndex; // Resize can be baseIndex+ baseIndex.
        int index;              // index of bin to use next
        /** * base* is based on the old TAB. BaseSize is the size of the old TAB. BaseLimit is the range to traverse (that is, limit <= size) */
        int baseIndex;          // current index of initial table
        int baseLimit;          // index bound for initial table
        final int baseSize;     // initial table size

        Traverser(Node<K,V>[] tab, int size, int index, int limit) {
            this.tab = tab;
            this.baseSize = size;
            this.baseIndex = this.index = index;
            this.baseLimit = limit;
            this.next = null; }// ...}Copy the code

In Traverser constructor, the current Traverser references are limit = size, that is, the Traverser currently iterates through the entire internal array between 0 and limit = size.

2.3 Traverser exposed method (default), only advance Advance is responsible for advancing elements and, during the expansion phase, for element traversal by indexes into new tables.

        final Node<K,V> advance(a) {
            Node<K,V> e;
            // This is the process of the external re-entry after the external element is fetched successfully (index increment)
            // Try to assign a value of e.ext. If not null, the list is iterated or the tree is traversed without advancing to the next node
            if((e = next) ! =null)
                e = e.next;
            for (;;) {
                // t -> table, table to be traversed
                // I -> index; n -> length/number
                Node<K,V>[] t; int i, n;  // must use locals in checks
                // If the value is null, proceed to the next node
                if(e ! =null)
                    return next = e;
                /** * baseIndex >= baseLimit; /** * baseIndex >= elimit; * This also causes the value of n to bounce repeatedly, sometimes the length of old Table and sometimes the length of nextTable * but t.length (the size of old and new tables) is always greater than the offset index currently traversed. * and it is possible that :(index = I + baseSize) >= n */
                if (baseIndex >= baseLimit || (t = tab) == null ||
                    (n = t.length) <= (i = index) || i < 0)
                    return next = null;
                /** * If the hash of the node is < 0, there are several scenarios: * if the ForwardingNode(hash=-1) indicates that the old table node has been moved in the regrouping, * uses the algorithm: For nodes that have been moved (ForwardingNode), change to a new table (nextTable) with index unchanged, * continue pops out, ready to iterate to the new table * Stores information into the temporary stack * pushes the index of the node onto the stack, TabAt (t, I) * starts the next round of traversal. TabAt is already for nextTable and can no longer be ForwardingNode. Replace TAB with old table with recoverState after traversing nextTable and advance index * In recoverState, set index = index+baseSize *, that is, increase index to the position of the element index after rehash in the new table. So that the next loop starts at this location * and concludes: that is, after the old table encounters a ForwardingNode, the traversal will switch to the new table, * traversing the elements on index index and baseIndex + baseSize */
                if((e = tabAt(t, i)) ! =null && e.hash < 0) {
                    if (e instanceof ForwardingNode) {
                        tab = ((ForwardingNode<K,V>)e).nextTable;
                        // Assign null and continue to recoverState
                        e = null;
                        // Save temporary information (current index to location, old table size, old table reference)
                        pushState(t, i, n);
                        // What happens if the expansion phase is completed here?
                        continue;
                    }
                    // The hash of the tree node type is also negative: -2
                    else if (e instanceof TreeBin)
                        e = ((TreeBin<K,V>)e).first;
                    else
                        e = null;
                }
                /** * No matter what, the program will always run to this judgment. * If stack is not null, index I has been expanded and rehashed, * then the index on the new table needs to be traversed: BaseIndex and baseIndex+baseSize, * reassigning the old table, starting with ++baseIndex */
                if(stack ! =null)
                    // Assist to jump to 2x index of the new table
                    recoverState(n);
                else if ((index = i + baseSize) >= n)
                    // Over n (TAB size, revert to old table offset size and add 1)
                    // If this happens, access back to the old slot
                    index = ++baseIndex; // visit upper slots if present}}Copy the code

Simplify code to describe iterations:

A. The traverser performs ordinary traversal and obtains nodes by increasing the index (+1), which may be linked list structure, tree node, or NULL. If the index is a linked list node, the next node of the index is not pushed until null. If it is a tree node, obtain e.first and then obtain the next tree node through E. next until it is null. If null, advance index directly.

B. If the node is a ForwardingNode (other than the preceding three types), the current traversal data (or the old table) is replaced with a new array (the new table) and the traversal information is saved, including the current traversal index, the old table, and the size of the old table.

C. Based on the new table, obtain the baseIndex node in the first step on the new table and use recoverState to advance the index. The advance algorithm is as follows: index = baseIndex + baseSize. In ConcurrentHashMap, the size of the new array is 2 x the size of the old table.

D. Based on the new table, after obtaining and using the elements on baseIndex + baseSize, advance the index. At this time, the algorithm will compare whether the size of this advance exceeds the size of the new table. The propulsion algorithm is:

/** * n is always the size of the new table, that is: 2 * baseSize * Possibly pops traversal state. * * @param n length of current table */ private void recoverState(int n) { / /... If (s == null && (index += baseSize) >= n) index = ++baseIndex; if (s == null && (index += baseSize) >= n) index = ++baseIndex; }Copy the code

E. Finally, the traverser exits through boundary judgment.

2.4 Advance completes the implementation of switching traversal between old and new tables, mainly with the help of two methods: pushState and recoverState. PushState stores temporary information, while recoverState is responsible for increasing the index span when the table is in a new table. It is also responsible for reversing the index to the index position of the old table and +1 when the table is switched to the old table.

        /** * 
 * For reuse purposes, * spare is always cleaned up, called null, and an empty stack is assigned in recoverState * Stack is always in charge in order to be used in recoverState, Spare * 

* Saves traversal state upon encountering a forwarding node. */
private void pushState(Node<K,V>[] t, int i, int n) {
/ / spare. Set aside
TableStack<K,V> s = spare; // reuse if possible
if(s ! =null)
spare = s.next;
else
s = new TableStack<K,V>();
// Store old tables
s.tab = t;
// Store the old table length
s.length = n;
// Store the index of the old table traversed
s.index = i;
s.next = stack;
// Stack stores nodes
stack = s;
}

/** * n is always the size of the new table, i.e. 2 * baseSize * Possibly pops traversal state. **@param n length of current table
*/

private void recoverState(int n) {
TableStack<K,V> s; int len;
// The first time this method provides cross-increment for index, adjust index = baseIndex + baseSize
// The index is always < n, so the loop is not valid, exit to the external method to fetch the element
BaseSize = 2*baseSize + baseIndex > n
// In this case, the element on the index corresponding to the new table has already been fetched
while((s = stack) ! =null && (index += (len = s.length)) >= n) {
n = len;
index = s.index;
// Assign the old table back to the traverser, and set the stored value to NULL
tab = s.tab;
s.tab = null;
TableStack<K,V> next = s.next; // null
s.next = spare; // save for reuse
stack = next;
spare = s;
}
// Restore the index to baseIndex + 1
// Change index to the old table index and increment to the next element
// s == null as a prerequisite, indicating that the old table is returned for operation
if (s == null&& (index += baseSize) >= n) index = ++baseIndex; }}Copy the code

    pushStaterecoverStateIn addition to implementing the index transformation, actually toTableStackThe operation is also achieved by multiplexing a separate oneTableStackThe realization of the stack out of the stack, the effect of repeated use. Mainly throughTableStacknextProperties. Because in the algorithm, by judgmentTableStackWhether it isnullTo control the restoration of the index to the old table index as a prerequisite+ 1, so the only one that needs to be instantiatedTableStackConstantly associated withnextProperties for easy next use fromnextTo get the object. The assignmentTableStacknullIs in therecoverStateIs processed in. It is associated by two Pointers in a continuous loopnextProperty to achieve the effect of reuse.          

2.5 Other Issues:

  • When traversingForwardingNode, if the information is temporarily storedConcurrentHashMapWhat happens when the expansion is complete?
    • In fact,TraverserTraverser has little to do with external Traverser (handling new tables) because at the time of creation the Traverser already uses internal variables to store references to the old table, so even if the external Traverser is finished, the Traverser will be traversedConcurrentHashMapThe table reference is updated to the new table, butTraverserThe old table references are still there, and you can consistently iterate over the old table, but at this point, because the old table is already fullForwardingNodeSo it’s going to keep passingForwardingNodeFind the new table and proceedTableStackFor each element traversalbaseIndexbaseIndex + baseSizeGets node data from the index. (Weak consistency)
  • When traversal, new elements are added, if added toindexWhat happens before you index it?
    • Nothing is going to happen in theTraverserAccording to the algorithm, even if this happens, it is guaranteed consistency and has no effect on traversal. Of course if it is added toindexAfter that, it can still be traversed.

Conclusion:

  • If you can visualize traversal as two tables, one old table and one new table (the new table is twice the size of the old table), and index movement encounters elements identified asrehashedWhen, switch to the new table to do (2 different index) element fetch, that thisTraverserIt’s not hard to understand.
  • In order to preserve temporary information during traversal, and to reuse it as much as possible,TraverserTo achieve theTableStackThe structure, although it looks a bit convoluted, but multiplexed barre
  • TraverserThe traversal implementation of theConcurrentHashMapThe expansion of the new array is 2 times the expansion of the new array
  • The schematic diagram is as follows: