Source shallow in shallow out Java ConcurrentHashMap

Ali interviewers often ask about HashMap and ConcurrentHashMap. No interviewer will be able to stump you after reading this article.

Old readers please unscrupulous place praise it, wechat search [Silent King 2] concern about this programmer in luoyang, the ancient capital of nine dynasties. This article has been included in GitHub github.com/itwanger, which also contains the interview questions I carefully prepared for you.

HashMap is a very powerful data structure in Java that is used so frequently that almost all applications use it. But what about HashMap, which is not thread-safe and cannot be used in a multithreaded environment?

1) Hashtable, an old synchronous Hashtable with t in lower case, looks unprofessional:

public class Hashtable<K.V>
        extends Dictionary<K.V>
        implements Map<K.V>, Cloneable.java.io.Serializable {
    public synchronized V put(K key, V value) {}
    public synchronized int size(a) {}
    public synchronized V get(Object key) {}}Copy the code

It’s all synchronized, and that’s a lot of synchronization, right? In this case, performance is not guaranteed. Pass.

2) Collections. SynchronizedMap (new HashMap < String, the String > ()), you can wrap a HashMap as a synchronous synchronizedMap:

private static class SynchronizedMap<K.V>
        implements Map<K.V>, Serializable {
    public int size(a) {
        synchronized (mutex) {return m.size();}
    }
    public V get(Object key) {
        synchronized (mutex) {return m.get(key);}
    }
    public V put(K key, V value) {
        synchronized (mutex) {returnm.put(key, value); }}}Copy the code

SynchronizedMap is an improvement over Hashtable. Synchronized is no longer placed on methods, but within methods, as a synchronized block, but is still an object-level lock. Both reads and writes require the lock and, essentially, only one thread is allowed access. Other threads are excluded.

3) ConcurrentHashMap, the only correct answer. Concurrent means Concurrent, so ConcurrentHashMap is a HashMap that can be used in multithreaded environments.

ConcurrentHashMap is evolving, and Java 7 is very different from Java 8. The Java 7 version of ConcurrentHashMap is segmented locking, which splits the interior into different segments, each containing a HashEntry array.

Take a look at the Segment:

static final class Segment<K.V> extends ReentrantLock implements Serializable {
       transient volatile HashEntry<K,V>[] table;
       transient int count;
       transient int modCount;
       transient int threshold;
       final float loadFactor;
}
Copy the code

Take a look at HashEntry:

static final class HashEntry<K.V> {
        final K key;                       // Declare key final
        final int hash;                   // Declare the hash to be final
        volatile V value;                 // Declare value as volatile
        final HashEntry<K,V> next;      // Declare next as final

        HashEntry(K key, int hash, HashEntry<K,V> next, V value) {
            this.key = key;
            this.hash = hash;
            this.next = next;
            this.value = value; }}Copy the code

Much like a HashMap, the only difference is that the value is volatile, ensuring visibility on get.

The Segment inherits from ReentrantLock, so it does not need to be synchronized, either through put or get, like Hashtable. There is no effect on segments accessed by other threads.

Java 8 and later versions improved greatly on this, eliminating the piecewise locking mechanism and using CAS (Compare and Swap) and synchronized to ensure concurrency. The Segment is still defined internally, but only for compatibility when serializing.

/** * Stripped-down version of helper class used in previous version, * declared for the sake of serialization compatibility. */
static class Segment<K.V> extends ReentrantLock implements Serializable {
    final float loadFactor;
    Segment(float lf) { this.loadFactor = lf; }}Copy the code

The underlying structure is also a bit different from Java 7, closer to HashMap (array + bidirectional list + red-black tree) :

Take a look at the key fields defined by the new ConcurrentHashMap:

public class ConcurrentHashMap<K.V> extends AbstractMap<K.V>
        implements ConcurrentMap<K.V>, Serializable {
    transient volatile Node<K,V>[] table;
    private transient volatile Node<K,V>[] nextTable;
    private transient volatile int sizeCtl;
}
Copy the code

1) Table, default null, initialized when first put, default size 16, used to store nodes, expansion is always a power of 2.

Take a look at the definition of Node:

static class Node<K.V> implements Map.Entry<K.V> {
        final int hash;
        final K key;
        volatile V val;
        volatile Node<K,V> next;
        / /...
}
Copy the code

Hashes and keys are final, just like nodes ina HashMap, because keys do not change. Val and next are volatile, ensuring visibility in multithreaded environments.

2) nextTable, which defaults to NULL, is a new array that is twice the size of the original array.

3) sizeCtl, 0 by default, controls table initialization and expansion operations. -1 indicates that the table is being initialized. -(1+ Number of threads) : The capacity is being expanded by multiple threads.

The most important Map method is put, and ConcurrentHashMap is no exception:

public V put(K key, V value) {
    return putVal(key, value, false);
}

final V putVal(K key, V value, boolean onlyIfAbsent) {
    if (key == null || value == null) throw new NullPointerException();
    int hash = spread(key.hashCode());
    int binCount = 0;
    for (Node<K,V>[] tab = table;;) {
        Node<K,V> f; int n, i, fh;
        if (tab == null || (n = tab.length) == 0)
            tab = initTable();
        else if ((f = tabAt(tab, i = (n - 1) & hash)) == null) {
            if (casTabAt(tab, i, null.new Node<K,V>(hash, key, value, null)))
                break;                   // no lock when adding to empty bin
        }
        else if((fh = f.hash) == MOVED) tab = helpTransfer(tab, f); . Omit some code} addCount(1L, binCount);
    return null;
}
Copy the code

1) Spread () is a hash algorithm, similar to the hash() method of HashMap:

static final int spread(int h) {
    return (h ^ (h >>> 16)) & HASH_BITS;
}
Copy the code

2) If it is the first put, initTable() is called to initialize the table.

private final ConcurrentHashMap.Node<K,V>[] initTable() {
    ConcurrentHashMap.Node<K,V>[] tab; int sc;
    while ((tab = table) == null || tab.length == 0) {
        if ((sc = sizeCtl) < 0)
            Thread.yield(); // lost initialization race; just spin
        else if (U.compareAndSetInt(this, SIZECTL, sc, -1)) {
            try {
                if ((tab = table) == null || tab.length == 0) {
                    int n = (sc > 0)? sc : DEFAULT_CAPACITY;@SuppressWarnings("unchecked")
                    ConcurrentHashMap.Node<K,V>[] nt = (ConcurrentHashMap.Node<K,V>[])newConcurrentHashMap.Node<? ,? >[n]; table = tab = nt; sc = n - (n >>>2); }}finally {
                sizeCtl = sc;
            }
            break; }}return tab;
}
Copy the code

The outer layer uses a while loop, and if sizeCtl is found to be less than 0, that means another thread is initializing, yield yields the CPU.

When the first put is executed, U.compareAndSetInt(this, SIZECTL, sc, -1) is set to -1, indicating that the current thread is being initialized.

private static final Unsafe U = Unsafe.getUnsafe();
private static final long SIZECTL
        = U.objectFieldOffset(ConcurrentHashMap.class, "sizeCtl");
Copy the code

U is an Unsafe object (for hardware-level atomic operations, such as retrieving the memory location of an attribute and changing the field value of an object). CompareAndSetInt () is a native method for Unsafe, It changes the sizeCtl of ConcurrentHashMap to the specified value (-1).

After initialization, the table size is 16 (DEFAULT_CAPACITY).

Instead of the first put, tabAt() is called to retrieve the value (f) at the key position ((n-1) & hash) :

static final <K,V> ConcurrentHashMap.Node<K,V> tabAt(ConcurrentHashMap.Node<K,V>[] tab, int i) {
    return (ConcurrentHashMap.Node<K,V>)U.getReferenceAcquire(tab, ((long)i << ASHIFT) + ABASE);
}
Copy the code

U.getreferenceacquire () calls the Unsafe native getReferenceVolatile() method to retrieve data in the specified memory, ensuring that the data is the latest every time.

If f is null, this is the first put element in the table. CasTabAt () is called to insert Node.

static final <K,V> boolean casTabAt(ConcurrentHashMap.Node<K,V>[] tab, int i,
                                    ConcurrentHashMap.Node<K,V> c, ConcurrentHashMap.Node<K,V> v) {
    return U.compareAndSetReference(tab, ((long)i << ASHIFT) + ABASE, c, v);
}
Copy the code

If the CAS is successful, the Node is successfully inserted. Run the addCount() method to check whether the CAS needs to be expanded.

If it fails, another thread has inserted Node ahead of time to try the next for loop, known as spin.

If f’s hash is MOVED (-1), it means that another thread is expanding, so helpTransfer() is expanding as well.

Otherwise, nodes are inserted into the appropriate location as a linked list or red-black tree, a process achieved through synchronized blocks.

synchronized (f) {
    if (tabAt(tab, i) == f) {
        if (fh >= 0) {
            binCount = 1;
            for (Node<K,V> e = f;; ++binCount) {
                K ek;
                if(e.hash == hash && ((ek = e.key) == key || (ek ! =null && key.equals(ek)))) {
                    oldVal = e.val;
                    if(! onlyIfAbsent) e.val = value;break;
                }
                Node<K,V> pred = e;
                if ((e = e.next) == null) {
                    pred.next = new Node<K,V>(hash, key,
                                              value, null);
                    break; }}}else if (f instanceof TreeBin) {
            Node<K,V> p;
            binCount = 2;
            if((p = ((TreeBin<K,V>)f).putTreeVal(hash, key, value)) ! =null) {
                oldVal = p.val;
                if(! onlyIfAbsent) p.val = value; }}}}Copy the code

1) Before inserting, call tabAt(TAB, I) == f again to determine whether f has been modified by other threads.

2) If fh (hash value of f) >= 0, it indicates that f is the head Node of the linked list. Iterate through the linked list to find the corresponding Node, update the value, otherwise insert to the end.

3) If F is a red-black tree, insert or update nodes in a red-black tree manner.

After analyzing the put() method, let’s look at the get() method:

public V get(Object key) {
    ConcurrentHashMap.Node<K,V>[] tab; ConcurrentHashMap.Node<K,V> e, p; int n, eh; K ek;
    int h = spread(key.hashCode());
    if((tab = table) ! =null && (n = tab.length) > 0 &&
            (e = tabAt(tab, (n - 1) & h)) ! =null) {
        if ((eh = e.hash) == h) {
            if((ek = e.key) == key || (ek ! =null && key.equals(ek)))
                return e.val;
        }
        else if (eh < 0)
            return(p = e.find(h, key)) ! =null ? p.val : null;
        while((e = e.next) ! =null) {
            if(e.hash == h && ((ek = e.key) == key || (ek ! =null && key.equals(ek))))
                returne.val; }}return null;
}
Copy the code

Is it much easier?

1) If the hash values are equal ((eh = e.hash) == h), return the elements in the table array.

2) If it is a red-black tree (eh < 0), find returns as a red-black tree.

3) If it is a linked list, iterate and get value based on key.

Finally, write a ConcurrentHashMap application instance.

/ * * *@authorSilent King ii, an interesting programmer */
public class ConcurrentHashMapDemo {
    public final static int THREAD_POOL_SIZE = 5;

    public static void main(String[] args) throws InterruptedException {
        Map<String, String> map = new ConcurrentHashMap<>();

        long startTime = System.nanoTime();
        ExecutorService crunchifyExServer = Executors.newFixedThreadPool(THREAD_POOL_SIZE);
        for (int j = 0; j < THREAD_POOL_SIZE; j++) {
            crunchifyExServer.execute(new Runnable() {
                @SuppressWarnings("unused")
                @Override
                public void run(a) {
                    for (int i = 0; i < 500000; i++) {
                        map.put("itwanger"+i, "Silent King II."); }}}); } crunchifyExServer.shutdown(); crunchifyExServer.awaitTermination(Long.MAX_VALUE, TimeUnit.DAYS);long entTime = System.nanoTime();
        long totalTime = (entTime - startTime) / 1000000L;
        System.out.println(totalTime + "ms"); }}Copy the code

I’m going to give you an exercise, if you’re interested, to replace ConcurrentHashMap with SynchronizedMap, and compare the performance differences between the two.

I am the second Silent King, a programmer who lived in Luoyang, the ancient capital of nine dynasties. Attention can improve learning efficiency, thank you for your support, Ollie to 🌹.

Note: If there are any problems with the article, please kindly correct them.

If you feel that the article is helpful to you, welcome to wechat search “Silent King two” for the first time to read, reply to the keyword “xiaobai” can be free to get my liver 40,000 + words “Java Xiaobai from the entry to wunky” 2.0 version; This article has been included at GitHub github.com/itwanger.

Source shallow in shallow out Java ConcurrentHashMap

Related Posts

Summary of Java basic knowledge

In high concurrency scenarios, how to ensure that messages sent by producers to message middleware are not lost? [Architectural Notes of Huperia]

Install TraeFIK2 in Kubernetes