JDK7 ConcurrentHashMap source code analysis

Unsafe. Unsafe

1.1, Getting Unsafe

The JDK version of the code covered in this article is 1.7 Unsafe, which, as its name implies, offers low-level operations, such as the ability to directly manipulate out-of-heap memory, The monitorEnter and monitorExit features used by synchronized, The Unsafe object's default Constructor is private, so we need to get this object's Constructor via reflection. Constructor<Unsafe> declaredConstructor = Unsafe.class.getDeclaredConstructor(); declaredConstructor.setAccessible( true ); Unsafe unsafe = declaredConstructor.newInstance();Copy the code

1.2. Unsafe Operates common objects

class Person {
	private String username;

  /* getter / setter */
}

Person person = new Person();
person.setUsername( "zhangsan" );
System.out.println( person.getUsername() );
unsafe.putObjectVolatile( person,
      unsafe.objectFieldOffset( Person.class.getDeclaredField( "username")),"lisi"); System.out.println( person.getUsername() ); Assuming we have a Person class and create a Person object in the main method, we start with zhangsan and then call the unbroadening putObjectVolatile. The first printout is zhangsan. The unsafe.objectFieldOffset method is used to retrieve the offset in memory of an object's property, username. The unsafe.objectFieldOffset method is used to retrieve the offset in memory of an object's property, username. When an object is stored in the heap, it occupies a continuous space, and the properties of the object are placed somewhere in that space, which is the offset, and when we know the location of an object in memory, we can figure out the location of the property in memory based on the offset of the property, Which can directly control the memory the unsafe putObjectVolatile, is used to update an attribute values, such as update the attribute of person objects, update this property, will need to retrieve attributes of offset, Even though the objectFieldOffset method is unsafe, putOrderedObject has the same parameters and functionality as putObjectVolatile, which writes quickly and does not guarantee memory visibilityvolatileSemantics can guarantee the visibility of memory, so the difference between the two methods is whether the visibility of memory is guaranteedCopy the code

1.3. Unsafe Operates on arrays

String[] arr = new String[]{ "zhangsan1"."zhangsan2"."zhangsan3" };
long firstElementOffset = unsafe.arrayBaseOffset( arr.getClass() );
long perElementSize = unsafe.arrayIndexScale( arr.getClass() );
unsafe.putOrderedObject( arr, firstElementOffset + perElementSize , "lisi"); System.out.println(Arrays.toString( arr ) ); ArrayBaseOffset () is used to retrieve the size of the first element in an array. ArrayIndexScale () is used to retrieve the size of each element in an array. So when we want to get the offset of the second element in memory, we simply add the return values of the two methods to get it. The above code is a simple exampleCopy the code

ConcurrentHashMap source code analysis

2.1. Theoretical analysis

We know that HashMap is not thread-safe. In order to make HashMap operations thread-safe, we usually place these operations in a synchronized block. However, this causes HashMap operations to be parallel, which greatly affects efficiency. ConcurrentHashMap improves this situation. When we synchronize the HashMap, each step of the HashMap is placed in a synchronized block, while ConcurrentHashMap splits the operation. The internal data structure of ConcurrentHashMap, as shown in the figure below, has an array X in the outermost layer, and an array Y in each index. A linked list is maintained for each index in array Y. In fact, based on the previous analysis of HashMap source code, we can actually find that array Y is actually a HashMap data structure. Indeed, when we add the array Y, we will first calculate the index of the key-value in array X. In other words, each single array Y in array X has a separate lock. A put operation only locks the corresponding array Y. Instead of locking the entire array XCopy the code

2.2 data structure analysis

static final class HashEntry<K.V> {
    final int hash;
    final K key;
    volatile V value;
    volatileHashEntry<K,V> next; } In the previous section, we learned that array Y is actually the same structure as a HashMap, and that a HashEntry is the storage structure of each element in array Y, as you can see, the same structure as an Entry in a HashMapstatic final class Segment<K.V> extends ReentrantLock{
  volatile HashEntry<K,V>[] table;
  int modCount;
  int count;
  int threshold;
  floatloadFactor; } Each element in array X is a Segment object. As described in the previous section, each Segment object should have its own separate lock. The Segment object inherits its own separate lock from ReentrantLock. Each Segment maintains a HashEntry table, which is called array Y. ModCount, count(equivalent to size in HashMap), threshold, loadFactor, etc. All of these exist in a HashMap, so the Segment object is like a small HashMap, and the ConcurrentHashMap is like a set of hashMaps. In ConcurrentHashMap, the Segment array itself is not expanded, whereas the HashEntry array in the Segment is expanded. The length of the Segment array can be considered as a concurrency level. The longer the Segment, the fewer elements in each Segment and the higher the concurrency level. The shorter the Segment, the more elements in each Segment and the lower the concurrency levelCopy the code

2.3. Analysis of construction methods

public ConcurrentHashMap(a) {
    this(DEFAULT_INITIAL_CAPACITY, DEFAULT_LOAD_FACTOR, DEFAULT_CONCURRENCY_LEVEL);
}

public ConcurrentHashMap(int initialCapacity, float loadFactor, int concurrencyLevel) {... } initialCapacity refers to the sum of all the sizes of array Y in array X, that is, the sum of all the HashEntry arrays. LoadFactor refers to the loadFactor. The product of the length of array Y and the loadFactor is the threshold of the number of elements that need to be reached when the array Y is expanded. ConcurrencyLevel is the concurrencyLevel, which is used to express the length of the Segment[] array, but because the parameter constructor ispublic, so allow developers to provide parameter values themselves, according to an article in a HashMap source code analysis, we learned about, calculate the index of an element should be in which position, is to use an element computed hash value and the length of the array and the operation, and one of the limit is the length of the array must be2So the argument constructor needs to calculate a value greater than the value passed in by the user and is2To the power of, and finally create the arraypublic ConcurrentHashMap(int initialCapacity, float loadFactor, int concurrencyLevel) {
    if (concurrencyLevel > MAX_SEGMENTS)
        concurrencyLevel = MAX_SEGMENTS;
    // Find power-of-two sizes best matching arguments
    int sshift = 0;
    int ssize = 1;
    while (ssize < concurrencyLevel) {
        ++sshift;
        ssize <<= 1;
    }
    this.segmentShift = 32 - sshift;
    this.segmentMask = ssize - 1; . } Segment size (sszie); sszie(Segment Size)0001Make use of onewhileLoop to find the first value greater than concurrencyLevel2To determine the length of the segment array, while segmentMask is the value used to calculate which index of the segment array an element should be placed under. That is, after calculating a hash value based on an element, The index is then obtained based on hash & segmentMask, so segmentMask must be ssize -1, and the reason for this has been analyzed in detail in the analysis of the HashMap source code0And the low values are1), segmentShift is used to manipulate elements in the Segment array using the Unsafe object later, which we'll examine laterpublic ConcurrentHashMap(int initialCapacity, float loadFactor, int concurrencyLevel) {
  if (initialCapacity > MAXIMUM_CAPACITY)
      initialCapacity = MAXIMUM_CAPACITY;
  int c = initialCapacity / ssize;
  if (c * ssize < initialCapacity)
      ++c;

  int cap = MIN_SEGMENT_TABLE_CAPACITY;
  while (cap < c)
      cap <<= 1; . } The second code first obtains the average size of the HashEntry in each Segment object via initialCapacity/ssize, if initialCapacity is17While ssize is16, the resulting value after the trigger is1So there will be a remainder, and that will add c1Operation to ensure that the total capacity is greater than or equal to initialCapacity. However, we assume that each Segment object should maintain a small HashMap, so should the capacity of the HashEntry array2To the power of, so cap is set to minimum first2Theta to the power of theta, and when it's less than c, we move it to the left, so we get the first value greater than or equal to c, and that value is theta2The value is the size of the HashEntry arraypublic ConcurrentHashMap(int initialCapacity, float loadFactor, int concurrencyLevel) {
    // create segments and segments[0]
    Segment<K,V> s0 =
        new Segment<K,V>(loadFactor, (int)(cap * loadFactor),
                          (HashEntry<K,V>[])new HashEntry[cap]);
    Segment<K,V>[] ss = (Segment<K,V>[])new Segment[ssize];
    UNSAFE.putOrderedObject(ss, SBASE, s0); // ordered write of segments[0]
    this.segments = ss; } create a Segment object set to s0 by storing the Segment object in the Segment array index of s00Insert into the Segment[] array (s0, loadFactor, HashEntry); insert into the Segment[] array (s0, loadFactor, HashEntry); Finally, the segments member property is assigned, and using unbroadening. PutOrderedObject, the offset is specified as SBASE0The offset of, atstaticBlock initialization), placing s0 in the Segment array index as0The location of theCopy the code

2.4, put method

1. Overall analysis

public V put(K key, V value) {
    Segment<K,V> s;
    if (value == null)
        throw new NullPointerException();
    int hash = hash(key);
    int j = (hash >>> segmentShift) & segmentMask;
    if ((s = (Segment<K,V>)UNSAFE.getObject          // nonvolatile; recheck
            (segments, (j << SSHIFT) + SBASE)) == null) // in ensureSegment
        s = ensureSegment(j);
    return s.put(key, hash, value, false); } Use the hash method to calculate the corresponding hash value of the key, which is similar to the hash method in HashMap, if you are interested to explore the details, (Hash >>> segmentShift) & segmentMask, Segment[] = Segment[]; Segment[] = Segment[]; Segment[]; Segment[]1The Segment array length should be greater than or equal to concurrentLevel, and the Segment array length should be greater than or equal to concurrentLevel2To the power of theta, so it involves a left shift operation, whereas segmentShift is equal to32Minus the number of left moves, let's say left moves4Bit, then the hash value is shifted to the right when calculating the index of the key (32 - 4 = 28After retrieving index J, the Segment object corresponding to index J in the Segment array is obtained using the Unsafe getObject method. If the object is empty, ensureSegment is used to create the object, and then the Segment's PUT method is called to place the key-value into the appropriate location in the HashEntry[] array. To obtain the specified index position of an array, the arrayBaseOffset method is used to obtain the offset of the first element in the array and the arrayIndexScale method is used to obtain the size of the object occupied by each position in the arraystaticClass sc = Segment[].class; SBASE = UNSAFE.arrayBaseOffset(sc); ss = UNSAFE.arrayIndexScale(sc); SSHIFT =31- Integer.numberOfLeadingZeros(ss); (j << SSHIFT) + SBASE), the same as j * unsafe.arrayIndexScale (Segment[].class) + c, So if you're interested, you can write a demo, so the second one in putifSelect Segment[] from Segment[] where Segment[] = jCopy the code

The ensureSegment method creates a Segment object

private Segment<K,V> ensureSegment(int k) {
    final Segment<K,V>[] ss = this.segments;
    long u = (k << SSHIFT) + SBASE; // raw offset. } select * from Segment[] where Segment[] is the offset of the Segment whose index position is kprivate Segment<K,V> ensureSegment(int k) {... Segment<K,V> seg;if ((seg = (Segment<K,V>)UNSAFE.getObjectVolatile(ss, u)) == null) {
        Segment<K,V> proto = ss[0]; // use segment 0 as prototype
        int cap = proto.table.length;
        float lf = proto.loadFactor;
        int threshold = (int)(cap * lf);
        HashEntry<K,V>[] tab = (HashEntry<K,V>[])new HashEntry[cap];
        if ((seg = (Segment<K,V>)UNSAFE.getObjectVolatile(ss, u))
            == null) { // recheck
            Segment<K,V> s = new Segment<K,V>(lf, threshold, tab);
            while ((seg = (Segment<K,V>)UNSAFE.getObjectVolatile(ss, u))
                    == null) {
                if (UNSAFE.compareAndSwapObject(ss, u, null, seg = s))
                    break; }}}returnseg; } if the Segment object is null, then the Segment object is null0When the Segment array is initialized, it will be indexed to0Then other index locations can use the attributes reserved in this object to create the Segment object and then initialize the HashEntry array, and the threshold load factor, ConcurrentHashMap is designed to ensure the safety of multiple threads, that is, it may be called in multiple threads. Therefore, after the above work is done, we will check whether other threads have already added the Segment object to the index position k. If not, we will create the Segment object and use the Segment objectwhileLoop, and use the CAS to place the Segment object in the corresponding index bit. Exit only when the CAS update succeeds or the Segment object is fetched from index position KwhileThe loop, the last onewhileLoops plus CAS are used to ensure concurrency safety in multithreaded situationsCopy the code

3, Put method of Segment object

final V put(K key, int hash, V value, boolean onlyIfAbsent) {
    HashEntry<K,V> node = tryLock() ? null :
        scanAndLockForPut(key, hash, value);
    V oldValue;
    try{... }finally {
        unlock();
    }
    returnoldValue; } Take a look at this reduced code first, the top is a lock operation, and then usetry.finallyConcurrentHashMap uses a piecewise lock, that is, each put operation locks only one piece of data. Lock the Segment object itself. Because the Segment object inherits a ReentrantLock, it has all the methods of a ReentrantLock. TryLock first attempts to acquire the lock. If the lock is not acquired successfully (another thread is holding the lock), scanAndLockForPut is used to acquire the lock continuously. Since the first tryLock failed to obtain the lock, other things can be done to improve performance. Therefore, the scanAndLockForPut method will calculate the index of the key in the HashEntry array, and then scan the linked list under the index to see if there is a corresponding value. If there is no value, the corresponding HashEntry object will be created and return, So this node is actually created when there is no corresponding key in the HashEntry list where the key-value residestry.finallyThe authors of ConcurrentHashMap are very demanding in terms of performance, since they do add operations directly to the block without creating a HashEntryfinal V put(K key, int hash, V value, boolean onlyIfAbsent) {... HashEntry<K,V>[] tab = table;int index = (tab.length - 1) & hash;
    HashEntry<K,V> first = entryAt(tab, index);
    for (HashEntry<K,V> e = first;;) {
        if(e ! =null) {
            K k;
            if ((k = e.key) == key ||
                (e.hash == hash && key.equals(k))) {
                oldValue = e.value;
                if(! onlyIfAbsent) { e.value = value; ++modCount; }break;
            }
            e = e.next;
        }
        else {
            if(node ! =null)
                node.setNext(first);
            else
                node = new HashEntry<K,V>(hash, key, value, first);
            int c = count + 1;
            if (c > threshold && tab.length < MAXIMUM_CAPACITY)
                rehash(node);
            else
                setEntryAt(tab, index, node);
            ++modCount;
            count = c;
            oldValue = null;
            break; }}... } this piece of code is intry.finallyAdd elements to the HashEntry array, use entryAt to retrieve the first element in the Segment[] array under the specified index, and then use entryAt to retrieve the first element in the Segment[] arrayforLoop through the list of elements. If the element is not empty, check whether it is equal to key. If so, onlyIfAbsent is equal tofalseOnlyIfAbsent is only available when we call ConcurrentHashMap's putIfAbsent methodtrue, or forfalseIf there is an element equal to key, then it is directbreakOtherwise, the next cycle continues until e is zeronullThere is no element equal to key in the listforIn the loopelseIf node is not empty, tryLock failed to get the lock, and then scan scanAndLockForPut for all elements that do not have the key, and create HashEntry. If c is greater than the threshold value and the length of HashEntry does not reach the threshold value, then the rehash method will be called for expansion. Otherwise, the setEntryAt method is used to set the node to the corresponding index of the HashEntry[] array. This completes the addition operation. The rehash method is a bit more complicated, mainly because the authors of ConcurrentHashMap wanted to improve performance. A simple implementation is to calculate the index of the elements in the new array and then move the elements one by one. In the rehash method, the HashEntry linked list under the index (such as X) of the HashEntry[] array is specified. After the HashEntry array is expanded, We need to recalculate the index position of the elements in the list. We have done an optimization in the rehash method, starting at the end of the list. If the index position of the elements in the list is the same, then the top element of the list is directly placed in the corresponding position of the new array. For example, if the following elements A -> B -> C -> D -> E are calculated to be in the same index position in the new array (for example, if the new index is set to Y, there is no value under the index), then the following operation will be performed (pseudocode): NewTable [Y] = C (newTable[Y] = C) (newTable[Y] = C) newTable[Y] = C (newTable[Y] = C) (newTable[Y] = CCopy the code

2.5. Get method

The get method does not need to be analyzed in detail. Given the basis of the PUT method, the get method looks very simple. It should be mentioned that the whole get method is unlocked, so be careful with this when using ConcurrentHashMapCopy the code

JDK7 ConcurrentHashMap source code analysis

Unsafe. Unsafe

1.1, Getting Unsafe

1.2. Unsafe Operates common objects

1.3. Unsafe Operates on arrays

ConcurrentHashMap source code analysis

2.1. Theoretical analysis

2.2 data structure analysis

2.3. Analysis of construction methods

2.4, put method

2.5. Get method

Related Posts

How to do Spring Cloud Zuul dynamic routing? Integrating the Nacos implementation is simple

The Security of Redis Distributed Lock

Read this article and never say that you are single by merit