Be serious about writing, not clickbait.
The article is available at Github.com/niumoo/Java… And program ape Alang’s blog, point attention, don’t get lost.
ConcurrentHashMap is a thread-safe HashMap that has been used a lot in recent years. So what is its storage structure and implementation principle?
1. ConcurrentHashMap 1.7
1. Storage structure
ConcurrnetHashMap consists of several segments, each of which is a hashMap-like structure. Therefore, each HashMap can be expanded internally. However, the number of segments cannot be changed once initialized. The default number of segments is 16. You can also consider ConcurrentHashMap to support up to 16 concurrent threads by default.
2. The initialization
The initialization process of ConcurrentHashMap is explored through the no-argument construction of ConcurrentHashMap.
/** * Creates a new, empty map with a default initial capacity (16), * Load factor (0.75) and concurrencyLevel (16). */
public ConcurrentHashMap(a) {
this(DEFAULT_INITIAL_CAPACITY, DEFAULT_LOAD_FACTOR, DEFAULT_CONCURRENCY_LEVEL);
}
Copy the code
The parameterless construct calls the parameterless construct, passing in the default values of the three parameters, whose values are.
/** * Default initialization capacity */
static final int DEFAULT_INITIAL_CAPACITY = 16;
/** * Default load factor */
static final float DEFAULT_LOAD_FACTOR = 0.75 f;
/** * Default concurrency level */
static final int DEFAULT_CONCURRENCY_LEVEL = 16;
Copy the code
Then look at the internal implementation logic of the parameterized constructor.
@SuppressWarnings("unchecked")
public ConcurrentHashMap(int initialCapacity,float loadFactor, int concurrencyLevel) {
// Check parameters
if(! (loadFactor >0) || initialCapacity < 0 || concurrencyLevel <= 0)
throw new IllegalArgumentException();
// Verify the concurrency level, greater than 1<<16, reset to 65536
if (concurrencyLevel > MAX_SEGMENTS)
concurrencyLevel = MAX_SEGMENTS;
// Find power-of-two sizes best matching arguments
PI to the power of 2
int sshift = 0;
int ssize = 1;
// This loop finds the nearest power of 2 above concurrencyLevel
while (ssize < concurrencyLevel) {
++sshift;
ssize <<= 1;
}
// Record segment offset
this.segmentShift = 32 - sshift;
// Record the segment mask
this.segmentMask = ssize - 1;
// Set the capacity
if (initialCapacity > MAXIMUM_CAPACITY)
initialCapacity = MAXIMUM_CAPACITY;
// c = capacity/ssize, default 16/16 = 1
int c = initialCapacity / ssize;
if (c * ssize < initialCapacity)
++c;
int cap = MIN_SEGMENT_TABLE_CAPACITY;
// The Segment's hashmap-like capacity is at least 2 or a multiple of 2
while (cap < c)
cap <<= 1;
// create segments and segments[0]
// Create Segment array and set segments[0]
Segment<K,V> s0 = new Segment<K,V>(loadFactor, (int)(cap * loadFactor),
(HashEntry<K,V>[])new HashEntry[cap]);
Segment<K,V>[] ss = (Segment<K,V>[])new Segment[ssize];
UNSAFE.putOrderedObject(ss, SBASE, s0); // ordered write of segments[0]
this.segments = ss;
}
Copy the code
To summarize the initialization logic for ConcurrnetHashMap in Java 7.
- Verify necessary parameters.
- ConcurrencyLevel specifies the concurrencyLevel. If the value is greater than the maximum value, the value is reset to the maximum value. The default value is 16.
- Find the nearest power of 2 above concurrencyLevel as the initial capacity size. The default is 16.
- Record the segmentShift offset, which is N in [capacity = 2 ^ N] and will be used later in the Put calculation. The default is 32-sshift = 28.
- Record the segmentMask, default is ssize-1 = 16-1 = 15.
- Initialize segments[0], default size is 2, load factor is 0.75, expansion threshold is 2*0.75=1.5, expansion is performed only when the second value is inserted.
3. put
Follow the initialization parameters above to see the source of the PUT method.
/**
* Maps the specified key to the specified value in this table.
* Neither the key nor the value can be null.
*
* <p> The value can be retrieved by calling the <tt>get</tt> method
* with a key that is equal to the original key.
*
* @param key key with which the specified value is to be associated
* @param value value to be associated with the specified key
* @return the previous value associated with <tt>key</tt>, or
* <tt>null</tt> if there was no mapping for <tt>key</tt>
* @throws NullPointerException if the specified key or value is null
*/
public V put(K key, V value) {
Segment<K,V> s;
if (value == null)
throw new NullPointerException();
int hash = hash(key);
// Move the hash value unsigned 28 bits to the right (obtained at initialization) and then do an and with segmentMask=15
// Add the upper 4 bits to the segmentMask (1111)
int j = (hash >>> segmentShift) & segmentMask;
if ((s = (Segment<K,V>)UNSAFE.getObject // nonvolatile; recheck
(segments, (j << SSHIFT) + SBASE)) == null) // in ensureSegment
// If the Segment is null, initialize it
s = ensureSegment(j);
return s.put(key, hash, value, false);
}
/**
* Returns the segment for the given index, creating it and
* recording in segment table (via CAS) if not already present.
*
* @param k the index
* @return the segment
*/
@SuppressWarnings("unchecked")
private Segment<K,V> ensureSegment(int k) {
final Segment<K,V>[] ss = this.segments;
long u = (k << SSHIFT) + SBASE; // raw offset
Segment<K,V> seg;
// Check whether the Segment at u is null
if ((seg = (Segment<K,V>)UNSAFE.getObjectVolatile(ss, u)) == null) {
Segment<K,V> proto = ss[0]; // use segment 0 as prototype
// Get the initial length of HashEntry
from segment 0
,v>
int cap = proto.table.length;
// get the loadFactor from the hash table in segment 0. The loadFactor is the same for all segments
float lf = proto.loadFactor;
// Calculate the expansion threshold
int threshold = (int)(cap * lf);
// Create a HashEntry array with cap capacity
HashEntry<K,V>[] tab = (HashEntry<K,V>[])new HashEntry[cap];
if ((seg = (Segment<K,V>)UNSAFE.getObjectVolatile(ss, u)) == null) { // recheck
// Check if the Segment at u is null, because another thread may be manipulating it
Segment<K,V> s = new Segment<K,V>(lf, threshold, tab);
// Check whether the Segment at position u is null
while ((seg = (Segment<K,V>)UNSAFE.getObjectVolatile(ss, u))
== null) {
// The CAS assignment will only succeed once
if (UNSAFE.compareAndSwapObject(ss, u, null, seg = s))
break; }}}return seg;
}
Copy the code
The above source code analyzes the process of putting a data in ConcurrentHashMap.
-
Calculates the location of the key to put, and gets the Segment at the specified location.
-
If the Segment at the specified location is empty, initialize the Segment.
Initialize Segment flow:
- Check if the Segment of the computed location is null.
- To continue initialization for NULL, create a HashEntry array using the capacity and load factor of Segment[0].
- Check again if the Segment computed at the specified location is null.
- Initialize this Segment with the created HashEntry array.
- Spin determines whether the Segment computed at the specified location is null. CAS assigns the Segment to that location.
-
Segment. Put Insert key,value value.
The operations to get and initialize segments were explored above. The Segment put method in the last line is not checked yet.
final V put(K key, int hash, V value, boolean onlyIfAbsent) {
// Obtain the ReentrantLock exclusive lock. If the lock cannot be obtained, scanAndLockForPut can be obtained.
HashEntry<K,V> node = tryLock() ? null : scanAndLockForPut(key, hash, value);
V oldValue;
try {
HashEntry<K,V>[] tab = table;
// Calculate the data location to put
int index = (tab.length - 1) & hash;
// CAS gets the index value
HashEntry<K,V> first = entryAt(tab, index);
for (HashEntry<K,V> e = first;;) {
if(e ! =null) {
// Check if the key already exists. If so, traverse the list to find the location and replace the value
K k;
if ((k = e.key) == key ||
(e.hash == hash && key.equals(k))) {
oldValue = e.value;
if(! onlyIfAbsent) { e.value = value; ++modCount; }break;
}
e = e.next;
}
else {
// first has a value but index has a value.
if(node ! =null)
node.setNext(first);
else
node = new HashEntry<K,V>(hash, key, value, first);
int c = count + 1;
// If the capacity is greater than the threshold and smaller than the maximum capacity, expand the capacity
if (c > threshold && tab.length < MAXIMUM_CAPACITY)
rehash(node);
else
// The index position assigns node, which may be an element or the head of a linked list
setEntryAt(tab, index, node);
++modCount;
count = c;
oldValue = null;
break; }}}finally {
unlock();
}
return oldValue;
}
Copy the code
Because a Segment inherits a ReentrantLock, it is very easy to acquire locks from within the Segment. Put processes use this functionality.
-
TryLock () obtains the lock, if not, proceed with scanAndLockForPut.
-
Calculate the index position to put the put data into, and then get the HashEntry at that position.
-
Iterate over put new elements, why iterate over them? Because the HashEntry you get here could be an empty element or a linked list already exists, you treat it differently.
If the HashEntry at this location does not exist:
- If the current capacity is greater than the threshold or smaller than the maximum capacity, expand the capacity.
- Direct head insertion.
If a HashEntry exists at this location:
- Check whether the current Key and hash values of the linked list elements match the Key and hash values to be put. If yes, replace the value
- If not, obtain the next node in the list until the value is replaced, or the end of the list does not have the same.
- If the current capacity is greater than the threshold or smaller than the maximum capacity, expand the capacity.
- Direct linked list head insert method.
-
If the position to be inserted already exists, return the old value after the replacement, otherwise return NULL.
The scanAndLockForPut operation in the first step is not covered here; all this method does is spin tryLock() continuously to get the lock. Block with lock() to acquire the lock when the number of spins is greater than the specified number. Gets the HashEntry of the hash position along the table at spin.
private HashEntry<K,V> scanAndLockForPut(K key, int hash, V value) {
HashEntry<K,V> first = entryForHash(this, hash);
HashEntry<K,V> e = first;
HashEntry<K,V> node = null;
int retries = -1; // negative while locating node
// Spin the lock
while(! tryLock()) { HashEntry<K,V> f;// to recheck first below
if (retries < 0) {
if (e == null) {
if (node == null) // speculatively create node
node = new HashEntry<K,V>(hash, key, value, null);
retries = 0;
}
else if (key.equals(e.key))
retries = 0;
else
e = e.next;
}
else if (++retries > MAX_SCAN_RETRIES) {
// After a specified number of spins, block until the lock is acquired
lock();
break;
}
else if ((retries & 1) = =0 &&
(f = entryForHash(this, hash)) ! = first) { e = first = f;// re-traverse if entry changed
retries = -1; }}return node;
}
Copy the code
4. The expansion rehash
The expansion of ConcurrentHashMap will only be doubled. When the data in the old array is moved to the new array, the position will either remain unchanged or change to index+ oldSize. The node in the parameter will be inserted into the specified position after expansion using the linked list header method.
private void rehash(HashEntry<K,V> node) {
HashEntry<K,V>[] oldTable = table;
/ / old capacity
int oldCapacity = oldTable.length;
// New capacity, twice as large
int newCapacity = oldCapacity << 1;
// New expansion threshold
threshold = (int)(newCapacity * loadFactor);
// Create new array
HashEntry<K,V>[] newTable = (HashEntry<K,V>[]) new HashEntry[newCapacity];
// New mask, default 2 is 4 after expansion, -1 is 3, binary is 11.
int sizeMask = newCapacity - 1;
for (int i = 0; i < oldCapacity ; i++) {
// Iterate over the old array
HashEntry<K,V> e = oldTable[i];
if(e ! =null) {
HashEntry<K,V> next = e.next;
// Calculate the new location, the new location can only be inconvenience or old location + old capacity.
int idx = e.hash & sizeMask;
if (next == null) // Single node on list
// If the current position is not yet a list, but an element, assign the value directly
newTable[idx] = e;
else { // Reuse consecutive sequence at same slot
// If it is a linked list
HashEntry<K,V> lastRun = e;
int lastIdx = idx;
// The new location can only be inconvenience or old location + old capacity.
// The lastRun element positions are the same after the loop
for(HashEntry<K,V> last = next; last ! =null; last = last.next) {
int k = last.hash & sizeMask;
if (k != lastIdx) {
lastIdx = k;
lastRun = last;
}
}
// the element position after lastRun is the same, directly as the list assigned to the new position.
newTable[lastIdx] = lastRun;
// Clone remaining nodes
for(HashEntry<K,V> p = e; p ! = lastRun; p = p.next) {// Iterate over the rest of the elements, inserting the header to the specified position k.
V v = p.value;
int h = p.hash;
int k = h & sizeMask;
HashEntry<K,V> n = newTable[k];
newTable[k] = newHashEntry<K,V>(h, p.key, v, n); }}}}// Insert a new node
int nodeIndex = node.hash & sizeMask; // add the new node
node.setNext(newTable[nodeIndex]);
newTable[nodeIndex] = node;
table = newTable;
}
Copy the code
Some of you might be confused by the last two for loops, where the first for looks for a node where all the new positions of the next nodes are the same. And then assign this as a linked list to the new location. The second for loop inserts the remaining elements into the list at the specified location via header insertion. The reason for this may be based on probability statistics, students who have in-depth research can give their opinions.
5. get
It’s pretty easy from here, the GET method takes only two steps.
- Calculate the location of the key.
- Traverses the specified location to find the value of the same key.
public V get(Object key) {
Segment<K,V> s; // manually integrate access methods to reduce overhead
HashEntry<K,V>[] tab;
int h = hash(key);
long u = (((h >>> segmentShift) & segmentMask) << SSHIFT) + SBASE;
// Calculate the location of the key
if((s = (Segment<K,V>)UNSAFE.getObjectVolatile(segments, u)) ! =null&& (tab = s.table) ! =null) {
for (HashEntry<K,V> e = (HashEntry<K,V>) UNSAFE.getObjectVolatile
(tab, ((long)(((tab.length - 1) & h)) << TSHIFT) + TBASE); e ! =null; e = e.next) {
// If the list is linked, the value of the same key is found.
K k;
if ((k = e.key) == key || (e.hash == h && key.equals(k)))
returne.value; }}return null;
}
Copy the code
2. ConcurrentHashMap 1.8
1. Storage structure
The ConcurrentHashMap of Java8 has changed a lot compared to that of Java7. Instead of the Segment array + HashEntry array + linked list, the ConcurrentHashMap of Java8 is Node array + linked list/red-black tree. When the collision chain expression reaches a certain length, the linked list is transformed into a red-black tree.
2. Initialize initTable
/** * Initializes table, using the size recorded in sizeCtl. */
private final Node<K,V>[] initTable() {
Node<K,V>[] tab; int sc;
while ((tab = table) == null || tab.length == 0{/ / if sizeCtl <0", indicating that the other thread successfully executes the CAS and is initializing the CAS.if ((sc = sizeCtl) < 0)
// Allocate CPU usage
Thread.yield(); // lost initialization race; just spin
else if (U.compareAndSwapInt(this, SIZECTL, sc, -1)) {
try {
if ((tab = table) == null || tab.length == 0) {
int n = (sc > 0)? sc : DEFAULT_CAPACITY;@SuppressWarnings("unchecked")
Node<K,V>[] nt = (Node<K,V>[])newNode<? ,? >[n]; table = tab = nt; sc = n - (n >>>2); }}finally {
sizeCtl = sc;
}
break; }}return tab;
}
Copy the code
From the source, you can see that the initialization of ConcurrentHashMap is done through spin and CAS operations. One thing to note inside is the variable sizeCtl, whose value determines the current initialization state.
- -1 Indicates that initialization is in progress
- -n Indicates that n-1 threads are expanding capacity
- Indicates the initialization size of the table if the table is not initialized
- Indicates the capacity of the table, if the table has been initialized.
3. put
Go through the put source code directly.
public V put(K key, V value) {
return putVal(key, value, false);
}
/** Implementation for put and putIfAbsent */
final V putVal(K key, V value, boolean onlyIfAbsent) {
// Key and value cannot be empty
if (key == null || value == null) throw new NullPointerException();
int hash = spread(key.hashCode());
int binCount = 0;
for (Node<K,V>[] tab = table;;) {
// f = target position element
Node<K,V> f; int n, i, fh;// fh is used to store the hash value of the element at the target position
if (tab == null || (n = tab.length) == 0)
// If the bucket is empty, initialize the bucket (spin +CAS)
tab = initTable();
else if ((f = tabAt(tab, i = (n - 1) & hash)) == null) {
// If the bucket is empty and the CAS is not locked, then the CAS will break
if (casTabAt(tab, i, null.new Node<K,V>(hash, key, value, null)))
break; // no lock when adding to empty bin
}
else if ((fh = f.hash) == MOVED)
tab = helpTransfer(tab, f);
else {
V oldVal = null;
// Use synchronized to join a node
synchronized (f) {
if (tabAt(tab, i) == f) {
// It is a linked list
if (fh >= 0) {
binCount = 1;
// Loop to add new or overwrite nodes
for (Node<K,V> e = f;; ++binCount) {
K ek;
if(e.hash == hash && ((ek = e.key) == key || (ek ! =null && key.equals(ek)))) {
oldVal = e.val;
if(! onlyIfAbsent) e.val = value;break;
}
Node<K,V> pred = e;
if ((e = e.next) == null) {
pred.next = new Node<K,V>(hash, key,
value, null);
break; }}}else if (f instanceof TreeBin) {
/ / the red-black tree
Node<K,V> p;
binCount = 2;
if((p = ((TreeBin<K,V>)f).putTreeVal(hash, key, value)) ! =null) {
oldVal = p.val;
if(! onlyIfAbsent) p.val = value; }}}}if(binCount ! =0) {
if (binCount >= TREEIFY_THRESHOLD)
treeifyBin(tab, i);
if(oldVal ! =null)
return oldVal;
break;
}
}
}
addCount(1L, binCount);
return null;
}
Copy the code
-
Calculate the Hashcode based on the key.
-
Check whether initialization is required.
-
That is, the Node located for the current key. If it is empty, data can be written to the current position. CAS is used to attempt to write data.
-
If the current location hashCode == MOVED == -1, you need to expand the capacity.
-
If none is met, data is written using a synchronized lock.
-
If the number is greater than TREEIFY_THRESHOLD, it is converted to a red-black tree.
4. get
Get process is relatively simple, directly through the source code.
public V get(Object key) {
Node<K,V>[] tab; Node<K,V> e, p; int n, eh; K ek;
// Hash position of the key
int h = spread(key.hashCode());
if((tab = table) ! =null && (n = tab.length) > 0 &&
(e = tabAt(tab, (n - 1) & h)) ! =null) {
// The hash value of the header is the same if the element at the specified position exists
if ((eh = e.hash) == h) {
if((ek = e.key) == key || (ek ! =null && key.equals(ek)))
// The hash value of key is equal, and the value of key is the same
return e.val;
}
else if (eh < 0)
// If the hash value of the header node is less than 0, it indicates expansion or red-black tree
return(p = e.find(h, key)) ! =null ? p.val : null;
while((e = e.next) ! =null) {
// it is a linked list
if(e.hash == h && ((ek = e.key) == key || (ek ! =null && key.equals(ek))))
returne.val; }}return null;
}
Copy the code
To summarize the GET process:
- Calculates the position based on the hash value.
- Find the specified location, if the head node is to find, return its value.
- If the hash value of the head node is less than 0, it indicates expansion or red-black tree.
- If it’s a linked list, walk through it and find it.
Conclusion:
In general, ConcruuentHashMap changes a lot in Java8 compared to Java7,
3. Summary
In Java7, ConcruuentHashMap uses a segmented lock, which means that only one thread can operate on each Segment at a time. Each Segment is a hashmap-like array structure that can be expanded and its conflicts converted to a linked list. But the number of segments cannot be changed once initialized.
Synchronized locking CAS mechanism used by ConcruuentHashMap in Java8. The structure also evolved from the Segment array + HashEntry array + linked list in Java7 to the Node array + linked list/red-black tree. Node is a HashEntry like structure. When its conflicts reach a certain size, it will be transformed into a red-black tree, and when the number of conflicts is less than a certain number, it will return to the linked list.
Some students may have doubts about the performance of Synchronized. In fact, the performance of Synchronized lock is no longer a problem after the introduction of the lock upgrade strategy. Interested students can learn about the upgrade of Synchronized by themselves.
Hello world:) I’m Aaron, a tech tool guy on the front line. The article continues to update, can pay attention to the public number “program ape A Lang” grow together. This article has been received at Github.com/niumoo/Java… And “Unread Code,” lots of knowledge and a series of articles.