Today’s sharing started, please give us more advice ~
This article summarizes the interview questions related to ConcurrentHashMap. As autumn recruitment approaches, I wish you a lot of progress as you prepare for your summer internship.
1. Describe the ConcurrentHashMap storage data structure.
-
The map structure of ConcurrentHashMap is the same as that of HashMap, which consists of array, linked list, and red-black tree.
-
ConcurrentHashMap stores data in the same unit as a HashMap, i.e., Node structure:
- The difference between ConcurrentHashMap and HashMap is that ConcurrentHashMap supports concurrent expansion and uses internal locking (spin locking + CAS + synchronized + segment locking) to ensure thread safety.
2. Can the ConcurrentHashMap load factor be specified?
The load factor of a normal HashMap can be modified, but ConcurrentHashMap cannot because its load factor is modified with the final keyword and has a fixed value of 0.75:
Load factor: indicates how full the hash table is. ~ In ConcurrentHashMap, this attribute is fixed at 0.75 and cannot be modified
Private static final float LOAD_FACTOR = 0.75f;
3. Why is the node. hash field of a Node usually >=0?
Or, how many hash values can a Node have? What are the different scenarios?
-
If node. hash = -1, the current Node is the **FWD(ForWardingNode) ** Node (the Node that has been migrated).
-
If node. hash = -2, the current Node is treed and the current Node is a TreeBin object. The TreeBin object agent operates on red-black trees.
-
If node. hash > 0, the current Node is a normal Node, possibly a linked list, or a single Node.
4. Can you briefly describe the sizeCtl fields in ConcurrentHashMap?
SizeCtr = Size Control, Size Control, sizeCtr = Size Control, Size Control, Size Control
(1) sizeCtl == 0;
IzeCtl == 0, which is the default value, means that the hash list is initialized with the default capacity of 16 when it is actually initialized for the first time.
SizeCtl == -1?
SizeCtl == -1: The hash list is being initialized. A thread is initializing the current hash table.
The hash table of ConcurrentHashMap is lazily initialized and must be initialized only once in concurrent conditions, so sizeCtl == -1 acts as an “identity lock” that prevents multiple threads from initializing the hash table.
SizeCtl is set to -1 in initTable() via CAS, indicating that a thread is initializing the hash table. Other threads cannot initialize the hash table again.
If CAS modifs the thread with the successful sizeCtl = -1 operation, the logic to initialize the hash list can be followed. Threads that fail to modify CAS, on the other hand, are constantly checked for spin until the hash list initialization is complete.
In this case, the Thread whose CAS fails will follow the following logic: the spin Thread will pass thread.yield (); Method to temporarily free up CPU resources to be used by hungrier threads. The goal is to reduce the spin drain on CPU performance.
SizeCtl > 0 after initializing hash table
If sizeCtl is greater than 0, it indicates the threshold for the next expansion after initialization or expansion.
For example, with sizeCtl =12, when the number of data in the hash table is greater than or equal to 12, the expansion operation is triggered.
④ sizeCtl < 0 && sizeCtl! What happens when lambda is equal to negative 1?
sizeCtl < 0 && sizeCtl ! If = -1, the current hash list is being expanded.
-(1 + nThreads) : n threads are being added together.
In this case, the high 16 bits of sizeCtl indicate the capacity expansion id stamp, and the low 16 bits indicate the number of threads participating in concurrent capacity expansion: 1 + nThread, that is, the number of threads participating in concurrent capacity expansion is N.
5. Would you please explain the function and calculation method of capacity expansion identification stamp?
Obtain the unique identifier of capacity expansion according to the length tab.length of the old table.
Assuming that the capacity expansion is 16 -> 32, the function of the capacity expansion identifier is that under the condition of multi-threaded concurrent capacity expansion, all threads with the capacity expansion of 16 -> 32 can participate in the concurrent capacity expansion.
sizeCtl < 0 && sizeCtl ! If = -1, the high 16 bits of sizeCtl indicate the capacity expansion identifier, and the low 16 bits indicate the number of threads participating in concurrent capacity expansion: 1 + nThread, that is, the number of threads participating in concurrent capacity expansion is N.
6. How does ConcurrentHashMap ensure the safety of writing data threads?
This question essentially asks how adding data to ConcurrentHashMap is implemented to ensure thread-safety.
ConcurrentHashMap uses internal locking (spin locking + CAS + synchronized + segmental locking) to ensure thread safety.
The process for adding data is as follows:
① First, check whether the hash list has been initialized. If not, initialize the hash list first and then write the hash list.
(2) Before writing data to the bucket, check whether the bucket is empty. If the bucket is empty, the new data node is directly written to the bucket through the CAS. If CAS fails to write, it indicates that another thread has written data to the current bucket. The current thread fails to compete, returns to spin, and spins to wait.
If the current bucket is not empty, we need to determine the type of the current bucket header:
③ If the hash value of the head node in the bucket is -1, it indicates that the current head node of the bucket is FWD, and the hash list is being expanded. The current thread needs to help with the expansion.
④ If conditions (2) and (3) are not met, the current bucket may store a linked list or a red-black tree proxy object, TreeBin. Synchronized locks the headers in the bucket to ensure thread-safe writes in the bucket.
7. Describe the Hash addressing algorithm in ConcurrentHashMap
The addressing algorithm of ConcurrentHashMap is not very different from that of HashMap:
First, the HashCode value is obtained through the Key of the Node Node, and then the HashCode value is detour through the spread(int H) method to obtain the final Hash value.
After obtaining the final hash value, use the addressing formula index = (tab.length-1) & hash to obtain the bucket index.
How does ConcurrentHashMap count the amount of data in the current hash table?
ConcurrentHashMap counts the amount of stored data through the addCount(Long x, int Check) method, essentially using the LongAdder atomic class.
ConcurrentHashMap Why not ConcurrentHashMap why not AtomicLong hash table statistics? How about the amount of hash table data?
Because the AtomicLong atomclass increment is implemented based on CAS, the problem with CAS implementation is that when a large number of threads execute CAS at the same time, only one thread succeeds, and all the other threads fail and go into spin state. The spin itself is a while(true) loop that is very resource-intensive.
So how does LongAdder maintain high performance with high concurrency?
Take a look at the operation schematic of LongAdder:
LongAdder adopts a “segmented” approach to reduce the frequency of CAS failure, typically swapping space for time:
LongAdder has a global variable volatile long Base; When the concurrency is not high, the base value is operated directly through CAS. If CAS fails, CAS operation is performed on the Cell in the Cell[] array in LongAdder to reduce the probability of failure.
For example, in the current class, base= 10, there are three threads adding 1 atomically to CAS. Thread 1 succeeds in executing base=11, and thread 2 and thread 3 fail in executing base=11, and then add 1 to Cell element in Cell[] array, which is also CAS operation. The value of each Cell in index=1 and index=2 is set to 1.
Sum = 11 + 1 + 1 = 13. The operation of LongAdder is completed. The flow chart is as follows:
9. What are the preprocessing tasks performed by the thread triggering the capacity expansion condition?
The first thing the thread that triggers the expansion condition needs to do is to change the sizeCtl field value through CAS, so that 16 bits higher is the unique identification stamp for expansion, and 16 bits lower is (number of threads participating in expansion + 1), indicating that there are threads performing expansion logic!
Note: The high 16-bit expansion unique identifier is calculated based on the current length of the hash list, and its highest bit is 1. So the sizeCtl is going to be a negative number.
The current thread that triggered the capacity expansion condition then creates a new hash list, twice the size of the old one. And assign a reference to the new hash table to the map.nexttable field.
Since ConcurrentHashMap supports concurrent expansion by multiple threads, the expansion thread needs to know the progress of the data transfer from the old hash table to the new hash table. ConcurrentHashMap provides a transferIndex field to record the total migration progress of the old hash table data! The migration progress starts from the high bucket to the bucket level with subscript 0.
10. How to mark the bucket after migration in the old loose linked list?
The bucket that has been migrated from the old hashlist needs to be represented by a ForwardingNode object. This Node is special and is used to indicate that the data in the bucket has been migrated to the bucket in the new hashlist.
What does ForwardingNode do?
-
The ForwardingNode object provides a find() method inside to query the target data in a new hash list.
-
Find () on the FWD node can be used to redirect the thread to the new hash table to find the target element.
11. What if a hash table is in storage and a new write request comes in?
There are two cases to consider here:
-
If the bucket accessed by the current thread is not a FWD node according to the addressing algorithm (that is, the data in the current bucket has not been migrated) when writing. Check whether the bucket is empty. If the bucket is empty, data is written in CAS mode. If the bucket is not empty, it could be a linked list or TreeBin, and you need to lock the head of the bucket with the synchronized keyword.
-
If the current thread is writing, the bucket accessed by the addressing algorithm is the FWD node (that is, the data in the current bucket has been migrated).
HelpTransfer (TAB, F); helpTransfer(TAB, F); Assist with capacity expansion to minimize the time spent on data migration for capacity expansion.
After the current thread joins the expansion assistance, ConcurrentHashMap assigns the current thread a migration task based on the global transferIndex field (it needs to be responsible for migrating the bucket interval of the old hash list). For example, let the current thread be responsible for migrating data from buckets [0-4] in the old hash table to the new hash table.
If the current thread does not have the task to be responsible for migration, the thread exits to assist expansion, that is, the expansion ends. At this point, the current thread can continue to execute the logic of writing data!
12. How does the expansion worker maintain the lower 16 bits of sizeCtl during capacity expansion?
Each thread that performs a capacity expansion task (including assisting capacity expansion) will update the lower 16 bits of the sizeCtl before it starts to work. If the lower 16 bits are +1, a new thread will be added to perform capacity expansion.
If the current thread’s migration task is completed and no more migration task is assigned to it, then the current thread can exit to assist in expansion. At this time, update the lower 16 bits of sizeCtl, that is, make the lower 16 bits -1. A thread exits concurrent capacity expansion.
If sizeCtl is 16 bits lower than -1 and the value is 1, the current thread is the last thread to exit concurrent capacity expansion.
13. If the bucket list is upgraded to a red-black tree and a reader thread is currently accessing the tree, what should be done if a new writer thread requests?
The writer thread will block because the red black tree is special. New data may trigger the self-balancing of the red black tree, which will cause the structure of the tree to change and affect the reading result of the reader thread!
Data read and written in the red-black tree are mutually exclusive. The detailed principle is as follows:
-
We know that the red-black tree in ConcurrentHashMap is proxied by TreeBin, which has an internal state field of type Int.
-
When reading data, the read thread uses CAS to add the state value +4 (indicating that the read lock is added). After reading data, the read thread uses CAS to add the state value -4.
-
If the writer thread writes data to the red-black tree, it checks whether the state value is 0. If the value is 0, it indicates that no reader thread is retrieving data. In this case, the writer thread can directly write data, and sets the state value to 1 (indicating that the write lock is added) through CAS.
-
If the writer thread checks that the state value is not 0, park() suspends the current thread and leaves it waiting to be woken up.
Conversely, if there is a writer thread in the red-black tree performing a write operation, what happens if a new reader thread requests?
The TreeBin object retains a linked list structure inside and is designed for this situation. The new reader thread is allowed to access the data on the linked list without going through the red-black tree.
14. After suspending a waiting writer thread, when does it wake up to continue writing?
In the previous question, we analyzed that: When reading data, the reader thread would use CAS to change the state value to +4 (indicating that the read lock is added), and then use CAS to change the state value to -4 after the data is read.
After subtracting state by 4, the reader checks to see if state is 2. If state is 2, a pending writer thread is called unsafe.unpark() to wake up the writerthread.
15. Can you describe the lastRun mechanism in ConcurrentHashMap?
Today’s share has ended, please forgive and give advice!