JUC study notes
Thread life cycle and state?
A Java thread can only be in one of six different states at a given time in its running life (figure in Section 4.1.4, Art of Concurrent Programming in Java).Threads are not fixed in one state during their life cycle but switch between states as the code executes. Java thread state transitions are shown below (from Section 4.1.4 of The Art of Concurrent Programming in Java) :Revised (fromissue736) : In the transition from wait to runnable, join is actually the method of Thread, but Object is written here.As you can see from the figure above: after the thread is created, it will be inNEW (NEW)State, which starts running after the start() method is called, and the thread is inREADY to runState. A runnable thread is in a state when it has acquired a CPU timesliceRUNNINGState.In the operating system, layer threads have READY and RUNNING states, while in the JVM, only RUNNABLE states are seen.HowToDoInJava:Java Thread Life Cycle and Thread States), so Java systems generally refer to these two states as RUNNABLE states. Why doesn’t the JVM distinguish between these two states? (from:How can a Java thread run in a sixth state? – Dawell’s answer**) Today’s time-sharing multi-task operating system architecture usually uses the so-called “time quantum or time” “For a round-robin preemptive rotation. This time slice is usually very small. A thread can only run on the CPU for a maximum of 10-20ms at a time (when it is in the running state), that is, about 0.01 seconds. After the time slice is used, it will be switched off and put into the end of the scheduling queue to wait for another scheduling. (That is, back to ready). Threads switch so quickly that it makes little sense to distinguish between the two states. 台湾国After the thread executes wait(), the thread entersWAITING forState. Threads that enter the wait state need to be notified by other threads to return to the run state, whileTIME_WAITING(timeout waiting)A state is equivalent to adding a timeout limit to the wait state, such as placing a Java thread in a TIMED WAITING state with a sleep (Long Millis) or wait (Long Millis) method. When the timeout expires, the Java thread will return to the RUNNABLE state. When a thread calls a synchronous method, the thread will enter without acquiring the lockBLOCKED.State. The thread will enter after executing Runnable’s run() methodTERMINATEDState. Related reading:The wrong | “the art of concurrent Java programming” in three mistakes about the state of the thread 。
Parsing the ThreadLocal keyword
preface
Your first reaction to a ThreadLocal might be simple: a copy of a thread’s variables, isolated for each thread. So here are a few things you can think about:
- If the key of a ThreadLocal is a weak reference, will the key be null after GC in threadlocale.get ()?
- ThreadLocalMap data structure in ThreadLocal?
- Hash algorithm for ThreadLocalMap?
- How to resolve Hash conflicts in ThreadLocalMap?
- How does ThreadLocalMap expand capacity?
- How to clean expired keys in ThreadLocalMap? Exploratory and heuristic clean-up processes?
- Threadlocalmap.set ()
- Threadlocalmap.get () ¶
- How is ThreadLocal used in your project? Potholes?
- .
Note: This article source based on JDK 1.8
ThreadLocal code demo
Let’s take a look at an example of ThreadLocal in use:
public class ThreadLocalTest {
private List<String> messages = Lists.newArrayList();
public static final ThreadLocal<ThreadLocalTest> holder = ThreadLocal.withInitial(ThreadLocalTest::new);
public static void add(String message) {
holder.get().messages.add(message);
}
public static List<String> clear(a) {
List<String> messages = holder.get().messages;
holder.remove();
System.out.println("size: " + holder.get().messages.size());
return messages;
}
public static void main(String[] args) {
ThreadLocalTest.add("Is a flower romantic?"); System.out.println(holder.get().messages); ThreadLocalTest.clear(); }}Copy the code
ClipboardErrorCopied printable results: Zero copy to clipboardErrorCopiedThreadLocal object can provide thread-local variables, each Thread Thread with a copy of your variables, multiple threads to each other.
ThreadLocal data structure
Thread class a type for ThreadLocal. ThreadLocalMap threadLocals instance variables, that is to say, each Thread has an own ThreadLocalMap. A ThreadLocalMap has its own separate implementation and can simply treat its key as a ThreadLocal and its value as a value put into the code (in fact, the key is not the ThreadLocal itself, but one of its own)A weak reference). When a thread adds a value to a ThreadLocal, it stores the value to its own ThreadLocalMap, and reads the value using a ThreadLocal reference to find the corresponding key in its own mapThread isolation. ThreadLocalMap is somewhat similar to the structure of a HashMap, except that a HashMap is created byArray + linked listThreadLocalMap does notThe listStructure. We also notice Entry, whose key is ThreadLocal<? > k, inheriting from WeakReference, which is also known as WeakReference type.
Is the key null after GC?
In response to the question at the top, ThreadLocal’s key is a weak reference, so is the key null after GC in threadlocale.get ()? To understand this, we need to understand the four Java reference types:
- Strong references: We often create objects that are strong references. As long as strong references exist, the garbage collector will never reclaim the referenced object, even when memory is low
- SoftReference: an object with a SoftReference modifier is called a SoftReference. The object to which a SoftReference points is reclaimed when its memory is about to overflow
- Weak references: objects decorated with WeakReference are called weak references. Whenever garbage collection occurs, if the object is only pointed to by weak references, it will be collected
- Virtual References: Virtual references are the weakest references and are defined in Java using PhantomReference. The only purpose of a virtual reference is to queue up notifications that an object is about to die
Then look at the code, the way we use reflection to see data in a ThreadLocal after GC: (the following source code from: blog.csdn.net/thewindkee/… Run locally to demonstrate GC collection scenario)
public class ThreadLocalDemo {
public static void main(String[] args) throws NoSuchFieldException, IllegalAccessException, InterruptedException {
Thread t = new Thread(()->test("abc".false));
t.start();
t.join();
System.out.println("-- after gc -");
Thread t2 = new Thread(() -> test("def".true));
t2.start();
t2.join();
}
private static void test(String s,boolean isGC) {
try {
new ThreadLocal<>().set(s);
if (isGC) {
System.gc();
}
Thread t = Thread.currentThread();
Class<? extends Thread> clz = t.getClass();
Field field = clz.getDeclaredField("threadLocals");
field.setAccessible(true); Object ThreadLocalMap = field.get(t); Class<? > tlmClass = ThreadLocalMap.getClass(); Field tableField = tlmClass.getDeclaredField("table");
tableField.setAccessible(true);
Object[] arr = (Object[]) tableField.get(ThreadLocalMap);
for (Object o : arr) {
if(o ! =null) { Class<? > entryClass = o.getClass(); Field valueField = entryClass.getDeclaredField("value");
Field referenceField = entryClass.getSuperclass().getSuperclass().getDeclaredField("referent");
valueField.setAccessible(true);
referenceField.setAccessible(true);
System.out.println(String.format("Weak reference key:%s, value :%s", referenceField.get(o), valueField.get(o))); }}}catch(Exception e) { e.printStackTrace(); }}}Copy the code
ClipboardErrorCopied results are as follows: A weak reference key: Java. Lang. ThreadLocal @ 433619 b6, value: ABC weak references key: Java. Lang. ThreadLocal @ 418 a15e3, value: Java. Lang. Ref. SoftReference@bf97a12 Weak reference key:null and value :defCopy to clipboardErrorCopiedBecause the ThreadLocal created here does not point to any value, that is, there is no reference: new ThreadLocal<>().set(s); Copy to clipboardErrorCopied so here the key is retrieved after GC. We see that referent=null in the debug above ifChange the code:When you start looking at this problem, if you don’t think too much about it,A weak reference, as well asThe garbage collection, then it must feel null. Threadlocal.get () = threadlocal.get ()Strong referenceThreadLocal exists, so the key is not null, as shown in the following figureStrong referenceIt still exists.If ourStrong referenceIf it doesn’t exist, then the key will be recycled, that is, our value will not be recycled, so the key will be recycled, causing the value to be forever, causing a memory leak.
Threadlocal.set () ¶
A set method in a ThreadLocal is used to determine whether a ThreadLocalMap exists and then to process data. The code is as follows:
public void set(T value) {
Thread t = Thread.currentThread();
ThreadLocalMap map = getMap(t);
if(map ! =null)
map.set(this, value);
else
createMap(t, value);
}
void createMap(Thread t, T firstValue) {
t.threadLocals = new ThreadLocalMap(this, firstValue);
}
Copy the code
Y to clipboardErrorCopied The main core logic is again copied in ThreadLocalMap, step by step, and dissected in more detail later.
ThreadLocalMapHash algorithm
ThreadLocalMap implements its own hash algorithm to resolve hash array collisions.
int i = key.threadLocalHashCode & (len-1);
Copy the code
To clipboardErrorCopiedThreadLocalMap of hash algorithm is very simple, I here is the key in the hash table corresponding to the array subscript position. The key here is the calculation of threadLocalHashCode, which has an attribute of HASH_INCREMENT = 0x61C88647
public class ThreadLocal<T> {
private final int threadLocalHashCode = nextHashCode();
private static AtomicInteger nextHashCode = new AtomicInteger();
private static final int HASH_INCREMENT = 0x61c88647;
private static int nextHashCode(a) {
return nextHashCode.getAndAdd(HASH_INCREMENT);
}
static class ThreadLocalMap { ThreadLocalMap(ThreadLocal<? > firstKey, Object firstValue) { table =new Entry[INITIAL_CAPACITY];
int i = firstKey.threadLocalHashCode & (INITIAL_CAPACITY - 1);
table[i] = new Entry(firstKey, firstValue);
size = 1; setThreshold(INITIAL_CAPACITY); }}}Copy the code
To clipboardErrorCopied Each time you create a ThreadLocal object, the threadLocal. nextHashCode value increases by 0x61C88647. This is a very special valueFibonacci numberAlso calledGolden proportion number. If the hash increment is this number, the benefit is hashIt’s very evenly distributed. We can try it ourselves:You can see that the resulting hash codes are evenly distributed, so I don’t want to correct them hereFibonacciSpecific algorithm, interested in the relevant information can be consulted.
ThreadLocalMapHash conflict
Note: In all the following examples, the green Entry block represents normal data, and the gray Entry block represents the null key value of the Entry, which has been garbage collected. The white block indicates that Entry is NULL.Although used in ThreadLocalMapThe golden ratioAs a hash calculation factor, it greatly reduces the probability of hash collisions, but conflicts still exist. The way to resolve conflicts in a HashMap is to construct one on an arrayThe listStructure, conflicting data is mounted to the linked list and converted to if the list length exceeds a certain numberRed and black tree. ThreadLocalMap does not have a linked list structure, so you cannot use HashMap to resolve conflicts.As shown in the figure above, if we insert a value=27, it should hash into slot 4, which already has Entry data. In this case, the search will be linear backward until it finds the slot with null Entry. The search will stop and the current element will be put into this slot. Of course, there are other situations in the iteration process, such as the situation where the Entry is not null and the key value is equal, and the situation where the key value in the Entry is null, and so on, they will be treated differently, which will be explained in detail later. There is also an Entry with a null key (Gray block data with Entry=2), because key isA weak referenceType, so this data exists. In the set process, if Entry data with an expired key is encountered, a round is actually performedExploratory cleaningOperation, and we’ll talk about how we do that later.
ThreadLocalMap. The set (), rounding
Threadlocalmap.set (
After reading the ThreadLocalThe hash algorithmLater, we’ll look at how set is implemented. Set data to ThreadLocalMap (neworupdateData) can be divided into several cases, for different cases we draw a picture to illustrate.The first case:The Entry data corresponding to the slot calculated using the hash is empty:In this case, you can directly put data into the slot.The second case:Slot data is not empty, and the key value is the same as that obtained by the current ThreadLocal hash:Data of the slot is directly updated here.The third case:The slot data is not empty. No Entry whose key has expired is encountered before a slot with a null Entry is found during the subsequent traversal:Iterating through the hash array, looking back linearly. If a slot with null Entry is found, the data is put into that slotThe key values are equalData can be directly updated.The fourth case:The slot data is not empty. During the subsequent traversal, an Entry with an expired key is encountered before finding a slot with a null Entry, as shown in the following figure. During the subsequent traversal, the key of an Entry with an index=7 is null:If the Entry key at subscript 7 in the hash array is null, it indicates that the key value has been garbage collected. In this case, the replaceStaleEntry() method is executedLogic to replace expired dataIn order toindex=7Bit start traversal, exploratory data cleaning work. Initialize slotToExpunge = staleSlot = 7 Start from the current staleSlot and iterate forward to find other expired data. Then update slotToExpunge. The for loop iterates until it hits null Entry. If stale data is found, continue iterating until you hit a slot with Entry= NULL, as shown in the figure below.SlotToExpunge is updated to 0:The current object (index=7) is iterated forward to check whether there is expired Entry data. If so, the slotToExpunge value is updated. The probe ends when null is encountered. SlotToExpunge is updated to 0. The forward iteration above is to update the slotToExpunge starting subscript for probing and cleaning expired data, which is used to determine whether there are expired elements before the current expired slot staleSlot, as explained later. Then start iterating backwards at staleSlot position (index=7),If an Entry with the same key value is found:Find the Entry element with the same key value from the current node staleSlot, update the Entry value after finding it, exchange the staleSlot element location (staleSlot location is the expired element), update the Entry data, and then start to clean expired entries, as shown in the figure below:If no Entry data with the same key value is found during the backward traversal:The current node staleSlot looks back for entries with equal key values until the search stops when the Entry is null. The preceding figure shows that no Entry with the same key value exists in the table. Create a new Entry and replace the table[stableSlot] position:ExpungeStaleEntry () and cleanSomeSlots() are the two main methods used to clean up expired elements after the replacement, which will be covered in more detail later.
Threadlocalmap.set (
Had with figure way to resolve the above set () implementation of the principle, in fact already very clear, we then look at the source: Java. Lang. ThreadLocal. ThreadLocalMap. The set () :
private void set(ThreadLocal
key, Object value) {
Entry[] tab = table;
int len = tab.length;
int i = key.threadLocalHashCode & (len-1);
for(Entry e = tab[i]; e ! =null; e = tab[i = nextIndex(i, len)]) { ThreadLocal<? > k = e.get();if (k == key) {
e.value = value;
return;
}
if (k == null) {
replaceStaleEntry(key, value, i);
return;
}
}
tab[i] = new Entry(key, value);
int sz = ++size;
if(! cleanSomeSlots(i, sz) && sz >= threshold) rehash(); }Copy the code
In this case, the key is used to calculate the corresponding position in the hash table. Then, the bucket corresponding to the current key is searched backwards to find the bucket that can be used.
Entry[] tab = table;
int len = tab.length;
int i = key.threadLocalHashCode & (len-1);
Copy the code
Under what circumstances is a bucket usable?
- K = key indicates that it is a replacement operation and can be used
- When an expired bucket is encountered, the replacement logic is performed to occupy the expired bucket
- During the search, if Entry is null in the bucket, use this command directly
NextIndex (), prevIndex(); nextIndex(); prevIndex();
private static int nextIndex(int i, int len) {
return ((i + 1 < len) ? i + 1 : 0);
}
private static int prevIndex(int i, int len) {
return ((i - 1> =0)? i -1 : len - 1);
}
Copy the code
Copy to clipboardErrorCopied From the rest of the logic in the for loop:
- The Entry data in the bucket corresponding to the current key value is empty, which indicates that there is no data conflict in the hash array. The for loop is broken, and the data is directly set to the corresponding bucket
- 2.1 If k = key, the current set operation is a replacement operation, and the replacement logic is directly returned. 2.2 If key = NULL, the Entry of the current bucket location is expired. Execute the replaceStaleEntry() method (the core method) and return
- After the for loop is executed, proceed further. A null entry occurs during the backward iteration 3.1 Create a new entry object in a bucket with a null entry 3.2 Perform the ++ SIZE operation
- Call cleanSomeSlots() to do a heuristic cleanup of data whose Entry keys have expired in the hash array 4.1 If no data has been cleaned up after the cleanup and the size exceeds the threshold (2/3 of the array length), In 4.2 rehash(), a round of exploratory cleanup will be performed to clean expired keys. After the cleanup, if size >= threshold-threshold / 4, the actual expansion logic will be performed (see the expansion logic later).
The replaceStaleEntry() method provides the function to replace expired data. We can review the schematic of the fourth case above. The code is as follows: java.lang.ThreadLocal.ThreadLocalMap.replaceStaleEntry():
private void replaceStaleEntry(ThreadLocal<? > key, Object value,int staleSlot) {
Entry[] tab = table;
int len = tab.length;
Entry e;
int slotToExpunge = staleSlot;
for (inti = prevIndex(staleSlot, len); (e = tab[i]) ! =null;
i = prevIndex(i, len))
if (e.get() == null)
slotToExpunge = i;
for (inti = nextIndex(staleSlot, len); (e = tab[i]) ! =null; i = nextIndex(i, len)) { ThreadLocal<? > k = e.get();if (k == key) {
e.value = value;
tab[i] = tab[staleSlot];
tab[staleSlot] = e;
if (slotToExpunge == staleSlot)
slotToExpunge = i;
cleanSomeSlots(expungeStaleEntry(slotToExpunge), len);
return;
}
if (k == null && slotToExpunge == staleSlot)
slotToExpunge = i;
}
tab[staleSlot].value = null;
tab[staleSlot] = new Entry(key, value);
if(slotToExpunge ! = staleSlot) cleanSomeSlots(expungeStaleEntry(slotToExpunge), len); }Copy the code
To clipboardErrorCopiedslotToExpunge said began to detect type cleaning stale data to subscript, starting from the current staleSlot by default. Start with the current staleSlot and iterate forward to find data that has not expired. The for loop does not end until Entry is null. If the expired data is found ahead, the update probe clears the expired data with the starting subscript I, that is, slotToExpunge= I
for (inti = prevIndex(staleSlot, len); (e = tab[i]) ! =null;
i = prevIndex(i, len)){
if (e.get() == null){ slotToExpunge = i; }}Copy the code
ClipboardErrorCopied then starts looking backwards from staleSlot and ends up encountering a bucket with null Entry. If k == key is encountered during iteration, this indicates that the replacement logic is in place, replacing the new data and swapping the current staleSlot position. If slotToExpunge == staleSlot, this means that replaceStaleEntry() did not find expired Entry data when it first looked forward and then did not find expired Entry data when it looked back. Change the index of the current cycle to slotToExpunge = I. Finally call cleanSomeSlots(expungeStaleEntry(slotToExpunge), len); Perform heuristic expiration data cleansing.
if (k == key) {
e.value = value;
tab[i] = tab[staleSlot];
tab[staleSlot] = e;
if (slotToExpunge == staleSlot)
slotToExpunge = i;
cleanSomeSlots(expungeStaleEntry(slotToExpunge), len);
return;
}
Copy the code
The cleanSomeSlots() and expungeStaleEntry() methods, which are detailed later, are related to cleanup. The first is heuristic cleanup of expired key related entries (heurscan), The other is exploratory cleaning of expired key-related entries. If k! K == null indicates that the current Entry traversed is an expired data. SlotToExpunge == staleSlot indicates that no expired Entry was found in the initial forward search. If the condition is correct, slotToExpunge is updated to the current position. The prerequisite is that no expired data is found during the scan of the precursor node.
if (k == null && slotToExpunge == staleSlot)
slotToExpunge = i;
Copy the code
Py to clipboardErrorCopied To end the current iteration if no data with k == key is found and null Entry is encountered during subsequent iterations. Add new data to the slot corresponding to table[staleSlot].
tab[staleSlot].value = null;
tab[staleSlot] = new Entry(key, value);
Copy the code
Py to clipboardErrorCopied Finally determines that besides staleSlot, other slot data is also out of date and needs to be cleaned up:
if(slotToExpunge ! = staleSlot) cleanSomeSlots(expungeStaleEntry(slotToExpunge), len);Copy the code
clipboardErrorCopied
Exploratory clean-up process for ThreadLocalMap expired keys
There are two ways to clean stale key data using ThreadLocalMap:Exploratory cleaningandHeuristic cleaning. The expungeStaleEntry method traverses the hash array, probes and cleans expired data backwards from the starting position, sets the Entry of expired data to NULL, and rehash unexpired data to locate it in the table array. If the location already has data, the unexpired data will be added to the bucket with Entry= NULL closest to the location, making the rehash Entry data closer to the correct location of the bucket. The operation logic is as follows:As shown in the figure above, after hash calculation, set(27) should fall into bucket index=4. Since bucket index=4 already has data, the final data in subsequent iterations will be put into bucket index=7. After a period of time, Entry data key in bucket index=5 will become nullTrigger if additional data sets are added to the mapExploratory cleaningOperation. As shown in the figure above, executeExploratory cleaningWhen the index=7 element is rehashed, the correct index=4 is found. The node with the nearest Entry= NULL to index=4 is searched. Index =5), then move index= 7 to index=5. At this time, the bucket position is closer to the correct position index=4. After a round of exploratory cleaning, the data with expired keys will be cleaned up. The bucket position of the unexpired data after rehash is theoretically closer to the position of I = key.hashcode & (tab.len-1). This optimization improves overall hash query performance. Then look at the specific process of expungeStaleEntry(), we will sort out step by step in the way of schematic diagram and source code explanation:We call this method with expungeStaleEntry(3), as shown in the figure above. We can see the table data in ThreadLocalMap, and then clean up:The first step is to clear the data for the current staleSlot position, and the Entry at index=3 becomes NULL. And then further probe:After the second step, the element with index=4 is moved to the slot with index=3. If normal data is encountered, the position of the data is calculated. If it is offset, the slot position is recalculated. The purpose is to store normal data in the correct location or closer to the correct location as possibleWhen an empty slot is encountered during the next iteration, the probe is terminated, so a round of exploratory cleaning is complete, and we continue to look at the detailsImplementation source code:
private int expungeStaleEntry(int staleSlot) {
Entry[] tab = table;
int len = tab.length;
tab[staleSlot].value = null;
tab[staleSlot] = null;
size--;
Entry e;
int i;
for(i = nextIndex(staleSlot, len); (e = tab[i]) ! =null; i = nextIndex(i, len)) { ThreadLocal<? > k = e.get();if (k == null) {
e.value = null;
tab[i] = null;
size--;
} else {
int h = k.threadLocalHashCode & (len - 1);
if(h ! = i) { tab[i] =null;
while(tab[h] ! =null) h = nextIndex(h, len); tab[h] = e; }}}return i;
}
Copy the code
Opied here we still use staleSlot=3 as an example, first empty the data in TAB [staleSlot] slot, then set size– then iterate through the position of staleSlot, if the expired data is encountered, also empty the data in the slot. Then the size –
ThreadLocal<? > k = e.get();if (k == null) {
e.value = null;
tab[i] = null;
size--;
}
Copy the code
If the key does not expire, the system recalculates whether the subscript position of the current key is the subscript position of the current slot. If it is not, a hash conflict occurs. In this case, the system iterates through the correct slot position to find the latest position that can store the entry.
int h = k.threadLocalHashCode & (len - 1);
if(h ! = i) { tab[i] =null;
while(tab[h] ! =null)
h = nextIndex(h, len);
tab[h] = e;
}
Copy the code
To clipboardErrorCopied This method is to process normal Hash conflicted data. After iteration, Entry positions of Hash conflicted data will be closer to the correct location. In this way, query efficiency will be higher.
ThreadLocalMap Capacity expansion mechanism
At the end of the threadLocalmap.set () method, if no data has been cleaned up after the heuristic and the number of entries in the current hash array has reached the list expansion threshold (len*2/3), the rehash() logic is executed:
if(! cleanSomeSlots(i, sz) && sz >= threshold) rehash();Copy the code
Let’s look at the rehash() implementation:
private void rehash(a) {
expungeStaleEntries();
if (size >= threshold - threshold / 4)
resize();
}
private void expungeStaleEntries(a) {
Entry[] tab = table;
int len = tab.length;
for (int j = 0; j < len; j++) {
Entry e = tab[j];
if(e ! =null && e.get() == null) expungeStaleEntry(j); }}Copy the code
Here, exploratory cleaning will be carried out first, cleaning from the start position of the table back, there is a detailed analysis of the cleaning process above. After the table is cleared, some entries with null keys may be cleared. Therefore, size >= threshold-threshold /4, or size >= threshold_ 3/4, determines whether to expand the table. Remember that the threshold for rehash() above is size >= threshold_Let’s look at the resize() method in detail. To illustrate, we’ll use oldTab.len=8:_ The size of the expanded TAB is oldlen_ 2, and then the old hash table is iterated, the hash position is recalculated, and the new TAB array is placed. If a hash conflict occurs, the slot with the most recent entry is null is searched. All entry data from the oldTab has been put into the new TAB. Recalculate TAB for next expansionThe threshold value, the specific code is as follows:
private void resize(a) {
Entry[] oldTab = table;
int oldLen = oldTab.length;
int newLen = oldLen * 2;
Entry[] newTab = new Entry[newLen];
int count = 0;
for (int j = 0; j < oldLen; ++j) {
Entry e = oldTab[j];
if(e ! =null) { ThreadLocal<? > k = e.get();if (k == null) {
e.value = null;
} else {
int h = k.threadLocalHashCode & (newLen - 1);
while(newTab[h] ! =null)
h = nextIndex(h, newLen);
newTab[h] = e;
count++;
}
}
}
setThreshold(newLen);
size = count;
table = newTab;
}
Copy the code
ThreadLocalMap. The get (), rounding
Set () = set(); set() = set();
Diagram of ThreadLocalMap. The get ()
The first case:Enter the slot position in the hash table by searching for the key value. The Entry. Key in the slot position is the same as the sought key.The second case:Entry. Key in slot location is inconsistent with the key to be searched:Take get(ThreadLocal1) as an example. After hash calculation, the correct slot position is 4. The index=4 slot already contains data, and the key value is different from ThreadLocal1. ExpungeStaleEntry () is executed. After executing the expungeStaleEntry() method, the data in index 5 and 8 will be reclaimed, while the data in index 6 and 7 will be moved forward. At this point, the next iteration will continue. When index = 6, the Entry data with the same key value is found, as shown in the figure below:
Threadlocalmap.get (
java.lang.ThreadLocal.ThreadLocalMap.getEntry():
private Entry getEntry(ThreadLocal
key) {
int i = key.threadLocalHashCode & (table.length - 1);
Entry e = table[i];
if(e ! =null && e.get() == key)
return e;
else
return getEntryAfterMiss(key, i, e);
}
private Entry getEntryAfterMiss(ThreadLocal<? > key,int i, Entry e) {
Entry[] tab = table;
int len = tab.length;
while(e ! =null) { ThreadLocal<? > k = e.get();if (k == key)
return e;
if (k == null)
expungeStaleEntry(i);
else
i = nextIndex(i, len);
e = tab[i];
}
return null;
}
Copy the code
Heuristic cleanup process for ThreadLocalMap expired keys
There are two ways ThreadLocalMap expiration can be cleaned up:Probing cleanup (expungeStaleEntry()),Heuristic cleaning (cleanSomeSlots())Exploratory clearing is clearing after the current Entry. If the value is null, the clearing will endLinear detection cleaning. And heuristic cleaning is defined by the author as:Heuristically scan some cells looking for stale entries.The specific code is as follows:
private boolean cleanSomeSlots(int i, int n) {
boolean removed = false;
Entry[] tab = table;
int len = tab.length;
do {
i = nextIndex(i, len);
Entry e = tab[i];
if(e ! =null && e.get() == null) {
n = len;
removed = true; i = expungeStaleEntry(i); }}while ( (n >>>= 1) != 0);
return removed;
}
Copy the code
InheritableThreadLocal
When we use ThreadLocal, there is no way to share a thread copy created by the parent thread with the child thread in an asynchronous scenario. To solve this problem, there is also an InheritableThreadLocal class in the JDK. Let’s look at an example:
public class InheritableThreadLocalDemo {
public static void main(String[] args) {
ThreadLocal<String> ThreadLocal = new ThreadLocal<>();
ThreadLocal<String> inheritableThreadLocal = new InheritableThreadLocal<>();
ThreadLocal.set("Parent data :threadLocal");
inheritableThreadLocal.set("Parent class data :inheritableThreadLocal");
new Thread(new Runnable() {
@Override
public void run(a) {
System.out.println("Child thread gets parent ThreadLocal data:" + ThreadLocal.get());
System.out.println(The child thread gets the inheritableThreadLocal data from the parent class:+ inheritableThreadLocal.get()); } }).start(); }}Copy the code
Print result:
The child thread gets the parent ThreadLocal data:nullThe child thread gets the inheritableThreadLocal data from the parent classCopy the code
The child Thread is created by calling the new Thread() method in the parent Thread. The Thread#init method is called in the Thread constructor. Copy parent thread data to child thread in init method:
private void init(ThreadGroup g, Runnable target, String name,
long stackSize, AccessControlContext acc,
boolean inheritThreadLocals) {
if (name == null) {
throw new NullPointerException("name cannot be null");
}
if(inheritThreadLocals && parent.inheritableThreadLocals ! =null)
this.inheritableThreadLocals =
ThreadLocal.createInheritedMap(parent.inheritableThreadLocals);
this.stackSize = stackSize;
tid = nextThreadID();
}
Copy the code
[InheritableThreadLocal] [init()] [Thread pool] [InheritableThreadLocal] [init()] [Thread pool]] [Thread pool] [init()]] [Thread pool] So there’s a problem here. Of course, if there is any problem, there will be a solution. If there is any TransmittableThreadLocal, alibaba has opened an open source module to solve this problem.
Use actual combat in ThreadLocal projects
Usage scenarios of ThreadLocal
At present, we use ELK+Logstash to record logs in the project. Finally, we display and retrieve logs in Kibana. Nowadays, distributed systems provide services uniformly. The invocation relationship between projects can be associated with traceId. However, how can traceId be transmitted between different projects? We use org.slf4j.mdc to do this, internally via ThreadLocal, as follows: the current end sends a request toService AWhen,Service AA UUID like traceId string is generated, placed in the ThreadLocal of the current thread, and then calledService BWrite traceId to the Header of the request,Service BWhen receiving a request, the system checks whether the Header of the request contains traceId. If yes, the traceId is written to the ThreadLocal of the thread.In the figure, requestId is the traceId associated with each system link. Systems call each other, and corresponding links can be found through this requestId. Here are some other scenarios:For each of these scenarios, we can have corresponding solutions, as shown below
Feign remote call solution
The service sends a request:
@Component
@Slf4j
public class FeignInvokeInterceptor implements RequestInterceptor {
@Override
public void apply(RequestTemplate template) {
String requestId = MDC.get("requestId");
if (StringUtils.isNotBlank(requestId)) {
template.header("requestId", requestId); }}}Copy the code
The service receives requests:
@Slf4j
@Component
public class LogInterceptor extends HandlerInterceptorAdapter {
@Override
public void afterCompletion(HttpServletRequest arg0, HttpServletResponse arg1, Object arg2, Exception arg3) {
MDC.remove("requestId");
}
@Override
public void postHandle(HttpServletRequest arg0, HttpServletResponse arg1, Object arg2, ModelAndView arg3) {}@Override
public boolean preHandle(HttpServletRequest request, HttpServletResponse response, Object handler) throws Exception {
String requestId = request.getHeader(BaseConstant.REQUEST_ID_KEY);
if (StringUtils.isBlank(requestId)) {
requestId = UUID.randomUUID().toString().replace("-"."");
}
MDC.put("requestId", requestId);
return true; }}Copy the code
Thread pool asynchronous call, requestId pass
Since MDC is implemented based on ThreadLocal, child threads do not have access to data stored by parent ThreadLocal threads during asynchronous processing, so we can customize the thread pool executor by modifying the run() method:
public class MyThreadPoolTaskExecutor extends ThreadPoolTaskExecutor {
@Override
public void execute(Runnable runnable) {
Map<String, String> context = MDC.getCopyOfContextMap();
super.execute(() -> run(runnable, context));
}
@Override
private void run(Runnable runnable, Map<String, String> context) {
if(context ! =null) {
MDC.setContextMap(context);
}
try {
runnable.run();
} finally{ MDC.remove(); }}}Copy the code
The volatile keyword
Let’s start with the CPU cache model!
2.1. CPU cache model
Why do you need a CPU cache? ** Similar to the cache used by the backend system of our website (such as Redis) is to solve the problem of unequal speed between the processing speed of the program and the speed of accessing the regular relational database. **CPU caching is designed to solve the problem of unequal CPU and memory processing speed. ** We can even putMemory can be thought of as a cache of external memoryWhen the program is running, we copy the data from the external memory to the memory. Since the processing speed of the memory is much higher than that of the external memory, this improves the processing speed. Summary: **CPU Cache is used to Cache memory data to solve the CPU processing speed and memory mismatch, memory Cache is used to solve the disk access speed is too slow. ** To better understand this, I have drawn a simple schematic of the CPU Cache as follows (in fact, modern CPU caches are usually divided into three layers, called L1,L2, and L3 Cache) :** How the CPU Cache works: ** First copies the data to the CPU Cache. When the CPU needs it, it reads the data directly from the CPU Cache. When the operation is complete, the data is written back to Main Memory. But there isMemory cache inconsistency problem! For example, if I perform an I operation, if two threads execute it simultaneously, suppose that both threads read I =1 from the CPU Cache, and then write back to Main Memory after 1, then I =2, the correct result would be I =3.In order to solve the problem of memory cache inconsistency, CPU can formulate cache consistency protocol or other means to solve it.
2.2. JMM(Java Memory Model)
Prior to JDK1.2, Java’s memory model implementation was always implemented fromMain memory(that is, shared memory) reading variables requires no special care. Under the current Java memory model, threads can store variablesLocal memory(for example, in a machine register), rather than reading or writing directly from main memory. This can cause one thread to change the value of a variable in main memory while another thread continues to use its copy of the value in the registerInconsistency of data.To solve this problem, declare the variable asvolatileThis indicates to the JVM that the variable is shared and unstable, and that it is read from main memory each time it is used. So,In addition to preventing instruction rearrangements by the JVM, the volatile keyword is also important for ensuring visibility of variables.
2.3. Three important features of concurrent programming
- Atomicity: An operation or multiple operations in which either all operations are performed without interruption, all operations are performed, or none is performed. Synchronized guarantees atomicity in code fragments.
- Visibility: When a variable modifies a shared variable, other threads can immediately see the updated value. The volatile keyword ensures visibility of shared variables.
- Orderliness: The order in which code is executed, Java optimizations in the compiler and at run time, the order in which code is executed is not necessarily the order in which it was written. The volatile keyword prevents instructions from reordering optimizations.
2.4. Tell the difference between synchronized and volatile
The synchronized keyword and the volatile keyword are complementary beings, not opposites!
- Volatile is a lightweight implementation of thread synchronization, so volatile certainly performs better than synchronized. However, the volatile keyword can only be used with variables and the synchronized keyword can modify methods and blocks of code.
- The volatile keyword guarantees visibility, but not atomicity. The synchronized keyword guarantees both.
- The volatile keyword addresses the visibility of variables across multiple threads, while the synchronized keyword addresses the synchronization of access to resources across multiple threads.
CAS
Before JDK 5, Java language relied on the synchronized keyword to ensure synchronization, which would lead to the following problems in locking mechanism :(1) under the multithreading competition, locking and releasing locks would lead to more context switching and scheduling delay, resulting in performance problems. (2) A lock held by one thread causes all other threads requiring the lock to suspend. (3) If a thread with a higher priority waits for a thread with a lower priority to release the lock, priority inversion will occur, causing performance risks. Volatile is a good mechanism, but volatile does not guarantee atomicity. So synchronization ultimately comes back to locking. An exclusive lock is a pessimistic lock, and synchronized is an exclusive lock that causes all other threads requiring the lock to suspend until the thread holding the lock releases it. An even more effective lock is the optimistic lock. Optimistic locking is when an operation is performed each time without locking, assuming no conflicts, and then retry until it succeeds if it fails because of conflicts. The mechanism used for optimistic locking is CAS, Compare and Swap.
What is CAS
CAS, the abbreviation of compare and swap, Chinese translation into compare and exchange. As we all know, concurrency was widespread and used heavily in the server world before the Java language. So hardware manufacturers have long added a large number of primitives to chips up to concurrent operation to improve efficiency at the hardware level. In Intel cpus, the CMPXCHG instruction is used. In the early days of Java, the Java language could not take advantage of these advantages provided by hardware to improve system performance. With the development of Java, the appearance of Java native methods (JNI) provides a convenient way for Java programs to call local methods directly across the JVM, so Java has more means of concurrency. In Doug Lea’s cucurenct package, CAS theory is the cornerstone of its implementation of the entire Java package. The CAS operation contains three operands — the memory location (V), the expected old value (A), and the new value (B). If the value of the memory location matches the expected original value, the processor automatically updates the location value to the new value. Otherwise, the processor does nothing. In either case, it returns the value of that location before the CAS instruction. (Some special cases of CAS will only return whether the CAS was successful, not extract the current value.) CAS effectively says “I think position V should contain the value A; If this value is included, place B in this position; Otherwise, do not change the location, just tell me the current value of the location.” The common way to use CAS for synchronization is to read the value A from address V, perform A multi-step calculation to get the new value B, and then use CAS to change the value of V from A to B. If the value at V is not changed at the same time, the CAS operation succeeds. An instruction similar to CAS allows an algorithm to perform a read-modify-write operation without fear of another thread modifying a variable at the same time, because if another thread modifates a variable, CAS will detect it (and fail) and the algorithm can recalcate the operation.
The CAS instruction of CPU is used to complete the non-blocking algorithm of Java with the help of JNI.
Other atomic operations are performed using similar properties. As the whole J.U.C is built on CAS, it has greatly improved the performance of synchronized blocking algorithm. 3. Problems existing in CAS Although CAS is very efficient in solving atomic operations, three major problems still exist in CAS. ABA problem, long cycle time, high overhead and atomic operation that can guarantee only one shared variable 1. ABA problem. This is because CAS needs to check if the value has changed and update it if it has not. However, if A value is A, changes to B, and then changes to A, CAS checks that it has not changed, but actually has changed. The solution to the ABA problem is to use version numbers. Append the version number to the variable, increment the version number by one each time the variable is updated, and a-b-A becomes 1A-2B-3a. Since Java1.5, the JDK atomic package has provided a class AtomicStampedReference to address ABA issues. The compareAndSet method of this class first checks whether the current reference is equal to the expected reference, and whether the current flag is equal to the expected flag, and if all are equal, sets the reference and flag values to the given updated value atomically. Reference documentation about ABA: blog.hesey.net/2011/09/res… 2. Long cycle time and high cost. Spin CAS, if unsuccessful for a long period of time, can impose a significant execution overhead on the CPU. The JVM can improve performance if it supports pause instruction provided by the processor. Pause instruction has two functions. First, it can delay de-pipeline instruction so that the CPU does not consume too many execution resources. Second, it improves CPU execution efficiency by avoiding CPU pipeline flush due to memory order violation during loop exit.
3. Atomic operations of only one shared variable can be guaranteed. When performing operations on a Shared variables, we can use a loop CAS way to ensure that atomic operation, but for more than one Shared variables, circulation CAS will not be able to ensure the atomicity of operation, this time can use the lock, or there is a tricky way, is to combine multiple Shared variables into a Shared variable to operation. Let’s say we have two shared variables I =2,j=a, combine ij=2a, and then use CAS to manipulate ij. Since Java1.5, the JDK has provided the **AtomicReference class to ensure atomicity between reference objects. You can place multiple variables in an object to perform CAS operations. Because Java CAS has the memory semantics for both volatile reads and volatile writes, Java threads now communicate in the following four ways:
- Thread A writes the volatile variable, and thread B reads the volatile variable.
- Thread A writes the volatile variable, and thread B updates the volatile variable with CAS.
- Thread A updates A volatile variable with CAS, and thread B updates the volatile variable with CAS.
- Thread A updates A volatile variable with CAS, which thread B then reads.
Java CAS will use modern processors provide efficient machine level atomic instructions, these atoms instruction atomically to memory read – to – write operations, this is the key to achieve synchronization in multiprocessor (in essence, can support atomic reading – to – writing instruction computing machines, calculate the Turing machine is order equivalent asynchronous machine, So any modern multiprocessor will support some kind of atomic instruction that can perform atomic read-modif-write operations on memory. Meanwhile, read/write of volatile variables and CAS enable communication between threads. Taken together, these features form the building blocks for the implementation of the concurrent package. If we look closely at the source code implementation of the Concurrent package, we will find a common implementation pattern:
- First, declare the shared variable volatile;
- Then, atomic conditional update of CAS is used to achieve synchronization between threads.
- At the same time, the thread communication is realized with volatile read/write and CAS memory semantics.
AQS, non-blocking data structure and the atomic variable classes (Java. Util. Concurrent. Atomic package’s classes), the concurrent is the base class in the package using this model, and the top class in the concurrent bag is dependent on the base class. As a whole, the implementation of the Concurrent package looks like this:
happen-before
JMM
- When programmers write code, they want the memory model to be easy to understand and easy to program, so we need to rely on a strong memory model to code. That is to say, like axioms, defined rules, we just write code and follow them.
- For compiler and processor implementations, they want to have as few constraints as possible. After all, if you limit them, they will definitely affect their performance, and they won’t be able to optimize as much as they can to provide performance. So they need a weak memory model.
These two points are clearly in conflict, as we programmers expect the JMM to provide us with a strong memory model, while the underlying compiler and processor need a weak memory model to improve their performance.
This tradeoff scenario, such as memory and CPU registers, introduces multiple levels of CPU caching to address performance issues, but also introduces problems in concurrent scenarios with multi-core cpus. So here, too, we need to strike a balance between satisfying the programmer’s needs and maximizing compiler and processor constraints as much as possible.
Therefore, at design time, the JMM defines the following policies:
- The JMM requires that the compiler and processor must forbid reordering that changes the results of program execution.
- The COMPILER and processor are not required by the JMM for reordering that does not change the result of program execution (the JMM allows such reordering).
reorder
Compiler reorder
Just in time compilation, also known as runtime compilation, javac translates Java source files into class files, which are filled with Java bytecode. So, after loading the class files, the JVM retrieves and executes the bytecodes one by one, which is interpreted execution.
Alternatively, the Java bytecode can be recompiled and optimized to generate machine code that the CPU can execute directly. The resulting code will be more efficient. In general, it is not necessary to compile all Java methods into machine code, and the virtual machine identifies a method or block of code as “hot code” when it finds that it is being run particularly frequently. To improve the efficiency of hot code execution, the virtual machine compiles the code to local platform-specific machine code and performs various levels of optimization at runtime with a Compiler called the _ Just-in-time Compiler (JIT Compiler).
The X and Y values that the CPU reads only once. There is no need to read the register repeatedly to alternate x and y values.
Handler reorder
The write cache is not refreshed in time, causing the processor to perform read and write operations out of order in memory.
Processor A reads B =0, and processor B reads A =0. A1 writes a=1 to the write cache of processor A, where a=0 in memory. If processor B reads A from memory, it’s going to read 0.
Maybe x and y are both 0.
The memory barrier
Two kinds of instructions:
Store Memory Barrier: The ability to write the latest data from the cache to main Memory and make it visible to other threads.
Load Memory Barrier: Invalidates data in the cache, forcing data to be reloaded from main Memory.
volatile
Volatile inserts read barriers before read and adds write barriers after write to avoid PROBLEMS caused by CPU rearrangement and achieve data visibility between multiple threads.
final
Initial writes to the final field in the constructor and assignments to other reference variables during new object creation cannot be reordered; (Nonsense)
The first reading of an object reference containing a final field and the reading of the final field cannot be reordered; (Obscure, meaning to assign a reference first and then call a final value)
In summary, the above rule means that all final fields of an object must be written before it can be referenced and read. This is where the memory barrier comes in:
Write to final fields: After the compiler writes to final fields and before the constructor finishes, a StoreStore barrier is inserted to ensure that previous writes to final fields are visible to other threads/cpus and prevent reordering.
Conclusion:
Why insert barriers (also known as fences)? The essence of this is that the business level cannot accept even a small amount of latency for asynchronous operations such as store Buffer writing and memory flushing, which requires a high level of memory visibility.
What exactly is a memory barrier? Memory barriers are nothing but an abstraction, like OOP. If that doesn’t make sense to you, you can think of it as a wall, a wall where front and back instructions cannot be executed out of order by the CPU, and a wall where front and back read and write operations need to be executed in order.
The CPU puts read, write, and read requests into a queue and executes them on a first-in, first-out basis. What is read completion, that is, the CPU performs a read operation to read the value into the register; When a write operation is complete, the CPU performs a write operation and the data is flushed to the memory.
Problem with lazy singleton pattern (DCL double-checked lock)
This singleton pattern is obviously not thread-safe, as when multiple threads are invoked, they may all try to create an object at the same time, or they may end up getting a reference to an incomplete initialized object.
DCL double check lock
The DCL essentially reduces the lock granularity, so if instance is not null the first time you check, you don’t need to perform the following locking and initialization operations. Therefore, the performance overhead associated with synchronized can be significantly reduced. The code above, on the surface, seems to have the best of both worlds. When multiple threads attempt to create an object at the same time, they lock it to ensure that only one thread can create the object. After the object is created, the getInstance() method returns the created object without obtaining the lock. Double-checked locking may seem perfect, but it’s a bad optimization!
In the code above:
instance = new DCLTestBean();
Copy the code
This code can be broken down into three lines of pseudocode:
Between 2 and 3 of the above three lines of pseudocode, it is possible to reorder (which actually happens on some JIT compilers).
That is, a thread could read an uninitialized DCLTestBean, and this instance is! = null.
Va
The eight principles
In a word, HB is a constraint on instruction reordering in a single-threaded environment and data consistency between threads in a multi-threaded environment. Serial semantics are guaranteed in the single-threaded case, and in the multi-threaded case, since consistency of data needs to be declared and guaranteed ourselves, the JVM guarantees that consistency must be guaranteed as stated in the HB principle.
1. Single-thread happen-before principle: in the same thread, the previous operations happen-before the next operations.Copy the code
The first is single-threaded HB, where the results of previous operations must be visible to subsequent operations. Instead, the previous operation must precede the subsequent operation, such as the as-if-serial semantics, where two instructions without data dependence can be reordered. In this case, for the HB principle, since neither of the two instructions produces the result required by the other party, which does not need to be visible to the other party, the timely execution order is also in line with the HB principle.
2. Lock happens before: Unlock operation of the same lock happens before lock operation of the same lock.Copy the code
My understanding emphasizes the visibility of the unlock operation in a multi-threaded environment. A thread must be able to sense the change of lock state in time for later-than-later-than-later-than-later-than-later-than-lock operation. The result of the unlock operation must be visible to subsequent lock operations, regardless of whether the two are in the same thread.
3. The happen-before principle for volatile: writes to a volatile variable happen-before any operation to that variable.Copy the code
The result of a write to a volatile variable is visible to the result of any subsequent operation. Volatile on x86 implements variable consistency across multiple cores through memory barriers and cache consistency protocols.
4. Happen before transitivity principle: if A happens before B, B happens before C, then A happens before C.Copy the code
From this property, it can be deduced that there is a happens-before relationship between two operations that are not directly related
5. Thread start happens before principle: start methods of the same thread happen before other methods of this thread.Copy the code
The start method may not have a data dependency on other methods, but it is clearly necessary for the program to be correct. Function side effects caused by the start method must be visible to other methods.
- Join() rule: if thread A performs the operation threadb.join () and returns successfully, then any operation in ThreadB happens before the thread.
- Program interrupt rule: A call to the thread interrupted() method detects the interruption before the code in the interrupted thread detects the interruption.
- Object Finalize rule: The finalization of an object (completion of constructor execution) precedes the start of the Finalize () method that generated it.