Understand ThreadLocal in depth

“This is the 10th day of my participation in the First Challenge 2022. For details: First Challenge 2022.”

What exactly is a ThreadLocal?

First look at how to use:

From the above results, we can see that the variables obtained by the five threads through ThreadLocal do not affect each other. Thread safety can be achieved.

So how did he do it? What about the underlying structure?

If you look at the ThreadLocal source code, you don’t see any member properties that store data, so where does the data reside?

The get method, which takes no parameters, gets the current thread object, then a ThreadLocalMap object from the current thread object, and then fetches data from this(the current ThreadLocal object) from the map.

Public T get() {public T get() {Thread T = thread.currentThread (); ThreadLocalMap map = getMap(t); if (map ! Threadlocalmap.entry e = map.getentry (this); if (e ! = null) { @SuppressWarnings("unchecked") T result = (T)e.value; return result; Return setInitialValue(); return setInitialValue(); }Copy the code

Initialize ThreadLocalMap

Private T setInitialValue() {// Get initialValue T value = initialValue(); Thread t = Thread.currentThread(); ThreadLocalMap map = getMap(t); if (map ! = null) map.set(this, value); Else create ThreadLocalMap and initialize createMap(t, value); return value; }Copy the code

Initialize the ThreadLocalMap of thread T and assign the value

void createMap(Thread t, T firstValue) { t.threadLocals = new ThreadLocalMap(this, firstValue); }Copy the code

Let’s look at how threadLocalMaps are stored in threads.

Public class Thread implements Runnable {// Thread implements Runnable, , that is, each thread has a ThreadLocalMap ThreadLocal. ThreadLocalMap threadLocals = null; / / / thread member variables, behind the reanalysis ThreadLocal. ThreadLocalMap inheritableThreadLocals = null; }Copy the code

From the thread class source, each thread object has a ThreadLocalMap attribute.

ThreadLocal does not store data. It is a utility class that indirectly manipulates the ThreadLocalMap variable in the Thread object.

Now that the data exists in a ThreadLocalMap, we examine the structure and underlying implementation of ThreadLocalMap.

Let’s start with the relationship between ThreadLocalMap, threads, and ThreadLocal

As you can see from this diagram, the Thread class holds a ThreadLocalMap reference, which is essentially an array of types of Entry. The key of an Entry is of the ThreadLocal type, and the value is of the Object type. That is, a ThreadLocalMap can hold multiple ThreadLocal’s.

Look at the class diagram

To get things straight:

Threads hold references to ThreadLocalMap in a 1-to-1 relationship.

2. Entry is an internal class of ThreadLocalMap, and ThreadLocalMap holds an array of entries. That is, a ThreadLocalMap corresponds to multiple entries.

3. The relationship between ThreadLocal and ThreadLocalMap is the most difficult to describe because

ThreadLocalMap is a subclass of ThreadLocal, which stores keys of type ThreadLocal and is weakly referenced.

Take a look at their code relationships:

Public class ThreadLocal<T> {static class ThreadLocalMap {/** * static class ThreadLocalMap {/** * static class ThreadLocalMap {/** * static class ThreadLocalMap {/** * static class ThreadLocalMap {/** * static class ThreadLocalMap {/** * static class ThreadLocalMap {/** * static class ThreadLocalMap; * Key is directly managed by WeakReference. Static class Entry extends WeakReference<ThreadLocal<? Static class Entry extends WeakReference<ThreadLocal<? >> { /** The value associated with this ThreadLocal. */ Object value; Entry(ThreadLocal<? > k, Object v) { super(k); value = v; }}Copy the code

Why is ThreadLocalMap designed as an inner class?

ThreadLocalMap is a thread-local value, and all of its methods are private. This means that no other class can manipulate any of the methods in ThreadLocalMap, making it transparent to other classes. This class also has package-level permissions, which means that only classes under the same package can reference ThreadLocalMap. This is why threads can reference ThreadLocalMap, because they are under the same package.

While threads can refer to ThreadLocalMap, they cannot call any of the methods in ThreadLocalMap. This is how we normally get and set values using ThreadLocal.

What are the benefits of this design?

ThreadLdocalMap is transparent to the user and acts as air. We’ve always used ThreadLocal, which makes it easy to use and encapsulation.

Set method source analysis:

public void set(T value) { Thread t = Thread.currentThread(); ThreadLocalMap map = getMap(t); if (map ! = null) // Operate ThreadLocalMap, set data,key is ThreadLocal object. map.set(this, value); else createMap(t, value); } ThreadLocalMap getMap(Thread t) { return t.threadLocals; } // For the first time, the thread ThreadLocalMap is initialized. void createMap(Thread t, T firstValue) { t.threadLocals = new ThreadLocalMap(this, firstValue); }Copy the code

Now notice the set method of ThreadLocalMap

private void set(ThreadLocal<? > key, Object value) {// We don't use fast paths like get (), because using set() to create a new entry // is at least as common as replacing an existing entry, in which case fast paths often fail. Entry[] tab = table; int len = tab.length; Int I = key.threadLocalHashCode & (len-1); int I = key.threadLocalHashCode & (len-1); // Here we keep looking for the next subscript until we find an array with a null subscript. for (Entry e = tab[i]; e ! = null; E = TAB [I = nextIndex(I, len)]) {key ThreadLocal<? > k = e.get(); If (k == key) {e.value = value; return; If (k == null) {// Replace old data with new set key,value replaceStaleEntry(key, value, I); return; TAB [I] = new Entry(key, value); int sz = ++size; If (! cleanSomeSlots(i, sz) && sz >= threshold) rehash(); }/** * Increment I modulo len.*/ private static int nextIndex(int I, int len) {return ((I + 1 < len)? i + 1 : 0); }Copy the code

Why ThreadLocalMap uses open address to resolve hash collisions?

Most Hash classes in the JDK use chained addresses to resolve Hash collisions. Why does ThreadLocalMap use open addresses to resolve Hash collisions? First let’s look at these two different ways:

1. Chain address method

The basic idea of this method is to form a single linked list of all elements whose hash address is I, and store the head pointer of the single linked list in the ith cell of the hash table, so the search, insertion and deletion are mainly carried out in this chain.

2. Open address method

The basic idea of this approach is to find the next empty hash address whenever a conflict occurs (this is very important, the source code is based on this feature, must understand here to proceed), as long as the hash table is large enough, the empty hash address will always be found and the record will be saved.

Advantages and disadvantages of chain address method and open address method

Open address method:

1, easy to produce stacking problems, not suitable for large-scale data storage.

2. The design of hash functions has a great influence on conflicts, and multiple conflicts may occur during insertion.

3. The deleted element is one of multiple conflicting elements, which needs to be processed for the following elements, so the implementation is more complicated.

Chain address method:

1. It is simple to deal with conflicts, and there is no accumulation phenomenon, and the average search length is short.

2. Nodes in the linked list are dynamically applied, which is suitable for situations where the length of the construction table cannot be determined.

3. The operation of deleting nodes is easy to realize. Simply delete the corresponding nodes on the linked list.

Pointers need extra space, so when the node size is small, the open address method saves space.

ThreadLocalMap uses the open address method

ThreadLocal class HASH_INCREMENT = 0x61C88647 0x61C88647 is the magic number that allows the hash code to be evenly distributed in a 2 ^ N array, that is Entry[] table.

2. The amount of data stored in ThreadLocal is usually not very large (and the key is weak reference and will be garbage collected to reduce the amount of data in time). In this case, the simple structure of open address method will save more space, and the query efficiency of array is very high.

Automatic clean source code:

private void replaceStaleEntry(ThreadLocal<? > key, Object value, int staleSlot) { Entry[] tab = table; int len = tab.length; Entry e; // Back up to check for prior stale entry in current run. // We clean out whole runs at a time to avoid continual // incremental rehashing due to garbage collector freeing // up refs in bunches (i.e., whenever the collector runs). int slotToExpunge = staleSlot; for (int i = prevIndex(staleSlot, len); (e = tab[i]) ! = null; I = prevIndex(I, len)) staleSlot if (LLDB () = null) slotToExpunge = I; for (int i = nextIndex(staleSlot, len); (e = tab[i]) ! = null; i = nextIndex(i, len)) { ThreadLocal<? > k = e.get(); If (k == key) {e.value = value; // Switch places, If not, the next set will be directly stored in the lower subscript position, resulting in the data error of the two same keys. TAB [I] = TAB [staleSlot];  tab[staleSlot] = e; If (slotToExpunge == staleSlot) //slotToExpunge is set to invalid after the preceding stale entry if it exists slotToExpunge = i; // Recycle slotToExpunge cleanSomeSlots(expungeStaleEntry(slotToExpunge), len); return; } // If we didn't find stale entry on backward scan, the // first stale entry seen while scanning for key is the // first still present in the run. // Update slotToExpunge to the maximum key that needs to be reclaimed if (k == NULL && slotToExpunge == staleSlot) slotToExpunge = I; } // If key not found, put new entry in stale slot tab[staleSlot].value = null; tab[staleSlot] = new Entry(key, value); // If there are any other stale entries in run, expunge them if (slotToExpunge ! = staleSlot) cleanSomeSlots(expungeStaleEntry(slotToExpunge), len); } private int expungeStaleEntry(int staleSlot) { Entry[] tab = table; int len = tab.length; tab[staleSlot].value = null; tab[staleSlot] = null; size--; Entry e; int i; for (i = nextIndex(staleSlot, len); (e = tab[i]) ! = null; i = nextIndex(i, len)) { ThreadLocal<? > k = e.get(); If (k == null) {// Set this to null to allow the GC to collect e.value = null; tab[i] = null; size--; } else {// this is an open address method, so the element to be deleted is one of several conflicting elements, so the element to be deleted is one of the following elements. K ==null // set entry to null. If you do not set entry to null, you will never access the following elements. Int h = k.hash localHashCode & (len-1); int h = k.hash localHashCode & (len-1); // If (h! = i) { tab[i] = null; while (tab[h] ! = null) h = nextIndex(h, len); tab[h] = e; } } } return i; }Copy the code

Private Boolean cleanSomeSlots(int I, int n) {Boolean removed = false; Entry[] tab = table; int len = tab.length; Do {I = nextIndex(I, len); do {I = nextIndex(I, len); Entry e = tab[i]; if (e ! = null && LLDB () == null) {return n=len; removed = true; // Call expungeStaleEntry once for garbage collection (just to help garbage collection) I = expungeStaleEntry(I); } } while ( (n >>>= 1) ! = 0); // Unsigned right move one bit, easily understood as a return removed by 2; }Copy the code

By looking at the source code above, we know that the expungeStaleEntry() method is used to help garbage collection. We can also see that both get and set methods can trigger the cleanup method expungeStaleEntry(), so normally there is no memory overflow. But if we don’t call get and set, we might run out of memory, so make it a good habit to call remove() when we’re not using it anymore, to speed up garbage collection and avoid running out of memory, to say the least, even if we don’t call get and set and remove, when the thread terminates, There is no strong reference to a ThreadLocalMap in a ThreadLocal, and the ThreadLocalMap and its elements are reclaimed, but there is a danger that if the thread is thread pool, it doesn’t terminate when it finishes executing code, it just returns it to the thread pool, ThreadLocalMap and its elements are not recycled.

Thoughts on ThreadLocal

When ThreadLocal finds an empty key, it attempts to clean up invalid entries. In this case, it traverses forward to find the minimum entry subscript that needs to be cleaned up. The backward traversal is to find the subscript of the first key with the same value. This is to solve the problem that when the key is the same, the subscript error is judged and the useful data is moved forward.

ThreadLocalMap keys are designed as weak references to indicate that the key is invalid and needs to be reclaimed. Linear probes are used to reclaim and clean up the failed data.

The two types of ThreadLocal cleanup are discussed separately

Their application scenarios are different: remove method, active data removal mechanism.

When ThreadLocal WeakReference. Get =null, the object is reclaimed and value still exists.

When a ThreadLocal loses a strong reference, its life cycle can only survive until the next GC. Then, a null key Entry will appear in the ThreadLocalMap, and the current thread cannot terminate. The value of these null key entries will always have a strong reference chain. Memory leaks occur.

Solution:

It is recommended that ThreadLocal variables be defined as private static. After calling ThreadLocal get() and set(), call ThreadLocal remove () to manually remove ThreadLocal that is no longer needed.

InheritableThreadLocal understand

InheritableThreadLocal is a subclass of ThreadLocal that uses the InheritableThreadLocal data shared by its parent class. Use the same methods as ThreadLocal, using template methods to design patterns, overwriting getMap and createMap.

public class InheritableThreadLocal<T> extends ThreadLocal<T> { protected T childValue(T parentValue) { return parentValue; } / / the Thread inheritableThreadLocals ThreadLocalMap getMap (Thread t) {return t.i nheritableThreadLocals; } // Set the inheritableThreadLocals void createMap(Thread t, T firstValue) { t.inheritableThreadLocals = new ThreadLocalMap(this, firstValue); Where}} / / create a thread that is initialized inheritableThreadLocals variables if (inheritThreadLocals && parent. InheritableThreadLocals! = null) this.inheritableThreadLocals = ThreadLocal.createInheritedMap(parent.inheritableThreadLocals);Copy the code

The Spring framework’s Web module uses Both ThreadLocal and InheritableThreadLocal. Used to store the Request attribute for each thread.

public abstract class RequestContextHolder { private static final boolean jsfPresent = ClassUtils.isPresent("javax.faces.context.FacesContext", RequestContextHolder.class.getClassLoader()); private static final ThreadLocal<RequestAttributes> requestAttributesHolder = new NamedThreadLocal<>("Request attributes"); private static final ThreadLocal<RequestAttributes> inheritableRequestAttributesHolder = new NamedInheritableThreadLocal<>("Request context"); }Copy the code

ThreadLocal is widely used in the framework, and may be used in the workplace as well. It is also helpful to realize the exclusive use of data between threads to ensure thread safety.

Related Posts

For better use of OKHttp – architecture and source code analysis

AQS are finished, Condition principle can not be less!

JAVA- Part 9 -Maven