Deep parsing of ThreadLocal

ThreadLocal source code parsing

We talked about ThreadLocal in Handler’s source code. Know that it is a thread-safe way to operate. So what’s the inner workings of it? Let’s find out this time.

use

public class TheadLocalDemo {
    public static void main(String[] args) {
        ThreadLocal<String> stringLocal = new ThreadLocal<String>() {
            @Override
            protected String initialValue(a) {// Can be initialized to prevent null pointer crash caused by direct get when not set
                return "initValue"; }}; ThreadLocal<Integer> intLocal =new ThreadLocal<>();
        Random random = new Random();
        IntStream.range(0.5).forEach(value -> {
            new Thread(() -> {
                intLocal.set(random.nextInt(100)); System.out.println(stringLocal.get()); System.out.println(intLocal.get()); stringLocal.remove(); }).start(); }); }}Copy the code

The test case here uses ThreadLocal and just sets and gets. Thread-safe processing is handled directly from within.

The source code parsing

For source code parsing, we still start with the code we use.

save

To save, just call the **set()** method, so what happens internally?

//src\main\java\java\lang\ThreadLocal.java
    public void set(T value) {
        Thread t = Thread.currentThread();
		// Get the ThreadLocalMap object in the thread
        ThreadLocalMap map = getMap(t);
		// If the map exists, save it
        if(map ! =null)
            map.set(this, value);
        else
			// If it does not exist, create and save it
            createMap(t, value);
    }

    ThreadLocalMap getMap(Thread t) {
        return t.threadLocals;
    }
Copy the code

This is done by getting the current thread and then getting the ThreadLocalMap object stored internally by thread. If it does not exist, create it and save it. If it exists, save the data.

Let’s examine the ThreadLocalMap object here. This class is a static inner class of ThreadLocal.

    static class ThreadLocalMap {
        // Save data
        static class Entry extends WeakReference<ThreadLocal<? >>{
            /** The value associated with this ThreadLocal. */Object value; Entry(ThreadLocal<? > k, Object v) {super(k); value = v; }}private static final int INITIAL_CAPACITY = 16;
         Each time the object is saved, the corresponding hashcode value will be calculated. If a hash collision occurs, the object will be moved back one position and so on until an empty position is found.
        private Entry[] table;
Copy the code

You can see that ThreadLocalMap uses arrays to store data. Data is saved by Entry objects for keys and values. And keys are handled by soft references.

The creation of ThreadLocalMap

        // Create a ThreadLocalMap and save the first key and value values
        void createMap(Thread t, T firstValue) {
            // Create a ThreadLocalMap object and assign it to t.htreadlocals
            t.threadLocals = new ThreadLocalMap(this, firstValue); } ThreadLocalMap(ThreadLocal<? > firstKey, Object firstValue) {// Initialize the information, create an array, save the first data
            table = new Entry[INITIAL_CAPACITY];
			// Get the save location
            int i = firstKey.threadLocalHashCode & (INITIAL_CAPACITY - 1);
            table[i] = new Entry(firstKey, firstValue);
            size = 1;
            // Set the expansion boundary point
            setThreshold(INITIAL_CAPACITY);
        }
Copy the code

If the current thread threadLocals is empty, the ThreadLocalMap object is created and the first node value is saved. When creating this object, three things are done:

Sets the initial array size
Find and save the location information of the saved node.
Set expansion boundary points

The save location is obtained by modulating the threadLocalHashCode of the key value. We’ll focus on how this value is generated later, but skip over it for the moment.

Data preservation

When the threadLocals object does not exist, the data is saved through the instance constructor. However, when the threadLocals object already exists, the set method is used to save the data

        private void set(ThreadLocal
        key, Object value) {
            Entry[] tab = table;
            int len = tab.length;
			// The hash value is used to obtain the array location corresponding to the key
            int i = key.threadLocalHashCode & (len-1);
			// Find the location where the data is not saved. This has to do with the saving mechanism of ThreadLocalMap.
            for(Entry e = tab[i]; e ! =null; e = tab[i = nextIndex(i, len)]) {// Get the key of the data stored at the corresponding location.ThreadLocal<? > k = e.get();if (k == key) {// if key is equal, overwrite and return
                    e.value = value;
                    return;
                }

                if (k == null) {//key is empty, but e exists. So at this point, the key is recycled.
					// Replace the value of position I with the corresponding data
                    replaceStaleEntry(key, value, i);
                    return; }}// If no data is found, save the data
            tab[i] = new Entry(key, value);
            int sz = ++size;
            if(! cleanSomeSlots(i, sz) && sz >= threshold)// If the threshold is reached, expand the capacity
                rehash();
        }
Copy the code

When data is saved, the key’s threadLocalHashCode first calculates the desired location. However, due to hash collisions, different ThreadLocal objects may be stored at the same POS. In this case, it takes “first come, first served” approach, and the last arrived node will find an empty node in the node after pos to save.

The whole preservation process is as follows:

Obtain the ideal storage node I,
Gets an Entry to the I position
- If Entry is not empty
  - If the key of the I node is equal to the key we want to save, it is overwritten and returns
  - If the key of the I node is empty, the key of the I node has been reclaimed (because the key is weakly referenced), the key and value values are saved at the I node using the replaceStaleEntry method and returned
  - If no, use the nextIndex method to obtain the next position and repeat Step 2
- If Entry is empty, the 2 loop is broken
The data is stored on node I
If the threshold for capacity expansion is still reached after cleaning up dirty data (the cleanSomeSlots method), capacity expansion is performed using the rehash method.

Let’s take a look at the nextIndex() method

         // Get the next protection node.
        private static int nextIndex(int i, int len) {
        	If I +1 exceeds the array length, 0 is returned
            return ((i + 1 < len) ? i + 1 : 0);
        }
Copy the code

Once at this point, our node can be saved to the array, which we will explore further in a later section.

To obtain

For data acquisition, it is through get method to get data

    public T get(a) {
        Thread t = Thread.currentThread();
		// Get the ThreadLocalMap of the thread
        ThreadLocalMap map = getMap(t);
        if(map ! =null) {
            // Key method to obtain the corresponding data store node
            ThreadLocalMap.Entry e = map.getEntry(this);
            if(e ! =null) {// Get the saved data based on the current object
                @SuppressWarnings("unchecked")
                T result = (T)e.value;
                returnresult; }}// If the map does not exist or the corresponding data does not exist, the initialization data is directly returned
        return setInitialValue();
    }
Copy the code

If the map already exists, the data is retrieved through ThreadLocalMap’s GET method. If the map does not exist or the value for the key is not found in the map, setInitialValue returns a default value.

setInitialValue

    // Initialize
    private T setInitialValue(a) {
    	// Initialize the value
        T value = initialValue();
		// Get the current thread
        Thread t = Thread.currentThread();
		// Get the ThreadLocalMap of the thread
        ThreadLocalMap map = getMap(t);
		// Get data from the map. The key value of the data is ThreadLocal itself.
        if(map ! =null)
			// If the map exists, save it
            map.set(this, value);
        else
			// If it does not exist, create and save
            createMap(t, value);
		// Return value
        return value;
    }
Copy the code

In the code we used, we set initialValue() once. This is our default value. The created key and value values are saved to ThreadLocalMap. **set()** is saved in the same way.

getEntry

When a map exists, **getEntry()** is used to retrieve the node corresponding to the key from the map.

        private Entry getEntry(ThreadLocal
        key) {
        	// Calculate the values stored in the table according to theadLocal's threadLocalHashCode
            int i = key.threadLocalHashCode & (table.length - 1);
            Entry e = table[i];
			// The key value of the Entry corresponding to the current I position is the same as the parameter key, indicating that it is the node information we are looking for.
            if(e ! =null && e.get() == key)
                return e;
            else
				// If the key of the Entry at position I is not the key we are looking for, the search is iterated and the result is returned
                return getEntryAfterMiss(key, i, e);
        }


Copy the code

In the analysis of the saved source code in the previous section, we know that due to the existence of hashing conflicts, the I location may not actually store the data we want to find. At this time, we need to search for the Entry corresponding to the key from the I location in turn.

        // Select * from node I where key is equal to key.
        private Entry getEntryAfterMiss(ThreadLocal<? > key,int i, Entry e) {
            Entry[] tab = table;
            int len = tab.length;

            while(e ! =null) { ThreadLocal<? > k = e.get();if (k == key)
                    return e;
                if (k == null)
					// Empty data is encountered.
                    expungeStaleEntry(i);
                else
                    i = nextIndex(i, len);// Get the next save node.
                e = tab[i];
            }
			// No data found
            return null;
        }
Copy the code

There are two kinds of search end conditions:

Until the Entry corresponding to the key is found.
A node with empty Entry was encountered.

When the value returned by **getEntryAfterMiss()** is null, the setInitiaValue() method returns the default value. The specific logic has been mentioned in the source code analysis just now.

delete

Deletion is done primarily through the remove() method

     public void remove(a) {
         ThreadLocalMap m = getMap(Thread.currentThread());
         if(m ! =null)
             m.remove(this);
     }
Copy the code

If the corresponding ThreadLocalMap exists, its remove() method is called

        private void remove(ThreadLocal
        key) {
            Entry[] tab = table;
            int len = tab.length;
            int i = key.threadLocalHashCode & (len-1);
            for(Entry e = tab[i]; e ! =null; e = tab[i = nextIndex(i, len)]) {if (e.get() == key) {
										// Set the entry key to NULL
                    e.clear();
										// Set the value of entry to null, and then handle a memory leak
                    expungeStaleEntry(i);
                    return; }}}Copy the code

The deletion method is relatively simple. The main method is to find the corresponding Entry node and set its key and value to NULL.

capacity

Like most containers, there is a capacity expansion mechanism for storing too much data. For capacity expansion, there are two key points:

The load factor is the boundary value for expansion.
Expansion plan

To determine the threshold of

        private static final int INITIAL_CAPACITY = 16;
        // Resize when the next size value exceeds this value
		private intthreshold; ThreadLocalMap(ThreadLocal<? > firstKey, Object firstValue) {// Initialize the information, create an array, save the first data
            table = new Entry[INITIAL_CAPACITY];
			// Set the expansion boundary point
            setThreshold(INITIAL_CAPACITY);
        }
        // Set the resizing threshold to maintain a worst-case 2/3 load factor.
        private void setThreshold(int len) {
            threshold = len * 2 / 3;
        }
Copy the code

From the source code, when we create theadLocal for the first time, we will create an array of theadLocalMap with an initial size of 16, and set the corresponding threshold through setThreshold. The value is two-thirds the length of the current array. That is, the load factor is 2/3 (the load factor is used to indicate how full the elements in the hash table are. The larger the loading factor, the greater the chance of conflict, the higher the cost of search. If it is too small, there will be low memory usage. So there needs to be a balance in the size of the loading factor). Here the initial size of ThreadLocalMap is 16 and the loading factor is 2/3, so the available size is 10.

Expansion plan

In the process of saving the data, we mentioned that the **rehash()** method is used to expand the capacity.

        private void rehash(a) {
        	// Clean up dirty data
            expungeStaleEntries();
            // Use a low double threshold to avoid hysteresis
            if (size >= threshold - threshold / 4)
                resize();
        }
Copy the code

In this method, we can see that the expansion process of resiz() is not carried out when the threshold is reached, but the resize method is called when the used size of the array is 3/4 of the threshold. In this way, hysteresis can be effectively avoided (this is different from HashMap, which is expanded after shreshold is reached).

        private void resize(a) {
            Entry[] oldTab = table;
            int oldLen = oldTab.length;
			// Double the new length
            int newLen = oldLen * 2;
            Entry[] newTab = new Entry[newLen];
            int count = 0;

            for (int j = 0; j < oldLen; ++j) {
                Entry e = oldTab[j];
                if(e ! =null) { ThreadLocal<? > k = e.get();if (k == null) {
                        e.value = null; // Help the GC
                    } else {
                    	// Calculate the ideal storage location according to the latest length.
                        int h = k.threadLocalHashCode & (newLen - 1);
                        while(newTab[h] ! =null) h = nextIndex(h, newLen); newTab[h] = e; count++; }}}// Reset the threshold
            setThreshold(newLen);
            size = count;
            table = newTab;
        }
Copy the code

For array expansion, there are two main steps:

Create an array twice the size of the original array
Iterate over the old array and insert all of its non-dirty numbers into the new array.

There is one caveat here: If a node Entry with an empty key is found during capacity expansion, its value is set to null so that it can be garbage collected to resolve hidden memory leaks.

Hash collision

When we talked about creating ThreadLocal instances, we learned that each ThreadLocal instance has a hash value threadLocalHashCode. The ideal location for an instance in an array is key.threadLocalHashCode & (len-1). In this way, it is inevitable that there will be different instances, and eventually the calculated save position pos will be the same. This is called hash collision.

So how does ThreadLocal try to avoid this? The key is the generation of threadLocalHashCode.

    // The current hashcode of ThreadLocal.
    private final int threadLocalHashCode = nextHashCode();

    // This is a static variable. All ThreadLocal nextHashCode uses the same variable
    private static AtomicInteger nextHashCode = new AtomicInteger();
    private static final int HASH_INCREMENT = 0x61c88647;
    // Returns the next hashCode
    private static int nextHashCode(a) {
        // Get the hash value by atomically manipulating the class
        return nextHashCode.getAndAdd(HASH_INCREMENT);
    }
Copy the code

In the source code, the nextHashCode() method is used to get the corresponding value. This method is obtained by adding 0x61C88647 to the atomic operation instance of the class’s static nextHashCode variable.

Here 0x61C88647 is the magic number that guarantees the least possible hash collision.

The length of the array

The actual location of the instance in the map is obtained by **key.threadLocalHashCode & (len-1)**, so let’s look at len here first. For the size of the array, one is the default initial value, and the other is only changed when the array is expanded.

        /** * The initial size must be an integer power of 2 */
        private static final int INITIAL_CAPACITY = 16; ThreadLocalMap(ThreadLocal<? > firstKey, Object firstValue) {// Initialize the information, create an array, save the first data
            table = newEntry[INITIAL_CAPACITY]; . }private void resize(a) {
            Entry[] oldTab = table;
            int oldLen = oldTab.length;
			// Double the new length
            int newLen = oldLen * 2;
            Entry[] newTab = newEntry[newLen]; . table = newTab; }Copy the code

For the TABLE array, the initial size is 16, and it doubles with each expansion. So the size of the array is always 2 to the N (len=2 to the N). Then the binary corresponding to len-1 is: N consecutive ones in the low order.

The value of key.threadLocalHashCode& (len-1) is the lowest N bits of threadLocalHashCode.

The magic number 0 x61c88647

To avoid collisions, we definitely want the hashCode generated by ThreadLocal to be evenly distributed in a 2 ^ N array. How about 0x61C88647?

So let’s write a test code

public class MagicNumTest {
    private static int Magic_Num=0x61c88647;

    public static void main(String[] args) {
        hash(16);
        hash(32);
        hash(64);
    }

    public static void hash(int len) {
        int hashCode=0;
        for (int i = 0; i < len; i++) {
            hashCode=i*Magic_Num+Magic_Num;
            System.out.print(hashCode&(len-1));
            System.out.print(""); } System.out.println(); }}Copy the code

Corresponding output result:

Length of16:7 14 5 12 3 10 1 8 15 6 13 4 11 2 9 0Length of32:7 14 21 28 3 10 17 24 31 6 13 20 27 2 9 16 23 30 5 12 19 26 1 8 15 22 29 4 11 18 25 0Length of64:7 14 21 28 35 42 49 56 63 6 13 20 27 34 41 48 55 62 5 12 19 26 33 40 47 54 61 4 11 18 25 32 39 46 53 60 3 10 17 24 31 38 45 52 59 2 9 16 23 30 37 44 51 58 1 8 15 22 29 36 43 50 57 0 
Copy the code

The result hashed by 0x61C88647 is evenly distributed. The Fibonacci hash method is used, and 0x61C88647 = 2^32 * golden ratio.

A memory leak

In the case of a ThreadLocal, the specific content is in the saved Entry.

        static class Entry extends WeakReference<ThreadLocal<? >>{ Object value; Entry(ThreadLocal<? > k, Object v) {super(k); value = v; }}Copy the code

Let’s take a look at the actual reference diagram here

When the external strong reference of the external TheadLocal is null, the threadLocal instance has no strong reference link and will be reclaimed the next time the system GC is performed, so the Entry Key becomes null. In this case, the Value of the Entry cannot be queried through the Entry’s key Value. But now that Entry exists in an array of maps in Thread, there is a chain of references for values: Thread->ThreadLocalMap->Entry->Value causes the Value to be unreachable, but in practice, because the Key is retracted, the Value is never accessed, causing a memory leak.

Although these are destroyed when a Thread is destroyed, threads created from a Thread pool are not reclaimed for reuse, resulting in persistent memory leaks.

ThreadLocal refers to this memory leak as “dirty data” and already handles some of it itself.

For example, when setting data

        private void set(ThreadLocal
        key, Object value) {...if (k == null) {//key is empty, but e exists. So at this point, the key is recycled.
										// Replace the value of position I with the corresponding data
                    replaceStaleEntry(key, value, i);
                    return; }}...if(! cleanSomeSlots(i, sz) && sz >= threshold)// If the threshold is reached, expand the capacity
                rehash();
        }
Copy the code

In this method, dirty data is processed

After a hash conflict occurs, the **replaceStaleEntry()** method is called if an Entry key is found to be null during a backward loop lookup
After saving the data to the array, the **cleanSomeSlots()** method is called to detect and clean up the dirty data

replaceStaleEntry

       // Replace old data. This method has the added benefit of being able to remove dirty data from the sequence (the queue between two empty nodes) of the node where staleSlot is located
        private void replaceStaleEntry(ThreadLocal<? > key, Object value,int staleSlot) {
            Entry[] tab = table;
            int len = tab.length;
            Entry e;
            // Find the first dirty entry of the sequence
            int slotToExpunge = staleSlot;
            for (inti = prevIndex(staleSlot, len); (e = tab[i]) ! =null; i = prevIndex(i, len))if (e.get() == null)
                    slotToExpunge = i;
            for (inti = nextIndex(staleSlot, len); (e = tab[i]) ! =null; i = nextIndex(i, len)) { ThreadLocal<? > k = e.get();if (k == key) {
					// Entries with the same key are overwritten and exchanged with dirty entries.
                    e.value = value;

                    tab[i] = tab[staleSlot];
                    tab[staleSlot] = e;
                    if (slotToExpunge == staleSlot)
                        slotToExpunge = i;
                    cleanSomeSlots(expungeStaleEntry(slotToExpunge), len);
                    return;
                }
                if (k == null && slotToExpunge == staleSlot)
                    slotToExpunge = i;
            }
            // If no entry is found to override during the search, a new entry is inserted into the staleSlot location
            tab[staleSlot].value = null;
            tab[staleSlot] = new Entry(key, value);
            if(slotToExpunge ! = staleSlot) cleanSomeSlots(expungeStaleEntry(slotToExpunge), len); }Copy the code

When dirty data is encountered, it not only overwrites the data at the current location, but also removes the dirty data in the sequence where staleSlot is located (i.e. the queue between two empty nodes), thus trying to deal with the problem of memory leakage.

cleanSomeSlots

        // Clean up some dirty data with key null
        private boolean cleanSomeSlots(int i, int n) {
            boolean removed = false;
            Entry[] tab = table;
            int len = tab.length;
            do {
								// Get the next node
                i = nextIndex(i, len);
                Entry e = tab[i];
                if(e ! =null && e.get() == null) {
										// If dirty data is encountered, set the flag position to true, and then set n to the length of the data to increase the number of searches
                    n = len;
                    removed = true; i = expungeStaleEntry(i); }}while ( (n >>>= 1) != 0);
							// the loop condition is n>>>1, that is, if the input parameter n is searched log2(n) times,
	            // If dirty data is found in the middle, set n to the size of the table to increase the number of searches
            return removed;
        }
Copy the code

CleanSomeSlots starts at the I node and looks back to the node.

If dirty data is found, the expungeStaleEntry method is called to clean up the data, and then n is set to the length of the array.
Each time the loop, n >>>= 1, goes n over 2.

When dirty data is found, it is considered that there may be dirty data in the vicinity of dirty data. Therefore, n is modified to expand the search times.

expungeStaleEntry

    	// Clean up dirty data (i.e. data with empty keys)
        private int expungeStaleEntry(int staleSlot) {
          	// Set the corresponding value to null
            tab[staleSlot].value = null;
            tab[staleSlot] = null;
            size--;
            Entry e;
            int i;
          	// Start from staleSlot and loop back until table[I]=null is reached
            for(i = nextIndex(staleSlot, len); (e = tab[i]) ! =null; i = nextIndex(i, len)) { ThreadLocal<? > k = e.get();// If you find dirty data during the backward search, clean it up
                if (k == null) {
                    e.value = null;
                    tab[i] = null;
                    size--;
                } else {
                    // To rehash,
                    // Because of hash conflicts, the good location of the original Entry may not be the ideal place to save it. Now, because the dirty data is cleaned up, one pos may be empty between the current saving location and the desired saving location, and it is necessary to move the processing
                    int h = k.threadLocalHashCode & (len - 1);
                    / / h! = I, indicating that the pos saved is not the ideal position, indicating that h should have been saved when it was saved, but now it is saved in the I position.
                    if(h ! = i) { tab[i] =null;

                        // Unlike Knuth 6.4 Algorithm R, we must scan until
                        // null because multiple entries could have been stale.
                        // Start at h, find the empty position, and save e to the position.
                        while(tab[h] ! =null) h = nextIndex(h, len); tab[h] = e; }}}return i;
        }
Copy the code

The expungeStaleEntry function not only cleans the dirty data at the staleSlot location, but also iterates backwards from the staleSlot location. If dirty data is found during the backward search, it will also clean up the dirty data.

ThreadLocal handles useless nodes every time a function is called, much like SparseArray on Android. Everyone interested can go to see the corresponding source.

Usage scenarios

SimpleDateFormat

Handler

conclusion

Each Thread is a Thread as an example, its internal have a called threadLocals instance members, its type is ThreadLocal ThreadLocalMap
By instantiating ThreadLocal instances, we can set thread-private variables to the currently running thread, which can be accessed by calling ThreadLocal’s set and GET methods
ThreadLocal itself is not a container; the value we access is actually stored in the ThreadLocalMap, which serves as the key for TheadLocalMap
ThreadLocal keys and values are assembled into Entry objects. ThreadLocalMap holds an array of entries.
The Entry position in the array is related to threadLocalHashCode in ThreadLocal, which is generated from the static nextHashCode variable in ThreadLocals.
If there is a conflict between the saving locations, the next location is saved. However, we cannot obtain the Entry directly. Instead, we need to determine whether the Entry key is our ThreadLocal object after obtaining it.
ThreadLocal stores data through weak references. So key values in ThreadLocalMap can be reclaimed. The corresponding value is no longer used. But if Thead is always present, ThreadLocalMap is not destroyed. As a result, the value persists and cannot be reclaimed, resulting in a memory leak. You can remove it by using the remove method (in fact, when set, it will also optimize the condition that the key is empty and the value is not empty to save new data).

reference

www.jianshu.com/p/dde92ec37…

www.cnblogs.com/ilellen/p/4…

www.jianshu.com/p/30ee77732…

zhuanlan.zhihu.com/p/40515974

www.pianshen.com/article/450…

This article is published by Kaiken!

Open Ken

Personal wechat id: on_the_wayer