In the process of learning Java concurrent programming, ThreadLocal must be avoided. In actual development, ThreadLocal application scenarios are very rich:

  1. Isolation of data between threads.
  2. Session management.
  3. Transaction management.
  4. Implicit passing of parameters (PageHelper).
  5. Dubbo RpcContext.

In order to better understand the principle of ThreadLocal, I documented the process of reading the source code, and I would appreciate it if you could point out any errors.


The source code parsing

The ThreadLocal instance is stored as a Key in the Thread’s ThreadLocalMap, so it needs to calculate a subscript based on the ThreadLocal hashCode, resolve hash conflicts, and so on. ThreadLocalMap does not calculate the hash based on the hashCode() method, but instead uses a set of incrementing rules:

/* The ThreadLocal instance is stored as a Key in the Thread's ThreadLocalMap. The subscript needs to be calculated based on the hashCode. The hashCode() method is not called, but is incremented by the step of 0x61C88647. * /
private final int threadLocalHashCode = nextHashCode();

// Generate hashCode by CAS
private static AtomicInteger nextHashCode =
		new AtomicInteger();

// hashCode increments step size, why this number? https://zhuanlan.zhihu.com/p/40515974
private static final int HASH_INCREMENT = 0x61c88647;

// Calculate the next hashCode, incrementing all the time
private static int nextHashCode(a) {
	return nextHashCode.getAndAdd(HASH_INCREMENT);
}
Copy the code

Why is the incremental step 0x61C88647? ThreadLocalMap’s underlying implementation is Entry[]. Like HashMap, the size of the array will always be 2 to the N, increasing by 0x61C88647, to make the hashCode more evenly distributed across the 2 to the N array. Specific can refer to: from the implementation of ThreadLocal hash algorithm.

2. What does set() do? When a thread calls the set() method of a ThreadLocal, it first gets the current thread’s ThreadLocalMap. If null, it creates a ThreadLocalMap, otherwise it puts elements into the ThreadLocalMap.

/* 1. Obtain the ThreadLocalMap of the current thread. 2. If the value is not null, set */
public void set(T value) {
	Thread t = Thread.currentThread();
	ThreadLocalMap map = getMap(t);
	if(map ! =null)
		map.set(this, value);
	else
		createMap(t, value);
}
Copy the code

The first time a new thread sets (), it creates a ThreadLocalMap:

/* Create a Map instance for Thread's threadLocals and add elements to it. * /
void createMap(Thread t, T firstValue) {
	t.threadLocals = new ThreadLocalMap(this, firstValue);
}
Copy the code

ThreadLocalMap ThreadLocalMap is a static inner class of ThreadLocalEntry[]Array to hold the data. Unlike a HashMap, when you have a hash conflict,EntryInstead of converting to a linked list or a red-black tree, it is implemented using linear detection of open addressing. For details on how hash conflicts are handled, see another article by the author:Common ways to resolve hash conflicts.

// Initialize the ThreadLocalMap instanceThreadLocalMap(ThreadLocal<? > firstKey, Object firstValue) {// Initialize. The default capacity is 16
	table = new Entry[INITIAL_CAPACITY];
	// Calculate the subscript by using the algorithm hashCode & (Len - 1), which is the same as HashMap.
	int i = firstKey.threadLocalHashCode & (INITIAL_CAPACITY - 1);
	table[i] = new Entry(firstKey, firstValue);
	size = 1;
	// Set the expansion threshold to two thirds of the capacity
	setThreshold(INITIAL_CAPACITY);
}

private void setThreshold(int len) {
	threshold = len * 2 / 3;
}
Copy the code

Entry Entry inherits from WeakReference. Its Key is a WeakReference. When using it, we need to pay attention to remove() manually as far as possible to avoid memory leakage.

/* The Key of Entry is a weak reference. When there is no strong reference outside the ThreadLocal instance, GC will reclaim it. If remove() is not called, value still has a reference and cannot be reclaimed. This can easily lead to memory leaks. * /
static class Entry extends WeakReference<ThreadLocal<? >>{ Object value; Entry(ThreadLocal<? > k, Object v) {super(k); value = v; }}Copy the code

If ThreadLocalMap is not null, we need to add elements to it:

private void set(ThreadLocal
        key, Object value) {
	Entry[] tab = table;
	int len = tab.length;
	// Calculate the subscript by using the algorithm hashCode & (Len - 1), which is the same as HashMap.
	int i = key.threadLocalHashCode & (len-1);

	for (Entry e = tab[i];
		 /* If the subscript element is not null, there are two cases: 1. 2. Hash conflict. * /e ! =null;
		 /* Hash conflict resolution: open addressing method of linear detection. If the current subscript is taken, look for next, find the tail and start from scratch. Until you find an unoccupied index. * /e = tab[i = nextIndex(i, len)]) { ThreadLocal<? > k = e.get();if (k == key) {
			// For the same Key, override value.
			e.value = value;
			return;
		}

		if (k == null) {
			The /* subscript is used, but key.get () is null. This indicates that ThreadLocal was reclaimed. You need to replace it. * /
			replaceStaleEntry(key, value, i);
			return;
		}
	}

	tab[i] = new Entry(key, value);
	int sz = ++size;
	/* 1. Check whether some slots can be cleared. 2. If the data is cleared successfully, capacity expansion is not required because some space has been freed up for future use. 3. If the clearing fails, check whether the capacity needs to be expanded. * /
	if(! cleanSomeSlots(i, sz) && sz >= threshold) rehash(); }Copy the code

After the element is added successfully, ThreadLocalMap cleans up the element’s reclaimed keys. For performance reasons, ThreadLocalMap does not check all elements, but rather samples some of the data.

/* Clear some slots. 1. If the clearing is successful, there is no need to expand the capacity because some space has been vacated. 2. For performance reasons, not all elements will be cleaned up, but samples will be cleaned up. Set (), n=size, the search range is small. * /
private boolean cleanSomeSlots(int i, int n) {
	boolean removed = false;
	Entry[] tab = table;
	int len = tab.length;
	do {
		i = nextIndex(i, len);
		Entry e = tab[i];
		if(e ! =null && e.get() == null) {
			// Once the expired elements are found, then n=len to expand the search scope
			n = len;
			removed = true;
			// Real clean logic
			i = expungeStaleEntry(i);
		}
		/ * sampling rules: n > > > = 1 (binary) example: 100 > > > > > 6 12 25 50 * > 1/3
	} while ( (n >>>= 1) != 0);
	return removed;
}
Copy the code

If an expired Key is found, it’s time to clean it up:

/* Delete expired elements: elements that hold subscripts but whose ThreadLocal instance has already been reclaimed. * /
private int expungeStaleEntry(int staleSlot) {
	Entry[] tab = table;
	int len = tab.length;

	// Clear the current Entry
	tab[staleSlot].value = null;
	tab[staleSlot] = null;
	size--;

	// Rehash until we encounter null
	Entry e;
	int i;
	// Continue searching until null is found
	for(i = nextIndex(staleSlot, len); (e = tab[i]) ! =null; i = nextIndex(i, len)) { ThreadLocal<? > k = e.get();if (k == null) {
			// Find expired elements again and clean them up
			e.value = null;
			tab[i] = null;
			size--;
		} else {
			// Handle the rehash logic
			int h = k.threadLocalHashCode & (len - 1);
			if(h ! = i) { tab[i] =null;

				// Unlike Knuth 6.4 Algorithm R, we must scan until
				// null because multiple entries could have been stale.
				while(tab[h] ! =null) h = nextIndex(h, len); tab[h] = e; }}}return i;
}
Copy the code

When you clean up, you do not just clean up the current Entry, but continue to look for expired entries in the circle. As long as you find them, you will clean up until TAB [I]==null. In the process of clearing, you will also do a rehash operation on the elements.

If the clearing fails, check whether the size exceeds the threshold. If the size exceeds the threshold, complete the clearing and check whether the capacity is expanded.

private void rehash(a) {
	// Clear expired entries in full
	expungeStaleEntries();

	// If the size still exceeds three-quarters of the threshold, expand the capacity
	if (size >= threshold - threshold / 4)
		resize();
}
Copy the code

Clearing expired entries in full:

// Clear expired entries in full
private void expungeStaleEntries(a) {
	Entry[] tab = table;
	int len = tab.length;
	for (int j = 0; j < len; j++) {
		Entry e = tab[j];
		// Iterate through the array and clean up expired elements
		if(e ! =null && e.get() == null) expungeStaleEntry(j); }}Copy the code

If the size still exceeds three-quarters of the threshold, expand the capacity:

// Expansion rule: double capacity expansion
private void resize(a) {
	Entry[] oldTab = table;
	int oldLen = oldTab.length;
	int newLen = oldLen * 2;
	Entry[] newTab = new Entry[newLen];
	int count = 0;

	for (int j = 0; j < oldLen; ++j) {
		Entry e = oldTab[j];
		if(e ! =null) { ThreadLocal<? > k = e.get();if (k == null) {
				// Expired elements are found during expansion and will be skipped
				e.value = null; // Help the GC
			} else {
				// Move elements that are not expired from the old array to the new array
				int h = k.threadLocalHashCode & (newLen - 1);
				while(newTab[h] ! =null) h = nextIndex(h, newLen); newTab[h] = e; count++; }}}// Reset the threshold
	setThreshold(newLen);
	size = count;
	table = newTab;
}
Copy the code

At this point, all logic for Set() ends.

3. What does get() do? When get() obtains a Value, it first determines whether the current thread’s ThreadLocalMap is null. If it is null, it calls initialValue() to get an initialValue and sets () to the ThreadLocalMap.

/* When obtaining a Value: 1. Obtain ThreadLocalMap 2 for the current thread. If null, the Map is created and the initial value is set. 3. If the value is not null, search by Map. * /
public T get(a) {
	Thread t = Thread.currentThread();
	ThreadLocalMap map = getMap(t);
	if(map ! =null) {
		ThreadLocalMap.Entry e = map.getEntry(this);
		if(e ! =null) {
			@SuppressWarnings("unchecked")
			T result = (T)e.value;
			returnresult; }}return setInitialValue();
}
Copy the code

If ThreadLocalMap is not null, it is time to start looking:

/* Obtain Entry */ by Key
private Entry getEntry(ThreadLocal
        key) {
	// Calculate the subscript
	int i = key.threadLocalHashCode & (table.length - 1);
	Entry e = table[i];
	if(e ! =null && e.get() == key) {
		// If the corresponding subscript node is not null and the Key is equal, the match is returned directly
		return e;
	} else {
		/* Otherwise there are two cases: 1.Key does not exist. 2. Hash conflict, backward ring lookup is required. * /
		returngetEntryAfterMiss(key, i, e); }}Copy the code

If it is hit, it is returned directly. If it is not hit, there are two cases:

  1. The Key does not exist.
  2. Hash conflict. We need to look backwards.
/* Find logic that cannot be directly hit */
private Entry getEntryAfterMiss(ThreadLocal<? > key,int i, Entry e) {
	Entry[] tab = table;
	int len = tab.length;

	while(e ! =null) {// e==null Indicates that the Key does not exist and null is directly returnedThreadLocal<? > k = e.get();if (k == key)
			// There is a hash conflict
			return e;
		if (k == null)
			// The Key exists, but has expired, needs to be cleaned up, and returns null
			expungeStaleEntry(i);
		else
			// Backward ring lookup
			i = nextIndex(i, len);
		e = tab[i];
	}
	return null;
}
Copy the code

If the Entry node is successfully found, simply return its value.

3. What does () do? Gets the ThreadLocalMap of the current thread and removes the elements.

// Find the ThreadLocalMap of the current thread and remove the element
public void remove(a) {
	ThreadLocalMap m = getMap(Thread.currentThread());
	if(m ! =null)
		m.remove(this);
}
Copy the code

The main logic is in threadLocalMap.remove () :

// Delete Entry by Key
private void remove(ThreadLocal
        key) {
	Entry[] tab = table;
	int len = tab.length;
	// Calculate the subscript
	int i = key.threadLocalHashCode & (len-1);
	/* Delete is the same, because there is a hash conflict, can not directly locate the subscript and directly delete. Before deleting keys, check whether the keys are equal. If the keys are not equal, ring search. * /
	for(Entry e = tab[i]; e ! =null;
		 e = tab[i = nextIndex(i, len)]) {
		if (e.get() == key) {
			/* Clean it up when you find it. Instead of cleaning up directly, the Reference Reference to the Key is cleared, and then expungeStaleEntry() is called to clean up expired elements. In addition, you can clean up subsequent nodes. * /
			e.clear();
			expungeStaleEntry(i);
			return; }}}Copy the code

Because of the existence of hash conflict, the node cannot be located and directly deleted. It needs to confirm whether the Key is equal. If the Key is not equal, it needs to ring search until the correct Key is found.

Instead of simply emptying, the Key reference is first nulled and then the expungeStaleEntry() method is called to clean up expired elements. This process cleans up subsequent nodes and rehash operations in turn.


The problem

1. Why use weak references?

Each thread has its own ThreadLocalMap. If ThreadLocalMap strongly references ThreadLocal, then ThreadLocal cannot be reclaimed even if we execute ThreadLocal=null. Remove the current ThreadLocal of the ThreadLocalMap of all threads.

Because of the use of weak references, as long as there is no strong reference outside the ThreadLocal instance, it can be collected by GC. When ThreadLocalMap performs some read and write operations, it will automatically do some expiration checks and delete expired entries to avoid memory leaks to the maximum extent.

2. Why does the memory leak?

A weak reference to a ThreadLocalMap applies only to a Key. If a ThreadLocal does not have a strong reference, the GC will collect it, but the Value will not be collected because it has a strong reference to an Entry. This will result in a memory leak where some values will never be accessed.

This, of course, is something the JDK does its best to avoid. When reading and writing a ThreadLocal, there are many places where it can be triggered to perform expiration checks to remove expired entries and avoid memory leaks.

3. When will the expiration check be triggered?

  1. callset()Method, and the system will continue to check during capacity expansion.
  2. callget()Method, no direct hit, for backward ring lookup.
  3. callremove()In addition to clearing the current Entry, it will continue to clean the Entry backward.

4. How to avoid memory leaks

When using a ThreadLocal, it is generally recommended to declare it static final to avoid frequently creating ThreadLocal instances. Try to avoid storing large objects. If you do, try to call remove() to remove them after access is complete.

The Value of a ThreadLocal can cause memory leaks, but the JDK has done a lot to avoid them. For example, as mentioned above, expired entries will be automatically cleared in many scenarios, so that invalid values can be recycled. In general, normal use will not be too much of a problem, may cause some Value may have a brief memory leak, but in the subsequent expired check, will be cleared. However, it is advisable to call remove() in time.


Articles you may be interested in:

  • AQS source guide
  • It’s a showdown. I’m gonna hand write an RPC
  • The Java lock bloat process and the effect of consistency hashing on lock bloat
  • CMS and three-color labeling algorithm
  • Accessibility analysis algorithm for vernacular understanding