In analysing NioEventLoop source mentioned, above Netty ty. Defaults to using io.net util. Concurrent. DefaultThreadFactory thread factory to create a new thread, It creates FastThreadLocalThread threads to drive NioEventLoop execution instead of JDK native threads because FastThreadLocalThread improves FastThreadLocal performance.

Why would Netty reinvent the wheel when the JDK already provides ThreadLocal? None of this is because JDK ThreadLocal isn’t very efficient.

The JDK stores ThreadLocal as a Key and values as values in ThreadLocalMap, and each Thread maintains a Map container. As the number of ThreadLocal objects used by threads increases, the probability of hash conflicts will increase. ThreadLocalMap handles hash conflicts by “linear detection”, that is, calculating index based on the hash of Key. If the index of array is already occupied, hash conflicts will occur. It searches for the next slot in a circle until it finds an empty slot, and then encapsulates the mapping into an Entry node and saves it in an array. The get() operation is the same process. If you encounter a hash conflict, you need to search around.

In order to improve performance, Netty has implemented a fast FastThreadLocal. In this article, we will take a look at the performance of FastThreadLocal.

FastThreadLocal source

FastThreadLocal is a variant of ThreadLocal that provides better access performance when used in conjunction with FastThreadLocalThreads.

Note that FastThreadLocalThread must be used with FastThreadLocalThread, otherwise it will degenerate into JDK ThreadLocal, and efficiency may be affected.

InternalThreadLocalMap

InternalThreadLocalMap Netty is used instead of the ThreadLocal in JDK. ThreadLocalMap, InternalThreadLocalMap using array instead of a Hash table, When a FastThreadLocal is created, it will have a globally unique and increasing index index, index, which represents the index of the FastThreadLocal array. The Value will be placed directly at that index. There are no hash conflicts at all, and the time is always O(1). The downside is that it wastes a bit of memory, but in a world where memory is getting cheaper, it’s worth it.

Let’s take a look at a few FastThreadLocal related attributes, which we’ll use later:

/* if you use FastThreadLocal, Netty creates an InternalThreadLocalMap and saves it to native thread.threadlocals. This adds another Map to the native ThreadLocalMap. * /
    private static final ThreadLocal<InternalThreadLocalMap> slowThreadLocalMap =
            new ThreadLocal<InternalThreadLocalMap>();
   
	// Index generator
    private static final AtomicInteger nextIndex = new AtomicInteger();

	// The default array size
    private static final int INDEXED_VARIABLE_TABLE_INITIAL_SIZE = 32;
    
	// An object instead of Null, indicating that the slot has no value
    public static final Object UNSET = new Object();

    // Store the Value of FastThreadLocal
    private Object[] indexedVariables;
Copy the code

The static variable variablesToRemoveIndex has a value of 0, which stores a Set

at bit 0 of the array to store the FastThreadLocal used by the thread. The purpose is to do bulk removal in the removeAll() method. The instance constant index represents the unique index of FastThreadLocal, which is globally unique and increasing.

Set
      
       : all FastThreadLocal threads used by the current thread. Purpose is FastThreadLocal. RemoveAll () when the batch removed. * /
      
private static final int variablesToRemoveIndex = InternalThreadLocalMap.nextVariableIndex();

// Unique index of FastThreadLocal, globally unique and increasing, starting from 1, 0 stores the FastThreadLocal to be removed.
private final int index;
Copy the code

The constructor of FastThreadLocal is very simple and simply generates an index.

public FastThreadLocal(a) {
    // Index initialization, incremented by a global AtomicInteger
    index = InternalThreadLocalMap.nextVariableIndex();
}
Copy the code

Set () the source code

When the set() method is called to save a Value, it determines whether the Value is UNSET, which means removing the instance, or setting a new Value. Obtain the thread-bound InternalThreadLocalMap, and then fill the Value with the subscript corresponding to the Array of indexedVariables in InternalThreadLocalMap according to index.

public final void set(V value) {
    if(value ! = InternalThreadLocalMap.UNSET) {// Gets InternalThreadLocalMap of the current thread binding
        InternalThreadLocalMap threadLocalMap = InternalThreadLocalMap.get();
        setKnownNotUnset(threadLocalMap, value);
    } else {
        // A value of UNSET indicates a removal operationremove(); }}Copy the code

InternalThreadLocalMap. Get () is used to retrieve the current thread InternalThreadLocalMap object, if the thread is FastThreadLocalThread, you can quickly get, Because FastThreadLocalThread uses an attribute to record it. If the Thread is a normal Thread, it is acquired slowly, and the JDK’s native ThreadLocal is used to store InternalThreadLocalMap.

// Gets the current thread's InternalThreadLocalMap
public static InternalThreadLocalMap get(a) {
    Thread thread = Thread.currentThread();
    if (thread instanceof FastThreadLocalThread) {
        // If the current thread is a FastThreadLocalThread, take the variable threadLocalMap directly.
        return fastGet((FastThreadLocalThread) thread);
    } else {
        // Non-fastThreadLocalThread threads are compatible with Netty, but performance is affected.
        returnslowGet(); }}Copy the code

For non-FastThreadLocalThread threads, Netty also implements a degenerate JDK ThreadLocal that stores InternalThreadLocalMap objects.

/* Threads that are not fastThreadLocalthreads can also use FastThreadLocal. Netty makes this compatible, but there is a performance impact. * /
private static InternalThreadLocalMap slowGet(a) {
    /* if you use FastThreadLocal, Netty creates an InternalThreadLocalMap and saves it to native thread.threadlocals. This adds another Map to the native ThreadLocalMap. The InternalThreadLocalMap is taken from the native ThreadLocal and inserted if it does not exist. * /
    InternalThreadLocalMap ret = slowThreadLocalMap.get();
    if (ret == null) {
        ret = new InternalThreadLocalMap();
        slowThreadLocalMap.set(ret);
    }
    return ret;
}
Copy the code

Get the thread-bound InternalThreadLocalMap, then set the Value to the array by calling setKnownNotUnset() :

/* Set (UNSET); /* Set (UNSET)
private void setKnownNotUnset(InternalThreadLocalMap threadLocalMap, V value) {
    // Set the index of Object[] to value
    if (threadLocalMap.setIndexedVariable(index, value)) {
        // Add the current FastThreadLocal to the 0 Set for removeAll().
        addToVariablesToRemove(threadLocalMap, this); }}Copy the code

ThreadLocalMap. SetIndexedVariable () will be the Value to the specified array subscript, capacity expansion will if required.

// Set value to the index of the array
public boolean setIndexedVariable(int index, Object value) {
    Object[] lookup = indexedVariables;
    if (index < lookup.length) {
        Object oldValue = lookup[index];
        lookup[index] = value;
        return oldValue == UNSET;
    } else {
        // Expand and Set
        expandIndexedVariableTableAndSet(index, value);
        return true; }}Copy the code

After the Value is saved to the array, you need to add the FastThreadLocal to the Set of the 0 bit of the array because removeAll() removes all FastThreadLocal in batches.

// Save FastThreadLocal to Set (0) for later batch deletion
private static void addToVariablesToRemove(InternalThreadLocalMap threadLocalMap, FastThreadLocal
        variable) {
    // Retrieve the element with fixed bit 0Object v = threadLocalMap.indexedVariable(variablesToRemoveIndex); Set<FastThreadLocal<? >> variablesToRemove;if (v == InternalThreadLocalMap.UNSET || v == null) {
        // Set
      
        if there is no value
      
        variablesToRemove = Collections.newSetFromMap(newIdentityHashMap<FastThreadLocal<? >, Boolean>()); threadLocalMap.setIndexedVariable(variablesToRemoveIndex, variablesToRemove); }else {
        // If there is a value, force to SetvariablesToRemove = (Set<FastThreadLocal<? >>) v; }// Add FastThreadLocal to Set, which is required when removeAll() is used.
    variablesToRemove.add(variable);
}
Copy the code

The set() operation ends here.

The get () the source code

Now that you know the flow of set(), look at get().

To obtain the Value of the current FastThreadLocal thread, you must first obtain the thread bound InternalThreadLocalMap, and then select the Value from the array according to the index. If not, fill the initial value with initialize().

public final V get(a) {
    /* Gets InternalThreadLocalMap 1 of the current thread binding. For FastThreadLocalThreads, it saves them directly using the threadLocalMap attribute. 2. For non-FastThreadLocalThread threads, one is created and plugged into the native ThreadLocal. * /
    InternalThreadLocalMap threadLocalMap = InternalThreadLocalMap.get();
    /* All FastThreadLocal values are placed in the InternalThreadLocalMap Object[], which can be accessed using the index subscript. * /
    Object v = threadLocalMap.indexedVariable(index);
    if(v ! = InternalThreadLocalMap.UNSET) {// UNSET can be understood as a substitute for null, representing a default object with no value.
        return (V) v;
    }
    // If there is no value, try it first
    return initialize(threadLocalMap);
}
Copy the code

ThreadLocalMap. IndexedVariable () is very simple, it is according to the index orientation array elements:

public Object indexedVariable(int index) {
    Object[] lookup = indexedVariables;
    return index < lookup.length? lookup[index] : UNSET;
}
Copy the code

If the element is UNSET, the value has not been set and needs to be initialized.

/* get() finds that the value is not set and calls this method to initialize */
private V initialize(InternalThreadLocalMap threadLocalMap) {
    V v = null;
    try {
        v = initialValue();// Return null by default, subclass override
    } catch (Exception e) {
        PlatformDependent.throwException(e);
    }
    // Save the initialized value to the array
    threadLocalMap.setIndexedVariable(index, v);
    // Add FastThreadLocal to Set at bit 0 of array
    addToVariablesToRemove(threadLocalMap, this);
    return v;
}
Copy the code

At this point, the get() process is complete.

Remove () the source code

When you’re done with FastThreadLocal, just call remove().

To remove FastThreadLocal, you still need to get the thread-bound InternalThreadLocalMap.

public final void remove(a) {
    // Get the current thread bound InternalThreadLocalMap, then remove
    remove(InternalThreadLocalMap.getIfSet());
}
Copy the code

Three things need to happen next:

  1. Fill the UNSET with the specified position in the array according to the index of FastThreadLocal.
  2. Remove the FastThreadLocal object from the Set.
  3. Need to triggeronRemoval()Hook function.
// Remove the current FastThreadLocal from InternalThreadLocalMap
public final void remove(InternalThreadLocalMap threadLocalMap) {
    if (threadLocalMap == null) {
        return;
    }

    // Remove the subscript value from the array: reset to UNSET
    Object v = threadLocalMap.removeIndexedVariable(index);
    // Delete the Set from the 0 bit of array
    removeFromVariablesToRemove(threadLocalMap, this);

    if(v ! = InternalThreadLocalMap.UNSET) {try {
            // Trigger the hook function
            onRemoval((V) v);
        } catch(Exception e) { PlatformDependent.throwException(e); }}}Copy the code

1. Fill the UNSET of the specified position in the array according to the index of FastThreadLocal.

public Object removeIndexedVariable(int index) {
    Object[] lookup = indexedVariables;
    if (index < lookup.length) {
        Object v = lookup[index];
        // Fill UNSET, which means delete
        lookup[index] = UNSET;
        return v;
    } else {
        returnUNSET; }}Copy the code

2. Remove the Set container containing bit 0 and delete the FastThreadLocal object.

Set
      
        from fixed bit 0 and remove the specified FastThreadLocal
      
private static void removeFromVariablesToRemove( InternalThreadLocalMap threadLocalMap, FastThreadLocal
        variable) {
    // Get a Set of bits 0 in the array
    Object v = threadLocalMap.indexedVariable(variablesToRemoveIndex);

    if (v == InternalThreadLocalMap.UNSET || v == null) {
        return;
    }

    @SuppressWarnings("unchecked")Set<FastThreadLocal<? >> variablesToRemove = (Set<FastThreadLocal<? >>) v;// Remove FastThreadLocal from the container
    variablesToRemove.remove(variable);
}
Copy the code

3. If the Value to be removed is not UNSET, a specific Value was set. When removing it, trigger the onRemoval() hook function so that subclasses can listen for the removal. Do nothing by default, subclass override.

// A callback that is triggered when value is removed
protected void onRemoval(@SuppressWarnings("UnusedParameters") V value) throws Exception {}Copy the code

RemoveAll () the source code

RemoveAll () is a static method that does not target a FastThreadLocal, but removes all FastThreadLocal from the current thread.

First, we still need to get the current thread bound InternalThreadLocalMap, retrieve the Set from the 0 bit of the array, traverse the Set, and remove the FastThreadLocal one by one.

/* Remove all FastThreadLocal instances bound to the current thread. * /
public static void removeAll(a) {
    // Gets the current bound InternalThreadLocalMap
    InternalThreadLocalMap threadLocalMap = InternalThreadLocalMap.getIfSet();
    if (threadLocalMap == null) {
        return;
    }

    try {
        Set
      
       ; remove();
      
        Object v = threadLocalMap.indexedVariable(variablesToRemoveIndex);
        if(v ! =null&& v ! = InternalThreadLocalMap.UNSET) {@SuppressWarnings("unchecked")Set<FastThreadLocal<? >> variablesToRemove = (Set<FastThreadLocal<? >>) v; FastThreadLocal<? >[] variablesToRemoveArray = variablesToRemove.toArray(new FastThreadLocal[0]);
            for(FastThreadLocal<? > tlv: variablesToRemoveArray) {// iterate and press deletetlv.remove(threadLocalMap); }}}finally {
        // Finally reset InternalThreadLocalMapInternalThreadLocalMap.remove(); }}Copy the code

conclusion

Netty’s FastThreadLocal is a variant of JDK ThreadLocal that provides better access performance when used with FastThreadLocalThreads.

It optimizes the idea of using arrays instead of JDK hash tables to avoid hash collisions and keep the time complexity of reading and writing within O(1). If the number of FastThreadLocal objects is too large, the global incrementing index will become too large, and the array will become longer and longer. In addition, the index will only increase and not decrease, which means that the array will only expand and not shrink. Developers need to pay special attention to the number of FastThreadLocal objects. Don’t abuse it or it will cause frequent GC and OOM!