A: background
1. Tell a story
ThreadStatic variables are stored in the same directory as ThreadStatic variables. Can you help me dig out 😂😂😂, in fact, this question asked quite deep, playing high-level language friends believe that few contact with this feature, although many friends know how to use this feature, of course, I did not study this, since to answer this question, I have to study the answer! For better universality, start with the simple ones!
ThreadStatic = ThreadStatic
1. Plain static variables
The static variable can be used as a process cache to improve performance. You can also use the static variable as a level cache.
public class Test
{
public static Dictionary<int.string> cachedDict = new Dictionary<int.string> (); }Copy the code
As I mentioned earlier, this is a process-level cache that can be seen by multiple threads, so in a multi-threaded environment, you need to pay special attention to synchronization. Either lock or ConcurrentDictionary, I think this is also a stereotype of thinking. Most of the time, thinking always fixes on the existing foundation, rather than jumping out of the thinking and dealing with the foundation. What do you mean by saying so much? Let me give you an example:
In the common chain tracking framework in the market, for example: Zikpin, SkyWalking, uses collections to store links that track the current thread, such as A -> B -> C -> D -> B -> A. The conventional wisdom is to define A global cachedDict and use various synchronization mechanisms. Wouldn’t it be better to reduce cachedDict’s access scope and make global access threadlevel?
2. Mark static variables with ThreadStatic
ThreadStatic is an easy way to implement ThreadStatic in cachedDict:
public class Test{[ThreadStatic]
public static Dictionary<int.string> cachedDict = new Dictionary<int.string> (); }Copy the code
Then you can open multiple threads to feed data to cachedDict to see if the dict is Thread scoped. The code is as follows:
class Program
{
static void Main(string[] args)
{
var task1 = Task.Run(() =>
{
if (Test.cachedDict == null) Test.cachedDict = new Dictionary<int.string> (); Test.cachedDict.Add(1."mary");
Test.cachedDict.Add(2."john");
Console.WriteLine($"thread={Thread.CurrentThread.ManagedThreadId}Dict records:{Test.cachedDict.Count}");
});
var task2 = Task.Run(() =>
{
if (Test.cachedDict == null) Test.cachedDict = new Dictionary<int.string> (); Test.cachedDict.Add(3."python");
Test.cachedDict.Add(4."jaskson");
Test.cachedDict.Add(5."elen");
Console.WriteLine($"thread={Thread.CurrentThread.ManagedThreadId}Dict records:{Test.cachedDict.Count}"); }); Console.ReadLine(); }}public class Test{[ThreadStatic]
public static Dictionary<int.string> cachedDict = new Dictionary<int.string> (); }Copy the code
The result is a Thread level, and the synchronization overhead between threads is avoided. 😄
Select ThreadStatic from Windbg
1. Understanding of TEB and TLS
- TEB (Thread Environment Block)
Each Thread has a copy of its own private data stored in the Thread’s TEB, which can be printed out in WinDBG if you want to see it.
0:000> !teb
TEB at 0000001e1cdd3000
ExceptionList: 0000000000000000
StackBase: 0000001e1cf80000
StackLimit: 0000001e1cf6e000
SubSystemTib: 0000000000000000
FiberData: 0000000000001e00
ArbitraryUserPointer: 0000000000000000
Self: 0000001e1cdd3000
EnvironmentPointer: 0000000000000000
ClientId: 0000000000005980 . 0000000000005aa8
RpcHandle: 0000000000000000
Tls Storage: 000001b599d06db0
PEB Address: 0000001e1cdd2000
LastErrorValue: 0
LastStatusValue: c0000139
Count Owned Locks: 0
HardErrorMode: 0
Copy the code
It can be seen from the structure of TEB that there are both thread-local storage (TLS) and ExceptionList storage (ExceptionList) and other related information.
- TLS (Thread Local Storage)
The process allocates a total of 1088 slots to TLS after startup. Each thread is assigned a dedicated TLSIndex index and has a set of slots that you can verify with WinDBG.
0:000> !tls
Usage:
tls <slot> [teb]
slot: - 1 to dump all allocated slots
{00n1088} to dump specific slot
teb: <empty> for current thread
0 for all threads in this process
<teb address> (not threadid) to dump forspecific thread. 0:000> ! tls -1 TLS slotsonThread: 5980.5 AA8 0x00000000000000000000 0x0001:0000000000000000 0x0002:0000000000000000 0x0003: 0000000000000000 0x0004 : 0000000000000000 ... 0x0019 : 0000000000000000 0x0040 : 0000000000000000 0:000> ! t Lock DBG ID OSID ThreadOBJ State GC Mode GC Alloc Context Domain Count Apt Exception 0 1 5aa8 000001B599CEED90 2a020 Preemptive 000001B59B9042F8:000001B59B905358 000001b599cdb130 1 MTA 5 2 90c 000001B599CF4930 2b220 Preemptive 0000000000000000:0000000000000000 000001b599cdb130 0MTA (Finalizer)
7 3 74 000001B59B7272A0 102a220 Preemptive 0000000000000000:0000000000000000 000001b599cdb130 0 MTA (Threadpool Worker)
9 4 2058 000001B59B7BAFF0 1029220 Preemptive 0000000000000000:0000000000000000 000001b599cdb130 0 MTA (Threadpool Worker)
Copy the code
{0-0n1088} to dump specific slot {0-0n1088} to dump specific slot
All right, with the basic concepts covered, it’s time to take a look at assembly code.
2. Look for answers in assembly code
To better use windbg, I’ll define a simple ThreadStatic int variable as follows:
class Program{[ThreadStatic]
public static int i = 0;
static void Main(string[] args)
{
i = 10; // 12 line
varnum = i; Console.ReadLine(); }}Copy the code
Use it next! U Disassemble the Main function, focusing on line 12 where I = 10; .
0:000> !U /d 00007ffbe0ae0ffb
E:\net5\ConsoleApp5\ConsoleApp5\Program.cs @ 12:
00007ffb`e0ae0fd6 48b9b0fbb7e0fb7f0000 mov rcx,7FFBE0B7FBB0h
00007ffb`e0ae0fe0 ba01000000 mov edx,1
00007ffb`e0ae0fe5 e89657a95f call coreclr! JIT_GetSharedNonGCThreadStaticBase (00007ffc`40576780)
00007ffb`e0ae0fea c7401c0a000000 mov dword ptr [rax+1Ch],0Ah
Copy the code
From the assembly instruction, the last 10 is assigned to the lower 32 bits of RAx +1Ch, so where did the RAx address come from? It can be seen that the core logic within JIT_GetSharedNonGCThreadStaticBase method, then why have to research this method.
3. The debugging JIT_GetSharedNonGCThreadStaticBase core function
Next set a breakpoint at 12! BPMD program.cs :12, the simplified assembly code of the method is as follows:
coreclr! JIT_GetSharedNonGCThreadStaticBase:00007ffc`2c38679a 448b0dd7894300 mov r9d, dword ptr [coreclr!_tls_index (00007ffc`2c7bf178)]
00007ffc`2c3867a1 654c8b042558000000 mov r8, qword ptr gs:[58h]
00007ffc`2c3867aa b908000000 mov ecx, 8
00007ffc`2c3867af 4f8b04c8 mov r8, qword ptr [r8+r9*8]
00007ffc`2c3867b3 4e8b0401 mov r8, qword ptr [rcx+r8]
00007ffc`2c3867b7 493b8060040000 cmp rax, qword ptr [r8+460h]
00007ffc`2c3867be 732bjae coreclr! JIT_GetSharedNonGCThreadStaticBase+0x6b (00007ffc`2c3867eb)
00007ffc`2c3867c0 4d8b8058040000 mov r8, qword ptr [r8+458h]
00007ffc`2c3867c7 498b04c0 mov rax, qword ptr [r8+rax*8]
00007ffc`2c3867cb 4885c0 test rax, rax
00007ffc`2c3867ce 741bje coreclr! JIT_GetSharedNonGCThreadStaticBase+0x6b (00007ffc`2c3867eb)
00007ffc`2c3867d0 8bca mov ecx, edx
00007ffc`2c3867d2 f644011801 test byte ptr [rcx+rax+18h], 1
00007ffc`2c3867d7 7412je coreclr! JIT_GetSharedNonGCThreadStaticBase+0x6b (00007ffc`2c3867eb)
00007ffc`2c3867d9 488b4c2420 mov rcx, qword ptr [rsp+20h]
00007ffc`2c3867de 4833cc xor rcx, rsp
00007ffc`2c3867e1 e89a170600 call coreclr! __security_check_cookie (00007ffc`2c3e7f80)
00007ffc`2c3867e6 4883c438 add rsp, 38h
00007ffc`2c3867ea c3 ret
Copy the code
So let me take a closer look at the MOV operation here.
1) dword ptr [coreclr!_tls_index (00007ffc`2c7bf178)]
This is simple: get the thread-specific TLs_index index
2) qword ptr gs:[58h]
What does gs:[58h] mean? The gs register is used to store the teB address of the current thread. 58 is the offset from the TEB address. In fact, you can print out the data structure of the TEB.
0:000> dt teb coreclr! TEB +0x000 NtTib : _NT_TIB
+0x038 EnvironmentPointer : Ptr64 Void
+0x040 ClientId : _CLIENT_ID
+0x050 ActiveRpcHandle : Ptr64 Void
+0x058 ThreadLocalStoragePointer : Ptr64 Void
+0x060 ProcessEnvironmentBlock : Ptr64 _PEB
...
Copy the code
The above sentence + 0 x058 ThreadLocalStoragePointer: Ptr64 Void, you can see that is actually pointing ThreadLocalStoragePointer.
3) qword ptr [r8+r9*8]
With the foundation of the previous two steps, the assembly is simple and does an index operation: ThreadLocalStoragePointer [tls_index], right, and thus obtain belongs to the TLS content of the thread, the ThreadStatic variables will be stored in the array of a memory block.
Follow-up and some calculating migration logic operations are based on the ThreadLocalStoragePointer tls_index above, around method calls, assembly unreadable ha 😂 😂 😂
Four:
ThreadStatic variables can be determined, on the whole, is stored in TEB ThreadLocalStoragePointer array and NET5 CoreCLR no compilation is successful, these days if you are interested, You can debug CoreCLR + assembly to dig deeper!
For more high-quality dry goods: See my GitHub:dotnetfly