A: background
1. Tell a story
Last month, a friend of mine told wX that his program was dead and asked how to further analyze it. The screenshot is as follows:
Dump dump dump dump dump dump dump dump dump dump dump dump
Now that I have been found, I must find a way to relieve his suffering , from easy to difficult Let’s analyze each of the three reasons for what?
Two: three high analysis
1. Cause of death
According to 40+ experience in dump analysis, hanging is usually caused by some circumstance that causes the thread to freeze, causing subsequent requests to pile up in the threadPool. The tp command looks at the thread pool queue.
0:000> !tp
CPU utilization: 81%
Worker Thread: Total: 65 Running: 65 Idle: 0 MaxLimit: 32767 MinLimit: 64
Work Request in Queue: 2831
Unknown Function: 00007ffffcba1750 Context: 0000022ab04d4a58
Unknown Function: 00007ffffcba1750 Context: 0000022ab03e4ce8
Unknown Function: 00007ffffcba1750 Context: 0000022ab825ec88
Unknown Function: 00007ffffcba1750 Context: 0000022ab825a458
Unknown Function: 00007ffffcba1750 Context: 0000022ab8266500
Unknown Function: 00007ffffcba1750 Context: 0000022ab8268198
Unknown Function: 00007ffffcba1750 Context: 0000022ab826cb00
Unknown Function: 00007ffffcba1750 Context: 0000022ab8281578
Number of Timers: 0
Completion Port Thread:Total: 2 Free: 2 MaxFree: 128 CurrentLimit: 2 MaxLimit: 32767 MinLimit: 64
Copy the code
There are 2831 tasks in the thread pool queue. As a result, new requests cannot be processed, so the thread pool queue is suspended. Clrstack calls up all thread stacks, as shown below:
Swept again after, found that there are a lot of System.Net.HttpWebRequest.GetResponse () method, an experienced friend should know that this is a classic synchronous HTTP request too slow process caused by less than hang dead, some friends might be curious, can you give me the website out, Yes, yes, yes, yes! The dso command is used.
000000D2FBD3B840 0000023269e85698 System.Text.UTF8Encoding
000000D2FBD3B850 00000236e9dd2cb8 System.String application/x-www-form-urlencoded
000000D2FBD3B870 0000023269e85698 System.Text.UTF8Encoding
000000D2FBD3B9A8 00000231aa221a38 System.String uSyncAppxxx
000000D2FBD3B9B8 00000231aa201a70 System.String VToken={0}&Vorigin={1}&QueryJson={2}
000000D2FBD3B9C0 00000231aa202200 System.String http://xxx.xxx.com/API/xxx/BusinessCardFolder/Connector/xxx/GetPageList
Copy the code
I go, this URL or an external network address, , itself synchronization mode is slow, this address is worse… No wonder not stuck
2. Analyze the CPU burst height
From above! The current CPU is 81%, so why is it so high? As a rule of thumb, this can be a lock lock, GC trigger, or an infinite loop.
- Is it a lock?
You can use command! Syncblk take a look at the synchronized block table.
0:000> !syncblk
Index SyncBlock MonitorHeld Recursion Owning Thread Info SyncBlock Owner
212 0000023ef3cdd028 3 1 0000023ef40efa40 8d70 209 000002396ad93788 System.Object
Total 297
ComClassFactory 0
Free 139
Copy the code
From the output, the lock lock is ok, next use! The mlocks command takes a look at the other types of locks to see if anything is new.
0:000> !mlocks
Examining SyncBlocks...
Scanning for ReaderWriterLock(Slim) instances...
Scanning for holders of ReaderWriterLock locks...
Scanning forholders of ReaderWriterLockSlim locks... Examining CriticalSections... ClrThread DbgThread OsThread LockType Lock LockLevel -- -- -- -- -- -- -- -- -- -- -- -- -- -- -- -- -- -- -- -- -- -- -- -- -- -- -- -- -- -- -- -- -- -- -- -- -- -- -- -- -- -- -- -- -- -- -- -- -- -- -- -- -- -- -- -- -- -- -- -- -- -- -- -- -- -- -- -- -- -- -- -- -- -- -- -- -- --... 0x49 209 0x8d70 thinlock 000002396ad9ba90 (recursion:0)
0x49 209 0x8d70 thinlock 000002396ad9baa8 (recursion:0)
0x49 209 0x8d70 thinlock 000002396ad9bac0 (recursion:0)
0x49 209 0x8d70 thinlock 000002396ad9bad8 (recursion:0)
0x49 209 0x8d70 thinlock 000002396ad9baf0 (recursion:0)
0x49 209 0x8d70 thinlock 000002396ad9bb08 (recursion:0)
0x49 209 0x8d70 thinlock 000002396ad9bb20 (recursion:0)
0x49 209 0x8d70 thinlock 000002396ad9bb38 (recursion:0)
0x49 209 0x8d70 thinlock 000002396ad9bb50 (recursion:0)
0x49 209 0x8d70 thinlock 000002396ad9bb68 (recursion:0)
0x49 209 0x8d70 thinlock 000002396ad9bb80 (recursion:0)
0xe 152 0x8e68 thinlock 0000023669f7e428 (recursion:0)
0x41 208 0x8fb4 thinlock 00000235e9f6e8d0 (recursion:0)
0x17 161 0x9044 thinlock 00000238ea94db68 (recursion:0)
0x16 159 0x911c thinlock 000002392a03ed40 (recursion:0)
0x47 206 0x9264 thinlock 000002322af08e28 (recursion:0)
Copy the code
Thinlock (DbgThread=209) has 1000 thinLock threads (DbgThread=209).
For those of you who don’t know what thinLock is, it is simply a CPU consuming internal lock, similar to SpinLock. Gcroot.
0:000> !gcroot 000002396ad9ba48
Thread 2580:
000000d2fb0bef10 00007ff806945ab3 System.Threading.ExecutionContext.RunInternal(System.Threading.ExecutionContext, System.Threading.ContextCallback, System.Object, Boolean)
rbp- 80.: 000000d2fb0bef50
-> 0000023769dd4008 System.Threading.Thread
-> 0000023269e776b8 System.Runtime.Remoting.Contexts.Context
-> 0000023269e773b8 System.AppDomain
-> 0000023269ee1e00 System.Threading.TimerCallback
-> 0000023269ed2d30 System.Web.Caching.CacheExpires
-> 0000023269ed2c78 System.Web.Caching.CacheSingle
-> 0000023269ed2ce0 System.Collections.Hashtable
-> 000002372ab91d90 System.Collections.Hashtable+bucket[]
-> 00000239ab32fd10 System.Web.Caching.CacheEntry
-> 000002396ad93748 System.Collections.Concurrent.ConcurrentDictionary`2[[System.String, mscorlib],[xxx].Application.Entity.BaseManage.UserRelationEntity, xxx.Application.Entity]]
-> 00000239ab2a8248 System.Collections.Concurrent.ConcurrentDictionary`2+Tables[[System.String, mscorlib],[xxx.Application.Entity.BaseManage.UserRelationEntity, xxx.Application.Entity]]
-> 000002396ad96b80 System.Object[]
-> 000002396ad9ba48 System.Object
Copy the code
From the output, thinLock comes from inside the ConcurrentDictionary dictionary. MDT command.
0:148> !mdt 000002396ad93748
000002396ad93748 (System.Collections.Concurrent.ConcurrentDictionary`2[[System.String, mscorlib],[xxx.Application.Entity.BaseManage.UserRelationEntity, xxx.Application.Entity]])
m_tables:00000239ab2a8248 (System.Collections.Concurrent.ConcurrentDictionary`2+Tables[[System.String, mscorlib],[xxx.Application.Entity.BaseManage.UserRelationEntity, xxx.Application.Entity]])
m_comparer:NULL (System.Collections.Generic.IEqualityComparer`1[[System.__Canon, mscorlib]])
m_growLockArray:true (System.Boolean)
m_keyRehashCount:0x0 (System.Int32)
m_budget:0x213 (System.Int32)
m_serializationArray:NULL (System.Collections.Generic.KeyValuePair`2[[System.__Canon, mscorlib],[System.__Canon, mscorlib]][])
m_serializationConcurrencyLevel:0x0 (System.Int32)
m_serializationCapacity:0x0 (System.Int32)
0:148> !mdt 00000239ab2a8248
00000239ab2a8248 (System.Collections.Concurrent.ConcurrentDictionary`2+Tables[[System.String, mscorlib],[xxx.Application.Entity.BaseManage.UserRelationEntity, xxx.Application.Entity]])
m_buckets:0000023e9a2477e8 (System.Collections.Concurrent.ConcurrentDictionary`2+Node[[System.String, mscorlib],[xxx.Application.Entity.BaseManage.UserRelationEntity, xxx.Application.Entity]][], Elements: 543997)
m_locks:000002396ad96b80 (System.Object[], Elements: 1024)
m_countPerLock:00000239aa8472c8 (System.Int32[], Elements: 1024)
m_comparer:0000023269e782b8 (System.Collections.Generic.GenericEqualityComparer`1[[System.String, mscorlib]])
Copy the code
From the above information, this dictionary has 54.3 W records, why so large, but also 1024 lock, some interesting, we dig source code to see.
The source code does have a lock[] array inside. To find out what exactly is causing locks[] to be traversed, look for the ConcurrentDictionary keyword on all thread stacks.
OS Thread Id: 0x2844 (163)
Child SP IP Call Site
000000d2fb83abb8 00007ff80a229df8 [GCFrame: 000000d2fb83abb8]
000000d2fb83aca0 00007ff80a229df8 [GCFrame: 000000d2fb83aca0]
000000d2fb83acd8 00007ff80a229df8 [HelperMethodFrame: 000000d2fb83acd8] System.Threading.Monitor.Enter(System.Object)
000000d2fb83add0 00007ff80693ea56 System.Collections.Concurrent.ConcurrentDictionary`2[[System.__Canon, mscorlib],[System.__Canon, mscorlib]].AcquireLocks(Int32, Int32, Int32 ByRef)
000000d2fb83ae20 00007ff806918ef2 System.Collections.Concurrent.ConcurrentDictionary`2[[System.__Canon, mscorlib],[System.__Canon, mscorlib]].AcquireAllLocks(Int32 ByRef)
000000d2fb83ae60 00007ff8069153f9 System.Collections.Concurrent.ConcurrentDictionary`2[[System.__Canon, mscorlib],[System.__Canon, mscorlib]].GetValues()
000000d2fb83aee0 00007ff7ae17d8ec xxx.Util.DataHelper.ToEnumerable[[System.__Canon, mscorlib],[System.__Canon, mscorlib]](System.Collections.Concurrent.ConcurrentDictionary`2<System.__Canon,System.__Canon>)
000000d2fb83af20 00007ff7ad125241 xxx.Application.Code.CacheHelper.GetCaches[[System.__Canon, mscorlib],[System.__Canon, mscorlib]](System.String)
000000d2fb83afa0 00007ff7ad12513b xxx.Application.Code.CacheHelper.GetCaches[[System.__Canon, mscorlib]](System.String)
000000d2fb83b000 00007ff7b10199e5 xxx.Application.Cache.CacheHelper.GetUserRelations()
Copy the code
If you look at the thread stack, you can see that there are about 20 places where the ConcurrentDictionary lock is triggered when the GetCaches method is called, Let’s have a look at the XXX. Application. Cache. CacheHelper. GetUserRelations () what did source?
public static IEnumerable<UserRelationEntity> GetUserRelations()
return xxx.Application.Code.CacheHelper.GetCaches<UserRelationEntity>("xxx.BaseManage-UserRelation");
protected static IEnumerable<T> GetCaches<T> (string cacheKeyName)
return GetCaches<T, string>(cacheKeyName);
private static IEnumerable<T> GetCaches<T.TKey> (string cacheKeyName)
returnGetConcurrentDictionaryCache<T, TKey>(cacheKeyName)? .ToEnumerable(); }public static IEnumerable<T> ToEnumerable<TKey.T> (this ConcurrentDictionary<TKey, T> dics)
return dics.Values;
Copy the code
From the source logic, every time the program calls the cache, it will eventually call dics.Values. I am curious about its framework logic. Screenshot below:
Have you noticed that every dict.Values is executed 1024 times monitor. Enter(locks[I], ref lockTaken); That is, 1024 internal spin locks, which is a key factor in high CPU.
3. Memory explosion
The last question is why does memory explode? If you are careful, you will notice that there is a strange logic in GetValues. Let me paste the code again:
private ReadOnlyCollection<TValue> GetValues()
int locksAcquired = 0;
AcquireAllLocks(ref locksAcquired);
int countInternal = GetCountInternal();
if (countInternal < 0)
throw new OutOfMemoryException();
List<TValue> list = new List<TValue>(countInternal);
for (int i = 0; i < m_tables.m_buckets.Length; i++)
for(Node node = m_tables.m_buckets[i]; node ! =null; node = node.m_next) { list.Add(node.m_value); }}return new ReadOnlyCollection<TValue>(list);
ReleaseLocks(0, locksAcquired); }}Copy the code
Every time GetValues is called, a new List of 54.3W size is generated. Please note that the List is newly generated, not a reference to the ConcurrentDictionary. You say memory burst not burst??
In general, this helpless pain of the three high has the following two factors caused.
- Using synchronous HttpRequest mode and using an extranet URL caused the application to hang.
Optimization: Use asynchronous mode
- Crater of the ConcurrentDictionary Values lead to memory, CPU blasting height.
I think a lot of friends, have never thought: the ConcurrentDictionary. Values have such a big pit, which makes me think up a thread unsafe Dictionary. Values do?
public ValueCollection Values
if (values == null)
values = new ValueCollection(this);
returnvalues; }}public sealed class ValueCollection
public ValueCollection(Dictionary<TKey, TValue> dictionary)
if (dictionary == null)
this.dictionary = dictionary; }}Copy the code
You can obviously see that it does not generate a new list, so the optimization measures are as follows:
- Refuse to use the ConcurrentDictionary Values, adopted
lock + Dictionary
。 - If you must use ConcurrentDictionary, pass the Query condition instead of using Values for full pull and Query to reduce memory usage.
Finally on a small egg, the analysis results to the friend, the friend wanted me to come to the door analysis, the first encounter… What a surprise