A: background

1. Tell a story

I haven’t written a blog for about two months. Those who follow me should know that I have been concentrating on the planet recently. In the past two months, some friends have asked for help on how to analyze dump. I will also contribute the analysis ideas one by one.

This dump was provided to me by a friend about a month ago. Since there are many friends in WX asking for help, I haven’t found relevant screenshots yet, so I have to break the old rules. 😭 😭 😭

Since my friend said that the API interface is unresponsive and presents a hangon phenomenon, from some past experience, there are probably only three cases.

  • Mass lock wait

  • Not enough threads

  • A deadlock

With this preconceived idea, go to WinDBG.

Two: WinDBG analysis

1. Are there a lot of locks waiting?

To see if the lock waits, as usual, look at the sync block table.


0:000> !syncblk
Index SyncBlock MonitorHeld Recursion Owning Thread Info  SyncBlock Owner
-----------------------------
Total           1673
CCW             3
RCW             4
ComClassFactory 0
Free            397

Copy the code

If there is nothing, just look at all the thread stacks.

Watch is good, a look startled, there are 339 thread stuck in the System. When the Monitor. The ObjWait (Boolean, Int32, System. Object), but on second thought, even if there are 339 threads stuck in here, Does this really cause the program to hangon? After all, I have seen that 1000+ threads do not freeze, but the CPU is too high. Next, I will continue to investigate whether there are not enough threads, we can start from the thread pool task queue.

2. Explore the thread pool queue

You can use! Tp command.


0:000> !tp
CPU utilization: 10%
Worker Thread: Total: 328 Running: 328 Idle: 0 MaxLimit: 32767 MinLimit: 4
Work Request in Queue: 74
    Unknown Function: 00007ffe91cc17d0  Context: 000001938b5d8d98
    Unknown Function: 00007ffe91cc17d0  Context: 000001938b540238
    Unknown Function: 00007ffe91cc17d0  Context: 000001938b5eec08
    ...
    Unknown Function: 00007ffe91cc17d0  Context: 0000019390552948
    Unknown Function: 00007ffe91cc17d0  Context: 0000019390562398
    Unknown Function: 00007ffe91cc17d0  Context: 0000019390555b30
--------------------------------------
Number of Timers: 0
--------------------------------------
Completion Port Thread:Total: 5 Free: 4 MaxFree: 8 CurrentLimit: 4 MaxLimit: 1000 MinLimit: 4

Copy the code

According to the output information, 328 threads in the thread pool are fully engaged and 74 guests are waiting in the work queue. It is clear that the hangon is caused by the arrival of a large number of guests beyond the capacity of the thread pool.

3. Is your reception capacity really bad?

I think it’s a great title. Really? There are two things to start with:

  • Is it bad code?

  • Is QPS really beyond reception capacity?

To find out, we need to start with the 339 threads that are stuck, and take a close look at the call stack of each thread, which is stuck in roughly three places.

<1>. GetModel


public static T GetModel<T.K> (string url, K content)
{
	T result = default(T);
	HttpClientHandler httpClientHandler = new HttpClientHandler();
	httpClientHandler.AutomaticDecompression = DecompressionMethods.GZip;
	HttpClientHandler handler = httpClientHandler;
	using (HttpClient httpClient = new HttpClient(handler))
	{
		string content2 = JsonConvert.SerializeObject((object)content);
		HttpContent httpContent = new StringContent(content2);
		httpContent.Headers.ContentType = new MediaTypeHeaderValue("application/json");
		string mD5ByCrypt = Md5.GetMD5ByCrypt(ConfigurationManager.AppSettings["SsoToken"] + DateTime.Now.ToString("yyyyMMdd"));
		httpClient.DefaultRequestHeaders.Add("token", mD5ByCrypt);
		httpClient.DefaultRequestHeaders.Accept.Add(new MediaTypeWithQualityHeaderValue("application/json"));
		HttpResponseMessage result2 = httpClient.PostAsync(url, httpContent).Result;
		if (result2.IsSuccessStatusCode)
		{
			string result3 = result2.Content.ReadAsStringAsync().Result;
			return JsonConvert.DeserializeObject<T>(result3);
		}
		returnresult; }}Copy the code

<2>. Get

public static T Get<T> (string url, string serviceModuleName)
{
	try
	{
		T val3 = default(T);
		HttpClient httpClient = TryGetClient(serviceModuleName, true);
		using (HttpResponseMessage httpResponseMessage = httpClient.GetAsync(GetRelativeRquestUrl(url, serviceModuleName, true)).Result)
		{
			if (httpResponseMessage.IsSuccessStatusCode)
			{
				string result = httpResponseMessage.Content.ReadAsStringAsync().Result;
				if (!string.IsNullOrEmpty(result))
				{
					val3 = JsonConvert.DeserializeObject<T>(result);
				}
			}
		}
		T val4 = val3;
		val5 = val4;
		return val5;
	}
	catch (Exception exception)
	{
		throw; }}Copy the code

<3>. GetStreamByApi


public static Stream GetStreamByApi<T> (string url, T content)
{
	Stream result = null;
	HttpClientHandler httpClientHandler = new HttpClientHandler();
	httpClientHandler.AutomaticDecompression = DecompressionMethods.GZip;
	HttpClientHandler handler = httpClientHandler;
	using (HttpClient httpClient = new HttpClient(handler))
	{
		httpClient.DefaultRequestHeaders.Accept.Add(new MediaTypeWithQualityHeaderValue("application/octet-stream"));
		string content2 = JsonConvert.SerializeObject((object)content);
		HttpContent httpContent = new StringContent(content2);
		httpContent.Headers.ContentType = new MediaTypeHeaderValue("application/json");
		HttpResponseMessage result2 = httpClient.PostAsync(url, httpContent).Result;
		if (result2.IsSuccessStatusCode)
		{
			result = result2.Content.ReadAsStreamAsync().Result;
		}
		httpContent.Dispose();
		returnresult; }}Copy the code

4. Find the truth

I listed above the code of these three methods, do not know what you can see the problem? Yes, it’s asynchronous method synchronization, which is inherently inefficient in two ways.

  • Opening and closing threads is itself a relatively resource-intensive and inefficient operation.

  • Frequent thread scheduling puts tremendous pressure on the CPU

Dump dump dump dump dump dump dump dump dump dump dump dump dump

Three:

In general, this hangon accident is caused by the fact that the developer’s asynchronous method cannot be asynchronized. It is very simple to change the method, to make pure await (await,async) transformation, to liberate the calling thread and make full use of the ability of driving device.

This dump also reminds me of the example in CLR Via C# book (P646,647) where we use await and async to modify synchronous requests.

I think this dump is the best proof of this example! 😄 😄 😄

For more high-quality dry goods: See my GitHub:dotnetfly