Case 1: Shared thread pool, causing secondary logic to overwhelm primary logic

I have a service that has a thread pool, and the ordering service will drop some asynchronous requests into this thread pool, as will the weather query service.

As shown in the figure, both services submit tasks to the thread pool and wait for the task to complete.

What’s wrong with that?

If the response time of the WHETHER service is too slow, a large number of thread pool resources will be occupied. As a result, there are not enough resources to execute the task of the Order service, and the order will fail. The jitter of a secondary service causes the failure of the primary logic, which is never allowed. How do you solve this problem?

In fact, this problem is very simple, do thread pool isolation, use a separate thread pool to handle the ORDER request, a separate thread pool to handle whether request, so that they do not affect each other.

Case 2: An unreasonable blocking queue leads to a precipitous drop in throughput

What is a cliff fall? For example, the original throughput is three digits, after a problem directly to you dry to single digits.

This problem is especially common in IO intensive scenarios.

I have a request to come in, assuming that 20 interface to invoke, assuming that there is no causal relationship between the 20 interface, can be directly into the thread pool to process inside, but will have to wait until all the request to return, in such a situation, might step on the pit.

We set the thread pool to 20 core threads, 100 maximum threads, 1024 blocking queues, and reject policy to schedule thread processing.

When the traffic is low, the core thread + blocking queue can process the request, but with the increase of the request, the overall response speed will be slow, because a request will occupy all the core threads, so some requests will be put into the blocking queue first, cannot be executed at the first time.

As the number of requests increases further and the blocking queue becomes full, new threads are created until the maximum number of threads is reached, and response time tends to decline.

This is exactly what we expected at this point, but as the volume of requests increases, the response time does more than drop.

Let’s say we can handle 200 requests per second at this point. With a slight increase in requests, we might only be able to handle 10 requests per second!

Why does this happen? The problem is the blocking queue, so we can imagine, when the thread pool is full, our new requests will join the blocking queue, notice it’s a first in, first out queue, what if you have a child request at the end of the queue? You must wait for 1024 subrequests to be executed before you have a chance to execute them.

Even worse, this kind of blocking will cause scheduling threads to block along with it. For example, Tomcat has 200 threads, and if all of them block here, Tomcat will not be able to provide services.

In fact, when this problem does occur, the efficiency of the concurrent call interface is slower than serial execution in most cases.

So how should this problem be solved?

We demoted parallel calls to serial calls, which slowed down response times but didn’t cause this precipitous drop in throughput. We simply replaced the blocking queue with a SynchronousQueue, and when the number of threads reached its maximum, we rejected the policy and let the scheduled thread execute it. It actually demoted to serial execution

If the response time is sensitive, you can perform traffic limiting.

If you look at the CPU at this point, the CPU utilization is not high at this time, in fact, the server is not fully utilized.

So how can we make full use of server resources? In fact, Java doesn’t have a very good way to handle this, because Java threads and operating system threads are one-to-one, so it’s expensive to create, and you can’t have an infinite number of new threads, but the core problem with this is that there aren’t enough threads.

So for this IO intensive, actually Java is not very suitable, go is a good choice

Example 3: I clearly set keepalive, why can’t the thread reclaim

For example, I have a thread pool parameter set as follows:

The number of core threads is 10, the maximum number of threads is 20, the blocking queue is 500, keepalive 10s

One of my services uses this thread pool, which is an online thread pool with traffic all the time.

I found through the thread pool monitoring that for some reason, the thread pool reached the maximum number of threads. Later in the evening, when the service load was low, but when I looked at the thread pool, I found that the number of active threads was only 2, but the maximum thread pool stayed at 20, and no 10 was received

In fact, this problem is relatively simple. For example, I used to have two people working at full capacity to finish the work, but now I hire 8 more people, and then 10 of them do the work of the original two. We all have work to do but we are all idle.

Thread pools also encounter the same situation, because it is an online service, there is traffic all the time, threads have no real idle time, resulting in never recycle.

If the thread pool is not active, set keepalive to 0 and set it back every few seconds.