Reprinted from public account: JavaGuide

At around 5,000 words long, this article is definitely dry work. The title is a little exaggerated, heh heh, actually are their own use of the thread pool summed up some of the more important personal feelings.

Review of thread pool knowledge

Why use thread pools?

Pooling techniques are more common than ever. Thread pooling, database connection pooling, Http connection pooling, and so on are all applications of this idea. The idea of pooling technology is to reduce the consumption of resources per acquisition and improve the utilization of resources.

Thread pools provide a way to limit and manage resources (including performing a task). Each thread pool also maintains some basic statistics, such as the number of completed tasks.

Here are some of the benefits of using thread pools, borrowed from The Art of Concurrent Programming in Java:

  • Reduce resource consumption. Reduce the cost of thread creation and destruction by reusing created threads.
  • Improve response speed. When a task arrives, it can be executed immediately without waiting for the thread to be created.
  • Improve thread manageability. Threads are scarce resources. If they are created without limit, they will not only consume system resources, but also reduce system stability. Thread pools can be used for unified allocation, tuning, and monitoring.

Usage scenarios for thread pools in real projects

Thread pools are generally used to execute multiple unrelated time-consuming tasks. In the absence of multi-threading, tasks are executed sequentially. With thread pools, multiple unrelated tasks can be executed simultaneously.

Given that we have three unrelated, time-consuming tasks to perform, the Guide diagram shows the difference before and after using thread pools.

Note: The following three tasks may or may not do the same thing.

Before and after using thread pools

How do I use thread pools?

ThreadPoolExecutor’s constructor creates a thread pool and submits tasks to the pool for execution.

The ThreadPoolExecutor constructor looks like this:

/** * Creates a new ThreadPoolExecutor with the given initial parameters. */ public ThreadPoolExecutor(int corePoolSize, int maximumPoolSize,// the maximum number of threads in the thread pool long BlockingQueue<Runnable> workQueue, BlockingQueue<Runnable> workQueue, RejectedExecutionHandler handler// RejectedExecutionHandler handler// RejectedExecutionHandler handler// RejectedExecutionHandler We can customize policies to handle tasks) {if (corePoolSize < 0 ||
            maximumPoolSize <= 0 ||
            maximumPoolSize < corePoolSize ||
            keepAliveTime < 0)
            throw new IllegalArgumentException();
        if (workQueue == null || threadFactory == null || handler == null)
            throw new NullPointerException();
        this.corePoolSize = corePoolSize;
        this.maximumPoolSize = maximumPoolSize;
        this.workQueue = workQueue;
        this.keepAliveTime = unit.toNanos(keepAliveTime);
        this.threadFactory = threadFactory;
        this.handler = handler;
    }Copy the code

A quick demonstration of how to use thread pools, and more details

private static final int CORE_POOL_SIZE = 5; private static final int MAX_POOL_SIZE = 10; private static final int QUEUE_CAPACITY = 100; private static final Long KEEP_ALIVE_TIME = 1L; Public static void main(String[] args) {public static void main(String[] args) {public static void main(String[] args) {public static void main(String[] args) {public static void main(String[] args) {public static void main(String[] args) executor = new ThreadPoolExecutor( CORE_POOL_SIZE, MAX_POOL_SIZE, KEEP_ALIVE_TIME, TimeUnit.SECONDS, new ArrayBlockingQueue<>(QUEUE_CAPACITY), new ThreadPoolExecutor.CallerRunsPolicy());for (int i = 0; i < 10; i++) {
            executor.execute(() -> {
                try {
                    Thread.sleep(2000);
                } catch (InterruptedException e) {
                    e.printStackTrace();
                }
                System.out.println("CurrentThread name:" + Thread.currentThread().getName() + "date:"+ Instant.now()); }); } // Terminate the thread pool executor.shutdown(); try { executor.awaitTermination(5, TimeUnit.SECONDS); } catch (InterruptedException e) { e.printStackTrace(); } System.out.println("Finished all threads");
    }Copy the code

Console output:

CurrentThread name: Pool-1-thread-5Date: 2020-06-06T11:45:31.639Z CurrentThread name: Pool-1-thread-3Date: : the 2020-06-06 T11 45:31. 639 z CurrentThread name: date - thread pool - 1-1: : the 2020-06-06 T11 45:31. 636 z CurrentThread name: - thread pool - 1-4 date: : the 2020-06-06 T11 45:31. 639 z CurrentThread name: - thread pool - 1-2 date: : the 2020-06-06 T11 45:31. 639 z CurrentThread name: - thread pool - 1-2 date: : the 2020-06-06 T11 45-33. 656 z CurrentThread name: - thread pool - 1-4 date: : the 2020-06-06 T11 45-33. 656 z CurrentThread name: date - thread pool - 1-1: : the 2020-06-06 T11 45-33. 656 z CurrentThread name: - thread pool - 1-3 date: 2020-06-06T11:45:33.656Z CurrentThread Name: Pool1-thread-5Date: 2020-06-06T11:45:33.656Z Finished All ThreadsCopy the code

Thread pool best practices

Here’s a quick summary of what I know about thread pools. There doesn’t seem to be an online article about them.


Because the Guide is still a dish, there are some supplements, improvements and mistakes, you can inform me in the comment section or communicate with me on wechat.

Declare a thread pool using the constructor of ThreadPoolExecutor

1. The thread pool must be declared manually by using the ThreadPoolExecutor constructor. Avoid using the Executors newFixedThreadPool and newCachedThreadPool because of OOM risks.

Executors return thread pool object pros:

FixedThreadPool and SingleThreadExecutor: Allows requests with a queue length of Integer.MAX_VALUE, which can pile up requests to OOM.

CachedThreadPool and ScheduledThreadPool: The number of threads allowed to be created is integer. MAX_VALUE, which may create a large number of threads, resulting in OOM.

In plain English: use bounded queues to control the number of threads created.

In addition to avoiding the OOM, two quick thread pools provided by Executors are not recommended:

  1. In practice, you need to manually configure the parameters of the thread pool, such as the number of core threads, the task queue used, the saturation policy and so on, according to the performance of your machine and business scenarios.
  2. We should name our thread pool explicitly to help us locate problems.

2. Monitor the thread pool running status

You can detect the running status of the thread pool by means such as the Actuator component in SpringBoot.

In addition, we can use the related API of ThreadPoolExecutor to do a crude monitoring. As you can see in the figure below, ThreadPoolExecutor provides access to the number of current threads and active threads in the thread pool, the number of completed tasks that have been executed, the number of tasks that are queued, and so on.

Here is a simple Demo. PrintThreadPoolStatus () prints every second the number of threads in the thread pool, the number of active threads, the number of completed tasks, and the number of tasks in the queue.

/** * print threadPool status ** @param threadPool threadPool object */ public static voidprintThreadPoolStatus(ThreadPoolExecutor threadPool) {
        ScheduledExecutorService scheduledExecutorService = new ScheduledThreadPoolExecutor(1, createThreadFactory("print-thread-pool-status".false));
        scheduledExecutorService.scheduleAtFixedRate(() -> {
            log.info("= = = = = = = = = = = = = = = = = = = = = = = = =");
            log.info("ThreadPool Size: [{}]", threadPool.getPoolSize());
            log.info("Active Threads: {}", threadPool.getActiveCount());
            log.info("Number of Tasks : {}", threadPool.getCompletedTaskCount());
            log.info("Number of Tasks in Queue: {}", threadPool.getQueue().size());
            log.info("= = = = = = = = = = = = = = = = = = = = = = = = =");
        }, 0, 1, TimeUnit.SECONDS);
    }Copy the code

3. You are advised to use different thread pools for different types of services

Many people have a question like this in their real projects: Should I define one for each thread pool or should I define a common thread pool for my project?

It is recommended that different services use different thread pools. Configure the current thread pool based on the current service conditions. Because different services have different concurrency and resource usage, optimize services related to system performance bottlenecks.

Let’s look at another real accident! (in this case since the sources, the thread pool improper use of a line accident “@ https://club.perfma.com/article/646639, a wonderful case)

Case code overview

The above code can be deadlocked, why? So let me draw a picture for you.

Consider this extreme case:

If we are at the heart of the thread pool threads is n, the parent task (deduction) of n number, the parent task below, there are two subtasks (deduction task under the subtasks), one has to perform to complete, another is placed on the task queue. Because the parent task uses up the core thread resources in the thread pool, the child task cannot execute properly because it cannot obtain thread resources and is blocked in the queue. The parent task waits for the child task to complete, and the child task waits for the parent task to release thread pool resources, which causes a “deadlock”.

The solution is simply to add a new thread pool for executing subtasks.

4. Don’t forget to name the thread pool

When initializing the thread pool, you need to display the name (set the thread pool name prefix) to help locate problems.

By default, the name of the created thread is similar to pool-1-thread-n, which has no service meaning and is not conducive to fault locating.

There are two common ways to name threads in a thread pool:

1. Leverage Guava’s ThreadFactoryBuilder

ThreadFactory threadFactory = new ThreadFactoryBuilder()
                        .setNameFormat(threadNamePrefix + "-%d")
                        .setDaemon(true).build();
ExecutorService threadPool = new ThreadPoolExecutor(corePoolSize, maximumPoolSize, keepAliveTime, TimeUnit.MINUTES, workQueue, threadFactory)Copy the code

2. Implement ThreadFactor yourself.

import java.util.concurrent.Executors; import java.util.concurrent.ThreadFactory; import java.util.concurrent.atomic.AtomicInteger; /** * thread factory, which sets the thread name and helps us locate problems. */ public final class NamingThreadFactory implements ThreadFactory { private final AtomicInteger threadNum = new AtomicInteger(); private final ThreadFactory delegate; private final String name; Public NamingThreadFactory(ThreadFactory delegate, String name) {this.delegate = delegate; this.name = name; // TODO consider uniquifying this } @Override public Thread newThread(Runnable r) { Thread t = delegate.newThread(r); t.setName(name +"[#" + threadNum.incrementAndGet() + "]");
        returnt; }}Copy the code

5. Set thread pool parameters correctly

Speaking of how to configure thread pool parameters, meituan’s SAO operation has impressed me so far (more on that later)!

Let’s take a look at the recommended ways to configure thread pool parameters from various books and blogs.

Normal operation

Many people might even think it’s better to configure thread pools a little too large! I think this is clearly problematic. Take a very common example in our life: not many people can do things well, increasing the cost of communication. You only need 3 people to do one thing, but you just bring in 6 people. Will it improve the efficiency? I don’t think so. The impact of having too many threads is the same as how many people we assign to do things, which in the case of multi-threading is mainly an increase in context-switching costs. If you are not sure what context switching is, you can read my introduction below.

Context switch:

In multithreaded programming, the number of threads is generally greater than the number of CPU cores, and a CPU core can only be used by one thread at any time. In order to make these threads can be effectively executed, the CPU adopts the strategy of allocating time slices for each thread and rotating them. When a thread runs out of time, it is ready to be used by another thread. This process is a context switch. To summarize, the current task saves its state before switching to another task after executing the CPU time slice, so that it can be reloaded when switching back to the task next time. The process from saving to reloading a task is a context switch.

Context switching is usually computationally intensive. In other words, it requires a significant amount of processor time, with each switch taking nanosecond of tens or hundreds of times per second. Therefore, context switching means consuming a lot of CPU time for the system, in fact, it may be the most time consuming operation in the operating system.

One of the many advantages Linux has over other operating systems, including other Unix-like systems, is that context switching and mode switching require very little time.

One thing we can say for sure is that if the thread pool size is too large or too small, it’s best if it’s appropriate.

If we set the number of thread pool is too small, if the same time have a large number of tasks/requests to deal with, may lead to a lot of request/task in the task queue waiting to perform, can appear even after the task queue full tasks/requests to handle, or the accumulation of a large number of tasks in task queue lead to OOM. This is obviously a problem! The CPU is simply underutilized.

However, if we set the number of threads too large, a large number of threads may compete for CPU resources at the same time, which can cause a lot of context switching, increasing the execution time of the threads and affecting the overall execution efficiency.

Here’s a simple formula that works for a wide range of applications:

  • CPU intensive task (N+1) : This type of task consumes CPU resources. The number of threads can be set to N (number of CPU cores) +1. One more thread is used to prevent occasional page miss interrupts or pauses caused by other reasons. Once the task is paused, the CPU is idle, and in this case the extra thread can take full advantage of the idle CPU time.
  • I/ O-intensive task (2N) : This is a task in which the system spends most of its time processing I/O interactions, and the thread does not occupy CPU for the time it is processing I/O, so the CPU can be handed over to another thread. Therefore, in the application of I/O intensive tasks, we can configure more threads, the specific calculation method is 2N.

How to determine whether a CPU – intensive task or an IO – intensive task?

CPU intensive is simply a task that uses the computing power of the CPU such as sorting large amounts of data in memory. As far as network reads are concerned, file reads are IO intensive. The characteristic of this type of task is that the CPU computation time is very small compared to the time spent waiting for THE I/O operation to complete. Most of the time is spent waiting for the I/O operation to complete.

The United States group of SAO operation

The technical team of Meituan introduced the idea and method of realizing customizable configuration of thread pool parameters in the article “Implementation Principle of Java Thread Pool and its Practice in Meituan Business”.

The idea of the Technical team of Meituan is to customize the core parameters of the thread pool. The three core parameters are:

  • The number of threads defines the minimum number of threads that can run at the same time.
  • MaximumPoolSize: When the number of tasks in the queue reaches the queue capacity, the number of threads that can run simultaneously becomes the maximum number of threads.
  • WorkQueue: When a new task arrives, it checks whether the number of threads currently running has reached the core number. If so, trust is placed in the queue.

Why these three parameters?

As I mentioned in this thread Pool Learning Summary, these three parameters are the most important parameters of ThreadPoolExecutor and basically determine the thread pool’s processing strategy for tasks.

How to support dynamic parameter configuration? Take a look at these methods provided by ThreadPoolExecutor.

SetCorePoolSize () will determine if the current number of worker threads is greater than corePoolSize. If the number is greater than corePoolSize, the thread pool will reclaim the worker thread.

Also, you can see that there is no way to dynamically specify the queue length, Meituan way is to customize a queue called ResizableCapacityLinkedBlockIngQueue (mainly is the capacity of LinkedBlockingQueue fields of the final keyword modifiers to removed, make it into a variable).

The resulting dynamically modifiable thread pool parameters are as follows.

The final effect of dynamically configuring thread pool parameters