Do you dare to believe? Knowing the principles of Java thread pooling, your interviewer will be flattered to offer you a raise

Today’s sharing started, please give us more advice ~

Many people are now asked about thread pools in interviews at many companies. Why are interviewers so keen to ask about thread pools? Since this is the foundation of multithreading, there are several important parameters of ThreadPoolExecutor that you need to know how to set and which Executor to choose for which scenario, thread pool queue selection, and the corresponding rejection policy.

Here are some interview questions about thread pools:

Usage scenarios for thread pools

What is the meaning of the parameters of the thread pool? What queues and rejection policies do you use?

Where are thread pools used in your program, and what are the benefits of using them?

How do I implement a thread pool myself

The JDK provides a default implementation of which thread pools

Why not allow thread pools implemented by default

How did you figure out the parameters in the thread pool, and based on what?

Describe the workflow in your custom thread pool

…

Here is not on the other side of the question analysis, only talk about the core principle and dynamic adjustment of thread pool parameters practice to help you have a clear understanding of the thread pool, know the principle and then combined with their own practice, that interview thread pool is also handy.

The concept of thread pool

1.1 What is a thread pool

A thread pool is a thread usage pattern. Too many lines will bring extra costs, including the cost of creating and destroying threads, the cost of scheduling threads, etc., and also reduce the overall performance of the computer. A thread pool maintains multiple threads waiting for a supervisor to assign tasks that can be executed concurrently. This approach, on the one hand, avoids the cost of creating and destroying threads while processing tasks, on the other hand, avoids the excessive scheduling problem caused by the expansion of the number of threads, and ensures the full utilization of the kernel.

1.2 Benefits of using thread pools

Reduced resource consumption: Reuse of created threads through pooling techniques to reduce wastage from thread creation and destruction.

Improved response time: Tasks can be executed immediately when they arrive without waiting for threads to be created.

Improve manageability of threads: Threads are scarce resources. If they are created without limit, they will not only consume system resources, but also cause resource scheduling imbalance due to unreasonable distribution of threads, which reduces system stability. Thread pools allow for uniform allocation, tuning, and monitoring.

More and more power: Thread pools are extensible, allowing developers to add more functionality to them. Such as delay timer thread pool ScheduledThreadPoolExecutor, allows a stay of execution or regular task execution.

1.3 Core parameters of ThreadPoolExecutor

The web hype is not as good as looking directly at Doug Lea’s comments on the source code.

CorePoolSize: the number of threads to keep in the pool, even if they are idle, unless {@code allowCoreThreadTimeOut} is set

Core Threads: The number of threads that remain in the thread pool, even if they are idle, unless allowCoreThreadTimeOut is set.

MaximumPoolSize: The maximum number of threads to allow in the pool

Maximum number of threads: The maximum number of threads allowed in the thread pool

KeepAliveTime: when the number of threads is greater than the core, this is the maximum time that excess idle threads will wait for new tasks before terminating.

Thread idle time: If the number of threads exceeding the core thread count has not received a new task after keepAliveTime, it is reclaimed.

Unit: The time unit for the {@code keepAliveTime} argument

Unit: keepAliveTime Time unit

WorkQueue: the queue to use for holding tasks before they are executed. This queue will hold only the {@code Runnable} tasks submitted by the {@code execute} method.

Queue for pending tasks: This is where resubmitted tasks are stored after the number of submitted tasks exceeds the number of core threads. It is only used to hold Runnable tasks submitted by the execute method.

ThreadFactory: The factory to use when the executor creates a new thread

Thread factory: The factory used by the execution program to create new threads. For example, the custom thread factory in our project, when troubleshooting problems, according to the name of the thread factory, we can know where the thread comes from, and quickly locate the problem.

2. The handler to use when execution is blocked because the thread bounds and queue are reached

Reject policy: When the queue is full of tasks, the maximum number of threads are working, then the thread pool of further submitted tasks cannot handle, what kind of reject policy should be implemented.

The implementation principle of thread pool

This article describes the thread pool as the ThreadPoolExecutor class provided in JDK 8. Let’s look at its UML dependencies from the ThreadPoolExecutor class.

2.1 Overall Design

Solid blue line: inheritance relationship

Dotted green line: interface implementation relationship

Green implementation: Interface inheritance

The top-level interface of ThreadPoolExecutor implementation is Executor. The top-level interface only provides void execute(Runnable command). An Executor provides the idea of decoupling task submission from task execution. You do not need to worry about how to create a thread or schedule a thread to execute a task. You only need to provide a Runnable object and submit the execution logic of a task to an Executor. The Executor framework takes care of thread allocation and task execution.

The ExecutorService interface adds some capabilities:

Extend the ability to execute tasks, complementing methods that can generate futuresfor one or a group of asynchronous tasks;

Provides methods for managing thread pools, such as stopping them from running.

AbstractExecutorService is a high-level abstract class that strings together the process of performing a task, ensuring that the underlying implementation only needs to focus on a single method to perform the task. The lowest implementation class, ThreadPoolExecutor, implements the most complex part of the run. ThreadPoolExecutor will maintain its own life cycle while managing threads and tasks in a good combination to execute parallel tasks.

Let’s look at the flow of ThreadPoolExecutor:

Thread pooling actually builds a producer-consumer model internally, decoupling threads and tasks from each other and not directly related to each other, so as to buffer tasks well and reuse threads. The operation of thread pool is mainly divided into two parts: task management and thread management.

The task management part acts as a producer, and when a task is submitted, the thread pool determines the subsequent flow of the task:

Apply directly to the thread to perform the task

Buffered to a queue waiting for the thread to execute

Reject the task

The thread management part acts as the role of consumers, which are uniformly maintained in the thread pool and allocated according to the task request. When the thread completes the task, it will continue to acquire new tasks to execute. Finally, when the thread fails to obtain the task, the thread will be recycled.

The following three core mechanisms explain thread pool operation in detail:

How does a thread pool maintain its state

How do thread pools manage tasks

How does a thread pool manage threads

2.2 How does a thread pool maintain its state

The running state of the thread pool is not explicitly set by the user, but is maintained internally along with the running of the thread pool. A variable is used internally to maintain two values: runState and number of threads (workerCount).

CTL this AtomicInteger is a field that controls the running state of the thread pool and the number of valid threads in the pool. The runState of the thread pool and the number of valid threads in the thread pool (workerCount). The runState is stored in the higher 3 bits and the workerCount is stored in the lower 29 bits. The two variables do not interfere with each other. Using a variable to store two values can avoid inconsistencies when making relevant decisions. It is unnecessary to occupy lock resources to maintain the consistency of the two values.

As you can also see from reading the thread pool source code, it is often necessary to determine both the running state of the thread pool and the number of threads. Thread pools also provide several methods for the user to obtain the current running state of the thread pool and the number of threads. All of these are bit operations, which are much faster than basic operations.

Internally encapsulated to get the life cycle state, get the number of threads in the thread pool is calculated as follows:

Why can an integer variable hold both the state of the run and the number of threads?

First, we know that in Java an integer takes four bytes, or 32 bits, so an integer has 32 bits.

So the binary representation of integer 1 is: 0000 0000 0000 0000 0000 0001

The binary representation of integer -1 is: 1111 1111 1111 1111 1111 1111 1111 1111 1111 1111 1111 1111 1111 1111 1111 1111 1111 1111 1111 1111 1111 1111 1111 1111

In ThreadPoolExecutor, the first three 32-bit bits of the integer represent the thread pool state, and the last 29 bits represent the number of valid threads in the pool.

CAPACITY = (1 << 29) -1:0001 1111 1111 1111 1111 1111 1111 1111 1111 1111 1111 1111 1111 1111 1111

The binary representation of 1 is 0000 0000 0000 0000 0000 0001.

Then 0000 0000 0000 0000 0000 0000 0000 0000 0000 moves 29 bits to the left and gets 0010 0000 0000 0000 0000 0000 0000 0000 0000 0000 0000.

The final will be 0010 0000 0000 0000 0000 0000 0000 0000 minus 1 0001 1111 1111 1111 1111 1111 1111 1111.

Let’s take a look at the states defined by ThreadPoolExecutor, which are closely related to thread execution:

RUNNING: Can accept newly submitted tasks and also process tasks in a blocking queue.

SHUTDOWN: When the SHUTDOWN () method is called, the newly submitted tasks are no longer accepted, but the saved tasks in the blocking queue can continue to be processed.

STOP: The shutdownNow() method is called to STOP accepting the newly submitted task, discard all tasks in the blocking queue and interrupt all tasks in progress.

TIDYING: All tasks are completed, and workerCount is 0.

TERMINATED: state TERMINATED and updated to this state when TERMINATED () is executed.

2.3 How does a Thread Pool manage tasks

2.3.1 Task Scheduling

Task scheduling is the main entry point to the thread pool. When a user submits a task, how the task will be executed is determined by this stage. Understanding this section is equivalent to understanding the core workings of thread pools.

First, all tasks are scheduled by the Execute method, as in our business code

threadPool.execute(new Job()); .

This part of the job is to check the current running state of the thread pool, the number of running threads, the running policy, and determine the next process to execute, whether to directly apply for the thread to execute, buffer to the queue for execution, or simply reject the task. Its execution process is as follows:

First, check the RUNNING status of the thread pool. If it is not RUNNING, reject it directly. Ensure that the thread pool executes tasks in the RUNNING state.

If workerCount < corePoolSize, a thread is created and started to execute the newly submitted task.

If workerCount >= corePoolSize and the blocking queue in the thread pool is not full, the task is added to the blocking queue.

If workerCount >= corePoolSize && workerCount < maximumPoolSize and the blocking queue in the thread pool is full, a thread is created and started to execute the newly submitted task.

If workerCount >= maximumPoolSize and the blocking queue in the thread pool is full, the task is processed according to the reject policy. The default is to throw an exception directly.

The execution flow chart is as follows:

2.3.2 Queue of Tasks to be executed

The queue of tasks to be executed is a core part of the thread pool’s ability to manage tasks. The essence of thread pool is the management of tasks and threads, and the key idea to achieve this is to decouple the tasks and threads from the direct correlation, so that the subsequent allocation work can be done. Thread pools are implemented in producer-consumer mode through a blocking queue. The blocking queue caches tasks from which the worker thread retrieves them.

A BlockingQueue is a queue that supports two additional operations.

The two additional operations are:

When the queue is empty, the thread that fetched the element waits for the queue to become non-empty.

When the queue is full, the thread that stores the element waits for the queue to become available.

Blocking queues are often used in producer and consumer scenarios, where the producer is the thread that adds elements to the queue and the consumer is the thread that takes elements from the queue. A blocking queue is a container in which producers hold elements, and consumers only take elements from the container.

The following illustration shows Thread1 adding elements to the blocking queue and Thread2 removing elements from the blocking queue:

Different queues can implement different task access strategies. Let’s look at blocking queue members:

2.3.3 Task Application

As can be seen from the above, there are two possibilities for the execution of the task:

In one case, the task is executed directly by the newly created thread

The other is that the thread retrieves the task from the task queue and executes it, and the idle thread that completes the task will apply for the task from the queue again to execute it.

The first is only when the thread is initially created, and the second is when the thread acquires most of the tasks.

The thread needs to get the task from the queue of the task to be executed continuously to help the thread get the task from the blocking queue, and realize the communication between the thread management module and the task management module.

This part of the strategy is implemented by the getTask method. Let’s look at the code of the getTask method.

The getTask method will eject a task from the queue when it blocks pending tasks in the queue and return it. If the blocking queue is empty, it will block until the new task is submitted to the queue until it times out (in some configurations it will wait without timeout). If a new task is obtained before the timeout, The task is then returned as the return value. So the getTask method does not return NULL, but blocks to wait for the next task and then returns that new task as the return value.

When the getTask method returns null, the current Worker exits and the current thread is destroyed. The getTask method returns NULL only if:

The number of threads in the current thread pool exceeds the maximum number. This is the result of the runtime changing the maximum number of threads by calling setMaximumPoolSize;

The thread pool is stopped. In this case all threads should be recycled and destroyed immediately;

The thread pool is SHUTDOWN and the blocking queue is empty. In this case no new tasks are submitted to the blocking queue, so the thread should be destroyed;

Threads can be timed out while waiting for new tasks to timeout. A thread can be reclaimed by timeout in the following two situations:

Allows the core thread to time out (thread pool configuration) in case a thread waits for a task to time out

Threads that exceed the number of core threads are waiting for a task timeout

2.3.4 Task Rejection

The task rejection module is the protected part of the thread pool. The thread pool has a maximum capacity. When the task cache queue of the thread pool is full and the number of threads in the thread pool reaches maximumPoolSize, the task must be rejected and the task rejection policy is adopted to protect the thread pool.

A rejection policy is an interface designed as follows:

Users can implement this interface to customize rejection policies or choose from the four existing rejection policies provided by the JDK, which have the following features:

2.4 How does a Thread Pool manage threads

Against 2.4.1 Worker thread

In order to master the state of threads and maintain the life cycle of threads, a Worker thread in the thread pool is designed. Let’s look at its code:

The Worker thread implements the Runnable interface and holds a thread, thread, that initializes the task firstTask. Threads are threads created from ThreadFactory when the constructor is called and can be used to perform tasks.

FirstTask uses it to hold the first incoming task, which can be null or null. If this value is non-empty, the thread will execute the task immediately after startup, which is the case when the core thread is created. If this value is empty, a thread needs to be created to perform the tasks in the workQueue, that is, the creation of non-core threads.

2.4.1.1 AQS role

The Worker inherited AbstractQueuedSynchronizer, there are two main purpose:

Refine the granularity of locks to each Worker

If multiple workers use the same lock, then when one Worker runs the lock, other workers cannot execute it, which is obviously unreasonable.

Obtain CAS directly to avoid blocking.

If the lock is acquired using blocking, execute shutDown in case of multiple workers. If the Worker is Running and cannot acquire the lock, then the shutDown() thread will block, which is obviously not reasonable.

2.4.1.2 Runnable role

Worker also implements Runnable, which has two properties, thead and firstTask.

FirstTask uses it to hold the first incoming task, which can be null or null.

If this value is non-empty, the thread will execute the task immediately after startup, which is the case when the core thread is created.

If the value is null, a thread needs to be created to perform the tasks in the workQueue, that is, the creation of non-core threads.

According to the overall process:

Execute — > create Worker (set thead, firstTask) — > worker.thread.start() — > actually call worker.run() — > thread pool RunWorker (worker) – > worker.firsttask.run () (if firstTask is null, pull one from the wait queue).

The task execution model of Worker is shown in the figure below:

2.4.2 Worker threads increase

Increase the thread by thread pool of addWorker method, the function of the method is to increase a single thread, this method does not consider the thread pool is to increase the thread, in which stage the allocation of thread strategy is done in the last step, this step only complete increase thread, and make it run, finally returned to the success of the results.

The addWorker method takes two parameters: firstTask and core.

The firstTask parameter is used to specify the firstTask to be executed by the new thread. This parameter can be null.

If the core parameter is true, it will determine whether the number of active threads is less than corePoolSize before adding a thread. If the core parameter is false, it will determine whether the number of active threads is less than maximumPoolSize before adding a thread.

Take a look at the addWorker source code:

Source code looks pretty laborious? It doesn’t matter, look at another execution flow chart to deepen the impression.

2.4.3 Worker Thread Executing tasks

When a thread starts a Worker, it calls the Worker’s own run method. This run method calls the external ThreadPoolExecutor runWorker method.

The execution process is as follows:

The while loop keeps getting tasks through the getTask() method

The getTask() method takes the task from the blocking queue

If the thread pool is stopping, make sure the current thread is interrupted, otherwise make sure the current thread is not interrupted.

Perform a task

If the getTask result is null, the loop is broken out and the processWorkerExit() method is executed to destroy the thread.

2.4.4 Worker thread reclamation

The thread pool’s job is to maintain a certain number of thread references based on the current state of the thread pool and prevent these threads from being reclaimed by the JVM. When the thread pool decides which threads need to be reclaimed, it simply removes the references. After Worker is created, it will poll continuously and then acquire tasks for execution. Core threads can wait indefinitely to acquire tasks, while non-core threads have to acquire tasks within a limited time.

When the Worker fails to obtain the task, that is, the acquired task is empty, the loop will end and the Worker will actively eliminate its own reference in the thread pool.

Thread recycling is done in the processWorkerExit method.

When the Worker is reclaimed, the thread pool attempts to terminate itself, using the tryTerminate method:

2.4.4 The Worker thread is closed

When it comes to thread closure, we have to talk about the shutdown method and shutdownNow method.

2.4.4.1 shutdown

The getTask method returns null after the idle Worker is interrupted, and the Worker is reclaimed. What is idle Worker?

Idle Worker is explained as follows: When a Worker is running, it will block the queue to get data (getTask method). If the timeout time is not set when the Worker is getting data, it will always block waiting for the blocked queue to get data. Such Worker is called idle Worker. Since Worker is also an AQS, there will be a pair of lock and unlock operations in the runWorker method. These lock operations ensure that the Worker is not an idle Worker.

Therefore, Worker is designed as an AQS to determine whether it is idle thread and whether it can be forcibly interrupted according to Worker’s lock.

Let’s look at the interruptIdleWorkers method:

2.4.4.2 shutdownNow

The shutdown method changes the state of the thread pool to shutdown, and the thread pool can continue to process tasks in the blocking queue and reclaim idle workers. The shutdownNow method, however, is different. It changes the thread pool state to STOP, so that tasks in the blocking queue are not processed, and new tasks are not processed.

ShutdownNow, unlike shutdown, calls the interruptWorkers method:

2.4.4.3 Worker thread closing summary

The shutdown method updates the state to shutdown and does not affect the execution of tasks in the blocking queue, but does not execute new tasks. At the same time, idle workers will also be recycled. The definition of idle workers has been mentioned above.

The shutdownNow method updates the status to STOP, which affects the execution of tasks that block the queue and does not execute new tasks. All workers will be reclaimed at the same time.

summary

Many people do not use Thread pools at ordinary times, using the Thread class is defined to inherit the Thread class or defined to implement the Runnable interface to achieve multithreading. But if you are an advanced Java developer, you should never say this. It makes the interviewer think you are not advanced. If you don’t know thread pools yet, it’s not too late; If you’re already working with thread pools, this will give you a better idea of how thread pools work and make it easier to use in your projects.

Today’s share has ended, please forgive and give advice!