How to use thread pools gracefully

Thread pooling is not only a common technique in projects, but also a must-ask in job interviews. Before we get into thread pools let’s take a look at what is a process and what is a thread

process

Program: A file, typically a collection of CPU instructions, statically stored on a storage device such as a hard disk
Process: When a program is to be run by a computer, a runtime instance of the program is created in memory. This instance is called a process

After a user gives the command to run a program, a process is created. Multiple processes can be created for the same program (one-to-many relationship) to allow multiple users to run the same program at the same time without conflict.

Processes require resources such as CPU time, memory, files, and I/O devices to work and are executed sequentially, meaning that only one process can be running at any time per CPU core. However, in an application, there is usually not only a single task to be executed in a single line, but there must be multiple tasks, and the creation process is time-consuming and resource consuming, called heavyweight operations.

Creating a process consumes too many resources
Communication between processes requires data to be passed between different memory Spaces, so inter-process communication is more time consuming and resource consuming

thread

Thread is the smallest unit that an operating system can schedule operations. In most cases, it is contained in the process and is the actual operating unit of the process. A process can have multiple concurrent threads, each performing a different task. Multiple threads in the same process share all virtual resources in that process, such as virtual address space, file descriptors, signal processing, and so on. But multiple threads in the same process have their own call stacks.

A process can have many threads, each performing different tasks in parallel.

Data in threads

Local data on the thread stack: for example, local variables during the execution of a function. We know that the thread model in Java uses the stack model. Each thread has its own stack space.
Global data shared throughout the process: We know that in Java programs, Java is a process that we can pass throughps -ef | grep javaYou can see how many Java processes are running in the program, such as global variables in our Java, which are isolated between processes but shared between threads.
Private data for the thread: In Java we can passThreadLocalTo create data variables that are private between threads.

Local data on a thread stack is valid only within this method, whereas private data on a thread is shared by multiple functions across threads.

CPU intensive and IO intensive

Understanding whether the server is CPU intensive or IO intensive can help us better set the parameters in the thread pool. We’ll talk about how to set this up later when we talk about thread pools, but you know these two concepts.

IO intensive: The CPU is idle most of the time, waiting for disk I/O operations
CPU intensive: Most of the time disk I/O is idle, waiting for CPU operations

The thread pool

Thread pool is actually an application of pooling technology. There are many common pooling technologies, such as database connection pool, Memory pool in Java, constant pool and so on. And why is there pooling? The nature of the program running, is through the use of system resources (CPU, memory, network, disk, etc.) to complete the processing of information, such as in the JVM to create an object instance need to consume CPU and memory resources, frequent if your program needs to create a large number of objects, and these objects are short survival time means that the need for frequent destruction, There is a good chance that this code will become a performance bottleneck. In fact, summed up the following points.

Reuse the same resources, reduce waste, reduce the cost of new and destruction;
Reduce the cost of individual management, unified by the “pool”;
Centralized management to reduce “debris”;
Improve system response times because resources are already in the pool and do not need to be recreated.

Therefore, pooling technology is to solve our problems, to put it simply, the thread pool is used to save the object, so that the next time the object is needed, directly from the object pool for reuse, avoid frequent creation and destruction. In Java, everything is an object, so a thread is an object. Java threads encapsulate operating system threads. Creating Java threads also consumes operating system resources, hence the thread pool. But how do we create it?

Java provides four thread pools

Java provides us with four ways to create thread pools.

Executors. NewCachedThreadPool: create a cache unlimited number of thread pool, if there are no threads in a thread pool to the task at this time will be a new thread, this thread is useless if more than 60 seconds, then this thread will be destroyed. Simply put, create temporary threads without limit when you are busy and recycle them when you are free

1public static ExecutorService newCachedThreadPool(a) {

2    return new ThreadPoolExecutor(0, Integer.MAX_VALUE,

3                                  60L, TimeUnit.SECONDS,

4                                  new SynchronousQueue<Runnable>());

5}

Copy the code

Executors. NewFixedThreadPool: create a fixed-size pool, can control the largest number of concurrent threads, beyond the thread will be waiting in the queue. Simply put, they put tasks in queues of infinite length when they are too busy.

1   public static ExecutorService newFixedThreadPool(int nThreads) {

2    return new ThreadPoolExecutor(nThreads, nThreads,

3                                  0L, TimeUnit.MILLISECONDS,

4                                  new LinkedBlockingQueue<Runnable>());

5}

Copy the code

Executors. NewSingleThreadExecutor: create a number of threads in thread pool thread pool to 1, with the only thread to perform the task, ensure the task is performed in a specified order

1public static ExecutorService newSingleThreadExecutor(a) {

2    return new FinalizableDelegatedExecutorService

3        (new ThreadPoolExecutor(1.1.

4                                0L, TimeUnit.MILLISECONDS,

5                                new LinkedBlockingQueue<Runnable>()));

6}

Copy the code

Executors. NewScheduledThreadPool: create a fixed-size thread pool, and the regular support periodic task execution

1public ScheduledThreadPoolExecutor(int corePoolSize) {

2    super(corePoolSize, Integer.MAX_VALUE, 0, NANOSECONDS,

3          new DelayedWorkQueue());

4}

Copy the code

How to create a thread pool

We click on the source code of these four implementations, we can see that the underlying creation principle is the same, but the different parameters are passed into four different types of thread pools. Both are created using ThreadPoolExecutor. We can look at the parameters passed by ThreadPoolExecutor creation.

1public ThreadPoolExecutor(int corePoolSize,

2                              int maximumPoolSize,

3                              long keepAliveTime,

4                              TimeUnit unit,

5                              BlockingQueue<Runnable> workQueue,

6                              ThreadFactory threadFactory,

7                              RejectedExecutionHandler handler)

8

Copy the code

So what exactly do these parameters mean?

corePoolSize: Number of core threads in the thread pool
maximumPoolSize: Maximum number of threads allowed in the thread pool
keepAliveTime: When the number of existing threads is greater thancorePoolSizeThis parameter sets the number of idle threads before they are destroyed.
unit: Time unit
workQueue: work queue. If the number of threads in the thread pool is greater than the number of core threads, the next task will be queued
threadFactory: Threads are produced in factory mode when they are created. This parameter sets our custom thread creation factory.
handler: If the maximum number of threads is exceeded, the rejection policy we set will be executed

Now let’s combine these parameters and see what their processing logic is.

beforecorePoolSizeFor each task, a thread is created
If the number of threads in the current thread pool is greater thancorePoolSizeThen the next task will go into the one we set up aboveworkQueueIn the queue
If at this timeworkQueueIt’s also full, so it’s going to create a temporary thread when the task comes back, so now if we set itkeepAliveTimeOr set upallowCoreThreadTimeOut, the system checks for thread activity and destroys the thread if it times out
If the current thread in the thread pool is greater thanmaximumPoolSizeMaximum number of threads, then that will do what we just sethandlerRejection policies

Why is it recommended not to use the thread pool creation methods provided by Java

Now that we understand the parameters set above, let’s see why there is a stipulation in the Alibaba Java Manual.

I believe that you have seen the above provides four kinds of thread pool implementation principle, should know why Alibaba will have such a regulation.

FixedThreadPoolandSingleThreadExecutor: The implementation of both thread pools, we can see that it is set to work queues areLinkedBlockingQueue, we know that this queue is a queue in the form of a linked list, which has no length limit and is an unbounded queue, so if there are a large number of requests at this time, it may causeOOM.
CachedThreadPoolandScheduledThreadPool: The implementation of the two thread pools, we can see that it sets the maximum number of threadsInteger.MAX_VALUE, then it is equivalent to the number of threads allowed to be createdInteger.MAX_VALUE. It can also be caused when a large number of requests come inOOM.

How to Set parameters

So if we want to use thread pools in our projects, it is recommended to create thread pools that are personalized to your project and machine. So how do you set these parameters? To properly customize the length of the thread pool, you need to understand your computer configuration, the resources required, and the nature of the task. For example, how many cpus are installed on the deployed computer? How much memory? Is task execution mainly IO intensive or CPU intensive? Does the task require a scarce resource such as a database connection?

If you have multiple different categories of tasks with vastly different behaviors, you should consider using multiple thread pools. This also makes it possible to customize a different thread pool for each task, and prevents the failure of one type of task from crushing another.

Cpu-intensive tasks: A large number of operations are involved. For example, if there are N cpus, set the capacity of the thread pool to N+1 to achieve optimal utilization. Because cpu-intensive threads are suspended at precisely the moment a page error occurs or for some other reason, having an extra thread ensures that the CPU cycle does not break in this case.
IO – intensive task: Indicates that the CPU waits most of the time for I/O blocking operations. In this case, you can increase the capacity of the thread pool. At this point, you can calculate the appropriate number of thread pools based on some parameters.
- N: indicates the number of cpus
- U: indicates the target CPU usage. 0<=U<=1
- W/C: ratio of waiting time to calculation time
- So the optimal pool size is going to beNU(1+W/C)

Missing Page (English: Page fault, also known as hard, hard interrupt, paging errors, lack of the search Page, Page faults, etc.), of a Page fault refers to when trying to access has been mapping software in the virtual address space, but the current is not loading a Page in physical memory, issued by the memory management unit of the CPU interrupt

In fact, the size of the thread pool should be set according to the business type. For example, when the current task needs pooled resources, such as the connection pool of the database, the length of the thread pool and the length of the resource pool will affect each other. If each task requires a database connection, the size of the connection pool limits the effective size of the thread pool. Similarly, if tasks in the thread pool are the only consumers of the connection pool, the size of the thread pool limits the effective size of the connection pool.

Thread destruction in the thread pool

The creation and destruction of threads managed by the corePoolSize, maximumPoolSize, and keepAliveTime of the thread pool. Now let’s review how thread pools create and destroy threads. Okay

Number of current threads < number of core threads: create a thread for each task
Number of current threads = number of core threads: when a task comes, it will be queued
Number of current threads > number of core threads: if the queue is full, a new thread will be created. Then the thread activity check will be enabledkeepAliveTimeThreads with no active time will be reclaimed

One might want to set corePoolSize to zero (if you remember CachedThreadPool, it’s zero), because threads are created dynamically, no threads are created when they’re idle, Create threads in the thread pool when busy. This is fine, but if we set this parameter to 0 and the wait queue is not SynchronousQueue, then we have a problem because new threads will only be created when the queue is full. I’m using an unbounded queue called LinkedBlockingQueue in the code below, and actually take a look at the output

 1ThreadPoolExecutor threadPoolExecutor = new ThreadPoolExecutor(0,Integer.MAX_VALUE,1, TimeUnit.SECONDS,new LinkedBlockingQueue<>());

 2for (int i = 0; i < 10; i++) {

 3    threadPoolExecutor.execute(new Runnable() {

 4        @Override

 5        public void run() {

 6            try {

 7                Thread.sleep(1000);

 8            } catch (InterruptedException e) {

 9                e.printStackTrace();

10            }

11            System.out.printf("1");

12        }

13    });

14}

Copy the code

So if you look at the demo, 1 is actually printed every second, which is actually the opposite of what we’re trying to do with a thread pool, because we’re essentially running a single thread.

But if we replace the work queue with the SynchronousQueue, we see that the ones are printed together.

The SynchronousQueue is not really a queue, but rather a mechanism for managing the transfer of information directly between threads. Think of it simply as a producer producing a message to the SynchronousQueue that is delivered directly to the consumer if received by a thread. Otherwise it will block.

So when we set some parameters in the thread pool, we should think about the thread creation and destruction process, otherwise our custom thread pool might as well use the four thread pools provided by Java.

Reject policies in thread pools

ThreadPoolExecutor provides us with four rejection policies, and we can look at the four thread pool creation policies provided by Java that are the default rejection policies defined by them. So what are the other rejection strategies besides this rejection strategy?

1private static final RejectedExecutionHandler defaultHandler =

2    new AbortPolicy();

Copy the code

RejectedExecutionHandler interface RejectedExecutionHandler interface RejectedExecutionHandler interface RejectedExecutionHandler interface RejectedExecutionHandler interface RejectedExecutionHandler

 1public interface RejectedExecutionHandler {

 2

 3    / * *

 4     * Method that may be invoked by a {@link ThreadPoolExecutor} when

 5     * {@link ThreadPoolExecutor#execute execute} cannot accept a

 6     * task.  This may occur when no more threads or queue slots are

 7     * available because their bounds would be exceeded, or upon

 8     * shutdown of the Executor.

 9     *

10     * <p>In the absence of other alternatives, the method may throw

11     * an unchecked {@link RejectedExecutionException}, which will be

12     * propagated to the caller of {@code execute}.

13     *

14     * @param r the runnable task requested to be executed

15     * @param executor the executor attempting to execute this task

16     * @throws RejectedExecutionException if there is no remedy

17* /

18    void rejectedExecution(Runnable r, ThreadPoolExecutor executor);

19}

Copy the code

AbortPolicy

This denial policy is the default denial policy provided by the four thread pool creation methods provided by Java. We can look at the implementation.

 1public static class AbortPolicy implements RejectedExecutionHandler {

 2

 3    public AbortPolicy(a) {}

 4

 5    public void rejectedExecution(Runnable r, ThreadPoolExecutor e) {

 6        throw new RejectedExecutionException("Task " + r.toString() +

 7                                             " rejected from " +

 8                                             e.toString());

 9    }

10}

Copy the code

So the rejection strategy is the flipRejectedExecutionExceptionabnormal

CallerRunsPolicy

This rejection policy simply hands off the task to the caller to execute directly.

 1public static class CallerRunsPolicy implements RejectedExecutionHandler {

 2

 3    public CallerRunsPolicy(a) {}

 4

 5    public void rejectedExecution(Runnable r, ThreadPoolExecutor e) {

 6        if(! e.isShutdown()) {

 7            r.run();

 8        }

 9    }

10}

Copy the code

Why is it left to the caller to execute? We can see that it calls the run() method instead of the start() method.

DiscardOldestPolicy

As you can see from the source code, this rejection policy discards the oldest task in the queue and then executes it.

 1public static class DiscardOldestPolicy implements RejectedExecutionHandler {

 2

 3        public DiscardOldestPolicy(a) {}

 4

 5        public void rejectedExecution(Runnable r, ThreadPoolExecutor e) {

 6            if(! e.isShutdown()) {

 7                e.getQueue().poll();

 8                e.execute(r);

 9            }

10        }

11    }

Copy the code

DiscardPolicy

From the source code should be able to see, this rejection policy is to do nothing for the current task, in simple terms, directly discarded the current task not to execute.

1public static class DiscardPolicy implements RejectedExecutionHandler {

2

3    public DiscardPolicy(a) {}

4

5    public void rejectedExecution(Runnable r, ThreadPoolExecutor e) {

6    }

7}

Copy the code

The default denial policy of the thread pool provides us with these four implementations. Of course, we can also customize the denial policy to make the thread pool more suitable for our current business. Tomcat will also explain its own rejection policy when we explain how to customize its own thread pool.

Thread starvation deadlock

Thread pools bring a new possibility to the concept of “deadlocks” : thread-starvation deadlocks. In a thread pool, if one task submits another to the same Executor, a deadlock is usually caused. The second thread is stuck in the work queue waiting for the first submitted task to complete, but the first task cannot complete because it is waiting for the second task to complete. It is graphically shown as follows

So in code, notice that the thread pool that we’ve defined here is SingleThreadExecutor, and there’s only one thread in the thread pool, just to simulate the situation, if in a larger thread pool, if all the threads are blocked waiting for other tasks that are still on the work queue, This situation is called a thread starvation deadlock. So try to avoid processing two different types of tasks in the same thread pool.

 1public class AboutThread {

 2    ExecutorService executorService = Executors.newSingleThreadExecutor();

 3    public static void main(String[] args) {

 4        AboutThread aboutThread = new AboutThread();

 5        aboutThread.threadDeadLock();

 6    }

 7

 8    public void threadDeadLock(a){

 9        Future<String> taskOne  = executorService.submit(new TaskOne());

10        try {

11            System.out.printf(taskOne.get());

12        } catch (InterruptedException e) {

13            e.printStackTrace();

14        } catch (ExecutionException e) {

15            e.printStackTrace();

16        }

17    }

18

19    public class TaskOne implements Callable{

20

21        @Override

22        public Object call(a) throws Exception {

23            Future<String> taskTow = executorService.submit(new TaskTwo());

24            return "TaskOne" + taskTow.get();

25        }

26    }

27

28    public class TaskTwo implements Callable{

29

30        @Override

31        public Object call(a) throws Exception {

32            return "TaskTwo";

33        }

34    }

35}

Copy the code

Expand the ThreadPoolExecutor

If we want to extend the thread pool, we can use some of the interfaces that ThreadPoolExecutor has reserved for me to allow us to customize the thread pool at a deeper level.

Thread factory

If we want to give some custom names to each thread in our thread pool, we can use thread factories to implement some custom operations. As long as we pass our custom factory to ThreadPoolExecutor, whenever a thread pool needs to create a thread, it will be created through our defined factory. Let’s take a look at the interface ThreadFactory, which allows us to customize our own thread-specific information once we implement it.

 1public interface ThreadFactory {

 2

 3    / * *

 4     * Constructs a new {@code Thread}.  Implementations may also initialize

 5     * priority, name, daemon status, {@code ThreadGroup}, etc.

 6     *

 7     * @param r a runnable to be executed by new thread instance

 8     * @return constructed thread, or {@code null} if the request to

 9     *         create a thread is rejected

10* /

11    Thread newThread(Runnable r);

12}

Copy the code

Next we can look at our own thread pool factory class

 1class CustomerThreadFactory implements ThreadFactory{

 2

 3    private String name;

 4    private final AtomicInteger threadNumber = new AtomicInteger(1);

 5    CustomerThreadFactory(String name){

 6        this.name = name;

 7    }

 8

 9    @Override

10    public Thread newThread(Runnable r) {

11        Thread thread = new Thread(r,name+threadNumber.getAndIncrement());

12        return thread;

13    }

14}

Copy the code

All you need to do is add the factory class when you instantiate the thread pool

 1   public static void customerThread(){

 2        ThreadPoolExecutor threadPoolExecutor = new ThreadPoolExecutor(0,Integer.MAX_VALUE,1, TimeUnit.SECONDS,new SynchronousQueue<>(),

 3                new CustomerThreadFactory("customerThread"));

 4

 5        for (int i = 0; i < 10; i++) {

 6            threadPoolExecutor.execute(new Runnable() {

 7                @Override

 8                public void run() {

 9                    System.out.printf(Thread.currentThread().getName());

10                    System.out.printf("\n");

11                }

12            });

13        }

14    }

Copy the code

We then execute this statement and find that the name of each thread has changed

 1customerThread1

 2customerThread10

 3customerThread9

 4customerThread8

 5customerThread7

 6customerThread6

 7customerThread5

 8customerThread4

 9customerThread3

10customerThread2

Copy the code

By inheriting ThreadPoolExecutor extensions

If we look at the ThreadPoolExecutor source code, we can see that there are three methods in the source code that are protected

1protected void beforeExecute(Thread t, Runnable r) {}

2protected void afterExecute(Runnable r, Throwable t) {}

3protected void terminated(a) {}

Copy the code

Members protected are visible to this package and its subclasses

We can override these methods by inheritance, and then we can do our own extensions. The thread executing the task calls the beforeExecute and afterExecute methods, which add logging, timing, monitor, or peer information collection capabilities. AfterExecute is called whether the task returns from run normally or if an exception is thrown (afterExecute is not called if an Error is thrown after the task is complete). If beforeExecute throws a RuntimeException, the task will not be executed and afterExecute will not be called.

Terminated is called when the thread pool is closed, after all tasks are complete and all worker threads are closed. Terminated can be used to release resources that Executor has allocated during its life cycle. In addition, operations such as sending notifications, recording logs, and finalize statistics of mobile phones can also be performed.

This article code address

Interested can pay attention to my new public number, search [program ape 100 treasure bag]. Or just scan the code below.

reference

What is the difference between a process and a thread?
What’s the Diff: Programs, Processes, and Threads
Wikipedia – Process
Wikipedia – Threads
Pooling technology (JAVA) analysis
How does Tomcat extend thread pools
[Java Concurrent Programming Practice]
Understand the ThreadPoolExecutor thread pool
JAVA extension ThreadPoolExecutor for multithreading