Interviewer: Tell me about the locks in the thread pool.

Hi, I’m crooked.

A reader recently told me that during a thread pool interview, we had a great conversation and got most of the answers, but there was one question that blew his mind.

The interviewer asked him: Tell me about the lock in the thread pool.

As a result, he had read all about the thread pool in various blogs or blogs, and had not read the source code himself, so he had not noticed that there were locks in the thread pool.

He also complained to me:

When he said that, I also felt that when people talk about thread pools, they don’t talk much about the locks that are used in them.

Very low presence indeed.

Why don’t I just set it up?

mainLock

In fact, there are a lot of places to use locks in the thread pool.

For example, as I mentioned earlier, there is a variable called workers in the thread pool, and it holds something that you can think of as threads in the thread pool.

And the data structure of that object is a HashSet.

HashSet is not a thread-safe collection class, you know?

So, look at the notes on it and see what it says:

MainLock can only be accessed if it is held.

I don’t have to introduce it, but the name tells you that mainLock is a lock, if not a guess.

Is it, and if so, what kind of lock is it?

The variable mainLock is located directly above workers:

It turns out that it is really a ReentrantLock.

There’s nothing wrong with protecting a HashSet with a ReentrantLock.

So how do ReentrantLock and Workers work together?

Let’s take the key addWorker method:

With locks, there must be something that needs to be monopolized.

What are you trying to do when you lock up a shared resource?

Most of the time, you want to change it, you want to stuff it, right?

So you will follow this idea of analysis, addWorker locked wrapped code, what is it monopolized?

In fact, there is no need to analyze, there are only two shared data. Both need to be written. The two share data, the workers object and the largestPoolSize variable.

Workers, whose data structure is a thread-unsafe HashSet, was described earlier.

What is largestPoolSize and why is it locked?

This field is used to record the maximum number of threads that have ever appeared in the thread pool.

MianLock = mianLock ();

The largestPoolSize variable is volatile and mainLock is not locked.

It’s also thread-safe.

I wonder if you feel the same way?

If that’s what you think, I’m sorry, but you’re wrong.

Many other fields in the thread pool use volatile:

Why not largestPoolSize?

Look again at where the getLargestPoolSize method gets the value earlier.

If it is volatile and not locked, the mainlock.lock () operation is missing.

By removing this operation, it is possible to eliminate a blocking wait operation.

Suppose the getLargestPoolSize method is called by a thread before the addWorkers method can change the value of largestPoolSize.

Since there is no block, the value directly obtained is only the largestPoolSize at that moment, not necessarily after the execution of the addWorker method

With blocking, the program can sense that the largestPoolSize may be changing, so it must be the largestPoolSize after the addWorker method is executed.

So I understand the lock, in order to maximize the accuracy of this parameter.

In addition to the places mentioned above, there are a number of places mainLock can be used:

I’m not going to go through it, you have to go through it, it’s not interesting, it’s code that you can understand at a glance.

Tell me something interesting.

Have you ever wondered why Doug Lea used thread-unsafe HashSet in conjunction with ReentrantLock to implement thread-safe?

Why not just make a thread-safe Set collection, such as use this trifle Collections. SynchronizedSet?

The answer actually appeared in the front, but I did not deliberately say, people did not notice.

Right here in mainLock’s comment:

Let me pick up the key points for you.

First look at this:

While we could use a concurrent set of some sort, it turns out to be generally preferable to use a lock.

This sentence is inverted sentence, there should be no new words, everyone knows.

It turns out to be. It turns out to be.

That translates to “facts proved.”

So here’s the whole thing: Although we could use some kind of concurrency-safe set, it turns out that locking is generally better.

Pops is about to explain why locks are better.

I’m not talking nonsense, it’s all valid, because this is the don himself explaining why he doesn’t use thread-safe sets.

The first reason goes like this:

Among the reasons is that this serializes interruptIdleWorkers, which avoids unnecessary interrupt storms, especially during shutdown. Otherwise exiting threads would concurrently interrupt those that have not yet interrupted.

English yes, I translated into Chinese, plus my own understanding is like this.

First, the first sentence contains the word “seritidleworkers,” and the two words together are somewhat confusing.

Serialization here does not mean that we hold serialization in Java, but that one needs to be translated as “serialization.”

InterruptIdleWorkers, which is not a word at all, is a thread pool method:

The first thing to do in this method is to grab the mainLock lock and then try to interrupt the thread.

Because of mainlock. lock, one of the multiple threads that call this method is serialized.

What are the benefits of serialization?

That’s what follows: It avoids unnecessary interrupt storms, especially when the shutdown method is called, and prevents the exiting thread from interrupting the threads that have not been interrupted.

Why is the shutdown method specifically mentioned here?

Because the shutdown method calls interruptIdleWorkers:

So what does that mean?

This is going to be a proof by contradiction.

Suppose we use a concurrency-safe Set instead of mainLock.

The shutdown method is called by all five threads, and since there is no mainLock, there is no blocking, and each thread will run interruptIdleWorkers.

Therefore, the first thread will initiate an interrupt, causing the worker, that is, the thread is interrupting. The second thread comes back to initiate the interrupt, and again initiates the interrupt in progress.

Well, it’s kind of like a tongue twister.

So I’m going to repeat: For interrupts in progress, initiate interrupts.

Therefore, locks are used to avoid the risk of interrupt storms.

In concurrency, only one thread can initiate an interrupt operation, so locking is a must. With locks, the Set will be locked anyway, so there is no need for a concurrency safe Set.

So I understand that mainLock is used here to serialize, while ensuring that the Set will not be accessed concurrently.

Just make sure that this and this Set are locked around each other. Therefore, there is no need for a concurrent safe Set.

Email exchange with email exchange only under mainLock.

Remember, you might get tested.

And then, the second reason pops said:

It also simplifies some of the associated statistics bookkeeping of largestPoolSize etc.

The largestPoolSize parameter is locked and maintained.

Oh, yes, there is an etc, which means’ something like that ‘.

The etc parameter refers to the completedTaskCount parameter, for the same reason:

Another lock

In addition to mainLock, there is a thread pool lock that is often overlooked.

That is the Worker object.

It can be seen that Worker inherits from THE AQS object, and many of its methods are also lock-related.

It also implements the Runnable method, so it is essentially a thread that is encapsulated and used to run tasks submitted to the thread pool. When there are no tasks, it takes or polls the queue, and the bad ones are recycled.

Let’s take a look at where it locks, in the crucial runWorker method:

java.util.concurrent.ThreadPoolExecutor#runWorker

So here’s the question:

This is where the thread in the thread pool is executing the logic of the submitted task. Why lock?

Why create a ReentrantLock instead of an existing ReentrantLock?

Again, the answer is in the notes:

I know you lose interest after reading such a big English paragraph.

But don’t panic. I’ll show you how to eat.

The first sentence came straight to the point:

Class Worker mainly maintains interrupt control state for threads running tasks.

The main purpose of the Worker class is to maintain the interrupted state of threads.

The maintenance thread is not a normal thread, but a running Tasks thread, that is, a running thread.

How to understand this “maintenance thread interrupt state”?

If you look at the Lock and tryLock methods of the Worker class, each has only one place to call it.

The lock method is called in the runWorker method as described earlier.

The tryLock method is called here:

This method, which is an old friend of ours, is used to interrupt threads.

What kind of thread is being interrupted?

Is the thread waiting for a task, i.e. the thread waiting here:

java.util.concurrent.ThreadPoolExecutor#getTask

In other words: threads executing tasks should not be interrupted.

How does the thread pool know which tasks are executing and should not be interrupted?

Let’s look at the judgment conditions:

The key condition is actually the w.Trylock () method.

So take a look at the core logic of the tryLock method:

The core logic is a CAS operation that updates a state from 0 to 1, and if it succeeds, tryLock succeeds.

What are “0” and “1” respectively?

Notes, the answer is still in the notes:

So the core logic in tryLock, compareAndSetState(0, 1), is a lock operation.

If tryLock fails, why?

It must be that the state is already 1.

So when does the state go to 1?

One time is when the Lock method is executed, it also calls the tryAcquire method.

When did the lock take place?

In the runWorker method, a task is obtained and ready to be executed.

In other words, the worker in state 1 must be the thread executing the task and cannot be interrupted.

In addition, the initial value of the state is set to -1.

We can write a simple code to verify the above three states:

First we define a thread pool, and then we call the prestartAllCoreThreads method to warm up all threads to wait for the task to be received.

What are the states of the three workers at this time?

It has to be 0, unlocked.

Of course, you might see something like this:

Where does the minus 1 come from?

Don’t panic, I’ll tell you later, let’s see where the 1 is first?

According to the previous analysis, we only need to submit a task to the thread pool:

At this point, what happens if we call shutdown?

Interrupt idle threads, of course.

What about the thread that is executing the task?

Since it is a while loop, the getTask method is called again after the task is completed:

GetTask method will first determine the thread pool state, at this time can sense the thread pool closed, return null, the worker silently quit.

Ok, so with all this said, just keep one big premise in mind: The main premise of customizing worker classes is to maintain interrupt state, since threads executing tasks should not be interrupted.

Read on to the notes:

We implement a simple non-reentrant mutual exclusion lock rather than use ReentrantLock because we do not want worker tasks to be able to reacquire the lock when they invoke pool control methods like setCorePoolSize.

Here’s why Don didn’t use ReentrantLock and chose to create his own worker class.

Because what he wanted was a mutex that could not be reentrant, and ReentrantLock could.

As can be seen from the previous analysis of the method, it is a non-reentrant method:

The parameters passed in are not used at all, and there is no cumulative logic in the code.

In case you haven’t figured it out yet, let me show you the ReentrantLock logic:

You see that? There’s a cumulative process.

When the lock is released, there is a corresponding decrement process, which is 0, the current thread released the lock successfully:

The above logic of accumulation and decrement is completely absent in worker classes.

So the question remains: What would happen if it were reentrant?

The goal is the same as before: you don’t want to interrupt a thread that is executing a task.

The comment also mentions a method called setCorePoolSize.

This is a method that I discussed in the thread pool dynamic adjustment section:

Unfortunately, I was focusing on the logic in delta>0.

Now let’s look at what I’ve framed here.

WorkerCountOf (ctl.get()) > corePoolSize = true

It indicates that the number of current workers is more than the corePoolSize I want to reset, which needs to be reduced a little.

How to reduce it?

Call the interruptIdleWorkers method.

We have just analyzed this method, let me take a look at it:

There’s a tryLock in there. What happens if it’s reentrant?

Is it possible to interrupt the executing worker?

Is this appropriate?

Okay, last word on the note:

Additionally, to suppress interrupts until the thread actually starts running tasks, we initialize lock state to a negative value, and clear it upon start (in runWorker).

This is meant to suppress interrupts before the thread actually starts running the task. So initialize the worker’s state to negative (-1).

And clear it upon start (in runWorker).

Clear it at startup, which is the state with a negative value.

Pops is so sweet to give you the instructions: in runWorker.

So if you look at runWorker, you’ll see why it begins with an unLock operation followed by a allow Interrupts comment:

UnLock, since the worker’s state could still be -1 at this point, unLock and swipe the state to 0.

It also explains where the -1 that I didn’t explain before came from:

Okay? Where does minus one come from?

Worker objects must be in the -1 state before the worker. add method is executed during startup.

One last word

All right, I see it here. Give it a thumbs up and arrange one. Writing is tiring and needs some positive feedback.

Here’s one for readers:

This article has been collected from personal blog, welcome to play.

www.whywhy.vip/

Interviewer: Tell me about the locks in the thread pool.

mainLock

Another lock

One last word

Related Posts

Shiro source code comprehensive analysis

ShardingSphere-JDBC sharding rewriting engine

The problem of mamP operation database in MAC environment