Hello, I’m Why.
I don’t know about you, but when I first saw this question, I was stunned.
And it’s an interview question.
It wasn’t that I didn’t know the answer, it was that I had known it before in a very serendipity situation.
I feel very cold, as a candidate’s knowledge point in the interview process is not appropriate, unless the candidate volunteered to do such optimization.
And the fear is that the interviewer also happens to know this thing in a book or blog, briefly read, think they have learned the unique martial arts, and then out to test others.
It’s not appropriate.
Back to the subject.
Normally speaking, it should be the knowledge point category that belongs to inspect operating system actually.
But the interviewer specifically added “How to implement in Java”.
So let’s talk about that.
Java thread
Before I talk about how to bind, I’ll give you a bit of background: Java thread implementation.
We all know that most of Thread’s methods are native:
When a method is declared as a native method in Java, in most cases it does not or cannot be implemented using platform-independent means.
This means that you need to operate on something very low-level, outside the Scope of the Java language.
Regardless of the Java language, there are three main ways to implement threads:
1. Implementation using kernel thread (1:1 implementation) 2. Implementation using user thread (1:N implementation) 3. Hybrid implementation using user threads and lightweight processes (N:M implementation)
These three implementations are described in detail in section 12.4 of Understanding the Java Virtual Machine for those who are interested.
In summary, you need to know that there are three different threading models, but Java as a top application is not aware of the differences between the three models.
The JVM specification does not specify which model must be used.
Because the threading model supported by the operating system largely determines how the threads of the Java virtual machine running on it are mapped, this can be difficult to agree across platforms.
So the JVM specification does not and does not specify which thread model Java threads need to use for implementation.
Meanwhile, regarding the topic to be discussed in this paper, I also found a similar question on Zhihu:
https://www.zhihu.com/question/64072646/answer/216184631
There’s a big R answer in there, so you can look at it.
He also starts with a thread model.
I’ll focus here on the model that uses the kernel thread implementation (1:1).
Because the HotSpot VIRTUAL machine we use the most is the 1:1 model to implement Java threads.
What does that mean?
In human terms, a Java thread is mapped directly to an operating system native thread, with no additional indirection. The HotSpot VIRTUAL machine also does not interfere with thread scheduling, leaving it to the underlying operating system.
The most you can do is set a thread priority and the operating system will give you a suggestion when scheduling.
But when to suspend, when to wake up, when to allocate time slices, when to get that processor core to execute, and so on — all that stuff about thread life cycle, execution, is up to the operating system.
This is not what I say, is R big and zhou guy have said such words.
https://www.zhihu.com/question/64072646/answer/216184631
For the 1:1 threading model, remember this picture from the book:
- LWP: light-weight Process
- KLT: kernal-level Thread Specifies the kernel Thread
- UT: User Thread User Thread
Kernel threads are threads directly supported by the operating system kernel. This thread is switched by the kernel. The kernel schedules the threads by manipulating the scheduler, and is responsible for mapping the tasks of the threads to each processor.
Then you look at the picture above, KLT threads have a LWP corresponding to them.
What is LWP?
Programs typically do not use kernel threads directly, but instead use a high-level interface of kernel threads called lightweight processes (LWP), which are threads in the usual sense.
Then remember the following passage from the book, which can be said to be one of the cornerstone theory of Java multithreading implementation:
Because of kernel thread support, each lightweight process becomes an independent scheduling unit, and even if one of the lightweight processes is blocked in a system call, the whole process does not continue to work.
However, lightweight processes also have their limitations.
First, since it is implemented based on kernel threads, various thread operations, such as creation, destruction, and synchronization, require system calls. The cost of system call is relatively high, and it needs to switch back and forth between User Mode and Kernel Mode.
Second, each lightweight process requires the support of a kernel thread, so lightweight processes consume certain kernel resources (such as kernel thread stack space), so a system can support a limited number of lightweight processes.
All right, finally, the setup is done.
All this is to express a point of view:
In any case, binding threads to execute on a CPU is like operating system level work. Java, as a high-level development language, certainly can’t do it directly. The need for a more low-level development language, Java through JNA technology to call.
The solution is also mentioned in R’s answer:
- On Linux, you can use Taskset to tie threads to a specific core.
- On the Java level, there is a library already written to take advantage of Taskset binding: OpenHFT/Java-Thread-Affinity for those interested.
Java-Thread-Affinity
This open source project is essentially the answer to the interview question.
https://github.com/OpenHFT/Java-Thread-Affinity
There is a q&A on how to use it to bind the core:
So without further ado, let’s get right to the demo.
Get dependencies into the project first:
<dependency>
<groupId>net.openhft</groupId>
<artifactId>affinity</artifactId>
The < version > 3.2.3 < / version >
</dependency>
Copy the code
Then we call the main method:
public static void main(String[] args) {
try (AffinityLock affinityLock = AffinityLock.acquireLock(5)) {
// do some work while locked to a CPU.
while(true) {}
}
}
Copy the code
As described in Git, I wrote an infinite loop in the method for better demonstration.
This means that I need to execute an infinite loop on the fifth CPU thread to get the CPU utilization to 100%.
So let’s see what happens.
This is no program to start before:
Here’s what happens when it starts up:
Immediately, CUP 5 was full.
There are also two lines of log output, which I will cut out for you:
There are several Maven versions for this project:
On my machine, if I use version higher than 3.2.3, I get this exception message:
The feeling is version conflict, anyway did not go to investigate, if you also want to run, I just remind you.
We are now seeing the effect, can say that the project is very smooth, the implementation of thread binding to the specified core.
This function also has the actual application scenario, belongs to always very extreme performance optimization means.
Binding the core makes better use of caching and reduces context switching for threads.
Speaking of this I have to mention the first time I know “bound nuclear” this SAO operation of the scene.
It was the first Database Performance Competition, or better known as Tianchi, held in 2018.
That session of the competition, I went to play soy sauce, the result was very crotch not to mention.
But I went to carefully watch the top few post-match sharing, everyone’s ideas are similar.
Again, I have to whisper that the final round of the competition has become a development language level, the configuration of the parameters of the gap. C++ natural advantage, so you can see in front of all C++ players.
One small detail mentioned by many teams is the binding of the core.
And I first learned about this open source project through this article “PolarDB Database Performance Competition Java Players Share”.
At that time, I pulled down his entry code and had a basic understanding of the core binding operation, but ACTUALLY I did not go into the implementation.
It’s just the way it’s written, so you can tie it and be done.
Later, when I looked at Disruptor, I saw that it had a waiting strategy like this:
com.lmax.disruptor.BusySpinWaitStrategy
There is a comment on this policy:
It is best used when threads can be bound to specific CPU cores.
If you’re going to use this strategy, it’s best if threads can be tied to a specific CPU core.
In this way, the strange knowledge was awakened.
I know how to bind. Java-thread-affinity is an open source project.
The question then becomes: How does it do it?
What to do
Specific how to do, just write a few key points, a simple analysis, we are interested in the source can be pulled down to see.
First, JNA is important for Java-thread-affinity:
In fact, Java-thread-affinity is a Java skin, which should be done by the operating system. In fact, it is written in more low-level C++ or C language to achieve.
Therefore, this project is essentially based on JNA to call DLL files, so as to achieve the requirements of binding the core.
The corresponding code looks like this:
net.openhft.affinity.Affinity
First determine the operating system type in the static code block of this class:
I’m running Windows.
net.openhft.affinity.IAffinity
Is an interface that has thread affinity implementations for various platforms:
For example, in the implementation class WindowsJNAAffinity, you can see the logic called in its static code block:
net.openhft.affinity.impl.WindowsJNAAffinity.CLibrary
This is done by calling the kernel32.dll file from JNA as mentioned earlier.
Some of the building blocks for making this feature available on Windows platforms are here.
Second point: How do I bind to a given core?
In its core class there is a method like this:
net.openhft.affinity.AffinityLock#acquireLock(int)
The input parameter here is the number of CPUS. Remember that the CPU number starts at 0.
But 0 is not recommended:
So the program also controls not to bind to CPU 0.
You end up with this method:
net.openhft.affinity.AffinityLock#bind(boolean)
In this case, we use BitSet and set the position of the CPU to true if we want to bind to the CPU.
This method is called on the Win platform:
net.openhft.affinity.impl.WindowsJNAAffinity.CLibrary#SetThreadAffinityMask
This method is an API that restricts which CPU threads run on.
https://docs.microsoft.com/zh-cn/windows/win32/api/winbase/nf-winbase-setthreadaffinitymask?redirectedfrom=MSDN
Third point: How is the Solaris platform implemented?
As we know, HotSpot virtual machine on Solaris supports both 1:1 and N:M threading model.
Logically, there were two binding schemes to offer, so I clicked on it and, boy:
Avenue to simple, direct to a not.
Point 4: Who’s using it?
Netty uses this library:
SOFAJRaft also relies on this package:
https://github.com/sofastack/sofa-jraft/blob/master/README_zh_CN.md
Then I also saw such a scene in the competition I mentioned before. I also saw such a scene in Zhihu:
Well, this is the end of the article.
Think again about the interview question. If this is really what the interviewer is looking for, is it appropriate?
And you said you asked what this business, their own business scene ah, consider, need to optimize to this level?
Is it high-frequency trading?
When people come in, you probably don’t see much of a thread pool, do you?
All right, manual survival, call it a day.
Please imagine your own emoji: tactical backward. GIF