preface
For most Java developers, complex concurrency scenarios are rarely encountered in the real world. At best, thread pools can be used to run programs concurrently to improve execution efficiency. This does not mean that we can simply master concurrent applications. In today’s Moore’s law of failure, concurrent processing has become a new power of computer development, has also become the human squeeze CPU computing power the most advantageous weapon! Based on the above knowledge, I believe that systematic learning of Java concurrency from an ideological and underlying perspective is more rewarding than piecemeal learning and simple application. This article will systematically introduce the basic condition of Java concurrency – threads.
thread
First of all, we must realize that in a computer system, concurrency does not have to depend on multithreading! For example, multi-process concurrency is common in PHP. But concurrency in the Java language relies on multithreading.
As we all know, threads are more lightweight than processes. They separate resource allocation and execution scheduling of processes. Each thread can share process resources and schedule independently. In the Java language, threads are the smallest unit of program scheduling (although this may change in the future if Java successfully introduces fibers).
From the point of view of computer system, there are three main ways of thread implementation: using kernel thread implementation (1:1), using thread implementation (1:N), mixed implementation (N:M).
Kernel thread implementation
Kernel threads are directly controlled by the operating system kernel, which schedules kernel threads through the scheduler and maps them to the processor for execution. The kernel mode is theoretically the fastest, but the user does not manipulate the kernel thread directly, but indirectly through the kernel thread interface, the lightweight process. This lightweight process is called a thread.
Since each thread requires a kernel thread to support, the number of kernel threads and threads is 1:1. Therefore, the kernel thread implementation is also known as 1:1 implementation.
[Advantage] Because of the kernel thread support, each thread is an independent unit, so even if a thread dies, it does not cause the whole process to die.
[Disadvantages] However, this implementation method also has limitations. Because is based on the kernel thread implementation, so when involves the thread operation (create, run, switch, etc.) involves the system scheduling, and the system scheduling will lead to user state and kernel state switching back and forth, the cost is relatively expensive.
The cost of switching between user mode and kernel mode is mainly reflected in the cost of responding to interrupts, protecting and restoring the thread execution site. The operation of the program needs the support of context, which represents different meanings from different perspectives. From a programmer’s point of view: variables and resources for method calls; From the point of view of thread: method call stack lock stores various information; From a system point of view: specific data stored in registers, memory. When it comes to thread switching, the system needs to save all kinds of data snapshots executed by the current thread, which involves data copying back and forth in cache and memory, etc., and then it also needs to restore the context of the thread to be executed, so it is relatively expensive!
User thread implementation
In a broad sense, any thread that is not a kernel thread can be considered a user thread, so a lightweight process can also be considered a user thread, but a lightweight process is always built on a kernel thread, so a lightweight process does not have the advantages of a user thread in the general sense.
In the narrow sense, the user thread is a thread library completely established in the user space, and the system kernel is unable to perceive the existence of these user threads. User thread creation, running, scheduling and other management level operations are completed in user mode, without the help of the kernel. [Advantage] Since no kernel help is required, this means that if done properly, there is no need to switch to kernel mode, in which case the system consumption is much lower, and it is easier to achieve large scale thread concurrency. Multithreading in many high-performance databases today is implemented based on user threads. This 1:N relationship between the process and the user thread is also known as the one-to-many implementation model. The advantage of user threads is that they do not require a kernel, and so do their disadvantages. Because there is no kernel to help, the thread scheduling needs to be implemented by the user. Thread scheduling itself is too complicated for the user to take into account, and can cause threads to block and crash the system.
In the past, Java, Ruby and other languages tried to use user threads, but eventually abandoned them. In recent years, Go and Erlang, known for their concurrency, use user threads. Now we know why the Go and ErLang languages naturally support high concurrency.
Hybrid implementation
Hybrid implementations have both user threads and lightweight process implementations, and their number is variable, so they are also called N:M implementations. In this case, we can enjoy both the low consumption benefits of user threads and the advantage of kernel threads in thread scheduling, which greatly reduces the risk of thread blocking while increasing thread concurrency.
Java thread
The implementation of Java threads is not governed by the Java Virtual Machine Specification and can vary from virtual machine to virtual machine. In the case of the Hotspot VIRTUAL machine, each thread is mapped directly to the kernel thread, so it does not interfere with thread scheduling, state, etc.
Java thread scheduling
There are two main ways of thread scheduling: cooperative thread scheduling and preemptive thread scheduling.
[Collaborative thread scheduling] In the case of collaborative thread scheduling, the running time of the thread is determined by the thread itself, and when the thread finishes running, it will actively notify the system to switch to other threads. [Advantage] This implementation is simple and efficient, thread switching is visible to the thread. Therefore, there is no thread synchronization problem. [Disadvantages] But this approach also has disadvantages, the running time of the thread itself is not controllable, if due to improper code writing, causing the thread to block, it will not be able to notify the system to switch the thread, making the program blocked forever.
[Preemptive thread scheduling] In preemptive thread scheduling, the running time and switching of threads are controlled by the system. The thread itself has no recourse. For example, in the Java language, Thread::yield() can yield execution time, but due to the nature of preemptive scheduling, it is up to the system to determine whether the current Thread actually yields execution time.
The downside of preemptive thread scheduling, which most of us are probably aware of, is synchronization. Java thread scheduling is based on preemptive thread scheduling. So the discussion of Java threads is based on preemption.
Although in preemption, threads cannot determine how long they run, programs can “suggest” that the system allocate more time to certain threads, a feature known as thread priority. The specific level of thread priority varies from operating system to operating system, but the function is the same. So while the Java language defines 10 levels of priority for threads, some levels of priority may be the same for different operating systems.
Java thread state and state switching
The JAVA language defines six Thread states, defined in java.lang.thread. State. A thread can have only one state at a time, but its state can be changed in a specific way.
[NEW] (NEW) : status of the thread that has not been started after it is created.
RUNNABLE: includes Runing and Ready in the operating system thread state. That is, the thread in this state may be executing or waiting for the operating system to allocate execution time.
WAITING: a thread in which the CPU does not allocate execution time and requires other threads to explicitly wake it up. There are several ways to get a thread into this state:
1. Object::wait() without Timeout;
2. Thread::join() without Timeout;
3.LockSupport::park();
TIMED_WAITING: a thread in this state will be RUNNABLE after a specified Timeout. There are several ways to get a thread into this state:
1. Set Object:: Wait (long Timeout) for le Timeout;
2. Thread::join(long millis) with Timeout specified;
3.LockSupport::parkNanos(long nanos);
4.LockSupport::parkUntil(long deadline);
Blocked: A thread is Blocked. Pay attention to the blocking state, with wait states is essential difference between the blocking state in the exclusive lock competition scenario, when a thread when there is no competition for the lock will remain in the blocking state, until the thread holding the lock lock is released, after being blocked thread will become RUNNABLE state, waiting for the CPU scheduling and to participate in the competition of the lock. In the wait state, the current thread will call a specific method to enter the wait state, wait for a period of time or explicitly wake up by other threads to become RUNNABLE state again, waiting for the CPU to schedule and re-participate in the lock competition.
【 TERMINATED 】 (TERMINATED) state of the thread after execution is complete.
From the above analysis, transitions between thread states can be easily described. For a more intuitive view, we subdivide RUNNABLE states into runable and running states. Its interconversion relationship is as follows:
Through this article, we know several ways of operating system thread implementation and their advantages and disadvantages, as well as the mainstream virtual machine thread implementation, and master the Java thread status and its flow process. With the basics of this article in mind, it makes sense to discuss Java concurrency again.