scheduling

When a computer is a multi-programming system, there are frequently many processes or threads competing for CPU slices at the same time. This happens when two or more processes/threads are in a ready state. If only one CPU is available, then you must choose which process/thread can run next. There is a role in the operating system called a scheduler that does just that, using an algorithm called the scheduling algorithm.

Although there are some differences, many of the same processes that apply to process scheduling also apply to thread scheduling. When the kernel manages threads, scheduling usually occurs at the thread level, with little or no regard for which process the thread belongs to. We will first focus on process and thread scheduling and then explicitly introduce thread scheduling and the problems it causes.

Scheduling is introduced

Back in the early days of batch processing systems with cards on tape as input, the scheduling algorithm was simple: run each job on tape in turn. For multi-programming systems, it is more complicated because there are often multiple users waiting for service. Some mainframes still use a combination of batch and time-sharing services, requiring the scheduler to decide whether to run a batch job or a user on the terminal next. Since CPU is a scarce resource on these machines, a good scheduler can achieve great results in terms of improved performance and user satisfaction.

Process behavior

Almost all process (disk or network) I/O requests and calculations run alternately

As shown in the figure above, the CPU runs for some time without stopping and then issues a system call waiting for I/O to read or write files. Once the system call is complete, the CPU computes again until it needs to read or write more data. An I/O activity is when a process is blocked waiting for an external device to complete its work.

A above is cpu-intensive process; B is an I/ O-intensive process. A is compute-bound or CPU-bound because it takes longer to compute. B is I/O-bound because I/ OS occur more frequently. Computationally intensive processes have longer CPU concentration and less frequent I/O waits. I/O intensive processes have shorter CPU usage times and more frequent I/O waits. Note that the difference between the two processes is more about CPU time than I/O time. I/O intensive because they don’t spend more computation between I/ OS, not because I/O request times are particularly long. No matter how long it takes the data to arrive, they all take the same amount of time to make a hardware request to read the disk block.

It is worth noting that as CPUS get faster and faster, more processes tend to be I/O intensive. This happens because CPU speeds increase much faster than hard drives. As a result, scheduling for I/ O-intensive processes seems to be more important in the future. The basic idea here is that if you need to run an I/ O-intensive process, you should give it a chance as soon as possible to make disk requests and keep the disk busy.

When scheduling

The first scheduling problem is when to make scheduling decisions. There are various situations that require scheduling. First, after creating a new process, you need to decide whether to run the parent or child process. Since both processes are in a ready state, this is a normal scheduling decision and can be selected arbitrarily, that is, the scheduler can choose any child or parent process to start running.

Second, scheduling decisions need to be made when a process exits. Because this process is no longer running (because it will no longer exist), you must select another process from the ready process to run. If no process is in a ready state, the free processes provided by the system will usually run

What is an idle process

System-supplied Idle process (SYSTEM-supplied idle process) is a system process in Microsoft Windows operating system. It is a single thread running on each processor. Its sole task is to occupy processor time when the system is not processing other threads. System Idle Process is not a real Process, it is the core virtual, multitasking operating System exists. When no Process is available, the System Idle Process is running. You can simply understand that it represents the idle state of the CPU. The higher the value is, the more idle the processor is. You can check the CPU utilization in Windows through the Task Manager

The third case is when a process is blocked in I/O, semaphore, or other reasons and another process must be selected to run. Sometimes, the cause of a block becomes a key factor in choosing a process to run. For example, if A is an important process and it is waiting for B to exit the critical area, let B exit the critical area so that A can run. But schedulers generally don’t take this into account.

Fourth, scheduling decisions can be made when I/O interrupts occur. If the interruption came from an I/O device, and the I/O device has finished its work, those processes waiting for the I/O can now continue running. It is up to the scheduler to decide whether it is ready to run a new process or rerun an interrupted one.

If the hardware clock provides periodic interrupts at 50 or 60 Hz or some other frequency, scheduling decisions can be made at each clock interrupt or the KTH clock interrupt. Scheduling algorithms can be divided into two types according to how to handle clock interrupts. Non-preemptive scheduling algorithms pick a process and let it run until it becomes blocked (blocking on I/O or waiting for another process) or until it automatically releases the CPU. Even after the process has been running for several hours, it is not forced to suspend. This will not be scheduled when a clock interrupt occurs. After processing a clock interrupt, if no higher priority process is waiting, the interrupted process continues.

Another case is preemptive scheduling algorithms, which select a process and run it for a maximum fixed time. If it is still running after the end of the interval, the process is suspended and the scheduler selects another process to run (provided there is a ready process). Preemptive scheduling requires a clock interrupt at the end of the interval to return control of the CPU to the scheduler. If no clock is available, then non-preemptive is the only option.

Classification of scheduling algorithms

There is no doubt that different scheduling algorithms are needed in different environments. This happens because different applications and different operating systems have different goals. In other words, scheduler optimization is different in different systems. It is necessary to distinguish three environments

  • Batch (Batch)
  • Interactive (Interactive)
  • Real time (Real time)

Batch processing systems are widely used in business, such as processing payroll, inventory, accounting receipts, accounting expenses, interest calculations, claims processing, and other cyclical operations. In batch systems, non-preemptive algorithms or preemptive algorithms with long periodicity are generally used. This approach reduces thread switching and therefore improves performance.

In an interactive user environment, preemptive algorithms are needed to prevent one process from occupying the CPU and refusing to serve other processes. Even if no process is intended to run forever, it is possible to exclude all other processes indefinitely because one process is faulty. To avoid this, preemption is also necessary. Servers also fall into this category because they typically serve multiple (remote) users who are in a hurry. Computer users are always busy.

In real-time systems, preemption is sometimes unnecessary because processes, knowing they may not run for long, usually do their work quickly and block. The difference between real-time systems and interactive systems is that real-time systems only run programs that advance existing applications, while interactive systems are universal and can run arbitrary non-cooperative or even malicious programs.

The target of the scheduling algorithm

In order to design scheduling algorithms, it is necessary to consider what a good scheduling algorithm is. There are some goals that depend on the environment (batch, interactive, or real-time). Most are applicable to all situations, but there are some factors to consider that we will discuss together below.

All system

In all cases, fairness is important. It is unfair to give one process a larger slice of CPU time than an equivalent process. Of course, different types of processes can take different approaches.

The thing about fairness is enforcement of the system. What does that mean? If the payroll system of a company is scheduled to be distributed on the 15th of this month, and everyone is in a pinch due to the epidemic, the boss says that the payroll will be distributed on the 14th night, then the scheduler must force the process to implement the strategy of the payroll on the 14th night.

Another common goal is to keep all parts of the system as busy as possible. If the CPU and all the I/O devices are running all the time, more work can be done per second than if something is idling. For example, in a batch system, the scheduler controls which jobs are called into memory to run. It is a good idea to have some CPU intensive processes and some I/O intensive processes in memory, rather than calling in and running all THE CPU intensive jobs first, and then calling in and running all the I/O intensive jobs after they are done. With the latter approach, CPU-intensive processes start competing for CPU while the disk runs idle, and I/O intensive processes start competing for disk while the CPU runs idle… Obviously, by combining I/O intensive and CPU intensive, you can make the whole system run more smoothly and efficiently.

Batch system

There are generally three metrics to measure system health: throughput, turnaround time, and CPU utilization. Throughput (THROUGHOUT) is the number of jobs completed by the system per hour. All things considered, 50 tasks per hour is better than 40. Turnaround time is the average time from a batch submission to the time when a job is due. This data measures the average wait time for the user to get the output. The smaller the turnaround time, the better.

CPU utilization is usually used as an indicator on batch systems. Even so, CPU utilization is not a good metric, and the really valuable metric is how many jobs the system can do per hour (throughput) and how long it takes to do them (turnaround time). Using CPU utilization as a metric is like comparing the performance of a car to how many times an engine spins per hour. And knowing when CPU utilization is approaching 100% is more useful than when you need more computing power.

Interactive system

For interactive systems, there are different metrics. The most important thing is to minimize response time. This time refers to the time between the execution of the instruction and the result. On a PC where background processes are running (for example, reading and saving E-mail files from the network), user requests to start a program or open a file should take precedence over background work. What makes all interactive requests run in the first place is a good service.

A related problem is proportionality, which means that users have a fixed (though often incorrect) idea of how long it takes to do something. When a request is considered complex and takes a lot of time, it is considered normal and acceptable, but a simple application takes a long time to run, and users are annoyed. Take color printing and photocopying as a simple example. Color printing may take 1 minute, but users find it complicated and are willing to wait a minute. On the contrary, photocopying is easy and only takes 5 seconds.

Real-time systems

The real-time system has different considerations from the interactive system, so it has different scheduling objectives. Real-time systems are characterized by the need to meet deadlines. For example, if a computer controls a device that generates data at a fixed rate, it may lose data if it is not up and running on time. Therefore, the most important requirement in a real-time system is to meet all (or most) time horizons.

In some practical systems, especially those involving multimedia, predictability is important. Occasionally missing deadlines doesn’t matter, but if audio multimedia runs erratically, the sound quality will continue to deteriorate. Videos can also cause problems, but ears are much more sensitive than eyes. To avoid these problems, process scheduling must be highly predictable and regular.

Scheduling in batch processing

Now let’s switch our focus from general scheduling to specific scheduling algorithms. Next we will look at scheduling in batch processing.

First come, first served

It’s like a first come, first served… Perhaps the simplest design for a non-preemptive scheduling algorithm is first-come,first-serverd. Using this algorithm, cpus are allocated to processes in the order they are requested. At its most basic, there is a queue of ready processes. When the first task enters the system from outside, it starts immediately and allows it to run for any length of time. It will not be interrupted by running too long. When other jobs come in, they go to the end of the ready queue. When the running process blocks, the first process in the wait queue starts running. When a blocked process is back in the ready state, it will appear as a newly arrived task at the end of the queue, at the end of all processes.

The power of this algorithm is that it is easy to understand and program. In this algorithm, a single linked list keeps track of all ready processes. To select a process to run, simply remove a process from the head of the queue. To add a new job or block a process, simply append the job or process to the end of the queue. This is a very simple implementation.

If you have 100 I/O processes queuing and the 101st one is cpu-intensive, then you have to wait for 100 I/O processes to finish running before one cpu-intensive process runs. This is not possible in practice, so priority or preemptive processes are needed to prioritize important processes to run.

Shortest job first

In batch processing, the second scheduling algorithm is Shortest Job First, and we assume that the running time is known. An insurance company, for example, can predict with considerable accuracy how long it will take to process a batch of 1,000 claims because it does similar work every day. When several equally important jobs are started in the input queue, the scheduler should use the shortest first job algorithm

As shown in figure A, there are four jobs A, B, C, and D, with running times of 8, 4, 4, and 4 minutes respectively. If running in the order shown in the figure, the turnaround time of A is 8 minutes, B is 12 minutes, C is 16 minutes, D is 20 minutes, and the average time is 14 minutes.

Now consider using the shortest job first algorithm to run four jobs, as shown in Figure B. The current turnaround time is 4, 8, 12 and 20 respectively, with an average of 11 minutes, which can prove that shortest job first is optimal. Consider the case where there are four jobs with running times a, B, C, and D respectively. The first job ends at time A, the second at time A + B, and so on. The average turnaround time is (4A + 3b + 2C + d) / 4. Obviously A has the greatest impact on the average, so A should be the shortest priority job, followed by B, then C, and then D and it can only affect its own turnaround time.

It is important to note that the shortest job first algorithm is optimal when all processes are running.

Minimum remaining time is preferred

The Shortest job first preemptive version is known as the Shortest Remaining Time Next algorithm. With this algorithm, the scheduler always selects the process with the shortest remaining running time to run. When a new job arrives, its total time is compared to the remaining time of the current process. If the new process takes less time than the current running process, the current process is suspended and the new process is run. In this way, short-term operations can be well served.

Scheduling in interactive systems

Interactive systems are so common in PCS, servers, and other systems that it’s worth exploring interactive scheduling

Polling scheduling

One of the oldest, simplest, fairest and most widely used algorithms is the round robin algorithm. Each process is assigned a period of time, called a quantum, within which the process is allowed to run. If the process is still running when the time slice ends, it preempts one CPU and allocates it to another process. If the process blocks or ends before the time slice ends, the CPU switches immediately. The polling algorithm is relatively easy to implement. What the scheduler does is maintain a list of runnable processes, like a in the image below, that are moved to the end of the queue when a process runs out of time, like B in the image below.

The only interesting aspect of time slice polling scheduling is the length of time slice. Switching from one process to another takes time for administrative processing, including saving register values and memory maps, updating different tables and lists, and clearing and reloading the memory cache. These switches are called process switch and context switch. If it takes 1ms to switch between processes, including memory mapping, cleaning, reloading the cache, and so on, and assuming a time slice of 4 ms, it will take the CPU 1ms to switch between processes after 4 ms of useful work. As a result, 20% of the CPU’s time slice is wasted on administrative overhead. It’s expensive.

To improve CPU efficiency, we set the time slice to 100 ms. Now only 1% of time is wasted. But consider what happens when 50 requests arrive in a very short period of time with different CPU requirements. All 50 processes are placed in the runnable process list. If CP painting U is idle, the first process will start immediately, the second will not start until 100 ms later, and so on. Unfortunately, the last process had to wait 5 seconds to get a chance to execute. Most users will find running for 5 seconds to be slow for a short command. This is bad design if some request at the end of the queue takes only a few seconds to run.

Another factor is that preemption will occur less often if the time slice set length is larger than the CPU usage length. In contrast, most processes are blocked before the time slice runs out, causing interprocess switching. Eliminating preemption improves performance because process switching occurs only when it is logically necessary, that is, when the process is blocked and cannot continue.

The conclusion can be stated as follows: Setting the context switch time too short will result in too many process switches and reduced CPU efficiency, but setting the context switch time too long will result in a short request going unanswered for too long. The best switching time is between 20 and 50 milliseconds.

Priority scheduling

Polling scheduling assumes that all processes are equally important. But that may not be the case. For example, in a university hierarchy, first the dean, then the professors, secretaries, support staff, and finally the students. This consideration of external circumstances enables priority scheduling

The basic idea is clear: each process is assigned a priority, and the process with the highest priority runs first.

This does not mean that high-priority processes can run forever, however, as the scheduler reduces the priority of the currently running process during each clock interrupt. If this action causes its priority to drop below that of the next highest process, a process switch occurs. Alternatively, each process can be assigned a maximum interval of time allowed to run. When the interval runs out, the next higher-priority process gets a chance to run.

Processes can be assigned priorities statically or dynamically. On a military computer, you can set the process initiated by a general to priority 100, a colonel to priority 90, a major to priority 80, a captain to priority 70, a lieutenant to priority 60, and so on. There is a UNIX command called nice that allows users to voluntarily lower the priority of their processes in favor of others, but it is generally not used.

Priorities can also be dynamically assigned by the system for certain purposes. For example, some processes are I/O intensive and spend most of their time waiting for the I/O to finish. When such a process needs a CPU, it should be allocated immediately to start the next I/O request so that I/O operations can be performed while another process performs its calculations. Such AN I/ O-intensive process waiting for the CPU for a long time only causes it to consume memory for a long time. A simple algorithm to better serve an I/ O-intensive process is to set its priority to 1/f, where F is the portion of the process in the last slice of time. A process that uses only 1 ms in a 50 ms slice gets priority 50, a process that uses 25 ms before blocking gets priority 2, and a process that uses the entire slice gets priority 1.

It is convenient to divide a group of processes into several classes according to their priorities, and use priority scheduling between the classes, and use rotation scheduling within the various processes. A system of four priority classes is shown below

Its scheduling algorithm is mainly described as follows: There are runnable processes of priority 4, and a time slice is run for each process according to the rotation method. At this time, processes of lower priority are ignored. If the class 4 process is empty, the class 3 process is run in a polling manner. If both class 4 and class 3 processes are empty, the class 2 process is run as a rotation. If priorities are not adjusted, low-priority processes are prone to starvation.

Multi-level queue

The earliest System to use priority scheduling was CTSS(Compatible TimeSharing System). One problem with CTSS, a compatible time-sharing system, is that process switching is slow because IBM 7094 memory can only fit into one process.

IBM was the computer of Columbia University Computer Center from 1964 to 1968

CTSS needs to swap the current process out to disk and read a new process from disk before each switch. CTSS designers quickly realized that it was more efficient to set long time slices for CPU-intensive processes than to frequently give them short time slices (reducing the number of swaps). On the other hand, as mentioned earlier, the response time can be affected by long chip processes, and the solution is to set priority classes. The process with the highest priority runs one time slice, the process with the next highest priority runs two time slices, the process with the next highest priority runs four time slices, and so on. When a process runs out of allocated time slices, it is moved to the next category.

Shortest process first

For batch systems, since minimum job first is often accompanied by minimum response times, it would be nice to be able to use it for interactive processes. To some extent, this can be done. Interactive processes typically follow the following pattern: wait for command, execute command, wait for command, execute command… If we treat each command execution as a separate job, we can minimize response times by running the shortest job first. The only problem here is how to find the shortest of the currently runnable processes.

One way is to make assumptions based on the process’s past behavior and execute the one with the shortest estimated running time. Assuming that the estimated running time of each command on each terminal is T0, now assuming that its next run time is measured to T1, the estimated time can be improved by weighting two values, namely aT0+ (1-a)T1. By choosing the value of A, you can decide whether to forget the old running times as soon as possible or to keep them in mind for a long time. When a is equal to 1/2, you get the following sequence

As you can see, after three rounds, T0’s share of the new estimate has dropped to 1/8.

This technique of taking a weighted average of a current measurement and a previous estimate to get the next one is sometimes called aging. This method uses a lot of predictive values based on the current values.

Ensure that scheduling

A completely different scheduling approach is to make explicit performance guarantees to users. A practical and easily implemented guarantee is that if a user is working with n users logged in, each user will get 1/n of the CPU processing power. Similarly, in a single-user system with n processes running, if all processes are equivalent, each process will get 1/n of CPU time.

Lottery scheduling

Making promises to users and then delivering on them is a good thing, but hard to do. But there is a simple way, there is an algorithm that can not only give the prediction results but also has a relatively simple way of implementation, is lottery scheduling algorithm.

The basic idea is to provide a lottery of various system resources, such as CPU time, for processes. When a scheduling decision is made, a lottery ticket is randomly drawn, and the process that owns the lottery ticket gets the resource. When applied to CPU scheduling, the system can hold up to 50 sweepstakes per second, with each winner receiving, say, 20 milliseconds of CPU time as a reward.

George Orwell said that all processes are equal, but some processes are more equal than others. Some important processes can give them extra lottery tickets to increase their chances of winning. If 100 tickets are sold and one process holds 20 of them, it has a 20% chance of winning the ticket. In a long run, it gets 20% of the CPU. In contrast, for priority schedulers, it is difficult to say exactly what it means to have priority 40. The rule here is clear: processes with lottery F shares get approximately F shares of system resources.

You can exchange tickets between processes if you want them to collaborate. For example, if a client process blocks after sending a message to the server process, the client process might give the server all its tickets to increase the chances of the server running again. When the service is complete, it returns the lottery ticket to the client so it can run again. In fact, if there were no clients, the server would not need the lottery at all.

Think of the lottery as a buff, which gives you a 15% chance to produce speed boots.

Fair share scheduling

So far, we have assumed that each process is being scheduled, regardless of who owns the process. As a result, if user 1 starts nine processes and user 2 starts one, using rotation or the same priority scheduling algorithm, user 1 will get 90% of the CPU time and user 2 will get 10% of the CPU time.

To prevent this, some systems take process owners into account before scheduling. In this model, each user allocates some CPU time, and the scheduler selects processes and forces them to execute. So if two users are each guaranteed a 50% CPU slice, then no matter how many processes a user has, they will get the same CPU share.

Scheduling in real-time systems

Real-time system is a system in which time plays an important role. Typically, one or more external physical devices send a service request to the computer, and the computer must respond appropriately within a defined time frame. For example, a computer in a CD player would take a bit stream from the drive and then have to convert the bit stream to music in a very short amount of time. If the calculation takes too long, the music will sound abnormal. Take patient monitoring devices in intensive care units in hospitals, autopilot systems in airplanes, smoke warning devices in trains, and so on. In these cases, a correct but slow response is worse than no response at all.

Real-time systems can be divided into two categories: hard real time systems, which must meet absolute deadlines, and soft real time systems, which must meet absolute deadlines. The latter means you don’t want to miss deadlines occasionally, but you can tolerate them. In both cases, real-time is achieved by dividing the program into a group of processes, each of which is predictable and known in advance. These processes tend to be short-lived and run extremely quickly. When an external signal is detected, the scheduler’s job is to schedule the process as long as all deadlines are met.

Events in a real-time system can be further classified by response as either periodic (occurring at regular intervals) or non-periodic (occurring at unpredictable times) events. A system may respond to multiple periodic streams of events, and depending on how long it takes to process each event, it may not even be able to process all of them. For example, if there are m cycle events, event I occurs at cycle Pi, and it takes Ci seconds of CPU time to process an event, then the load can be handled under the condition that

Only real-time systems that meet this condition are called schedulable, which means that it can actually be implemented. A process that does not meet this test cannot be scheduled because the total amount of CPU time required by these processes is greater than the CPU can provide.

As an example, consider a soft real-time system with three periodic events, with periods of 100 ms, 200 m, and 500 ms. If these events require 50 ms, 30 ms, and 100 ms of CPU time, respectively, then the system is schedulable because 0.5 + 0.15 + 0.2 < 1. If a fourth event is added at this time and its period is 1 second, then this event can still be scheduled if it does not exceed 150 ms. Ignores the time of the context switch.

The scheduling algorithm of real-time system can be static or dynamic. The former makes scheduling decisions before the system starts to run; The latter makes scheduling decisions during operation. Static scheduling works only if you can know in advance what work is being done and what deadlines must be met, while dynamic scheduling does not require these limitations.

Scheduling policies and mechanisms

So far, we have implicitly assumed that all processes in the system belong to different groupings of users and compete with each other for CPUS. This is usually true, but sometimes a process can have many children and run under their control. For example, a DATABASE management system process may have many child processes. Each child process may handle different requests, or each child process may perform different functions (such as request analysis, disk access, and so on). It is entirely possible for the main process to control which subprocesses are the most important (or urgent) and which are the least important. However, none of the scheduling algorithms discussed above receives the relevant scheduling decision information from the user process, which results in the scheduler rarely making the optimal choice.

The solution is to separate the scheduling mechanism from the scheduling policy, which has been a long-standing principle. This means that the scheduling algorithm is parameterized in some way, but the parameters can be filled in by the user process. Let’s consider the database example first. Suppose the kernel uses a priority scheduling algorithm and provides a system call for the process to set the priority. In this way, the parent process can control the details of how the child process is scheduled, even though it does not participate in the scheduling itself. The scheduling mechanism is located in the kernel, and the scheduling policy is determined by the user process. The separation of scheduling policy and mechanism is a key idea.

Thread scheduling

When several processes have multiple threads, there are two levels of parallelism: process and thread. Scheduling processing in such systems differs substantially, depending on whether user-level threads or kernel-level threads are supported (or both).

Let’s start with user-level threads. Since the kernel doesn’t know there are threads, the kernel does what it did before, taking A process, let’s say A, and giving it time slice control. The thread scheduler in A decides which thread to run. Let’s say A1. Since multiple threads do not have clock interrupts, the thread can run for as long as it wishes. If the thread has used up the entire time slice of the process, the kernel selects another process to continue running.

When process A finally runs again, thread A1 takes over. The thread continues to consume all of process A’s time until it completes its work. However, thread running does not affect other processes. The other processes get the appropriate share assigned by the scheduler, regardless of what happens inside process A.

Now consider the case where thread A does less work per CPU calculation, for example, 5 ms of work in 50 ms time. So, each thread runs for a while, and then hands the CPU back to the thread scheduler. So before the switch to the kernel process B, there will be a sequence A1, A2, A3, A1, A2, A3, A1, A2, A3, A1. As shown below.

The scheduling algorithm used by the runtime system can be any of the algorithms described above. For practical purposes, rotation scheduling and priority scheduling are more commonly used. The only limitation is the lack of a clock interrupt running too long a thread. However, this is usually not a problem because of the cooperative relationship between threads.

Now consider the case of using kernel threads, where the kernel selects a specific thread to run. It doesn’t care which process the thread belongs to, but it can if necessary. A time slice is assigned to the selected thread, and if the time slice is exceeded, the thread is forced to be suspended. A thread would block after 5 ms in a 50 ms time slice, and in a 30 ms time slice the threads would be in the order A1,B1,A2,B2,A3,B3. As shown in the figure below

The main difference between user-level threads and kernel-level threads is performance. While user-level thread switching requires a small number of machine instructions (think of thread switching in a Java program), kernel threads require a full context switch, modifying the memory image and invalidating the cache, resulting in orders of magnitude of delay. On the other hand, when using kernel-level threads, once the thread is blocked on I/O, there is no need to suspend the entire process as in user-level threads.

The cost of switching from one thread in process A to one thread in process B is much higher than the cost of running two threads in process A (which involves modifying the memory image, modifying the cache), and the kernel knows the cost of switching and can use this information to make A decision.

Afterword.

I put forward a translation corrigendum of China Machine Press. I don’t know if I can see it, but I mentioned it first. It is very imprecise.

Article Reference:

Modern Operating Systems

Modern Operating System Forth Edition

www.encyclopedia.com/computing/n…

J00ru.vexillium.org/syscalls/nt…

www.bottomupcs.com/process_hie…

En.wikipedia.org/wiki/Runtim…

En.wikipedia.org/wiki/Execut…

Zhidao.baidu.com/question/11…

baike.baidu.com/item/ Waiting queue /9…

www.columbia.edu/cu/computin…