2.5W words + 36 pictures of the liver operating system interview questions, too cattle force!

Welcome everyone to visit my Github and beg star bestJavaer

Hello, I’m cxuan, I make a summary before the interview questions about the operating system, again through the recently discovered is not very full, also to interview season now, so I spent a week revised adjusted the interview questions, the interview questions may be easing the bit in all operating systems on the market the interview questions, not I say, It’s because I checked it in the system, and if you don’t believe me, you’re welcome to punch me in the face.

This 43-question essay covers both campus and corporate job interviews. Read this article and you’ll be sure to talk with your interviewer and increase your chances of landing a job at a major company!

So without further ado, let’s get straight to the interview questions.

Introduction to the operating system

Explain what an operating system is

An operating system is an application that manages hardware and software. The operating system is the most important software running on the computer. It manages the resources and processes of the computer as well as all the hardware and software. It provides an intermediate layer for computer hardware and software, so that the application software and hardware are separated, so that we do not need to pay attention to the implementation of hardware, focus more on the software application.

Typically, there are many applications running on the computer that need to interact with the memory and CPU, and the purpose of the operating system is to ensure that these accesses and interactions are accurate.

The main functions of the operating system

In general, modern operating systems mainly provide the following functions

Process management: The main function of process management is task scheduling, under the single-core processor, the operating system will assign a task to each process, process management is very simple; Under the multi-core processor, the operating system not only needs to assign tasks to the process, but also needs to solve the scheduling, allocation and recycling problems of the processor
Memory management: Memory management is mainly the operating system is responsible for managing the allocation and recycling of memory, allocating memory when the process needs and recycling memory when the process is completed, coordinating memory resources, and switching pages in and out through a reasonable page replacement algorithm
Equipment management: Allocates devices based on specified device allocation principles so that devices and hosts can work in parallel and provide users with a good device interface.
File management: Effectively manage file storage space, reasonably organize and manage file system, and provide more effective methods and means for file access and file protection.
Provide user interfaceThe operating system provides an interface to access applications and hardware, enabling users to manipulate the hardware and achieve desired functions by making system calls through applications.

Several ways in which software can access hardware

Software access to hardware is actually a kind of IO operations, software access to hardware, that is, I/O operations.

Hardware is roughly divided into parallel and serial I/O, and also corresponds to serial and parallel interfaces.

With the development of computer technology, I/O control methods are also developing. There are three principles for selecting and measuring I/O control methods

(1) The data transmission speed is fast enough to meet the needs of users without losing data;

(2) the system overhead is small and the processing control procedures required are few;

(3) It can give full play to the ability of hardware resources to make I/O devices as busy as possible and CPU waiting time as little as possible.

According to the above control principles, I/O operations can be divided into four categories

Direct access to the: Direct access The user process directly controls the transfer of information between main memory or CPU and peripheral devices. Direct program control is also known as busy/wait mode.
interrupt-drivenIn order to reduce the CPU waiting time and improve the parallelism of the system, the interrupt mechanism is introduced in the system. With the introduction of interrupt mechanisms, peripherals only issue interrupt requests to the CPU when an operation ends normally or abnormally. In the I/O device input each data process, because the CPU does not need to intervene, the CPU and I/O device work in parallel to a certain extent.

The characteristics of the above two methods are CPU-centric, data transmission through a program to achieve, software transmission means limit the speed of data transmission. The following two I/O control methods take a hardware approach to displaying I/O control

DMA direct memory access: In order to further reduce THE INTERFERENCE of CPU on I/O operations and prevent the data loss caused by too many parallel operating devices or speed mismatch, the DMA control mode is introduced.
Channel control modeA CPU independent input/output processor that controls the exchange of data directly between the device and memory. It has its own channel instructions, which are started by the CPU and send an interrupt signal to the CPU at the end of the operation.

What is the main purpose of an operating system

An operating system is a piece of software that serves three main purposes

Manages computer resources, including CPU, memory, disk drives, printers, and so on.
Provide a graphical interface, as we described earlier, that provides a bridge between the user and the computer.
Providing services to other software, the operating system interacts with the software in order to allocate any necessary resources to it to run.

What are the types of operating systems

The operating system is usually pre-installed before you buy the computer. Most users will use the default operating system, but you can upgrade or even change the operating system. But there are only three common operating systems: Windows, macOS, and Linux.

Why can’t Linux applications run directly on Windows

This is a platitude question, here gives the concrete answer.

One of the reasons is that Linux is different from Windows in terms of format, which is protocol, which is meaningful data in a fixed location. The executable program file format under Linux is ELF. You can use the readelf command to view the ELF file header.

The executable program under Windows is PE format, which is a portable executable file.

In Linux, the API is called a system call, which is implemented with a soft interrupt called int 0x80. In Windows, the apis are stored in dynamically linked library files, or DLLS as Windows developers call them, which are libraries that contain code and data. Linux executables don’t get system resources the same way Windows does, so they obviously won’t run on Windows.

Operating system architecture

Monomer system

In most systems, the entire system runs as a single program in kernel mode. The entire operating system is written as a collection of programs, linked together to form a large binary executable program, this system is called a monolithic system.

When the actual object program is constructed in a singleton system, all the individual procedures (or files containing them) are first compiled and then bound together into an executable using the system linker

In a singleton system, there is a service to secure and run each system call. A set of utilities is required to compensate for the functionality required by the service, such as fetching data from the user program. Processes can be divided into a three-tier model

Many operating systems support additional extensions in addition to the core operating system that is loaded when the computer is first booted up. Such as I/O device drivers and file systems. These parts can be loaded on demand. They are called shared libraries in UNIX and Dynamic Link Libraries (DLLS) in Windows. Their extension is.dll, and there are more than 1000 DLL files in C:\Windows\ System32, so don’t delete the FILES on drive C easily, or they might explode.

Layered system

Layered systems use layers to separate different functional units. Each layer communicates only with the upper and lower layers of that layer. Each layer uses the layer below to perform its functions. Communication between layers is via predefined fixed interfaces.

The microkernel

To achieve high reliability, it is necessary to divide the operating system into small, hierarchical modules, with only one module – the microkernel – running in kernel mode and the rest running as normal user processes. Because each device driver and file system is treated separately as a normal user process, errors in these modules can crash the modules, but not the entire system.

MINIX 3 is a masterpiece of the microkernel, and its structure is as follows

Outside the kernel, the system is constructed with three layers, all of which run in user mode, with the device driver at the bottom. Because they run in user mode, they cannot physically access the I/O port space or issue I/O commands directly. Instead, to be able to program the I/O device, the driver completes a call by building a structure that specifies which parameter values are written to which I/O port and claiming a kernel call.

Client-server mode

The microkernel strategy is to divide processes into two types: servers, each used to provide services; Clients, using these services. This pattern is called the client-server pattern.

Client-server mode will have two carriers, one case is a computer is both client and server, in this way, the operating system will have some optimization; But it is common for the client and server to be on different machines, connected over a local area network or wan.

The client communicates with the server by sending messages that the client does not need to know whether the messages are being processed on the local machine or sent over the network to a remote machine. For the client, both situations are the same: you send a request and get a response.

Why is it called falling into the kernel

If the software structure were layered, it would look something like this, with applications on the outer layer and the operating system kernel on the inside.

The application is at privilege level 3, and the operating system kernel is at privilege level 0. If a user program wants to access operating system resources, it makes a system call and gets stuck in the kernel, the CPU goes into kernel mode and executes kernel code. As for why it is trapped, we can see from the picture that the core is a sunken structure with a feeling of sinking, so it is called trapped.

What are user mode and kernel mode

User state and kernel state are two operating states of an operating system.

Kernel state: A CPU in kernel state can access any data, including peripheral devices, such as network cards and hard disks. The CPU in kernel state can switch from one program to another without preemption. Generally, the CPU in kernel state is at privilege level 0.
User mode: A CPU in user mode can only access memory and peripheral devices. A CPU in user mode cannot be used exclusively. That is, a CPU can be acquired by other programs.

So why user and kernel?

This is mainly the consideration of the access ability, there are some dangerous operations in the computer, such as setting the clock, memory cleaning, these need to be completed in the kernel state, if you randomly carry out these operations, that your system will crash how many times.

How to switch between user and kernel mode?

All user processes run in user mode, but as mentioned above, user programs have limited access. Some important operations, such as reading data from the hard disk and retrieving data from the keyboard, can only be done in kernel mode, and these data are very important to user programs. So there are two modes of transformation involved, namely user mode -> kernel mode -> user mode, and the only one who can do these operations is the system call, and the only one who can perform the system call is the operating system.

The conversion of user state -> kernel state is called trap into kernel, also known as trap instruction.

Their workflow is as follows:

First the user program will callglibcGlibc is both a standard library and a core set of libraries that define key apis.
The gliBC library knows how to call for different architecturesThe system calls, which prepares the system call by setting parameters passed by the user process based on the binary interface of the architecture application.
Then, the glibc library callsSoftware Interrupt Instruction (SWI)This instruction is updatedCPSRThe register changes mode to superuser mode, then jumps to address0x08Place.
So far, the whole process remains in user mode, allowing the process to execute kernel code after executing SWI instructions, and the MMU now allows kernel virtual memory access
Starting at address 0x08, the process loads and jumps to the interrupt handler, which is in ARMvector_swi().
At vector_swi(), the system call number SCNO is extracted from the SWI instruction and then used as the system call tablesys_call_tableIndex to the system call function.
After executing the system call, the user mode registers are restored and then executed in user mode.

What is the kernel

In computers, the kernel is a computer program that is the core of the operating system and controls everything in the operating system. The kernel is usually the first program loaded before the Boot Loader loads the program.

Here also need to understand what is the boot loader.

A boot loader, also known as a boot loader, can place a computer’s operating system into memory. When the power is turned on or the computer restarts, the BIOS performs some initial tests and then transfers control to the master boot record (MBR) where the boot loader resides.

What is a real-time system

Real-time operating system has made strict requirements on time. Real-time operating system is divided into two types: hard real-time and soft real-time

Hard real-time operating systems dictate that an action must be completed or occur at a specified time, such as in a car factory, where welding machines must be completed at a certain time. Welding too early or too late can cause permanent damage to the car.

Soft real-time operating systems don’t want to occasionally violate final deadlines, but they are still acceptable. And it won’t cause any permanent damage. Such as digital audio, multimedia, mobile phones are soft real-time operating systems.

You can easily understand the two metrics of hard real time and soft real time: whether it has to be done in the moment and whether it causes serious damage.

Start the Linux operating system

After the computer is powered On, the BIOS performs power-on self-test (POST) to check and initialize the hardware. Because the operating system will use the disk, screen, keyboard, mouse and other devices. Next, the first partition on the disk, also known as the MBR(Master Boot Record) Master Boot Record, is read into a fixed memory area and executed. This partition contains a very small, 512-byte program. The program calls a boot independent program from the disk, and the boot program copies itself to the high address memory to free the low address memory for the operating system.

After the replication is complete, the boot program reads the root directory of the boot device. Boot programs understand file system and directory formats. The Boot program is then called into the kernel, transferring control to the kernel. Until here, Boot has done its job. The system kernel starts running.

The kernel boot code is done in assembly language, mainly including creating the kernel stack, identifying the CPU type, calculating memory, disabling interrupts, starting the memory management unit, etc., and then calling the MAIN function of C language to execute the operating system part.

This section also does a number of things. First, a message buffer is allocated to hold debugging problems, and debugging information is written to the buffer. This information can be called up by a diagnostic program if debugging goes wrong.

Then the operating system will automatically configure the device, detect the device, load the configuration file, and if the detected device responds, it will be added to the linked device table. If there is no response, it is classified as disconnected and ignored.

After configuring all the hardware, the next thing to do is to carefully manually process process 0, set up its stack, and then run it, perform initialization, configure the clock, and mount the file system. Create the init process (process 1) and the daemon (process 2).

The init process detects its flag to determine whether it serves single or multiple users. In the former case, it calls fork to create a shell process and waits for the process to terminate. In the latter case, fork is called to create a process that runs the system-initialized shell script (i.e. /etc/rc), checks file system consistency, mounts file systems, starts daemons, and so on.

The /etc/rc process then reads data from /etc/ttys, which lists all terminals and attributes. For each enabled terminal, the process calls the fork function to create a copy of itself, process it internally and run a program called Getty.

The Getty program will type on the terminal

login:
Copy the code

Wait for the user to enter a user name. After the user name is entered, the Getty program ends and the /bin/login program starts running. The login program enters a password and compares it with the password saved in /etc/passwd. If the password is correct, the login program replaces itself with a user shell program and waits for the first command. If not, the login program asks for a different user name.

The entire system startup process is as follows

Processes and threads

Advantages of multiprocessing systems

With the increasing of processors, our computer system has changed from a single system to a multi-processing system. The throughput of multi-processing system is relatively high. Multi-processing system has multiple parallel processors, which share clock, memory, bus, peripheral equipment and so on.

Multiprocessing systems can save money by sharing resources. The reliability of the whole system is also improved.

What are processes and schedules

A process is an instance of an executing program. For example, a Web program is a process, a shell is a process, and the article editor Typora is a process.

The operating system is responsible for managing all running processes, allocating a specific amount of time to each process to occupy CPU, and allocating specific resources to each process.

The operating system maintains a process table to keep track of the activity status of each process. Inside the process table, the status of each process and the resources used by each process are listed.

What is a thread and the difference between a thread and a process

This is another old question, let’s answer it from the perspective of operating system.

We mentioned above that a process is an instance of a running program, and a thread is actually a single stream in a process. Threads are also called lightweight processes because they have certain properties in the process. If the browser is a process, each TAB page below the browser can be thought of as a thread.

The following is the difference between threads and processes holding resources

Threads are not as independent as processes, and they share data

The overhead of creating a thread is much lower than that of a process, because all you need to create a thread is a stack pointer and a program counter, while creating a process requires the operating system to allocate new address space, data resources, etc., which is expensive.

What is context switching

For a single-core, single-threaded CPU, only one CPU instruction can be executed at a time. Context Switch is a mechanism for allocating CPU resources from one process to another. From the user’s point of view, computers can run multiple processes in parallel precisely as a result of the operating system’s rapid context switching. In the process of switching, the operating system needs to first store the state of the current process (including the pointer to the memory space, the currently executed instructions, etc.), then read the state of the next process, and then execute the process.

What are the benefits of using multiple threads

Multithreading is one of the basic things programmers have to know, so here are some of the benefits of multithreading

Improves the sequence of responses to users
Resource sharing in the process
Comparative economy
In-depth understanding of multi-threaded architectures

How the process terminates

Termination of a process

After a process is created, it starts running and does its job. However, nothing lasts forever, including progress. A process terminates sooner or later, but it is usually triggered by the following

Normal exit (voluntary)
Wrong exit (voluntary)
Serious error (involuntary)
Killed by another process (involuntarily)

The normal exit

Most processes terminate when work is done. When the compiler has finished compiling the given program, the compiler makes a system call to tell the operating system that it is done. This call is exit on UNIX and ExitProcess on Windows. Screen-oriented software also supports voluntary termination. Word processors, Internet browsers, and similar programs always have an icon or menu item that the user can click to tell the process to delete any temporary files it locks open and then terminate.

Error to exit

The second reason for a process to terminate is when a serious error is found, for example, if a user executes the following command

cc foo.c	
Copy the code

To be able to compile foo.c but the file doesn’t exist, the compiler issues a declaration and exits. When error parameters is given, for interactive process usually does not directly out of the screen, because it is not reasonable from the view of the user, the user needs to know what happened and want to try again, so the application usually will pop up a dialog box to inform the user system error occurred, it is need to retry or quit.

Serious mistakes

The third cause of process termination is an error caused by the process, usually due to a bug in the program. For example, an invalid instruction is executed, references memory that does not exist, or divisor is 0, etc. In some systems, such as UNIX, a process can notify the operating system that it wants to handle certain types of errors on its own, in which case it receives a signal (an interrupt) rather than terminating the process when such errors occur.

Killed by another process

The fourth reason for terminating a process is when a process makes a system call telling the operating system to kill a process. In UNIX, the system call is kill. In Win32 the corresponding function is TerminateProcess (note not the system call).

Communication between processes

There are many ways to communicate between processes. First, you need to understand the following concepts

Race conditions: When two or more threads modify a shared data at the same time, affecting the correct execution of the program, they are called race conditions.
Critical sections: not only does sharing resources create race conditions. in fact, sharing files and shared memory can also create race conditions. so how can you avoid it? It is forbidden for one or more processes to read or write shared resources (including shared memory and shared files) at the same time. In other words, we need a mutual exclusion condition, which means that if one process uses shared variables and files in a certain way, other processes are prohibited from doing so (accessing unified resources).

A good solution should include the following four conditions
1. At no time can two processes be in a critical region at the same time
2. No assumptions should be made about the speed and number of cpus
3. Processes outside a critical area must not block other processes
4. You cannot make any process wait indefinitely to enter a critical section
Mutual exclusion of busy: When one process is modifying a resource, other processes must wait. Each process must be mutually exclusive. The solution we discussed is based on mutual exclusion of busy.

The professional term for inter-process Communication is Inter Process Communication, IPC, which mainly includes the following 7. Type of communication

The messaging: Message passing is the mechanism between processes to realize communication and synchronous waiting. With message passing, the communication between processes can communicate directly without sharing variables. Message delivery is divided into sender and receiver
Fifo queue: FIFO queue refers to the communication between two unrelated processes. The two processes can communicate with each other. This is a full-duplex communication mode
The pipeThe: pipe is used for communication between two related processes. This is a half-duplex communication mode. If full duplex is required, another pipe is required.
Direct communication: In this mode of process communication, only one link exists between processes, and the names of communication parties must be specified between processes.
Indirect communicationIndirect communication means that the communication parties do not directly establish a connection, but find a mediator, which may be an object and so on. The process can place messages in it, and can delete messages from it, so as to achieve the purpose of inter-process communication.
The message queueMessage queues are linked lists of stored messages in the kernel, identified by message queue identifiers, which provide full-duplex communication connections between different processes.
The Shared memoryShared memory uses memory between all processes to establish a connection. This type requires synchronous process access to protect each other.

Interprocess state model

Three-state model of processes

When a process starts running, it may go through one of these states

There are three states involved in the diagram

Running state: Run state refers to the actual elapsed CPU time slice of the process
The ready state: Ready is runnable, but ready because other processes are running
Blocking state: Blocked state, also known as sleep state, refers to the process is not ready to run, waiting to be scheduled by the CPU.

Logically, the running state and the ready state are very similar. Both cases indicate that the process is running, but the second case does not get CPU time sharding. The reason the third state is different from the first two is that the process cannot run, nor can it run when the CPU is idle.

The three states involve switching between the four states. A state 1 rotation occurs when the operating system finds that a process cannot continue. In some systems, a process makes a system call, such as pause, to obtain a blocked state. On other systems, including UNIX, a process is terminated automatically when no input is available to read from a pipe or a special file (such as a terminal).

Both transformations 2 and 3 are caused by the process scheduler (part of the operating system), which the process itself is unaware of. The occurrence of transformation 2 indicates that the process scheduler has determined that the current process has been running long enough that it is time for another process to run the CPU time slice. When all other processes have run and it is time for the first process to regain the CPU time slice, the transition 3 occurs.

Scheduling refers to deciding which processes should be run first and for how long, which is important. Many algorithms have been designed to try to balance the overall efficiency of the system against the competing requirements of the various processes.

A transformation 4 occurs when an external event that the process is waiting for occurs, such as after some data is input from the outside. If no other process is running at this point, conversion 3 is triggered immediately and the process starts running, otherwise the process is in the ready phase, waiting for the CPU to be free before it is its turn to run.

Five-state model of process

On the basis of the three-state model, two states are added, namely, the new state and the termination state.

New state: The new state of a process is when the process is first created

Creating a process requires two steps: allocate the required resources and space for the new process, set the process to ready state, and wait for scheduling to execute.

Terminating state: The terminating state of a process is when the process finishes executing, reaches its end point, or has to terminate because of an error.

Terminating a process requires two steps:

Wait for the operating system or related processes to rectify the fault.

The occupied resources are reclaimed and deleted by the system.

What are the scheduling algorithms

Scheduling algorithms can be divided into three categories: batch scheduling, interactive scheduling and real-time scheduling

Scheduling in batch processing

First come, first served

It’s like a first come, first served… Perhaps the simplest design for a non-preemptive scheduling algorithm is first-come,first-serverd. Using this algorithm, cpus are allocated to processes in the order they are requested. At its most basic, there is a queue of ready processes. When the first task enters the system from outside, it starts immediately and allows it to run for any length of time. It will not be interrupted by running too long. When other jobs come in, they go to the end of the ready queue. When the running process blocks, the first process in the wait queue starts running. When a blocked process is back in the ready state, it will appear as a newly arrived task at the end of the queue, at the end of all processes.

The power of this algorithm is that it is easy to understand and program. In this algorithm, a single linked list keeps track of all ready processes. To select a process to run, simply remove a process from the head of the queue. To add a new job or block a process, simply append the job or process to the end of the queue. This is a very simple implementation.

If you have 100 I/O processes queuing and the 101st one is cpu-intensive, then you have to wait for 100 I/O processes to finish running before one cpu-intensive process runs. This is not possible in practice, so priority or preemptive processes are needed to prioritize important processes to run.

Shortest job first

In batch processing, the second scheduling algorithm is Shortest Job First, and we assume that the running time is known. An insurance company, for example, can predict with considerable accuracy how long it will take to process a batch of 1,000 claims because it does similar work every day. When several equally important jobs are started in the input queue, the scheduler should use the shortest first job algorithm

As shown in figure A, there are four jobs A, B, C, and D, with running times of 8, 4, 4, and 4 minutes respectively. If running in the order shown in the figure, the turnaround time of A is 8 minutes, B is 12 minutes, C is 16 minutes, D is 20 minutes, and the average time is 14 minutes.

Now consider using the shortest job first algorithm to run four jobs, as shown in Figure B. The current turnaround time is 4, 8, 12 and 20 respectively, with an average of 11 minutes, which can prove that shortest job first is optimal. Consider the case where there are four jobs with running times a, B, C, and D respectively. The first job ends at time A, the second at time A + B, and so on. The average turnaround time is (4A + 3b + 2C + d) / 4. Obviously A has the greatest impact on the average, so A should be the shortest priority job, followed by B, then C, and then D and it can only affect its own turnaround time.

It is important to note that the shortest job first algorithm is optimal when all processes are running.

Minimum remaining time is preferred

The Shortest job first preemptive version is known as the Shortest Remaining Time Next algorithm. With this algorithm, the scheduler always selects the process with the shortest remaining running time to run. When a new job arrives, its total time is compared to the remaining time of the current process. If the new process takes less time than the current running process, the current process is suspended and the new process is run. In this way, short-term operations can be well served.

Scheduling in interactive systems

Interactive systems are so common in PCS, servers, and other systems that it’s worth exploring interactive scheduling

Polling scheduling

One of the oldest, simplest, fairest and most widely used algorithms is the round robin algorithm. Each process is assigned a period of time, called a quantum, within which the process is allowed to run. If the process is still running when the time slice ends, it preempts one CPU and allocates it to another process. If the process blocks or ends before the time slice ends, the CPU switches immediately. The polling algorithm is relatively easy to implement. What the scheduler does is maintain a list of runnable processes, like a in the image below, that are moved to the end of the queue when a process runs out of time, like B in the image below.

Priority scheduling

It is not true that all processes are of equal priority. For example, in a university hierarchy, first the dean, then the professors, secretaries, support staff, and finally the students. This consideration of external circumstances enables priority scheduling

The basic idea is clear: each process is assigned a priority, and the process with the highest priority runs first.

This does not mean that high-priority processes can run forever, however, as the scheduler reduces the priority of the currently running process during each clock interrupt. If this action causes its priority to drop below that of the next highest process, a process switch occurs. Alternatively, each process can be assigned a maximum interval of time allowed to run. When the interval runs out, the next higher-priority process gets a chance to run.

Shortest process first

For batch systems, since shortest job first is often accompanied by shortest response times, one approach is to make assumptions based on the past behavior of the process and execute the one with the shortest estimated elapsed time. Assuming that the estimated running time of each command on each terminal is T0, now assuming that its next run time is measured to T1, the estimated time can be improved by weighting two values, namely aT0+ (1-1)T1. By choosing the value of A, you can decide whether to forget the old running times as soon as possible or to keep them in mind for a long time. When a is equal to 1/2, you get the following sequence

As you can see, after three rounds, T0’s share of the new estimate has dropped to 1/8.

This technique of taking a weighted average of a current measurement and a previous estimate to get the next one is sometimes called aging. This method uses a lot of predictive values based on the current values.

Lottery scheduling

There is an algorithm that can give prediction results and has a relatively simple way to achieve it, namely lottery Scheduling algorithm. His basic idea is to provide a lottery of system resources for processes. When a scheduling decision is made, a lottery ticket is drawn at random, and the process that owns the lottery ticket gains resources. For example, during CPU scheduling, the system could hold 50 sweepstakes per second, and each winning process would be rewarded with extra running time.

Think of the lottery as a buff, which gives you a 15% chance to produce speed boots.

Fair share scheduling

If user 1 starts nine processes and user 2 starts one, using rotation or the same priority scheduling algorithm, user 1 will get 90% CPU time and user 2 will get 10% CPU time.

To prevent this, some systems take process owners into account before scheduling. In this model, each user allocates some CPU time, and the scheduler selects processes and forces them to execute. So if two users are each guaranteed a 50% CPU slice, then no matter how many processes a user has, they will get the same CPU share.

What are the metrics that affect the scheduler

Several factors determine whether a scheduler is good or bad

CPU usage:

The percentage of time the CPU is executing a task (that is, not idle).

Waiting time

This is the time when the process is executed in rotation, that is, when the process is switched

throughput

The number of processes completed per unit of time

The response time

This is the time that passes between submitting the process and getting useful output.

Turnaround time

The time elapsed from submission to completion of the process.

What is RR scheduling algorithm

The round-robin (RR) scheduling algorithm is mainly for time-sharing systems. The RR scheduling algorithm allocates the same time slice to each process in a cycle. The RR scheduling algorithm has no concept of priority. The implementation of this algorithm is relatively simple, and each thread will occupy the time slice, there is no thread hunger problem.

Memory Management

What is on-demand paging

In an operating system, processes are loaded into memory on a page basis, and on-demand paging is a virtual memory management approach. In systems that use request paging, the operating system copies disk pages into memory only if a page miss exception occurs when an attempt is made to access the disk where the page is located and the page is not already in memory.

What is virtual memory

Virtual memory is a memory allocation scheme and a mechanism that can be used to assist memory allocation. As we know, applications are loaded into memory on a per-page basis. But not all pages load into memory, and the hardware and software in the computer make up for the memory deficit by temporarily transferring data from RAM to disk. If you don’t have virtual memory, the computer will talk to you once you fill it up

Uh, no, sorry, you cannot load any more applications, please close another application to load a new one. For virtual memory, the computer can perform operations by looking at areas of memory that have not been used recently and then copying them to the hard disk. Virtual memory is realized by copying technology girl, you can see brother can install so many programs capital. Replication is automatic, you don’t know it’s there.

The implementation of virtual memory

In virtual memory, a job can be called into memory multiple times. In the continuous allocation mode, a large part of the memory space will be temporarily or permanently idle, resulting in a serious waste of memory resources, and the memory capacity cannot be logically expanded. Therefore, the real demand of virtual memory is based on discrete allocation of memory management. Virtual memory can be implemented in three ways:

Request paging storage management.
Request segmented storage management.
Request segment-page storage management.

Either way, you need some hardware support. Support is generally required in the following aspects:

A certain amount of memory and external storage.
Page table mechanism (or segment table mechanism), as the primary data structure.
Interrupt mechanism, when the user program to access the part has not been called into memory, interrupt.
Address shuffling mechanism, logical address to physical address transformation.

Why is memory segmented

Memory is a random-access device. For memory, you don’t need to start from scratch, just give the address directly. Memory segmentation starts with the 8086 CPU. The 8086 CPU is still 16 bit register wide. The 16 bit register can store a range of 2 to the 16th power, that is, 64 KB. If two identical programs compile to the same address, they cannot run at the same time. To solve this problem, the operating system designers proposed that the CPU use segment base address + segment offset to access arbitrary memory. This has the advantage of allowing programs to relocate, which is why memory is segmented in the first place.

So what is relocation?

Simply put, the instruction address in the program is changed to another address, where the content stored is the original.

CPU uses segment base address + segment offset address in the form of memory access, you need to provide special registers, these special registers are CS, DS, ES, etc., if you are not familiar with registers, you can see my article.

Love love, this register is a little interesting

In other words, the program needs to use any memory, first need to load the appropriate segment into the segment base register, and then give the segment offset address relative to the segment base address. The address adder in the CPU merges the two addresses from the address bus into memory.

The 8086 CPU has 20 root address bus, the maximum addressing capacity is 1MB, and the segment base address register is only 16 bits wide, the maximum addressing capacity is 64 KB, 64 KB obviously does not meet the maximum addressing range of 1MB, so you have to segment the memory. The maximum addressable capacity of each segment is 64KB, but it is still not able to reach the maximum addressable capacity of 1MB, so the offset address is needed. The offset address is also stored in the register, which also has 64KB addressable capacity, so it is still not able to meet the 1MB addressable capacity. So THE CPU designers manipulated the address unit by moving the segment base address four bits to the left and then adding it to the 16-bit in-segment offset address to achieve 1MB of addressing capability. So the second purpose of memory segmentation is to be able to access all memory.

Physical address, logical address, valid address, linear address, virtual address difference

Physical address is the real address in memory, it is equivalent to the number of your home, your home must have this number, with unique. No matter what kind of address it is, it will eventually be mapped to a physical address.

In real mode, segment base + segment offset is processed by the address adder, transmitted through the address bus, and eventually converted to a physical address.

But in protected mode, the segment base + in-segment offset is called a linear address, but the segment base cannot be called an actual address. Instead, the segment base is called a selector, which is an index, which is the subscript of an array, through which the corresponding segment descriptor can be found in the GDT. The segment descriptor records information such as the start of the segment, the size of the segment, and so on, thus obtaining the base address. If paging is not enabled, the linear address can be used as a physical address, accessing memory directly. If paging is turned on, the linear address has another name, which is the virtual address.

In-segment offset addresses are called valid addresses in either real or protected mode. Effective boycott is also a logical address.

A linear address can be considered a virtual address. A virtual address is not a real physical address, but the virtual address will eventually be mapped to a physical address. Below is the mapping of virtual addresses -> physical addresses.

Free memory management

When the operating system dynamically allocates memory (malloc, new), it needs to manage spatial memory. Two methods are generally used: bitmaps and free linked lists.

Use bitmaps for management

With the bitmap approach, memory may be divided into allocation units as small as a few words or as large as a few thousand bytes. Each allocation unit corresponds to a bit in the bitmap, with 0 indicating free and 1 indicating occupied (or vice versa). A memory region and its corresponding bitmap are shown below

Figure A represents a segment of memory with five processes and three free areas. The scale is memory allocation unit, and the shaded area is free (represented by 0 in the bitmap). Figure B represents the corresponding bitmap; Figure C shows the same information as a linked list

The size of the allocation unit is an important design factor; the smaller the allocation unit, the larger the bitmap. However, even with only a 4-byte allocation unit, 32-bit memory requires only one bit of the bitmap. 32N bits of memory requires n bits of bitmap, so one bitmap takes up 1/32 of memory. If you choose a larger memory location, the bitmap should be smaller. If the size of the process is not an integer multiple of the allocation unit, then a lot of memory is wasted in the last allocation unit.

Bitmaps provide an easy way to track memory usage in fixed-size memory, because the size of a bitmap depends on the size of memory and allocation units. One problem with this approach is that when deciding to put a process with K allocation units into memory, the Memory Manager must search the bitmap for strings capable of running k consecutive 0 bits. The downside of bitmaps is that finding consecutive zeros of specified length in a bitmap is a time-consuming operation. (This can be simply understood as finding a large number of free array elements in a chaotic array.)

Use free linked lists

Another way to keep track of memory usage is to maintain a linked list of allocated memory segments and free memory segments, which may contain free areas for processes or both processes. You can use Figure C above to represent memory usage. Each item in the list can represent a free area (H) or process (P) start flag, length, and the location of the next item.

In this example, the segment list is sorted by address. The advantage of this approach is that it is easy to update the list when a process terminates or is swapped. A terminating process usually has two neighbors (except the top and bottom of memory). Adjacent areas can be processes or free areas, and they can be combined in four ways.

When storing processes and free areas in a linked list by address, there are several algorithms that can allocate memory for processes that are created (or swapped in from disk).

First fit algorithm: searches through the linked list until an initial free area large enough is found and allocated. Unless the process size and the space area size are exactly the same, the free area is divided into two parts, one for the process and one for the new free area. This method is a fast algorithm because the number of indexed linked list nodes is small.
The next adaptation algorithm: works in the same way as the first adaptation algorithm, but every time it finds a new free area, it records the current location. The next search for the free area starts from the place where the last search ended, rather than starting from the beginning as the first adaptation algorithm.
Best fit algorithm: search the entire list to find the smallest free area that can accommodate the process allocation. The problem with this is that, although it is possible to ensure that the most suitable free area is found for the process to allocate, in most cases this free area is divided into two parts, one for the process to allocate, and one for the small free area, which is difficult to reuse.
Worst fit algorithm: in contrast to the best fit algorithm, each allocation searches for the largest free area for allocation, so that the new free area can be better utilized by splitting the free area.

Page replacement algorithms are what

During address mapping, if the page to be accessed is found to be out of memory, a page-missing interrupt is generated. When a page miss interrupt occurs, if there are no free pages in operating system memory, the operating system must select a page in memory to move it out of memory to make room for the page to be called in. The rules used to choose which pages to discard are called page replacement algorithms.

Below I summarize these page replacement algorithm is quite complete, only give a simple introduction, the realization and principle of the algorithm readers can understand.

The optimal algorithmReplaces the last page to be visited in the current page. Unfortunately, there’s no way to determine which page will be the last to visit,So actually the algorithm can't be used. However, it can be used as a yardstick against which other algorithms can be measured.
NRUThe algorithm classifies the page atmosphere into four categories according to the states of R bits and M bits. Select a page at random from the least numbered category. The NRU algorithm is easy to implement, but not very good performance. There are better algorithms.
FIFOIt tracks the order in which pages are loaded into memory and places them in a linked list. It is possible to delete the oldest pages that are still in use, so this algorithm is not a good choice either.
A second chanceThe algorithm is a modification to the FIFO that checks to see if a page is still in use before deleting it. If the page is in use, it is reserved. This improvement greatly improves performance.
The clockThe clock algorithm is another implementation of the second chance algorithm. The clock algorithm performs about the same as the second chance algorithm, but takes less time to execute the algorithm.
LRUThe algorithm is a very good algorithm, but noSpecial Hardware (TLB)Difficult to implement. If you don’t have hardware, you can’t use the LRU algorithm.
NFUThe algorithm is similar to LRU algorithm, but its performance is not very good.
agingThe algorithm is a closer implementation of the LRU algorithm, and can be better implemented, so it is a good choice
The last two algorithms use the working set algorithm. The working set algorithm provides reasonable performance overhead, but its implementation is complicated.WSClockIs another variant that not only provides good performance, but can be implemented efficiently.

The best algorithms are aging algorithm and WSClock algorithm. They are based on LRU and working set algorithms respectively. They all have good performance and can be implemented effectively. There are other good algorithms out there, but actually these two are probably the most important.

File System

A way to improve file system performance

Access to disk is much slower than memory, so it’s time to revisit this chart

So disk tuning is necessary, and we’ll discuss some of them below

The cache

The most common technique to reduce the number of disk accesses is to use block cache or buffer cache. A cache is a series of blocks that logically belong to disk but are actually held in memory for performance reasons.

There are different algorithms for managing the cache. A common algorithm is to check all read requests to see if there are any blocks in the cache that are needed. If yes, read operations can be performed without accessing the disk. If the check block is no longer in the cache, it is first read into the cache and then copied to the desired location. After that, requests to the same block are made through the cache.

The operation of caching is shown below

Because there are many blocks in the cache, you need some way to quickly determine if the required block exists. The common method is to hash device and disk addresses. The result is then looked up in the hash table. Blocks with the same hash value are joined together in a linked list (is this data structure much like a HashMap?). , so you can look for other blocks along the collision chain.

If the cache is full and a new block needs to be called in, one of the original blocks needs to be called out of the cache, or if the block to be called has been modified since the last call, it needs to be written back to disk. This situation is very similar to paging.

Block read ahead of time

The second significant improvement in filesystem performance is the attempt to write blocks to the cache before they are needed to improve the hit ratio. Many files are read sequentially. If the requested file system generates block K in a file, the file system performs the operation and, upon completion, checks the cache to determine whether block K + 1 is already cached. If not, the file system arranges a prefetch for k + 1, because the file wants to read the block directly from the cache when it is used.

Of course, the block-ahead read policy only applies to files that are actually read sequentially. For randomly accessed files, reading ahead doesn’t work at all. It can even get in the way.

Reduces the disk arm movement

Caching and block read-ahead are not the only ways to improve filesystem performance. Another important technique is to place blocks that can be accessed sequentially together, preferably on the same cylinder, thereby reducing the number of disk arm movements. When writing an output file, the file system must allocate disk blocks again and again as required. If a bitmap is used to record free blocks, and the entire bitmap is in memory, it is easy to select the nearest free block to the previous one. If you have a free table, and part of the linked list exists on disk, it is much more difficult to allocate adjacent free blocks.

However, even with free tables, you can use the cluster technique. That is, continuous clusters of blocks instead of blocks are used to track disk storage. If a sector has 512 bytes, it is possible that the system uses 1 KB blocks (2 sectors) but allocates disk storage in units of 2 blocks (4 sectors). This is not the same as a 2 KB disk block, because it still uses 1 KB blocks in the cache, and data transfers between disk and memory are also 1 KB, but reading these files sequentially on an idle system can reduce the number of seeks by half, resulting in a significant improvement in filesystem performance. Variations of this method can be obtained if rotational positioning is considered. When allocating blocks, the system tries to store contiguous blocks in a file on the same cylinder.

Another performance bottleneck in systems that use inodes, or any inode-like system, is that reading a very short file also requires two disk accesses: one to the inode and one to the block. Typically, inodes are placed as shown below

All inodes are placed near the beginning of the disk, so the average distance between the inode and the block it points to is half that of the cylinder group, which requires a longer seek time.

A simple improvement is to store the inode in the middle of the disk instead of at the beginning, so that the seek time between the inode and the first block is halved. Another approach is to divide the disk into cylinder groups, each with its own inodes, data blocks, and free tables, as shown in Figure B.

Of course, it only makes sense to talk about seek times and spin times if you have disk arms on the disk. More and more computers are using solid-state drives (SSDS), for which many of the problems of traditional hard drives disappear because the same manufacturing techniques used in flash memory allow random and sequential access to transfer at similar speeds. But it also raises new questions.

Disk defragmentation

After the initial installation of the operating system, files are constantly created and erased, resulting in fragmentation of the disk. When a file is created, its blocks are scattered all over the disk, reducing performance. Reclaiming disk blocks after deleting files may cause holes.

Disk performance can be restored by moving files next to each other and placing all or at least most of the free space in one or more large contiguity areas. There’s a Windows program called Defrag that does just that. Windows users will use it a lot, with the exception of SSD.

Disk defragmenter will run fine on the file system. Linux file systems (ext2 and ext3 in particular) are generally less difficult to defragment than Windows due to the way they select disk blocks, so manual disk defragmentation is rarely required. Moreover, SSDS are not affected by disk fragmentation, and in fact, defragmentation on SSDS is a waste of time, wearing out the SSD rather than improving performance. So defragmentation will only shorten the life of the SSD.

Disk arm scheduling algorithm

Generally, the disk fast read/write time is determined by the following factors

Seek time – The seek time is the time to move the disk arm to the disk block to be read
Rotation delay – The time required to wait for the appropriate sector to rotate under the head
The actual reading or writing time of data

These three time parameters are also the disk seek process. In general, the seek time has the greatest impact on the total time, so effectively reducing the seek time can improve disk read speed.

If the disk driver receives requests one at a time and completes them in the order they are received, which is first-come, first-served (FCFS), it is difficult to optimize the seek time. Because each request is processed sequentially, no matter what the order, it is possible to wait for one disk to rotate one week after reading, while the other cylinders can read immediately, in which case each request will queue up as well.

Normally, while the disk is seeking, other processes will make additional disk requests. The disk driver maintains a table in which cylinder numbers are recorded as indexes. Unfinished requests for each cylinder form a linked list. The linked list heads are stored in the corresponding entries in the table.

An alternative to the first-come-first-served algorithm is to use the shortest path First (SSF) algorithm, which is described below.

If a request for 11, 2, 4, 14, 8, 15, 3 occurs simultaneously while addressing track 6, if the first come, first served principle is used, as shown in the figure below

We can calculate that the number of disks the disk arm spans is 5 + 9 + 2 + 10 + 6 + 7 + 12 = 51, which is equivalent to 51 disk spans. If using shortest path first, we can calculate the number of disk spans

The number of disks across is 4 + 1 + 1 + 4 + 3 + 3 + 1 = 17, which saves twice as much time as 51.

However, the shortest path first algorithm is not perfect, this algorithm still has a problem, that is the priority problem,

A prototype for reference here is the elevator in our daily life, which uses an elevator algorithm for scheduling to meet the conflicting goals of coordinated efficiency and fairness. The elevator will generally keep moving in one direction until there are no requests in that direction, and then change direction.

The elevator algorithm maintains a binary bit, which is the current direction bit: UP or DOWN. When a request is processed, the disk or elevator driver checks the bit, and if the bit is UP, the disk arm or elevator bin moves to the next higher level to drop the pending request. If the high level has no outstanding requests, take the opposite direction. When the direction bit is DOWN and there is a low level request, the disk arm will turn to this point. If it doesn’t, it just stops and waits.

Let’s take an example to describe the elevator algorithm. For example, each cylinder is served in the order of 4,7,10,14,9,6,3,1. Then its flow chart is as follows

So the number of disks the elevator algorithm needs to span is 3 + 3 + 4 + 5 + 3 + 3 + 1 = 22

Elevator algorithms are generally inferior to SSF algorithms.

Another optimization can be made using some disk controllers that provide a way for the software to check the current sector number below the head. If there are two or more requests waiting to be processed for the same cylinder, the driver can issue a request to read or write the sector to be passed through the head next time.

It is important to note here that when a cylinder has multiple tracks, successive requests may be made for different tracks. This choice is not costly, since the selection of the head requires no disk arm movement and no rotation delay.

Seeking time and rotation latency are the biggest performance issues for disks, so reading only one or two sectors at a time is inefficient. For this reason, many disk controllers always read multiple sectors and cache them, even when only one sector is requested. Typically a sector is read at the same time as the track in which the sector is located or all remaining sectors are read, depending on how much space is available in the controller’s cache.

The disk controller cache is somewhat different from the operating system cache in that the disk controller cache is used to cache blocks that are not actually requested, whereas the operating system maintains a cache consisting of blocks that are explicitly read and that the operating system will assume will still be used frequently in the near future.

When there are multiple drives on the same controller, the operating system should maintain a separate outstanding request table for each drive. As soon as one of the drives is idle, a seek request should be made to move the disk arm to the next requested cylinder. If no disk arms are in the correct position when the next seek request arrives, the driver issues a new seek command on the drive that has just finished transferring and waits to check which drive is idle when the next interrupt arrives.

Different RAID levels

RAID is called redundant array of Disks, or disk array for short. The virtualization technology is used to combine multiple disks into one or more disk arrays to improve performance or data redundancy.

RAID has different levels

RAID 0 – Fault-tolerant striped disk array
RAID 1 – Mirroring and duplex
RAID 2 – Memory error correction codes
RAID 3 – Bits interleaved parity check
RAID 4 – Block staggered parity check
RAID 5 – Block staggered distributed parity check
RAID 6 -P + Q redundancy

IO article

What is a clock in an operating system

Clocks, also known as timers, are essential to any programmed system. The clock is responsible for maintaining time, preventing a process from occupying CPU time for a long time and other functions. Clock software is also a device driver. Here we will introduce the clock, usually discussing the hardware first and then introducing the software, using a bottom-up way, but also to tell you that the bottom is the most important.

The clock hardware

There are two types of clocks in computers, and these clocks are completely different from those used in real life.

A simpler clock is connected to a 110 V or 220 V power cord, so that eachVoltage cycleThere will be an interruption, about 50-60 HZ. These clocks used to dominate.
The other clock consists of a crystal oscillator, a counter and a register, as shown in the diagram below

Called a programmable clock, the programmable clock has two modes, one is one-shot mode. When the clock is switched on, it copies the value in memory into the counter. Then, every time the crystal’s oscillator pulses, the counter will be -1. When the counter turns to 0, an interrupt is generated and work stops until the software shows up again. There is also square-wave mode, in which the value of the storage register is automatically copied to the counter when the counter goes to 0 and an interrupt occurs. This periodic interruption is called a clock cycle.

Main functions of the device controller

A device controller is an addressable device that has a unique device address when it controls only one device. If the device controller controls multiple connected devices, it should contain multiple device addresses and make each device address correspond to one device.

There are two main types of device controllers: character devices and block devices

The main functions of the device controller are as follows

Receive and recognize commands: The device controller can accept and recognize commands from the CPU. There will also be registers inside the device controller to store instructions and parameters
Data exchange: Data is exchanged between the CPU, controller, and device. The CPU sends instructions to the controller over the bus or reads data from the controller in parallel. The controller writes data to the specified device.
Address recognition: Each hardware device has its own address. The device controller can recognize these different addresses to achieve the purpose of controlling the hardware. In addition, in order for the CPU to write or read data into registers, these registers should have unique addresses.
Error detection: The device controller also has the function of detecting data transmitted from the device.

Interrupt handling procedure

There are many interrupt handling schemes. Here is the ARM System Developer’s Guide

Designing and Optimizing System Software lists some options

The nestedThe unnested interrupt handler is also the simplest interrupt handler
nestedThe interrupt handler handles multiple interrupts without assigning priority
reentrantThe interrupt handler can handle multiple interrupts using priority
Simple priorityInterrupt handlers handle simple interrupts
Standard priorityAn interrupt handler can handle higher-priority interrupts in a shorter time than a lower-priority interrupt handler
High priorityInterrupt handlers can work on higher-priority tasks for a short period of time and go directly to specific service routines.
Priority groupingInterrupt handlers can handle interrupt tasks of different priorities

Below are some general interrupt handler steps, the details of which vary from operating system to operating system

Saves all registers that are not saved by interrupt hardware
Sets the context for the interrupt service routine, possibly including SettingsTLB,MMUAnd page tables. If you are not familiar with these three concepts, please refer to another article
Set up the stack for the interrupt service routine
Responds to interrupt controllers and continues to respond to interrupts if no centralized interrupt controller exists
Copy a register from where it was saved to the process table
Runs an interrupt service routine that extracts information from the registers of the device controller that issued the interrupt
The operating system selects an appropriate process to run. If the interrupt causes some higher-priority processes to become ready, select to run those higher-priority processes
TLB may also be required to set the MMU context for the process, depending on the situation
Load process registers, including PSW registers
Start a new process

Above we have listed some general interrupt steps. Different operating systems and interrupt handlers can handle different interrupt steps and details. Below is the detailed operation steps of a nested interrupt

What is a device driver

In a computer, a device driver is a computer program that controls or operates a specific device connected to the computer. Drivers provide software interfaces that interact with hardware, enabling the operating system and other computer programs to access a particular device without needing to know the specifics of its hardware.

What is the DMA

The Chinese name for DMA is direct memory access, which means that the CPU grants I/O modules permission to read or write to memory without involving the CPU. That is, DMA can be done without the CPU. This process is managed by a chip called a DMA controller (DMAC). Because DMA devices can transfer data directly between memory, rather than using the CPU as an intermediary, congestion on the bus can be alleviated. DMA improves system concurrency by allowing the CPU to perform tasks while the DMA system transfers data across the system and memory bus.

Direct memory access features

DMA mode has the following characteristics:

Data transfer is based on data blocks
The transmitted data is either sent directly from the device to main memory or output directly from main memory to the device
CPU intervention is required only at the beginning and end of the transmission of one or more blocks of data, and the transmission of the whole block of data is done under the control of the controller.

Compared with the interrupt drive control mode, DMA mode reduces the INTERFERENCE of CPU to I/O operation, and further improves the parallel operation degree of CPU and I/O device.

DMA lines are simple and inexpensive, and are suitable for batch data transmission between high-speed equipment and main memory. Fast equipment in small and microcomputer all adopt this method, but its function is poor and cannot meet the complex I/O requirements.

Deadlock report

What is a zombie process

A zombie process is a completed and terminated process that still exists in the process table. Zombie processes usually occur in parent-child processes because the parent process still needs to read the exit status of its child process.

Cause of deadlock

There are two main causes of deadlocks: resource contention and improper program execution sequence

The necessary conditions for a deadlock to occur

Resource deadlocks can occur in the following ways

Mutual exclusion: Each resource is assigned to a process or is available
Hold and wait conditions: Processes that have acquired resources are considered capable of acquiring new resources
Non-preemption condition: a resource assigned to one process cannot be forcibly preempted by another process. It can only be released by the process that owns it
Loop waiting: When a deadlock occurs, there must be two or more processes in the system forming a loop, each process in the loop waiting for resources to be released by the next process.

Deadlock recovery mode

Therefore, we need to restore deadlocks detected, and we will discuss several ways to recover deadlocks

Recovery through preemption

In some cases, a resource may be temporarily transferred from its holder to another process. For example, a resource can be forcibly removed from a process and sent back to another process without notifying the original process. This kind of recovery method is generally difficult and some simple rough, not desirable.

Restore by rolling back

If the system designer and machine operator are aware of the possibility of deadlocks, the process can be periodically reviewed. A checkpoint for a process means that the state of the process can be written to a file for later recovery. The checkpoint contains not only the memory image but also the resource state. A more efficient solution is not to overwrite the original checkpoint, but to write each checkpoint to a file so that a series of checkpoint files can be accumulated as the process executes.

To restore, start at the previous earlier checkpoint so that the process requiring resources is rolled back to the previous point in time when the deadlocked process has not yet acquired the required resources and can be allocated at that point.

Kill process recovery

The simplest and most effective solution is to simply kill a deadlocked process. But killing one process may not work, and you need to kill another resource to recover.

Another option is to select an out-of-loop process as a sacrificial lamb to free process resources.

How do I break deadlocks

As is necessary for a deadlock to occur, if a deadlock is to be broken, it can be broken in the following four ways.

Break the mutex condition

Our first concern is to break the mutex condition. If the resource is not monopolized by a process, deadlock will definitely not occur. If two printers using the same resource creates confusion, the printer’s solution is to use spooling printers, a technology that allows multiple processes to produce output at the same time. In this model, the only process that actually requests the printer is the printer daemon, also known as the background process. Background processes do not request additional resources. We can eliminate deadlocks on printers.

Background processes are usually written to print a complete file, and if both processes take up half of the spooler space and neither process completes the entire output, a deadlock will result.

Therefore, as few processes as possible can request resources.

Break the condition that keeps waiting

The second way is that if we can prevent the process that holds the resource from requesting another resource, we can eliminate deadlocks. One way to do this is to have all processes request all resources before they start executing. If the required resources are available, the process completes the allocation and runs to completion. If any of the resources are frequently allocated, processes that are not allocated will wait.

Many processes do not know how many resources are needed before execution is complete. If they do, they can use the banker algorithm. Another problem is that this is not an efficient use of resources.

Alternatively, when a process requests other resources, it releases the occupied resources and then tries again to obtain the full resources.

Break the non-preemption condition

It is also possible to break non-preemption conditions. This can be avoided by virtualization.

Break the loop wait condition

Now there is only one final condition left, and the circular wait condition can be broken in a number of ways. One way to do this is by setting a standard that only one resource can be used by a process at any one time. If another resource is needed, the current resource must be released.

Another option is to uniformly number all the resources, as shown in the figure below

A process can make requests at any time, but all requests must be made in the order of resources. If this allocation rule is followed, there will be no loops between resource allocations.

Deadlock type

Two-stage locking

Although deadlock avoidance and prevention can be handled in many cases, it does not work well. Over time, many excellent algorithms have been proposed to deal with deadlocks. For example, in a database system, a common operation is to request that some records be locked and then update all locked records. When multiple processes are running at the same time, there is a risk of deadlocks.

One solution is to use two-phase locking. As the name implies, there are two phases. In the first phase, the process attempts to lock all the records it needs at once. If successful, the second phase begins, where the update is performed and the lock is released. The first stage is not really meaningful work.

If records required by a process in phase 1 have been locked, the process releases all locked records and restarts phase 1. In a sense, this approach is similar to requesting all necessary resources up front or before performing some irreversible operation.

However, in general application scenarios, the two-phase locking strategy is not universal. It is unacceptable for a process to break in mid-stream and start again if it lacks resources.

Communication deadlock

We’ve been talking about resource deadlocks, which are one type of deadlock but not the only type, and communication deadlocks, which are deadlocks that occur when two or more processes send messages. Process A sends A message to process B, and process A blocks until process B returns A response. If the request message is lost, process A is waiting for A reply, and process B is blocking waiting for the request message to arrive, A deadlock occurs.

Although A deadlock can occur, it is not A resource deadlock because A does not occupy B’s resources. In fact, communication deadlocks do not have fully visible resources. According to the definition of a deadlock, each process blocks because it is waiting for an event caused by another process. This is a deadlock. In contrast to the most common kind of communication deadlock, we call this situation a communication deadlock.

Communication deadlocks cannot be avoided by scheduling, but they can be avoided by using a very important concept in communication: timeout. In the communication process, as long as a message is sent, the sender will start a timer, which records the timeout time of the message. If the timeout time expires but the message is not returned, the message is considered lost and resend. In this way, communication deadlock can be avoided.

However, not all network communication deadlocks are communication deadlocks. Resource deadlocks also exist. The following is a typical resource deadlock.

When a packet enters a router from a host, it is put into a buffer and then transmitted to another router, another router, and so on to the destination. Buffers are resources and are limited in number. As shown in the figure below, each router has 10 buffers (there are actually many).

Suppose all of router A’s data needs to be sent to router B, all of router B’s packets need to be sent to Router D, and then all of Router D’s packets need to be sent to Router A. No packets can be moved because there is no buffer available at the other end, a classic resource deadlock.

Live lock

In some cases, when a process realizes that it cannot acquire the next lock it needs, it will try to politely release the acquired lock and then wait a very short time to try again. Imagine this scenario: when two people meet in a narrow lane, both want to give way to the other side, the same pace will cause both sides to move forward.

Now imagine a pair of parallel processes using two resources. After each failed attempt to acquire another lock, both processes release the lock they hold and try again, repeating the process all the time. Obviously, there is no process blocking, but the process still does not execute down, a condition we call livelock.

hunger

A very similar problem to deadlock and live locks is starvvation. Imagine when you will be hungry? Does not eat for a period of time meeting hungry? For processes, the most important thing is resources, and if resources are not available for a period of time, then processes will starve and those processes will never be served.

Let’s assume that the printer’s allocation scheme is to allocate to the smallest process every time, so the process that prints large files will never be served, causing the process to starve and delay indefinitely, even though it is not blocked.

Afterword.

This is the end of this article, I will continue to write about computer networks, computer fundamentals, Java related, Java architecture related interview questions.

If you think this article is good, you still hope to like it, read it, forward it and leave a message. Welcome to pay attention to my public account [Programmer Cxuan], which has too many dry goods.

Finally, your support is the motivation for me to continue my work. I hope you can enter dachang smoothly, come on!

In addition, add my becomecxuan on wechat to join the one question of the day group and one interview question of the day to share. For more content, please refer to my Github to become the bestJavaer. This article has been included, see the original link for details.

I have uploaded six PDFS by myself. After searching the public account of “Programmer Cxuan” on wechat, I reply to CXuan on the background and get all PDFS. These PDFS are as follows

Six PDF links