In addition to the three abstractions of the operating system mentioned in our previous article, which are processes, address Spaces, and files, the operating system also controls all I/O devices. The operating system must send commands to the device to catch interrupts and handle errors. It should also provide an easy-to-use interface between the device and the rest of the operating system. How the operating system manages I/O is our next focus.

Different people understand I/O hardware differently. To an electronics engineer, I/O hardware is the chip, wires, power supply, and other physical devices that make up the hardware. What we programmers think of as I/O is the interface the hardware provides to the software, such as the commands the hardware receives, the operations it performs, and the errors it receives. We focus on how to program the hardware, not how it works.

I/O devices

What is an I/O device? I/O devices, also known as input/output devices, are the external hardware that humans use to communicate with computers. Input/output devices can send data to and receive data from a computer (input).

I/O devices can be divided into two types: block devices and character devices.

Piece of equipment

A block device is a device capable of storing fixed-size chunks of information that can be read and (optionally) written to fixed-size chunks, sectors, or clusters. Each block has its own physical address. Typically block sizes range from 512 to 65536. All information transmitted will be in contiguous chunks. The basic feature of a block device is that each block is opposite and can read and write independently. Common block devices include hard disks, Blu-ray discs, and USB disks

Block devices generally require fewer pins than character devices.

Disadvantages of block devices

Block devices based on a given solid-state memory are slower than byte addressing based on the same type of memory because reading or writing must begin at the beginning of the block. So, to read any part of the block, you must find the beginning of the block, read the whole block, and discard it if it is not used. To write part of a block, you have to find the beginning of the block, read the whole block into memory, modify the data, find the beginning of the block again, and then write the whole block back to the device.

Character device

Another type of I/O device is the character device. The character device sends or receives a stream of characters in units of characters, regardless of any block structure. The character device is not addressable and does not have any seek operations. Common character devices are printers, network devices, mice, and most other devices that are different from disks.

Data rates for some common devices are shown below.

Equipment controller

First, we need to understand the concept of device controller.

The device controller is the system that processes incoming and outgoing signals from the CPU. The device is connected to the computer through a plug and socket, and the socket is connected to the device controller. The device controller receives data from the connected device and stores it in the special purpose registers, also known as local buffers, inside the controller.

Special purpose registers, as the name implies, are registers designed for one task only. For example, CS, DS, GS, and other segment registers are special purpose registers because they exist to hold segment numbers. Eax, ECX, etc are general-purpose registers because you can use them without limit. For example, you can’t move DS, but you can move EAX, ebX.

General purpose registers such as EAX, ECX, EDX, EBX, ESI, EDI, EBP, ESP

Special purpose registers include cs, DS, SS, ES, FS, GS, EIP, and Flag

Each device controller has an application corresponding to it, and the device controller communicates with the operating system via interrupts through the interface of the application. Device controllers are hardware, and device drivers are software.

I/O devices are usually composed of mechanical components and electronic components. The electronic components are called device controllers or adapters. On personal computers, it usually takes the form of a chip or printed circuit card on a motherboard that plugs into an expansion slot (PCIe).

The mechanical device is itself, and its composition is as follows

The controller card usually has a connector into which the cable to the device itself can be plugged, and many controllers can operate two or four sets of eight identical devices.

The interface between a controller and a device is usually a low-level interface. For example, a disk might be formatted with 2,000,000 sectors, 512 bytes per track. What actually comes out of the driver, however, is a serial bit stream, starting with a preamble, followed by 4096 bits in a sector, and finally a checksum or ECC (Error Code). The leading character is written when the disk is being formatted, and includes cylinder numbers and sector numbers, sector sizes, and similar data, as well as synchronization information.

The task of the controller is to convert the serial bitstream into byte blocks and do the necessary error correction. Byte blocks are usually assembled bitwise in a buffer inside the controller, and then copied into memory after verification and verification to prove that the byte block is error-free.

Memory-mapped I/O

Each controller has several registers to communicate with the CPU. By writing to these registers, the operating system can command the device to send data, receive data, turn the device on or off, and so on. By reading information from these registers, the operating system can know the status of the device, whether it is ready to receive a new command, and so on.

To control registers, many devices have data buffers that the system can read and write. For example, the conventional way to display a pixel on a screen is to use a video RAM, which is basically just a data buffer for programs and operating systems to write data to.

How does the CPU communicate with device registers and device data buffers? There are two alternatives. In the first way, each control register is assigned an I/O port number, which is an 8 – or 16-bit integer. The collection of all I/O ports forms a protected I/O port space so that ordinary user programs cannot access it (only the operating system can). Using special I/O instructions like

IN REG,PORT
Copy the code

The CPU can read the contents of the control register PORT and place the results in the CPU register REG. Similarly, use

OUT PORT,REG
Copy the code

The CPU can write the contents of the REG to the control register. Most early computers, including almost all mainframes, such as the IBM 360 and all its successors, worked this way.

A control register is a processor register that changes or controls the general behavior of a CPU or other digital device. Common tasks performed by control registers include interrupt control, switching addressing modes, paging control, and coprocessor control.

In this scenario, the memory address space and I/O address space are not the same, as shown in the figure below

instruction

IN R0, 4Copy the code

and

MOV R0. 4Copy the code

It’s completely different in this design. The former reads the contents of I/O port 4 and puts it into R0, while the latter reads the contents of memory word 4 and puts it into R0. The 4 in these examples represents different and unrelated address Spaces.

The second method was introduced by the PDP-11,

What is the PDP to 11?

It maps all control registers into memory space, as shown below

Memory-mapped I/O is a way of exchanging data and instructions between a CPU and its connected peripherals. In this way, the processor and THE IO device share memory in the same memory location, that is, the processor and THE IO device map using memory addresses.

In most systems, the address assigned to the control register is at or near the top of the address.

Here is a hybrid approach

This approach has a data buffer with memory-mapped I/O, while the control registers have separate I/O ports. X86 uses this architecture. In IBM PC compatibles, 640 K to 1m-1 memory addresses are reserved for the device’s data buffer, except for the 0 to 64K-1 I/O ports.

How do these schemes work? When the CPU wants to READ a word, either from memory or from an I/O port, it places the desired address on the bus address line and invokes a READ signal on one of the bus control lines. There is also a second signal line to indicate whether I/O space or memory space is required. If it is a memory space, memory will respond to the request. If it is I/O space, then the I/O device will respond to the request. If there is only memory space, then each memory module and each I/O device compares the address line to the address range it serves. If the address falls within this range, it responds to the request. Addresses are never assigned to both memory and I/O devices, so there are no ambiguities and conflicts.

Advantages and disadvantages of memory-mapped I/O

The two addressing controller schemes have different advantages and disadvantages. Let’s take a look at the benefits of memory-mapped I/O.

  • First, if special I/O instructions are required to read and write device control registers, then assembly code is required to access these registers, because execution does not exist in C or C++INOUTMethod of instruction. Calling such a procedure increases the I/O overhead. In memory mapping, control registers are just variables in memory that can be addressed like any other variable in C.
  • Second, with memory-mapped I/O, no special protection mechanism is required to prevent user processes from performing I/O operations. The operating system needs to ensure that the address space of the control register is not placed in the user’s virtual address.
  • Third, for memory-mapped I/O, each instruction in memory can be referenced as well as the control register for ease of reference.

In computer design, almost everything is a trade-off. Memory-mapped I/O, too, has its drawbacks. First, most computers now have some cache of memory words. Caching a device control register can be costly. To avoid this memory-mapped I/O situation, the hardware must selectively disable caching, for example, on each page, a feature that adds additional complexity to the hardware and operating system and must be managed selectively.

Second, if there is only one address space, then all memory modules and all I/O devices must check all memory references to infer who is responding.

What is a memory module? In computing, a memory module is a printed circuit board on which a memory integrated circuit is installed.

If the computer is a single bus architecture, as shown below

It is simple to have each memory module and I/O device view each address.

However, the trend for modern personal computers is dedicated high-speed memory buses, as shown in the figure below

This bus is equipped to optimize memory access speed, and x86 systems can also have multiple buses (memory, PCIe, SCSI, and USB). As shown in the figure below

The trouble with using a separate memory bus on a memory-mapped machine is that THE I/O devices cannot see the memory address through the memory bus, so they cannot respond to it. In addition, special measures must be taken to make memory-mapped I/O work on systems with multiple buses. One possible approach is to first send all memory references to memory, and if the memory response fails, the CPU tries another bus.

The second design is to place a probe device on the memory bus, letting go of all addresses potentially pointing to the I/O device of interest. The problem here is that I/O devices may not be able to process requests as fast as memory can.

A third possible design is to filter addresses in memory controllers, which matches the design described in the figure above. In this case, the memory controller chip contains range registers preloaded at boot time. The disadvantage of this design is that you need to determine which memory addresses rather than the actual memory addresses at boot time. Therefore, every design has arguments for and against it, so compromises and trade-offs are inevitable.

Direct memory access

Whether or not a CPU has memory-mapped I/O, it needs addressing device controllers in order to exchange data with them. The CPU can request data one byte at a time from the I/O controller, but doing so wastes CPU time, so a scheme called Direct Memory Access is often used. For simplicity, let’s assume that the CPU accesses all the devices and memory through a single system bus that connects the CPU, memory, and I/O devices, as shown in the figure below

Modern operating systems are actually more complex, but the principles are the same. If the hardware has a DMA controller, then the operating system can only use DMA. Sometimes this controller is integrated into the disk controller and other controllers, but this design requires a separate DMA controller on each device. A single DMA controller can be used for transfers to multiple devices, often at the same time.

Regardless of the physical address of the DMA controller, it can access the system bus independent of the CPU, as shown in the figure above. It contains several CPU-readable registers, including a memory address register, a byte count register, and one or more control registers. The control register specifies the I/O port to be used, the direction of transmission (reading or writing to the I/O device), the unit of transmission (one byte at a time or one word at a time), and the number of bytes to be transmitted in a burst transmission.

To explain how DMA works, let’s first look at how disk reads can be done without DMA.

  • First, the controller goes fromDisk driveRead a block (one or more sectors) serially, bit by bit, until the entire block is placed in the controller’s internal buffer.
  • readThe checksumTo ensure that no read errors have occurred. The controller then generates an interrupt, and when the operating system starts running, it repeatedly reads the block of information from the controller’s buffer, one byte or one word at a time, and stores it into memory.

How DMA works

When you use DMA, the process becomes different. First the CPU programs the DMA controller by setting its registers, so the DMA controller knows what data to send to where. The DMA controller also issues a command to the disk controller telling it to read data from disk into its internal buffer and verify the checksum. DMA can begin when the valid data is in the buffer of the disk controller.

The DMA controller initiates the DMA transfer by issuing a read request on the bus to the disk controller, which is the second step. This read request is just like any other read request, and the disk controller does not know or care whether it comes from the CPU or the DMA controller. Typically, the memory address to be written is on the bus address line, so when the disk controller matches the next word, it knows where to write it. Writing to memory is another bus loop, which is step 3. When the write operation is complete, the disk controller sends an acknowledgement signal on the bus to the DMA controller, which is the fourth step.

The DMA controller then increases the memory address and decreases the number of bytes. If the number of bytes is still greater than 0, steps 2 through 4 are repeated until the byte count becomes 0. At this point, the DMA controller interrupts the CPU and tells it that the transfer is complete. When the operating system starts running, it does not copy the disk block into memory because it is already in memory.

The complexity of different DMA controllers varies greatly. The simplest DMA controller handles one transfer at a time, as described above. More complex is the case where many transfers are handled simultaneously, and such a controller has multiple sets of registers internally, one for each channel. After transmitting each word, the DMA controller decides which device to serve next. The DMA controller may be set up to use a polling algorithm, or it may have a priority planning design so that some devices receive more care than others. Multiple requests to different device controllers can be suspended at the same time if there is a clear way to distinguish the reply signal.

Many buses can operate in two modes: word at a time and block mode. Some DMA controllers can also operate in both ways. In the former pattern, the DMA controller requests a word to be passed and gets it. If the CPU wants to use the bus, it must wait. The device may sneak in and steal a bus cycle from the CPU, slightly delaying the CPU. This mechanism is called cycle stealing.

In block mode, the DMA controller tells the device to acquire the bus, perform a series of transfer operations, and then release the bus. The form of this operation is called Burst mode. This mode is more efficient than periodic theft because bus acquisition takes time, and bus acquisition at the cost of transferring multiple words simultaneously. The disadvantage is that if a long burst is carried out at this time, the CPU and other devices may be blocked for a long time.

In the model we’re talking about, sometimes called fly-by mode, the DMA controller tells the device controller to pass data directly to memory. Another pattern used by some DMA controllers is to have the device controller send the word to the DMA controller, which then issues a second bus request to write the word wherever it can be written. With this scheme, each transferred word requires an additional bus cycle, but is more flexible because it can also perform device-to-device replication and even memory-to-memory replication (by reading memory first and then writing to memory).

Most DMA controllers use physical addresses for transfers. Using a physical address requires the operating system to translate the virtual address of the target memory buffer into a physical address and write the physical address to the ADDRESS register of the DMA controller. Another option is for some DMA controllers to write virtual addresses to the DMA controller. Then, the DMA controller must use the MMU to complete the virtual to physical conversion. You can only put virtual addresses on the bus if the MMU is part of memory and not the CPU.

To interrupt

In a PC architecture, the interrupt structure would look like this

When an I/O device has finished its work, it generates an interrupt (interrupts are turned on by the default operating system), which it does by declaring allocated signals on the bus. The interrupt controller chip on the motherboard detects this signal and performs the interrupt operation.

If no other interrupt operation blocks before the interrupt, the interrupt controller will handle the interrupt immediately. If there are other interrupt operations in progress before the interrupt, or some other device sends a higher level interrupt signal, the device will not handle the interrupt for the time being. In this case, the device continues to set interrupt signals on the bus until CPU service is available.

To handle interrupts, the interrupt controller places a number on the address line, specifies which device to focus on, and declares a signal to interrupt the CPU. An interrupt signal causes the CPU to stop what it is currently doing and start doing something else. There is an index to the interrupt direction table on the address line to get the next program counter. The newly acquired program counter indicates that the program is about to start, pointing to the beginning of the program. In general, traps and interrupts use the same mechanism from this point of view and often share the same interrupt vector. The location of the interrupt vector can be hardwired into the machine or located anywhere in memory, with the CPU register pointing to its starting point.

After the interrupt service routine starts running, the interrupt service routine acknowledges the interrupt by writing a value to the I/O port of the interrupt controller. Tell it that the interrupt controller is free to issue another interrupt. Multiple interrupts reaching the CPU at the same time by allowing the CPU to respond late involves the occurrence of a race. Some older computers do not have a centralized interrupt controller, and typically each device requests its own interrupt.

The hardware usually saves the current information before the service program starts. What information needs to be stored and where varies greatly from CPU to CPU. Regardless of whether other information is saved, the program counter must be saved, which is the same for all cpus, in order to resume the interrupted process. All visible registers and a large number of internal registers should also be saved.

Where the hardware should store the current information is a question, but one option is to put it in internal registers that can be read by the operating system if needed. The problem with this approach is that the device cannot respond for some time until all the information stored in the internal registers has been read out before resuming operation, lest the second internal register overwrite the state of the internal register.

The second way, which is used by most cpus, is to store information on the stack. However, this approach is problematic because the stack used is uncertain, and if the current stack is used, it is likely to be the stack of the user process. The stack pointer is even illegal, resulting in a fatal error when hardware tries to write at the address it points to. If you are using a kernel stack, the stack pointer is legal and points to a fixed page, the chances are even greater. However, switching to kernel mode requires switching the MMU context and may invalidate the cache or TLB. Reloading these things statically or dynamically increases interrupt processing time and wastes CPU time.

Precise interrupts and imprecise interrupts

Another problem: modern cpus are heavily pipelined and sometimes superscalar (internal parallelism). On some older systems, after each instruction is executed, the microprogram or hardware checks for unfinished interrupts. If present, the program counter and PSW are pushed onto the stack to start the interrupt sequence. After interrupting the program, the old PSW and the program counter pop up from the stack to resume the previous process.

Here is a pipeline model

What happens if there is an interruption when the pipeline is full? Many instructions are in different stages of execution, and when an interrupt occurs, the value of the program counter may not reflect correctly the boundary between instructions that have been executed and instructions that have not yet been executed. In fact, many instructions may be partially executed, and different instructions complete to more or less degree. In this case, the A program counter is more likely to reflect the address of the next instruction to be fetched and pushed into the pipeline than the address of the instruction that has just been processed by the execution unit.

In superscalar designs, it can be even worse

Each instruction can be decomposed into microoperations, which may be executed out of order, depending on the availability of internal resources such as function units and registers. When an interrupt occurs, some instruction that was started a long time ago may not have been executed yet, and some instruction that was executed recently may be about to be completed. When the interrupt signal is present, there may be many instructions in different completion states that have little to do with the program counter.

Interrupts that keep a machine in good shape are called precise interrupts. Such interrupts have four properties:

  • The PC (program counter) is kept in a known place
  • All instructions prior to the instruction to which the PC points have been fully executed
  • None of the instructions after the one the PC points to are executed
  • The execution state of the instruction to which the PC points is known

Interrupts that do not meet these requirements are called imprecise interrupts, and they are a pain in the neck. The figure above depicts the phenomenon of imprecise interrupts. The execution timing and completion of instructions are uncertain and difficult to recover.

IO Software Principles

I/O software targets

Equipment independence

Now let’s turn to the study of I/O software. An important goal of I/O software design is device independence. What does that mean? This means that we can write applications that access any device without specifying a specific device in advance. For example, if you write an application that reads files from a device, the application can read files from a hard disk, DVD, or USB instead of having to customize the application for each device. This is the concept of device independence.

For example, you could enter one of the following commands

Sort INputOutputCopy the code

The above input can then receive input from any type of disk or keyboard, and the output can be written to any type of disk or screen.

The computer operating system is the medium of these hardware, because different hardware has different instruction sequences, so the operating system is needed to do the conversion between instructions.

A closely related indicator of device independence is uniform naming. The device identifier should be an integer or a string, and should not be device-specific. In UNIX, all disks can be integrated into a file system. Therefore, users do not need to remember the specific name of each device. Instead, they can directly remember the corresponding path. For example, if a USB disk is mounted to /usr/cxuan/backup, then you copy the file to /usr/cxuan/backup/device, which is equivalent to copying the file to disk. In this way, Implements that writing a file to any disk is equivalent to writing a file to a specified path.

Error handling

In addition to device independence, the second important goal that I/O software achieves is error handling. In general, errors should be handled at the hardware level. If the device controller finds a read error, it will do its best to fix it. Can’t deal with this problem, if the device controller device driver should be handled, the device driver will try again to read operations, many errors are occasional, if a device driver can’t deal with this error, can put the mistake up to the hardware level (upper) for processing, most of the time, The upper level does not need to know how the lower level resolves errors. It’s much like a project manager who doesn’t have to tell his boss every decision; Programmers don’t have to tell project managers how to write each line of code. This approach is not transparent enough.

Synchronous and asynchronous transmission

The third goal that I/O software implements is synchronous and asynchronous (interrupt driven) transmission. Let’s talk about synchronous and asynchronous first.

In synchronous transmission, data is usually sent as blocks or frames. The sender and receiver should have a synchronization clock prior to data transmission. Asynchronous transmission, in which data is usually sent as bytes or characters, does not require a synchronous clock, but parity bits are added to the data before transmission. Here are the main differences between synchronous and asynchronous

Back to business. Most physical I/ OS are asynchronous. The CPU in physical I/O is very clever. The CPU moves on to other things after the transmission is complete. It is mentally attuned to the interrupt, and when the interrupt occurs, the CPU will return to the transmission.

There are two types of I/ OS: physical I/O and Logical I/O.

Physical I/O usually obtains data from storage devices such as disks. Logical I/O is the retrieval of data to storage (blocks, buffers).

The buffer

The final problem with I/O software is buffering. Typically, data sent from one device does not go directly to the last device. During the period will go through a series of calibration, inspection, buffering and other operations to reach. For example, a packet sent from the network will go through a series of checks to reach the buffer first, eliminating the buffer fill rate and buffer overload.

Sharing and exclusivity

The last problem that I/O software causes is the problem of shared and exclusive devices. Some I/O devices can be shared by many users. Some devices, such as disks, can be used by multiple users without causing problems, but some devices must be exclusive, allowing only one user to use them before other users can use them.

Now, let’s explore how to use programs to control I/O devices. There are three ways to control an I/O device

  • Use programs to control I/O
  • Use interrupt to drive I/O
  • Use DMA to drive I/O

Use programs to control I/O

Using programmatic I/O, also known as programmable I/O, is the transfer of data initiated by the CPU under the control of driver software to access registers or other storage on the device. The CPU issues commands and waits for the I/O operation to complete. Because cpus are much faster than I/O modules, the problem with programmable I/O is that the CPU has to wait a long time for the result to be processed. The CPU will polling or busy waiting while waiting. As a result, the performance of the entire system will be severely slowed down. Programmable I/O is very simple, and is a good way to do it if the wait time is very short. A programmable I/O goes through the following operations

  • The CPU requested I/O operations. Procedure
  • The I/O module responds
  • The STATUS bit of the I/O module is set
  • The CPU periodically checks the status bits
  • I/O does not directly notify the CPU of the completion of the operation
  • I/O also does not interrupt the CPU
  • The CPU may wait or return later in the process

Use interrupt to drive I/O

In view of the above shortcomings of programmable I/O, we propose an improved solution where we want the CPU to be able to do other things while waiting for the I/O device, and when the I/O device completes, it generates an interrupt that stops the current process and saves the current state. A possible schematic is shown below

Although interrupts reduce the burden of CPU and I/O device wait times, they are still inefficient in large data transfers because of the large number of verbatim transfers required before the CPU and I/O modules. The following are the basic operations for interrupts

  • CPU reads data
  • The I/O device fetches data from the peripheral while the CPU performs other operations
  • The I/O device notifies the CPU of an interrupt
  • CPU request data
  • The I/O module transmits data

So what we need to solve now is the efficiency of data transmission between CPU and I/O module.

I/O using DMA

The Chinese name for DMA is direct memory access, which means that the CPU grants I/O modules permission to read or write to memory without involving the CPU. That is, DMA can be done without the CPU. This process is managed by a chip called a DMA controller (DMAC). Because DMA devices can transfer data directly between memory, rather than using the CPU as an intermediary, congestion on the bus can be alleviated. DMA improves system concurrency by allowing the CPU to perform tasks while the DMA system transfers data across the system and memory bus.

I/O hierarchy

I/O software is typically organized into four layers, whose general structure is shown in the following figure

Each layer and its upper and lower layers have well-defined functions and interfaces. Let’s take the opposite approach to computer networks, which is to look at these programs from the bottom up.

Here is another diagram showing all the layers of an input/output software system and its main functions.

Let’s explore the above hierarchy in detail

Interrupt handler

Interruptions occur all the time in computer systems, like women’s tempers, and they are often unpleasant. Interrupt handlers, also known as Interrupt Service Routines or ISRs, are the layer closest to the hardware. Interrupt handler An interrupt generated by a hardware interrupt, a software interrupt, or an abnormal startup of software, used to convert device drivers or protected modes of operation (such as system calls).

The interrupt handler is responsible for handling all the operations that occur when the interrupt occurs, blocks when the operation is complete, and then starts the interrupt driver to resolve the blocking. There are usually three types of notification, depending on the specific implementation

  • Semaphore implementation: Used on semaphoreupTo give notice;
  • Pipe implementation: Executes a condition variable in a pipesignaloperation
  • In other cases, messages are sent

Either way, the purpose is to get the blocking interrupt handler back up and running.

There are many interrupt handling schemes. Here is the ARM System Developer’s Guide

Designing and Optimizing System Software lists some options

  • The nestedThe unnested interrupt handler is also the simplest interrupt handler
  • nestedThe interrupt handler handles multiple interrupts without assigning priority
  • reentrantThe interrupt handler can handle multiple interrupts using priority
  • Simple priorityInterrupt handlers handle simple interrupts
  • Standard priorityAn interrupt handler can handle higher-priority interrupts in a shorter time than a lower-priority interrupt handler
  • High priorityInterrupt handlers can work on higher-priority tasks for a short period of time and go directly to specific service routines.
  • Priority groupingInterrupt handlers can handle interrupt tasks of different priorities

Below are some general interrupt handler steps, the details of which vary from operating system to operating system

  • Saves all registers that are not saved by interrupt hardware
  • Sets the context for the interrupt service routine, possibly including SettingsTLB,MMUAnd page tables. If you are not familiar with these three concepts, please refer to another article
  • Set up the stack for the interrupt service routine
  • Responds to interrupt controllers and continues to respond to interrupts if no centralized interrupt controller exists
  • Copy a register from where it was saved to the process table
  • Runs an interrupt service routine that extracts information from the registers of the device controller that issued the interrupt
  • The operating system selects an appropriate process to run. If the interrupt causes some higher-priority processes to become ready, select to run those higher-priority processes
  • TLB may also be required to set the MMU context for the process, depending on the situation
  • Load process registers, including PSW registers
  • Start a new process

Above we have listed some general interrupt steps. Different operating systems and interrupt handlers can handle different interrupt steps and details. Below is the detailed operation steps of a nested interrupt

Device driver

In the previous article we saw what the device controller does. We know that each controller has internal registers to communicate with the device, send instructions, read the state of the device, etc.

Therefore, every I/O device connected to a computer needs to be controlled by some device-specific code, such as a mouse controller that needs to receive instructions from the mouse telling it where to move next, a keyboard controller that needs to know which key is pressed, and so on. The code that provides the I/O Device to Device controller conversion process is called Device driver.

In order to be able to access the device’s hardware, this actually means that device drivers are usually part of the operating system kernel, at least in today’s architecture. However, it is also possible to construct a user-space device driver that performs reads and writes via system calls. This avoids a problem where a faulty driver interferes with the kernel and causes a crash. Therefore, implementing device drivers in user controls is a very useful measure to construct system stability. MINIX 3 did just that. Here’s how to call MINI 3

However, most desktop operating systems require drivers to run in the kernel.

Operating systems typically classify drivers as character devices and block devices, as we described above

In UNIX, the operating system is a binary program that contains all the drivers that need to be compiled into it. If you want to add a new device to UNIX, you need to recompile the kernel and put the new drivers into the binary.

With the advent of most personal computers, however, this statically compiled approach was no longer effective due to the widespread use of I/O devices. Therefore, starting with MS-DOS, operating system steering drivers were dynamically loaded into the system during execution.

The device driver has many functions, such as receiving read and write requests, initializing the device, managing power and logs, and checking the validity of input parameters.

After receiving a read/write request, the device driver checks whether the device is in use. If the device is in use, the request is queued for subsequent processing. If the device is idle at this point, the driver checks the hardware to see if the request can be processed. Before transmission begins, the device or motor is started. Wait until the equipment is ready to complete, then carry out the actual control. Controlling equipment means giving instructions to equipment.

After the commands are issued, the device controller begins writing them to the device registers of the controller. After each command is written to the controller, the controller is checked to see if it has accepted the command and is ready to accept the next command. Generally, the control device will issue a series of instructions, which is called an instruction sequence. The device controller will check whether each command is accepted and whether the next command can be received in turn until all the sequences are issued.

After issuing an order, there are generally two possible situations. In most cases, the device driver will wait until the controller has done its thing. It is important to understand the concept of a device controller

The primary responsibility of the device controller is to control one or more I/O devices for data exchange between the I/O devices and the computer.

The device controller receives instructions from the CPU to control the hardware

A device controller is an addressable device that has a unique device address when it controls only one device. If the device controller controls multiple connected devices, it should contain multiple device addresses and make each device address correspond to one device.

There are two main types of device controllers: character devices and block devices

The main functions of the device controller are as follows

  • Receive and recognize commands: The device controller can accept and recognize commands from the CPU. There will also be registers inside the device controller to store instructions and parameters

  • Data exchange: Data is exchanged between the CPU, controller, and device. The CPU sends instructions to the controller over the bus or reads data from the controller in parallel. The controller writes data to the specified device.

  • Address recognition: Each hardware device has its own address. The device controller can recognize these different addresses to achieve the purpose of controlling the hardware. In addition, in order for the CPU to write or read data into registers, these registers should have unique addresses.

  • Error detection: The device controller also has the function of detecting data transmitted from the device.

In this case, the device controller blocks until an interrupt unblocks the blocking state. There is also a case where the operation can be completed without delay, so the driver does not need to block. In the first case, the operating system may be woken up by an interrupt; In the second case, the operating system is not hibernated.

Device drivers must be reentrant because device drivers block and wake up and then block again. Drivers are not allowed to make system calls, but they usually need to interact with the rest of the kernel.

Device-independent I/O software

There are two types of I/O software: device-specific software, which we described above, and device-independent software, which means that no specific device is required. The boundary between device drivers and device-independent software depends on the system. The functions shown below are implemented by device-independent software

The basic function of device-independent software is to perform common I/O functions for all devices and provide a unified interface to user-layer software.

The buffer

Buffering is an important consideration for both block and character devices. Below is the process of reading data from an ADSL(modem), which is the device we use to connect to the Internet.

The user program calls the READ system call, which blocks the user process and waits for characters to arrive. This is a way of processing incoming characters. Each incoming character causes an interrupt. The interrupt service routine provides characters to the user process and unblocks. After a character is given to a user program, the process reads other characters and continues to block, modeled as follows

In this scenario, there is no buffer because the user process blocks until the data is read, which is inefficient and blocks the user process from doing anything else, which is unacceptable to the user. There is also a case where the user process is restarted every time the user process is restarted for every character, and this efficiency is severely reduced, so bufferless software is not a good design.

As an improvement, we can try to read N characters in user space using a buffer that reads n bytes. In this case, the interrupt service will put characters into the buffer until the buffer is full, and then wake up the user process. This scheme is much better than the previous one.

But there are problems with this scheme. What happens if the buffer is called out of memory when a character arrives? The solution is to lock the buffer in memory, but this can also be problematic. If a small number of buffers are locked, it is fine. If a large number of buffers are locked in memory, the pages that can be moved in and out shrink, causing performance degradation.

One solution is to create a buffer inside the kernel and have the interrupt service program place characters in the buffer inside the kernel.

When the kernel of the buffer is full, will the pages to the user space memory, and then copies the buffer of kernel space to user space buffer, this scheme is also facing a problem is if the user page is in memory of the space, the kernel space in the buffer is full, this time there is still a new arrival of the characters, this time? Because the buffer is full, there is no space to store new characters.

A very simple way to do this is to set up a second buffer and, after the first buffer is filled, use the second buffer when the buffer is empty

When the second buffer is also full, it copies the data into user space, and then the first buffer is used to accept new characters. This design with two buffers is called double buffering.

Another form of buffering is circular buffer. It consists of a memory region and two Pointers. A pointer points to the next free word, where new data can be placed. The other pointer points to the first word of the undeleted data in the buffer. In many cases, hardware moves the first pointer as new data is added; The operating system moves the second pointer when it deletes and processes unwanted data. When the two Pointers reach the top, they go back to the bottom and start again.

Buffers are also important for output. The output is described similarly to the input

Buffering technology is widely used, but it also has disadvantages. If the data is buffered too many times, performance will be affected. Consider, for example, the following case,

The data is sent to the network, then to the receiver’s network controller, then to the receiver’s kernel buffer, then to the receiver’s user buffer. If a packet is cached too many times, it can easily degrade performance.

Error handling

In I/O, errors are a perfectly normal occurrence. When errors occur, the operating system must handle them as best it can. Some errors are handled only by a particular device, and some are handled by the framework, independent of a particular device.

One type of I/O error is a programmer’s programming error, such as reading a stream without opening a file, or running out of memory without closing the stream, and so on. Such problems are handled by programmers; The other category is actual I/O errors, such as writing data to a bad block of disk that cannot be written no matter how hard it is written. This kind of problem is handled by the driver, and if the driver can’t handle it, it’s handled by the hardware, as we mentioned above.

Unified interface for device drivers

We said in the summary of operating system, the operating system is a very important function of shielding of the difference of the hardware and software for the hardware and software provides a unified standard, the standard also in to provide a unified interface to device drivers, because different hardware and manufacturers to write a device driver is different, So if you provide a separate interface for each driver, you can’t do that, so you have to unify.

Allocate and release

Some devices, such as printers, can only be used by one process. In this case, the operating system needs to check whether the device can accept other requests according to the actual situation. A simple and direct way is to perform the open operation on special files. If the device is not available, opening directly will cause a failure. Another way is not to fail directly, but to block, wait until another process releases resources, and then open. This leaves the choice up to the user to decide whether they should wait.

Note: Blocking can be implemented in many ways, such as blocking queues

Device independent blocks

Different disks have different sector sizes, but software doesn’t care about sector sizes, just store them. Some character devices can deliver data one byte at a time, while others deliver data in larger units, and these differences can also be hidden.

User space I/O software

While most I/O software is in the kernel structure, there are also some I/O software implemented in user space where there are no absolutes. Some I/O software and library procedures exist in user space and are then implemented in the form of system calls.

disc

The disk is arguably the simplest structure in hardware, and also the most important. Let’s start with the disk and talk about its physical structure

Disk hardware

There are many types of disks. The simplest of these are magnetic hard disks, also known as hard disks, HDDS, etc. The disk is usually paired with a magnetic head mounted on a magnetic arm that reads and writes data to the disk, so the disk reads and writes equally fast. On a disk, data is randomly accessed, which means that individual blocks of data can be stored and retrieved in any order, so you can place disks anywhere for the heads to read. Disks are non-volatile devices that last forever even after a power failure.

In the early days of computers, data was stored on compact disks. However, with the popularity of solid-state disks (SSDS), which do not contain moving parts, they are now the preferred storage method for computers.

disk

To organize and retrieve data, disks are organized into specific structures, namely tracks, sectors, and cylinders

Each disk is made up of innumerable concentric circles, which are like the rings of a tree

Part of the tree ring photos have to pay to download, dare not directly white piao, broad fear of broad fear.

A disk is organized in a cylindrical form, with each disk connected by a shaft. Each cylinder contains a number of tracks, and each track consists of a number of sectors. Floppy disks have about 8-32 sectors per track, and hard disks have up to a few hundred sectors per track, with about 1-16 heads.

An important feature for disk drivers is whether the controller can manage two or more drives at the same time for track addressing. This is called overlapped seek. For the controller, it can control one disk driver to complete the seek while the other drivers wait for the seek to end. The controller can also read and write to one driver while the other drives seek, but the floppy controller cannot read and write to both drives.

RAID

RAID is called redundant array of Disks, or disk array for short. The virtualization technology is used to combine multiple disks into one or more disk arrays to improve performance or data redundancy.

RAID has different levels

  • RAID 0 – Fault-tolerant striped disk array
  • RAID 1 – Mirroring and duplex
  • RAID 2 – Memory error correction codes
  • RAID 3 – Bits interleaved parity check
  • RAID 4 – Block staggered parity check
  • RAID 5 – Block staggered distributed parity check
  • RAID 6 -P + Q redundancy

Disk formatting

Disks are made up of a bunch of aluminum, alloy, or glass disks. When disks are created, they have no information. Disks must be formatted in low- Levvel format before use. Here is a sector format

The leading code is used to indicate the starting position of the sector. It usually starts with a bit pattern. The leading code also includes other information such as the cylinder number and sector number. Following the lead code is the data area, the size of which is determined by the low-level formatter. Most disks use 512-byte sectors. After the data area is ECC, the full name of ECC is error correction code, data error correction code, which is different from common error detection, ECC can also be used to recover read errors. The size of the ECC stage is implemented by different disk manufacturers. The design criteria for ECC size depend on how much disk space the designer is willing to sacrifice to improve reliability, and how complex the ECC can be handled by the program. ECC is usually 16 bits, and in addition, hard disks generally have a certain number of spare sectors to replace the sectors in which the manufacturing defect was made.

After low-level formatting, the position of sector 0 is offset from the previous track, as shown in the following figure

This method, also known as cylinder skew, is used to improve the performance of the program. Can think so, disk, in the process of turning the head from a to read the sector information, after read inside a circle sector data, head to the lateral track will be addressing the operation, addressing the operation of disk on rotation at the same time, if not in this way, may just head addressing the lateral, 0 sector has turned his head, So you need to rotate it around to wait for it to continue reading, which can be eliminated by the way the cylinder is tilted in.

The amount of cylindrical slant depends on the geometry of the driver. The cylindrical oblique precession is the difference between two adjacent concentric circles in sector 0. As shown in the figure below

It should be noted that not only the cylinders but also the magnetic heads have head skew, but the head skew is relatively small.

Disk formatting reduces disk capacity. The reduced disk capacity depends on leading codes, the gap between sectors, the size of ECC, and the number of reserved spare sectors.

One final step before the disk can be used is to perform high-level format for each partition, which sets up a boot block, free storage management (either bitmap or free list), root directory, and empty file system. This step places the code in the partition entry and tells the partition what file system to use, since many operating systems support multiple compatible file systems. After this step, the system is ready to boot.

When power is turned on, the BIOS runs first, it reads the master boot record and jumps to the master boot record. The boot program then checks to see which partition is active. It then reads the Boot sector from that partition and runs it. The boot sector contains a small program that loads a larger boot to search the file system for the system kernel, which is then loaded into memory and executed.

Here’s what a boot sector is: A boot sector is a reserved sector of a disk or storage device that contains data or code necessary to complete the computer or disk boot process.

The boot sector stores boot record data, which is used to provide instructions when the computer boots up. There are two different types of boot sectors

  • Master Boot Record is called the Master boot sector
  • Volume Boot Record Volume boot record

For partitioned disks, the boot sector consists of the master boot record;

A non-partitioned disk consists of a volume boot record.

Disk arm scheduling algorithm

Let’s discuss the algorithms that affect disk read and write. In general, the time that affects disk fast read and write is determined by the following factors

  • Seek time – The seek time is the time to move the disk arm to the disk block to be read
  • Rotation delay – The time required to wait for the appropriate sector to rotate under the head
  • The actual reading or writing time of data

These three time parameters are also the disk seek process. In general, the seek time has the greatest impact on the total time, so effectively reducing the seek time can improve disk read speed.

If the disk driver receives requests one at a time and completes them in the order they are received, which is first-come, first-served (FCFS), it is difficult to optimize the seek time. Because each request is processed sequentially, no matter what the order, it is possible to wait for one disk to rotate one week after reading, while the other cylinders can read immediately, in which case each request will queue up as well.

Normally, while the disk is seeking, other processes will make additional disk requests. The disk driver maintains a table in which cylinder numbers are recorded as indexes. Unfinished requests for each cylinder form a linked list. The linked list heads are stored in the corresponding entries in the table.

An alternative to the first-come-first-served algorithm is to use the shortest path First (SSF) algorithm, which is described below.

If a request for 11, 2, 4, 14, 8, 15, 3 occurs simultaneously while addressing track 6, if the first come, first served principle is used, as shown in the figure below

We can calculate that the number of disks the disk arm spans is 5 + 9 + 2 + 10 + 6 + 7 + 12 = 51, which is equivalent to 51 disk spans. If using shortest path first, we can calculate the number of disk spans

! [image-20200614184609291](/Users/mr.l/Library/Application Support/typora-user-images/image-20200614184609291.png)

The number of disks across is 4 + 1 + 1 + 4 + 3 + 3 + 1 = 17, which saves twice as much time as 51.

However, the shortest path first algorithm is not perfect, this algorithm still has a problem, that is the priority problem,

A prototype for reference here is the elevator in our daily life, which uses an elevator algorithm for scheduling to meet the conflicting goals of coordinated efficiency and fairness. The elevator will generally keep moving in one direction until there are no requests in that direction, and then change direction.

The elevator algorithm maintains a binary bit, which is the current direction bit: UP or DOWN. When a request is processed, the disk or elevator driver checks the bit, and if the bit is UP, the disk arm or elevator bin moves to the next higher level to drop the pending request. If the high level has no outstanding requests, take the opposite direction. When the direction bit is DOWN and there is a low level request, the disk arm will turn to this point. If it doesn’t, it just stops and waits.

Let’s take an example to describe the elevator algorithm. For example, each cylinder is served in the order of 4,7,10,14,9,6,3,1. Then its flow chart is as follows

So the number of disks the elevator algorithm needs to span is 3 + 3 + 4 + 5 + 3 + 3 + 1 = 22

Elevator algorithms are generally inferior to SSF algorithms.

Another optimization can be made using some disk controllers that provide a way for the software to check the current sector number below the head. If there are two or more requests waiting to be processed for the same cylinder, the driver can issue a request to read or write the sector to be passed through the head next time.

It is important to note here that when a cylinder has multiple tracks, successive requests may be made for different tracks. This choice is not costly, since the selection of the head requires no disk arm movement and no rotation delay.

Seeking time and rotation latency are the biggest performance issues for disks, so reading only one or two sectors at a time is inefficient. For this reason, many disk controllers always read multiple sectors and cache them, even when only one sector is requested. Typically a sector is read at the same time as the track in which the sector is located or all remaining sectors are read, depending on how much space is available in the controller’s cache.

The disk controller cache is somewhat different from the operating system cache in that the disk controller cache is used to cache blocks that are not actually requested, whereas the operating system maintains a cache consisting of blocks that are explicitly read and that the operating system will assume will still be used frequently in the near future.

When there are multiple drives on the same controller, the operating system should maintain a separate outstanding request table for each drive. As soon as one of the drives is idle, a seek request should be made to move the disk arm to the next requested cylinder. If no disk arms are in the correct position when the next seek request arrives, the driver issues a new seek command on the drive that has just finished transferring and waits to check which drive is idle when the next interrupt arrives.

Error handling

The disk may have defects during manufacturing. If the defects are small, such as only a few bits, then it is feasible to use bad sectors and just have the ECC correct the errors each time. If the defects are large, then the errors cannot be covered up.

Generally, there are two ways to deal with bad blocks. One is to deal with bad blocks in the controller. One is at the operating system level.

These two methods are often used interchangeably, such as on a disk with 30 data sectors and two spare sectors, where sector 4 is defective.

All the controller can do is remap one of the alternate sectors.

Another way to do this is to move all sectors up by one sector

In both cases, the controller must know which sector it is, and can track this information through internal tables, or override the lead code to give the remapped sector number. If the lead code is overwritten, then the way in which the movement is involved must override all subsequent leads, but ultimately provides good performance.

Stable memory

Disk errors often occur, causing good sectors to become bad sectors and drivers to hang. RAID can protect against sector errors or drive crashes, but RAID cannot protect against write errors in bad data, or against crashes during write operations that destroy the original data.

We expect the disk to work perfectly, which it does not, but we do know that a disk subsystem has the following properties: when a write command is given to it, the disk either writes correctly or does nothing, leaving the existing data intact. Such a system is called stable storage. The goal of stable memory is to ensure disk consistency at all costs.

Stable memory uses two pairs of identical disks, and the corresponding blocks work together to form an undifferentiated block. Stable memory For this purpose, the following three operations are defined:

  • Stable Write
  • Stable read
  • Crash Recovery

Stable write means that the block is first written to, say, drive 1, and then read back to verify that it was written correctly. If not, the write and read will be tried again until the write can be verified. If blocks are written and not verified correctly, blocks are switched to continue writing and reading until correct. No matter how many spare blocks you try to use, drive 2 will be written and read only after you have written successfully to drive 1. So we’re essentially writing to two drives.

Stable reads refer to reading from drive 1 first, then trying again if the read operation produces an incorrect ECC, and then reading from drive 2 if all read operations give an incorrect ECC. So we’re essentially reading from both drives.

Crash recovery is when, after a crash, the recovery program scans two disks and compares the corresponding blocks. If a pair of blocks are both good and identical, no mechanism is triggered; If one of the blocks triggers an ECC error, then good blocks need to be used to override the bad ones.

This works if the CPU does not crash, because stable writes always write two valid copies of each block and assume that spontaneous errors never occur on the two corresponding blocks at the same time. What happens if the CPU crashes during stable writes? Depending on the exact time of the crash, there are five scenarios

  • In the first case, the crash occurs before writing, and nothing needs to be changed when restoring, and the old values continue.

  • In the second case, a CPU crash occurs while writing to drive 1, causing the block contents to be corrupted, but the recovery program can detect this error and recover the block from drive 2 on drive 1.

  • The third case is when the crash occurs after disk drive 1 but before drive 2 is written, in this case because disk 1 has been written successfully

  • In the fourth case, a crash occurs after disk drive 1 writes and while disk drive 2 writes, the bad block is replaced with the good block during recovery, and the final value of both blocks is up to date

  • The final case is where the crash occurs after two disk drives have been written, in which case no problems will occur

Any optimizations and improvements in this mode are possible, but costly. One improvement is to monitor the blocks being written during steady writes so that only one block is checked after a crash. There is a non-volatile RAM that can retain data after a crash, but this is not recommended.

The clock

Clocks, also known as timers, are essential to any programmed system. The clock is responsible for maintaining time, preventing a process from occupying CPU time for a long time and other functions. Clock software is also a device driver. Here we will introduce the clock, usually discussing the hardware first and then introducing the software, using a bottom-up way, but also to tell you that the bottom is the most important.

The clock hardware

There are two types of clocks in computers, and these clocks are completely different from those used in real life.

  • A simpler clock is connected to a 110 V or 220 V power cord, so that eachVoltage cycleThere will be an interruption, about 50-60 HZ. These clocks used to dominate.
  • The other clock consists of a crystal oscillator, a counter and a register, as shown in the diagram below

Called a programmable clock, the programmable clock has two modes, one is one-shot mode. When the clock is switched on, it copies the value in memory into the counter. Then, every time the crystal’s oscillator pulses, the counter will be -1. When the counter turns to 0, an interrupt is generated and work stops until the software shows up again. There is also square-wave mode, in which the value of the storage register is automatically copied to the counter when the counter goes to 0 and an interrupt occurs. This periodic interruption is called a clock cycle.

The clock software

The work of the clock hardware is only based on the known time interval interruption, and other work is done by the clock software, the general operating system is different, the specific implementation of the clock software is also different, but generally will include the following points

  • Maintain the time of day
  • Prevents a process from running longer than its specified time
  • Collect statistics about CPU usage
  • Handles warning system calls for user processes
  • Provides watchdog timers for all parts of the system
  • Complete profiling, monitoring and information gathering

Soft timer

Clock software, also known as a programmable clock, can be set to trigger interrupts at any rate the program needs. Interrupts triggered by clock software are hard interrupts, but some applications are unacceptable for hard interrupts.

In this case, a soft timer is needed to avoid interrupts. Whenever the kernel is running for some reason, it checks the clock to see if the soft timer has expired before returning to user mode. If the soft timer expires, the scheduled event does not need to switch to kernel state because it is already in kernel state. This method avoids frequent switching between kernel mode and user mode, and improves the efficiency of program operation.

The rate of the softtimer switching to the kernel mode is different for different reasons

  • The system calls
  • TLB misses
  • Missing page exception
  • I/O interrupt
  • The CPU becomes idle.

I launched an open source project to become the bestJavaer at github.com/crisxuan/be… Star.