Introduction in the design and implementation of the Linux kernel, found only in the process of reading a book to learn how to design a real kernel too narrowly, still need to practice to really understand how to design a kernel is, therefore, found in the lot two minimalist kernel (compared with the real kernel of these two little kernel code, More like kernel components) to illustrate how to design a simple Linux kernel. Reading this article requires some knowledge of assembly language and C language. Basics First let’s take a look at what the Linux kernel is and what it does. We know a computer operating system is the most important part of, the user needs to the operating system to run all kinds of application process, if the user application is indirectly by the operating system called part of the computer resources to run, then the operating system is equivalent to the computer resource manager, the operating system can directly call the computer’s resources. Operating systems include some basic components, such as a text editor, editor, and user interaction process, the kernel, and kernel core, as the basis of the operating system in the operating system is the most important part is also the most basic, it is the most commonly used in the operating system’s basic module, interact directly with hardware, ACTS as the underlying drivers, Addressing various devices and components in the system, used to manage system resources, such as process, file system, synchronization, memory, network protocol and other operations and permission control, through the kernel can be shared resources of the computer (CPU, memory, etc.) allocated to each system process; The kernel also provides a set of system-oriented commands, and applications call system calls just as they call normal functions in C. A comparison of kernels and applications is made from the outermost to the innermost: User Applications >UNIX Commands and Libraries > System call Interfaces > Kernel > Hardware. Here is a complete schematic of how the Linux kernel works. There are more than 400 functions in the subsystem, and they interact with each other using wires:

The architecture of the Linux kernel is shown below:

The Linux kernel mainly contains: system call interface SCI, process management, memory management, virtual file system, network protocol stack, device driver, hardware architecture. 1. System call interface SCI provides a set of system call functions as a bridge between user state and kernel state. It is a function call multiplexing and multiplexing service, and the interface also depends on the architecture of the kernel. 2, process management in Linux is not actually thread and process to distinguish between these two concepts, called the thread is a lightweight process in Linux, there is no separate address space, all threads under a process share the address space, file system resources, such as file descriptors, signal processing program, each thread has private data, stack information, etc., The kernel creates processes through SCI’s fork and exec apis, suspends processes through WAIT apis, terminates processes through kill and exit apis, and communicates and synchronously between processes through signal or POSIX mechanisms. The process descriptor is used to record the state of the process. Finally, the execution of the process is completed through the scheduling system. 3. Memory management Kernel uses memory management to control multiple processes to safely share shared memory. Memory is managed in the basic unit of memory pages, which are the smallest unit of memory in Linux from a virtual memory perspective, with most 32-bit systems supporting 4KB pages and 64-bit systems supporting 8KB pages. Linux provides an abstraction of the 4KB buffer, such as the slab allocator. This memory management mode uses the 4KB buffer as the base, allocates structure from it, tracks memory page usage, and adjusts memory usage dynamically based on system requirements after knowing how pages are stored. In order to support multiple users to use the memory, the available memory may be used up. In this case, pages can be removed from the memory and put into disk. This process is called swap, because pages are swapped from memory to disk. The VIRTUAL file system (VFS) is a layer interface encapsulated between the Linux kernel and an I/O device. Through this interface, the Linux kernel can access various I/O devices in the same way. Applications can use the same interface to read and write data from different file systems on different media. The VFS is able to connect various file systems because it defines the basic, conceptual interfaces and data structures that all file systems support. VFS takes an object-oriented approach and uses structures to implement Pointers to functions that contain both data and operational data. The four main types of VFS objects are superblock objects, inodes, directory items, and files. Because virtual file systems are part of the Linux kernel and are software, they do not require hardware support. 5. The network protocol stack ensures the implementation of network protocols and follows the layered architecture of the simulation protocol itself in design. For details, please refer to here. A Device Driver is a special program that enables the computer to communicate with the Device. It is equivalent to a hardware interface through which the operating system can control the work of the hardware Device. Much of the source code in the kernel is implemented in drivers to control specific hardware devices. The Linux source tree provides a driver subdirectory, which is further divided into supported devices such as Bluetooth, I2C, Serial, and so on. 7. Architecture-related code Linux is largely independent of the architecture it runs on, but a few rely on it to operate properly and efficiently. The architecture-dependent portion of the kernel is defined in the Arch directory for Linux, which contains various architecture-specific subdirectories (which together make up the BSP). Each architecture subdirectory contains many other subdirectories, each focusing on a specific aspect of the kernel, such as boot, kernel, memory management, and so on. Linux using macro kernel architecture, all function into one big file, so that we can directly call a function to reduce the communication overhead between the kernel, but also lead to a function error will affect the back of the all the functions, and affects the maintainability of the kernel, in order to improve the scalability and maintainability of the kernel, the Linux kernel USES modular design, Software components can be added or removed. The added or removed components are called Loadable Kernel Modules (LKM), which can be compiled separately but cannot be used independently without the Kernel. When used, they are linked to the Kernel and run in the Kernel space. Loadable kernel modules include driver devices, kernel extension modules, which can be inserted by the user at boot time as needed or at any time to implement file systems, add devices, system calls, or other functions of the upper kernel. Introduce everybody should be here in the Linux kernel component had a general idea of the complete kernel, of course, also include more details, such as system calls, kernel data structures, scheduling algorithm, interrupt and synchronization, etc., but as a result of this paper just shows you how to make a micro kernel, so after know the knowledge more than enough to the following study. Students who are interested in the full implementation details of the Linux kernel can read “Linux Kernel Design and Implementation” and download the Linux source code to learn. This section uses two minimalist kernels on GitHub to learn how to write simple kernels. Example one Example source This example’s kernel implementation prints “My First kernel” on the screen and suspends. You need a virtual machine with a NASM compiler (I used Ubuntu20.04 for the demonstration). To start the demonstration we need to know what happens when a computer is powered up and turned on to the user application. Step 1 After the computer is powered on, the motherboard will get the signal to start the power supply to check whether the device is normal, and use the CPU to clear all the data in the register and set the predefined value for the register. The reset vector (the address of the first execution instruction after the CPU is reset) is set to 0xFFFFF0 (80386 and later CPUS are generally this address). 0xffFFFFF0 = 0xFFff0000 (CS register value, base address) + 0xFFF0 (EIP register value, address offset), then 0xFFffFFf0 will have similar JNE jump instruction, through the jump instruction to BIOS. The second step the BIOS to read the MBR sector after initialization is complete (the first sector on the startup disk) the contents of the sector (including master boot program) is copied to the address 0 x7c00 physical memory, and then through the master boot program initializes the hardware equipment to prepare for the kernel, at the same time, the master boot program is also responsible for loading execution GRUB (bootloader, Find the kernel and load it into memory to run). Step 3 After GRUB decompresses the kernel and loads the decompressed kernel into memory, start_kernel () is called to start a series of initialization functions and initialize various devices, at which point the kernel is finally loaded. After the kernel is loaded, the first program to run is /sbin/init. It reads the /etc/inittab file and initializes the system according to the file. After the system is initialized and the kernel module is loaded, the user will enter the login page. Let’s use the first example to mimic how a kernel works. A real kernel initializes the kernel setup stack, BSS segment, etc., after the boot loader starts the kernel code, jumps to the kernel entry point, and then executes other functions in the kernel (the Linux kernel uses the macro kernel architecture, which contains all the necessary kernel functions). This is a complex and rigorous process. But in this example, the kernel code has only one function (this example only demonstrates how the kernel is started). First of all, because high-level languages such as C cannot directly interact with the computer (switching from real mode to protected mode can only be completed by assembly), we use assembly code here to write a small program for starting the kernel code:

You can see a comment at the end of the program code that calls Kmain after setting up data, variables, and so on. Then take a look at Kmain:

Here kmain is our kernel, kmain calls no other functions, and its function is very simple. The vidptr pointer points to address 0xB8000 (where video memory starts in protected mode), and the first time the while loop writes 0x07 (0x07 prints the string in light gray, The whitespace character of variable) clears the screen, and the second while loop writes the “My First kernel” string with the attribute 0x07 to video storage for printing on the screen. In addition to the two files above, we need a linker script link.ld, which we pass as a parameter to our linker. Link. ld link program script:

The link.ld file above sets the output format of the output executable to a 32-bit executable, ENTRY (start) specifies the symbol name, start as the starting point of the executable, and position counter. Set to 0x100000, and the kernel code starts executing from there (you can change it for demonstration only). .text:{(.text)},.data:{(.data)},.bss:{*(.bss)} causes the linker to merge all text parts of the object file into the text part of the executable file, and data parts into the data part of the executable file, The BSS section is merged into the BSS section of the executable. After the linker places the text output section, the value of the position counter will change to 0x1000000+ the size of the text output section. With the above three files, you also need to use GRUB to load the kernel in accordance with the Multiboot specification, which requires that the kernel have a header file in the first 8KB, so the following four lines of code are added to the section. Align specifies that symbol alignment is not important, magic field is set to 0x1BADB002 for identifying headings, flag field is not important is set to 0, the last checksum field must be 0 when added to magic field and flag field, finally kern.asm becomes:

After the above files are ready, use NASm to assemble kernel.asm into one object file, then compile kernel.c into another object file, and finally use link.ld to link the compiled two object files together.

GRUB can be configured to set the vm kernel to the above kernel. For convenience, we will only use the QEMU emulator to see the effect:

Example 2 Source This example expands on Example 1. In this example, the kernel communicates with the I/O device (the I/O device) using an I/O port (which from the kernel perspective is a specific memory address on the I/O bus). The I/O port is configured with a control register (a processor register, Can control CPU and other digital devices) and read data registers to manipulate external devices to receive a-Z, 0-9 and partial symbols and print them to the screen. The environment is the same as in example 1. As in example 1, this example also requires a kernel.asm file to start the kernel.

The above code has been commented out. Here is a brief description of the differences between kernel.asm and example 1. Keyboard_handler_main, which handles keyboard-related input_handler, including keycode conversion and output. Read_port reads the I/O port, uses the IN instruction to specify the I/O port number of the DX register, and passes the read data to the AL register. Write_port is responsible for writing the I/O port, and uses the out instruction to write the data from the AL register to the I/O port specified by the DX register. Load_idt handles things related to the IDT of the interrupt descriptor table. Keyboard_map. H:

Above is the keyboard mapping table followed by keyboard_handler_main (), which converts the scan code to ASCII. The kernel. C:

The above IDT_entry, idT_init is used to create an IDT for the user to read the I/O port. Interrupt descriptor table IDT is a system table through which the interrupted process can be found to continue execution after CPU interruption. Here said the interruption, early computers if you want to know which equipment perform events need to equipment state monitoring, early for monitoring method is called polling, but because the program can only serial implementation, computer resource utilization is low, so in order to improve the utilization rate of resources, and convenient user control, and introduces the interrupt mechanism, to ensure the program more concurrent execution. Simply speaking, the kernel USES interrupt on the one hand, can let all of the process have a certain resource process can be used not to “starve”, on the other hand users can also raise the interrupt the CPU can stop the currently running process into kernel mode, the kernel again to different interrupt number to do different treatment, complete the users themselves can control the purpose of computer resources. Here we need to monitor the input of the keyboard by interrupt and process it. When it detects input from the keyboard, the keyboard will send a signal to the PIC (programmable interrupt controller, which can receive the interrupt and process the hardware according to the interrupt) through IRQ. PIC stores an offset during initialization. The offset and the input line number are added together to form the interrupt number. Then the processor finds the interrupt address program entry corresponding to the interrupt number by querying IDT, and runs the code of the address to process the keyboard input. The above two functions IDT_entry are concrete implementations of IDT. Idt_init will first populate the IDT entry for keyboard interrupts and then set up two PIC, PIC1 uses 0x20 as command port, 0x21 as data port, PIC2 uses 0xA0 as command port and 0xA1 as data port. Then through the initialization command ICW1 to PIC3 data port initialization words, ICW2 write PIC data port, set the PIC offset, and input line data added to obtain the interrupt number. ICW3 tells the PIC how to do master/slave devices. There is no need to set the PIC’s input and output to each other. Therefore, there is no need to cascade the PIC. ICW4 gives additional information about the environment. Set the low level to 8086 mode for PIC. We then map the interrupt number of the keyboard to the address of the keyboard handler. We need to know which interrupt number is mapped to the address of the keyboard handler and IDT. In the code above, the offset of PIC1 is initialized to 0x20, plus 1 is the interrupt number 0x21, and the address of the keyboard handler needs to be mapped to 0x21 interrupt. The mapping needs to fill the 0x21 interrupt in the IDT, and we not only set the type-capture interrupt, but also give the kernel code offset 0x08, interrupt gate 0x8e, and fill the other bits of the GDT entry with 0 to complete the IDT corresponding to the keyboard interrupt. We will map interrupts to the keyboard_handler function, which is written to kernel.asm above. Finally, after completing the above tasks, we pass a pointer to the IDT descriptor structure through load_idt () that is, IDt_ptr to liDT to tell the CPU where the IDT is, and also because the STI instruction initiates the interrupt. The IRQ line can be disabled or enabled by setting the value of the IMR. Previously, we disabled the NTH hop IRQ line by setting the value of the NTH bit of the IMR to 1. Now that the IDT is set up and loaded we use KB_init to enable the IRQ line. Keyboard_handler () calls keyboard_handler_main () to handle keyboard input. Keyboard_handler_main () sends an EOI signal to the PIC to handle the interrupt, reads port 0x64 to determine the buffer state, determines whether the buffer is empty, reads port 0x60 to read the buffer contents, The 0x60 port determines the key code judgment character using the aforementioned keyboard_map_. H and prints the character to the screen. Kernel. c and kernel.asm executables are compiled and linked:

Use the QEMU emulator to see the effect:

Conclusion because the author is the first time to contact the kernel, so this article is very rough, if there are any mistakes and inadequacy of the place please be corrected, forgive me, but also hope that there are big leaders can give advice, thank you. See the Linux Kernel Design and Implementation github.com/arjun024/mk… Github.com/arjun024/mk… Tldp.org/LDP/lkmpg/2… Zh.wikipedia.org/wiki/%E5%86… Blog.csdn.net/go_str/arti… 0 xax. The gitbooks. IO/Linux – insid… 0 xax. The gitbooks. IO/Linux – insid… 0 xax. The gitbooks. IO/Linux – insid…

Finally: Need Linux network security information please add wechat: Gogquick free limited time access