This article is translated by marsCatXDU_ Li Jingwei, if there is a fallacy, also please great gods more advice original address: wiki.osdev.org/How_kernel,… 23 December 2019, 10:56

Kernel

The kernel is the core of an operating system. In traditional designs, the kernel takes care of memory management, I/O, interrupt management, and so on.

In modern designs such as Microkernel and Exokernel, some of this work has been migrated to user space, which is beyond the scope of this article.

All kernels provide the service through system calls, and different kernels make system calls and return them differently.

C Library

See C library for more informationwiki.osdev.org/C_LibraryTo create the C libraryWiki.osdev.org/Creating_a_…

When we develop our own kernels, we must add C libraries if we want the system to support C, because C programs need the support of C libraries to run properly (adding C libraries, porting existing ones or writing new ones). The C library implements standard C functions (such as those declared in header files such as <stdlib.h>, <math.h>, <stdio.h>, etc.) and provides corresponding binaries that can be linked in user space.

In addition to the standard C functions defined in the IOS standard, C libraries generally implement additional functions beyond the standard (such as network-related functions not mentioned at all in the C standard). POSIX standards in various Unix-like systems, for example, define other things that need to be included in C libraries. Be aware that the libraries provided by different operating systems can be quite different.

The C library must use system call functions to implement functionality, so if you want to build an operating system that supports the C library, you must implement system calls in the system and tell the C library how to use them.

More on library function calls can be found (wiki.osdev.org/Library_Cal…) C library wiki.osdev.org/C_Library, create C library wiki.osdev.org/Creating_a_…

Compiler/assembler

The assembler can convert the input plain text source file into machine code in binary format: that is, convert the source code to object code and add additional information such as symbol names and relocation information.

The compiler can directly convert high-level language code into object files, or it can first convert high-level language into assembly language, and then call the assembler to convert assembly language into object files.

However, the object file generated by the compiler or assembler does not contain any code for the C standard library functions, which means that the code is incomplete and cannot run properly: For example, if we use printf() in

in our code and compile it into an object file, then only the reference to the printf() function will be in the generated object file. If you want to run the program, you also need to link the implementation code of the function to the object file to turn the object file into an executable file that can be run.

Some compilers use library functions inside the compiler. In this case, the object file may refer to memset(), memcpy(), or other functions, even if we do not include the corresponding header and do not call the functions declared in the header file. In this case, the linker must be provided with a library containing the implementation of these functions, otherwise the linker will report an error. The freestanding environment of GCC requires only memset(), memcpy(), memcmp(), memmove() functions and the libgcc library. Some advanced operations (such as 64-bit division on 32-bit systems) may use compiler internal functions. In GCC, these functions are implemented in libgcc. The contents of the library are independent of the operating system on which it is installed, and do not affect the compiled kernel due to licensing issues and so on.

C libraries are available in two types: Hosted and freestanding.

Hosted: Provides a complete C standard library for user environment programming; Freestanding: Only a few headers containing definitions and types are available for kernel programming

The compiler uses Hosted by default. You can use the -ffreestanding argument to make the compiler use freestanding instead

Freestanding headers included in GCC are:


,

, ,

,

,

,

,

, < STdnoreturn. h>, CPUID, SSE, etc





The linker

A linker links object code generated by a compiler or assembler to C libraries such as libgcc.a or whatever we provide.

There are two linking methods: static linking and dynamic linking

Static link

When doing static linking, the linker starts work after assembly is complete. The linker checks for unparsed references in the object code and attempts to resolve them with existing libraries, adding specific binaries from the library to the object code and generating the executable. Executable files generated using static links need only the kernel to run and require no additional support.

The downside of this linking approach is that direct copying of binary code makes the executable file bigger and bigger, and causes the exact same library function code to be copied by multiple programs on disk and in memory, wasting space.

Dynamic link

Programs that use dynamic linking launch the linker when the program loads. Unresolved references in the object file are only linked to libraries in the current system at this point. This linking method greatly reduces the disk space occupied by executable files, and supports the use of shared libraries to save memory space,

Dynamic linking sounds good, but then executables won’t run on systems without libraries.

The Shared library

Shared libraries allow multiple executables to be dynamically linked. All executables that use the same library can refer to code in the same library in memory, without each executable having to copy exactly the same code into its own code

Implementing shared libraries also requires resolving some issues, such as the fact that shared libraries either cannot have their own state (because multiple programs share the same memory, the library cannot have static or global data) or provide a separate state for each program. This is even more difficult in multithreaded systems — where an executable file may have multiple control flows, compounding the problem.

In a virtual memory environment, it is generally not possible to provide shared libraries at the same virtual memory address for all programs. To be able to access the library Code at any virtual memory location, we need PIC (Position Independent Code). We can use the -pic parameter in GCC to create the PIC library. This technique requires support for a binary format (that is, including relocation tables) and can make code less efficient in some architectures

ABI – Application Binary Interface

The system ABI defines how to use library function calls and system calls, including how to use stacks or registers to pass parameters, how to locate function entry points in the library, and so on.

Executables generated using static links use the same ABI as the system kernel on which they run; In dynamic linking, the ABI of the executable is the same as that of the library it actually uses

Unresolved symbol

In the linking phase, we often find content that was added to the program without our knowledge and not provided by the environment, such as alloca(), memcpy(), etc. When this happens, it usually means that our toolchain or command-line arguments are incorrectly set, or that we are using functionality that is not implemented in the library/runtime environment. If there is a link error and a message indicating that no symbol was found, check to see if there are any libraries that have forgotten links.