Computer has become an indispensable tool in modern People’s Daily work, study and life. Operating system is the soul of the computer, as the interface of the user to use the computer, it is responsible for scheduling the execution of each user program, so that the computer to complete specific tasks; As the manager of computer hardware resources, it is responsible for coordinating various devices in the computer to work efficiently. The importance of operating systems is self-evident.


For software engineers, understanding how an operating system works and its key mechanisms is a prerequisite for designing high-quality applications, but it is difficult to do so.

On the one hand, operating system design involves all aspects of computer science and engineering, including data structure and algorithm, computer composition and system structure, computer network, and even the core knowledge of programming language and compiler system, as well as the core concepts of concurrency, synchronization and communication.

On the other hand, as a complex and huge software product, the understanding of the operating system requires a deep combination of theory and practice.

Operating system related learning materials are very rich. Those who explain basic principles, those who analyze typical systems, and those who construct sample systems; Some are for professional theorists and some are for applied practitioners. Angle is varied, content is simple and complex differ.

                                                     

The biggest characteristic of this book lies in the author’s years of practical teaching experience in Linux operating system. As an experienced senior software engineer and professional teacher, based on his own experience in learning and studying Linux, the author of this book innovatively conducted teaching and experimental organization on a mykernel and MenuOS based experimental platform, realizing the natural integration of theoretical learning and engineering practice and achieving twice the result with half the effort.

At the same time, the book designed a wealth of unit tests and experiments, guide readers step by step to master the knowledge, and effectively promote readers to think deeply and practice what they have learned.

Based on this book, the author opens the operating system course, which involves face-to-face classroom teaching and online MOOC teaching. The selected subjects include both software engineering masters and general engineering practitioners, and the number of students has reached tens of thousands. The publication of this book reflects the author’s careful absorption of a large number of student feedback, continuous optimization of the course’s teaching content and process organization.

This paper mainly introduces the working principle of computer, including the working model of stored program computer, basic assembly language, and how to execute the assembly code compiled by C language program step by step on the working model of stored program computer. It focuses on the analysis of function call stack related assembly instructions, such as Call/RET and PUSHL/POPL.

Stored program computer working model

The concept of a stored program computer, though simple, was revolutionary in the history of computing and is still a very significant invention in the history of computing. The ability of a computer or smartphone with limited hardware to install a wide variety of software and perform a wide variety of programs may seem to be taken for granted.

Stored program The main idea of the computer is to store the program in the computer memory, and then execute the first instruction of the program according to the first address of the stored program in the memory, and then execute the instructions written in the program until the end of the program execution.

Many people, especially those majoring in computer science, have heard of the Turing machine and the von Neumann machine. The Turing machine was concerned with the philosophical definition of computing and was a virtual abstract machine, the first description of the modern computer. A Turing machine can do any operation provided it is properly programmed. Computers built on Turing machines store data in memory, and the logic of programs is embedded in hardware.

Unlike the Turing machine, the von Neumann machine was an actual architecture, called the von Neumann architecture, which is still the basis of almost every computer platform. We all know the idiom “paoding ding Xie Niu”, which means that after repeated practice, he has mastered the objective laws of things and worked with ease and ease. Von Neumann architecture is an “objective law” that various computer architectures need to follow, and it is very important to understand computers and operating systems. Let’s take a look at what a von Neumann architecture is.

In 1944-45, von Neumann showed that programs and data were logically identical and that programs could be stored in memory. The main points of the von Neumann architecture include:

Von Neumann architecture is shown in Figure 1-1, in which the computer hardware is composed of five basic types of components: arithmetic unit, memory unit, controller, input device and output device.

Figure 1-1 Von Neumann architecture

Inside the computer, instructions and data are represented by binary;

The program and data will be written in the memory, and then start the computer work, this is the basic meaning of the stored program.

The basis of computer hardware is the CPU, which interacts with memory and input/output (I/O) devices, receiving data from input devices and sending data to output devices.

The CPU consists of an arithmetic unit (ALU), a controller, and a number of registers. There is a very important register called the program counter, or EIP in IA32 (x86-32), which indicates the address in memory of the next instruction to be executed.

C/C++ programmers can think of EIP as a pointer because it always points to the address of an instruction. The CPU takes an instruction from the address that the EIP points to and executes it. After the execution, the EIP will automatically add one, execute the next instruction, and then take the next instruction to execute it. The CPU always “eats” instructions in the memory like a “snake”.

The CPU, memory, and I/O devices are connected by a bus. Memory holds instructions and data.

The “binary representation of instructions and data inside the computer” indicates that instructions and data are functionally and processed differently, but both can be stored in memory in binary mode.

The third point above points out that the core of the von Neumann architecture is the stored program computer.

We use programmer’s thinking to abstract the stored program computer, as shown in Figure 1-2.

Figure 1-2 Schematic diagram of the working principle of a stored program computer

We can abstract the CPU as a for loop because it is always executing next Instruction and then fetching the next instruction from memory to execute. From this point of view, memory holds instructions and data, the CPU is responsible for interpreting and executing these instructions, and they are connected by a bus. This explains the principle that computers can automate the execution of programs.

There is a problem here. We need a definition of what instructions the CPU can recognize. For those of you who have studied programming, you probably know apis, or application programming interfaces.

For programmers, there is also an interface called the ABI, which mainly encodes instructions. In terms of instruction coding, we’re not going to go into that kind of detail, just assembly stuff.

How these instructions are encoded into binary machine instructions is of no concern to the interested reader. In addition, these instructions will involve registers, and these registers have conventions that we can use for what instructions.

At the same time, we also need to understand the layout of the registers. Also, most instructions can access memory directly, which is an important concept for x86-32 computer instruction sets.

For x86-32 computer with a memory of the EIP register points to a particular instruction, EIP is automatically plus one (not a byte, nor is it a 32-bit, but add an instruction), even though the x86-32 each instruction of storage space, but it can intelligently automatically added to the next instruction, it can also be other instructions to modify, Such as Call, RET, JMP, etc. These instructions correspond to function calls, return, and if else statements in C.

The basic core of most computing devices today, from smart phones to supercomputers, can be described in terms of von Neumann architecture (stored program computers). Therefore, the stored program computer is a very basic concept, which is the basis for our understanding of how computer systems work.

X86-32 assembly basics

The Intel processor family, also known as x86, has evolved through key phases of architecture including 16 -, 32 -, and 64-bit. The 32-bit architecture is referred to as IA32 and the 64-bit architecture as x86-64, but for the sake of clear distinction, the 32-bit architecture is referred to as x86-32 in this book. This book is consistent with the assembly format used by the Linux kernel in AT&T assembly format.

1. Register of the X86-32 CPU

In order to facilitate readers to understand, the following to introduce the 16-bit 8086 CPU register. The 8086 CPU has a total of 14 16-bit registers: AX, BX, CX, DX, SP, BP, SI, DI, IP, FLAG, CS, DS, SS, and ES. The 14 registers are divided into three types: general register, control register and segment register.

General purpose registers are divided into data registers, pointer registers and indexing registers.

AX, BX, CX, and DX are collectively called data registers.

QAX (Accumulator) : An Accumulator.

QBX (Base) : Base address register.

QCX (Count) : counter register.

QDX (Data) : indicates the Data register.

SP and BP are collectively referred to as pointer registers.

QSP (Stack Pointer) : Stack Pointer register.

QBP (Base Pointer) : indicates the Base Pointer register.

SI and DI are collectively referred to as indexing registers.

QSI (Source Index) : Source indexing register.

QDI (Destination Index) : Destination indexing register.

The control register is mainly divided into instruction pointer register and flag register.

QIP: Instruction Pointer register.

QFLAG: indicates the flag register.

Segment registers mainly include code segment registers, data segment registers, stack segment registers and additional segment registers.

QCS: Code Segment register.

Data Segment (qDS) : Data Segment register.

Stack Segment (qSS) : Stack Segment register.

QES (Extra Segment) : Additional Segment register.

The above data registers AX, BX, CX, and DX can be used as two separate 8-bit registers, as shown in Figure 1-3. The AX register is used as an example.

Figure 1-3 Schematic diagram of AX register

The qAX register can be divided into two independent 8-bit AH and AL registers.

QBX registers can be divided into two independent 8-bit BH and BL registers.

The qCX register can be divided into two independent 8-bit CH and CL registers.

The qDX register can be divided into two independent 8-bit DH and DL registers.

Except for the above four data registers, all other registers can not be divided into two independent 8-bit registers. Note that each separate register has its own name and can be accessed independently. Programmers can take advantage of this “separable” nature of the data register to handle word/byte information flexibly.

Now that we know about the registers of the 16-bit 8086 CPU, let’s look at the 32-bit registers.

The IA32 contains the following registers:

Q4 data registers (EAX, EBX, ECX and EDX).

Q2 indexing and pointer registers (ESI and EDI).

Q2 pointer registers (ESP and EBP).

Q6 segment registers (ES, CS, SS, DS, FS and GS).

Q1 instruction pointer registers (EIP).

Q1 flag registers (EFlags)

The 32-bit register simply extends the corresponding 16-bit register to 32 bits, as shown in Figure 1-4 in a schematic of the EAX register, which adds an E. All registers that begin with E are generally 32 bits.

The EAX accumulative register, EBX base address register, ECX count register, and EDX data register are all general-purpose registers, and programmers can define how to use them when writing sink codes. EBP is a pointer to the base of the stack, which is important; ESI and EDI are indexing registers; ESP is also important as it is the top stack register.

The concept of a stack may be involved here, and those of you who have taken data structure courses will know the concept of a stack. Later in the book, we will talk about push and pop, which push and pop data onto and off a stack. These are 32-bit general purpose registers.

Figure 1-4 EAX register diagram

It is important to note that AX, BX, CX, and DX cannot be used as base and indexing registers to store the address of a memory cell on a 16-bit CPU. However, in a 32-bit CPU, the 32-bit registers EAX, EBX, ECX, and EDX can not only transmit data, temporarily store data to store the results of arithmetic logic operations, but also act as pointer registers. So these 32 bit registers are more versatile.

In addition to the general purpose registers, there are some segment registers. Although the segment register is used less frequently in this book, it is worth looking at. In addition to CS, DS, ES, and SS, there are additional segment registers FS and GS.

The CS register and SS register are commonly used. Our instructions are stored in code snippets. When locating an instruction, use CS:EIP to specify its exact address.

In other words, it is necessary to know which code segment the code is in first, and then to know the relative offset address (EIP) of the instruction in the code segment. Generally, CS:EIP is used to accurately mark the memory address of an instruction. There are also stack segments, which each process has its own stack segment (in Linux, each process has a kernel-mode stack and a user-mode stack).

The details of the function of the flag register are complicated and complicated, so this book will not be introduced in detail. It is enough for readers to know that the flag register can save some current states.

Today’s mainstream computers are mostly 64-bit CPUS, so we also need a brief understanding of x86-64 registers. In fact, there is not much difference between 64-bit and 32-bit registers, it is just extended from 32-bit to 64-bit. The ones preceded by “R” refer to 64-bit registers such as RAX, RBX, RCX, RDX, RBP, RSI, RSP, Flags changed to RFLAGS, and EIP changed to RIP.

In addition, also added more general purpose registers, such as R8, R9, etc., these increased general purpose registers and other general purpose registers are just the name is not the same, in use are to follow the rules of the use of the caller, simply say is casually used.

2. Data format

In Intel’s terminology specification, words represent 16-bit data types; In IA32, 32 bits are called double words; In x86-64, the 64-bit bits are called quadwords. Figure 1-5 shows the IA32 representation of the basic type in C, and the assembly code suffixes listed are often seen in assembly code.

Figure 1-5 IA32 representation of basic types in C

3. Addressing modes and common assembly instructions

Assembly instructions contain opcodes and operands, which are divided into the following three types:

(1) the immediate number is a constant, such as $8, starting with $followed by a number;

(2) The number of registers, indicating the value saved in a register, such as %eax; For byte operations, one of eight single-byte registers, such as %al (the lower 8 bits in the EAX register);

(3) memory reference, access to a location of memory according to a calculated valid address.

There are also some common assembly instructions, let’s see how they work. The most common assembly instruction is the MOV instruction, where L in MOVL refers to 32 bits, B in MOVB refers to 8 bits, W in MOVW refers to 16 bits, and Q in MOVQ refers to 64 bits. We introduce the 32 bits.

First, register addressing. Register addressing refers to operating on registers, not memory, as in %eax, where % is followed by a register name.

Movl % eax and % edx

The code above puts the contents of register %eax into %edx. If you think of the register name as a variable name in C code, it is equivalent to:

edx = eax;

Immediate addressing begins with a $followed by a number. Such as:

Movl $0 x123, % edx

The hexadecimal value 0x123 is placed directly into the EDX register. If you think of the register name as a variable name in C code, it is equivalent to:

edx = 0x123;

Immediate addressing also has nothing to do with memory.

Direct addressing is using a numeric value directly, without the $sign at the beginning. A value with a $sign at the beginning indicates that it is an immediate number; There is no $sign to indicate that this is an address. Such as:

Movl 0 x123, % edx

Put the data stored in the hexadecimal memory address 0x123 into the EDX register. This is equivalent to C code:

edx = *(int*)0x123;

Forcing the value 0x123 into a 32-bit pointer to an int, using an *, and placing it in the EDX register is called direct addressing.

In other words, the memory address is used to directly access the data in memory.

Indirection is a register with a little bracket. For example, the value in the %ebx register is a memory address. Parentheses indicate the data stored at this memory address. We place it in the EDX register:

move (%ebx), %edx

Equivalent to:

edx = *(int*)ebx;

Forcing the value stored in the EBX register into a 32-bit pointer to an int variable, then taking the value it points to with an *, and placing it in the EDX register is called indirection.

Variable addressing is a little more complicated than indirect addressing. Such as:

movl 4(%ebx), %edx

The reader will notice that “(%ebx)” is preceded by a 4 in the code, which is an immediate 4 added to the original address based on indirection, equivalent to:

edx = *(int*)(ebx+4)

Add 4 to the EBX register, force it to a 32-bit pointer to int, and place it in the EDX register with an *. This is called indexing.

As mentioned above, the CPU on the register and memory operation method, are more basic knowledge, need to firmly master.

Most instructions in x86-32 have direct access to memory, but there are also instructions that operate directly on memory, such as push/ POP. They push and unload according to the memory location pointed to by the ESP register. Note that this is the default for the instruction to execute using a specific register.

It should also be noted that the AT&T assembly format used in this book is also the assembly format used by the Linux kernel, which is slightly different from Intel assembly format.

We might run into Intel assembly code when we’re searching for information, and generally speaking, if it’s all capital letters it’s Intel assembly, and if it’s all lowercase letters it’s AT&T assembly.

The code in this book uses register names in AT&T assembly format in all lowercase, whereas register names used in text are generally capitalized because they are acronyms.

There are also several important directives: PUSHL/POPL and Call/RET. Pushl stands for 32-bit push, as in:

pushl %eax

Push the value of the EAX register to the top of the stack. It actually does two things, the first of which is:

% subl $4, esp

Subtract 4 from the ESP register at the top of the stack. Because the stack grows downward, the subtraction instruction subl is used, which reserves a storage unit at the top of the stack. The second action is:

Movl % eax and % (esp)

By adding a parenthesis (indirection) to the ESP register, the value of the EAX register is placed where the ESP register points to the reserved storage unit.

The popL directive is introduced as follows:

popl %eax

This is to take a memory unit (32-bit value) from the top of the stack and place it in the EAX register from the top of the stack. Pushing a stack also corresponds to two operations:

movl (%esp), %eax

addl $4, %esp

The first step is to put the value of the top of the stack into the EAX register, and then add 4 to the top of the stack with the instruction ADDL. The pushl stack grows with each execution, and the popL stack shrinks with each execution.

The call directive is a function call that invokes an address. Such as:

call 0x12345

The above code actually does two actions, the following two pseudo-instructions, notice that there is no actual corresponding instruction for these two actions, we use “(*)” to mark, these two actions are completed by the hardware at one time. For security reasons, EIP registers cannot be used or modified directly.

pushl %eip (*)

movl $0x12345, %eip (*)

The pseudoinstruction first stacks the current EIP register, putting the immediate number 0x12345 into the EIP register, which tells the CPU where to store the next instruction.

To stack the current value of the EIP register is to save the address of the next instruction, and then assign a new value 0x12345 to the EIP register, so that the next instruction to be executed by the CPU is fetched from 0x12345.

The ret directive, which corresponds to the call directive, returns the function, for example:

ret

The above code actually does an action, the following pseudo-instruction, notice that there is no actual corresponding instruction for this action, we use “(*)” to mark it specially, this action is completed by the hardware once. For security reasons, EIP registers cannot be used or modified directly.

popl %eip(*)

This is to place a location on the top of the current stack (typically the contents pushed by the Call instruction) into an EIP register. The actions performed by the pusHL/POPL and Call/RET assembly instructions are summarized as shown in Figure 1-6.

Figure 1-6 PusHL/POPL and Call/RET assembly instructions

To summarize, the call directive corresponds to the starting address of a function we call in C. The RET instruction restores the value of the EIP register (that is, the address of the next instruction of the call instruction) that is pressed when the function is called to the EIP register. The next instruction after the RET instruction also returns to the next instruction at the position of the function call.

In other words, the function call ends, and the next instruction after the function call continues, which is strictly corresponding to the function call process in C language. However, it should be noted that the instructions with a “(*)” indicate that these instructions cannot be directly used by programmers, and are pseudo-instructions.

Because EIP registers cannot be modified directly by programmers, they can only be modified indirectly by special instructions such as Call, RET, and JMP.

If a programmer can modify the EIP register directly, there are serious security risks. The reader may consider why? We’re not going to get into that.

4. Assembly code sample analysis

Now that we have a general understanding of instructions and registers, let’s do an exercise to verify our understanding. When the stack is empty, what happens to the stack and registers after executing the following assembly code snippet?

1 push $8

2 movl %esp, %ebp

3 subl $4, %esp

4 movl $8, (%esp)

We analyze what the assembly code does at each step. First, when the stack is empty, both EBP and ESP registers point to the bottom of the stack.

The first statement is to stack the immediate number 8 (that is, subtract 4 from the ESP register and then put the immediate number 8 at the top of the stack).

The ESP register is stored in the EBP register, and the EBP register points to the same position as the CURRENT ESP register.

In other words, a logically empty stack has been created on the stack, which is not easy to understand and doesn’t matter to the reader for the time being. Later in the book, we will analyze how function calls are implemented by assembling C programs into assembly code, including a function call stack framework.

The instruction in line 3 is subl, which subtracts the value stored in the ESP register by 4. That is, the top pointer to the ESP register is moved down by one storage unit (4 bytes).

The last line puts the immediate number 8 at the memory address pointed to by the ESP register, i.e. the immediate number 8 is placed at the top of the stack by indirect addressing.

This example is about some operations on the stack and register, we can compare the above text instructions step by step to follow the stack and register changes, in order to understand the function of the instruction more accurately.

What happens to the stack and registers after executing the following assembly code fragment with an empty stack?

1 pushl $8

2 movl %esp, %ebp

3 pushl $8

Let’s also examine what the assembly code does at each step. First, the EBP and ESP registers point to the bottom of the stack when the stack is empty.

The first statement pushes the immediate number 8, which means that the stack has an additional memory unit and an immediate number 8, and also changes the ESP register.

Line 2 puts the value of the ESP register into the EBP register. The stack space does not change, but the EBP register does.

Line 3 pushes the immediate number 8, that is, the stack has an additional memory location and an immediate number 8.

The reader will find that this example has exactly the same effect as the previous one.

After a little experimentation, let’s look at a bit more complicated assembly code:

1 pushl $8

2 movl %esp, %ebp

3 pushl %esp

4 pushl $8

5 addl $4, %esp

6 popl %esp

This assembly code also starts with the EBP and ESP registers pointing to the bottom of the stack when the stack is empty.

The first statement pushes the immediate number 8, that is, adding an extra storage unit to the stack and holding the immediate number 8, while also changing the ESP register.

Line 2 puts the ESP register into the EBP register. The stack space does not change, but the EBP register does.

Line 3 pushes the contents of the ESP register to the storage unit at the top of the stack. Note that the PUSHL instruction itself changes the ESP register. The “pushl % ESP” statement corresponds to the following two instructions:

subl $4, %esp

movl %esp, (%esp)

Obviously, the ESP register was changed before saving the ESP register value to the stack, and the data saved to the top of the stack should be the current ESP register value minus 4. The VALUE of the ESP register is changed, and a storage unit is added to the stack space to save the changed VALUE of the ESP register.

The fourth statement pushes the immediate number 8, meaning that an extra storage unit is added to the stack to hold the immediate number 8, and the ESP register is changed.

Line 5 increases the VALUE of the ESP register by 4, which is equivalent to shrinking stack space by one storage unit.

The last statement is equivalent to the following two instructions:

movl (%esp), %esp

addl $4, %esp

That is, the current top of the stack into the ESP register, and then add 4 to the ESP register. This code is complicated because the ESP register is both an operand and is used and modified during execution by the pusHL/POPL instruction.

The reader needs to carefully analyze and think about this assembly code to understand the entire execution process. Later in the book, we will further understand the assembly code involved in building a function call stack and dismantling a function call stack, combining function calls and function returns of C code.

                                                        

Analysis of paoding Niu Linux Kernel

By Meng Ning, Lou Jiapeng and Liu Yudong

From the core of understanding of computer hardware, the book work mechanism and the function call stack (stored program computer) and how to through the system call user mode application in kernel (interrupt) to obtain, through two direction two-way attack strategy, and use the actual operational procedures of the disassembly code from the Angle of practice, to understand the operating system kernel, Then start to analyze the Linux kernel source code, from the system call into the kernel, process scheduling and process switching, and finally return to the user process.

                                               

Long press the QR code, you can follow us yo

I share IT articles with you every day.

If you reply “follow” in the background of “Asynchronous books”, you can get 2000 online video courses for free

Asynchronous book benefits to send non-stop

Invite 10 friends to pay attention to get a book asynchronously within 10 days (click the text to get details of the activity)

Click to read the original article to buy asynchronous books

Read the original