An overview of the

We have seen in the previous chapter how an operating system reads memory from disk and starts working. This article is mainly about what kind of interface an operating system provides for upper-level applications to use

The system calls

The operating system provides the interface that we call the system call, and the idea that becomes it is understood. We’ve wrapped the implementation in Java’s interface, just interface oriented, same thing with the operating system, Unix has oneThe POSIX specification, which defines the names of system calls, is implemented by different operating systems

Kernel mode and user mode

The operating system and our applications are both in memory. Why do we need to call interfaces provided by the operating system instead of directly calling methods inside the operating system?Quite simply, kernel space stores a lot of important information that cannot be accessed directly by users. So how does the operating system prevent user programs from accessing kernel space? This depends on the HARDWARE of the CPU to assist. We know that programs are located by the CPU through CS+IP, CS is the program segment. The program segment of the user program is the user segment, and the segment of the operating system is the kernel segment, so the CPU will determine when executing the instruction that the current request to execute the target CS segment is foul. If the program of the user segment wants to access the kernel segment, the CPU will not execute.

CPL and DPL

The lowest two digits of the CS register are used to indicate what state the instruction is in. 0 is kernel state and 3 is user state.

  1. CPL is the level of the current instruction, 3 for user code and 0 for operating system code.
  2. The DPL is the level of the target instruction to jump to

So the CPU in the execution of the instruction is very simple, just need to determine DPL>=CPL, you can ensure that the user mode of the program can not access the kernel data

Int80 interrupt

As we have just seen, user access to kernel data is blocked at the CPU level, so the operating system provides system calls in kernel mode as well. How does a user-mode program call a kernel-mode system call? The answer is interruption


Interrupts are the only way for a user program to enter the kernel. So int0x80 has a DPL of 3, which means that the operating system opens up a DPL3 interface so that user programs can be called without being blocked by the operating system.

  1. The user program contains a paragraphThe code for int80 instruction
  2. Write interrupts in the operating system and get numbers from the code passed by the int instruction
  3. The operating system executes the code according to the number

System call details

Now that you know how the system call works, let’s look at the implementation in detail. We need to figure out a few things

  1. All user programs implement system calls through INT80 interrupts, so how do so many system calls know which one to call?
  2. How does the operating system handle int80 interrupts?

Start with the write system call

C language function library defines the macro command of different parameters, we will look at the macro command of three parameters_syscall3

#define _syscall3(type,name,a type,a,btype,b,c type,c) \
type name(atype,a,btype b,ctype c)\
{
    long__res; \__asm__ valatile("int 0x80":"=a"(__res):""(__NR_##name)),"b"((long)(a)),"c"((long)(b)). }Copy the code

This macro is actually called by concatenating the parameters into a new function. Let’s use the write function to see how syscall unfolds. Note that printf calls _syscall3 and passes in _syscall3(int,write,int,fd,const char *buf,off_t,count) so the macro expands as follows

int write(int fd,const char* buf,off_t count){
    long _res;
    // inline assembly language name=write so __NR_##name = __NR_write
    int 0x80mov __NR_write , %eax mov fd , %ebx ..... _res = %eax after the command is executedreturn _res;
}
Copy the code

We see that after the write call macro expands, the macro concatenates an _NR_write register to the EAX register. This __NR_write is called the system call number

linux/include/unistd.h
#define __NR_write 4
......
Copy the code

The operating system knows what interrupt to call because it concatenates a system call number according to the function name and passes it to the kernel. The kernel can handle the system call number accordingly

Interrupt handling of int 0x80

Now that you know how the operating system recognizes different system calls, let’s take a look at what the Int0x80 interrupt does.

void sched_init(void){
    set_system_cate(0x80,&system_call)
}
Copy the code

Apparently, sys_system_cate is used to set interrupt handling of 0x80

Idt table

As stated at operating system startup, the operating system has a GDT table that handles CS segment address mapping. For interrupt function entries, the operating system also has an IDT table to handle the mapping of interrupt functions. Now that we know what idT means, let’s look at what happens in the set_system_cate function. Notice that this is already a kernel program

#define set_system_gate(n,addr) \ 
_set_gate(&idt[n],15.3,addr);

#define _set_gate(gate_addr,type,dpl,addr) \
__asm__("movw %%dx,%%ax\n\t" "movw %0, %%dx\n\t"\ "movl %%eax,%1\n\t" "movl %%edx,%2":...)
Copy the code

If we look at the code above, we can see that at _set_gate, the operating system sets the DPL of the system_call function to 3, so there is no problem when the user calls system_call. The kernel then sets CPL to 0 when the function comes in, so that method calls between the cores can be implemented. This paragraph is still a little vague, and I will update it after I finish reading the operating system book

Interrupt handler: system_call

(int0x80); (int0x80); (int0x80

Movl $0x10,%edx mov %dx,%ds mov %dx,%es # movl $0x10,%edx mov %dx,%ds mov %dx,%es call _sys_call_table(,%eax,4)Copy the code

The _sys_call_table is a table of functions. Use the system call number to find the corresponding system call in that table

fn_ptr sys_call_table[]= {sys_setup,sys_exit,sys_fork,sys_read,sys_write,.... }Copy the code

The _sys_call_table contains Pointers to each system call, and the subscript 4 is the address of the sys_write function. The internal invocation of sys_write is explored later