An overview of the
We have seen in the previous chapter how an operating system reads memory from disk and starts working. This article is mainly about what kind of interface an operating system provides for upper-level applications to use
The system calls
The operating system provides the interface that we call the system call, and the idea that becomes it is understood. We’ve wrapped the implementation in Java’s interface, just interface oriented, same thing with the operating system, Unix has oneThe POSIX specification, which defines the names of system calls, is implemented by different operating systems
Kernel mode and user mode
The operating system and our applications are both in memory. Why do we need to call interfaces provided by the operating system instead of directly calling methods inside the operating system?Quite simply, kernel space stores a lot of important information that cannot be accessed directly by users. So how does the operating system prevent user programs from accessing kernel space? This depends on the HARDWARE of the CPU to assist. We know that programs are located by the CPU through CS+IP, CS is the program segment. The program segment of the user program is the user segment, and the segment of the operating system is the kernel segment, so the CPU will determine when executing the instruction that the current request to execute the target CS segment is foul. If the program of the user segment wants to access the kernel segment, the CPU will not execute.
CPL and DPL
The lowest two digits of the CS register are used to indicate what state the instruction is in. 0 is kernel state and 3 is user state.
- CPL is the level of the current instruction, 3 for user code and 0 for operating system code.
- The DPL is the level of the target instruction to jump to
So the CPU in the execution of the instruction is very simple, just need to determine DPL>=CPL, you can ensure that the user mode of the program can not access the kernel data
Int80 interrupt
As we have just seen, user access to kernel data is blocked at the CPU level, so the operating system provides system calls in kernel mode as well. How does a user-mode program call a kernel-mode system call? The answer is interruption
Interrupts are the only way for a user program to enter the kernel. So int0x80 has a DPL of 3, which means that the operating system opens up a DPL3 interface so that user programs can be called without being blocked by the operating system.
- The user program contains a paragraph
The code for int80 instruction
- Write interrupts in the operating system and get numbers from the code passed by the int instruction
- The operating system executes the code according to the number
System call details
Now that you know how the system call works, let’s look at the implementation in detail. We need to figure out a few things
- All user programs implement system calls through INT80 interrupts, so how do so many system calls know which one to call?
- How does the operating system handle int80 interrupts?
Start with the write system call
C language function library defines the macro command of different parameters, we will look at the macro command of three parameters_syscall3
#define _syscall3(type,name,a type,a,btype,b,c type,c) \
type name(atype,a,btype b,ctype c)\
{
long__res; \__asm__ valatile("int 0x80":"=a"(__res):""(__NR_##name)),"b"((long)(a)),"c"((long)(b)). }Copy the code
This macro is actually called by concatenating the parameters into a new function. Let’s use the write function to see how syscall unfolds. Note that printf calls _syscall3 and passes in _syscall3(int,write,int,fd,const char *buf,off_t,count) so the macro expands as follows
int write(int fd,const char* buf,off_t count){
long _res;
// inline assembly language name=write so __NR_##name = __NR_write
int 0x80mov __NR_write , %eax mov fd , %ebx ..... _res = %eax after the command is executedreturn _res;
}
Copy the code
We see that after the write call macro expands, the macro concatenates an _NR_write register to the EAX register. This __NR_write is called the system call number
linux/include/unistd.h
#define __NR_write 4
......
Copy the code
The operating system knows what interrupt to call because it concatenates a system call number according to the function name and passes it to the kernel. The kernel can handle the system call number accordingly
Interrupt handling of int 0x80
Now that you know how the operating system recognizes different system calls, let’s take a look at what the Int0x80 interrupt does.
void sched_init(void){
set_system_cate(0x80,&system_call)
}
Copy the code
Apparently, sys_system_cate is used to set interrupt handling of 0x80
Idt table
As stated at operating system startup, the operating system has a GDT table that handles CS segment address mapping. For interrupt function entries, the operating system also has an IDT table to handle the mapping of interrupt functions. Now that we know what idT means, let’s look at what happens in the set_system_cate function. Notice that this is already a kernel program
#define set_system_gate(n,addr) \
_set_gate(&idt[n],15.3,addr);
#define _set_gate(gate_addr,type,dpl,addr) \
__asm__("movw %%dx,%%ax\n\t" "movw %0, %%dx\n\t"\ "movl %%eax,%1\n\t" "movl %%edx,%2":...)
Copy the code
If we look at the code above, we can see that at _set_gate, the operating system sets the DPL of the system_call function to 3, so there is no problem when the user calls system_call. The kernel then sets CPL to 0 when the function comes in, so that method calls between the cores can be implemented. This paragraph is still a little vague, and I will update it after I finish reading the operating system book
Interrupt handler: system_call
(int0x80); (int0x80); (int0x80
Movl $0x10,%edx mov %dx,%ds mov %dx,%es # movl $0x10,%edx mov %dx,%ds mov %dx,%es call _sys_call_table(,%eax,4)Copy the code
The _sys_call_table is a table of functions. Use the system call number to find the corresponding system call in that table
fn_ptr sys_call_table[]= {sys_setup,sys_exit,sys_fork,sys_read,sys_write,.... }Copy the code
The _sys_call_table contains Pointers to each system call, and the subscript 4 is the address of the sys_write function. The internal invocation of sys_write is explored later