Extraction of multi-process image

1. Intuitive ideas for CPU management

2. Multi-process images

Multiple processes exist in memory at the same time, and the contents of the address of the previous process may be changed by the second process. This is obviously unreasonable.

We will first think of, whether can set the value of DPL, or set it to privilege level, this process will not be able to visit each other between, but ignore the problem here, the setting of DPL is in view of the operating system, the process of DPL is equal to three, namely the user level permissions, can still visit each other, so the thinking of the solution is wrong.

So how do you make sure that processes don’t conflict? The intuitive solution is to lock the process and restrict access to memory addresses.

A mapping table is a tool for limiting access. The idea of mapping table is to use virtual memory to achieve address separation.

At this point, we find that the so-called multi-process image, not only process management, also involves memory management.

How do multiple processes collaborate

Let’s start with the following example

Suppose there is a free memory space, and two processes want to use the memory space, so it is possible to have two incomplete process fragments in the same memory space, that is, processes can cooperate with each other, but the way of cooperation needs to be coordinated.

So how do you reconcile that? On paper, we can see that in the producer-consumer instance

Producers continually produce data into the buffer, and consumers take data out of the buffer for processing. Producers block when the buffer is full, and consumers block when the buffer is empty

We’re going to implement the producer-consumer problem in three ways

As mentioned below in process synchronization…

Two. Thread extraction and implementation

From the previous section we know that multi-process is the most basic and core image of an operating system. Multi-process can be organized, cooperated and switched.

Multi-process organization is achieved through states and queues

How do so many processes get switched? Listen to me


1. User-level threads

Threading out

If we want to learn about switching between processes, why are we learning about threads?

First, let’s look at the macro process switch: one process, namely instruction execution sequence + resource, switches to another process after an external interrupt. During this time, the memory address accessed by the instruction is obtained through the mapping table. This switch is caused by an interrupt and therefore enters the kernel state, which involves resource allocation. The allocation of resources is a waste of time, so in order to improve the efficiency of process execution, can we separate resources from instruction execution, and only switch instruction sequence without moving resources? –

Let’s look at the intuitive statement

This brings us to the lightweight process of threads.

The value of threads

To answer the question of whether threads are practical is to answer the question of whether the idea of multiple execution sequences plus one address space is practical.

Let’s look at the following example

What this page shows is that first the text is loaded, then the image is loaded, and finally the entire content is displayed.

In fact, its program execution flow is multiple threads reading the same resource from the browser buffer. When the program executes the display text thread, it switches to the display image thread during this period, and then switches back again after a period of time, giving the user a better experience.

I’m going to describe it in a rather awkward way, but multithreaded execution, in a nutshell, is the simultaneous execution of multiple lines of programming:

Thread switching

Opening picture:

As you can see, the switch between threads is implemented using a function called Yield, so our core is to write the implementation of the Yeild function —

To complete the switch, it is necessary to save the state of the previous thread, which depends on the stack to achieve. Let’s start with two execution sequences and a stack

Two execution sequences and one stack

What does this picture really mean?

When we call the Yield function on B, we need to Yield the address of the next instruction in B. When we call the Yield function on C, we need to Yield the address of the next instruction in B. When we call the Yield function on D, we need to Yield the address of 204. Pop the stack and jump, and then the program pointer jumps to 404.

Uh-huh. Did I pick up the script by mistake? Yes, it is so dramatic, I want to reload 104 and jump, not 404, endless loop…

So why is this? Because we share the same stack!!

So we naturally need to go from one stack to two stacks.

Two execution sequences and two stacks

Now when a thread switches, the first thing the yield function needs to do is switch the stack, which requires a container that holds the state of the current thread stack, namely the TCB.

We finally execute the yield function, first switch TCB, then Yiled function ends, continue to execute **}, that is, execute ret instruction **

Bomb stack execution 104, a thread switch is so narrowly completed.

  • There is no need for JMP 204, because when the Yield function executes, it willBy default, they are executed in descending order.

After completing both stacks, we initialize the ThreadCreate function that contains thread switches and associate the TCB with the stack.

Thus, a complete process has —


At this point, we can answer the question of why this summary is called user-level threads, because field is a user-level application.

Why do you need kernel-level threads when the entire user-level thread can switch in order? Because of its inherent limitations, of course, and from another perspective, core-level threads are extensions of user-level threads.

Disadvantages of user-level threads

When the user-level thread is interrupted during the execution of the command, the system will switch to the kernel mode and execute the interrupt program. After the execution, the user-level thread cannot be switched back to the user-level thread because the permission has been changed, and the user-level thread will no longer execute.

2. Kernel level threads

3. Core level thread implementation example

Let’s start by reviewing kernel-level thread switching

At a macro level, kernel-level threads have evolved from two stacks of user-level threads to two sets of stacks, and their implementation mechanism is roughly divided into five steps

  1. The user stack jumps to the kernel stack via the int interrupt instruction
  2. The state of the current kernel thread is stored in the kernel stack, which is maintained by the thread-control block TCB1
  3. Kernel-level thread jumping is done through the Schedule function
  4. The state of the current kernel thread is stored in the kernel stack, which is maintained by the thread-control block TCB2
  5. Jump from kernel stack back to user stack

Let’s implement this kernel level thread:


This story first starting into the kernel, namely from an interruption, the system call case we choose the fork (), the fork () function is used to create the process, the process is essentially resources + instruction sequences, instruction sequence is a series of thread, so we use the fork function, kernel-level threads may be performed by the switch.

Let’s look at the following example

Interrupt the entrance

A function is called in main function, and fork system call is called in the execution of A function. Fork system call will cause an interrupt. When triggering the interrupt handler, the previous user-mode stack will be saved first, that is, SS and SP will be pushed, and then the fork interrupt handler will be jumped. The system_call function —

The sys_fork system call starts the switch. In this system call, the TCB of the current thread is assigned to ex and the state of EX is determined, essentially the state of TCB, if it is non-zero. That is, blocked. Schedule needs to be executed to complete the switch, and also needs to be switched when the time slice runs out.

Interrupt exports

So how do I get back? By interrupting the export, what does interrupting the export do? Corresponding to a series of interrupt entry into the stack, the interrupt exit will do the stack operation.

We described the interrupt entry and the interrupt exit, so what happens during the interrupt?

The interrupt first finds the thread to switch to, and then switches to that thread. This involves thread scheduling and switching, and we’ll talk about thread scheduling later, but let’s just look at thread switching

Interrupt process – thread switching

Thread switching is done by switch_to, in the function mainly through the long jump to achieve thread jump —

First, movW instruction is used to write the contents of the registers in the CPU into TSS, which is the task structure segment, which is a part of the process control block PCB.

First, find the TSS descriptor in the GDT table through the TR register (cS-like register), that is, the pointer to the TSS segment, and finally locate the TSS segment through the TSS descriptor.

Long jumps do nothing more than find the TSS descriptor of another thread through the TR register, find the TSS segment through the TSS descriptor, and load the contents of the register into the TSS segment as well.

However, it is inefficient to implement jumps in the TSS way, because long jumps in TSS require a series of register operations. Then we will use KIN Stack to complete the switch.

Refer to the Linux process switch with TSS structure blog.51cto.com/stonehorse/…


In the previous section, we learned about thread switching, but now we’ll zoom in and learn about thread creation, the ThreaCreate function

Start with the sys_fork function

In the sys_fork function, the user state is first pushed so that the child process can call it. Then in the sys_fork function, copy_process is called. You can see that there is a long list of parameters for this function.

So what are the details of child process creation?

First, we apply for a page of memory space. Let’s review its origin: Memmap is initialized at startup, and the memory is divided into pages with 4k as the minimum unit. When kernel space needs to be applied, a page of memory space with memMap not equal to 0 can be selected.

Then ** creates the TCB, which is the space on the next page from the start of the requested memory.

Then create a kernel stack and a user stack, and finally associate the TCB with the stack.

Finally, we just need to complete a series of initializations. The one of interest is the initialization of the IP, the memory address of the next instruction after the interrupt is removed from the stack, and the initialization of eAX to zero. So what does eAX do?

One judgment we often see: if(! Fork () {child execute}; if eax = 0, child execute; if eax = 0, parent execute.

With the execution conditions of the child process clear, let’s look at the specific execution code of the child process, that is, exec system call —

We go from the parent’s shell to the child’s execution, and when the child is finished executing and needs to return, where does it return to? We know that at the end of the jump execution, instruction IRET will be executed, so the address returned depends on the IP register contents of the RET instruction stack in IRET. So where does IP point to?

The location to which the IP points, or the address in the IP register, depends on the arguments of the struct_exec system call. In its execution, the address of the a.try method in the EX register is loaded into ret as the pointing address of IP. A.try is the entry address of the executable file. When it is connected to produce the executable file, the contents of the file, that is, the contents of the disk, will be written to memory for execution.

4. The operating system tree

A system call is a kind of soft interrupt handler

In this section, we are not going to learn the new material, because the old material is still a mess. So it is necessary for us to connect the previous knowledge points into a line, so that it starts from the bud, light immersion at any time, and eventually grow into a towering tree.

Even the operating system, which is a very big tree, will spread from a very small fire. As Li Zhijun teacher warned us –

The mind is not a vessel that needs filling,but wood that needs ignitng

Our brain is not a container full of water, but a flame.

So for the rest of our content we will start with a very small, very specific, very valuable IDEA —

  • We’re going to stagger A and B on the screen

Let’s first review the thread switching idea

  • Run the CPU-“The CPU is not working properly-“To get the CPU to work properly, jump from function A to function B first-“Confusion occurred when jumping with 1 stack and 1 yeild-“Try using two stacks and two TCBS-“This works fine, but only in user mode is not appropriate-“Introduce two sets of stacks, namely kernel stack switch

Let’s start with the user code

main(){
    if(! fork()){while(1)printf("A"); }if(! fork()){while(1)printf("B"); } wait(); }Copy the code

Let’s express it in assembly

main(){
	mov _NR_fork,%eax
	int 0x80
100: mov %eax,res
	cmp1,res,0
	jne 208
200: printf("A")
	jmp 200
208: ...
304: wait()
}
Copy the code

Int 0x80 (int 0x80)

main(){
	mov _NR_fork,%eax
	int 0x80
100: mov %eax , res
	cmp1 res,0
}

set_system_gate(0x80,&system_call);
system_call:
	call sys_call_table(,%eax,4)
Copy the code

First, the system call number is assigned to **_NR_fork**, triggering interrupt 0x80 to enter system_call. In this system call, jump system_call_table, assign 1 to eax, where the current kernel process is the parent, and call sys_fork —

sys_fork:
	push1 ..
	call copy_process
	ret
Copy the code

The child process is created, which is a copy of the parent process, copy_process_

copy_process(... .longeip,..) { p = (PCB *)get_free_page(); p->tss.esp0 = p+4k;
        p->tss.esp = esp;
        p->tss.eax = 0;
        p->tss.eip = eip;
}
Copy the code

After the PCB is created and initialized, that is, the child process is created, ret is returned to the parent process, and IRET is returned to the user state.

We then call main, create thread after thread, and execute main to wait(), which implements —

sys_waitpid(){
	current -> state = TASK_INTERRUPTTIBLE;
        schedule();
}
Copy the code

Set the state of the current thread to blocked, and start the switch —

shedule(){
    if((*p)->state == TASK_RUNNING && (*p)->counter > c) c = (*p) -> counter,next = i; . switch_to(next); }Copy the code

If the current state is running, use the scheduling algorithm to get the thread to switch, enter the switch —

After switching, we can type B, but we need to alternate, so we have to go from B to A, so we have to do the switch, and the switch is done in kernel mode, and the user mode requires an interrupt to get into the kernel, so which interrupt do we use?

Clock break!!

void sched_int(void){
    set_intr_gate(0x20,&timer_interrupt)
}

void  _time_interrupt:
	call do_time
        
void do_time(...).{
        if((--current -> counter >0)) 
            return ;
        current -> counter = 0;
        schedule();     
 }
Copy the code

The interrupt type code is assigned to the system call time_interrupt, where it jumps do_time, where the number of current threads is determined. If greater than 0, no switch is made; otherwise, schedule is called to switch threads.

Finally switch_to is called in the SCHdule. Continuous cycle, end ~

CPU scheduling

In the description of the above, a thread switch so silky finished, but we will always CPU scheduling deliberately as a black box, this is because the CPU scheduling task is indeed a very heart, but understand it can deepen our understanding of system design, therefore, this section, we knowing that mountain tiger, did – race


1. CPU scheduling policy

When a thread blocks, it needs to switch to another thread. Which thread should it switch to? Intuitively, we need to get things done as quickly as possible. So how do you measure the efficiency of the task? Even the task itself has different types, such as I/O type, CPU type, according to different classification Angle, and there are foreground task and background task.

We only catch a few of its key points to design a scheduling algorithm

  1. Short turnaround time, from task entry to completion, i.e. as fast as possible. Turnaround time = waiting time + running time.
  2. Short response time, that is, short time from operation to response
  3. The system has low internal consumption, that is, the system has high throughput and performs many tasks

Let’s look at the specific algorithmic idea that fits some points and measure it with the metrics above:

FCFS

First come, first served is the most intuitive scheduling algorithm, but a system is complex, only considering fairness will only get half the result.

Assuming that there are n tasks and the turnaround time of each task is Ti, its average response time is ** : T1+T2+T3+Tn/n**, and the response turnaround of its NTH task is T1+T2+T3+… +Tn, if T1 time is greater than the sum of the other times, then obviously the turnaround time is likely to be very long, which is very unreasonable.

Can the FCFC scheduling algorithm be used to determine the order of execution according to the length of the task? Let’s take a look at SJF — short job priority scheduling algorithm

SJF

Assuming that there are n tasks, according to the principle of short-job priority, the scheduling results are P1,P2,P3… Pn, then its total turnaround time is N *P1+(n-1)* P2 +…. So its turnaround time is obviously significantly reduced.

But if I have a long assignment and it’s urgent, should I prioritize it appropriately? This refers to RR – rotation time slice scheduling algorithm.

RR

The idea of slice rotation is to subdivide the job, and then the CPU assigns time slices to each job, and then interlaces the smallest units of the task.

But there is also a problem here, if the time slice is large, the response time is long; If the time slice is small, the internal consumption of the system is large and the throughput is small

Let’s start with a smile and say fuck~

What if our need for turnaround time and response time exists at the same time?

Set priority! At the same time we also need to abide by a principle, people care more about the computer and the interaction of the time, also is the foreground task execution is fast, so we first to arrange the reception tasks according to the principle of RR, then according to the principle of SJF to arrange background tasks, between the two types of tasks through dynamic priority setting for each switch.

The complexity of the metrics in the previous section gives us an intuitive sense of how cumbersome the algorithm must be, but the Schdule function in Linux0.11 does a good job of handling the problem

2. An actual schedule function

Before we start, we should first make clear that there is a certain relationship between time slice and priority. A large time slice itself has a higher priority to some extent. An algorithm that can solve a complex problem has the same depth of thought even if it has less code.

Now let’s look at the implementation of the Chedule function —

schele(void) {while(1) {int i = NR_TASKS;
    	  int c= - 1;
         int next = 0;
   	       p = &tasks[NR_TASKS];
   	       while(i--){
    		if(*p -> state == TASK_RUNNING &&  *p -> counter > c){// Add the current job to the ready state with a high priorityc = *p -> counter; next = i; }}if(c){
            break;     // Find the largest counter
        }
        for(p = &LAST_TASK; p > &FIRST_TASK; p--){// Reset the priority
            (*p) -> counter = ((*p) -> counter >> 1) + *p ->priority; } switch_to(next); }}Copy the code

In this application, counter is key, and it takes care of both the priority and the time slice

/ / time slice
void do_time(a){
    if(-- current -> counter >0)
        	return;
    current -> counter = 0;
    schedule();
}
Copy the code

That is, when the time slice is 0, the time slice is switched. Here counter satisfies the rotation schedule and guarantees the response.

And in the following program, counter represents the priority —

if(*p -> state == TASK_RUNNING &&  *p -> counter > c){// Add the current job to the ready state with a high priority
            c = *p -> counter;
            next = i;	
 }

for(p = &LAST_TASK; p > &FIRST_TASK; p--){// Reset the priority
            (*p) -> counter = ((*p) -> counter >> 1) + *p ->priority;
 }
Copy the code

The priority of a blocked process is higher than that of a non-blocked process, i.e., the priority foreground process.

But it’s worth noting that if counter is outside the bounds of the response time, we can look at it quantitatively

If the response time of process N is p and the response time (time slice) of process N is P + P /2, the response time of the infinite process is p+ P /2+ P /4+… Plus p over 2 to the infinity is less than 2p

Process synchronization and deadlocks

We learned in the previous chapter of switching between threads is how to finish, and process is equal to the resources (thread) + executable instruction sequences, the switch to complete the process, also is the concurrent execution process, it not only need to know how the resources are scheduling is ok, so how resources said, and how to quantify, in the process of scheduling resources. What kind of problems will arise, and how to solve them? All of these questions will be answered in the following sections


1. Process synchronization and semaphore

The so-called process cooperation, is a number of processes to complete the same task, before we talked about the example of producer and consumer, here we carry on a further analysis of it —

For the producer, it produces a resource and puts it into a buffer, and when the buffer is full, it stops producing.

Similarly, for the consumer, it consumes resources from the buffer, and when the buffer is empty, it stops consuming.

From this mechanism, we see that we first need a buffer to put the resource —

#define BUFFER_SIZE 10
// Declare the structure of the resource
typedef struct{} item;
// Declare the buffer size
item buffer[BUFFER_SIZE];
int in = out = counter = 0;

Copy the code

Next, let’s look at the most intuitive producer-consumer process model

/ / producer
while(true) {while(counter == BUFFER_SIZE);
    buffer[in] = item;
    in =  (in + 1) % BUFFER_SIZE;
    counter++;
}

/ / consumer
while(true) {while(counter == 0);
    item = buffer[out] ;
    out=  (iout+ 1) % BUFFER_SIZE;
    counter--;
}
Copy the code

In this most basic model, we did manage to control the timing of producer/consumer runs, but the scheduling of processes is unpredictable, so we can extend it to use the signal counter to control the sleep and wake up of processes.

/ / producer
while(true) {while(counter == BUFFER_SIZE){
        sleep();// When the buffer is full, put the created process into sleep queue
    }
    buffer[in] = item;
    in =  (in + 1) % BUFFER_SIZE;
    counter++;
    if(counter  == 1){
	wakeup();// Wake up the consumer}}/ / consumer
while(true) {while(counter == 0){
        sleep();// Put the consumer process in and out of the sleep queue when the buffer is empty
    }
    item = buffer[out] ;
    out=  (iout+ 1) % BUFFER_SIZE;
    counter--;
    if(counter  == BUFFER_SIZE- 1){
	wakeup();// Wake up the producer}}Copy the code

A second version of the producer-consumer model was formed in this way. It is undoubtedly more perfect, but the details are wrong because the counter signal is macro

If the buffer is full, producer P1 is running and in the sleep queue, and producer P2 is running and in the sleep queue, counter equals BUFFER_SIZE. The consumer process then runs once, and the contents of the buffer are reduced by 1 to wake up the producer, but before the producer executes, another consumer process runs, at which point counter equals buffer_size-2, and the producer process P2 is not woken up.

This is no doubt because the signal counter only represents the idle state in the buffer, not how many processes are in the dormant queue. In other words, things are not either 0 or 1, but maybe the 1 also represents the difference between 1.0 and 1.00

So we want to make this signal that can express both dormant and aroused states, and of course, at this point it’s no longer a signal, why don’t we just call it a semaphore?

From signal to semaphore

From the semaphore we can determine the state of sleep or wake.

Let’s write the semaphore as SEM. When SEM =0, it means that there are no resources available, indifferent. It should be noted that for the process, the so-called resources are the free space. When SEM <0, it indicates that there are processes waiting for resources. If SEM =-2, it indicates that there are two processes waiting for resources. When SEM >0, it indicates how many idle resources there are.

Next we give the official definition of semaphore:

In 1965, the semaphore proposed by Dutch scholar Dijkstra was used for recording, and the signal was used for sleep and wakeup.

We further refine the semaphore:

struct semphore{
    int value; // Record the number of resources
    PCB * queue; // Record the queue of waiting processes
}
Copy the code

In addition, P,V primitives are derived from semaphores. P can be denoted as test, V as increament

The following is an implementation of P,V primitives —

//p primitive: No resources, queue sleep
P(semaphore s){
    s.value--;
    if(s.value < 0){
        sleep(s.queue); }}//V primitive: there are resources, wake up the queue
V(semaphore s){
    s.value++;
    if(s.value >= 0){
        wakeup(s.queue); }}Copy the code

At this point, we can return to the producer and consumer problem, without further ado, the code:

int fd = open ('buffer.txt');
write(fd,0.sizeof(int)); / / write in
write(fd,0.sizeof(int)); / / write out

semaphore full = 0;
semaphore empty = BUFFER_SIZE;
// Declare the mutex semaphore to be 1, i.e. the number of concurrent modifications
semaphore mutex = 1;

Producer(item){
    // Check whether the buffer is full
    P(empty);
    // Determine whether mutex is less than 0, i.e. whether the current resource accessed is a mutex semaphore
    P(mutex);//mutex--
    // Write in here, that is, write item to the position of IN
    V(mutex);//mutex++
    V(full);//full++
}

Consumer(item){
    P(full);// Determine that the buffer is empty
    // Determine whether mutex is less than 0, i.e. whether the current resource accessed is a mutex semaphore
    P(mutex);//mutex--
    // Write iout here, that is, read into item from the out position in the file
    V(mutex);//mutex++
    V(empty);//empty++
}
Copy the code

Smooth as a clam

This is a big step, but there are a few minor flaws: two processes accessing a semaphore at the same time due to the uncertain scheduling order

2. Critical region of semaphore

Why protect critical sections? What is a critical section

Because two processes are competing on the same piece of data, this resource needs to be a critical section. The race condition here refers to shared semantic errors related to scheduling.

  • The critical area:Only one process at a time is allowed to enter that section of code. The basic principle of critical sections is mutual exclusion.

However, it is not enough to declare a piece of code as critical section code. A good critical section code should also be open and waiting.

Start with the first attempt to enter the critical region, the rotation method.

1. The rotation method

First of all, the design of the critical section is mutually exclusive, but if you continue to call process P0 after calling process P0, then you no longer satisfy the principle of free time, so the next column —

2. The notation

The problem of the CPU being idle for a lot of time is solved and mutually exclusive, but let’s say there’s a scenario —

If both marks are true, then both processes are no longer executed. We can use the analogy of an example in life: suppose there is no milk in the refrigerator, the wife sees the big mark of her husband and thinks he will buy it, while the husband sees the mark left by his wife and thinks he will buy it, so he does not actually buy the milk.

This is a synonym **, but in reality, it would take a relatively diligent person to break through this situation.

3. Peterson algorithm

whileMap to process collaborationIn the case of both flags true, one process is required to execute and the other process does nothing

This is the idea of Peterson algorithm, it meets several conditions of a good critical region, is the best way to achieve the critical region!!

When the problem of two processes competing for critical sections is solved, how about one or more processes


Bakery algorithm

As we emphasized at the beginning, the criteria for evaluating a critical section algorithm are mutually exclusive entry, free entry, and finite wait. In the previous section, we also verified that to satisfy both conditions, we need to combine the idea of rotation and tagging. How does this work in multiple processes?

  • Rotation: Each process has a serial number

  • Mark: the entry with the smallest sequence number in the process

First select the ith process and mark it. Assign the smallest sequence number of the current process to the ith element of the num array.

Then cancel the current mark, that is, mark only to extract the current smallest sequence number. Then through the serial number, if there is another process selected, and the serial number of the process is smaller than the current process, then wait for it to finish execution, otherwise, execute the current process. At the end of the process, set the ith element of num to 0. Start the next loop.

Multiple process synchronization problem can be resolved by the bakery algorithm indeed, but have a headache, and so cumbersome algorithms look at and review why critical region, essentially because of uncontrolled scheduling, and dispatching is to get into the kernel, and the way into the kernel is interrupted, then we make a process triggered interrupt not only can protect critical area? So does that work?

This brings us to hardware in hW co-development

In this case, cli() is used to set the interrupt register INTR to 1, and after executing the critical section code, set it to 0 again, so that only one process can access the critical section at a time. The core of this method lies in the interrupt register, but if it is multi-core CPU, there is such a register in each CPU, changing the interrupt register of a CPU has no effect on other CPUS, so this method is limited.

So what do you do? Obviously we need a global semaphore to mark the state of the current critical region. Here we try hardware atomic instruction

boolean 
    // This is a piece of atomic code, that is, execute once, do not cut back
    TestAndSet(boolean &x){2. If false, set x to true and return false
    	booean rv = x;
    	x = true ;
    	return rv ;
     }


while(TestAndSet(&lock));
/ / critical region
lock = false;
/ / rest area
Copy the code

In other words, if the semaphore is locked, enter and execute, execute and unlock; If the lock is open after entering, lock it immediately and return to wait for the next execution. This approach also works with multiple cpus.

3. Code implementation of critical section

Before we finished the semaphore extraction, the critical section protection, next to see how to use code to implement the critical section. Without further ado, let’s get to the point

sem.c // Enter the kernel
typedef struct{// Declare the semaphore structure
    char name[20];
    int value;
    task_struct * queue;
}semable[20]
    
 // user - mode program
 producer.c
 main(a){
       // Read the subscript of empty from the semaphore structure
 	sd = sem_open("empty");
       for(int i = 0; i<=5; I++) {// Set the blocking state and switch the process
           sem_wait(sd);
           // Write 4 bytes of data to file fd
           write(fd,&i,4); }}// Select * from semable; // select * from semable
sys_sem_open(char *queue);

sys_sem_wait(int sd){
        // Interrupt protection
	cli();
        if(semtable[sd].value-- < 0) {[sd].queue (semtable.[sd].queue
            schedule();
        }
   	   sti();
}
Copy the code

Complex ideas eventually condense into these few macro codes, but even so, we should not forget our original intention to explore the ninety percent of the iceberg.

The application of this idea is widely used in linux0.11, so let’s learn some ideas from there

// Read the disk block
bread(int dev,int block){
    struct buffer_head *bh;
    // Start disk sleep, switch the process, and wake it up by disk interrupt, that is, synchronization
    ll_rw_block(READ,bh);
    wait_on_buffer(bh);
}
// Lock the critical section
lock_buffer (buffer_head *bh){
    cli();
    //while? Why not use if
    while(bh -> b_lock){// After entering, lock it regardless of whether it was locked before
        sleep_on(&bh_wait);
        bh -> b_lock = 1;
    }
    sti();
}

void sleep_on(struct task_struct){
       // Join the queue, known as the world's most hidden queue?
	struct task_struct *tmp;
        tmp = *p;
        *p = current;
        current -> state = TASK_UNINTERUPTIBLE;
        schdule();
        if(tmp){
		tmp -> state = 0; }}Copy the code

Let’s challenge the world’s most hidden queue

TMP to task_struct, p to task_struct, TMP to next queue, p to task_struct. And the two Pointers are stored in the process of the kernel stack, so the kernel of the next process from the next task_struct TMP pointing to a process on the stack, then the pointer p points to the current process.

If we use the while, we judge all the process states once, and then find the one with the highest priority and let it execute.

Where p is the pointer of a pointer, we can understand that P points to the starting address of a memory space, then the third-level pointer, n-level pointer also —

When switching a process, the current process is added to the blocking queue.

The function that unlocks the critical region is called via disk interrupt and wake_UP is called within it, changing the state of the current process, and then continuing with iRET returns.


4. Deadlock handling

Using the producer-consumer model as an example, we know that in both the producer and consumer functions, the first step is to determine whether the buffer is full or empty. What would happen if we switched the judgment with subsequent code?

This can lead to deadlocks, which, in layman’s terms, means that multiple processes are waiting for each other to hold resources, so that no one can execute

In essence, a deadlock is caused by a loop waiting for resources owned and requested.

Let’s look at the necessary conditions for deadlocks

So how do you handle deadlocks?

So what can we do specifically in process synchronization for each of these methods?

  1. Deadlock prevention: apply for all required resources at one time, so that there is no need to wait for the loop caused by resource conflicts, but the efficiency is too low, there is no use value; Another intuitive idea is to apply resources on demand, not deadlock, but a serious waste of resources, decisively abandon pit.

  2. Deadlock avoidance: The idea of deadlock avoidance is “perfectly” implemented by the banker algorithm

    If all processes in the system have a sequence of execution that can be completed, the system is said to be in a safe state, and the banker algorithm is used to find this safe sequence.

    For a process, the relationship between it and resources is divided into three types: currently occupied resources, still needed resources and available resources.

    Let’s look at the implementation

    int Available[1. ,n];// The amount of each resource remaining
    int Allocation[1. ,m];// The number of resources allocated
    int Need[1. ,m];// The number of various resources required by the process
    int work[1. ,m];// Work vector
    bool Finish [1. ,n];// Whether the process is finished
    
    work = Available;
    Finish[1. ,n] =false;
    while(true) {for(i=1; i<=n; i++){
            if(Finish[i]==false&& Need[I]≤Work) {Work = Work + Allocation[I]; Finish[i] =true; 
                break;
            }else 
            {
                goto end;
            }
        }
    }
    End: 
    for(i=1; i<=n; i++)if(Finish[i]==false) 
            return"Deadlock"; T(n)=O(mn2)Copy the code

    The implementation of the banker algorithm is relatively easy to understand, but at the same time brings the problem of high time complexity, and the time complexity is high because it will judge every detail, but the probability of deadlock is low, so it is a waste of time to judge all.

    So can we pretend to assign, call the banker algorithm, and see if it works? In this case, we need to apply for a deadlock process group and determine whether each sequence is safe.

  3. Deadlock detection + recovery: The core idea is to find the problem first, then deal with it, when a deadlock occurs, let the process roll back, but the implementation mechanism is more complicated

  4. Deadlock ignore: The ultimate secret – restart;


Life continues until death.