Different operating systems have different overall goals for scheduling processes, so scheduling algorithms are also different.

This depends on the type of process (computationally intensive? IO intensive?) , priority and other factors to choose.

For Linux x86 platforms, the CFS: Completely fair scheduling algorithm is generally used.

It is called perfectly fair because the operating system dynamically calculates the CPU usage per thread, and the operating system expects each process to use the CPU equally.

When we create a thread, the default is SCHED_OTHER, with a default priority of 0.

PS: In Linux, the kernel object of a thread is very similar to the kernel object of a process (in fact, some structural variables), so a thread is a lightweight process.

In this article, threads can be roughly equal to processes or, in some places, tasks, with different idioms in different contexts.

To put it this way: if there are N processes in the system, each process will get 1/N chances of execution. After each process executes for a certain period of time, it is called out and the next process executes.

If the number of N is so large that each process runs out of time when it starts executing. If the task is scheduled at this time, the system’s resources are spent on process context switching.

For this reason, operating systems introduce minimum granularity, where each process has a minimum guaranteed execution time, called a timeslice.

In addition to the SCHED_OTHER scheduling algorithm, Linux supports two real-time scheduling policies:

SCHED_FIFO: schedules processes based on their priority and runs once they reach the CPU until they abort or are preempted by a higher-priority process;

SCHED_RR: Adds the concept of time slices to SCHED_FIFO. When a process preempts the CPU and runs for a certain amount of time, the scheduler will place the process at the end of the current priority process queue in the CPU and select another process with the same priority to execute.

This article tests a mix of SCHED_FIFO and regular SCHED_OTHER scheduling policies.





On Linux, priority management can be confusing. Take a look at the following figure:

This graph shows priorities in the kernel in two segments.

The preceding values 0-99 are real-time tasks, and the following values 100-139 are normal tasks.

The lower the value, the higher the priority of the task.

The lower the value, the higher the priority of the task.

The lower the value, the higher the priority of the task.

Again, these are priorities from a kernel perspective.

Okay, here’s the point:

When we create threads in the application layer, we set a priority number, which is the priority number from the application layer perspective.

However, the kernel does not directly use this value set by the application layer. Instead, it does some calculation to get the priority value used in the kernel (0-139).

1. For real-time tasks

When creating a thread, we can set the priority value (0-99) as follows:

struct sched_param param;
param.__sched_priority = xxx;
Copy the code

When creating thread functions at the kernel level, the kernel uses the following formula to calculate the true priority value:

kernel priority = 100 - 1 - param.__sched_priority
Copy the code

If the application layer passes in the value 0, then the priority value in the kernel is 99, which is the lowest priority of all real-time tasks.

If the application layer transmits the value 99, then the priority value in the kernel is 0, which is the highest priority of all real-time tasks.

Therefore, from the perspective of the application layer, the higher the priority value of the transporter, the higher the priority of the thread. The smaller the value, the lower the priority.

The opposite of the kernel Angle!

2. For ordinary tasks

Adjusting the priority of common tasks is achieved by using nice values. There is also a formula in the kernel to convert nice values passed in by the application layer into kernel priority values:

kernel prifoity = 100 + 20 + nice
Copy the code

The legal value for nice is -20-19.

If the application layer sets the thread nice value to -20, then the priority value in the kernel is 100, which is the highest priority of all common tasks.

If the application layer sets the thread nice value to 19, then the priority value in the kernel is 139, which is the lowest priority of all common tasks.

Therefore, from the perspective of the application layer, the smaller the priority value of the transporter, the higher the priority of the thread. The higher the value, the lower the priority.

The kernel Angle is exactly the same!

Now that the background is clear, you can finally test your code!





Note:

  1. #define _GNU_SOURCE must be defined before #include

    ;

  2. #include

    must be included before #include ;

// filename: test.c #define _GNU_SOURCE #include <unistd.h> #include <stdio.h> #include <stdlib.h> #include <sched.h> #include <pthread.h> // Prints the current thread information: what is the scheduling policy? What is the priority? void get_thread_info(const int thread_index) { int policy; struct sched_param param; printf("\n====> thread_index = %d \n", thread_index); pthread_getschedparam(pthread_self(), &policy, &param); if (SCHED_OTHER == policy) printf("thread_index %d: SCHED_OTHER \n", thread_index); else if (SCHED_FIFO == policy) printf("thread_index %d: SCHED_FIFO \n", thread_index); else if (SCHED_RR == policy) printf("thread_index %d: SCHED_RR \n", thread_index); printf("thread_index %d: priority = %d \n", thread_index, param.sched_priority); } // the thread_routine(void *args) {// The thread_routine(void *args) { Four threads, index numbers from 1 to 4, are used in printed messages. int thread_index = *(int *)args; // To ensure that all threads are created, let the thread sleep for 1 second. sleep(1); // Print thread information: scheduling policy, priority. get_thread_info(thread_index); long num = 0; for (int i = 0; i < 10; i++) { for (int j = 0; j < 5000000; J++) {// nothing, pure simulation of CPU intensive computing. Float f1 = ((I + 1) * 345.45 * 12.3 * 45.6/78.9 / ((j + 1) * 4567.89); Float f2 = (I +1) * 12.3 * 45.6/78.9 * (j+1); float f2 = (I +1) * 12.3 * 45.6/78.9 * (j+1); float f3 = f1 / f2; Printf ("thread_index %d: num = %ld \n", thread_index, num++); Printf ("thread_index %d: exit \n", thread_index); return 0; } void main(void) {// Create four threads: 0 and 1- live thread, 2 and 3- normal thread (non-live thread) int thread_num = 4; Int index[4] = {1, 2, 3, 4}; int index[4] = {1, 2, 3, 4}; Pthread_t ppid[4]; pthread_t ppid[4]; Pthread_attr_attr [2]; pthread_attr_attr [2]; pthread_attr_attr [2]; struct sched_param param[2]; If (0! = getuid()) { printf("Please run as root \n"); exit(0); } // create 4 threads for (int I = 0; i < thread_num; Pthread_attr_init (&attr[I]); pthread_attr_init(&attr[I]); // Set the scheduling policy to SCHED_FIFO pthread_attr_setschedpolicy(&attr[I], SCHED_FIFO); // set the priority to 51,52. param[i].__sched_priority = 51 + i; pthread_attr_setschedparam(&attr[i], &param[i]); // Set the thread properties: do not inherit the scheduling policy and priority of the main thread. pthread_attr_setinheritsched(&attr[i], PTHREAD_EXPLICIT_SCHED); Pthread_create (&ppID [I], &attr[I],(void *)thread_routine, (void *)&index[I]); } else {pthread_create(&ppID [I], 0, (void *)thread_routine, (void *)&index[I]); }} for (int I = 0; i < 4; i++) pthread_join(ppid[i], 0); for (int i = 0; i < 2; i++) pthread_attr_destroy(&attr[i]); }Copy the code

An instruction to compile into an executable program:

gcc -o test test.c -lpthread
Copy the code





Let’s talk about the expected results first. If there are no expected results, then nothing else will be discussed at all.

There are four threads:

  1. Thread indexes 1 and 2: are real-time threads (SCHED_FIFO with priority 51,52);

  2. Thread indexes 3 and 4: are common threads (SCHED_OTHER with priority 0);

My test environment is: Ubuntu16.04, which is a virtual machine installed on Windows10.

My desired outcome is:

  1. First, print the information of thread 1 and thread 2, because they are real-time tasks and need to be scheduled first.

  2. The priority of thread 1 is 51, which is less than the priority of thread 2, which is 52. Therefore, thread 1 should execute after thread 2 finishes.

  3. Threads 3 and 4 are normal processes that need to wait until threads 1 and 2 have finished executing, and threads 3 and 4 should execute alternately because their scheduling policies and priorities are the same.

Hopefully, I tested it on my work computer and printed the following:

====> thread_index = 4 thread_index 4: SCHED_OTHER thread_index 4: priority = 0 ====> thread_index = 1 thread_index 1: SCHED_FIFO thread_index 1: priority = 51 ====> thread_index = 2 thread_index 2: SCHED_FIFO thread_index 2: priority = 52 thread_index 2: num = 0 thread_index 4: num = 0 ====> thread_index = 3 thread_index 3: SCHED_OTHER thread_index 3: priority = 0 thread_index 1: num = 0 thread_index 2: num = 1 thread_index 4: num = 1 thread_index 3: num = 0 thread_index 1: num = 1 thread_index 2: num = 2 thread_index 4: num = 2 thread_index 3: Num = 1; num = 1;Copy the code

The question is obvious: why are four threads being executed at the same time?

Thread 1 and thread 2 should be executed first because they are real-time tasks!

How did it end like this? It’s totally messed up! It’s not what you expected!

Can’t think of a reason, can only appeal to the network! But nothing of value was found.

One of the messages concerns the scheduling policy of the Linux system, which is noted here.

In Linux, to prevent real-time tasks from completely occupying CPU resources, ordinary tasks are executed in a small time gap.

In the /proc/sys/kernel directory, there are two files used to limit the CPU usage of real-time tasks:

Sched_rt_runtime_us: default 950000 SCHED_rt_period_us: default 1000000

Meaning: In a period of 1,000,000 microseconds (1 second), real-time tasks occupy 950,000 microseconds (0.95 seconds), leaving 0.05 seconds for normal tasks.

Without this limitation, if a SCHED_FIFO task had a particularly high priority and happened to be bugging the CPU, we wouldn’t have had a chance to kill the real-time task because the system wouldn’t be able to schedule any other processes to execute it.

With this limitation, we can use the 0.05 second execution time to kill the buggy real-time task.

Back to the point: The data says that if real-time tasks are not scheduled first, you can remove this time limit. The method is:

sysctl -w kernel.sched_rt_runtime_us=-1
Copy the code

I did it and it still didn’t work!





Is it the computer environment? So I put the test code in Ubuntu14.04, a virtual machine, on another laptop.

When compiling, there is a small problem with the error:

Error: 'for' loop initial declarations are only allowed in C99 modeCopy the code

Simply add the C99 standard to the compile instruction:

gcc -o test test.c -lpthread -std=c99
Copy the code

Execute the program and print the following information:

====> thread_index = 2 

====> thread_index = 1 
thread_index 1: SCHED_FIFO 
thread_index 1: priority = 51 
thread_index 2: SCHED_FIFO 
thread_index 2: priority = 52 
thread_index 1: num = 0 
thread_index 2: num = 0 
thread_index 2: num = 1 
thread_index 1: num = 1 
thread_index 2: num = 2 
thread_index 1: num = 2 
thread_index 2: num = 3 
thread_index 1: num = 3 
thread_index 2: num = 4 
thread_index 1: num = 4 
thread_index 2: num = 5 
thread_index 1: num = 5 
thread_index 2: num = 6 
thread_index 1: num = 6 
thread_index 2: num = 7 
thread_index 1: num = 7 
thread_index 2: num = 8 
thread_index 1: num = 8 
thread_index 2: num = 9 
thread_index 2: exit 

====> thread_index = 4 
thread_index 4: SCHED_OTHER 
thread_index 4: priority = 0 
thread_index 1: num = 9 
thread_index 1: exit 

====> thread_index = 3 
thread_index 3: SCHED_OTHER 
thread_index 3: priority = 0 
thread_index 3: num = 0 
thread_index 4: num = 0 
thread_index 3: num = 1 
thread_index 4: num = 1 
thread_index 3: num = 2 
thread_index 4: num = 2 
thread_index 3: num = 3 
thread_index 4: num = 3 
thread_index 3: num = 4 
thread_index 4: num = 4 
thread_index 3: num = 5 
thread_index 4: num = 5 
thread_index 3: num = 6 
thread_index 4: num = 6 
thread_index 3: num = 7 
thread_index 4: num = 7 
thread_index 3: num = 8 
thread_index 4: num = 8 
thread_index 3: num = 9 
thread_index 3: exit 
thread_index 4: num = 9 
thread_index 4: exit
Copy the code

Threads 1 and 2 execute simultaneously, and then threads 3 and 4 execute simultaneously.

But this is also not expected: thread # 2 has a higher priority than thread # 1 and should execute first!

Do not know how to check this problem, can not think of ideas, have to consult the Linux kernel god, suggested to check the kernel version.

At this point, I remembered that for some reason, the kernel version on Ubuntu16.04 had been downgraded.

Checked in this direction, and finally confirmed that the kernel version is not the cause of the problem.





I had to go back and look at the difference between the two printed messages:

  1. Ubuntu16.04 on the work machine: four threads are scheduled to execute at the same time, and neither scheduling policy nor priority is in effect;

  2. Ubuntu14.04: real-time tasks 1 and 2 were executed first, indicating that the scheduling policy worked, but the priority did not.

Suddenly, CPU affinity popped out of my head!

The next thing I know, there’s something wrong with it: it’s probably multicore.

So I bind all four threads to CPU0, which is CPU affinity.

To start the thread entry function thread_routine, add the following:

cpu_set_t mask;
int cpus = sysconf(_SC_NPROCESSORS_CONF);
CPU_ZERO(&mask);
CPU_SET(0, &mask);
if (pthread_setaffinity_np(pthread_self(), sizeof(mask), &mask) < 0)
{
    printf("set thread affinity failed! \n");
}
Copy the code

Then I continued to verify in the Ubuntu16.04 virtual machine, and the printed information was perfect, exactly as expected:

====> thread_index = 1 ====> thread_index = 2 thread_index 2: SCHED_FIFO thread_index 2: priority = 52 thread_index 2: Num = 0... thread_index 2: num = 9 thread_index 2: exit thread_index 1: SCHED_FIFO thread_index 1: priority = 51 thread_index 1: Num = 0... thread_index 1: num = 9 thread_index 1: exit ====> thread_index = 3 thread_index 3: SCHED_OTHER thread_index 3: priority = 0 ====> thread_index = 4 thread_index 4: SCHED_OTHER thread_index 4: priority = 0 thread_index 3: Num = 0 thread_index 4: num = 0... thread_index 4: num = 8 thread_index 3: num = 8 thread_index 4: num = 9 thread_index 4: exit thread_index 3: num = 9 thread_index 3: exitCopy the code

At this point, the truth of the problem is clear: it is the multicore processor that causes the problem!

In addition, the CPU cores allocated to the two tested VMS are different during installation, which leads to the different printing results.





Finally, confirm the CPU information in the two virtual machines:

Cpuinfo in Ubuntu 16.04:

$ cat /proc/cpuinfo processor : 0 vendor_id : GenuineIntel cpu family : 6 model : 158 model name : Intel(R) Core(TM) i5-8400 CPU @2.80 GHz Stepping: 10 CPU MHz: 2807.996 Cache size: 9216 KB Physical ID: 0 Siblings: 4 Core ID: 0 CPU cores: 4... Processor: 1 vendor_id: GenuineIntel CPU Family: 6 Model: 158 Model name: Intel(R) Core(TM) i5-8400 CPU @2.80 GHz Stepping: 10 CPU MHz: 2807.996 Cache size: 9216 KB Physical ID: 0 Siblings: 4 Core ID: 1 CPU cores: 4... Other information Processor: 2 vendor_id: GenuineIntel CPU Family: 6 Model: 158 Model name: Intel(R) Core(TM) i5-8400 CPU @2.80 GHz Stepping: 10 CPU MHz: 2807.996 Cache size: 9216 KB Physical ID: 0 Siblings: 4 Core ID: 2 CPU cores: 4... Other information Processor: 3 vendor_id: GenuineIntel CPU Family: 6 Model: 158 Model name: Intel(R) Core(TM) i5-8400 CPU @2.80 GHz Stepping: 10 CPU MHz: 2807.996 Cache size: 9216 KB Physical ID: 0 Siblings: 4 Core ID: 3 CPU cores: 4... Other informationCopy the code

In this virtual machine, there are exactly four cores, and my test code creates exactly four threads, so each core is assigned a thread, and each of them is busy and executing simultaneously.

Therefore, the printed information shows that four threads are executing in parallel.

At this point, no scheduling policy, no priority, no effect! (To be precise: scheduling policies and priorities matter on the CPU where the thread resides.)

If I had started with 10 threads in my test code, I would probably have found the problem faster!

The CPU in Ubuntu14.04 is the same as that in Ubuntu14.04.

$ cat /proc/cpuinfo processor : 0 vendor_id : GenuineIntel cpu family : 6 model : 142 model name : Intel(R) Core(TM) i5-7360U CPU @ 2.30GHz Stepping: 9 Microcode: 0x9A CPU MHz: 2304.000 Cache Size: 4096 KB Physical ID: 0 Siblings: 2 Core ID: 0 CPU cores: 2... Processor: 1 vendor_id: GenuineIntel CPU Family: 6 Model: 142 Model name: Intel(R) Core(TM) i5-7360U CPU @ 2.30GHz Stepping: 9 Microcode: 0x9A CPU MHz: 2304.000 Cache Size: 4096 KB Physical ID: 0 siblings: 2 Core ID: 1 CPU cores: 2... Other informationCopy the code

In this virtual machine, there are two cores, so two real-time tasks 1 and 2 are executed first (the priority of these two tasks is not meaningful because the two cores are executing at the same time), and then threads 3 and 4 are executed.





This round test down, really want to use the keyboard to hit their own head, how did not consider the multi-core factor earlier? !

Deep reasons:

  1. Many of the previous projects were single-core, such as ARM, MIPS and STM32, so I didn’t realize the multi-core screen factor earlier.

  2. I’ve worked on some x86 platform projects that didn’t involve real-time tasks. The default scheduling policy is generally used, which is an important indicator of Linux x86’s scheduling policy as a general-purpose computer: to make fair use of CPU resources for each task.

With the gradual application of x86 platform in the field of industrial control, the real-time problem is more prominent, more important.

That’s why inTime in Windows, Preempt, Xenomai and other real-time patches are available in Linux.





Recommended reading

[1] C language pointer – from the underlying principle to the tricks, with graphics and code to help you explain thoroughly [2] step by step analysis – how to use C to achieve object-oriented programming [3] The original GDB underlying debugging principle is so simple [4] inline assembly is terrible? Finish this article and end it! [5] It is said that software architecture should be layered and divided into modules. What should be done specifically