Threads, processes, and interprocess communication

This is the 16th day of my participation in Gwen Challenge

Single-threaded vs. multi-threaded

Suppose you have these four tasks

Task 1 is to calculate A=1+2;
Task 2 is to calculate B=20/5;
Task 3 is to calculate C=7*8;
Task 4 is to display the result of the final calculation.

With a single thread, each of the four tasks is executed in four steps, in sequence.

If multithreading is used to process, only two steps are needed. The first step is to use three threads to execute the first three tasks at the same time. The second step is to perform the fourth task: display the calculated results.

By comparison, you can see that single-threaded execution takes four steps and multi-threaded execution only takes two steps, so using parallel processing can greatly improve performance.

Threads vs. processes

Multithreading can process tasks in parallel, but threads cannot exist alone. They are started and managed by processes. So what is a process?

A process is a running instance of a program. When a program is started, the operating system creates a block of memory for the program, which is used to store code, running data and a main thread to perform tasks. We call such a running environment a process. ,

Add the concept of processes to consider single-threaded and multi-threaded processing:

As can be seen from the figure, threads are attached to processes, and using multi-thread parallel processing in processes can improve computing efficiency.

The relationship between processes and threads has four characteristics.

1. If any thread fails, the entire process crashes.

So let’s say we change task B from the previous task to B=20/0

A = 1+2
B = 20/0
C = 7*8
Copy the code

When you compute B, because the denominator is 0, the thread will fail to execute, which will crash the whole process, and of course the other two threads will fail to execute.

2. Data in the process is shared between threads.

Threads can read and write to the process’s common data.

Thread 1, thread 2, and thread 3 write the execution results to A, B, and C, respectively. Thread 2 then reads data from A, B, and C to display the execution results.

3. After a process is shut down, the operating system reclaims the memory occupied by the process.

When a process exits, the operating system reclaims all the resources applied for by the process. Even if any of these threads leak memory due to improper operation, the memory will be properly reclaimed when the process exits.

Memory Leak: A Memory Leak refers to a program that fails to release dynamically allocated heap Memory for some reason, resulting in a waste of system Memory, slowing down the program, or even crashing the system.

4. The contents of processes are isolated from each other.

Process isolation is A technique to protect each process from interfering with each other in the operating system. Each process can access only the data it owns, preventing process A from writing data to process B. Because data between processes is strictly isolated, a crash or hang of one process does not affect other processes. If there is a need for data communication between processes, then a mechanism for interprocess communication (IPC) is needed.

Interprocess communication

1, pipes,

Anonymous pipe

The first command | the second commandCopy the code

The result of the first command is the input to the second command.

Communication data flow in a plain and limited size, is one-way communication way, the data can only flow in one direction, if you want to two-way communication, you need to create two pipes and anonymous pipe is again there is a father and son relationship can only be used for inter-process communication, with anonymous pipe lifecycle process creation, along with the process terminates.

A named pipe

Create a pipe

Type mkFIFo pipe and echo “Hello” > pipe on a terminal and you can see that the process is blocked

Enter cat < pipe on the other terminal

Conclusion:

Whether anonymous or named, data written by a process is cached in the kernel. When another process reads data, it naturally obtains data from the kernel. Meanwhile, communication data follows the first-in, first-out principle and file location operations such as LSEEK are not supported.

2. Message queues

The message queue overcomes the disadvantages of little signal transmission, pipe carrying only plain byte stream and limited buffer size. Message queue is actually stored in the kernel “message list”, the body of the message queue can be user-defined data types, send data, will be divided into a, an independent body when receiving data, of course, also want to and the sender sends the message body is consistent with the type of data, so as to ensure the read data is correct. The speed of message queue communication is not the most timely, after all, every data write and read needs to go through the process of copying between user and kernel.

3. Shared memory

Can be solved in the message queue communication between user mode and kernel mode data copy process of overhead, it directly assigned a Shared space, each process can have direct access to, like your own space access process convenient, don’t need in kernel mode or system calls, greatly improving the speed of communication, enjoy the fastest in the name of interprocess communication way. However, the convenient and efficient shared memory communication brings new problems. Multiple processes competing for the same shared resource will cause data confusion.

4.A semaphore

Semaphores, then, are needed to secure the shared resource to ensure that only one process can access it at any one time, which is mutually exclusive access. The semaphore can not only achieve mutual exclusion of access, but also achieve synchronization between processes. The semaphore is actually a counter, which represents the number of resources, and its value can be controlled by two atomic operations, namely P operation and V operation.

A semaphore is a counter that can be used to control access to a shared resource by multiple processes. It is often used as a locking mechanism to prevent other processes from accessing a shared resource while one process is accessing it. Therefore, it is mainly used as a means of synchronization between processes and between different threads within the same process.

5.signal

A semaphore with a similar name is a signal, which has a similar name but not the same function at all. Used to notify the receiving process that an event has occurred. Signal is inter-process communication mechanism in the asynchronous communication mechanism, the signal can be direct interaction between the application process and the kernel, the kernel can also use signal to notify the user space, what had happened to the process of system events, the source of the signal events mainly include hardware source source (such as keyboard Cltr + C) and software (e.g., kill command), Once a signal occurs, a process can respond to it in three ways. Perform default operations. 2. Capture signals. 3. SIGKILL and SEGSTOP are two signals that the application process cannot detect and ignore, so that we can terminate or stop a process at any time.

6.The Socket communication

The communication mechanism mentioned above, all work in the same host, if you want to communicate with different host processes, then you need Socket communication. Sockets are not only used for communication between different host processes, but also for communication between local host processes. According to different Socket types, sockets can be divided into three common communication modes: TCP, UDP, and local process communication.

This is the main mechanism for interprocess communication.

Reference:

Juejin. Cn/post / 684490…

Juejin. Cn/post / 686993…