The process communication method under Linux is basically inherited from the process communication method on Unix platform. The two major contributors to Unix, AT&T’s Bell LABS and BSD (Berkeley Software Distribution Center at the University of California, Berkeley), focused on interprocess communication in a different way. The former is a systematic improvement and expansion of the early interprocess communication means of Unix, forming “System V IPC”, communication process is limited in a single computer; The latter bypains this limitation and forms an interprocess communication mechanism based on sockets. Linux inherits both, as shown here:

Among them, the original Unix IPC includes: pipe, FIFO, signal; System V IPC includes System V message queue, System V signal lamp, and System V shared memory area. Posix IPC includes Posix message queue, Posix semaphore, and Posix shared memory. Two points need to be made briefly: 1) Due to the diversity of Unix versions, the Institute of Electrical and Electronic Engineering (IEEE) developed a separate Unix standard. This new ANSI Unix standard is called Portable Operating System Interface for Computer Environments (POSIX). While most existing Unix and popular versions are POSIX compliant, Linux has been POSIX compliant since the beginning; 2) BSD does not involve in-machine interprocess communication (socket itself can be used for in-machine interprocess communication). In fact, many Unix versions of stand-alone IPC have traces of BSD, such as 4.4BSD support for anonymous memory mapping, 4.3+BSD implementation of reliable signal semantics, and so on.

Figure 1 shows the various IPC methods supported by Linux, and in the rest of this article, the discussion will ultimately come down to interprocess communication in the Linux environment, with as little reference to Unix versions as possible to avoid conceptual confusion. In addition, the Posix API will be introduced mainly for the different implementations of the communication means supported by Linux (for shared memory, there are two implementations of Posix shared memory and System V shared memory).

Interprocess communication in Linux

  1. Pipes can be used for communication between related processes. Named pipes overcome the limitation that pipes have no name and therefore allow communication between unrelated processes in addition to the functionality of pipes.

  2. Signal: A complex communication method used to notify a receiving process of an event. In addition to interprocess communication, a process can also send signals to the process itself. In addition to supporting sigAL, Linux also supports SIGAction, a signal function whose semantics conform to Posix.1 standard (in fact, this function is based on BSD, BSD in order to achieve a reliable signal mechanism, and can unify the external interface, Reimplementing signal with sigAction);

  3. Message queue: A Message queue is a linked table of messages, including Posix Message queue system V Message queue. A process with sufficient permissions can add messages to the queue, and a process with read permissions can read messages from the queue. Message queue overcomes the disadvantages of signal carrying less information, pipe carrying only unformatted byte stream and limited buffer size.

  4. Shared memory: Enables multiple processes to access the same memory space and is the fastest form of IPC available. It is designed for the low efficiency of other communication mechanisms. Often used in conjunction with other communication mechanisms such as semaphores to achieve synchronization and mutual exclusion between processes.

  5. Semaphore: Mainly used as a means of synchronization between processes and between different threads of the same process.

  6. Socket: A more general interprocess communication mechanism that can be used for interprocess communication between different machines. Originally developed by the BSD branch of Unix systems, it is now generally portable to other Unix-like systems: Both Linux and System V variants support sockets.

In general, a process under Linux contains the following key elements:

  • Have an executable program;

  • Dedicated system stack space;

  • The kernel has its control block (process control block), which describes the resources occupied by the process, so that the process can accept the kernel scheduling;

  • Have independent storage space

Process creation

To create a new process, first create a task_struct structure in memory for the new process, then copy the task_struct content of the parent process into it, and then modify some data. Allocate a new kernel stack, a new PID, and add the task_struct node to the list. The so-called creation is actually “copy”.

When the child process starts, the kernel does not allocate physical memory for it. Instead, the parent process memory is shared read-only and copied only when the child process writes. The “copy – on – write”.

Fork is implemented by DO_fork. The simplified process of DO_fork is shown as follows:

The fork function

The fork function is called once and returned twice. Called once in the parent and once in the child. In the child process, the return value is 0. In the parent process, the return value is the PID of the child process. The programmer can have the parent and child execute different code depending on the return value.

An image of the process:

Run a demo like this:

#include <stdio.h>
#include <unistd.h>
#include <stdlib.h>


int main()
{
 pid_t pid;
 char *message;
 int n = 0;
 pid = fork();
 while(1){
 if(pid < 0){
 perror("fork failed\n");
 exit(1);
 }
 else if(pid == 0){
 n--;
 printf("child's n is:%d\n",n);
 }
 else{
 n++;
 printf("parent's n is:%d\n",n);
 }
 sleep(1);
 }
 exit(0);
}
Copy the code

You can see that there is no influence between the child and the parent on their respective variables.

In general, the order of parent and child execution after fork is uncertain, depending on the kernel scheduling algorithm. Synchronization between processes requires process communication.

When to fork?

It is common in web servers for a parent process to expect its children to execute different code segments at the same time — the parent waits for a service request from the client, and when the request arrives, the parent calls fork to have the child process the request.

When a process executes a different program, it usually calls exec immediately after fork

Vfork function

Vfork vs. fork:

The same:

Return the same value

Different:

Fork creates a copy of the parent process’s data space, heap, and stack. Vfork creates a child process and shares memory data with the parent process.

Vfork ensures that the child executes first, and the parent executes only after the child calls exit() or exec

Why vfork?

Since vfork is usually followed by exec calls, there is no access to the parent process’s data space and no need to spend time copying data, so vfork is “made for exec.”