process

A process is an application that is running in the system.

When we open activity Monitor or file Explorer, we can see each running process:

Multiple processes

Replication process

NodeJS provides the child_process module and the child_process.fork() function for copying processes.

An 🌰

Create worker.js and master.js files in one directory:

worker.js

const http = require('http');

http.createServer((req, res) = > {
  res.writeHead(200, {'Content-Type': 'text/plain'});
  res.end('Hello NodeJS! \n');
}).listen(Math.round((1 + Math.random()) * 2000), '127.0.0.1');
Copy the code

master.js

const { fork } = require('child_process');
const { cpus } = require('os');

cpus().map(() = > {
  fork('./worker.js');
});
Copy the code

Through the node master. Master js start. Js, then through ps aux | grep worker. Js check process quantity, we can find that under ideal conditions, the number of processes is equal to the number of CPU core, each process using a CPU, Realize the utilization of multi-core CPU:

This is the classic master-worker mode.

In reality, forking processes is expensive, and the purpose of copying processes is to maximize CPU resources, so NodeJS uses an event-driven approach to solve the problem of high concurrency on a single thread.

Creation of a child process

The child_process module provides four methods for creating child processes:

  • child_process.spawn(command, args)
  • child_process.exec(command, options)
  • child_process.execFile(file, args[, callback])
  • child_process.fork(modulePath, args)

Contrast:

The last three methods are all extensions of spawn().

Communication between processes

In NodeJS, the child process object uses the send() method to enable the main process to send data to the child process, and the message event enables the main process to listen to the data sent by the child process.

An 🌰

Create two new parent-js and child-js files in one directory:

parent.js

const { fork } = require('child_process');
const sender = fork(__dirname + '/child.js');

sender.on('message'.msg= > {
  console.log('Master process receives message from child process:', msg);
});

sender.send('Hey! The child ');
Copy the code

child.js

process.on('message'.msg= > {
  console.log('Child process receives message from master process:', msg);
});

process.send('Hey! The master ');
Copy the code

When we execute node parent.js, the following image should appear:

Thus we have implemented a most basic interprocess communication.

IPC

IPC stands for inter-process communication, which enables different processes to access resources and coordinate their efforts.

In fact, before creating the child process, the parent process will first create the IPC channel and listen to the IPC, and then create the child process, through the environment variable (NODE_CHANNEL_FD) to tell the child process and IPC channel related file descriptor, when the child process started according to the file descriptor to connect the IPC channel. To establish a connection with the parent process.

Handle transfer

A handle is a reference to a resource that contains a file resource descriptor that points to an object.

In general, when we want to listen to multiple processes on a single port, we might consider using the main process proxy:

However, this proxy scheme can result in two file descriptors per request for receiving and proxy forwarding, and the system has a limited number of file descriptors, which can affect the system’s ability to scale.

So, why use a handle? The reason is that in practical application scenarios, complex data processing scenarios may be involved after the establishment of IPC communication. The handle can be passed in as the second optional parameter of send(), that is, the resource identifier can be directly transmitted through IPC, avoiding the use of file descriptors caused by proxy forwarding.

Here are the types of handles that can be sent:

  • net.Socket
  • net.Server
  • net.Native
  • dgram.Socket
  • dgram.Native

Handle send and restore

NodeJS processes only pass messages between each other and do not actually pass objects.

Before sending a message, the send() method assembles the message into a Handle and a Message. This message is serialized through json.stringify. In other words, when passing a handle, the whole object is not passed, but a string is passed over the IPC channel. Parse the transfer and restore it to an object.

Listening on common port

As shown above, why can multiple processes listen on the same port?

Reason is the main process through the send () method to send more child process belongs to one of the main process of the service object handle, so for each child process, after they handle to restore, get the service object is the same, when the network requests to the server by process service is preemptive, so listen to the same port do not cause anomalies.

Cluster

To quote the official understanding of Cluster from egg.js:

  • Start multiple processes simultaneously on the server.
  • Each process runs the same source code (like dividing the work of a previous process among multiple processes).
  • Even more amazing, these processes can listen on the same port at the same time

Among them:

  • The process responsible for starting other processes is called the Master process, it is like a “contractor”, does not do the specific work, only responsible for starting other processes.
  • The other processes that are started are called Worker processes, which are literally “workers” doing the work. They receive requests and provide services externally.
  • The number of Worker processes is generally determined according to the number of CPU cores on the server, so that multi-core resources can be perfectly utilized.
const cluster = require('cluster');
const http = require('http');
const numCPUs = require('os').cpus().length;

if (cluster.isMaster) {
  // Fork workers.
  for (let i = 0; i < numCPUs; i++) {
    cluster.fork();
  }

  cluster.on('exit'.function(worker, code, signal) {
    console.log('worker ' + worker.process.pid + ' died');
  });
} else {
  // Workers can share any TCP connection
  // In this case it is an HTTP server
  http.createServer(function(req, res) {
    res.writeHead(200);
    res.end("hello world\n");
  }).listen(8000);
}
Copy the code

Simply put, the Cluster module is a combination of the CHILD_process module and the NET module.

When the cluster is started, the TCP server is started internally and the file descriptor of the TCP server socket is sent to the worker process.

In the cluster module application, a main process can only manage a group of worker processes. Its operation mode is not as flexible as child_process, but is more stable:

To make the cluster more stable and robust, the Cluster module also exposes a number of events:

  • fork
  • online
  • listening
  • disconnect
  • exit
  • setup

These events are encapsulated on the basis of inter-process messaging, ensuring the stability and robustness of the cluster.

Process daemon #

Uncaught exception

Node.js provides the process.on(‘uncaughtException’, handler) interface to catch it. However, when a Worker process encounters an uncaughtException, It is already in an indeterminate state, at which point we should gracefully exit the process:

  • Close all TCP servers of the abnormal Worker process (quickly disconnect the existing connection and stop receiving new connections), disconnect the IPC channel with the Master and stop accepting new user requests.
  • The Master immediately forks a new Worker process, keeping the total number of “workers” online unchanged.
  • The abnormal Worker waits for a period of time, processes the accepted requests and exits.
+---------+ +---------+ | Worker | | Master | +---------+ +----+----+ | uncaughtException | +------------+ | | | | +---------+ | <----------+ | | Worker | | | +----+----+ | disconnect | fork a new worker | +-------------------------> +  ---------------------> | | wait... | | | exit | | +-------------------------> | | | | | die | | | | | |Copy the code

OOM. The system is abnormal

When a process is abnormal and crash or OOM is killed by the system, we still have the opportunity to continue the process execution unlike when the exception is not captured. We can only make the current process exit directly, and the Master immediately forks a new Worker.

The resources

  • Node.js
  • Node.js Chinese document
  • Egg.js official documentation