Multithreading versus single threading

C# and Java are both languages with multiple threads. The multi-thread synchronization mode divides the CPU into several threads, and each thread runs synchronously.

Node uses a single-threaded, asynchronous, non-blocking mode, in which each computation monopolizes the CPU and executes asynchronously on an I/O request. When the I/O is complete, computation 2 is executed asynchronously through an event. This is ideal for a single process

But one CPU, one process, is not enough to handle large AMOUNTS of I/O (such as reading from the network or accessing a database or file system). No matter how powerful your server is, a thread can only support a limited amount of processing power.

In fact, Node runs in single-threaded mode, but can also utilize multiple processes, as well as clusters.

The best way to scale a Node application is with multiple processes, which is why Node was designed to build distributed applications with multiple nodes.

Child_process child processes.

Using Node’s child_process module, you can easily spawn child processes that communicate with each other through the event messaging system.

There’s a lot we can do in any subprocess. Such as:

  • You can access and control the operating system by executing system commands

  • You can control the input stream of the child process and listen to the output stream of the child process.

  • You can control the parameters of commands passed to the underlying operating system

  • We can command the output to do whatever we want. For example, pass the output of one command as input to another command (as we do in Linux) because both the input and output of the command can use Node streams.

    Node’s child_process module provides four asynchronous and three synchronous functions to create child processes:

We’ll look at how the four asynchronous functions are used and the differences between them

spawn()

Format: child_process spawn (command,[args],[options])

- command< string > the command to run - the args < string [] > string argument list - options < object > - shell < Boolean > | < string > if `true', runs inside the shellcommand` - the stdio < Array > | < string > child processes of the standard input and output configuration - CWD < string > child of the current working directory - env < object > environment variables key-value pairs, default: process. The envCopy the code

The spawn function starts a command in a new process, and we can pass any arguments to the command through the new process. For example, spawn a new process to execute the PWD command

const {spawn} = require('child_process')
const child=spawn('pwd')
Copy the code

We deconstruct the spawn function from child_process, pass the system command PWD to the spawn function, and execute.

The spawn function returns an instance of ChildProcess that inherits EventEmitter, so we can add event callbacks to the ChildProcess instance. For example, we register a callback function that listens for the child process exit.

const { spawn } = require('child_process')
const child = spawn('pwd')

//code 
      
        : exit code if the child process exits by itself
      
//signal 
      
        Terminates the child process
      
child.on("exit".(code, signal) = > {
  console.log(
    "Child process exits:" + `code ${code} and signal ${signal}`)})// Result: the child process exits: code 0 and signal null
Copy the code

In the example above, the callback function takes two arguments, code and signal. When the child process terminates normally, the code value is 0 and signal is NULL.

The ChildProcess class inherits EventEmitters, so the instance includes the following events

exit

This event is emitted after the child process ends. If the process exits, code is the final process exit code; otherwise, null. Signal is the string name of the signal if the process terminates because it received a signal, otherwise null. One of the two will always be non-NULL.

The child process standard INPUT/output stream may still be open when the ‘exit’ event is triggered

disconnect

The ‘disconnect’ event is emitted after calling the subprocess.disconnect() method in the parent process or process.disconnect() method in the child process. Once disconnected, messages cannot be sent or received, and the subprocess.connected property is false.

error

This event is emitted when the process cannot spawn, cannot terminate, or fails to send a message to a child process. The ‘exit’ event may or may not fire after an error occurs.

Prevents multiple accidental calls to callback functions while listening for ‘exit’ and ‘error’ events

close

This event is emitted after the process terminates _ and the standard input/output stream for _ child processes has been closed. This is different from the ‘exit’ event because multiple processes may share the same standard INPUT/output stream. The ‘close’ event will always be emitted after ‘exit’ or’ error’ (if the child process fails to spawn) has already been emitted.

  • codeIs the exit code if the child process exits by itself.
  • signalA signal to terminate a child process.
const { spawn } = require('child_process');
const ls = spawn('ls'['-lh'.'/usr']);

ls.stdout.on('data'.(data) = > {
  console.log(`stdout: ${data}`);
});

ls.on('close'.(code,signal) = > {
  console.log(`child process close all stdio with code ${code}`);
});

ls.on('exit'.(code,signal) = > {
  console.log(`child process exited with code ${code}`);
});
Copy the code

message

This event is an important one. The ‘message’ event is emitted when a child process sends a message using process.send(). This is why parent/child processes can communicate with each other. We’ll see an example below

Each child process has the standard stream STDIO. So we can use child stdin, child. Stdout, child. Stderr.

The close event is emitted when the standard stream STdio is turned off. The close event is different from the exit event, because multiple child processes share the same standard stream stdio, so when a child process exits, it does not mean that the standard stream will be closed. In other words, when exit event is triggered, the close event is not necessarily triggered.

Since all streams inherit EventEmitter, you can listen for different events on the streams stdio attaches to each child process. Unlike normal processes, stdout/stderr is a readable flow in a child process, while stdin is a writable flow. This is completely related to the main process. Most importantly, on a readable stream, we can listen for data events, which will be triggered on command output or any errors encountered while executing the command

const { spawn } = require('child_process')
const child = spawn('pwd')

child.on("exit".(code, signal) = > {
  console.log(
    "Child process exits:" + `code ${code} and signal ${signal}`
  )
})

child.stdout.on("data".data= > {
  console.log(`child stdout:\n${data}`);
});

child.stderr.on("data".data= > {
  console.error(`child stderr:\n${data}`);
});

// Result:
//
// dchild stdout:
// /Users/liujianwei/Documents/personal_code/node-demo
//
// The child process exits: code 0 and signal null
Copy the code

The above example shows that both standard output and standard error can handle execution results by listening for data events. You can see the result of the run. The code value is 0 when the child process exits, indicating that no exception errors occurred.

You can pass arguments to the command with the spawn second argument, which is an array type. For example, to use the find command to find all files in the current directory, the -type f parameter is required.

const child = spawn("find"["."."-type"."f"]);
Copy the code

If an error occurs during command execution, the child.stderr data event will be raised, and the exit event will throw a code value of 1 indicating that an exception has occurred. The error value really depends on the host operating system and the type of error.

The child process stdin is a writable stream. We can pass something to the command line through it. As with any writable stream, the easiest way to consume a writable stream is through the PIPE function, which simply streams the readable stream into the writable stream. The stdin of the main process is a readable stream and the stdin of the child process is a writable stream that can be connected via pipe. Such as:

const { spawn } = require("child_process");

const child = spawn("wc");

process.stdin.pipe(child.stdin);

child.stdout.on("data".data= > {
  console.log(`child stdout:\n${data}`);
});
Copy the code

In the example above, the child process runs the wc command and waits for input. The main process then passes sdtin to the child’s stdin through the PIPE function. From this combination, we get a standard input mode in which we can type something, which ends with Ctrl+D, and what we type will be used as input to the WC command.

We can also concatenate multiple child processes using the PIPE function, as in Linux. Such as:

const { spawn } = require("child_process");

const find = spawn("find"["."."-type"."f"]);
const wc = spawn("wc"["-l"]);

find.stdout.pipe(wc.stdin);

wc.stdout.on("data".data= > {
  console.log(`Number of files ${data}`);
});
Copy the code

Shell syntax and exec()

The exec method will generate a subshell, execute commands in that shell, buffer the resulting data, and return the output of the subprocess once the subprocess completes as a callback function parameter. The exec method returns a complete buffer from the child process. By default, the size of this buffer should be 200K. If the data size returned by the child process exceeds 200K, the program will crash with Error message “Error: maxBuffer exceeded” displayed. You can solve this problem by setting a larger buffer size in exec’s options, but you shouldn’t because EXEC is not designed to return a lot of data

Let’s implement the above example with exec

const { exec } = require("child_process");

exec("find . -type f | wc -l".(err, stdout, stderr) = > {
  if (err) {
    console.error(`exec error: ${err}`);
    return;
  }

  console.log(`Number of files ${stdout}`);
});
Copy the code

Because exec method is a command executed by shell, we can write shell syntax directly at will, which is full of the characteristics of shell pipes

By default, spwan does not create a shell to execute commands, but a third parameter can be set to support the shell to execute commands. As mentioned above, spawn returns an object with stdout and stderr streams. Pipe makes it easy to consume data.

const child = spawn("find . -type f | wc -l", {
  stdio: "inherit".shell: true
});
Copy the code

With stdio:”inherit”, when we execute the code, the child will inherit stdin,stdout, and stderr from the main process. This causes the main process’s process.stdout to fire the child process’s data event and output the results immediately.

With shell:true set, we can use shell like exec to execute commands.

When you want the child process to return a lot of data to Node, such as image processing, reading binary data, etc., you should use the spawn method.

When you want to use shell syntax and the data is very small, using exec is the best choice

Here we are talking about two other important spwan parameters, CWD and env. CWD can set the current working directory:

const child = spawn("find . -type f | wc -l", {
  stdio: "inherit".shell: true.cwd: "/Users/liujianwei/Downloads"
});
Copy the code

Env sets the environment variable. The default is process.env. If this parameter is set, variables on process.env will not be retrieved.

const { spawn } = require('child_process')
const child = spawn("echo $ANSWER ; \n echo $HOME; ", {
  stdio: "inherit".shell: true.env: { ANSWER: 42}})Copy the code

In the above example, the result of ANSWER is 42, and the value of HOME cannot be obtained

execFile()

The child_process.execfile () function is similar to child_process.exec() except that it does not spawn shells by default. Instead, the specified executable file file is derived directly as the new process, making it slightly more efficient than child_process.exec().

Supports the same options as child_process.exec(). Since there is no derived shell, actions such as I/O redirection and file wildmatching are not supported.

const { execFile } = require('child_process');
const child = execFile('node'['--version'].(error, stdout, stderr) = > {
  if (error) {
    throw error;
  }
  console.log(stdout);
});
Copy the code

fork()

Child_process.fork is a special form of spawn() for modules that run in child processes, such as fork(‘./son.js’) equivalent to spawn(‘ node ‘, [‘./son.js’]). Unlike the spawn method, fork establishes a communication channel between the parent and child processes for communication between processes.

Here’s an example:

Parent file: parent-js

const { fork } = require("child_process");

const forked = fork("child.js");

forked.on("message".msg= > {
  console.log("Message from child", msg);
});

forked.send({ hello: "world" });
Copy the code

Child file: child.js

process.on("message".msg= > {
  console.log("Message from parent:", msg);
});

let counter = 0;

setInterval(() = > {
  process.send({ counter: counter++ });
}, 1000);
Copy the code

Take a look at the implementation:

In the parent file, we fork child.js, and we listen for message events that will be triggered when the child sends a message through process.send.

Similarly, we can send messages from the parent process to the child process via process.send.

Let’s take an example that we might use in a real project

const http = require("http");

const longComputation = () = > {
  let sum = 0;
  for (let i = 0; i < 1e9; i++) {
    sum += i;
  }
  return sum;
};

const server = http.createServer();

server.on("request".(req, res) = > {
  if (req.url === "/compute") {
    const sum = longComputation();
    return res.end(Sum is ${sum});
  } else {
    res.end("Ok"); }}); server.listen(3000);
Copy the code

We have an HTTP service that has two endpoints, and when we request/Compute endpoints, the longComputation function does a lot of CUP computation and blocks I/O. If we were accessing another endpoint at this point, we would not get an OK response.

The solution is to put the longComputation function in a subprocess and fork it so it doesn’t block I/O. The code is as follows:

Create a compute. Js file and place it in the calculation

const longComputation = () = > {
  let sum = 0;
  for (let i = 0; i < 1e9; i++) {
    sum += i;
  }
  return sum;
};

process.on("message".msg= > {
  const sum = longComputation();
  process.send(sum);
});
Copy the code

Modification:

const http = require("http");
const { fork } = require("child_process");

const server = http.createServer();

server.on("request".(req, res) = > {
  if (req.url === "/compute") {
    const compute = fork("compute.js");
    compute.send("start");
    compute.on("message".sum= > {
      res.end(Sum is ${sum});
    });
  } else {
    res.end("Ok"); }}); server.listen(3000);
Copy the code

The code above is limited by the number of processes forked, but when we execute it and calculate the endpoint by HTTP request time, the main server is not blocked at all and can accept further requests

The Cluster module of Node is based on this idea.

The last

Node.js Child Processes: Everything You Need to Know Node.js Child Processes: Everything You Need to Know

The resources

Node.js Child Processes: Everything you need to know

Is Node really single-threaded?

Node.js child processes (exec, spawn, fork)