Node.js is a single-threaded, asynchronous, non-blocking programming language, so how do you take advantage of multi-core cpus? This requires the child_process module to create child processes. In Node.js, there are four ways to create child processes:

  1. exec
  2. execFile
  3. spawn
  4. fork

All four of the above methods return ChildProcess instances (inherited from EventEmitter) that have three standard STDIO streams:

  1. child.stdin
  2. child.stdout
  3. child.stderr

The following events can be registered for listening during the child process lifecycle:

  • exit: Triggered when a child process ends. The parameters are code error code and signal interrupt.
  • close: triggered when the child process terminates and the STdio stream is closed, with the same argumentexitEvents.
  • disconnect: called by the parent processchild.disconnect()Or child process callprocess.disconnect()When triggered.
  • error: Triggered when the child process cannot be created, cannot be killed, or fails to send a message to the child process.
  • message: The child process passedprocess.send()Triggered when a message is sent.
  • spawn: Triggered when the child process is successfully created (this event was added in Node.js v15.1).

The exec and execFile methods provide an additional callback function that is triggered when the child process terminates. The following is a detailed analysis:

exec

The exec method is used to execute the bash command and takes a command string as an argument. For example, to count the number of files in the current directory, the exec function is written as:

const { exec } = require("child_process")
exec("find . -type f | wc -l".(err, stdout, stderr) = > {
  if (err) return console.error(`exec error: ${err}`)
  console.log(`Number of files ${stdout}`)})Copy the code

Exec creates a new child process, caches its results, and calls the callback function when it finishes running.

As you might already know, exec commands are dangerous. If you take a user-supplied string as an exec function argument, you run the risk of command-line injection, for example:

find . -type f | wc -l; rm -rf /;
Copy the code

Also, since exec caches the entire output in memory, spawn is a better choice when the data is large.

execFile

ExecFile differs from exec in that it does not create a shell but executes commands directly, so it is more efficient, for example:

const { execFile } = require("child_process")
const child = execFile("node"["--version"].(error, stdout, stderr) = > {
  if (error) throw error
  console.log(stdout)
})
Copy the code

Since no shell is created, the program’s parameters are passed in as an array, so it has high security.

spawn

The spawn function is similar to execFile in that shell is disabled by default, but execFile caches the output of the command line and passes it to the callback function. Spawn outputs the output as a stream, which makes it very easy to interconnect input and output. For example, a typical WC command:

const child = spawn("wc")
process.stdin.pipe(child.stdin)
child.stdout.on("data".data= > {
  console.log(`child stdout:\n${data}`)})Copy the code

At this point, input is taken from the command line stdin, the command is executed when the user triggers Enter + CTRL D, and the results are printed out from stdout.

Wc is short for Word Count and is used to Count words. The syntax is:

wc [OPTION]... [FILE]...
Copy the code

If you input wc command and press Enter on the terminal, at this time, the characters in the terminal will be counted from the keyboard. Press enter again, and then press Ctrl + D to output the statistical results.

It is also possible to combine complex commands, such as counting the number of files in the current directory, on the Linux command line:

find . -type f | wc -l
Copy the code

Node.js is written exactly as on the command line:

const find = spawn("find"["."."-type"."f"])
const wc = spawn("wc"["-l"])
find.stdout.pipe(wc.stdin)
wc.stdout.on("data".(data) = > {
  console.log(`Number of files ${data}`)})Copy the code

Spawn has a wealth of custom configurations, such as:

const child = spawn("find . -type f | wc -l", {
  stdio: "inherit".// Inherits the input and output streams of the parent process
  shell: true.// Enable the command line mode
  cwd: "/Users/keliq/code".// Specify the execution directory
  env: { ANSWER: 42 }, // Specify the environment variable (process.env by default)
  detached: true.// Exists as a separate process
})
Copy the code

fork

Fork is a variant of the spawn function. A communication channel is automatically created between the child process and its parent process. The send method is mounted on the child’s global process object. For example, the parent process parent.js code:

const { fork } = require("child_process")
const forked = fork("./child.js")

forked.on("message".msg= > {
  console.log("Message from child", msg);
})

forked.send({ hello: "world" })
Copy the code

Child process child.js code:

process.on("message".msg= > {
  console.log("Message from parent:", msg)
})

let counter = 0
setInterval(() = > {
  process.send({ counter: counter++ })
}, 1000)
Copy the code

When fork(“child.js”) is called, node is used to execute the code in the file, equivalent to spawn(‘node’, [‘./child.js’]).

A typical use of fork is as follows: Suppose an HTTP service is created in Node.js and a time-consuming operation is performed while the route is compute.

const http = require("http")
const server = http.createServer()
server.on("request".(req, res) = > {
  if (req.url === "/compute") {
    const sum = longComputation()
    return res.end(Sum is ${sum})
  } else {
    res.end("OK")
  }
})

server.listen(3000);
Copy the code

This time-consuming operation can be simulated with the following code:

const longComputation = () = > {
  let sum = 0;
  for (let i = 0; i < 1e9; i++) {
    sum += i
  }
  return sum
}
Copy the code

As long as the server receives the compute request, node.js is single-threaded and time-consuming operations occupy the CPU, other requests from the user will be blocked here, which shows the phenomenon of no response from the server.

The easiest way to solve this problem is to put the time-consuming operations into a child process, for example, create a compute. Js file as follows:

const longComputation = () = > {
  let sum = 0;
  for (let i = 0; i < 1e9; i++) {
    sum += i;
  }
  return sum
}

process.on("message".msg= > {
  const sum = longComputation()
  process.send(sum)
})
Copy the code

Change the server code slightly:

const http = require("http")
const { fork } = require("child_process")
const server = http.createServer()
server.on("request".(req, res) = > {
  if (req.url === "/compute") {
    const compute = fork("compute.js")
    compute.send("start")
    compute.on("message".sum= > {
      res.end(Sum is ${sum})
    })
  } else {
    res.end("OK")
  }
})
server.listen(3000)
Copy the code

This way, the main thread does not block, but continues to process other requests and responds when the result of the time-consuming operation is returned. In fact, the simpler way to do this is to use the Cluster module, which will be discussed later for space reasons.

conclusion

Having mastered the above four methods of creating child processes, the following three rules emerge:

  • Fork the node child process because it has its own channel for communication.
  • Create non-node child processes using execFile or spawn. If the output is less execFile, the result is cached and passed to the callback for easy processing. If the output is spawn, using streams doesn’t take up a lot of memory.
  • Exec is a more convenient way to write complex, fixed terminal commands. But keep in mind that exec creates a shell, which is not as efficient as execFile and spawn, and carries the risk of command-line injection.