What exactly is the Node single thread? What about Node multithreading? I hope this article makes it clear.

The reading time is about 10 to 13 minutes

This test environment: System: macOS Mojave 10.14.2 CPU: 4-core 2.3 GHz Node: 10.15.1

Start with the Node thread

It is generally understood that Node is single-threaded, so the number of threads should be 1 after Node is started. Let’s experiment.

setInterval((a)= > {
  console.log(new Date().getTime())
}, 3000)
Copy the code

You can see that the Node process occupies seven threads. Why are there seven threads?

As we all know, the core of Node is the V8 engine. After Node is started, an instance of V8 is created, which is multi-threaded.

  • Main thread: compiles and executes code.
  • Compile/optimize threads: Code can be optimized while the main thread is executing.
  • Analyzer thread: Logs profiling code runtime to provide a basis for optimizing code execution by the Crankshaft.
  • Several threads of garbage collection.

So when we say Node is single-threaded we mean that JavaScript is executed single-threaded, but the JavaScript hosting environment, both Node and browser, is multi-threaded.

Node has two compilers: full-codeGen: simply and quickly compile JS into simple but slow mechanical code. Crankshaft: More sophisticated real-time optimization compiler that compiles high performance executable code.

Some asynchronous IO takes up extra threads

As in the above example, we read a file while the timer is running:

const fs = require('fs')

setInterval((a)= > {
    console.log(new Date().getTime())
}, 3000)

fs.readFile('./index.html', () = > {})Copy the code

The number of threads became 11 because some IO operations (DNS, FS) and some CPU intensive computations (Zlib, Crypto) enabled Node’s thread pool, which defaulted to 4 because the number of threads became 11.

We can manually change the thread pool default size:

process.env.UV_THREADPOOL_SIZE = 64
Copy the code

One line of code easily makes the thread 71.

Is cluster multithreading?

Node’s single threading also brings some problems, such as CPU underutilization, an uncaught exception that can cause an entire program to exit, and so on. Because Node provides the Cluster module, cluster encapsulates child_process and implements the multi-process model by forking child processes. For example, pM2, the most commonly used one, is the best representative.

Let’s look at a cluster demo:

const cluster = require('cluster');
const http = require('http');
const numCPUs = require('os').cpus().length;

if (cluster.isMaster) {
  console.log(` main process${process.pid}Running ');
  for (let i = 0; i < numCPUs; i++) {
    cluster.fork();
  }

  cluster.on('exit', (worker, code, signal) => {
    console.log('Working process${worker.process.pid}Have withdrawn from `);
  });
} else {
  // Worker processes can share any TCP connection.
  // In this case, the HTTP server is shared.
  http.createServer((req, res) = > {
    res.writeHead(200);
    res.end('Hello World');
  }).listen(8000);
  console.log('Working process${process.pid}Has launched `);
}
Copy the code

At this point take a look at the activity monitor:

There are nine processes, including one primary process. Number of cpus x Number of CPU cores = 2 x 4 = 8 child processes.

So neither child_process nor cluster is a multithreaded model, but a multiprocess model. While developers are aware of the problems with the single-threaded model, it does not address the root of the problem and provides a multi-process approach to simulate multi-threading. As you can see from the previous experiments, Although Node (V8) has the ability to multithread itself, the developers are not able to take full advantage of this ability, and are more likely to use multithreading in some ways provided by the underlying Node. Node official says:

You can use the built-in Node Worker Pool by developing a C++ addon. On older versions of Node, build your C++ addon using NAN, Newer versions use n-api. node-webworker-threads offers a JavaScript-only way to access node’s Worker Pool.

But for JavaScript developers, there hasn’t been a standard, user-friendly way to use Node’s multithreading capabilities.

True – Node Multi-threaded

It wasn’t until the release of Node 10.5.0 that an experimental module called Worker_Threads was officially introduced to give Node true multithreading capabilities.

Take a look at a simple demo:

const {
  isMainThread,
  parentPort,
  workerData,
  threadId,
  MessageChannel,
  MessagePort,
  Worker
} = require('worker_threads');

function mainThread() {
  for (let i = 0; i < 5; i++) {
    const worker = new Worker(__filename, { workerData: i });
    worker.on('exit', code => { console.log(`main: worker stopped with exit code ${code}`); });
    worker.on('message', msg => {
      console.log(`main: receive ${msg}`);
      worker.postMessage(msg + 1); }); }}function workerThread() {
  console.log(`worker: workerDate ${workerData}`);
  parentPort.on('message', msg => {
    console.log(`worker: receive ${msg}`);
  }),
  parentPort.postMessage(workerData);
}

if (isMainThread) {
  mainThread();
} else {
  workerThread();
}
Copy the code

The code above opens five child threads in the main thread, and the main thread sends simple messages to the child threads.

Since worker_thread is still in the experimental stage, the –experimental-worker flag needs to be added at startup to observe the activity monitor after running:

No more, no less, exactly five more child threads.

worker_threadThe module

Worker_thread Core code

The Worker_Thread module has four objects and two classes.

  • IsMainThread: whether the main thread, the source is throughthreadId === 0Judge.
  • MessagePort: Used for communication between threads, inherited from EventEmitter.
  • MessageChannel: Used to create channel instances for asynchronous, bidirectional communication.
  • ThreadId: indicates the threadId.
  • Worker: Used to create child threads in the main thread. The first argument is filename, which represents the entry to be executed by the child thread.
  • ParentPort: an object of type MessagePort representing the parent process in the worker thread, null in the main thread
  • WorkerData: Used to pass data (copies of data) to child processes in the main process

Let’s look at an example of process communication:

const assert = require('assert');
const {
  Worker,
  MessageChannel,
  MessagePort,
  isMainThread,
  parentPort
} = require('worker_threads');
if (isMainThread) {
  const worker = new Worker(__filename);
  const subChannel = new MessageChannel();
  worker.postMessage({ hereIsYourPort: subChannel.port1 }, [subChannel.port1]);
  subChannel.port2.on('message', (value) => {
    console.log('received:', value);
  });
} else {
  parentPort.once('message', (value) => {
    assert(value.hereIsYourPort instanceof MessagePort);
    value.hereIsYourPort.postMessage('the worker is sending this');
    value.hereIsYourPort.close();
  });
}
Copy the code

See the official documentation for more details.

Multi-process vs. multi-thread

According to the university textbook saying: “process is the smallest unit of resource allocation, thread is the smallest unit of CPU scheduling”, this sentence is enough for the exam, but in the actual work, we still need to make reasonable choices according to the needs.

Here is a comparison between multi-threading and multi-process:

attribute Multiple processes multithreading To compare
data Data sharing is complex and requires IPC; The data is separate and easy to synchronize Because process data is shared, data sharing is simple and synchronization is complex Each has his strong point
CPU, memory, Large memory usage, complex switchover, low CPU usage Low memory usage, simple switchover, and high CPU utilization Multithreading is better
Destroy, switch Creation, destruction, switching complex, slow speed Create destroy, switch simple, fast Multithreading is better
coding Coding is simple and debugging is convenient Encoding and debugging are complicated Multiple processes are better
reliability Processes run independently and do not affect each other Threads breathe and share a common destiny Multiple processes are better
distributed Can be used for multi-machine multi-core distributed, easy to expand Can only be used for multi-core distribution Multiple processes are better

The above comparison is general and not absolute.

Work_thread gives Nodes true multithreading capability, which is a big step forward.

Go to [IVWEB community] public account for more dry goods articles