preface

The previous article described using the worker_Threads module to spawn worker threads to solve the problem of blocking the Node.js main thread when processing CPU-intensive tasks.

In practice, thread pools should be used when working with worker threads. Otherwise, the overhead of creating a worker thread may outweigh the benefits.

This article introduces the AsynC_hooks module, the thread pool concept, and the use of the AsynC_HOOKS module to encapsulate the thread pool.

A prerequisite for

To read and eat this article, you need to:

  • Basic knowledge of synchronous and asynchronous JavaScript programming
  • Learn how Node.js works

Async_hooks introduction

The AsynC_HOOKS module provides a set of apis for tracking the life cycle of asynchronous resources. This module first appeared in Node.js v8.0.0. Up to now, in the latest document version V16.14.0 and in the latest Experimental version V17.5.0, it is still marked as Stability: 1-experimental, which has not been turned into an Experimental feature. But over multiple iterations, it is reasonable to believe that Async_hooks have been refined and molded to be more stable than typical experimental apis.

Official API Overview

const async_hooks = require('async_hooks');

// Returns the ID of the current execution context.
const eid = async_hooks.executionAsyncId();

// Returns the handle ID responsible for triggering the current execution scope callback.
const tid = async_hooks.triggerAsyncId();

// Create a new AsyncHook instance. All of these callbacks are optional. See below for details.
const asyncHook =
    async_hooks.createHook({ init, before, after, destroy, promiseResolve });

// Allows the callback of this AsyncHook instance to be called. This is not an implicit operation after the constructor is run; it must be run explicitly to begin the callback.
asyncHook.enable();

// Disable listening for new asynchronous events.
asyncHook.disable();

In response to above, here are the callbacks that can be passed to createHook().
Init is called during object construction. The resource may not have been constructed when this callback runs, so all fields of the resource referenced by "asyncId" may not have been populated.
function init(asyncId, type, triggerAsyncId, resource) {}Call Before Before calling the resource's callback. For handles (such as TCPWrap), it can be called 0-N times, and for requests (such as FSReqCallback), it will be called exactly 1 time.
function before(asyncId) {}// After is called After the resource's callback is complete.
function after(asyncId) {}// Call Destroy when the resource is destroyed.
function destroy(asyncId) {}// When the "resolve" function passed to the "Promise" constructor is called (directly or by other means of resolving a Promise), Promise resolve is called only on the Promise resource.
function promiseResolve(asyncId) {}Copy the code

Let’s take a simple example

SetTimeout (()=>{}, 1000)

setTimeout(() = >{}, 1000)
Copy the code

Look at the contents of the print object:

Focus on the following properties:

  • _idleTimeout Timeout time
  • _onTimeout Callback function
  • _repeat Whether to repeat the rollback
  • _destroyed is destroyed

The setTimeout life cycle is as follows:

Async_hooks provide a set of hooks to monitor the different phases of the above life cycle, to which we can attach callback functions. There are four common hook types, such as init, before, After, and destroy. They are triggered in the following four phases of the asynchronous resource lifecycle.

See through the sample code

Create main.js as follows:

const fs = require('fs')
const async_hooks = require('async_hooks')

async_hooks.createHook({
  init (asyncId, type, triggerAsyncId, resource) {
    fs.writeSync(1.`init ${type}(${asyncId}): trigger: ${triggerAsyncId}\n`, resource)
  },
  destroy (asyncId) {
    fs.writeSync(1.`destroy: ${asyncId}\n`);
  }
}).enable()

async function A () {
  fs.writeSync(1.`A -> ${async_hooks.executionAsyncId()}\n`)
  setTimeout(() = > {
    fs.writeSync(1.`A in setTimeout -> ${async_hooks.executionAsyncId()}\n`)
    B()
  })
}

async function B () {
  fs.writeSync(1.`B -> ${async_hooks.executionAsyncId()}\n`)
  setTimeout(() = > {
    fs.writeSync(1.`B in setTimeout -> ${async_hooks.executionAsyncId()}\n`)
  })
}

fs.writeSync(1.`top level -> ${async_hooks.executionAsyncId()}\n`)
A()
Copy the code

Perform:

node main.js
Copy the code

Output:

top level -> 1
init PROMISE(2): trigger: 1
A -> 1
init Timeout(3): trigger: 1
A in setTimeout -> 3
init PROMISE(4): trigger: 3
B -> 3
init Timeout(5): trigger: 3
destroy: 3
B in setTimeout -> 5
destroy: 5
Copy the code

The code starts with the async_links.createHook register init callback to track the initialization of all asynchronous resources, register destroy to listen for the destruction of asynchronous resources, and enable it by calling.enable().

Use fs.writesync (1, MSG) to print to standard output, where the first argument to writeSync receives the file descriptor, 1 representing standard output. Why not use console.log? Because console.log is an asynchronous operation, an asynchronous operation in a callback registered with async_links.createHook will cause an infinite loop.

To track asynchronous resources, Node.js provides an async scope for each function (asynchronous or synchronous), We can get the id of the function’s current async scope (called asyncId) by calling async_links.executionAsyncid (), Get the asyncId of the current function caller by calling async_links.triggerAsyncid ().

When an asynchronous resource is created, the init event callback is triggered. The first parameter in the init function represents the asyncId of the asynchronous resource. The second argument, type, represents the type of the asynchronous resource (such as TCPWRAP, PROMISE, Timeout, Immediate, TickObject, and so on), and the third argument, triggerAsyncId, represents the asyncId of the caller of the asynchronous resource. The fourth parameter, resource, represents a reference to the resource for the asynchronous operation and needs to be released during destroy.

When an asynchronous resource is destroyed, it fires the Destroy event callback, which takes only one argument, the asyncId of the asynchronous resource.

What is a thread pool

Here’s a quote from Wikipedia:

A thread usage pattern. Too many lines will bring scheduling overhead, which will affect cache locality and overall performance. A thread pool maintains multiple threads, waiting for the supervisor to assign tasks that can be executed concurrently. This avoids the cost of creating and destroying threads while working on short-duration tasks. Thread pools not only ensure full utilization of the kernel, but also prevent overscheduling. The number of threads available should depend on the number of concurrent processors available, processor cores, memory, network sockets, and so on. For example, for computationally intensive tasks, the number of threads is usually the number of cpus +2, as too many threads will result in additional thread switching overhead.

Async_hooks encapsulates thread pools

Here’s the general idea:

  1. Based on the number of cpus in the systemworker_threadsCreate a reasonable number ofThe worker thread.
  2. A worker thread is required for mainthread maintenanceTask queue.
  3. usingasync_hooksmonitoringThe worker threadAnd schedule the worker threadTask queue.

This part will be introduced in the next article.

We look forward to node.js thread Pool with Async_hooks (part 2), and welcome any interested students to discuss in the comments section.