The official documentation

The AsynC_HOOKS module provides an API for tracking asynchronous resources. AsyncLocalStorage Api from Async_hooks of Node.js was used in a recent project. I have heard of async_hooks before, but have not practiced them, so take the opportunity to learn more about them.

What are asynchronous resources

Asynchronous resources here refer to objects with associated callbacks, which have the following characteristics:

  1. Callbacks can be called once or more. For example, fs.open creates a FSReqCallback object that listens for complete events and executes a callback once the asynchronous operation is complete, while Net.createserver creates a TCP object that listens for connection events and executes the callback multiple times.
  2. Resources can be closed before a callback is called
  3. AsyncHook is an abstraction of these asynchronous resources, regardless of the asynchrony differences
  4. If worker is used, each thread creates a separate asynC_hooks interface that uses separate asyncId.

Why track asynchronous resources

Because Node.js is based on the asynchronous non-blocking I/O model of an event loop, an asynchronous call is made and the callback is called later in the loop, so there is no way to track who made the asynchronous call.

The scene of a

const fs = require('fs')

function callback(err, data) {
    console.log('callback', data)
}

fs.readFile("a.txt", callback)
console.log('after a')
fs.readFile("b.txt", callback)
console.log('after b')
// after a
// after b
// callback undefined
// callback undefined
Copy the code

We used the above example to represent node asynchronous I/O, and the result was as expected. Which callback is which? Do YOU execute a or B first? -> We cannot confirm the call chain from the log

Scenario 2

function main() {
  setTimeout(() = > {
    throw Error(1)},0)
}

main()
// Error: 1
// at Timeout._onTimeout (/Users/zhangruiwu/Desktop/work/async_hooks-test/stack.js:3:11)
// at listOnTimeout (internal/timers.js:554:17)
// at processTimers (internal/timers.js:497:7)
Copy the code

Asynchronous callbacks throw exceptions and do not get the full call stack. Event loops break the link between asynchronous calls and callbacks. I: broken string, how to connect? Async_hooks: HEARD someone asked for me 👀

AsyncHooks

Here’s the official overview:

const async_hooks = require('async_hooks');

// Returns asyncId for the current execution context.
const eid = async_hooks.executionAsyncId();

// Returns asyncId that triggers the current execution context.
const tid = async_hooks.triggerAsyncId();

// Create asyncHook instances and register various callbacks
const asyncHook =
    async_hooks.createHook({ init, before, after, destroy, promiseResolve });

// Enable asyncHook to execute callback only after asyncHook is enabled
asyncHook.enable();

/ / close asyncHook
asyncHook.disable();

//
// Here is the callback passed to createHook.
//

// Initialize the hook function for an asynchronous operation
function init(asyncId, type, triggerAsyncId, resource) {}// The hook function before the asynchronous callback is executed may fire multiple times
function before(asyncId) {}// The hook function after the asynchronous callback is complete
function after(asyncId) {}// The hook function for asynchronous resource destruction
function destroy(asyncId) {}// The hook function that invokes promiseResolve
function promiseResolve(asyncId) {}Copy the code

When asyncHook is enabled, each asynchronous resource fires these lifecycle hooks. The following describes the parameters of init:

asyncId

Unique ID of the asynchronous resource. The id starts from 1 and increments

type

A string that identifies an asynchronous resource. The following are some of the built-in types, which can also be customized

FSEVENTWRAP, FSREQCALLBACK, GETADDRINFOREQWRAP, GETNAMEINFOREQWRAP, HTTPINCOMINGMESSAGE,

HTTPCLIENTREQUEST, JSSTREAM, PIPECONNECTWRAP, PIPEWRAP, PROCESSWRAP, QUERYWRAP,

SHUTDOWNWRAP, SIGNALWRAP, STATWATCHER, TCPCONNECTWRAP, TCPSERVERWRAP, TCPWRAP,

TTYWRAP, UDPSENDWRAP, UDPWRAP, WRITEWRAP, ZLIB, SSLCONNECTION, PBKDF2REQUEST,

RANDOMBYTESREQUEST, TLSWRAP, Microtask, Timeout, Immediate, TickObject
Copy the code

triggerAsyncId

AsyncId of the asynchronous resource that triggers initialization of the current asynchronous resource.

const { fd } = process.stdout;

async_hooks.createHook({
  init(asyncId, type, triggerAsyncId) {
    const eid = async_hooks.executionAsyncId();
    fs.writeSync(
      fd,
      `${type}(${asyncId}): trigger: ${triggerAsyncId} execution: ${eid}\n`);
  }
}).enable();

net.createServer((conn) = > {}).listen(8080);
// Output after startup:
// TCPSERVERWRAP(4): trigger: 1 execution: 1 #
// TickObject(5): trigger: 4 execution: 1 # listen

// nc localhost 8080
// TCPWRAP(6): trigger: 4 Execution: 0 # connect callback
Copy the code

When a new connection is established, an instance of TCPWrap is created, which is executed from C++ without a js stack, so executionAsyncId is 0. But then you don’t know which asynchronous resource caused it to be created, so you need triggerAsyncId to declare which asynchronous resource is responsible for it.

resource

An object representing an asynchronous resource from which you can obtain some data related to the asynchronous resource. For example, an asynchronous resource object of type GETADDRINFOREQWRAP provides a hostname.

Use the sample

Let’s take a look at an official example:

const { fd } = process.stdout;

let indent = 0;
async_hooks.createHook({
  init(asyncId, type, triggerAsyncId) {
    const eid = async_hooks.executionAsyncId();
    const indentStr = ' '.repeat(indent);
    fs.writeSync(
      fd,
      `${indentStr}${type}(${asyncId}) : ` +
      ` trigger: ${triggerAsyncId} execution: ${eid}\n`);
  },
  before(asyncId) {
    const indentStr = ' '.repeat(indent);
    fs.writeSync(fd, `${indentStr}before:  ${asyncId}\n`);
    indent += 2;
  },
  after(asyncId) {
    indent -= 2;
    const indentStr = ' '.repeat(indent);
    fs.writeSync(fd, `${indentStr}after:  ${asyncId}\n`);
  },
  destroy(asyncId) {
    const indentStr = ' '.repeat(indent);
    fs.writeSync(fd, `${indentStr}destroy:  ${asyncId}\n`);
  },
}).enable();

net.createServer().listen(8080.() = > {
  // Let's wait 10ms before logging the server started.
  setTimeout(() = > {
    console.log('> > >', async_hooks.executionAsyncId());
  }, 10);
});
Copy the code

Output after starting the service:

TCPSERVERWRAP(4): trigger: 1 execution: 1# listen Create TCP server, listen for connect event TickObject(5): trigger: 4 execution: 1The execute user callback is placed in nextTickbefore:  5
  Timeout(6): trigger: 5 execution: 5      # setTimeout
after:  5
destroy:  5
before:  6
>>> 6
  TickObject(7): trigger: 6 execution: 6   # console.log
after:  6
before:  7
after:  7
Copy the code

The official interpretation of the second line of TickObject is that unbinding the port without hostname is a synchronous operation, so putting the user callback into nextTick makes it an asynchronous callback. So a mian (thinking) test (shi) comes, ask output:

const net = require('net');
net.createServer().listen(8080.() = > {console.log('listen')})

Promise.resolve().then(() = > console.log('c'))
process.nextTick(() = > { console.log('b')})console.log('a')
Copy the code

Since console.log is an asynchronous operation, AsyncHooks callbacks are also triggered. So executing console in AsyncHooks callback loops indefinitely:

const { createHook } = require('async_hooks');

createHook({
  init(asyncId, type, triggerAsyncId, resource) {
    console.log(222)
  }
}).enable()

console.log(111)
// internal/async_hooks.js:206
// fatalError(e);
/ / ^
//
// RangeError: Maximum call stack size exceeded
// (Use `node --trace-uncaught ... ` to show where the exception was thrown)
Copy the code

Can be synchronized to a file or standard output:

const { fd } = process.stdout / / 1

createHook({
  init(asyncId, type, triggerAsyncId, resource) {
    // console.log(222)
    writeSync(fd, '222\n')
  }
}).enable()

console.log(111)
Copy the code

How do I track asynchronous resources

We use AsyncHooks to solve the above two scenarios:

The scene of a

const fs = require('fs')
const async_hooks = require('async_hooks');
const { fd } = process.stdout;

let indent = 0;
async_hooks.createHook({
  init(asyncId, type, triggerAsyncId) {
    const eid = async_hooks.executionAsyncId();
    const indentStr = ' '.repeat(indent);
    fs.writeSync(
      fd,
      `${indentStr}${type}(${asyncId}) : ` +
      ` trigger: ${triggerAsyncId} execution: ${eid} \n`);
  },
  before(asyncId) {
    const indentStr = ' '.repeat(indent);
    fs.writeSync(fd, `${indentStr}before:  ${asyncId}\n`);
    indent += 2;
  },
  after(asyncId) {
    indent -= 2;
    const indentStr = ' '.repeat(indent);
    fs.writeSync(fd, `${indentStr}after:  ${asyncId}\n`);
  },
  destroy(asyncId) {
    const indentStr = ' '.repeat(indent);
    fs.writeSync(fd, `${indentStr}destroy:  ${asyncId}\n`);
  },
}).enable();

function callback(err, data) {
    console.log('callback', data)
}

fs.readFile("a.txt", callback)
console.log('after a')
fs.readFile("b.txt", callback)
console.log('after b')
Copy the code
FSREQCALLBACK(4): trigger: 1 execution: 1      # a
after a
TickObject(5): trigger: 1 execution: 1
FSREQCALLBACK(6): trigger: 1 execution: 1      # b
after b
before:  5
after:  5
before:  4
callback undefined
  TickObject(7): trigger: 4 execution: 4       # trigger by a
after:  4
before:  7
after:  7
before:  6
callback undefined
  TickObject(8): trigger: 6 execution: 6       # trigger by b
after:  6
before:  8
after:  8
destroy:  5
destroy:  7
destroy:  4
destroy:  8
destroy:  6
Copy the code

So the first callback is A, and the second callback is B

Scenario 2

const async_hooks = require('async_hooks');

function stackTrace() {
  const obj = {}
  Error.captureStackTrace(obj, stackTrace)
  return obj.stack
}

const asyncResourceMap = new Map(a); async_hooks.createHook({init(asyncId, type, triggerAsyncId) {
    asyncResourceMap.set(asyncId, {
      asyncId,
      type,
      triggerAsyncId,
      stack: stackTrace()
    })
  },
  destroy(asyncId) {
    asyncResourceMap.delete(asyncId)
  },
}).enable();

function main() {
  setTimeout(() = > {
    throw Error(1)},0)
}

main()

function getTrace(asyncId) {
  if(! asyncResourceMap.get(asyncId)) {return ' ';
  }
  const resource = asyncResourceMap.get(asyncId);
  if(resource? .triggerAsyncId) { getTrace(resource? .triggerAsyncId); }console.log(`${resource? .type}(${resource? .asyncId})\n${resource.stack}`)
}

process.on('uncaughtException'.(err) = > {
  console.log(getTrace(async_hooks.executionAsyncId()))
})
Copy the code
Timeout(2)
Error
    at AsyncHook.init (/Users/zhangruiwu/Desktop/work/async_hooks-test/async-error.js:16:14)
    at emitInitNative (internal/async_hooks.js:199:43)
    at emitInitScript (internal/async_hooks.js:467:3)
    at initAsyncResource (internal/timers.js:157:5)
    at new Timeout (internal/timers.js:191:3)
    at setTimeout (timers.js:157:19)
    at main (/Users/zhangruiwu/Desktop/work/async_hooks-test/async-error.js:25:3)
    at Object.<anonymous> (/Users/zhangruiwu/Desktop/work/async_hooks-test/async-error.js:30:1)
    at Module._compile (internal/modules/cjs/loader.js:1063:30)
    at Object.Module._extensions.. js (internal/modules/cjs/loader.js:1092:10)
Copy the code

AsyncHooks + Error. CaptureStackTrace can trace the entire call chain

Performance impact

Using AsyncHook incurs some performance overhead:Github.com/bmeurer/asy…Here is my local run data, you can see that registering only init hooks is not expensive, but registering all hooks is quite expensive. Here’s another reason why.

// Node v14.16.0: regular hapiserver: 11734.73 reqs. Init hapiserver: 8768.21 reqs. Regular koAServer: 17418.8 reqs. init KoAServer: 17183.6 reqs. full koAServer: 14097.82 reqs.Copy the code

The practical application

Clinic’s performance testing tool bubbleProfe-clinic. Js uses AsyncHooks + error.capturestackTrace to trace the call chain.

AsyncResource

Asynchronous resources can be customized

const { AsyncResource, executionAsyncId } = require('async_hooks');

// Generally used for extend, to instantiate an asynchronous resource
const asyncResource = new AsyncResource(
  type, { triggerAsyncId: executionAsyncId(), requireManualDestroy: false});// Run the function in the execution context of the asynchronous resource:
// * Create an asynchronous resource context
// * trigger before
// * Execute the function
// * Trigger after
// * Restore the original execution contextasyncResource.runInAsyncScope(fn, thisArg, ... args);/ / triggers destory
asyncResource.emitDestroy();

/ / return asyncID
asyncResource.asyncId();

/ / return triggerAsyncId
asyncResource.triggerAsyncId();
Copy the code

Here’s an example

class MyResource extends asyncHooks.AsyncResource {
    constructor() {
        super('my-resource');
    }

    close() {
        this.emitDestroy(); }}function p() {
    return new Promise(r= > {
        setTimeout(() = > {
            r()
        }, 1000)})}let resource = new MyResource;
resource.runInAsyncScope(async() = > {console.log('hello')
    await p()
})

resource.close();
Copy the code

We can see that the function passed in by runInAsyncScope is executed in our custom asynchronous resource callback:

my-resource(4): trigger: 1 execution: 1
before:  4
  PROMISE(5): trigger: 4 execution: 4
hello
  TickObject(6): trigger: 4 execution: 4
  PROMISE(7): trigger: 4 execution: 4
  Timeout(8): trigger: 4 execution: 4
  PROMISE(9): trigger: 7 execution: 4
after:  4
before:  6
after:  6
destroy:  4
destroy:  6
before:  8
after:  8
before:  9
after:  9
destroy:  8
Copy the code

AsyncLocalStorage

Used to create asynchronous states in callback and Promise chains. It allows data to be stored for the entire life cycle of a Web request or any other asynchronous duration. Similar to thread local storage (TLS) in other languages.

const http = require('http');
const { AsyncLocalStorage } = require('async_hooks');

const asyncLocalStorage = new AsyncLocalStorage();

function logWithId(msg) {
  const id = asyncLocalStorage.getStore();
  console.log(`${id ! = =undefined ? id : The '-'}: `, msg);
}

let idSeq = 0;
http.createServer((req, res) = > {
  asyncLocalStorage.run(idSeq++, () = > {
    logWithId('start');
    // Imagine any chain of async operations here
    setImmediate(() = > {
      logWithId('finish');
      res.end();
    });
  });
}).listen(8080);

http.get('http://localhost:8080');
http.get('http://localhost:8080');
// Prints:
// 0: start
// 1: start
// 0: finish
// 1: finish
Copy the code

Realize the principle of

Github.com/nodejs/node…

const storageList = [];
const storageHook = createHook({
  init(asyncId, type, triggerAsyncId, resource) {
    const currentResource = executionAsyncResource();
    // Value of currentResource is always a non null object
    for (let i = 0; i < storageList.length; ++i) { storageList[i]._propagate(resource, currentResource); }}});class AsyncLocalStorage {
  constructor() {
    this.kResourceStore = Symbol('kResourceStore');
    this.enabled = false;
  }

  disable() {
    if (this.enabled) {
      this.enabled = false;
      // If this.enabled, the instance must be in storageList
      storageList.splice(storageList.indexOf(this), 1);
      if (storageList.length === 0) { storageHook.disable(); }}}// Propagate the context from a parent resource to a child one
  _propagate(resource, triggerResource) {
    const store = triggerResource[this.kResourceStore];
    if (this.enabled) {
      resource[this.kResourceStore] = store; }}enterWith(store) {
    if (!this.enabled) {
      this.enabled = true;
      storageList.push(this);
      storageHook.enable();
    }
    const resource = executionAsyncResource();
    resource[this.kResourceStore] = store;
  }

  run(store, callback, ... args) {
    const resource = new AsyncResource('AsyncLocalStorage');
    return resource.runInAsyncScope(() = > {
      this.enterWith(store);
      returncallback(... args); }); }exit(callback, ... args) {
    if (!this.enabled) {
      returncallback(... args); }this.enabled = false;
    try {
      returncallback(... args); }finally {
      this.enabled = true; }}getStore() {
    const resource = executionAsyncResource();
    if (this.enabled) {
      return resource[this.kResourceStore]; }}}Copy the code

Run:

  • Create AsyncResource, store store
  • Open asyncHook (init)

  • Execute the function passed in by run

  • Init callback is triggered when a new asynchronous resource is created

  • Passes the store of the current asynchronous resource to the new asynchronous resource

  • So on

GetStore: Gets the store from the current asynchronous resource

Performance impact

KuzzleThe difference between using AsyncLocalStorage and not using it is ~ 8%There are also issues in the community that track ALS performance issues:AsyncLocalStorage kills 97% of performance in anasync· Nodejs /nodeThe above issue is very interesting, the title is ALS kills 97% of the performance, but later someone commented that this test is not representative, change the test function, ALS has little impact.

let fn = async() = >/test/.test('test');

Performed 180407 iterations to warmup
Performed 205741 iterations (with ALS enabled)
Performed 6446728 iterations (with ALS disabled)
ALS penalty: 96.8%

let fn = promisify(setTimeout).bind(null.2);
Performed 44 iterations to warmup
Performed 4214 iterations (with ALS enabled)
Performed 4400 iterations (with ALS disabled)
ALS penalty: 4.23%
Copy the code

So the performance impact still needs to be evaluated in the context of actual application scenarios, and the loss from ALS may be insignificant compared to the business code.

Application scenarios

Realize the CLS

Continuation-local storage(CLS) is similar to thread-local storage(TLS) in other languages. It gets its name from the continuation-passing style(CPS) in functional programming. CPS is similar to the chain callback style in Node and is designed to maintain persistent data during chain function calls. Cls-hooked is a library for implementing CLS using async_hooks, and for versions that do not support async_hooks, asynchronous local-storage is implemented using ALS. Older versions will come back to CLS-Hooked, which is what this integration project will use. Here’s the official example

const { als } = require('asynchronous-local-storage')
const express = require('express')
const app = express()
const port = 3000

app.use((req, res, next) = > {
  als.runWith(() = > {
    next();
    }, { user: { id: 'defaultUser'}});// sets default values
});

app.use((req, res, next) = > {
  // overrides default user value
  als.set('user', { id: 'customUser' });
  next();
});

app.get('/'.(req, res) = > res.send({ user: als.get('user') }))

app.listen(port, () = > console.log(`Example app listening at http://localhost:${port}`))
Copy the code

We can use it to share data across asynchronous call chains and call GET whenever and wherever we want to get data. Now that we can share data across the asynchronous call chain, does our middleware need to pass req, RES? The middleware design of both Express and KOA requires us to pass fixed parameters between each middleware, and to retrieve some of these we need to retrieve CTX from the middleware parameters. ALS is able to break this restriction. You don’t have to take CTX from a parameter, and you can do different things without this restriction, like midway’s integrated call scheme. Farrow, a functional framework that has become popular recently, also uses ALS to bring us a different development experience.

The resources

  • Async hooks | Node. Js v16.1.0 Documentation
  • bmeurer/async-hooks-performance-impact

  • Making async_hooks fast (enough)

  • NodeJs async_hooks,

  • Use Async Hooks to handle HTTP request context for link tracing in Node.js

  • Medium.com/nmc-techblo…

  • Blog. Kuzzle. IO/nodejs – 14 – a…

  • Itnext. IO/request – id -…

Welcome to “Byte front-end ByteFE” resume delivery email “[email protected]