The Async_hooks module is an experimental API that was officially added to Node.js in v8.0.0. We also put it into production under v8.X. x.

So what are async_hooks?

Async_hooks provide an API for tracking asynchronous resources that are objects with associated callbacks.

In short, the Async_hooks module can be used to track asynchronous callbacks. So how do you use this tracking capability, and what are the problems with using it?

Know async_hooks

V8.x. x asynC_hooks have two main parts: createHook for tracking the lifecycle and AsyncResource for creating asynchronous resources.

const { createHook, AsyncResource, executionAsyncId } = require('async_hooks')

const hook = createHook({
  init (asyncId, type, triggerAsyncId, resource) {},
  before (asyncId) {},
  after (asyncId) {},
  destroy (asyncId) {}
})
hook.enable()

function fn () {
  console.log(executionAsyncId())
}

const asyncResource = new AsyncResource('demo')
asyncResource.run(fn)
asyncResource.run(fn)
asyncResource.emitDestroy()
Copy the code

The above code means and results:

  1. Create one that is included in each asynchronous operationinit,before,after,destroyDeclaration of a hook function that executes periodicallyhooksInstance.
  2. Enable thishooksInstance.
  3. Manually create ademoFor asynchronous resources. And that triggersinitHooks, asynchronous resourcesidasyncIdThat type oftype(i.e.demo), the creation context for asynchronous resourcesidtriggerAsyncId, the asynchronous resource isresource.
  4. Execute with this asynchronous resourcefnThe function is fired twicebeforeTwo times,afterTwice, asynchronous resourcesidasyncId, thisasyncIdfnPass in functionexecutionAsyncIdI get the same values.
  5. Manual triggerdestroyLifecycle hooks.

Asynchronous operations such as async, await, promise or request are all asynchronous resources that trigger these lifecycle hook functions.

So, in init hook function, we can create the context triggerAsyncId from the asynchronous resource to the current asynchronous resource asyncId, and we can concatenate the asynchronous calls together to get a whole call tree, The asyncId of the asynchronous resource executing the current callback is retrieved via executionAsyncId() in the callback function (fn of the code above), tracing the call down the call chain to the source of the call.

It is important to note that init is an asynchronous resource creation hook, not an asynchronous callback hook, and is executed only once when an asynchronous resource is created.

Request tracking

For the purpose of exception detection and data analysis, we hope that in node. js service of Ada architecture, request-ID in the request header received by the server from the client will be automatically added to the request header of each request sent to the middle and background services.

The simple design of function implementation is as follows:

  1. throughinitHooks cause asynchronous resources on the same invocation chain to share a storage object.
  2. Parse the request headerrequest-id, to the storage corresponding to the current asynchronous invocation chain.
  3. rewritehttp,httpsThe modulerequestMethod to get the current current invocation chain corresponding to the store when the request is executedrequest-id.

Example code is as follows:

const http = require('http')
const { createHook, executionAsyncId } = require('async_hooks')
const fs = require('fs')

// Trace the call chain and create the call chain store object
const cache = {}
const hook = createHook({
  init (asyncId, type, triggerAsyncId, resource) {
    if (type === 'TickObject') return
    // Since console.log is also asynchronous in Node.js and causes init hooks to fire, we can only log synchronously
    fs.appendFileSync('log.out'.`init ${type}(${asyncId}: trigger: ${triggerAsyncId})\n`);
    // Determine whether the call chain store object has been initialized
    if(! cache[triggerAsyncId]) { cache[triggerAsyncId] = {} }// Share the parent node's storage with the current asynchronous resource by reference
    cache[asyncId] = cache[triggerAsyncId]
  }
})
hook.enable()

/ / HTTP
const httpRequest = http.request
http.request = (options, callback) = > {
  const client = httpRequest(options, callback)
  // Write the header to the request-id of the storage of the asynchronous resource to which the current request belongs
  const requestId = cache[executionAsyncId()].requestId
  console.log('cache', cache[executionAsyncId()])
  client.setHeader('request-id', requestId)

  return client
}

function timeout () {
  return new Promise((resolve, reject) = > {
    setTimeout(resolve, Math.random() * 1000)})}// Create a service
http
  .createServer(async (req, res) => {
    // Get the request-id of the current request to write to storage
    cache[executionAsyncId()].requestId = req.headers['request-id']
    // Simulate some other time-consuming operations
    await timeout()
    // Send a request
    http.request('http://www.baidu.com'.(res) = > {})
    res.write('hello\n')
    res.end()
  })
  .listen(3000)
Copy the code

The code is executed and a send test is performed to find that request-ID can be obtained correctly.

trap

It is also important to note that init is an asynchronous resource creation hook, not an asynchronous callback hook, and is executed only once when an asynchronous resource is created.

The problem with the above code is that, as the code introduced earlier in the async_hooks module demonstrates, an asynchronous resource can repeatedly execute different functions, i.e. the asynchronous resource has the potential to be reused. In particular, for asynchronous resources like TCP that are created by the C/C++ part, multiple requests may use the same TCP asynchronous resource, in such a way that the initial init hook function is executed only once when multiple requests reach the server. Call chain traces that result in multiple requests trace back to the same triggerAsyncId, thus referencing the same store.

To verify this, let’s make the following changes to the previous code. The store initialization section saves triggerAsyncId so you can see the trace relationship for asynchronous calls:

    if(! cache[triggerAsyncId]) { cache[triggerAsyncId] = {id: triggerAsyncId
      }
    }
Copy the code

The timeout function instead performs a long and then a short operation:

function timeout () {
  return new Promise((resolve, reject) = > {
    setTimeout(resolve, [1000.5000].pop())
  })
}
Copy the code

If you want to send a request to curl, you can see the following output:

{ id: 1.requestId: 'Id of second request' }
{ id: 1.requestId: 'Id of second request' }
Copy the code

It can be found that in the case of multiple concurrent operations and other operations whose time is not fixed between write and read storage operations, the value stored in the request that reaches the server first will be overwritten by the request that reaches the server later, making the previous request read the wrong value. Of course, you can guarantee that no other time-consuming operations are inserted between writes and reads, but this kind of brain-based guarantee is obviously unreliable in complex services. At this point, we need to make every time before reading and writing, JS can enter a new asynchronous resource context, that is, to obtain a new asyncId, to avoid such reuse. The part stored in the call chain needs to be modified in the following aspects:

const http = require('http')
const { createHook, executionAsyncId } = require('async_hooks')
const fs = require('fs')
const cache = {}

const httpRequest = http.request
http.request = (options, callback) = > {
  const client = httpRequest(options, callback)
  const requestId = cache[executionAsyncId()].requestId
  console.log('cache', cache[executionAsyncId()])
  client.setHeader('request-id', requestId)

  return client
}

// Extract the initialization of storage into a separate method
async function cacheInit (callback) {
  // Use the await operation to make the code behind the await into a new asynchronous context
  await Promise.resolve()
  cache[executionAsyncId()] = {}
  Callback is executed in such a way that subsequent operations belong to the new asynchronous context
  return callback()
}

const hook = createHook({
  init (asyncId, type, triggerAsyncId, resource) {
    if(! cache[triggerAsyncId]) {// init hook is no longer initialized
      return fs.appendFileSync('log.out'.'not initialized with cacheInit method')
    }
    cache[asyncId] = cache[triggerAsyncId]
  }
})
hook.enable()

function timeout () {
  return new Promise((resolve, reject) = > {
    setTimeout(resolve, [1000.5000].pop())
  })
}

http
.createServer(async (req, res) => {
  // Pass subsequent operations as callbacks to cacheInit
  await cacheInit(async function fn() {
    cache[executionAsyncId()].requestId = req.headers['request-id']
    await timeout()
    http.request('http://www.baidu.com'.(res) = > {})
    res.write('hello\n')
    res.end()
  })
})
.listen(3000)
Copy the code

It is worth noting that this organization using callback is consistent with the pattern of KOAJS middleware.

async function middleware (ctx, next) {
  await Promise.resolve()
  cache[executionAsyncId()] = {}
  return next()
}
Copy the code

NodeJs v14

This way of creating a new asynchronous context using await promise.resolve () always feels a bit “crooked”. Fortunately NodeJs v9. Version 7.0.x.x create asynchronous context is provided in the way of the official implementation asyncResource. RunInAsyncScope. Even better, NodeJs v14.X. x provides an official implementation of the asynchronous call chain data store, which directly helps you track asynchronous call relationships, create new asynchronous tweets, and manage data. Without going into the details of the API, we’ll go straight to the new API and adapt the previous implementation

const { AsyncLocalStorage } = require('async_hooks')
// Create an asyncLocalStorage instance directly without managing async lifecycle hooks
const asyncLocalStorage = new AsyncLocalStorage()
const storage = {
  enable (callback) {
    // Use the run method to create a brand new store, and the subsequent operations need to be executed as callbacks to the run method to use the brand new asynchronous resource context
    asyncLocalStorage.run({}, callback)
  },
  get (key) {
    return asyncLocalStorage.getStore()[key]
  },
  set (key, value) {
    asyncLocalStorage.getStore()[key] = value
  }
}

/ / HTTP
const httpRequest = http.request
http.request = (options, callback) = > {
  const client = httpRequest(options, callback)
  // Get the request-id of the asynchronous resource store and write it to header
  client.setHeader('request-id', storage.get('requestId'))

  return client
}

/ / use
http
  .createServer((req, res) = > {
    storage.enable(async function () {
      // Get the request-id of the current request to write to storage
      storage.set('requestId', req.headers['request-id'])
      http.request('http://www.baidu.com'.(res) = > {})
      res.write('hello\n')
      res.end()
    })
  })
  .listen(3000)
Copy the code

As you can see, the official implementation of the AsyncLocalStorage.run API is structurally consistent with our second implementation.

As a result, request tracing using the Async_hooks module is easily implemented in Node.js V14.x. x.