Node.js applies full link tracing

The two core elements of full link tracking technology are full link information acquisition and full link information storage display.

This paper is divided into three chapters to introduce;

  • Chapter 1 describes how to obtain Nodejs application link information.
  • Chapter two introduces the full link tracing application of Node.js.

Node.js uses the full link tracing system

A list,

At present, the mainstream Node.js architecture design mainly includes the following two schemes:

  • General architecture: only DO SSR and BFF, do not do server and micro services;
  • Full scenario architecture: including SSR, BFF, server, and microservices.

The corresponding architecture description of the above two schemes is shown in the figure below:

In both of these generic architectures, NodeJS faces one problem:

In the case of longer and longer request links, more and more services are called, including various microservice calls, the following demands appear:

  • How to quickly define the problem when a request exception occurs;
  • How to find out the reason of slow response quickly;
  • This topic describes how to use log files to quickly locate root causes of problems.

To address these demands, we need a technology that aggregates the key information of each request and links all requests together. This allows us to know how many invocations of a service or microservice request are included in a request, and in which context the service or microservice invocation is made.

This technique is node.js using full link tracing. It is an essential technical guarantee for Node.js in complex server-side business scenarios.

In summary, we need node.js to apply full link tracing. After explaining why we need it, we will introduce how to obtain full link information for Node.js applications.

2. All-link information acquisition

Full link information acquisition is the most important part of full link tracking technology. Follow-up storage display flow is displayed only after the link information is obtained.

For multithreaded languages such as Java and Python, full link information retrieval is aided by a thread context such as ThreadLocal. However, for Node.js, it is difficult to obtain the whole link information naturally due to the single thread and asynchronous operation based on IO callback. So how to solve this problem?

Iii. Industry plans

Node.js is designed with single-threaded, non-blocking IO in mind. In terms of all-link information acquisition, there are mainly the following four schemes so far:

  • Domain: the node API;
  • Zone.js: Angular community product;
  • Explicit delivery: manual delivery, middleware mount;
  • Async Hooks: Node API;

In the above four scenarios, domain has been abandoned due to serious memory leaks. The implementation of zone.js is violent, the API is obscure, and the most important drawback is that Monkey Patch can only mock APIS, not mock languages. Explicit transmission is too cumbersome and intrusive; After comprehensive comparison, the scheme with the best effect is the fourth scheme, which has the following advantages:

  • Node 8.x is a new core module, which is also used by node maintainers and does not leak memory.
  • Very suitable for implicit link tracking, small intrusion, the best solution of implicit tracking;
  • Apis are provided to track the life cycle of asynchronous resources in Node;
  • With the aid of asynC_hook to achieve context correlation;

Using Async Hooks to retrieve full link information.

Four, async hooks

The official documentation describes async_hooks: this is used to track asynchronous resources, that is, to listen for the lifetime of asynchronous resources.

The async_hooks module provides an API to track asynchronous resources.

Since it is used to track asynchronous resources, there are two ids in each asynchronous resource:

  • asyncId: INDICATES the ID of the current life cycle of the asynchronous resource
  • trigerAsyncId: indicates the ID of the parent asynchronous resource, that isparentAsyncId

Call from the following API

const async_hooks = require('async_hooks');
​
const asyncId = async_hooks.executionAsyncId();
​
const trigerAsyncId = async_hooks.triggerAsyncId();
​
Copy the code

See the official documentation: Async_hooks API for more details

Now that we’re talking about async_hooks listening for asynchronous resources, what are those asynchronous resources? We often use the following in our daily projects:

  • Promise
  • setTimeout
  • fs/net/processEtc based on the underlying API

However, async_hooks lists this much on the official website. In addition to the several mentioned above, console.log is also an asynchronous resource: TickObject.

FSEVENTWRAP, FSREQCALLBACK, GETADDRINFOREQWRAP, GETNAMEINFOREQWRAP, HTTPINCOMINGMESSAGE,
HTTPCLIENTREQUEST, JSSTREAM, PIPECONNECTWRAP, PIPEWRAP, PROCESSWRAP, QUERYWRAP,
SHUTDOWNWRAP, SIGNALWRAP, STATWATCHER, TCPCONNECTWRAP, TCPSERVERWRAP, TCPWRAP,
TTYWRAP, UDPSENDWRAP, UDPWRAP, WRITEWRAP, ZLIB, SSLCONNECTION, PBKDF2REQUEST,
RANDOMBYTESREQUEST, TLSWRAP, Microtask, Timeout, Immediate, TickObject
Copy the code

async_hooks.createHook

We can use asyncId to listen on an asynchronous resource.

Create a hook with async_links. createHook:

Const asyncHook = async_links. createHook({// asyncId: resource Id // type: resource type // triggerAsyncId: resource type // Parent async resource Id init (asyncId, type, triggerAsyncId, resource) {}, before (asyncId) {}, after (asyncId) {}, destroy(asyncId) {} })Copy the code

Let’s just focus on the four most important apis:

  • init: Listens for the creation of an asynchronous resource. In this function, we can get the call chain of the asynchronous resource and also the type of the asynchronous resource, which are important.
  • destory: Listens for the destruction of asynchronous resources. Pay attention tosetTimeoutCan be destroyed, andPromiseUnable to destroy, CLS(Continuation-local Storage) could leak here if implemented via async_hooks!
  • before
  • after
SetTimeout (() => {console.log('Async Before') op() op() op() op() op() // After lifecycle at the end of the callback function console.log('Async After') })Copy the code

Async_hooks debugging and testing

const fs = require('fs')
const async_hooks = require('async_hooks')
async_hooks.createHook({
  init (asyncId, type, triggerAsyncId, resource) {
    fs.writeSync(1, `${type}(${asyncId}): trigger: ${triggerAsyncId}\n`)
  },
  destroy (asyncId) {
    fs.writeSync(1, `destroy: ${asyncId}\n`);
  }
}).enable()
async function A () {
  fs.writeSync(1, `A -> ${async_hooks.executionAsyncId()}\n`)
  setTimeout(() => {
    fs.writeSync(1, `A in setTimeout -> ${async_hooks.executionAsyncId()}\n`)
    B()
  })
}
async function B () {
  fs.writeSync(1, `B -> ${async_hooks.executionAsyncId()}\n`)
  process.nextTick(() => {
    fs.writeSync(1, `B in process.nextTick -> ${async_hooks.executionAsyncId()}\n`)
    C()
    C()
  })
}
function C () {
  fs.writeSync(1, `C -> ${async_hooks.executionAsyncId()}\n`)
  Promise.resolve().then(() => {
    fs.writeSync(1, `C in promise.then -> ${async_hooks.executionAsyncId()}\n`)
  })
}
fs.writeSync(1, `top level -> ${async_hooks.executionAsyncId()}\n`)
A()
Copy the code

Async_links.createhook can register 4 methods to track initialization, before, after, and destroy events for all asynchronous resources and enable them by calling.enable(). Call.disable() to turn it off.

Here we only care about the initialization and destruction events of the asynchronous resource and print them to standard output using fs.writesync (1, MSG). The first argument to writeSync receives the file descriptor, 1 indicating standard output. Why not use console.log? Because console.log is an asynchronous operation, occurrence in init, before, After, and destroy event handlers will cause an infinite loop, and any other asynchronous operations cannot be used either.

Run the program and print the following: