Understanding NodeJs Event loops, Timers, and process.nexttick ()

The translator’s note:

Why translation? In fact, before translating this article, I have Google Chinese translation, not very clear, so I have my own translation plan, of course, the ability is limited, the article or mistakes, welcome to correct.

There will be a few small questions at the end of the article, we might as well think about it together

If you are interested in NodeJs, please visit S.E.L.D. or my wechat (w979436427) to discuss your node learning experience

What is an Event Loop?

Although JavaScript is single-threaded, the Event Loop allows NodeJs to unload I/O operations to the system kernel as much as possible to achieve non-blocking I/O functions.

Since most modern system kernels are multithreaded, they can perform multiple operations in the background. When one of these operations is complete, the kernel notifies NodeJs so that the specified callback is added to the poll queue for final execution. We’ll talk more about this in a later chapter.

The Event Loop analytic

When NodeJs starts, the Event loop is initialized and the corresponding input script is executed (putting the script directly into the REPL execution is beyond the scope of this article). There may be calls to the asynchronous API during this process. Generate a timer or call process.nexttick () and then start the Event loop.

NodeJs executes synchronous code preferentially, and asynchronous apis may be called during synchronous code execution. When the synchronous code and process.Nexttick () callback are completed, the Event loop will start

Here is a brief overview of the sequence of events loop operations:

┌ ─ ─ ─ ─ ─ ─ ─ ─ ─ ─ ─ ─ ─ ─ ─ ─ ─ ─ ─ ─ ─ ─ ─ ┐ ┌ ─ > │ timers │ │ └ ─ ─ ─ ─ ─ ─ ─ ─ ─ ─ ┬ ─ ─ ─ ─ ─ ─ ─ ─ ─ ─ ─ ─ ┘ │ ┌ ─ ─ ─ ─ ─ ─ ─ ─ ─ ─ ┴ ─ ─ ─ ─ ─ ─ ─ ─ ─ ─ ─ ─ ┐ │ │ I/O callbacks │ │ └ ─ ─ ─ ─ ─ ─ ─ ─ ─ ─ ┬ ─ ─ ─ ─ ─ ─ ─ ─ ─ ─ ─ ─ ┘ │ ┌ ─ ─ ─ ─ ─ ─ ─ ─ ─ ─ ┴ ─ ─ ─ ─ ─ ─ ─ ─ ─ ─ ─ ─ ┐ │ │ idle, Prepare │ │ └ ─ ─ ─ ─ ─ ─ ─ ─ ─ ─ ┬ ─ ─ ─ ─ ─ ─ ─ ─ ─ ─ ─ ─ ┘ ┌ ─ ─ ─ ─ ─ ─ ─ ─ ─ ─ ─ ─ ─ ─ ─ ┐ │ ┌ ─ ─ ─ ─ ─ ─ ─ ─ ─ ─ ┴ ─ ─ ─ ─ ─ ─ ─ ─ ─ ─ ─ ─ ┐ │ incoming: │ │ │ poll │ < ─ ─ ─ ─ ─ ┤ connections, │ │ └ ─ ─ ─ ─ ─ ─ ─ ─ ─ ─ ┬ ─ ─ ─ ─ ─ ─ ─ ─ ─ ─ ─ ─ ┘ │ data, Etc. │ │ ┌ ─ ─ ─ ─ ─ ─ ─ ─ ─ ─ ┴ ─ ─ ─ ─ ─ ─ ─ ─ ─ ─ ─ ─ ┐ └ ─ ─ ─ ─ ─ ─ ─ ─ ─ ─ ─ ─ ─ ─ ─ ┘ │ │ check │ │ └ ─ ─ ─ ─ ─ ─ ─ ─ ─ ─ ┬ ─ ─ ─ ─ ─ ─ ─ ─ ─ ─ ─ ─ ┘ │ ┌ ─ ─ ─ ─ ─ ─ ─ ─ ─ ─ ┴ ─ ─ ─ ─ ─ ─ ─ ─ ─ ─ ─ ─ ┐ ├ ──┤ close callbacks ────── our r companyCopy the code

Note: Each box represents a stage in the Event loop

Each phase has a FIFO (first-in, first-out) callback queue waiting to execute. Although each stage is unique in its own way, in general, when the Event loop reaches the specified stage, it performs any operations in that stage and performs the corresponding callbacks until there are no callbacks available in the queue or the callback execution limit is reached, and then the Event loop moves on to the next stage.

Since operations on any of these phases may produce more operations, and the kernel pushes new events into the poll phase queue, new poll events are allowed to continue to be queued while polling events are processed, which also means that long-running callbacks can allow the poll phase to run longer than the timer threshold

Note: There are some implementation differences between Windows and Unix/Linux, but these are not important for this article. There are actually 7 or 8 steps, but the ones listed above are actually used in Node.js.

Phase overview

timers: ExecutessetTimeout()andsetInterval()The callback
I/O callbacks: Performs all operations except close callbacks, timer callbacks, and callbacksetImmediate()Almost all callbacks other than those specified
Idle, prepare: Used internally only
Poll: Receives new I/O events, node blocks here when appropriate (== in what case is appropriate? = =)
check:setImmediateThis is where the callback is triggered
close callbacks: for instance,socket.on('close', ...)

After each event loop is executed, Node.js checks to see if there are still I/ OS to wait for or if the timer is not being processed, and if not, the process exits.

Stage details

timers

A timer will specify a threshold and perform the given callback after it is reached, but usually this threshold will take longer than we expected. Timer callbacks are executed as early as possible, but operating system scheduling and other callbacks can cause delays.

Note: Strictly speaking, when the timer is executed depends on the poll phase

As an example, suppose a timer is given a threshold of 100ms and it takes 95ms to read a file asynchronously

const fs = require('fs');

function someAsyncOperation(callback) {
  // Assume 95ms is spent here
  fs.readFile('/path/to/file', callback);
}

const timeoutScheduled = Date.now();

setTimeout(function() {

  const delay = Date.now() - timeoutScheduled;

  console.log(delay + 'ms have passed since I was scheduled');
}, 100);


// the asynchronous operation is complete after 95ms
someAsyncOperation(function() {

  const startCallback = Date.now();

  // This takes 10ms
  while (Date.now() - startCallback < 10) {
    // do nothing}});Copy the code

In this case, when the Event loop reaches the poll phase, its queue is empty (fs.readfile () is not yet complete), so it stays there until the earliest timer threshold is reached. Fs.readfile () took 95ms to read the file, after which its callback was pushed into the poll queue and executed (10ms to execute). After the callback is completed and there are no more callbacks in the queue to execute, the Event Loop checks to see if any timer callbacks are available and jumps back to the Timer phase to execute the corresponding callback. In this case, you can see that it took 105ms from the time the timer was invoked to the time its callback was executed.

Note: To prevent event loops from blocking during the poll phase, Libuv (libuv.org/) specifies a hard maximum value to prevent more events from being pushed into the poll phase.

I/O callbacks stage

This phase is used to perform callbacks for system operations, such as TCP errors. For example, when a TCP socket receives an error from ECONNREFUSED while trying to connect, some * NIx systems will want to get a report of these errors, which will be rolled back to THE I/O callbacks.

Poll phase

The poll phase has two functions:

Execute the timer script that has reached the threshold
Processes events in the poll queue

When the Event loop enters the poll phase and the timer is set in this code, the following happens:

If the poll queue is not empty, the Event Loop iterates through the callback functions in the execution queue until the queue is empty or reaches the system limit
If the poll queue is empty, the following happens:
- If there are pairs in the scriptsetImmediate()The Event loop will terminate the poll phase and enter the Check phase and execute the scheduled code
- If no pair exists in the scriptsetImmediate()The event loop will block here until a callback is added, which will be executed immediately

Once the poll queue is empty, the Event Loop checks to see if any timers have reached the timers threshold. If one or more timers meet the requirements, the Event Loop returns to the Timers phase and performs a callback for the change phase.

The check phase

Once the poll phase is complete, the callback for this phase is executed immediately. If the poll phase is idle and setImmediate() is used in the script, the Event loop skips the wait for the poll phase and enters the phase.

SetImmediate () is actually a special timer that runs in a separate phase of the event loop and uses the Libuv API to schedule the execution of callbacks.

Typically, as the code executes, the Event loop ends up in the poll phase where it waits for new events (such as new connections, requests, and so on). However, if there is a setImmediate() callback and the poll phase is idle, the Event loop stops in the poll phase and goes straight to the Check phase.

The close callbacks phase

If a socket or handle is suddenly closed (such as socket.destory()), the close event is committed to this stage. Otherwise it will be triggered by process.nexttick ()

SetImmediate () and setTimeout ()

SetImmediate and setTimeout() may look similar, but they have different behaviors depending on when they are called.

setImmediate()It is designed to be called as soon as the poll phase is complete
setTimeout()The execution will be triggered when the minimum threshold is reached

The order in which they are called depends on their execution context. If both are called from the main module, the point at which their callbacks are executed depends on the performance of the processing (which can be affected by other applications running on the same machine)

For example, if the following script is not running in an I/O loop, the order in which the two timers run is not necessarily the same (== why? ==), depending on the performance of the processing:

// timeout_vs_immediate.js
setTimeout(function timeout() {
  console.log('timeout');
}, 0);

setImmediate(function immediate() {
  console.log('immediate');
});
Copy the code

$ node timeout_vs_immediate.js
timeout
immediate

$ node timeout_vs_immediate.js
immediate
timeout
Copy the code

But if you place the above code in an I/O loop, the setImmediate callback will take precedence:

// timeout_vs_immediate.js
const fs = require('fs');

fs.readFile(__filename, () => {
  setTimeout(() => {
    console.log('timeout');
  }, 0);
  setImmediate(() => {
    console.log('immediate');
  });
});

Copy the code

$ node timeout_vs_immediate.js
immediate
timeout

$ node timeout_vs_immediate.js
immediate
timeout
Copy the code

The main advantage of using setImmediate() instead of setTimeout() is that setImmediate() always takes precedence over other timers (no matter how many timers exist) if the code is called in an I/O loop.

process.nextTick()

understand`process.nextTick()`

You may have noticed that process.Nexttick () is not in the diagram above, even though it is also an asynchronous API. This is because process.nexttick () is not technically part of the Event loop, which ignores the stage in which the Event loop is currently executing and processes the contents of the nextTickQueue directly.

Looking back at the diagram, you call process.nexttick () at any given stage, and all callbacks passed in to process.Nexttick () will be executed before continuing the Event loop. This can lead to some bad things, because it allows you to recursively call process.nexttick () so that the Event loop can’t go to the poll phase and therefore can’t receive new I/O events

Why is this allowed?

So why is something like this included in Node.js? This is partly due to the design philosophy of Node.js: apis should always be asynchronous even if some of them are unnecessary. Here’s an example:

function apiCall(arg, callback) {
  if (typeofarg ! = ='string')
    return process.nextTick(callback,
                            new TypeError('argument should be string'));
}
Copy the code

This is a piece of code that validates arguments and passes an error message to the callback if they are incorrect. Process.nexttick () has recently been updated to allow us to pass multiple arguments to callbacks instead of nesting multiple functions.

What we do (in this case) is pass the error to the user after ensuring that the rest of the (synchronous) code completes. By using process.nexttick () we can ensure that the callback to apiCall() is always called after the other (synchronous) code has run and before the event loop starts. To achieve this, the JS call stack is expanded (== What is stack expansion? ==) then execute the provided callback immediately, so we can recurse on process.nexttick (== how? ==) without raising RangeError: Maximum Call Stack size exceeded from V8

This philosophy can lead to some potential problems. Such as:

let bar;

// this has an asynchronous signature, but calls callback synchronously
function someAsyncApiCall(callback) { callback(); }

// the callback is called before `someAsyncApiCall` completes.
someAsyncApiCall((a)= > {

  // since someAsyncApiCall has completed, bar hasn't been assigned any value
  console.log('bar', bar); // undefined

});

bar = 1;
Copy the code

The user defines an asynchronously signed function someAsyncApiCall()(the function name can be seen), but in fact the operation is synchronous. When it is called, its callback is also called at the same stage in the Event loop, because someAsyncApiCall() doesn’t actually have any asynchronous action. As a result, the callback attempts to access the variable BAR before the (synchronous) code has fully executed.

By placing the callback in process.nexttick (), the script runs completely (the synchronous code is completely executed), which allows variables, functions, and so on to be executed before the callback. It also has the benefit of preventing the Event loop from continuing. Sometimes we might want to throw an error before the Event loop continues, in which case process.nexttick () becomes useful. Here is the process.nexttick () modification to the previous example:

let bar;

function someAsyncApiCall(callback) {
  process.nextTick(callback);
}

someAsyncApiCall((a)= > {
  console.log('bar', bar); / / 1
});

bar = 1;
Copy the code

Here’s a practical example:

const server = net.createServer(() => {}).listen(8080);

server.on('listening', () = > {});Copy the code

When only one port is passed in as a parameter, the port is immediately bound. So the listening callback may be invoked immediately. The problem is: the ON (‘listening’) callback was not registered at that time.

To solve this problem, add the Listening event to the nextTick() queue to allow the script to finish executing first (synchronizing the code). This allows users to set up whatever event handlers they need (in synchronized code).

Process. NextTick () and setImmediate ()

To the user, these two terms are very similar but the names are confusing.

process.nextTick()Will be executed in the same phase
setImmediate()Will be executed in subsequent iterations

Essentially, the names should be switched, process.nexttick () is closer to mediate() than setImmediate(), but that’s unlikely to change for historical reasons. Name swaps are likely to affect most NPM packages, with a large number of packages being submitted every day, meaning that swaps can cause more damage later on. So even though their names are confusing they are impossible to change.

We recommend that developers use setImmediate() in all situations, as this allows your code to be compatible with more environments such as browsers.

Why use process.nexttick ()?

There are two main reasons:

Let the developer handle errors, clean up useless resources, or try again to rerequest resources before the Event loop continues
Sometimes you need to allow callbacks to run after the call stack has been expanded but before the event loop continues

The following example will satisfy our expectations:

const server = net.createServer();
server.on('connection'.function(conn) {}); server.listen(8080);
server.on('listening'.function() {});Copy the code

Assuming that listen() runs before the Event loop starts, but the listening callback is wrapped in setImmediate, the port will be immediately bound unless hostname is specified. The Event loop cannot be processed until the poll stage, which means that it is possible to receive a connection before the callback of the Listening event is executed, i.e. the Connection event is triggered before the Listening event.

Another example is to run a constructor inherited from EventEmitter that emits an event.

const EventEmitter = require('events');
const util = require('util');

function MyEmitter() {
  EventEmitter.call(this);
  this.emit('event');
}
util.inherits(MyEmitter, EventEmitter);

const myEmitter = new MyEmitter();
myEmitter.on('event'.function() {
  console.log('an event occurred! ');
});
Copy the code

You can’t actually fire the event from the constructor right away because the script hasn’t been run to the point where the user assigns a callback to the event. Therefore, in the constructor, you can use process.nexttick () to set the callback to emit an event after the constructor completes, providing the expected result

const EventEmitter = require('events');
const util = require('util');

function MyEmitter() {
  EventEmitter.call(this);

  // use nextTick to emit the event once a handler is assigned
  process.nextTick(function() {
    this.emit('event');
  }.bind(this));
}
util.inherits(MyEmitter, EventEmitter);

const myEmitter = new MyEmitter();
myEmitter.on('event'.function() {
  console.log('an event occurred! ');
});
Copy the code

Q&A

After translating this article, the author raises a few questions for himself.

When is the poll phase blocked?
Why in a non-I /O loop,setTimeoutandsetImmediateIs the order of execution not necessarily?
What does it mean to expand the JS call stack?
whyprocess.nextTick()Can be called recursively?

The author will discuss these issues in the following article “Q&A: Understanding the Event Loop, Timers and Process.Nexttick () in NodeJs”. If you are interested, you can follow the author’s official account: Front-end S.E.L.D. -NodeJs series for the latest information

Original address: github.com/nodejs/node…