Last week wrote JS asynchronous programming shallow thought, step by step anti-human asynchronous callback evolution to async/await keyword synchronous/sequential execution, let me asynchronous programming processing ability has a qualitative breakthrough, to “the highest state of asynchronous programming, is not concerned with whether it is asynchronous or not”.

So, here’s the problem

How is this asynchracy of Node.js implemented in single-threaded JS?

What are the benefits, limitations and bottlenecks of node.js’s asynchronous design?

Node. Js framework

Node.js is mainly divided into four parts: Node Standard Library, Node Bindings, V8 and Libuv. The structure diagram of Node.js is as follows:

As you can see, the structure of Node.js is roughly divided into three layers

  • Node Standard LibraryAre standard libraries that we use every day, such as the Http, Buffer, fs modules. They are all written in JavaScript and can be passedrequire(..)Can be called directly.
  • Node BindingsIt is a bridge between JS and C++, encapsulating the details of V8 and Libuv, and providing basic API services to the upper layer.
  • This layer is the key to running Node.js and is implemented by C/C++.
  • V8Javascript engine is a Javascript engine developed by Google. It provides an environment for javascript to run in non-browser side, so it can be said that it is the engine of Node.js. Its efficiency is one of the things that makes Node.js so effective.
  • LibuvProviding Node.js with cross-platform, thread pooling, event pooling, asynchronous I/O capabilities, is what makes Node.js so powerful.
  • C-aresProvides the ability to handle DNS related asynchronously.
  • Http_parser, OpenSSL, zlibOther capabilities include HTTP parsing, SSL, data compression, and more.

Libuv architecture

Below is the architecture diagram of Libuv on the official website

From left to right, there are two sections, one for network I/ O-related requests, and the other for file I/O, DNS Ops, and User code.

As you can see from the figure, the underlying mechanisms supporting asynchronous processing are completely different for Network I/O and another type of request, represented by File I/O.

For Network I/ O-related requests, epoll on Linux, Kqueue on OSX and BSD, Event ports on SunOS, and IOCP on Windows are used respectively according to the OS platform.

For requests represented by File I/O, thread pool is used. Thread pool is used to realize asynchronous request processing, which can be well supported on all kinds of OS.

For example

var fs = require('fs');
fs.open('./test.txt'."w".function(err, fd) {
	/ /.. do something
});
Copy the code

Lib /fs.js→ SRC /node_file.cc→uv_fs

The general flow chart is as follows:

Specifically, fs.open(..) Opens a file with the specified path and parameters to get a file descriptor, which is the initial operation for all subsequent I/O operations.

Node.js then calls the C/C++ layer Open function through process.binding, which then calls the specific method uv_fs_open in libuv.

At this point, the javascript call returns immediately, ending the first phase of the asynchronous call initiated by the javascript layer. Javascript threads can continue to perform subsequent operations on the current task. The current I/O operation is waiting to be executed in the thread pool, and whether or not it blocks I/O does not affect the execution of the javascript thread, thus achieving the purpose of asynchrony.

The second stage is the callback notification. When an I/O operation is invoked in the thread pool, the event loop is told that it is done. Event loop During each loop, it checks whether there is any completed I/O. If there is, it fetches the result and executes the corresponding callback function. This is used to call the callback function passed in javascript.

At this point, the asynchronous I/O process is complete.

This step is determined at compile time, not at run time.

Event loop

“An event loop is a program structure for waiting and sending messages and events. A programming construct that waits for and dispatches events or messages in a program.

When the process starts, Node.js creates a loop similar to while(true). Each time the body of the loop executes, it checks to see if the event is waiting to be processed, and if so, it retrieves the event and its associated callback function. If there are associated callback functions, they are executed. It then enters the next loop and exits the process if there is no more event handling.

The above simply describes the flow of the event loop. We know that Node.js doesn’t just have some asynchronous I/O, but other asynchronous apis: setTimeout, setInterval, setImmediate, and so on. What kind of process do they work according to?

The NodeJS event loop is divided into six phases, each of which serves the following purpose

┌ ─ ─ ─ ─ ─ ─ ─ ─ ─ ─ ─ ─ ─ ─ ─ ─ ─ ─ ─ ─ ─ ─ ─ ┐ ┌ ─ > │ timers │ │ └ ─ ─ ─ ─ ─ ─ ─ ─ ─ ─ ┬ ─ ─ ─ ─ ─ ─ ─ ─ ─ ─ ─ ─ ┘ │ ┌ ─ ─ ─ ─ ─ ─ ─ ─ ─ ─ ┴ ─ ─ ─ ─ ─ ─ ─ ─ ─ ─ ─ ─ ┐ │ │ I/O callbacks │ │ └ ─ ─ ─ ─ ─ ─ ─ ─ ─ ─ ┬ ─ ─ ─ ─ ─ ─ ─ ─ ─ ─ ─ ─ ┘ │ ┌ ─ ─ ─ ─ ─ ─ ─ ─ ─ ─ ┴ ─ ─ ─ ─ ─ ─ ─ ─ ─ ─ ─ ─ ┐ │ │ idle, Prepare │ │ └ ─ ─ ─ ─ ─ ─ ─ ─ ─ ─ ┬ ─ ─ ─ ─ ─ ─ ─ ─ ─ ─ ─ ─ ┘ ┌ ─ ─ ─ ─ ─ ─ ─ ─ ─ ─ ─ ─ ─ ─ ─ ┐ │ ┌ ─ ─ ─ ─ ─ ─ ─ ─ ─ ─ ┴ ─ ─ ─ ─ ─ ─ ─ ─ ─ ─ ─ ─ ┐ │ incoming: │ │ │ poll │ < ─ ─ ─ ─ ─ ┤ connections, │ │ └ ─ ─ ─ ─ ─ ─ ─ ─ ─ ─ ┬ ─ ─ ─ ─ ─ ─ ─ ─ ─ ─ ─ ─ ┘ │ data, Etc. │ │ ┌ ─ ─ ─ ─ ─ ─ ─ ─ ─ ─ ┴ ─ ─ ─ ─ ─ ─ ─ ─ ─ ─ ─ ─ ┐ └ ─ ─ ─ ─ ─ ─ ─ ─ ─ ─ ─ ─ ─ ─ ─ ┘ │ │ check │ │ └ ─ ─ ─ ─ ─ ─ ─ ─ ─ ─ ┬ ─ ─ ─ ─ ─ ─ ─ ─ ─ ─ ─ ─ ┘ │ ┌ ─ ─ ─ ─ ─ ─ ─ ─ ─ ─ ┴ ─ ─ ─ ─ ─ ─ ─ ─ ─ ─ ─ ─ ┐ ├ ──┤ close callbacks ────── our r companyCopy the code
  • Timers: Execute the callback that expires in setTimeout() and setInterval().
  • I/O Callbacks: A small number of I/ OCallbacks in the previous cycle are delayed until this stage of the cycle
  • Idle, prepare: Used internally only
  • Poll: The most important phase, performing I/O callback, will block in this phase under appropriate conditions
  • Check: Performs the Callback of setImmediate
  • Close callbacks: Callback to execute a close event, such as socket.on(“close”,func)

Each cycle of the event cycle needs to pass through the above phases in turn. Each phase has its own callback queue. When a phase enters, the callback is fetched from the queue to be executed. When the queue is empty or the number of callback executed reaches the maximum number of system, the next phase is entered. The completion of all six phases is called a cycle.

Here’s an example:

console.log(1);
console.log(2);
const timeout1 = setTimeout(function(){
    console.log(3)
    const timeout2 = setTimeout(function(){
        console.log(6); })},0)
const timeout3 = setTimeout(function(){
    console.log(4);
    const timeout4 = setTimeout(function(){
        console.log(7); })},0)
console.log(5)
Copy the code

If you can print the example above, you have a general understanding of how the JS process works with the event loop and how the event loop itself works.

  • Sequential execution prints 1
  • Sequential execution prints 2
  • The js process assigns timeout1(used to refer to timers for illustration purposes, likewise below) to the event loop and returns it
  • The js process assigns timeout3 to timers for the event loop and returns
  • Execute the sequence to print 5
  • In the Timers phase, Libuv periodically checks whether the timer has expired. When it checks that timeout1 is up, it notifies the JS process to execute the timeout1 callback, print out 3, and assign timeout2 to the timers of the event loop.
  • It then checks that timeout3 is out of date, notifies the JS process to execute the timeout3 callback, prints 4, and assigns timeout4 to the timers of the event loop.
  • Here the event loop moves to the next stage until it reaches the Timers stage, where the timers with the minimum elapsed time are taken out and the callback is executed. Print out the 6, then print out the 7.

It is important to note that timeout2 in timeout1 and timeout4 in timeout3 wait for the timeout1 and timeout3 callbacks to be executed, respectively, before being assigned to the event loop by the JS process. That is, timeout1 and timeout3 are not executed in the same event cycle as timeout2 and timeout4.

Advantages and Difficulties

The biggest feature node.js brings to the table is the event-driven, non-blocking I/O model that is at its heart. Non-blocking I/O allows the CPU and I/O to be independent of each other, making better use of resources.

Node.js uses the event loop to make javascript thread act like a big steward of task assignment and result processing. Each I/O thread in the I/O thread pool is the minor second, responsible for conscientiously completing the assigned task. There is no dependence between minor second and the housekeeper, so the overall efficiency can be maintained.

The disadvantage of this model is that the butler cannot undertake too many detailed tasks. If he undertakes too many tasks, it will affect the scheduling of tasks. The butler is always busy, but the mistress can not get work to do. For example, a js loop a million times would block the javascript thread, causing the butler to be too busy processing the loop to schedule the task.

When the event loop model is faced with a large number of requests, which all operate on a single thread at the same time, it is necessary to prevent any one computation from consuming too many logical fragments. As long as the computation does not affect the scheduling of asynchronous I/O, it can also be used in CPU-intensive scenarios.

It is recommended not to use more than 10ms of CPU, or to break up a lot of computing into a lot of small computing, via setImmediate(..) Scheduling. As long as the asynchronous model of Node.js and the high performance of V8 are properly utilized, the advantages of CPU and I/O resources can be fully utilized.

Reference:

Node.js — Even though it was written on V0.10, there were a lot of things that just clicked for me.

2. Don’t confuse NodeJS with event loops in browsers — reading the source code helped me understand event loops.