preface

Codeing should be a lifelong career, not just 30 youth rice This article has included making https://github.com/ponkans/F2E, welcome to Star, continuously updated

This article is the core foundation of Node.js


This article is about analyzing the core architecture and fundamentals of Node.js. I hope you can learn some things after reading this article:

  • The meaning and relationship of each layer in node.js architecture
  • How does Node.js interact with the underlying operating system, such as what happens when a file is read
  • How does Node.js handle high concurrency requests and make the best use of server performance
  • Event-driven advantages, and how to implement them

Many front-end beginners, especially college students, encounter the first technical bottleneck is today’s Node.js, in fact, mainly some important concepts have not been understood, some basic knowledge has not mastered, such as compilation principle.

PS: A lot of things that seem very complicated can actually be traced back to the bottom, that is, the computer.

For example, the core of the popular cross-end framework is a transformation of the AST abstract syntax tree

Well, let’s get down to business. Here are some of the most interesting things about Node.js

architecture

If you’re a front-end player, you’ll be able to say at least a few things about Node.js.

Let’s start by looking at a comparison between the browser and Node. After all, many front-end beginners may not have touched Node yet and just run projects in the browser.

The picture on the left shows a simple architecture for the browser. We usually write front-end projects in three parts.

HTML and CSS are processed by the WebKit engine, which goes through a series of transformations, and finally rendered to our screen. Steve Kobes, a Chrome team member, has previously seen a share of the browser rendering process from the bottom up. Later I will share with you.

JavaScript is handled and parsed by the V8 engine, which is not covered in this article.

Downwards see again the middle layer, middle layer of Chrome ability is limited, because is restricted in the browser, for example, we want to manipulate some local file in your browser, earlier it is very difficult to one thing, but with the popularity of HTML 5, have already can implement some functions, but compared with the ability of the middle tier of the Node, It’s still a long way off.

We remove the red part of the left, which is actually a simple Node architecture, in the Node, we can manipulate files, or even set up a variety of services, although the Node does not handle the UI layer, but it is the same as the browser to the mechanism and principle of operation, and in the middle tier here have their own more powerful.

With that in mind, if we took the WebKit engine out of the way and added Node, would we be able to develop Node projects with UI processing out of the browser? You already know what Geek is going to say. Electron does just that, and it’s not particularly magical

So, visually and simply, Node is a browser-free JavaScript environment that still runs on the Chrome V8 engine.

From the introduction on the official website, it can also be seen that its lightweight, efficient, event-driven, non-blocking I/O are several very important features of Node. Next, we will take the operation mechanism of Node as a starting point, step by step with you to analyze how to achieve high concurrency Node single thread, and how to make full use of server resources.

The Node architecture diagram above is relatively simple, but let’s look at the complete one below.

The infrastructure can be roughly divided into the following three layers

The upper

This layer is the Node standard library, in fact, simple understanding is JavaScript code, you can write code directly call the relevant API, Node provides a lot of very powerful API for us to implement, specific can be more in practice to use in-depth, take a very simple example, We can use Node to write a timing script, regularly send your girlfriend what you want to say to her email push, girlfriend happy, you also learn the technology, perfect ~

In the middle

Node bindings (implemented by c++) is a key layer that enables JavaScript bindings to interact with other layers.

The lower

This layer, which is the key to the Node.js runtime, is something! Let’s talk about them one by one

V8, can be simply and roughly summed up as the industry’s most cattle 🍺 JavaScrpt engine. Although there have been attempts to use V8 alternatives, such as the Node-Chakracore project and the SpiderNode project, Node.js still uses V8 by default.

C-ares, an asynchronous DNS request library implemented by C language;

Http_parser, OpenSSL, Zlib, etc., provide some other basic capabilities.

Libuv is a high-performance, event-driven I/O library that provides cross-platform apis such as Windows and Linux. It enforces an asynchronous, event-driven programming style, with the core job being to provide an event loop and callback functions based on I/O and other event notifications. It also provides some core tools, such as timers, non-blocking network support, asynchronous file system access, child processes, etc.

Have a level

Have a level

Node writes a love letter low-level operation

Here is an example from the Node.js book

Suppose we need to open a local TXT file to write a love letter to our girlfriend, the code can be written like this:

let fs = require('fs');

fs.open('. / love letter. TXT '."w".function(err, fd{

    // you have been able to learn from me, never separated.

});

Copy the code

Fs.open () opens a file with the specified path and parameters, returning a file descriptor

Let’s go to lib/fs.js and look at the underlying source code:

async function open(path, flags, mode{

  mode = modeNum(mode, 0o666);

  path = getPathFromURL(path);

  validatePath(path);

  validateUint32(mode, 'mode');

  return new FileHandle(

    await binding.openFileHandle(pathModule.toNamespacedPath(path),

             stringToFlags(flags),

             mode, kUsePromises));

}

Copy the code

JavaScript code calls C++ core modules for lower level operations, and the calling process can be expressed as

The node.js library is called from JavaScript, and the C++ module is called from the library, and the C++ module makes system calls through libuv. This process is the most common method of Node invocation. Libuv also provides implementations of both UNIX and Windows platforms, giving Node.js cross-platform capabilities.

Just like that, the love letter is done, your relationship to the next level, live happily ever after

It’s a single thread after all, and I need an explanation

I am writing an article on the structure analysis of a high concurrency mask kill system based on Node.js

Having solved the first problem, let’s take a look at the second: How does Node handle high concurrency scenarios since it is single-threaded?

In fact, Node is multi-threaded in many places except for the JavaScript part.

As you can see from the love letter example above, Node’s I/O is actually handed over to Libuv, which provides a full thread pool implementation. So, all I/O operations can be parallel, except that the user’s JavaScript code cannot be executed in parallel.

If you are not familiar with thread pool, you are busy writing love letters to your girlfriend in college!!

In fact, there are only two ways to handle I/O in an operating system, blocking and non-blocking.

Blocking I/O means that an invocation needs to wait for all operations to complete before the invocation ends. As a result, the CPU waits for the I/O to complete and its processing power is not fully utilized.

For example, you are now a CPU and you have two things to do. The first thing is to text your girlfriend who is out shopping to see if she is coming back for dinner (because you have to cook, hahaha) and the second thing is to clean the house.

Synchronous I/O: send a message to your girlfriend, and then stay online until an hour later, the girlfriend finally responded to the message. Then you go to clean your house, your girlfriend comes home, sees why it took you so long to clean your house, and Everybody hits you on the head

Asynchronous I/O approach: send a message to his girlfriend, and then directly began to clean the room, such as his girlfriend back to the message, the room has been cleaned, and the meal is also done, is not beautiful? ~

Going back to the operating system, the operating system provides non-blocking I/O methods that return immediately after being called, after which the CPU can process other transactions. But because the I/O is not complete, only the status of the call is immediately returned. To get the final result, the application needs to make a full call to determine whether the operation is complete, known as polling.

Currently, there are several common polling technologies:

  1. read

    This is the original one, where the final result is read through repeated calls, and the CPU is spent waiting until the result is available.

  2. select

    The kernel monitors all selects-responsible sockets. When data is ready in any of the sockets, the kernel monitors all selects-responsible sockets. Select returns, at which point the user process calls read to copy data from the kernel to the user process. Select and later poll and epoll are also known as I/O multiplexing. Select uses an array of length 1024 to store state, so up to 1024 file descriptors can be checked at the same time.

  3. poll

    Poll avoids the limit of array length by using linked list and improves its performance.

  4. epoll

    This is the most efficient I/O event notification mechanism in Linux. If no I/O event is detected during the polling, the system hibernates until the event occurs and wakes it up without wasting CPU.

  5. kqueue

    The implementation is similar to ePoll and only exists on BSD systems.

Don’t be intimidated by the fact that all of these terms are different polling mechanisms

Polling, while non-blocking, is actually a synchronous call. There is a way to provide native asynchronous I/O under Linux, but the scope is smaller, so Node chooses another way to implement full asynchronous I/O.

Therefore, the Node single thread is really just a JavaScript main thread, and the time-consuming asynchronous operations are done by the thread pool. Node throws these time-consuming operations to the thread pool, and Node only needs to do round-trip scheduling, with no real I/O operations.

Single threaded with CPU intensive

Single-threading brings the benefit of not having to worry about state synchronization, but it also comes with several weaknesses

  • Unable to utilize multi-core CPUS
  • An error can cause the entire application to exit
  • CPU intensive tasks cause asynchronous I/O failures

Node.js has a crude solution for cpu-intensive tasks in a single thread, which is to start a child process directly, distribute computations to the child process through child_process, and pass the results through inter-process event messages, i.e. inter-process communication. (Node uses pipes to communicate.)

What, the operating system is not familiar, small partners really need to go to make up the foundation oh ~ National Day do not go out to play, ha ha ha ~~

event-driven

The essence of event-driven is to run the program through the main loop plus event triggering.

The job of an event loop is to continuously wait for an event to occur and then execute all handlers of that event in the order in which they subscribed to the event. When all handlers of the event have been executed, the event loop continues to wait for the next event to fire, over and over again.

Event loop

The Node event loop uses libuv’s default event loop, which can be seen in SRC /node.cc.

Create the Node runtime environment

Environment* env = CreateEnvironment(

        node_isolate,

        uv_default_loop(),

        context,

        argc,

        argv,

        exec_argc,

        exec_argv);

Copy the code

Start the event loop

bool more;

do {

  more = uv_run(env->event_loop(), UV_RUN_ONCE);

  if (more == false) {

    EmitBeforeExit(env);

    // Emit `beforeExit` if the loop became alive either after emitting

    // event, or after running some callbacks.

    more = uv_loop_alive(env->event_loop());

    if(uv_run(env->event_loop(), UV_RUN_NOWAIT) ! =0)

      more = true;

  }

while (more == true);

code = EmitExit(env);

RunAtExit(env);

Copy the code

More indicates whether to proceed to the next cycle. Node.js will then decide what to do with more

  • ifmorefortrue, continue to run the next roundloop.
  • ifmoreforfalse, indicating that there are no events waiting to be processed.EmitBeforeExit(env);process-triggeringbeforeExitEvent, check and process the corresponding handler function, and then jump out of the loop.

The exit event is triggered, the corresponding callback function is executed, and Node.js is finished, followed by some resource release operations.

The observer

There are observers in each event loop, and you ask those observers to determine if there are events to be processed. In Node.js, the main sources of events are network requests, file I/O, etc., all of which correspond to different observers.

The request object

Want to what? Not that object, this object!!

The request object is an intermediate product of the transition between Node initiating a call and the kernel performing an I/O operation. For example, when Libuv calls file I/O, the FSReqWrap request object is immediately returned. The parameters passed in by JavaScript and the current method are wrapped in this request object, and this object is pushed to the kernel for execution.

Event-driven advantages

Event loops, observers, request objects, and I/O thread pools make up Node’s event-driven asynchronous I/O model.

Apache uses the method of starting a thread for each request to process requests. Although threads are relatively light, they still require a certain amount of memory. When large concurrent requests come, the memory usage will be very high, resulting in a slow server.

Node.js takes an event-driven approach to handle requests and does not need to create threads for each request, which can save a lot of overhead of thread creation, destruction and system context switching. Even under large concurrency conditions, it can also provide good performance. Nginx also uses the same event-driven model as Node, and thanks to its superior performance, Nginx is gradually replacing Apache as the dominant Web server.

conclusion

This article has included making https://github.com/ponkans/F2E, welcome to Star, continuously updated 💧

Many of the above parts are not very in-depth, but the general framework and structure are similar. If you want to know more about it, you can refer to “Simple Node.js”.

It’s not impossible to watch Awesome Node.js five times before you start. But I think this mainly depends on the way of learning, some people may be more willing to learn through practice, that in fact, there is no need to read 5 times ~~~

Node ecosystem is growing, front-end development partners will deal with every day, some of the most basic architecture mentioned in the article, the concept should be you must master.

– We will soon write a series on Node.js, along with the same series of portals:

  • “Punch the Interviewer” series Node.js double 11 seconds kill system (advanced must see)

Like the small partner to add a concern, a praise oh, Thanksgiving 💕😊

Contact me/public number

Wechat search “water monster” or scan the qr code below reply “add group”, I will pull you into the technical communication group. Honestly, in this group, even if you don’t talk, just reading the chat is a kind of growth. (Ali technical experts, aobing authors, Java3y, mogujie senior front end, Ant Financial security experts, all the big names).

The water monster will also be regular original, regular with small partners to exchange experience or help read the resume. Add attention, don’t get lost, have a chance to run together 🏃 ↓↓↓