preface

Learning Node can not open asynchronous IO, asynchronous IO and event loop is closely related, and about this piece has not been carefully to understand sorted out, just recently when doing the project, there are some thinking on the record, I hope to try to sort out this piece of knowledge, if there is a mistake, please give advice to light spray ~~

Some of the concepts

Synchronous asynchronous & blocking Non-blocking

When I look up the literature, I find that a lot of people are confused about asynchrony and non-blocking, which are completely different, synchronous asynchrony refers to the behavior, which is the relationship between the two, and blocking non-blocking refers to the state, which is a party.

As an example, many of you have written the following code

$.ajax(url).succedd(() => {
    ......
    // to do something
})
Copy the code

Synchronous asynchronous In synchronous mode, the client sends a request and waits for the serve to complete the request before executing the subsequent logic. In this way, the client and serve maintain the synchronization state.

In asynchronous cases, the client sends a request and returns it immediately. However, the request may not reach the server or the request is being processed. In asynchronous cases, the client usually registers events to handle the situation after the request is completed, as shown in the succeed function.

Blocking non-blocking first needs to understand the concept that Js is a single thread, whereas the browser is not, in fact your request is another thread of the browser running.

If it is blocked, the thread will wait until the request completes before it can be released for other requests.

If it is non-blocking, the thread can initiate the request and continue to do other things without waiting for the request to complete.

The summary is often confusing because it is not clear which part of the discussion is being discussed (as mentioned below), so the object of synchronous asynchronous discussion is both parties, while the object of blocking non-blocking discussion is itself.

IO and CPU

Io and Cpu can work simultaneously.

IO:

I/O (English: Input/Output), namely Input/Output, usually index data between internal memory and external memory or other peripheral devices Input and Output.

cpu

Interpret computer instructions and process data in computer software.

Asynchronous IO model in Node

IO is divided into disk I/O and network I/O, which has two steps

  1. Waiting for the data to be ready
  2. Copying the data from the kernel to the process

Disk I/O in Node

The following discussion is based on the * NIx system. The ideal asynchronous Io should look like the one discussed above, as shown below:

In fact, our system is not perfect to implement such a call method, Node asynchronous IO, such as reading files, is implemented in the way of thread pool, as can be seen, Node through another thread to perform I/O operations, after completion of the main thread notification:

Under Windows, the IOCP interface is used. IOCP is a perfect asynchronous call from the user’s point of view, and actually uses the thread pool in the kernel, which is different from NIX in that the thread pool is provided by the user layer.

Network Io in Node

Before getting into the topic, let’s take a look at the Io mode of Linux. We recommend you to read this article, which is summarized as follows:

Blocking I/O (Blocking IO)

So, the characteristic of blocking IO is that both phases of IO execution are blocked.

Nonblocking I/O (NONblocking IO)

When a user process issues a read operation, if the data in the kernel is not ready, it does not block the user process, but immediately returns an error. From the user process’s point of view, when it initiates a read operation, it does not wait, but gets a result immediately. When the user process determines that the result is an error, it knows that the data is not ready, so it can send the read operation again. Once the kernel is ready and receives a system call from the user process again, it copies the data to the user’s memory and returns.

I/O multiplexing

So, I/O multiplexing is characterized by a mechanism whereby a process can wait for multiple file descriptors at the same time, and select() returns when any one of these file descriptors (socket descriptors) is read ready.

Asynchronous I/O (Asynchronous IO)

As soon as the user process initiates the read operation, it can start doing other things. On the other hand, from the kernel’s point of view, when it receives an asynchronous read, it first returns immediately, so no blocks are generated for the user process. The kernel then waits for the data to be ready and copies the data to the user’s memory. When this is done, the kernel sends a signal to the user process telling it that the read operation is complete.

Node uses the I/O multiplexing mode, and the I/O multiplexing mode has several sub-modes, such as Read, SELECT, poll and epoll. Node uses the optimal epoll mode. Here is a brief description of the differences. And explain why epoll is optimal.

Read the read. It is one of the most primitive and lowest performance, it will repeatedly check the I/O state to complete the data read. Until the final data is available, the CPU is used to double-check the I/O status. Figure 1 is a schematic of polling with READ.

Select the select. It is an improved version of Read, which evaluates event states on file descriptors. Figure 2 is a schematic of polling by SELECT. Select polling has a weak limitation because it uses a 1024-length array to store state, meaning that it can check up to 1024 file descriptors at the same time.

Poll to poll. Poll is an improvement over SELECT in that it avoids array length constraints in a linked list manner, and second, it avoids unnecessary checks. However, when there are many file descriptors, its performance is very low.

Epoll This solution is the most efficient I/O event notification mechanism in Linux. If no I/O event is detected during polling, the system hibernates until an event occurs. It actually uses event notifications to perform callbacks instead of traversing queries, so it doesn’t waste CPU and is more efficient.

In addition, other polls and select have the following disadvantages (quoted from the article) :

  1. Every time you call select, you need to copy the fd collection from user mode to kernel mode, which can be expensive if there are many FD’s
  2. Also, each call to SELECT requires the kernel to iterate over all FDS passed in, which can be expensive if there are many FDS
  3. The number of file descriptors supported by SELECT is too small. The default is 1024

Epoll for the above improvements

Epoll, since it is an improvement on SELECT and Poll, should avoid these three disadvantages. So how does epoll work? Before we do that, let’s look at the differences between the call interfaces of epoll and select and poll. Both select and poll provide only one function — select or poll. Epoll provides three functions, epoll_create,epoll_ctl, and epoll_wait. Epoll_create creates an epoll handle. Epoll_ctl is the type of event registered to listen for; Epoll_wait waits for an event to occur. For the first shortcoming, the solution to epoll is in the epoll_ctl function. Each time a new event is registered into the epoll handle (specify EPOLL_CTL_ADD in epoll_ctl), all FDS are copied into the kernel instead of being copied repeatedly during epoll_wait. Epoll ensures that each FD is copied only once during the entire process. On the second disadvantage, instead of adding current to the device wait queue each time, as select or poll does, epoll’s solution simply suspends current (which is essential) at epoll_ctl and assigns a callback function to each FD. When the device is ready, This callback function is called when the waiters on the wait queue are awakened, and the callback function adds the ready FD to a ready linked list. The job of epoll_wait is to check for ready FDS in the ready list (schedule_timeout() is similar to step 7 in the select implementation). As for the third disadvantage, ePoll does not have this limitation. The maximum number of FDS it supports is the maximum number of files that can be opened. This number is usually much higher than 2048, for example, around 100,000 on a 1GB machine, which is generally very dependent on system memory.

The ASYNCHRONOUS network Io in Node is realized by using EPoll. To put it simply, a thread is used to manage numerous Io requests and realize message communication through the event mechanism.

Event loop

After understanding the low-level implementation of disk IO and network IO in Node, based on the above code, it can be seen that Node is based on the way of event registration after completing IO for a series of processing, its internal is the use of the event loop mechanism.

As for the event loop, it means that JS will check whether the stack is empty after each synchronization task is executed. If so, it will execute the registered event list and continue the loop process. The Node event loop has six phases:

Each of these phases handles related events:

  • Timers: Execute setTimeout and callback expired in setInterval.
  • Pending Callback: AN I/O callback that executes deferred until the next iteration of the loop.
  • Idle, prepare: used only in the system.
  • Poll: Retrieves new I/O events. Performing I/ O-related callbacks (in almost all cases, except for closed callback functions, which are scheduled by timers and setImmediate()), node blocks here. (i.e. the content of this article is relevant)
  • The check: setImmediate() callback function is executed here.
  • Close Callbacks: Callback that performs a close event, such as socket.on(‘close'[,fn]) or http.server.on(‘close, fn).

Ok, that explains how Node executes our registered events, but there’s still one more thing missing. How does Node map events to IO requests? Another intermediate request object is involved here. Take opening a file as an example:

fs.open = function(path, flags, mode, callback){

//...

binding.open(pathModule._makeLong(path), stringToFlags(flags), mode, callback);

}
Copy the code

Fs.open () opens a file with the specified path and parameters to get a file descriptor, which is the initial operation for all subsequent I/O operations. As you can see from the previous code, the JavaScript level code calls the C++ core module for the lower level operations.

The core module of Node is called from JavaScript. The core module calls the C++ built-in module. The built-in module makes system calls through libuv. Here libuv acts as an encapsulation layer, with two platform implementations that essentially call the uv_fs_open() method. During the call to uv_fs_open(), we create a FSReqWrap request object. The parameters passed in from the JavaScript layer and the current method are wrapped in the request object, where the callback we care about most is set on the object’s onComplete_sym property:

req_wrap->object_->Set(oncomplete_sym, callback);
Copy the code

The QueueUserWorkItem() method takes three arguments: the first argument is a reference to the method to be executed, the uv_fs_thread_proc referenced here; The second parameter is required when the uv_fs_thread_proc method runs; The third parameter is the execution flag. The uv_fs_thread_proc() method is called when there are threads available in the thread pool. The uv_fs_thread_proc() method calls the corresponding underlying function based on the type of argument passed in. In the case of uv_fs_open(), the fs_open() method is actually called.

At this point, the JavaScript call returns immediately, ending the first phase of the asynchronous call initiated by the JavaScript layer. JavaScript threads can continue to perform subsequent operations on the current task. The current I/O operation is waiting to be executed in the thread pool, and whether or not it blocks I/O does not affect subsequent execution of the JavaScript thread, thus achieving the purpose of asynchrony.

The request object is an important intermediary in the asynchronous I/O process, where all state is stored, including feeding into the thread pool for execution and callback processing after the I/O operation completes. In fact, I don’t think there is too much detail about this piece, generally know that there is such a request object, and finally summarize the whole process of asynchronous IO:


At this point, Node’s entire asynchronous I/O process has been clear, it depends on the Io thread pool \epoll, event loop, request object together constitute a management mechanism.

Why is Node better suited for IO intensive

Node is touted for being better suited for IO intensive systems and for better performance, which is due to its asynchronous IO.

For a request, if we rely on the result of the IO, both asynchronous IO and synchronous blocking IO (per-thread/per-request) wait until the I/O completes. If you block IO synchronously, you don’t get CPU time slices again, so why is asynchronous better?

The basic reason is that synchronous blocking Io needs to create a thread for each request. During I/O, the thread is blocked, which does not consume CPU, but has its own memory overhead. When large concurrent requests arrive, the memory is quickly used up, resulting in slow server. Switching context costs can also consume CPU resources. Node’s asynchronous Io is handled through the event mechanism, which does not require the creation of a thread for each request, which is why Node has better performance.

Especially in the case of Web, which is IO intensive, it has more advantages. In addition to Node, there is another event mechanism of the server Ngnix. If you understand the mechanism of Node, it should be easy to understand for Ngnix.

conclusion

Before really learning Node asynchronous IO, we often see some arguments about whether Node is suitable as a server-side development language. Of course, there are many one-sided arguments. Well, that depends on your business scenario.

If your business is CPU intensive, Node is not the right place to start. Why not? Because Node is single-threaded, you block while you compute, and other events can’t be handled, requests can’t be handled, and callbacks can’t be handled.

So is Node better than Java in IO intensive? Well, not necessarily. It depends on your business. If your business is very large and concurrent, but your server resources are limited, just like there is a portal, Node can enter 10 people at a time, and Java queue up one person, if 10 people enter at the same time, of course, Node has the advantage. However, if you have 100 people (say, 1W asynchronous requests, etc.) then Node’s asynchronous mechanism will cause the application to hang, memory to surge, I/O to jam, and you will have to restart. Java, on the other hand, does orderly processing, albeit slower. The cost of an online accident when a server goes down is immeasurable. (Of course, Node can handle it if the server resources are sufficient).

Finally, Java is actually a library with asynchronous IO, but Node’s syntax is more natural and appropriate.

Reference & quotation

How to understand the difference between blocking non-blocking and synchronous asynchronous? What is the core and key nature of high concurrency and high performance in Linux epoll&Node.js Event Loop & I/O multiplexing Node.js applications? Is asynchronous I/O performance better than synchronous blocking I/O performance? Why is that? Nodejs