The introduction

As a high-performance network communication framework, Netty is the product of IO model evolution. Based on Java NIO, Netty is an asynchronous event-driven network communication application framework. Netty is used to rapidly develop high-performance, highly reliable network server and client programs. Many open source frameworks choose Netty as their network communication module. This article mainly through the analysis of IO model optimization evolution road, comparing the similarities and differences of different IO models, so that we have a more profound understanding of Java IO model, I think this is Netty how to achieve high performance network communication understanding of the important basis. Without further ado, we’re on our way.

PS: There are Easter eggs at the end of the article!

IO model

1. What is IO

Before going into BIO, NIO, and AIO, let’s take a look at what the IO model is. We all know that both application and platform, their function after the highly abstract actually can be described as a process, is through external conditions as well as data input, through application or platform processing creates new output, IO model is actually describes a computer model of the process of input and output in the world.

For a computer, the keyboard and mouse are input devices, while the monitor and disk are output devices. For example, if we write a design document on the computer and save it, we actually input data to the computer through the keyboard. After completing the design document, we save and output it to the disk of the computer.

IO description in the figure above is the famous computer Von Neumann system, which roughly describes the IO interaction process between external devices and computers.

2. Application IO interaction

Now that we’ve covered the general process of interacting with peripheral devices, how does our application interact with IO? The code we write does not stand on its own. It is always deployed on Linux servers or in various containers where the application is launched and then served. Therefore, network request data first needs to interact with the computer before it is handed over to the corresponding program for subsequent business processing.

In the Linux world, files are used to describe the Linux world. Directory files, sockets, and so on are files. And what the hell is that file? Files are actually binary streams, and binary streams are the data medium through which the human world interacts with the computer world. An application reading data from a stream is a read operation, and writing data to a stream is a write operation. But how does Linux differentiate between different types of files? The File Descriptor is an integer, and that integer is an index value that points to the record table of open files that the kernel maintains for each process. So the operation on this integer is the operation on this file (stream).

In the case of network connections, we create a network socket, and a system call returns a file descriptor (some integer). Subsequent operations on the socket are converted to operations on that descriptor. The main operations involved are accept calls, read calls, and write calls. The various calls referred to here are the programs interacting with the computer through the Linux kernel. Then again, what the hell is this computer kernel? (PS: Kernel is not the focus of this article, here is a simple explanation and everyone)

/ / socket function
socket(PF_INET6,SOCK_STREAM,IPPROTO_IP)
Copy the code

But applications don’t actually get data directly from the computer’s network card, which means you don’t write programs that operate directly on the computer’s underlying hardware.

As shown in the figure above, in Linux architecture, user applications operate computer hardware through the Linux Kernel. So why does an application need another kernel in between when it can’t interact directly with the underlying hardware? There are mainly the following considerations.

(1) Unified management of computer resources

The Linux kernel manages process scheduling and system resources such as CPU and memory in a unified manner. Therefore, kernel management is the system is extremely sensitive resources, the use of kernel system is to achieve network communication system, user management, file system and other safe and stable process management, to avoid user applications damage system data.

(2) Unified encapsulation of underlying hardware calls

Imagine if there were no kernel system processes, each user application would need to implement its own hardware driver when interacting with hardware. Such a design is difficult to accept. According to the object-oriented design idea, the unified management of hardware is responsible for by the Kernel Kernel. The Kernel manages all hardware devices downward and provides unified system call for the user process upward, which facilitates the application program to interact with the system hardware like program call.

3. Five IO models

(1) Blocking IO

When the user application process initiates a system call, the call is blocked when the kernel data is not ready. After the kernel is ready, the data is copied from the kernel state to the user state, and the user application process obtains the data, the call is complete. For example, if you are a takeout boy, you go to the merchant to pick up your food, but the takeaway is not ready yet, so you can only wait at the place where you pick up your food until the takeaway is ready. Then you can take the takeaway to deliver your food.

(2) Non-blocking IO

Non-blocking IO based I/O model in which the application continuously polls to see if kernel data is ready, if not, EWOULDBLOCK is returned, recvFROM continues and the application can process other services. When the kernel data is ready, copy the kernel data to user space. This process is similar to the delivery boy constantly asking if the delivery is ready (the delivery boy is in a hurry and the delivery time is approaching) while waiting for his food, asking every 30 seconds until the delivery is ready.

(3) multiplexing IO

Linux mainly provides multiplex I/O implementations such as SELECT, Poll, and epoll. Why three implementations? In fact, they appear in chronological order, the latter is to solve the former in the use of the problem. In practical scenario, the back-end server receives a lot of socket connection, IO multiplexing is actually is to use the kernel provides the implementation of the function, in the realization of function is a parameter file descriptor set, the file descriptor (FD) for listening, when a file descriptor (FD) is ready, just to deal with the file descriptor.

Select, poll, and epoll

Select:

Select is a kernel system call function provided by the operating system. It can pass a set of FD’s to the operating system. The operating system traverses the set of FD’s.

There are some problems in the use of SELECT: (1) SELECT can listen on a maximum of 1024 connections, which supports a small number of connections; (2) Select does not only return ready FDS, but requires the user process to traverse them one by one to find ready FDS; (3) When the user process calls SELECT, it needs to copy the FD set from the user state to the kernel state. When there are more FD’s, the resource cost is relatively high.

Poll:

The poll mechanism is not much different from the select mechanism except that the poll mechanism removes the limit of listening connection number 1024.

Epoll:

Epoll solves most problems of select and poll mechanism, which is mainly reflected in the following aspects: (1) Changes discovered by FD: The kernel no longer finds ready FDS by polling traversal, but wakes up by asynchronous IO events. When an event occurs on the socket, the ready FDS are added to the list of ready events by the callback function, thus avoiding polling and scanning the FD collection. (2) Changes in FD returns: the kernel returns ready FDS to the user, so that the user application does not have to traverse to find ready FDS itself; (3) Changes of FD copy: Epoll and kernel share the same block of memory, which stores the set of file descriptors that are already readable or writable, thus reducing the memory copy overhead of kernel and program.

(The picture is from the Internet)

(4) Signal drives IO

The system has a signal capture function, which is associated with the socket. After the user process initiates the SIGAction call, the user process can process other business processes. When the kernel has the data ready, the user process receives a SIGIO signal and interrupts the current task to initiate a recvFROM call to read the data from the kernel into user space for processing.

(5) Asynchronous I/O

The so-called asynchronous I/O model is that after a user process initiates a system call, no matter whether the request data corresponding to the kernel is ready or not, the current process will not be blocked, and the process can continue to process other services after the immediate return. When the kernel is ready, the system copies the data from the kernel to user space, and then signals the user process to read the data.

IO model in Java

We have described several IO models of Linux itself, so Java also has corresponding IO models in the Java program world, namely BIO, NIO and AIO. Both of them provide IO related apis that actually rely on system-level IO for data processing. Therefore, Java IO model is actually the encapsulation of system-level IO model. Let’s take a look at the IO models for Java.

BIO

As a result, the user does not perform a Blocking I/O operation until the server sends data back to the user. As a result, the user does not perform a Blocking I/O operation until the server sends data back to the user. The user thread does not release the block state until the data is finished and returned, thus blocking the entire data read process.

In addition, we can see from the following figure that for each client connection, the server has a corresponding processing thread to handle the corresponding request. Or the example that has a meal with dining-room, you go to dining-room to have a meal, if every come a consumer, dining-room uses a waiter to receive until the consumer eats full drink enough to walk out of dining-room, so this dining-room must configure how many waiter just appropriate? So many waiters, the restaurant owner probably lost his underwear.

So BIO can work even when there are few network connections. But when the number of connections comes up, say hundreds of thousands or even millions, the BIO model’s IO interaction becomes more than it needs to be. When the number of connections continues to rise, the BIO model of IO interaction has the following disadvantages. (1) Frequent creation and destruction of a large number of threads will consume system resources and cause great pressure on the server; (2) In addition, a large number of processing threads will take up too much JVM memory. Your application should not do anything else because it is occupied by a large number of connected threads. (3) In fact, context switching costs for threads are also high.

The BIO-based model has these problems when dealing with large numbers of connections, so we need a more efficient threading model to handle hundreds of thousands or even millions of client connections.

NIO

In the BIO model, there is no way to know when data can be read or written during IO operations in Java. BIO is a real child, so there is no good solution but to wait. Socket read and write operations cannot be interrupted. Therefore, when new connections arrive, new threads must be constantly created to process them, resulting in performance problems.

So how to solve this problem? We all know that the root of the problem is that the BIO model blocks and waits because we don’t know when data will be read or written. If we could know when data will be read or written, we wouldn’t have to wait for a response and create new threads to handle connections.

To improve I/O interaction efficiency, avoid blocking. Java 1.4 introduced NIO. For NIO, some people call it non-blocking IO, but I prefer to call it New IO. Because it is an IO model based on IO multiplexing, rather than a simple synchronous non-blocking IO model. I/O multiplexing refers to processing a large number of connections using the same thread. Multiplexing refers to processing a large number of connections using a single thread.

So let’s look at what’s wrong with the synchronous non-blocking model. NIO’s read, write and receive methods are non-blocking while waiting for data to be ready. As described above, in synchronous non-blocking mode, the application process continuously makes calls to the kernel, asking the kernel for data to prepare. It is optimized over the synchronous blocking model, avoiding call blocking by constantly polling for data to be ready. However, there is room for further optimization because the application is constantly making system I/O calls, which are CPU intensive in the process. This is where the IO multiplexing model, which Java NIO uses to improve IO performance, comes in. (Here the epoll mechanism is used to illustrate)

Java NIO processes stream data in the form of channels and buffers. Thanks to the Linux operating system’s epoll mechanism, the multiplexer selector polls continuously. When a channel’s events (read events, write events, connect events, etc.) are ready, It finds the SelectionKey corresponding to the channel, and performs the corresponding operations to read and write data.

AIO

AIO (Asynchronous IO) is NIO ii, introduced in Java 7, and is an Asynchronous IO model. Asynchronous IO model is based on the event and callback mechanism, when the application calls the request will be directly returned without blocking there, when the background data processing is completed, the operating system will notify the corresponding thread for subsequent data processing. From the point of view of efficiency, AIO is undoubtedly the highest. However, the fly in the wall is that Linux, the system used by the majority of servers, does not provide perfect support for AIO, so we are not happy to use AIO technology. Netty actually used AIO technology, but it did not bring great performance improvement. It is currently implemented based on Java NIO.

conclusion

This paper mainly starts from computer IO interaction, respectively introduces what is IO model and five common IO models, introduces the advantages and disadvantages of these IO models, analyzes the evolution of Java BIO, NIO and AIO from the perspective of system optimization evolution. Analyze the shortcomings of Java BIO from a designer’s perspective. Let’s review the whole evolution process again.

In future articles, I will continue to delve into the wonders of Netty as a high-performance network communication framework. Stay tuned.

I’m Mufeng. Thanks for your likes, favorites and comments. See you next time!

A true master always has the heart of an apprentice

Wechat search: Mufeng technical notes, quality articles continue to update, we have learning punch group can pull you in, together with the impact of the big factory, in addition to a lot of learning and interview materials to provide you.

Almost new keyboard, I used it twice, a friend sent another keyboard, originally planned to sell it on xianyu, later thought it would be better to choose one of my public accounts and nuggets fans to send directly, no tricks, national package mail, come to get the first Razer keyboard of autumn.