NIO concept at the operating system level

BIO: Blocking IO

NIO: Nonblocking IO

NIO concepts at the language/architecture level

NIO: New IO: application interface based on the new OS layer interface

The actual system call, such as Accept () in Java Socket programming, uses poll(the multiplexed system call interface) rather than the old IO interface accept(which does not support multiplexing).

AIO: async IO

graph TD
Server --> Kernel --> Client

The Server creates a Socket to listen for client connections. This Socket corresponds to a file descriptor (if fd5). The Socket binds the port number, listens to the port state, and blocks the accept/poll connection on the receiving client. If there is a client Client1 connection, the accept system call returns an FD6 pointing to the connected client. At this point, the Server invokes the system call (recvFROM), passing in FD6 and receiving input from the client. Either of the above system calls is blocked and the Server is already blocked in Recvfrom when Client2 wants to access the Server. Instead, when ACCEPT returns a client FD, a new thread is created to handle the client’s input, and the main thread blocks the Accept method to receive connections from other clients.

This is the multi-threaded server model.

Kernel provides a socket-nonblocking class interface that specifies fd as non-blocking. The fd operation (accept,recv,read…) Either get the result and return, or return the error code directly, without blocking.

graph TD
Server --> Kernel --> Client1,Client2,Client3...
while(true){
bind->listen->accept->recvfrom
}
Copy the code

In the while loop on the server side, recvFROM returns immediately, while Accept can accept connections from other clients, and each loop iterates through the list of connected clients to see if anything needs to be read.

The problem with the above non-blocking scenario is that the number of system calls (recvFrom) is positively correlated with the number of clients.

The best scenario is that the number of system calls is independent of the number of clients, meaning that no matter how many clients are connected, it only takes one system call to know which client needs input data.

The Kernel provides a new system call select(/poll), passing in fd sets that need to be read and fd sets that need to be written, etc. The return value of select tells the user space which file descriptors can be written and which file descriptors can be read… , user space to call these FD system calls respectively, can greatly reduce the number of system calls.

Select is called multiplexing, which is to reuse a single system call to get the results required by multiple single-way system calls.

Multiplexing problems above is the kernel is still going to traverse the client fd check whether have read in the event of writing, on the other hand the select parameter is the set of all client file descriptor, client number from the user space when copying the file descriptor memory into the kernel space is still consumed CPU performance.

In this case, the Kernel provides a new system call epoll(Event poll). The three system calls related to epoll_create create a memory space in the Kernel space and return a FD pointing to the memory space. Epoll_ctl (fd1,op,fd2,event),fd1 is the FDS returned by epoll_CREATE,op is the operation that controls the creation, deletion, and movement of FDS in FD1,fd2 is put into FD1, and event is the event that FD2 listens for, such as reading events, writing events, etc. When the Server blocks in accept, fD1 will be created in the Kernel space, and epoll_wait() will be called to listen for events in the Kernel space. When the Client establishes a connection, the Kernel will have a callback event to fetch FD2 from FD1 and return it to the user space through epoll_wait. If the user space knows there is a Client connection, it creates an FD3 pointer to the Client and adds FD3 to the kernel space with epoll_ctl(fd1,add,fd3,recvfrom), and so on.

In this case, there is no need to copy the FD set or traverse the FD set in each select. The client events are sensed by THE CPU interrupt. The CPU will call back different callback events according to the interrupt number, so the Kernel can sense the state change of the file descriptor.