The multiplexing I/O model explains why it supports higher concurrency

Blocking I/O

In this IO model scenario, we create a thread for each client connection to process it. Regardless of whether the client has established a connection or is doing anything (sending read data, etc.), it maintains the connection until it is disconnected. Creating too many threads consumes too many resources, as in The Case of Java BIO

BIO is a synchronous blocking IO
The implementation of Java threads depends on the implementation of the underlying operating system. In Linux, a thread maps to a lightweight process (in user mode) and then calls the kernel thread to perform operations
Scheduling the threads, storing the state of the switch, and so on all consume a lot of CPU and cache resources
Synchronous: After the client request to the server, the server start treatment hypothesis 1 second, even if the second client to send a lot of request, the server is busy not to come over, it must wait until after the previous request processing to deal with the next request, of course, we can use pseudo asynchronous I/o, which is to realize a thread pool, The client request is then dropped to the thread pool and can move on to the next request
Block: InputStream.read (data) will receive data via recvfrom and will remain blocked if the kernel data is not ready

Thus, blocking I/O cannot support high concurrency scenarios

    public static void main(String[] args) throws IOException {
        ServerSocket serverSocket = new ServerSocket(9999);
        // Create a new thread to receive client connections
        // Pseudo asynchronous IO
        new Thread(() -> {
            while (true) {
                System.out.println("Start blocking, waiting for client to connect");
                try {
                    Socket socket = serverSocket.accept();
                    // Create a thread for each new connection
                    new Thread(() -> {
                        byte[] data = new byte[1024];
                        int len = 0;
                        System.out.println("Client connection successful, blocking waiting for incoming data from client");
                        try {
                            InputStream inputStream = socket.getInputStream();
                            // Block data until the client disconnects
                            while((len = inputStream.read(data)) ! = -1) {
                                // Or fetch data
                                System.out.println(new String(data, 0, len));
                                // Process data}}catch (IOException e) {
                            e.printStackTrace();
                        }
                    }).start();
                } catch (IOException e) {
                    e.printStackTrace();
                }
            }
        }).start();
    }
Copy the code

Serversocket.accept (); serversocket.accept (); Inputstream.read () can’t fetch any other connections. Inputstream.read () can see that once it gets the data, it needs to be processed before it receives the next one

Non-blocking I/O

    Socket socket = serverSocket.accept();
    // Constantly polling the kernel for which socket data is ready
    while (true) {
        data = socket.read();
        if(data ! = BWOULDBLOCK) {// Data is successfully obtaineddoSomething(); }}Copy the code

Multiplexing I/O

NIO in Java is the multiplexing mechanism adopted, which has different implementations in different operating systems, such as SELECT on Windows and Epoll on Linux. The poll model is a slightly updated version of SELECT that is roughly the same. The first thing that happens is after select and because of some of the pain points of select like it’s on a 32 system, A single process can open a maximum of 1024 file descriptors (Linux uses the corresponding file descriptors for IO operations). Poll has made some optimizations, such as exceeding the 1024 limit. There is no limit to the number of file descriptors it can open (but still depends on system resources), and each of the above two models had a significant performance problem resulting in epoll. More on that later

select

The file list corresponds to the created socket. In Linux, I/O operations are mapped to a file descriptor
A work queue is a set of processes that need to be executed by the CPU
Blocking does not consume CPU resources
The select call gets blocked, the process is removed from the work queue, the sockets that need to be monitored are passed to the kernel, and process A is placed in each socket

When the operating system kernel detects that A socket is ready, it wakes up process A, removing the process from the sockets waiting queue, and waking process A to put process A into A work queue waiting for the CPU.

Process A doesn’t know which socket is ready, so it needs to iterate over the previous sockets to see which ones are ready, and then process them. See below

Use a piece of pseudo-code to do that

Serversocket.accept (); List
      
        sockets;
      
sockets = getSockets(); 
while (true) {
    // block, pass all sockets into the kernel and let it check if any data is ready
    // n indicates how many sockets are ready
    int n = select(sockets);
    for (int i = 0; i < sockets.length; i++) {
        // FD_ISSET check sockets one by one to see if kernel data is ready
        if (FD_ISSET(sockets[i]) {
            // Ready to handle ready sockets one by onedoSomething(); }}}Copy the code

You can see some of the drawbacks of SELECT

The maximum file descriptor that can be opened by a single process is 1024
Monitoring sockets requires passing all sockets file descriptors into the kernel and setting up the corresponding process
When waking up, the process does not know which socket got the data and needs to traverse again

poll

Poll is similar to SELECT with some optimizations, such as unlimited file descriptors that can be opened by a single process, and the underlying implementation of linked lists

epoll

Epoll is a few years later than SELECT, and it has been greatly optimized for select and poll. The following

Eventpoll (rdList) ¶ Select a socket that is connected to an eventPoll (RDList) process

Serversocket.accept (); List
      
        sockets;
      
sockets = getSockets(); 
// Create eventPoll
int epfd = epoll_create();
// Add all sockets that need to be monitored to eventPoll
epoll_ctl(epfd, sockets);
while (true) {
    // Block to return ready sockets
    int n = epoll_wait();
    // The socket that received the data is traversed directly without traversing all sockets
    // How to do this
    for(Traverses the socket receiving data) {}}Copy the code

Ready queue

The wait queue is the same as the select process, which is suspended on eventPoll. At this point, process A is blocked and removed from the work queue and needs to be woken up.

The ready queue is the RDList shown above, which is a member of EventPoll and refers to what data is ready in the kernel. To do this, epoll_ctl() registers a callback function for each socket. When a socket is ready, the callback is added to rdList, which is a two-way linked list.

This allows us to retrieve data directly from rdList with a single system call without having to iterate over all sockets.

Epoll improves the concurrency of the system. Limited resources provide more services than SELECT. The advantages of poll are summarized as follows

The kernel does not need to pass in all sockets file descriptors each time and then disconnect them all, it just needs to pass epoll_ctl once
In the select/poll model, a process that receives a socket ready command does not know which socket is ready and needs to iterate over all sockets. Epoll maintains an RDList and inserts ready sockets into the RDList via callbacks. The RDList can be retrieved directly without iterating through other sockets to improve efficiency

Finally, we consider the applicable scenarios of epoll, as long as the ready list is not too long at the same time. For example, Nginx is extremely fast in its processing, and if it creates a thread for each request, how can it support high concurrency with this overhead?

Finally, netty, which is also a multiplexing model, looks at the use of Epoll in Linux. How can Netty be used more efficiently? If the request time of a certain socket is relatively long, such as 100MS, it will greatly reduce the concurrency corresponding to the model. How to deal with it? The code is as follows.

Code from flash, Netty book first chapter of the code Netty entry and actual combat: imitation writing wechat IM instant messaging system

public class NIOServer {
    public static void main(String[] args) throws IOException {
        Selector serverSelector = Selector.open();
        Selector clientSelector = Selector.open();

        new Thread(() -> {
            try {
                // Start the server in IO programming
                ServerSocketChannel listenerChannel = ServerSocketChannel.open();
                listenerChannel.socket().bind(new InetSocketAddress(8000));
                listenerChannel.configureBlocking(false);
                listenerChannel.register(serverSelector, SelectionKey.OP_ACCEPT);

                while (true) {
                    // block consistently until socket data is ready
                    if (serverSelector.select() > 0) {
                        Set<SelectionKey> set = serverSelector.selectedKeys();
                        Iterator<SelectionKey> keyIterator = set.iterator();

                        while (keyIterator.hasNext()) {
                            SelectionKey key = keyIterator.next();

                            if (key.isAcceptable()) {
                                try {
                                    Each time a new connection is made, no thread needs to be created. Instead, it registers with clientSelector directly
                                    SocketChannel clientChannel = ((ServerSocketChannel) key.channel()).accept();
                                    clientChannel.configureBlocking(false);
                                    clientChannel.register(clientSelector, SelectionKey.OP_READ);
                                } finally {
                                    keyIterator.remove();
                                }
                            }

                        }
                    }
                }
            } catch (IOException ignored) {
            }

        }).start();


        new Thread(() -> {
            try {
                while (true) {
                    // block waiting for the read event to be ready
                    if (clientSelector.select() > 0) {
                        Set<SelectionKey> set = clientSelector.selectedKeys();
                        Iterator<SelectionKey> keyIterator = set.iterator();

                        while (keyIterator.hasNext()) {
                            SelectionKey key = keyIterator.next();

                            if (key.isReadable()) {
                                try {
                                    SocketChannel clientChannel = (SocketChannel) key.channel();
                                    ByteBuffer byteBuffer = ByteBuffer.allocate(1024);
                                    // (3) Buffer oriented
                                    clientChannel.read(byteBuffer);
                                    byteBuffer.flip();
                                    System.out.println(Charset.defaultCharset().newDecoder().decode(byteBuffer)
                                            .toString());
                                } finally {
                                    keyIterator.remove();
                                    key.interestOps(SelectionKey.OP_READ);
                                }
                            }

                        }
                    }
                }
            } catch(IOException ignored) { } }).start(); }}Copy the code

Let’s analyze the above code

Use serverSelector to handle all client connection requests
Use clientSelector to handle all read operations after successful client connections
1. Register selectionKey. OP_ACCEPT with serverSelector
This is equivalent to creating eventPoll and monitoring the current serverSocket and registering ACCEPT to establish the connection event, removing the current Thread from the work queue and putting it on the wait queue of EventPoll
1. ServerSelector. Select () > 0 indicates that the socket is ready to be connected
Equivalent to epoll_wait returned to read number (the number of connected), and then we through clientSelector. SelectedKeys (); Got the socket in the ready queue
1. We know that setting up a connection is fast, so once it’s set up, register the socket with clientSelector and register the READ event
So we create an eventPoll and we pass in the socket that needs to monitor the read event (sockets = getSockets()), and then we remove the eventPoll from the work queue, All sockets that need to be monitored point to EventPoll, and the wait queue for eventPoll is the current new Thread.
1. Once a socket read is ready, the RDList data for EventPoll is ready and the currently waiting thread is woken up to process the data

Think about it for a second because the thread that’s making the connection is so fast that only the binding reads the event to the clientSelector, so the time is negligible. But retrieving data in clientSelector usually requires business logic operations, such as

if (key.isReadable()) {
    doSomething();
}

void doSomething(a) {
    Thread.sleep(500);
}
Copy the code

If that happens because it is single threaded, so other socket read ready event may can’t get a timely response, so the common practice is, don’t be too time-consuming operation in this thread processing because can greatly reduce its concurrency, for the relatively slow operation we will throw it to the thread pool to process.

if (key.isReadable()) {
    // Time is thrown into the thread pool
    executor.execute(task);
}
Copy the code

Serverbootstrap. group(boosGroup, workerGroup) is created by default when netty is used. BoosGroup is processed by a thread. A workerGroup is n * CPU threads that are being processed so that concurrency can be significantly improved.

Netty will end up creating a thread for the client and then dumping it into the thread pool. This is not the same as using blocking I/O to create a connection for each request.

The difference is that, for blocking I/O, one connection is created for each incoming request (there’s a lot of thread creation and maintenance overhead even with thread pools), whereas for multiplexing, the connection is just one thread processing, and it will inject the read event of the corresponding selector into another selector. For the user, I’m not going to set up a connection and I’m going to be sending requests all the time, so the benefits of multiplexing are there, you set up OK Linux kernel maintenance, I don’t have to create thread overhead. When you actually have read requests coming in, THEN I’m going to allocate resources for you to execute (if it takes time to go to the thread pool), where the number of actual requests coming in is far less than the number of successful sockets being set up. The thread overhead for the thread pool is also much lower than the overhead of creating one thread per request.

However, it is not suitable for multiplexing scenarios where the load is close to full every time the ready queue is fetched.

Reference:

What is the nature of epoll

The multiplexing I/O model explains why it supports higher concurrency

Blocking I/O

Non-blocking I/O

Multiplexing I/O

select

poll

epoll

Ready queue

Related Posts

How to easily identify captcha with Serverless?

Hmily: High-performance TCC framework for asynchronous distributed transactions

How much do you know about Java regular expressions?