Redis single thread how to handle so many concurrent client connections, why single thread, why fast?

IO multiplexing for Redis, Redis uses ePoll to implement IO multiplexing by putting connection information and events into queues, once into a file event dispatcher, which distributes events to event handlers.Redis runs in a single thread, and all operations are performed sequentially and linearly. However, as read and write operations wait for user input or output are blocked, I/O operations cannot be returned directly under normal circumstances, which will lead to I/O blocking of a file and the whole process cannot provide services to other customers. I/O multiplexing appears to solve this problem

The I/O multiplexing mechanism is a mechanism by which multiple descriptors can be monitored and, once a descriptor is ready (usually read or write), the program can be told to read or write accordingly. The use of this mechanism requires select, poll, and epoll. Multiple connections share a blocking object, and the application waits on only one blocking object instead of all connections. When a connection has new data to process, the operating system notifies the application, and the thread returns from the blocked state to begin processing the business.

Redis uses the method of Reactor to implement the file event handler (each network connection corresponds to a file descriptor). Redis develops the network event handler based on the Reactor model, which is called the file event handler. It consists of four parts:

  • Multiple sockets,
  • IO multiplexing program,
  • File event dispatcher,
  • Event handler.

Redis is called the single-threaded model because the consumption of the file event dispatcher queue is single-threaded

Refer to Redis Design and Implementation, as shown in the figure below

I/O multiplexing model

1.I/O: network I/O 2. Multiplexed: multiple client connections (connections are socket descriptors, namely sockets or channels) 3. Reuse: Reuse one or more threads. This means that one or a group of threads can handle multiple TCP connections, and a single process can handle multiple client connections simultaneously

In a word: a server process can process multiple socket descriptors simultaneously. Its development can be described in three stages: SELECT ->poll->epoll.

Small article: I had a meeting in the morning and missed the meal in the company canteen. At noon, I went to the rice noodle shop downstairs with the chief architect of the company to eat rice noodles. When we arrived, there were a lot of people in line.

The architect immediately said: ho, please line up ah! Does this cashier look like an Nginx reverse agent? Take requests, don’t process them. Send them to the kitchen. We handed over our money, took our numbers and left the order register. We took a seat and waited. Architect: You see, this is asynchronous processing, we can leave the waiting order, the rice noodles will be finished through the speaker ** “call back **” we go to pick up food; If we process it synchronously, we’ll have to stand at the checkout counter and wait for our meal. The subsequent requests will not be processed and the customer will leave without waiting.

Next, the architect stared at the paper number plate in his hand.

Architect: look, this paper number plate is also in the back kitchen “server”, isn’t it the session ID? With it we can be separated from each other, will not give my spareribs rice noodles to others. After a while, the queue of more and more people, has been dissatisfaction, but the cashier has been sweating, busy to the extreme.

Architect: You see his system is not flexible enough to expand. There are so many people now, we should increase the cashier desk, there can be no other cashier equipment, no matter how anxious the boss is, it is useless. The boss saw that he could not help in the cashier, and the orders from the kitchen were accumulating more and more, so he hurried to the kitchen to make rice noodles in person.

The architect spoke up again: thanks to the parallel processing capability in the background of the system, resources can be added at will to handle requests. I said: he only has this few resources, except the boss no one will do rice noodles. Unconsciously, we waited for 20 minutes, but the rice noodles did not come. Architect: You see, the system’s processing capacity has reached its limit. At this time, there are not many people queuing in front of the cashier, but there are still many people waiting for rice noodles.

The boss ran over to let the cleaning go to the cashier, let the cashier little sister also to the kitchen to help. Cleaning the cashier also stumbling, no original sister flexible.

Architect: This is called service degradation. In order to ensure the service of rice noodles, all other services are shut down. After another 20 minutes, the chef in the back kitchen called out: No. 237, the spareribs rice noodles you ordered are out of spareribs, can you replace them with tomato?

The architect whispered to me: Look, there are too many people, the system is abnormal. Then he stood up: No, the system has to compensate: a refund.

With that, he took me with him, hungry, and walked away without looking back.

The key concepts are as follows:

Synchronization: the caller must wait for notification of the result of the call before proceeding with the subsequent execution, now. I can wait until the result is known.

Asynchronous: The called party returns a reply to let the caller go back first, and then calculates the call result. After calculating the final result, the caller is notified and returned to the call. The asynchronous call usually obtains the result through callback.

Understanding synchronous and asynchronous: The discussion of synchronous and asynchronous is about the invokee (service provider), with emphasis on the way in which messages are notified to get the result of the invocation.

Block: the caller waits and does nothing else, the current thread is suspended, doing nothing.

Non-blocking: after the call is sent, the caller does something else first, does not block the current/thread, but returns immediately.

Understanding blocking and non-blocking: The discussion of blocking and non-blocking is about the caller (the service requester), focusing on the behavior while waiting for the message and whether the caller can do something else

Conclusion:

  • Synchronous block: the waiter said that you are coming, don’t leave my background to take a look at you immediately. Customers wait at the reception desk of Haidilao hotpot, doing nothing.
  • Synchronous non-blocking: The waiter says it’s almost you, don’t leave yet. Customers brush douyin at the reception desk of Haidilao hot pot while waiting for their call.
  • Asynchronous block: the waiter said to wait again, you go to stroll first, in a while inform you. The customer is afraid of the number in the haidilao hotpot reception with the queue receipt do not do anything, has been waiting for the clerk notice.
  • Asynchronous non-blocking: the waiter said to wait again, you go to stroll first, later inform you. Take the queue receipt + brush douyin, waiting for the clerk to inform.

Five IO models in Unix network programming

  • Blocking IO – Blocking IO
  • NoneBlocking IO – Non-blocking IO
  • IO multiplexing – IO multiplexing
  • Signal Driven IO – Signal driven IO
  • Asynchronous IO – Asynchronous I/O

BIO

When the recvfrom system call is made by a user process, the kernel begins the first phase of IO: preparing data. (For network IO, there are many times when data does not arrive in the first place. For example, a complete UDP packet has not been received. The kernel waits for enough data to arrive. This process requires waiting, meaning that it takes a while for the data to be copied into the buffer of the operating system kernel. On the user side, the entire process is blocked (by the process’s own choice, of course). When the kernel waits until the data is ready, it copies the data from the kernel to user memory. Then the kernel returns the result, and the user process unblocks and starts running again. So, BIO is typically blocked in both phases of IO execution. First demonstrate accept listening, that is, the socket server listens to the client connection, the code is shown as follows: RedisServer

package com.zzyy.study.iomultiplex.one;

import java.io.IOException;
import java.net.ServerSocket;
import java.net.Socket;

/ * * *@auther zzyy
 * @createThe 2020-12-06 "* /
public class RedisServer
{
    public static void main(String[] args) throws IOException
    {
        byte[] bytes = new byte[1024];

        ServerSocket serverSocket = new ServerSocket(6379);

        while(true)
        {
            System.out.println("-----111 waiting for connection");
            Socket socket = serverSocket.accept();
            System.out.println("-----222 successfully connected"); }}}Copy the code

RedisClient01 as follows

package com.zzyy.study.iomultiplex.one;

import java.io.IOException;
import java.net.Socket;
import java.util.Scanner;

/ * * *@auther zzyy
 * @createThe 2020-12-06 10:20 * /
public class RedisClient01
{
    public static void main(String[] args) throws IOException
    {
        System.out.println("------RedisClient01 start");
        Socket socket = new Socket("127.0.0.1".6379); }}Copy the code

RedisClient02 is as follows:

 
package com.zzyy.study.iomultiplex.one;

import java.io.IOException;
import java.net.Socket;

/ * * *@auther zzyy
 * @createThe 2020-12-06 10:20 * /
public class RedisClient02
{
    public static void main(String[] args) throws IOException
    {
        System.out.println("------RedisClient02 start");
        Socket socket = new Socket("127.0.0.1".6379); }}Copy the code

Then we demonstrate read, which reads information sent by the socket client. RedisServerBIO is as follows:

 
package com.zzyy.study.iomultiplex.bio;

import java.io.IOException;
import java.io.InputStream;
import java.net.ServerSocket;
import java.net.Socket;

/ * * *@auther zzyy
 * @createThe 2020-12-08 "* /
public class RedisServerBIO
{
    public static void main(String[] args) throws IOException
    {

        ServerSocket serverSocket = new ServerSocket(6379);

        while(true)
        {
            System.out.println("-----111 waiting for connection");
            Socket socket = serverSocket.accept();// block 1, waiting for the client to connect
            System.out.println("-----222 successfully connected");

            InputStream inputStream = socket.getInputStream();
            int length = -1;
            byte[] bytes = new byte[1024];
            System.out.println("-----333 waiting to read");
            while((length = inputStream.read(bytes)) ! = -1)// block 2, waiting for the client to send data
            {
                System.out.println("-----444 read successfully"+new String(bytes,0,length));
                System.out.println("= = = = = = = = = = = = = = = = = = = ="); System.out.println(); } inputStream.close(); socket.close(); }}}Copy the code

RedisClient01 is as follows:

 
package com.zzyy.study.iomultiplex.bio;

import java.io.IOException;
import java.io.OutputStream;
import java.net.Socket;
import java.util.Scanner;

/ * * *@auther zzyy
 * @createThe 2020-12-08 15:21 * /
public class RedisClient01
{
    public static void main(String[] args) throws IOException
    {
        Socket socket = new Socket("127.0.0.1".6379);
        OutputStream outputStream = socket.getOutputStream();

        //socket.getOutputStream().write("RedisClient01".getBytes());

        while(true)
        {
            Scanner scanner = new Scanner(System.in);
            String string = scanner.next();
            if (string.equalsIgnoreCase("quit")) {
                break;
            }
            socket.getOutputStream().write(string.getBytes());
            System.out.println("------input quit keyword to finish......"); } outputStream.close(); socket.close(); }}Copy the code

RedisClient02 is as follows:

 
package com.zzyy.study.iomultiplex.bio;

import java.io.IOException;
import java.io.OutputStream;
import java.net.Socket;
import java.util.Scanner;

/ * * *@auther zzyy
 * @createThe 2020-12-08 15:21 * /
public class RedisClient02
{
    public static void main(String[] args) throws IOException
    {
        Socket socket = new Socket("127.0.0.1".6379);
        OutputStream outputStream = socket.getOutputStream();

        //socket.getOutputStream().write("RedisClient01".getBytes());

        while(true)
        {
            Scanner scanner = new Scanner(System.in);
            String string = scanner.next();
            if (string.equalsIgnoreCase("quit")) {
                break;
            }
            socket.getOutputStream().write(string.getBytes());
            System.out.println("------input quit keyword to finish......"); } outputStream.close(); socket.close(); }}Copy the code

The problem with the above model is that if a client is connected to a server, and the connected client does not send data, the program will remain stuck in the read() method, and other clients will not be able to connect, which means only one client at a time, which is very unfriendly to the client

Know the problem, how to solve it?

Multithreaded mode

Once a socket is connected, the operating system allocates a single thread to handle it, so that the read() method blocks each specific thread without blocking the main thread.

The application server is only responsible for listening for client connections. By using accept() to block client 1 from connecting to the server, a thread (thread1) is opened to execute the read() method and the application server continues to listen

Client 2 connects to the server and also opens a thread (thread2) to execute the read() method, while the program server continues to listen

Client 3 connects to the server and also opens a thread (thread3) to execute the read() method. The program server continues to listen…

Any socket on a thread that has data sent in, read() is immediately read and processed by the CPU.

Code changes are as follows: RedisServerBIOMultiThread

package com.zzyy.study.iomultiplex.bio;

import java.io.IOException;
import java.io.InputStream;
import java.net.ServerSocket;
import java.net.Socket;

/ * * *@auther zzyy
 * @createThe 2020-12-08 not * * /
public class RedisServerBIOMultiThread
{
    public static void main(String[] args) throws IOException
    {
        ServerSocket serverSocket = new ServerSocket(6379);

        while(true)
        {
            // system.out. println("-----111 waiting for connection ");
            Socket socket = serverSocket.accept();// block 1, waiting for the client to connect
            // system.out. println("-----222 successfully connected ");

            new Thread(() -> {
                try {
                    InputStream inputStream = socket.getInputStream();
                    int length = -1;
                    byte[] bytes = new byte[1024];
                    System.out.println("-----333 waiting to read");
                    while((length = inputStream.read(bytes)) ! = -1)// block 2, waiting for the client to send data
                    {
                        System.out.println("-----444 read successfully"+new String(bytes,0,length));
                        System.out.println("= = = = = = = = = = = = = = = = = = = =");
                        System.out.println();
                    }
                    inputStream.close();
                    socket.close();
                } catch(IOException e) { e.printStackTrace(); } },Thread.currentThread().getName()).start(); System.out.println(Thread.currentThread().getName()); }}}Copy the code

RedisClient01 is as follows:

 
package com.zzyy.study.iomultiplex.bio;

import java.io.IOException;
import java.io.OutputStream;
import java.net.Socket;
import java.util.Scanner;

/ * * *@auther zzyy
 * @createThe 2020-12-08 15:21 * /
public class RedisClient01
{
    public static void main(String[] args) throws IOException
    {
        Socket socket = new Socket("127.0.0.1".6379);
        OutputStream outputStream = socket.getOutputStream();

        //socket.getOutputStream().write("RedisClient01".getBytes());

        while(true)
        {
            Scanner scanner = new Scanner(System.in);
            String string = scanner.next();
            if (string.equalsIgnoreCase("quit")) {
                break;
            }
            socket.getOutputStream().write(string.getBytes());
            System.out.println("------input quit keyword to finish......"); } outputStream.close(); socket.close(); }}Copy the code

RedisClient02 is as follows:

 
package com.zzyy.study.iomultiplex.bio;

import java.io.IOException;
import java.io.OutputStream;
import java.net.Socket;
import java.util.Scanner;

/ * * *@auther zzyy
 * @createThe 2020-12-08 15:21 * /
public class RedisClient02
{
    public static void main(String[] args) throws IOException
    {
        Socket socket = new Socket("127.0.0.1".6379);
        OutputStream outputStream = socket.getOutputStream();

        //socket.getOutputStream().write("RedisClient01".getBytes());

        while(true)
        {
            Scanner scanner = new Scanner(System.in);
            String string = scanner.next();
            if (string.equalsIgnoreCase("quit")) {
                break;
            }
            socket.getOutputStream().write(string.getBytes());
            System.out.println("------input quit keyword to finish......"); } outputStream.close(); socket.close(); }}Copy the code

Existing problems

Multithreaded model: for every client that comes, one thread is created, and for 10,000 clients, 10,000 threads are created. In the operating system user mode can not directly open up a thread, need to call the kernel to create a thread, which also involves the user state switch (context switch), very resource consumption.

Know the problem, how to solve it?

To solve

First: use thread pools

This works for a small number of client connections, but for a large number of users, you don’t know how big the thread pool should be, so if it’s too large it might not have enough memory and it might not be feasible.

Since the read() method is blocked, multiple threads need to be opened up. If there is a way to make the read() method not blocked, there is no need to open up multiple threads. This uses another IO model, NIO.

The previous version of tomcat7 used BIO multithreading to solve multiple connections

NIO

When a user process issues a read operation, if the data in the kernel is not ready, it does not block the user process, but immediately returns an error. From the user process’s point of view, when it initiates a read operation, it does not wait, but gets a result immediately. When the user process determines that the result is an error, it knows that the data is not ready, so it can send the read operation again. Once the kernel is ready and receives a system call from the user process again, it copies the data to the user’s memory and returns. So, the NIO feature is that the user process needs to constantly actively ask the kernel for data ready?In a non-blocking I/O model, application set a set of interfaces to non-blocking, is to tell the kernel, when the requested I/O operation cannot be completed, don’t sleep will process but a return to a “mistake”, the application based on I/O operations function will keep polling data are ready, if not well prepared, continue to polling, Until the data is ready.

Interview Summary answer

In NIO mode, everything is non-blocking:

  • The accept() method is non-blocking and returns error if there is no client connection

  • The read() method is non-blocking, returning error if it cannot read data, and blocking only the time it took the read() method to read data

In NIO mode, there is only one thread: when a client connects to a server, the socket is added to an array and iterated periodically to see if the socket’s read() method can read data, allowing a single thread to handle multiple client connections and reads

Before the above BIO socket is blocked, another set of API development —ServerSocketChannel The RedisServerNIO code is as follows:

package com.zzyy.study.iomultiplex.nio;

import java.io.IOException;
import java.net.InetSocketAddress;
import java.nio.ByteBuffer;
import java.nio.channels.ServerSocketChannel;
import java.nio.channels.SocketChannel;
import java.util.ArrayList;

/ * * *@auther zzyy
 * @createThe 2020-12-06 at * /
public class RedisServerNIO
{
    static ArrayList<SocketChannel> socketList = new ArrayList<>();
    static ByteBuffer byteBuffer = ByteBuffer.allocate(1024);

    public static void main(String[] args) throws IOException
    {
        System.out.println("---------RedisServerNIO launch waiting......");
        ServerSocketChannel serverSocket = ServerSocketChannel.open();
        serverSocket.bind(new InetSocketAddress("127.0.0.1".6379));
        serverSocket.configureBlocking(false);// Set to non-blocking mode

        while (true)
        {
            for (SocketChannel element : socketList)
            {
                int read = element.read(byteBuffer);
                if(read > 0)
                {
                    System.out.println("----- Read data:"+read);
                    byteBuffer.flip();
                    byte[] bytes = new byte[read];
                    byteBuffer.get(bytes);
                    System.out.println(new String(bytes));
                    byteBuffer.clear();
                }
            }

            SocketChannel socketChannel = serverSocket.accept();
            if(socketChannel ! =null)
            {
                System.out.println("----- successful connection:");
                socketChannel.configureBlocking(false);// Set to non-blocking mode
                socketList.add(socketChannel);
                System.out.println("-----socketList size: "+socketList.size()); }}}}Copy the code

The RedisClient01 code is as follows:

package com.zzyy.study.iomultiplex.nio;

import java.io.IOException;
import java.io.OutputStream;
import java.net.Socket;
import java.util.Scanner;

/ * * *@auther zzyy
 * @createThe 2020-12-06 10:20 * /
public class RedisClient01
{
    public static void main(String[] args) throws IOException
    {
        System.out.println("------RedisClient01 start");
        Socket socket = new Socket("127.0.0.1".6379);
        OutputStream outputStream = socket.getOutputStream();
        while(true)
        {
            Scanner scanner = new Scanner(System.in);
            String string = scanner.next();
            if (string.equalsIgnoreCase("quit")) {
                break;
            }
            socket.getOutputStream().write(string.getBytes());
            System.out.println("------input quit keyword to finish......"); } outputStream.close(); socket.close(); }}Copy the code

The RedisClient02 code is as follows:

package com.zzyy.study.iomultiplex.nio;

import java.io.IOException;
import java.io.OutputStream;
import java.net.Socket;
import java.util.Scanner;

/ * * *@auther zzyy
 * @create 2020-12-06 10:2asds7
 */
public class RedisClient02
{
    public static void main(String[] args) throws IOException
    {
        System.out.println("------RedisClient02 start");


        Socket socket = new Socket("127.0.0.1".6379);
        OutputStream outputStream = socket.getOutputStream();

        while(true)
        {
            Scanner scanner = new Scanner(System.in);
            String string = scanner.next();
            if (string.equalsIgnoreCase("quit")) {
                break;
            }
            socket.getOutputStream().write(string.getBytes());
            System.out.println("------input quit keyword to finish......"); } outputStream.close(); socket.close(); }}Copy the code

Existing problems and advantages and disadvantages

NIO successfully solved the problem of enabling multithreading in the BIO. In NIO, a single thread can handle multiple sockets, but there are still two problems.

Problem a: This model works well when there are few clients, but if there are many clients, for example, 10,000 clients are connected, then 10,000 sockets will be traversed each time. If only 10 sockets in 10,000 sockets have data, 10,000 sockets will also be traversed, which will do a lot of useless work. Each traversal that returns -1 on read is still a wasteful system call.

The user state determines whether the socket has data or calls the kernel’s read() method. This involves switching between the user state and the kernel state. This is expensive because of these problems.

Advantages: No blocking in the kernel waiting data process, each INITIATED I/O request can be returned immediately, no blocking waiting, real-time performance.

Disadvantages: Polling constantly asks the kernel, which takes up a lot of CPU time and is low on system resource utilization, so Web servers generally don’t use this I/O model.

Conclusion: Let the Linux kernel take care of the above requirements, we pass a batch of file descriptors to the kernel through a system call, the kernel layer traversal, can really solve the problem. IO multiplexing comes into being, which puts the above work directly into the Linux kernel instead of two-state conversion and gets results directly from the kernel, which is non-blocking.

Problem escalation: How to handle a large number of links with a single thread?

IO Multiplexing Multiplexing

IO multiplexing is the same as select poll and epoll, and is also called event driven IO multiplexing. It is a mechanism by which a process can monitor multiple descriptors, and when a descriptor is ready (usually read or write), it can tell the program to read or write accordingly. Instead of using multiple threads (one thread per file descriptor, one new thread at a time), waiting ready on multiple descriptors at the same time, based on a blocking object, can greatly save system resources. So, I/O multiplexing is characterized by a mechanism whereby a process can wait for multiple file descriptors at the same time and any one of these file descriptors (socket descriptors) goes into a read-ready state and the select() function returns.

I/O multiplexing stands for I/O multiplexingMultiplexing refers to the simultaneous management of multiple I/O streams by a single thread recording the state of each Sock(I/O stream). The goal is to increase the throughput of the server as much as possible.Everyone has used Nginx, nginx uses epoll to receive requests, Ngnix will have a lot of links coming in, EPoll will monitor them all, and then like a dial switch, whoever has data will call the corresponding code to process it. Redis is similar

A File descriptor is a computer science term, an abstract concept used to describe a reference to a File. The file descriptor is formally a non-negative integer. In fact, it is an index value that points to the record table of open files that the kernel maintains for each process. When a program opens an existing file or creates a new file, the kernel returns a file descriptor to the process. In programming, some low-level programming tends to revolve around file descriptors. However, the concept of file descriptors is usually only applicable to operating systems such as UNIX and Linux. People speaking:

Simulate a TCP server to handle 30 client sockets. Suppose you are an invigilator and ask 30 students to solve a competition question. Then you are responsible for checking the students’ answers. You have several choices:

The first option: check and accept one by one in order. First check and accept A, then B, then C, D… If one student gets stuck, the whole class is delayed, and you loop through sockets one by one, without concurrency.

Option two: you create 30 dopant threads, each of which checks if a student’s answer is correct. This is similar to creating a process or thread for each user to process the connection.

The third option is, you stand on the stage and whoever answers the question raises their hand. At this time, C and D raise their hands, indicating that they have answered the question, you go down to check the answers of C and D in turn, and then continue to return to the platform. At this point, E and A raise their hands again, and then deal with E and A… This is the IO multiplexing model. Select, poll, and epoll under Linux do just that.

Register the FDS of user sockets with epoll, and epoll helps you monitor which sockets are receiving messages, thus avoiding a lot of useless operations. The socket should be in non-blocking mode. In this way, the entire process will block only when calling select, poll, and epoll. Sending and receiving customer messages will not block, and the entire process or thread will be fully used. This is event-driven, called reactor response pattern.

The Reactor design pattern is based on the I/O reuse model: multiple connections share a blocking object, and the application waits on only one blocking object instead of all connections. When a connection has new data to process, the operating system notifies the application, and the thread returns from the blocked state to begin processing the business.

The Reactor pattern refers to the event-driven processing pattern of service requests that are delivered simultaneously to the service processor through one or more inputs. The Reactor pattern is also called the Dispatcher pattern. The Reactor pattern processes incoming multiple requests and dispatches them synchronously to the thread that processes the request.That is, I/O multiplexed unified listening event, after receiving the event Dispatch (to a process), is the necessary technology to write high-performance network server. There are two key components in the Reactor pattern: 1) Reactor: A Reactor runs in a separate thread, listening for and distributing events to the appropriate handlers to react to IO events. It is like a corporate telephone operator, which takes calls from customers and redirects the line to the appropriate contact; Handlers are the actual events to which I/O events are executed, similar to the actual Handlers in the company to which the customer wants to talk. Reactor responds to I/O events by scheduling appropriate handlers that perform non-blocking actions.

Why is Redis single threaded The Redis service uses the Reactor approach to implement file event handlers (each network connection corresponds to a file descriptor)

Redis developed the network event handler based on the Reactor pattern, which is called the file event handler. It consists of four parts:

  • Multiple sockets,
  • IO multiplexing program,
  • File event dispatcher,
  • Event handler. Redis is called the single-threaded model because the consumption of the file event dispatcher queue is single-threaded

Select, poll, and epoll are all concrete implementations of I/O multiplexing

Known as I/O multiplexing mechanism refers to the kernel once find specify one or more of the process conditions of IO ready to read, it informs the process, that is through a mechanism that can monitor multiple descriptor, once a descriptor ready ready (usually read or write in place), to inform the program to read and write operations. The use of this mechanism requires select, poll, and epoll.

Multiple connections share a blocking object, and the application waits on only one blocking object instead of all connections.

When a connection has new data to process, the operating system notifies the application, and the thread returns from the blocked state to begin processing the business.

The select method

Select was the first implementation (implemented in BSD circa 1983)

C language code is as follows: advantagesSelect from NIO, copy the fd array that the user mode traverses into the kernel state, and let the kernel state traverse it, because the user state determines whether the socket has data or calls the kernel state. In this way, you don’t have to switch between user mode and kernel mode frequently when traversing the judgment

Select system call returns a set of &rset, so that users can quickly know which sockets need read data with a simple binary comparison, effectively improve efficiency. The problem

1. The maximum bitmap is 1024 bits. A process can process a maximum of 1024 clients

2, &rset is not reusable, each time the socket has data, the corresponding bit will be set

3. The file descriptor array is copied to the kernel state (but without the overhead of system call context switching). (kernel layer can be optimized for asynchronous event notification), there is still overhead. The SELECT call requires passing in the FD array and making a copy of it to the kernel, which can be a huge resource drain in high-concurrency scenarios. (Can be optimized to not copy)

4, Select does not inform user mode which socket has data, still requires O(n) traversal. Select simply returns the number of file descriptors that can be read, which one the user must traverse. (Can be optimized to return only user-ready file descriptors without the user doing invalid traversal)

Our own practice is redisservernio.java, but it’s nuked.

Select a small conclusion

Select allows one thread to process multiple client connections (file descriptors) while reducing the overhead of system calls (multiple file descriptors have only one SELECT system call + N ready-to-state file descriptors read system calls)

Poll method

Poll was implemented in 1997

C code advantagesPoll uses a PollFD array instead of a bitmap in select. The array has no limit of 1024 and can manage more clients at once. The main difference from SELECT is that the limit of 1024 file descriptors that select can listen on is removed.

2. When an event occurs in the PollFDS array, the corresponding Revents is set to 1 and then set back to zero during traversal, thus realizing the reuse of the PollFD array

The problem

Poll solves the first two shortcomings of SELECT, its essence is still the method of select, there are original problems in select

Pollfds arrays are copied to the kernel state, and pollFDS arrays still have overhead. Pollfds arrays do not tell the user state which socket has data. Pollfds arrays still require O(n) traversal

Epoll method

It was invented by Davide Libenzi in 2002

Three calls

  • Epoll_create: Creates an epoll handle
  • Epoll_ctl: Adds, modifies, or deletes file descriptors to be monitored to the kernel
  • Epoll_wait: Similar to the select() call

C code Event notification mechanism

1. When data arrives from a network card, it is first put into DMA (a buffer in memory that the network card can access directly)

2. The nic initiates an interrupt to the CPU, and lets the CPU process the nic first

3, interrupt number in memory will bind a callback, which socket has data, the callback function put which socket into the ready list

conclusion

The reason multiplexing is fast is that the operating system provides such system calls that instead of multiple system calls in the while loop, one system call + kernel layer traverses these file descriptors. Epoll is the most advanced IO multiplexer available today. It is used by Redis, Nginx, and Java NIO in Linux. “Multiplexing” refers to multiple network connections, and “multiplexing” refers to the reuse of the same thread. In the life cycle of a socket, there is only one copy process from the user state to the kernel state, and the cost is small. 2. Using the event event notification mechanism, every time there is data in the socket will be actively notified to the kernel, and added to the ready list, without traversal all sockets

In the multiplexing IO model, there is a kernel thread constantly polling for the state of multiple sockets, and the actual IO read and write operations are called only when the actual read and write events are sent. Because in the multiplexing IO model, only one thread can be used to manage multiple sockets, the system does not need to create new processes or threads, and do not need to maintain these threads and processes, and only when there are really read and write events, will use IO resources, so it greatly reduces the resource occupation. Multi-channel I/O multiplexing model uses the ability of SELECT, poll and epoll to monitor I/O events of multiple streams at the same time. When idle, the current thread will be blocked. When one or more streams have I/O events, it will wake up from the blocking state. The program then polls all the streams (epoll only polls the streams that actually emitted the event) and only polls the ready streams sequentially, which avoids a lot of useless operations. The use of multiple I/O multiplexing technology allows a single thread to efficiently process multiple connection requests (minimizing network IO time consumption), and Redis is very fast to manipulate data in memory, which means that in-memory operations do not become a bottleneck affecting Redis performance

The three methods are compared as follows:

conclusion

The reason multiplexing is fast is that the operating system provides such system calls that instead of multiple system calls in the while loop, one system call + kernel layer traverses these file descriptors.

Why do you keep all three?