Understanding of I/O
There are several types of I/O, including memory I/O, disk I/O, and here we are talking about network I/O. Because communication between processes on our different hosts must be programmed using sockets, network I/O is essentially socket reading.
Network I/O requests generally consist of two steps: Step 1: Wait for the data to arrive from the network. When the data arrives at the disk, it is copied to a buffer in the kernel memory and waits for the data to be ready. Step 2: Copy data from the kernel buffer to user-mode process memory in a short time.
The concepts involved in the I/O model
Generally speaking I/O model, will involve blocking, non-blocking, synchronous and asynchronous these words, so let’s first understand the meaning of these several, to better understand the I/O model ~
Blocking and non-blocking refer to whether an operation is performed waiting for results or returning directly.
Blocking means that an I/O operation must complete all its operations before returning to user state. The application process (caller) is suspended until the call result is returned, and will not enter a work queue (runnable state) until the execution is complete.
Non-blocking means that a status code is returned immediately after an I/O operation is invoked, without waiting for the I/O operation to complete. But the caller is not suspended until the result of the call returns, so it keeps calling I/O when it is its turn to run.
Synchronous and asynchronous focus on message communication mechanisms.
Call a function synchronously and wait until the function returns. Similar to telephone, as long as the phone is not hung up, you and I are polite to wait for each other to finish speaking, do not interrupt.
Call a function asynchronously, return it directly without waiting for the result, and then perform other operations, with subsequent results notified by status, signals, and so on. Similar to send text messages, send you did not return, I ignore you, I hit the code to go! Then you will text me when you want to reply to me!
What are the I/O models
There are five I/O models.
- Blocking I/O
- Non-blocking I/O
- I/O multiplexing
- Signal drives I/O
- Asynchronous I/O
Blocking I/O model
An application makes a system call to read a socket, and if there is no data in the kernel, the application will block and do nothing until the data is copied from the kernel buffer to the application buffer.
Nonblocking I/O model
When a process makes an IO call, the system returns an error code without blocking. The application process can continue to execute, but it needs to continuously execute system calls to know if the IO is complete. If there is data in the kernel buffer, the kernel returns it to the process. Although application processes can immediately return each I/O request, they need to continuously request data, consuming a large amount of CPU resources.
The signal drives the I/O model
When a process initiates an IO operation, a signal handler is registered with the kernel, and the process returns unblocked. When the kernel data is ready, a signal is sent to the process, which calls IO in the signal handler to read the data.
Asynchronous I/O model
When a process initiates an IO operation, it does not block, but it cannot return the result. After the kernel has processed the entire IO, it notifies the process of the result. If the I/O operation succeeds, the process directly obtains data. It is similar to signal-driven I/O in that the kernel notifies the application that it is ready to do I/O when an event responds. But asynchronous I/O hands off the I/O operation to the kernel, which notifies the application process when it is done.
IO multiplexing model
When we communicate using Socket programming, the server can only communicate one to one. A server cannot connect to another client while it is still processing I/O from one client. A server can only deal with a client, it’s a waste of resources! But how to do that when dealing with multiple clients, especially 1W + client requests?
It’s tempting to fork multiple processes to handle multiple sockets (i.e., multiple client connections), or to create multiple threads to handle them. However, when dealing with 1W + connections, the server has to maintain tens of thousands of processes or threads, consuming a lot of server resources, and process/thread context switches are also very resource-intensive.
This is where I/O multiplexing comes in, allowing a single process to handle multiple client connection requests. So what magic does it use to handle multiple connections? Let’s unveil it now.
It does this by calling a function that asks the kernel to help monitor a set of sockets, return them if they are ready, and then copy and iterate over them to handle valid socket reads and writes. This can handle multiple sockets, but the performance of the traversal process after copying the past is a bit lower than the one-to-one case, but the overall performance is still much higher, after all, you can process many sockets in a period of time, rather than processing only one socket and then processing the next socket is much better. This is a bit like the parallel processing of processes by the CPU! The application process gets multiple ready sockets from the kernel via the system calls SELECT, poll, and epoll provided by the kernel.
Because in the I/O multiplexing model, only one process is required to manage multiple sockets, and the actual I/O read and write operations are used only when there are actual Socket read and write events. So it takes a lot less resources.
select
Select puts the file descriptor generated by the Socket in the TCP full connection queue into a set, and then copies it to the kernel. The kernel keeps polling for read/write events, marks the Socket as readable/writable, and copies the entire file descriptor set to user space. The select function returns, and the application needs to iterate over the set of file descriptors again to see if they are readable/writable and process them.
The specific process
In fact, here to say a bit more detailed, it involves the operating system scheduling and interruption knowledge ~
When an application process calls the select function, it will be stuck in kernel mode. The kernel will poll the socket for read/write events, and if not, will dock the current application process in the waiting queue of the socket to be checked. Write cache, read cache, wait queue), that is, suspend the process, CPU switch to another process to run.
Once an event occurs on any socket, that is, when the network packet arrives, it will trigger the interrupt corresponding to the completion of network data transmission. The CPU then executes the interrupt handler to analyze which socket the packet belongs to and puts the packet (according to the port number of the TCP header) into the read cache of the corresponding socket. Then check whether there is a waiting process in the socket’s waiting queue. Some calls move the waiting process back to the work queue, and the interrupt ends. CPU usage is returned to the user mode. The suspended process returns to the work queue and has another chance to obtain the CPU elapsed time slice. Then it executes the select function again to check if there are any read/write events on the socket. If there are any read/write events on the socket, mark it as readable
Several disadvantages:
- A fixed BitsMap is used to represent the set of file descriptors, and the number of supported file descriptors is limited. In Linux, the maximum value is limited by FD_SETSIZE in the kernel
1024
, can listen only for file descriptors 0 to 1023. - Moving the set of file descriptors from user to kernel state has copying overhead
- The select function does not know which file descriptor has data, so it needs to iterate over the file descriptor again, which is inefficient.
poll
Poll is an enhancement to SELECT. It uses a linked list to store file descriptors, breaking the limit of SELECT on file descriptors, only limited by the size of kernel memory.
However, it still needs to go through the traversal check of file descriptor set by kernel and application process, and copy overhead from kernel to application process.
epoll
It uses two red-black trees and two ready linked lists to overcome the shortcomings of SELECT /poll. This I/O reuse mechanism is used in Linux version 2.5.44.
There are three main system call apis:
// The kernel creates epoll instances, including red-black trees and ready lists
int epoll_create(int size);
// Modify, delete, and add a socket node to the red-black tree
int epoll_ctl(int epfd, int op, int fd, struct epoll_event *event);
// The kernel uses a red-black tree to quickly find active sockets and put them into the ready list
// Copy a certain number of items from the ready list to events
int epoll_wait(int epfd, struct epoll_event *events, int maxevents, int timeout);
Copy the code
First, the epoll_create process is invoked to create epoll instance, and the red-black tree and ready list are established in the kernel.
Calling epoll_ctl adds or deletes a socket node to the red-black tree:
- ADD will check whether the socket is in the red-black tree. If it is, ADD it to the ready list. If it is not, insert it into the red-black tree for maintenance.
- DEL is removed from each resource of the epoll instance.
- The MOD modifies the state of the socket and checks the red-black tree again. Active sockets are added to the ready list. Non-active sockets are registered with the event callback function, which adds the socket to the ready list whenever an event occurs.
Epoll_wait checks the ready list for sockets that are already ready, waits to wake up if they are not, and copies them back to user space if they are.
Since epoll only needs to copy active sockets from the kernel state to the user state, it solves the disadvantages of the large socket copy overhead and invalid traversal of select/poll.
Applicable scenario
Epoll is not necessarily better than SELECT /poll, and each technique has a suitable scenario. In the case of low concurrency and active sockets, there is no need to create red-black tree and ready linked list. The time cost of two traversals is not very large and each traversal node is fully utilized, so select/poll is more suitable. However, if there is high concurrency and only a few sockets are active at any one time, epoll is more suitable because it copies only active sockets to user state at a time.
Reference Sources
Mp.weixin.qq.com/s/Qpa0qXxuI… Be sad ithub. Chtistina georgina rossetti.british poetess journey – IO/IO – multiple…