preface
In recent years, the concepts of “asynchronous” and “non-blocking” have been widely used in server-side application development. A lot of times people like to describe them together, so many people may confuse the two words. This article tries to explain the tension between the two from a Linux I/O perspective.
This article covers the following:
Linux
的I/O
Basic knowledge;I/O
Model meaning and existing several categories:- blocking
I/O
; - Multithreaded blocking
I/O
; - non-blocking
I/O
; I/O
multiplexing:select
/poll
/epoll
;- asynchronous
I/O
libuv
How to solveI/O
The problem.
In addition, the examples involved in this article have been hosted on GITHUB, welcome to download the trial run.
Linux I/O basics
Start with the example of reading a file
Now there is a requirement to read the file through a shell script. I’m sure most of you can do it right away:
$ cat say-hello.txtCopy the code
Now we’re asking for C to do the same thing in more detail.
#include <string.h>
#include <fcntl.h>
#include <unistd.h>
#include <stdio.h>
int main(int argc, char const *argv[])
{
char buffer[1024];
int fd = open("say-hello.txt", O_RDONLY, 0666);
int size;
if (fd < 0) {
perror("open");
return 1;
}
while ((size = read(fd, buffer, sizeof(buffer))) > 0)
{
printf("%s\n", buffer);
}
if (size < 0) {
perror("read");
}
close(fd);
return 0;
}Copy the code
Call open to get a number, use it for write and read operations, and call close. This is the basic Linux I/O flow.
You write JavaScript all the time, and people who aren’t familiar with C might have two problems.
- What is the number returned by the open method?
- The read operation reads the resource from the hard disk, and the code after the read operation has to wait.
With that in mind, let’s begin with the following.
File operator
We know that Linux has a slogan called “Everything is a file.” An important feature of this feature is the file descriptor mechanism.
Let’s summarize the common I/O operations, including:
TCP
/UDP
- Standard input output
- File to read and write
DNS
- Pipes (process communication)
Linux uses the file operator mechanism to implement interface consistency, including:
- The introduction ofFile operator(
file descriptor
, hereinafter referred to asfd
); - Unity of
read
和write
Method to perform read and write operations.
As mentioned in the previous example, the open function returns a number, the file descriptor, that corresponds to a file unique to the current process.
Linux runs with a fixed-size array in the process control block (PCB), where each entry points to a record table of open files maintained by the kernel for each process.
struct task_struct { ... /* Open file information: */ struct files_struct *files; . } /* * Open file table structure */ struct files_struct { spinlock_t file_lock ____cacheline_aligned_in_smp; unsigned int next_fd; unsigned long close_on_exec_init[1]; unsigned long open_fds_init[1]; unsigned long full_fds_bits_init[1]; struct file __rcu * fd_array[NR_OPEN_DEFAULT]; };Copy the code
Fd is actually the sequence number of the corresponding file structure in this array, which is why FD starts at 0.
So if you pass fd in for a read or write operation, Linux will find the file you need to manipulate.
To detect the presence of an FD, you can use the lsof command, such as the following command to print the current fd in Chrome (if you are using Chrome to access the current page).
$ lsof -p$(ps -ef |grep Chrome|awk 'NR==1{print $2}')Copy the code
The second problem introduced by the above example:
The read operation reads the resource from the hard disk, and the code after the read operation has to wait.
In Linux, the main body of a program is a process or thread (prior to Linux kernel 2.4, the basic unit of scheduling became the thread, and the process became the container for the thread).
The process has an underlying state that looks like this when it’s running
Using the above example, after the read function is executed, the process enters the blocked state, and after the I/O ends, the process is re-unblocked by the system interrupt, enters the ready state, and waits for the time slice to be allocated, and enters the execution state.
This means that our process will block when I/O operations occur, which explains the problem above.
I/O model
The I/O mechanism described above is the blocking I/O mechanism in our I/O model.
Blocking I/O
Blocking I/O is the default execution mechanism of the Read /write function. When the read/write operation is performed, the process is blocked. After the I/O is complete, the system interrupts the process to put it into the ready state, waits for the time slice to be allocated, and then executes it.
One problem with the mechanism for blocking I/O, however, is that I/O operations cannot be executed concurrently, or CPU calculations can be performed simultaneously with I/O operations. If, in a Web request/response scenario, one request read state is blocked, the other requests cannot be processed.
We need to solve this problem.
Multithreading blocks I/O
The first idea is to use multithreading.
We pre-initialize a thread pool to block using the semaphore’s wait primitive. When an I/O operation is required, the thread is woken up by the semaphore Signal to perform the related I/O operation. See the code for details.
The downside of multithreaded non-blocking I/O, however, is that thread switching can be costly when the number of connections reaches a high level.
Therefore, it is expected that I/O waiting operations can be handled in one thread to avoid the overhead of thread context switching caused by opening multiple threads. Is there a way to do this, so you can introduce a non-blocking I/O mode.
Non-blocking I/O
Non-blocking I/O is a mechanism that allows the user to return immediately after calling an I/O read/write function, or -1 if the buffer is unreadable or unwritable. Here is an example of building a Web server with non-blocking I/O, looking at the code.
The key function is this:
fcntl(fd, F_SETFL, O_NONBLOCK);Copy the code
As you can see, the process STATE is no longer blocked, and I/O operations can be performed at the same time as other CPU operations
Non-blocking I/O can help us solve the need to execute I/O operations concurrently on a thread, but it can also cause problems:
- if
while
Cyclic polling for operations to be performed is unnecessaryCPU
Computation waste, because at this pointI/O
Operation not complete,read
The function doesn’t get the result; - If you are using
sleep
/usleep
The way the process is forced to sleep for a period of time is caused back againI/O
The operation did not return in time.
So, does the system have a mechanism that allows us to wait natively for multiple I/O operations to execute? The answer is yes, we need to introduce our I/O multiplexing.
I/O multiplexing
I/O multiplexing, hence the name, means the simultaneous execution of multiple I/O operations within a single process. There is an evolutionary process of its own, namely select, Poll, and epoll (the alternative on MacOS is Kqueue). Let’s take it one by one.
select
Select uses generally contains three functions (more on this, stamp this) :
void FD_SET(int fd, fd_set *fdset);
int select(int nfds, fd_set *restrict readfds,
fd_set *restrict writefds, fd_set *restrict errorfds,
struct timeval *restrict timeout);
int FD_ISSET(int fd, fd_set *fdset);Copy the code
Select allows you to listen for FDS in batches. When the buffer of any FD in the incoming Fd_set enters the readable/writable state, the block is unblocked and the specific FD is looped through FD_ISSET.
fd_set tmpset = *fds;
struct timeval tv;
tv.tv_sec = 5;
int r = select(fd_size, &tmpset, NULL, NULL, &tv);
if (r < 0) do {
perror("select()");
break;
} while(0);
else if (r) {
printf("fd_size %d\n", r);
QUEUE * q;
QUEUE_FOREACH(q, &loop->wq->queue) {
watcher_t * watcher = QUEUE_DATA(q, watcher_t, queue);
int fd = watcher->fd;
if(FD_ISSET(fd, &tmpset)) { int n = watcher->work(watcher); }}}else {
printf("%d No data within five seconds.\n", r);
}Copy the code
Here is an example of a SELECT build Web server that you can look at.
Although the SELECT function can solve the problem of I/O multiplexing, it also has some defects:
- The maximum number of FDS passed in by fd_set is 1024. If this number is exceeded, you may still need to use multiple threads.
- Performance overhead
select
Every time a function is executed, it existsfd_set
Copy from use-state to kernel-state;- The kernel needs polling
fd_set
中fd
The state; - The return values in the user mode also need to be polled to determine which ones
fd
Enter the readable state;
poll
The first problem was solved by poll, which received an array of fd’s instead of a 1024 limit. The poll function is defined as follows:
int poll(struct pollfd *fds, nfds_t nfds, int timeout);Copy the code
The exact usage is very similar to select:
struct pollfd * fds = poll_init_fd_set(loop, count);
int r = poll(fds, count, 5000);
if (r < 0) do {
perror("select()");
break;
} while(0);
else if (r) {
QUEUE * q;
QUEUE_FOREACH(q, &loop->wq->queue) {
watcher_t * watcher = QUEUE_DATA(q, watcher_t, queue);
int fd = watcher->fd;
if (watcher->fd_idx == -1) {
continue;
}
if((fds + watcher->fd_idx)->revents & POLLIN) { watcher->work(watcher); }}}else {
printf("%d No data within five seconds.\n", r);
}Copy the code
Here is an example of poll building a Web server that you can look at.
However, the performance overhead mentioned above in SELECT remains a problem. On Epoll, the problem was solved.
epoll
Details can be found here. In simple terms, epoll divides the select, poll operation into three steps:
- epoll_create, create a
epoll_fd
, used for stage 3 monitoring; - epoll_ctlBind the fd you want to listen to to
epoll_fd
Above; - epoll_waitAnd the incoming
epoll_fd
, the process enters the blocking state, listening on anyfd
The process is unblocked after the change.
Let’s look at how epoll addresses the performance overhead mentioned above:
fd
The binding is inepoll_ctl
Phase completion,epoll_wait
I just pass inepoll_fd
, there is no need to repeat the passfd
The collection;epoll_ctl
The incomingfd
Will maintain a red-black tree in kernel state, when byI/O
When the operation is complete, it is located through the red-black tree in O(LogN) modefd
, avoid polling;- Return to user mode
fd
The array is actually in a readable, writable statefd
Collection, which no longer requires the user to poll all FDS.
In this case, epoll scheme is the best scheme for multiplexing. Here is an example of epoll building a Web server.
But isn’t epoll flawed? The answer is no:
epoll
Currently only supportedpipe
, network and other operationsfd
File system generation is not supportedfd
.
Asynchronous I/O
Both blocking and non-blocking I/O and I/O multiplexing described above are synchronous I/ OS. Both require the user to wait for the I/O operation to complete and receive the returned content. The operating system itself also provides asynchronous I/O solutions that correspond to different operating systems:
- Linux
- Aio, currently being criticized, is only supported
Direct I/O
(File operation) - Io_uring, a new addition to the Linux Kernel in version 5.1, is considered Linux asynchronous
I/O
The new home - windows
- Iocp, as libuv’s asynchronous processing scheme on Windows. (the author of
windows
There is not enough research to do more introduction.)
So far, several common I/O models have been introduced.
At present, epoll mechanism is still recommended on Linux. Epoll does not support listening on file FDS, so we need to work with libuv to solve this problem.
Libuv’s solution
Libuv uses epoll to build the body of the Event-loop, where:
socket
.pipe
Such as throughepoll
Mode monitoredfd
Type, byepoll_wait
To monitor;- File processing/DNS resolution/decompression, compression and other operations, using the worker thread for processing, the request and result through two queues, by one
pipe
Communicate with the main thread,epoll
Listen to thefd
To determine when to read the queue.
That’s the end of this article. A quick summary:
First, we introduced the concept of file descriptors, followed by switching between basic Linux process states. Then the concept of blocking I/O was introduced, as well as defects, which led to the introduction of multithreaded blocking I/O, non-blocking I/O, and I/O multiplexing and asynchronous I/O. Finally, combined with the above knowledge, a brief introduction of liBUV internal I/O operation mechanism.
Finally, make a “little commercial.” Bytedance is looking for the best front-end engineers and Node.js engineers to join us and do something fun. Please send them to [email protected].
Well, I’ll see you next time.