Introduction to the
Any program is inseparable from IO, some are obvious IO, such as file reading and writing, and some are not obvious IO, such as network data transmission. So what are the modes of IO? How should we choose in use? How do the advanced IO models Kqueue and epoll work? Take a look.
Block IO and nonblocking IO
Let’s take a look at the two simplest IO models: blocking IO and non-blocking IO.
For example, if multiple threads need to read data from a Socket server, the reading process can actually be divided into two parts. The first part is to wait for the Socket data to be ready, and the second part is to read the corresponding data for business processing. For blocking IO, it works like this:
- A thread waits for the socket channel data to be ready.
- When the data is ready, the thread processes it.
- The other threads wait for the first thread to finish and continue the process.
Why is it called blocking IO? This is because while one thread is in the process of executing, other threads can only wait, that is, the IO is blocked.
What is non-blocking IO?
Again, in the example above, it would work like this in non-blocking IO:
- A thread attempted to read data from the socket.
- If the data in the socket is not ready, return immediately.
- The thread continues to attempt to read data from the socket.
- If the data in the socket is ready, the thread proceeds with the subsequent program processing steps.
Why is it called non-blocking IO? This is because the thread will return immediately if it finds no data on the socket. Does not block IO operations on the socket.
As you can see from the above analysis, although non-blocking IO does not block the Socket, it does not release the Socket because it polls the Socket all the time.
IO multiplexing and SELECT
There are many models of IO multiplexing, and SELECT is the most common one. Both Netty and JAVA NIO use the SELECT model in real time.
How does the SELECT model work?
In fact, the SELECT model is somewhat similar to non-blocking IO, except that there is a separate thread in the SELECT model to check whether the data in the socket is ready. If the data is found ready, select can choose to notify a specific data processing thread through a previously registered event handler.
The advantage of this is that while the select thread itself is blocked, the other threads used to actually process the data are non-blocking. And a SELECT thread can be used to monitor multiple socket connections, thus improving the efficiency of IO processing, so the SELECT model is used in many occasions.
To understand the principle of SELECT in more detail, let’s look at the Unix select method:
int select(int nfds, fd_set *readfds, fd_set *writefds, fd_set *errorfds, struct timeval *timeout);
Copy the code
So just to explain what these parameters mean, we know that in Unix, all objects are files, so fd means file descriptor, so file descriptor.
FDS stands for file Descriptor sets, which is a set of file descriptors.
NFDS is an integer value representing the maximum value +1 in the set of file descriptors.
Readfds is a collection of descriptors for file reads to examine.
Writefds is a collection of descriptors for file writes to be checked.
Errorfds is a collection of file exception descriptors to check.
Timeout is the timeout period, which represents the maximum interval to wait for the selection to complete.
It works by polling all file descriptors and then finding the file descriptors to monitor,
poll
The poll class is similar to the SELECT class, except that it describes fd collections differently. Poll is mostly used in POSIX systems.
epoll
In real time, select and Poll are both multiplexed IO, but they both have some disadvantages. Epoll and KQueue are optimizations for them.
Epoll is a system command in Linux, which can be thought of as an Event poll. It was first introduced in version 2.5.44 of the Linux core.
Used to monitor IO ready for multiple file Descriptors.
For traditional SELECT and poll, because it is necessary to continuously walk through all file descriptors, the efficiency of each select is O(n), but for epoll this time can be increased to O(1).
This is because ePoll triggers notifications when specific monitoring events occur, so polling like SELECT is not required and is more efficient.
Epoll uses a red-black tree (RB-tree) data structure to track all the file descriptors that are currently being monitored.
Epoll has three API functions:
int epoll_create1(int flags);
Copy the code
Used to create an epoll object and return its file descriptor. The flags passed in can be used to control the performance of epoll.
int epoll_ctl(int epfd, int op, int fd, struct epoll_event *event);
Copy the code
This method is used to control epoll and can be used to monitor which file descriptors and which events.
The op can be ADD, MODIFY, or DELETE.
int epoll_wait(int epfd, struct epoll_event *events, int maxevents, int timeout);
Copy the code
Epoll_wait is used to listen for events registered using the epoll_ctl method.
Epoll provides two trigger modes, edge-triggered and level-triggered.
If a PIPE registered with epoll receives data, the call to epoll_wait will return indicating that there is data to read. In level-triggered mode, however, the call to epoll_wait returns as soon as the pipeline buffer contains the data to be read. In level-triggered mode, however, epoll_wait returns only when new data is written to the pipe.
kqueue
Kqueue, like epoll, is used to replace select and poll. The difference is that KQueue is used in FreeBSD,NetBSD, OpenBSD, DragonFly BSD, and macOS.
Kqueue can handle not only file descriptor events, but also various other notifications such as file modification monitoring, signaling, asynchronous I/O events (AIO), child process state change monitoring, and timers that support nanosecond resolution. In addition, KQueue provides a way to, in addition to events provided by the kernel, You can also use user-defined events.
Kqueue provides two apis. The first is to build kqueue:
int kqueue(void);
Copy the code
The second is to create kEvent:
int kevent(int kq, const struct kevent *changelist, int nchanges, struct kevent *eventlist, int nevents, const struct timespec *timeout);
Copy the code
The first parameter in kevent is the kqueue to register, changelist is the list of events to monitor, nchanges is the length of events to listen for, eventList is the list of events to return,nevents is the length of the list of events to return, The last parameter is timeout.
In addition, kqueue also has an EV_SET macro that initializes the kEvent structure:
EV_SET(&kev, ident, filter, flags, fflags, data, udata);
Copy the code
Advantages of epoll and Kqueue
Epoll and kQueue are more advanced than select and poll because they make full use of the underlying functions of the operating system. The operating system must know when data is ready. By registering corresponding events with the operating system, polling operations of select can be avoided and operation efficiency can be improved.
Note that epoll and KQueue require the support of the underlying operating system. When using epoll and Kqueue, pay attention to the corresponding Native libraries.
This article is available at www.flydean.com/14-kqueue-e…
The most popular interpretation, the most profound dry goods, the most concise tutorial, many tips you didn’t know waiting for you to discover!
Welcome to pay attention to my public number: “procedures those things”, understand technology, more understand you!