Article source: Muduo network library preparation knowledge
I’ve been analyzing the source code in Muduo/Base, which is the auxiliary network library. Before analyzing network library, summarize relevant knowledge points first.
What are the concerns of TCP network programming? The Muduo network library is summed up in three and a half events.
Three and a half events for TCP network programming:
1. Establish a connection: Accept on the server and Connect on the client. Once the connection is established, both the sender and the receiver are equal.
Disconnect: active disconnect (close, shutdown) and passive disconnect (read returns 0).
Message arrival: that is, the file descriptor is readable. This is the final event, and how this event is handled determines the style of programming. Examples include blocking/non-blocking, how to subcontract, how to design application layer buffering, etc.
3.5. When the message is sent, it counts as half an event. “Sending completed” means that the data is written to the system buffer of the operating system, and the TCP protocol stack is responsible for sending and retransmitting the data. It does not mean that the receiver has received the data. For low-traffic services, you can ignore this event.
In non-blocking network programming, the application layer uses buffers
Why does the sender application layer use buffers?
Suppose an application wants to send 40KB of data, but TCP’s buffer only has 25KB of free space, what about the remaining 15KB of data? Waiting for the OS buffer to become available blocks the current thread because you do not know when the buffer will become available. In this case, the application layer needs to cache the 15kb data, put it into the send buffer, and wait until the socket becomes writable to send the data immediately. In this way, the “send” operation will not be blocked. If the application wants to send another 50KB of data, it should append the data to the end of the send buffer if there is still data in the application layer send buffer. Instead of calling write to write the socket, which would disrupt the order in which data was sent. For the application layer, it only needs to care about generating data, not whether the data is sent all at once or several times, which should be taken care of by the network library. The application layer simply calls TcpConnection::send() and the network library takes care of sending. What the network library does is put 15KB of data into the TcpConnection’s output buffer, register the POLLOUT event, and send the rest of the data in the event callback function. If the remaining data cannot be sent all at once, continue to register POLLOUT events; Stop paying attention to POLLOUT events if they are sent to avoid creating busy loops.
Why does the receiving application layer use buffering?
If the data read at one time is not enough for a complete packet, the read data should be stored somewhere temporarily and processed after the rest of the data arrives. Does the program still work if data is added byte by byte arriving at 10ms intervals and each byte arrival triggers a file descriptor readable event?
How are buffers designed to be used?
The application layer buffer is used to store incoming/outgoing data. On the one hand, we want the buffer to be as large as possible so that more data can be processed at once when sending and receiving, reducing system calls. On the other hand, we want to keep the buffer as small as possible, because each connection consumes the buffer, and if it’s too large, it uses a lot of memory, and most of the time, the buffer is inefficient. See muduo:: Buffer analysis for details.
What is the Reactor schema?
Reactor model is called Reactor model in Chinese. This pattern is based on synchronous I/O. We register I/O events with the Reactor and set up a callback function that calls our kill function when an event arrives. This is different from having threads/processes wait for an event.
The diagram above shows a Reactor schema. A number of events are registered in the Reactor pattern, which are handled by a dispatcher calling the corresponding callback function when the event arrives.
This is just a basic Reactor pattern, the callback function is in the same thread as the Reactor, and there are several other variations. For example, Reactor+ThreadPool and Multiple Reactors.
non-blocking IO + IO multiplexing
Non-blocking IO plus IO multiplexing. The Muduo network library implementation has another limitation: one loop per thread. Each thread of the program has an Event loop (Reactor) that processes read and write events and timed events.
The EventLoop represents the main loop of the thread. To make that thread work, register the Timer or IO Channel into the EventLoop of the corresponding thread. A thread can be used for IO events that require high real-time performance.
Muduo recommended mode
Muduo recommends one (Event)loop per thread+ Thread pool.
The event loop is used as IO multiplexing with blocking IO and timer.
Thread pools are used to process computation tasks, which can be producer-consumer queues.
Impedance matching rules for thread pool sizes
When the thread pool performs tasks, if the proportion of intensive computing tasks is P (0< P <=1) and the system has C cpus, the empirical formula for the size of the thread pool is T=C/P in order to fill C cpus but not load them.
Why does Eventloop use level trigger?
1. Compatible with traditional poll. 2. Level trigger programming is easier, and it is not easy to have the bug of missing events. 3. There is no need to wait for EAGAIN when reading and writing, which reduces the number of system calls.
\