| the author Wu Xianjian, tencent cloud database senior engineer, participated in the research of open source project Pika, 360 is now engaged in redis database research and development work.
Redis server is an event driver. Events are the core of Redis server. It handles two important tasks, one is IO events (file events) and the other is time events. The Redis server connects to the client through a socket, and the file event can be understood as an abstraction of the server’s socket operations. The communication between the server and the client will generate corresponding file events, and the server completes a series of network communication operations by listening and processing these events. In addition, Redis has some internal operations (currently only serverCron from Redis4.0 code analysis) that need to be executed at a given point in time, and time events are the Redis server’s abstraction of such timed operations.
A, aeEventLoop
Before we look at the code, let’s take a look at what an aeEventLoop is at the heart of event handling:
/* State of an event based program */
Copy the code
Creating an aeEventLoop requires only one setSize parameter, It identifies the maximum number of file descriptors that aeEventLoop can currently listen on (normally redis passes in server. maxClients +CONFIG_FDSET_INCR, which is an additional 128 on top of the user-specified maximum number of client connections). This 128 can be used to open AOF,RDB files and file handles for communication between master and slave clusters within Redis. When creating aeEventLoop, setSize determines the size of aeFileEvent and aeFiredEvent arrays.
1. aeFileEvent
There are two Pointers to functions that should be called when a readable/writable event occurs. There is also an untyped pointer to the associated data. Events is an array. The socket is indexed as a subscript corresponding to aeFileEvent. For example, the socket I currently care about is 9, then events[9] is its corresponding file event data structure (as mentioned in CSApp, when we call the system function to return the descriptor number, The descriptor returned is always the smallest descriptor that is not currently open in the process, so we don’t have to worry about file descriptors being created and destroyed repeatedly, and the problem gets bigger and bigger.
2. aeFiredEvent
The internal mask stores the currently fired events and corresponding sockets. In fact, the FIRED array is only assigned when aeApiPoll is called. For example, socket 6, 8 and 10 are currently found to have readable events, and socket 10 has writable events. Then the first three elements of the fired array are assigned {fd = 6, mask =AE_READABLE}, {fd = 8, mask =AE_READABLE}, {fd = 10, mask = AE_WRITABLE}, Events [6] triggers readable events. RfileProc is called in Events [6] to process readable events.
aeEventLoop *aeCreateEventLoop(int setsize){
Copy the code
For time events, aeEventLoop has a timeEventHead pointer to the first time event. Since aeEventLoop was created without any internal time events, the timeEventHead pointer is NULL when initialized. It is always added to the header of the timeEventHead. Since the aeTimeEvent structure has a next pointer to the next aeTimeEvent structure, once we get the timeEventHead we can iterate through all the current time events. There is another detail to note. The next pointer in the last aeTimeEvent structure points to timeEventHead, so all time events are actually connected by a circular linked list.
2. Document incidents
As mentioned in the introduction, file events are the server’s abstraction of socket operations. When a socket is triggered by a read-write event, we need to call the corresponding handler.
/* File event structure */
Copy the code
When aeEventLoop is initialized, space is allocated for the aeFileEvent array (Events). The size of the array is specified by setSize, indicating the current maximum open Redis socket size. The socket corresponds to aeFileEvent. That is, we can find the aeFileEvent object in the Events array using the socket value as an index.
When we register a file event in aeEventLoop, we first determine whether the incoming socket is out of bounds to the Events array. If not, we get the aeFileEvent object corresponding to the current socket. AeApiAddEvent is then called to register the current file descriptor and the listening event with the underlying I/O multiplexing mechanism (epoll, Select, EVport, kQueue). We also need to specify the function to be called when a readable/writable event occurs, and some private data for the current file event is stored in an object pointed to by clientData.
int aeCreateFileEvent(aeEventLoop *eventLoop,int fd, int mask,
Copy the code
3. Time events
In fact, the time events inside Redis can be divided into two types: periodic events, that is, events that need to be triggered at a certain point in the future (only triggered once), and periodic events, which are different from the previous periodic events that only trigger once, will be triggered again at intervals.
If the function returns AE_NOMORE(i.e. -1), the current event does not need to be triggered again (AE_DELETED_EVENT_ID). If the function returns a value n greater than or equal to 0, When you wait another n seconds, the event needs to be triggered again (update when_sec and when_MS according to the return value). The serverCron event mentioned at the beginning of this blog is actually a periodic event, and the function returns 1000/server.hz. Server. Hz is set to 10 by default, which means that serverCron will be called once every 100ms on average.
/* Time event structure */
Copy the code
Redis calls aeCreateTimeEvent to create a time task, which is very simple to implement. We’ll look at milliseconds and Proc, which specify how long the event was before the current trigger time, and which function should be called when the event is triggered. Internally, aeAddMillisecondsToNow calculates the timestamp of the current timing task and assigns it to when_sec and when_MS, and then points the timeProc to the function that should be called when the time event arrives.
After assigning the variables inside the aeTimeEvent structure, we finally add them to the head of the loop that stores timed events inside the aeEventLoop (note that since we always add new timed events to the head of the loop, Therefore, the time sequence triggered by time events is not orderly in the circular linked list. We need to traverse the circular linked list to ensure that all the current arrived time events have been processed. However, as mentioned at the beginning, there is only one time event in Redis, serverCron. So we don’t have to worry about the service performance of traversing the circular list), a time event is created.
static void aeAddMillisecondsToNow(long longmilliseconds, long *sec, long *ms) {
Copy the code
Redis uses the aeDeleteTimeEvent function to delete a time task, passing only the ID of the event to be deleted. We find that the deletion is actually a lazy deletion, and mark the ID in aeTimeEvent as AE_DELETED_EVENT_ID. Instead of directly deleting and releasing aeTimeEvent objects from the linked list, I think the reason for this implementation is more for security considerations and the simplicity of the code. I consider that in one time event, I intended to delete another time event, but because of the wrong ID, I mistakenly deleted the aeTimeEvent as myself. It is dangerous to release your own aeTimeEvent object at this point.
int aeDeleteTimeEvent(aeEventLoop *eventLoop,long long id)
Copy the code
Iv. Scheduling and execution of events
Redis is single-threaded and is always inside a while loop in aeMain. Within the loop, aeProcessEvents is constantly called, which schedules the file and time events mentioned above and decides when to process them.
void aeMain(aeEventLoop *eventLoop) {
Copy the code
What the aeProcessEvents function does inside is actually quite simple, as outlined below:
1. Call aeSearchNearestTimer to obtain the time event whose arrival time is closest to the current time.
2. Calculate how long it takes for the last time event to fire, and record the result to a structure pointing to a struct timeVal * pointer (NULL if no time event object was retrieved in step 1);
3. Block and wait for file events to occur. The maximum blocking time is determined by step 2.
4. If a file event is obtained within the maximum blocking time, the corresponding read event handler or write event handler is called according to the type of the file event.
5. In the event list of traversal duration, a time event whose ID is AE_DELETED_EVENT_ID that has been marked for deletion may be encountered during this process. This time event needs to be removed from the list and released. And according to the return value to determine whether the event time needs to be re-triggered within a given time.
Five, the problem of
Q1: Is the timing of events accurate?
A1: The triggering of time events cannot be triggered accurately at the specified time, which is generally a little later than the specified time. In addition, under the Single-thread Redis model, time events are executed in serial. If the processing time of a certain time event is long in the middle, the accuracy of the execution time of subsequent time events will be further affected. In addition, the time event linked list is unordered, so in extreme scenarios, there is a possibility that the time event with lower priority will be triggered before the event with higher priority. Fortunately, there is only one time event in Redis, so the influence will not be too big.
Q2: How does aeEventLoop specify the number of file descriptors to listen on when it is created, and then dynamically adjust the maximum number of client connections using the config set maxClients command?
A2: AeEventLoop provides the aeResizeSetSize function. The user reallocates the events and Fired arrays to adjust the number of sockets that aeEventLoop can listen to. When the new MaxClients array is larger than the previous one, This function is called to expand the number of file descriptors that aeEventLoop can listen on to support more client connections.
int aeResizeSetSize(aeEventLoop *eventLoop,int setsize) {
Copy the code
Six, summarized
Redis has a very clever way to deal with events. File events and time events cooperate with each other, and make full use of the time before the event reaches the time waiting and processing file events, which not only avoids the IDLE check of CPU, but also can timely process file events. In addition, the removal and refiring of time events are completely in the hands of the user through the return value of the timeProc function, making it more flexible to use.