If someone asks you, “What business scenarios would you use Redis for?” I think you’re probably going to say, “I’m going to use this as a cache because it stores the data from the back-end database in memory and then reads it directly from memory, so it’s very responsive.” Yes, this is a common scenario for Redis, but there is one problem that should never be ignored: if the server goes down, all the data in memory will be lost.
An easy solution would be to recover the data from the back-end database, but there are two problems with this approach. One is that frequent access to the database puts a lot of strain on the database. The second is that the data is read from a slow database, and the performance is certainly not as good as reading from Redis, causing the application that uses the data to respond slowly. Therefore, it is critical for Redis to persist data and avoid recovery from back-end databases.
Currently, there are two main mechanisms for Redis persistence: AOF logging and RDB snapshot. In the next two lessons, let’s study them separately. In this lesson, we will focus on AOF logging.
How is AOF logging implemented?
In terms of logging, we are familiar with Write Ahead Log (WAL) of the database. This means that the modified data is recorded in the Log file before the data is actually written, so that it can be recovered in the event of a failure. However, AOF logging is the opposite, it is a write after log. “write after” means that Redis executes commands and writes data to memory before logging, as shown in the following figure:
So why does AOF execute commands before logging? To answer that question, we need to know what was recorded in the AOF.
Whereas traditional database logs, such as the redo log, record modified data, AOF records every command received by Redis, which is stored as text.
Let’s take a look at the contents of AOF logs, which are recorded by Redis after receiving the “set testkey testValue” command. *3 indicates that the current command has three parts. Each part starts with $+ number and is followed by a specific command, key, or value. Here, “number” indicates the number of bytes of the command, key, or value in this section. For example, “$3 set” means that this part has three bytes, which is the “set” command.
However, to avoid extra checking overhead, Redis does not first check the syntax of these commands when logging to AOF. Therefore, if the log is recorded before the command is executed, the wrong command may be recorded in the log. When Redis uses the log to recover data, it may make an error.
In the post-write log mode, the system executes commands first. The commands are recorded in logs only when they are successfully executed. Otherwise, the system reports errors to the client. So one of the big benefits of Redis using post-write logging is that you can avoid logging incorrect commands.
In addition, AOF has another benefit: it logs after the command is executed, so it does not block the current write operation.
However, there are two potential risks to AOF.
First, if a command goes down just after it is executed before it can be logged, the command and its data are at risk of being lost. If Redis is used as a cache, you can re-read data from the back-end database for recovery, but if Redis is used directly as a database, the command is not logged, so recovery cannot be logged.
Second, while AOF avoids blocking on the current command, it may risk blocking on the next operation. This is because the AOF log is also executed in the main thread, and if the disk is under a lot of write pressure while the log file is being written to disk, the write to disk will be slow and subsequent operations will not be able to be performed.
If you look closely, you’ll see that both risks are related to the timing of AOF writing back to disk. This means that both risks are eliminated if we can control when AOF logs are written back to disk after a write command is executed.
Three write back strategies
In fact, the AOF mechanism gives us three options for this problem, namely three optional values of the AOF configuration item appendfsync.
- Always, synchronous write back: Logs are written back to the disk synchronously immediately after each write command is executed.
- Everysec, write back per second: after each write command is executed, only the log is written to the memory buffer of AOF file first, and the contents of the buffer are written to disk every second.
- No, operating system-controlled write back: After each write command is executed, the log is first written to the memory buffer of the AOF file. The operating system decides when to write the contents of the buffer back to disk.
None of the three write back strategies can solve the problem of preventing main thread blocking and reducing data loss. Let’s look at why.
- “Synchronous write back” can basically do not lose data, but it has a slow down operation after each write command, which will inevitably affect the main thread performance.
- Although “operating system control write back” after writing buffer, you can continue to execute subsequent commands, but the timing of disk fall is not in the hands of Redis, as long as AOF records are not written back to disk, once the corresponding data will be lost;
- Write back per second uses the write back frequency of one second to avoid the performance overhead of synchronous write back. Although the impact on system performance is reduced, commands that do not fall off disks in the last second are still lost in the event of a breakdown. So, this is a compromise between not affecting main thread performance and not losing data.
I’ve put together a list of all three strategies for you to review, along with their pros and cons.
At this point, we can choose which write back strategy to use based on the high performance and reliability requirements of the system. To sum up: To achieve high performance, choose the No strategy; To ensure high reliability, choose the Always policy. If you want to allow a bit of data loss, but you don’t want performance to suffer too much, choose the Everysec policy.
However, choosing a write back strategy based on the performance requirements of the system is not “safe and sound”. After all, AOF is a file that records all write commands received. As more write commands are received, the AOF file becomes larger and larger. This means that we must be careful of the performance problems associated with large AOF files.
The performance problems mainly lie in the following three aspects: First, the file system itself has limits on the file size, and cannot save large files. Second, if the file is too large, and then add command records to it, the efficiency will become low; Third, if there is an outage, the commands recorded in AOF have to be re-executed one by one for fault recovery. If the log file is too large, the whole recovery process will be very slow, which will affect the normal use of Redis.
So we have to have some control, and that’s where the AOF rewrite comes in.
What if the log file is too large?
In simple terms, the AOF rewriting mechanism is that Redis creates a new AOF file according to the current state of the database during rewriting. That is, Redis reads all the key/value pairs in the database and records the write of each key/value pair with a command. For example, when the key-value pair “testkey” : “testValue” is read, the override mechanism records the set testkey testvalue command. In this way, when you need to restore, you can run the command again to write “testkey” : “testValue”.
Why does overwriting make log files smaller? In fact, the rewriting mechanism has a “multiple one” function. This is called “polychange-one”, which means that multiple commands in the old log file become one command in the new log after rewriting.
As we know, the AOF file appends each write command received one by one. When a key-value pair is repeatedly modified by multiple write commands, the AOF file records the corresponding multiple commands. However, when overwritten, write commands are generated for the key-value pair based on its current state. This allows a key-value pair to be written in a single command in the rewrite log, and when the log is restored, the key-value pair can be written directly by executing this command.
Here’s an example:
When the final state of a list is [“D”, “C”, “N”] after six times of modification, only LPUSH u:list “N”, “C”, “D” command can be used to restore the data, which saves the space of five commands. For key pairs that have been changed hundreds or thousands of times, the space savings from overwriting are of course even greater.
However, even though the log files shrink after the AOF rewrite, it is still a very time-consuming process to write the operation logs of the latest data for the entire database back to disk. At this point, we move on to another question: will overwriting block the main thread?
Will AOF overwriting block?
Unlike AOF logs written back by the main thread, the rewrite process is done by the background thread bgreWriteAof to avoid blocking the main thread and causing database performance degradation.
I summarize the process of rewriting as “one copy, two logs”.
“One copy” means that the main thread forks out of the background bgrewriteaof child every time a rewrite is performed. Fork sends a copy of the main thread’s memory to the bgrewriteaof child, which contains the latest data from the database. The bgrewriteAof child process can then write the copied data as operations, one by one, to the rewrite log without affecting the main thread.
What is “two logs”?
Since the main thread is not blocked, incoming operations can still be processed. At this point, if there is a write operation, the first log is the AOF log being used, and Redis writes this operation to its buffer. This way, even if there is an outage, the OPERATION of the AOF log is still complete and ready for recovery.
The second log is the new AOF rewrite log. This operation is also written to the buffer of the rewrite log. This way, the rewrite log does not lose the latest operations. After all operation records of copying data are overwritten, the latest operation records of rewriting log records will also be written to a new AOF file to ensure the latest state records of the database. At this point, we can replace the old file with the new AOF file.
In summary, every time AOF is overwritten, Redis first performs a memory copy for overwriting. Then, two logs are used to ensure that the newly written data is not lost during the rewrite process. Also, because Redis uses additional threads for data rewriting, this process does not block the main thread.
summary
In this lesson, I introduce you to the AOF method that Redis uses to avoid data loss. This method ensures data reliability by recording operation commands one by one and executing commands one by one during recovery.
This approach may seem “simple,” but it takes into account the impact on Redis performance. In summary, it provides three write back strategies for AOF logs, Always, Everysec, and No, which are high to low in reliability and low to high in performance.
In addition, to avoid large log files, Redis also provides an AOF rewrite mechanism, which generates insert commands for the data directly according to the latest state of the data in the database as new logs. This process is done through background threads, avoiding blocking on the main thread.
Three of these write-back strategies embody an important principle in system design known as trade-off, or “trade-off,” which is the trade-off between performance and reliability assurance. I think this is a key philosophy for system design and development, and I really hope you understand it well enough to apply it to your daily development.
However, you may have noticed that both the timing of the drop and the rewriting mechanism come into play in the process of “logging”. For example, the timing of a drop avoids blocking the main thread while logging, and overwriting prevents the log file from becoming too large. However, in the “logging” process, that is, using AOF for recovery, we still need to run all the operation logs. Combined with Redis’s single-threaded design, these commands can only be executed sequentially one by one, making the “replay” process slow.
So, is there a way to avoid data loss and recover it more quickly? Of course there is, that is the RDB snapshot. Next class, we will study together, please look forward to it.
Each lesson asking
In this lesson, I’ll give you two quick questions:
- When the AOF log is rewritten, it is done by the bgreWriteAof child, without the main thread. When we say non-blocking today, we also mean that the child does not block the main thread. But do you see any other potential blocking risks in the rewrite process? If so, where does it block?
- AOF rewrite also has a rewrite log, why doesn’t it share the log that uses AOF itself?
I hope you think about these two questions and share your answers in the comments section. In addition, you are welcome to forward the content of this lesson to more people to exchange and discuss.