Introduction to the
Redis operation in memory, read and write speed is very fast, QPS up to 10W, often used as a cache server to relieve the pressure on the database. We know that data in memory can easily be lost, such as when the machine is down, so we need a persistent mechanism to protect redis data from being lost due to failures. This mechanism is Redis’ persistence mechanism:
- The snapshot
- AOF log
The snapshot
The snapshot principle
As mentioned in the Basics of Redis, Redis is single-threaded. This thread needs to handle both client requests and logical reads and writes to in-memory data structures, which is obviously difficult to do without maintaining high performance. Therefore, Redis uses the multiple process COW (Copy on Write) copy-on-write mechanism of the operating system to implement snapshot persistence. Copy on Write principle see this article on Linux write replication.
Fork (multithreading)
Redis calls the glibc fork() function to create a child process during persistence. The child process handles all snapshot persistence, and the parent process continues to process client requests. The child process fully shares the code and data segments in memory with the parent process. The child process does data persistence without modifying the existing in-memory data structure. It simply iterates through the in-memory data and serializes it to disk. At the same time, the parent process is receiving requests from the client for data interaction, and then continuously modifying the data in memory. When a client requests to modify some data, the COW mechanism of the operating system is used to separate data segments. To put it simply, a data segment consists of multiple data pages. If you find the data page where the data resides, the system will make a copy of the data page and modify the data on the copy page. The child’s data does not change, so the data it sees in memory is frozen as soon as jincheng is created, just like a snapshot taken.
AOF log
AOF principle
AOF logs store redis sequential instructions in order of execution. AOF logs only record instructions that modify memory, similar to MySQL Binlog.
AOF writing process:
When Redis receives the modification command from the client, it will check the parameters first. If there is no problem, it will immediately store the command to the AOF log, that is, write it to the disk. After the instructions are stored in the AOF log, even if the data in the memory is lost due to the downtime, the data can be reproduced by replaying the AOF log to restore redis to the state before the downtime.
As Redis runs for a long time, it receives a lot of instructions, which causes the AOF log to become larger and larger. However, many commands in the AOF log are repeatedly overwritten, so we only need to keep the latest data. For example, the initial data value of a key = AAA value = 1 has been modified for 1000 times during the operation of Redis, and the latest value = 111. There are 1000 instructions on key= AAA in AOF, and we only need to record its latest value=111 to ensure persistence. So we need to slim down AOF.
AOF rewrite
As the name implies is to rewrite the AOF command, to achieve the purpose of slimming. Redis provides the bgrewriteaof command to rewrite AOF logs. The rewriting principle is as follows:
- Create a child process to traverse the memory and convert it into a series of REDis operation instructions, serialize a new AOF log file, and then append the incremental AOF log occurred during the operation to the new AOF log file, and immediately replace the old AOF log with the new AOF log
Persistent summary
- Snapshots are performed by forking a child process, which is a resource-intensive operation and traverses the entire memory. Large writes to disks increase system load
- Fsync for AOF is a time-consuming IO operation that degrades redis performance and increases the IO burden on the system.
So Redis typically persists data from nodes, while the master node interacts with the client
Hybrid persistence schemes
RDB + incremental AOF log format: When a Redis is restarted, RDB recovery is first used and then the incremental AOF log is used for command replay to achieve the effect of fast start recovery data.