Redis persistence helps you understand

Although there have been a lot of similar introduction on the Internet, but I still summed up the summary, from the content and details are relatively complete.

The length of the article is more than 4K words, the goods are a little dry, intermittent writing for several days, I hope to help you. Unsurprisingly, I will continue to update Redis related articles in the future and study with you.

Okay, let’s get back to the text:

Redis has two types of persistence methods, RDB and AOF respectively. Now I will introduce in detail what these two methods do in each process and their characteristics.

1. RDB persistence

RDB persistence is the default Redis persistence mode.

The RDB file it generates is a compressed binary file through which you can restore the database state when the RDB file was generated

PS: Database state refers to Redis server’s non-empty database and their key value pairs

1.1 Creating RDB Files

There are two commands to generate RDB files, one is SAVE and the other is BGSAVE.

The difference is that the former blocks the Redis server process until the RDB file is created.

The server cannot process any command requests while the server process is blocked.

The latter does not block the server process because you fork a child process and let it create the RDB file while the server process (the parent process) continues processing the command request.

When the database state is written, the new RDB file replaces the old RDB file atomically.

What happens if a client sends SAVE, BGSAVE, or BGREWRITEAOF to a server during BGSAVE execution?

Answer: None of the above three commands are executed during BGSAVE execution.

Detail reason: the first two are rejected directly, the reason is to prevent the parent process from executing two rdbSave calls at the same time, preventing race conditions. The BGREWRITEAOF command is deferred until the BGSAVE command is executed. However, if the BGREWRITEAOF command is running, sending the BGSAVE command will be rejected. Since both BGREWRITEAOF and BGSAVE are executed by child processes, there is no conflict in operation, and the reason for not executing simultaneously is performance — and issuing two child processes, both of which perform a lot of IO (disk write) operations at the same time

1.2 RDB file loading

The loading of the RDB file is performed automatically when the server starts, so there are no commands for loading, blocking the main process.

As long as AOF persistence is not enabled, RDB files are automatically loaded when detected at startup.

When AOF persistence is enabled on the server, the server will use AOF files to restore database state in preference. The reason is that AOF files are usually updated more frequently than RDB files.

1.3 Automatic interval saving

For RDB persistence, we generally use BGSAVE for persistence because it does not block the server process.

In the Redis configuration file, there is a provision to set how often the server executes the BGSAVE command.

The default Redis configuration is as follows: Save 900 1 // modify the database at least once within 900 seconds. Save 300 10 save 60 10000

As long as one of these conditions is met, the server executes the BGSAVE command.

2. AOF persistence

As we know from the introduction above, RDB persistence persists by saving database state. AOF is different in that it records database state by saving write commands to the database.

For example, if set key 123 is executed,Redis saves the write command to the AOF file.

The next time the server is started, you can restore the database state before the server was shut down by loading and executing the commands saved in the AOF file.

The overall process is the same as for RDB persistence — you create a XXX file and load it to restore the data the next time the server starts

So, how exactly does AOF persistence work?

2.1 AOF Persistence implementation

The implementation of AOF persistence can be divided into three steps: command append, file write, and file synchronization

Command appending is well understood, which is to append write commands to the end of the AOF buffer.

What about file writing and file synchronization? At first I also a face meng force, finally found the answer on the Internet, refer to the end of the article, interested readers can go to see.

The buffer is written to an AOF file, while the AOF file is saved to disk.

Ok, now that we know what that means, let’s take a closer look at what these two things are.

As mentioned in Redis Design and Implementation, the Redis server process is an event loop in which file events (socket readable and writable events) receive command requests from the client and send command results to the client.

Because the server may be writing to file events, some content is appended to the end of the AOF buffer. Therefore, the flushAppendOnlyFile method is called every time the server terminates an event loop.

This method does two things:

WRITE: Writes the buffer contents to an AOF file according to the condition.
SAVE: Depending on the condition, call the fsync or fdatasync function to SAVE the AOF file to disk.

Both steps need to be performed according to criteria determined by the appendfsync option in the Redis configuration file. There are three options:

Appendfsync always: Saves each command
Appendfsync everysec (default, recommended) : Save every second
Appendfsync no: not saved

Here are the three differences:

Appendfsync always: WRITE and SAVE are executed each time a command is executed
Appendfsync Everysec: SAVE is in principle executed every second.
Appendfsync no: After each execution of a command, WRITE is executed, SAVE is ignored, and is executed only in one of the following cases:
- Redis is closed
- The AOF function is disabled
- The system’s write cache is flushed (either the cache is full or a periodic save operation is performed). Complete os-dependent write, usually about 30 seconds)

For the analysis of operation characteristics, it is as follows:

model	WRITE Whether to block the main process	SAVE Whether to block the main process	The amount of data lost while down
appendfsync always	blocking	blocking	At most, data of one command is lost
appendfsync everysec	blocking	Don’t block	Generally no more than 2 seconds of data
appendfsync no	blocking	blocking	Data after the AOF file was last triggered by the operating system

Since AOF persistence is done by saving write commands to a file, over time the AOF file records more and more content, the file gets bigger and bigger, and it takes longer and longer to restore it.

Redis provides AOF file rewriting to address this problem.

2.2 AOF rewrite

Use this function to create a new AOF file instead of an old one. And both files hold the same database state, but the new file does not contain any redundant commands, so it is much smaller than the old file.

And why doesn’t the new file contain any redundant commands?

That’s because the override function is implemented by reading the server’s current database state. Although called “overwrite,” no read changes are actually made to the old file.

For example, the old file stored four set commands for a key. After rewriting, the new file will record only the last set command for the key. Therefore, the new file will not contain any redundant commands

Because rewriting involves a lot of IO operations, Redis uses child processes to do this, which would otherwise block the main process. The child process has a copy of the parent process’s data, which can avoid locking and ensure data security.

The new command may modify the database, which will cause the current database state to be inconsistent with the database state stored in the AOF file after rewriting.

To solve this problem, Redis sets up an AOF rewrite buffer. During AOF rewrite, the main process needs to perform the following three steps:

Execute client request commands
Appends the executed write command to the AOF buffer
Appends the executed write command to the AOF rewrite buffer

When the child process finishes rewriting, it will send a signal to the main process, which will call the signal handler function to perform the following steps:

Writes the contents of the AOF rewrite buffer to the new AOF file. The database state saved in the new file is the same as the current database state
Rename the new file, atomically overwrite the existing AOF file, and complete the replacement of the old and new files.

When the function completes, the main process continues to process the client command.

Thus, throughout the AOF rewrite, the main process is blocked only when the signal handler is executed, and not at all other times.

3. Choose the official recommendation for persistence

So far, the two persistence methods of Redis are pretty much covered. You may be wondering, in a real project, which persistence scheme should I choose? Here’s my official advice:

In general, if you want to provide high data security, it is recommended that you use both persistence methods. If you can live with a few minutes of data loss from a disaster, you can just use RDB.

Many users only use AOF, but we recommend that since RDB can take a full snapshot of the data from time to time and provide a faster restart, it is best to use RDB as well.

In terms of data recovery: RDB startup times are shorter for two reasons:

There is only one record per piece of data in an RDB file, unlike an AOF log, which may have multiple operations on a piece of data. So each piece of data only needs to be written once.
RDB file storage format and Redis data in memory encoding format is the same, no data encoding work, so the CPU consumption is far less than AOF log loading.

Note:

Note that the child process forked for RDB snapshot persistence consumes the same amount of memory as the parent process. True copy-on-write has a large impact on performance and memory consumption. For example, the machine has 8G memory, but Redis has already used 6G memory. At this time, save will regenerate to 6G and become 12G, which is larger than the system’s 8G memory. There’s an exchange going on; If the virtual memory runs out, it crashes, causing data loss. So when using Redis, you must plan the system memory capacity.

At present, the common design idea is to use Replication mechanism to make up for the performance deficiency of AOF and Snapshot, so as to achieve data persistence. That is, neither Snapshot nor AOF is enabled on the Master to ensure the read/write performance of the Master, while Snapshot and AOF are enabled on the Slave for persistence to ensure data security.

conclusion

The article is a little bit complicated and miscellaneous. I summarize it to help them review the content:

RDB persistence is the default Redis persistence mode. It is persisted by saving database key-value pairs to record state. RDB files are created by the SAVE and BGSAVE commands. The former blocks the main Redis process, the latter does not.
The RDB can set how often the BGSAVE command is executed in the configuration file
AOF stores the current database state through appending write commands. The implementation of its persistence function can be divided into three steps: command append (to AOF buffer), file write (buffer content written to AOF file), file synchronization (AOF file saved to disk)
File synchronization and saving can be determined through the appendfsync option of the configuration file
To address the growing problem of AOF files, Redis provides AOF rewriting without blocking the main process.
In order to solve the problem that the database state stored in the new AOF file may be inconsistent with the current database state during AOF rewrite, Redis introduced the AOF rewrite buffer, which is used to store new write commands generated by the child process during the rewrite of AOF file.
Finally, some official recommendations for choosing between the two persistence methods

Reference: the design and implementation of redis www.cnblogs.com/zhoujinyi/a… Redisbook. Readthedocs. IO/en/latest/I…

PS: This article was originally published on wechat public number “not only Java”, adhere to the original! Classic must-read books: Java, MySQL, Redis, Linux, MQ, Data Structures, Design patterns, programming ideas, architecture