This is the 26th day of my participation in the August More Text Challenge

preface

Redis is an in-memory database of key-value pairs, and the read and write data is based on memory, so its performance is very high. But at the same time, if the server goes down, the data in memory is not recoverable. Therefore, Redis thought of persistence, how to gracefully synchronize the data in memory to disk. So that Redis can restore the original data when it restarts, which is called persistence.

Redis can be persisted in three ways:

  • RDB snapshot: Writes the memory data of a database status to the disk in binary mode.
  • AOF File Append (Append Only File) : Records all operation commands and appends them to a File as text
  • Hybrid persistence: This is a feature since Redis4.0 and combines the best of RDB and AOF. AOF is used to ensure that data is not lost as the first choice of data recovery. RDB is used for varying degrees of cold backup and for quick data recovery when AOF files are lost or corrupted and unavailable

These three use modes depend on specific application scenarios, and the two persistence modes of RDB and AOF are what we need to master. This article will give a detailed interpretation of RDB and AOF.

I. Overview of RDB

RDB is short for Redis DataBase. It writes a Snapshot of memory ata certain moment to disk in binary mode. So how to write data to disk, how to trigger RDB persistence, there are two modes of manual save and automatic save.

1.1 Manual Save

There are two action commands to perform RDB persistence, save and BGSave.

  • The SAVE command blocks the server process until the RDB file is created, during which time the server cannot process any command requests!!
  • The BGSave command spawns a child process that creates the RDB file, and the server process continues processing the command request.

SAVE the demo:

The first time to view dump. RDB file was in July, when we used the save command, we found that the RDB file had been updated, indicating that the save command successfully triggered the persistence of Redis. The save command will block the Redis server when executed, so this command is still very dangerous.

BGSAVE demo:

The execution of the BGSAVE command is shown. The save work is performed by the child process, so the Redis server can still process the client’s command requests while the child process is saved. At the same time, the BGSAVE command cannot be executed at the same time.

1.2 Automatic Saving

The main difference between the two manually saved commands is:

  • The save command is performed by the server process and blocks the server
  • The BGSave command is performed by the child process and does not block service requests.

For this reason, we use the bgsave command by default. In redis, there is a save option that enables automatic saving at intervals, and also triggers automatic saving at flushall, master/slave synchronization.

1.2.1 the save

Redis has a save command in its configuration. Note that some readers may think this save is the same command as the previous save command, but it actually performs BGSave. Create a child process to save the RDB file.

Note that the Redis snapshot is a full snapshot, and the current memory data is recorded to the disk each time the snapshot is triggered. If the snapshot is executed frequently when a large amount of data is generated, the Redis performance may be affected.

Save 900 1: If at least one key changes within 900 seconds, save 300. 10: If at least 10 keys change within 300 seconds, save 60 10000. If at least 10000 keys change within 60 seconds, the value is savedCopy the code
1.2.2 flushall command

This command clears the database and, if executed, generates the dump. RDB file

1.2.3 Exit Trigger

When exiting Redis, persistence is also triggered to generate the dump. RDB file

1.2.4 Primary/secondary Synchronization Is Triggered

In Redis master-slave replication, when the slave node performs a full copy operation, the master node executes the BGsave command and sends the RDB file to the slave node, which automatically triggers Redis persistence.

1.3 Data Recovery

Redis loads in-memory data at startup time. Whenever the Redis server detects an RDB file in the boot directory at startup time, it automatically loads the RDB file. If the dump. RDB file does not exist in the root directory, move the dump. RDB file to the root directory of Redis. Verify that the RDB file has been loaded Redis has a log message when it is started, showing whether the RDB file has been loaded.

1.4 COW mechanism

To understand RDB, it is necessary to mention COW mechanism of operating system. The RDB file is written to by the child process, and the RDB file is to be persisted by the main process. The RDB file is to be persisted by the child process, and the RDB file is to be persisted by the main process. The RDB file is to be persisted by the main process. The conclusion is that the data can be modified, but the data written to the RDB is 1, which is the data before the modification.

Let’s analyze this process:

When fork is used to create a child process, the child process shares memory with the parent process. The child process references the same physical memory as the parent process through the page table responsible for the parent process. If the parent class modifies the page table data, then the child process reads the data.

That’s not the case, so we need to talk about COW here. Redis uses the multi-process COPY on Write (COW) mechanism of the operating system to implement snapshot persistence. What’s the advantage of this?

The so-called COW mechanism refers to that when modifying a shared resource, a copy of the shared resource is created, and then the shared resource is locked and modified. Then, the reference of the original container points to the new container. In this way, the memory used for operations between processes is different and data problems are avoided. This point is as follows:

So the child process can save the original data, but the memory A is already 2, this value can only be written the next time the RDB triggers.

COW can copy all the memory. If all the memory is changed, all the memory needs to be copied, which is equivalent to half of the memory. Enough memory is required to implement COW. Therefore, snapshot, we can make another definition, Redis at the moment when the process RDB triggers, save is the data at that moment, although the data may change later, but I save is a snapshot, time frozen at that moment of data, this is the understanding of snapshot.

Summary of COW mechanism:

Fork a child process that actually allocates memory and copies data only when a parent process writes to the page, but not the entire page.

1.5 the advantages and disadvantages

Advantages:

  1. The RDB is suitable for cold backup. It occupies a small memory and is useful for Dr.
  2. RDB has very little impact on the read and write services provided by Redis, so that Redis can maintain high performance, because the main redis process only needs to fork a sub-process and let the sub-process perform disk I/O operations for RDB persistence
  3. Compared to AOF persistence, restarting and restoring redis processes directly based on RDB data files is much faster

Disadvantages:

  1. RDB is prone to data loss, and in the event of an outage, data is lost in the last few minutes
  2. When the RDB forks a child process to generate large file data, it can cause Redis performance to degrade, causing the server to temporarily suspend service, typically by a few millimeters.

Ii. Overview of AOF

In addition to RDB persistence, Redis also provides AOF persistence. Unlike an RDB snapshot, it does not save the data at that point in time, but appending all commands to an AOF file. The server can restore the database state before the server was shut down by loading and executing the commands saved in the AOF file at startup.

Redis does not enable AOF persistence by default, which can be changed through the configuration file.

Note out the save command of RDB, set save to “”, and restart the redis test

Type in a few commands and open the appendonly file. The AOF file holds the commands that we use to manipulate redis.

2.1 Persistent implementation

AOF persistence can be divided into three steps: command append, file write and file sync.

2.1.1 Adding Commands

When Redis uses AOF persistence, the server appends a write command to the end of the server state’s AOF_buf buffer in a protocol format after executing it.

2.1.2 File writing and synchronization

In order to improve the efficiency of file writing, in a modern operating system, the write function called when the user, to write some actions into the file, the operating system will often write data is temporarily stored in a memory buffer, wait until the buffer space is filled with or exceed the specified time limit after, really will buffer the data written to disk, For example, the following flow chart:

Adding a buffer improves the efficiency of writing data, but it also poses a security problem for writing data. If the computer is down, the data stored in the buffer will be lost. Therefore, the operating system provides two synchronization functions, fsync and fdatasync. They can force the operating system to immediately write the data in the buffer to disk, thereby ensuring security.

The normal synchronization of data is done by the redis.conf appendfsync setup policy, which determines the efficiency and security of AOF persistence.

Appendfsync option value Executive function
always Every write operation on the client is saved to an AOF file. This strategy is secure, but each write request has IO operation, so it is slow.
everysec The default write policy is to write aOF files once per second, so at most 1s of data may be lost
no The Redis server does not take care of writing aOF files, but leaves it up to the operating system to handle when to write AOF files, which is faster but less secure.

2.2 Loading AOF Files (Load)

The AOF file contains all the commands needed to restore the state of the database, so the server can restore the state of the database as long as it reads the AOF file and executes the commands again.

Redis loads the AOF file and restores the database state as follows:

1. Create a dummy client without network connection: since Redis commands can only be executed on the client, create a dummy client without network connection to execute local commands requesting to write AOF files to memory

Analyze and read a write command from the AOF file

3. Use the pseudo client to execute the read write command

4. Repeat until all write commands of AFO are finished

2.3 AOF rewrite

Redis first executes the write command and then writes the command to the AOF buffer. However, as Redis runs for a long time, the AOF logs get longer and longer. If downtime is followed by a restart, rewrite to load AOF

The operation of writing commands to files will be very time-consuming, resulting in Redis unable to provide external services for a long time. Therefore, AOF logs need to be slimmed down. The so-called slim is to rewrite the order, so that it does not become so bloated.

2.3.1 Implementation of AOF file rewrite

To solve the problem of AOF files becoming bloated, Redis can create a new AOF file to replace the existing AOF file in a process called AOF file rewriting.

However, AOF file rewriting does not require reading, analyzing, and writing of existing AOF files, but is implemented based on the current database state.

For example, the original AOF file command could be something like this, where all 10 commands need to be saved.

The new AOF file, regardless of the old AOF file, simply reads the state of the current key from the current database. For example, if the current list key has five elements, it can be written as:

rpush list a b c d e
Copy the code

For the five set commands, we only need to obtain the current value of K1 as V5, that is, copy this command into the new AOF, and ignore the previous discard commands.

This is the implementation of AOF rewrite slimming, which is based on the current database state, to the command slimming.

2.3.2 AOF background rewrite

The previous RDB uses bgSave to write data in the background, while AOF also needs to be overwritten in the background.

Also, when Redis overrides a program, it will put it into a child process to execute, so that the Redis process can continue to process client command requests while the AOF overrides. However, there is also the problem of data inconsistency. In the process of rewriting, data is changed and the current state of the database is inconsistent with the state of the AOF file rewritten by the child process. What should I do?

To solve this problem, the Redis server sets up an AOF rewrite buffer. This buffer is used after the server creates the child process. When the Redis server executes a write command, it sends the write command to both the AOF buffer and the AOF rewrite buffer.

Why split up? Because the AOF buffer records all commands, and the AOF rewrite buffer is the write command executed by the server after the child process is created, the content is still different.

After the child process completes the AOF file rewrite, it sends a signal to the parent process. Upon receiving the signal, the parent process calls a signal handler and performs the following:

  • 1. Write all contents in the AOF rewrite buffer into the new AOF file to ensure that the database state saved by the new AOF file is consistent with the current state of the server.
  • 2. Rename the new AOF file, overwrite the original AOF file, and complete the replacement
  • 3. Continue processing client request commands.

How to say, the premise of rewriting is to ensure that the current state of the database is consistent with the data of the AOF file, so after creating a child process to rewrite, the following commands will be put into the buffer, although these commands may be repeated, such as set k1 v1; The set of k1 v2; However, in order to ensure consistency, after the new AOF file is rewritten, these commands need to be appended to the end of the file, although it does not meet the requirements of simplification, but data consistency first.

This is AOF backstage rewrite, or more interesting.

2.4 the advantages and disadvantages

Advantages:

AOF simply apends log files, so it has less impact on server performance, is faster than RDB, and consumes less memory

Disadvantages:

  • 1. The log file generated by AOF is too large. Even if AFO is used to rewrite the log file, the size of the file is still too large
  • 2. Loading recovery data after a restart is slower than RDB.

Three, how to choose

When rebooting Redis, we rarely use RDB to restore memory state because a lot of data is lost. We usually use AOF log loading, but the process of loading AOF log is much slower than RDB, so what should we choose?

The website advises:

  • If you only do caching, you don’t need persistence
  • If you can afford to lose data within a few minutes, you can just persist with RDB
  • Redis will first load AOF files to restore, because RDB is more suitable for backup database, quick restart, no AOF potential bugs.

It doesn’t solve our previous problem, but a new persistence option is available after Redis4.0 – hybrid persistence, which stores the contents of the RDB file with the incremental AOF log file. The AOF log is no longer a full log, but rather an incremental AOF log that occurs between the start of persistence and the end of persistence, which is usually small.

Therefore, adopting this model is different from the official recommendation, and it optimizes it. When Redis restarts, load the content of the previous RDB file first, and then replay the incremental AOF log to optimize the full load of the previous AOF, which greatly improves the efficiency of restart.

The mixed persistence configuration is as follows:

Aof-use-rdb-preamble yes # yes: enable, no: disableCopy the code

The above is my understanding of THE AOF and RDB content of Redis. This paper took about 5 hours to learn and write, with full hand typing and drawing. If there is any mistake, please comment, discuss and exchange, and also welcome everyone to like and collect.

The resources

  • Redis Deep Adventure
  • Redis Design and Implementation