Original blog address: pjmike’s blog

preface

Persistence is to Redis data in memory for storage in written to disk, because Redis is an in-memory database, data is exist in memory, in order to avoid the process result in the loss of data, so you need to the persistence of data to a hard drive, so the next reboot Redis persistence file before you can use to reach data recovery.

There are generally two types of persistence:

  • Snapshot: Redis RDB
  • Keep a journal: Redis AOF

For more details on Redis persistence, you can read the classic Redis Persistence Demystified

RDB

RDB persistence is the process of generating snapshots of the current process data and saving them to the hard disk. The file format is binary file. In fact, this method of saving data by snapshot is quite common in many fields, such as cloud server backup data, MySQL backup data, etc.

RDB persistence is generally triggered in two ways:

  • Manual trigger
  • Automatic trigger

Manual trigger

There are two commands for manual triggering:

  • Save command: block the current Redis server until the RDB process is complete
  • Bgsave (the main rDB-triggered persistence method): The Redis process forks the child process. RDB persistence is created by the child process

The save command

127.0.0.1:6379 > save OKCopy the code

Execute the save command to block the current Redis and return an OK when done

bgsave

The bgSave execution flow is shown below (from Redis Development & Operations) :

  • When the bgsave command is executed, the parent Redis process checks whether there are executing child processes, such as RDB/AOF child processes. If there are executing child processes, the parent Redis process returns the bgsave command
  • The parent process forks to create a child process. During the fork operation, the parent process blocks and passesinfo statsCommand to seelatest_fork_usecOption to obtain the elapsed time of the most recent fork operation in microseconds
  • After the parent process forks, the bgsave command returnsBackground saving startedMessage does not block the parent process and can continue to respond to other commands
  • The child process creates an RDB file, generates a temporary snapshot file based on the parent process memory, and then atomic replaces the original file
  • The process signals completion to the parent process, which updates the statistics

Automatic trigger

Bgsave is automatically triggered when the data set is modified n times within m seconds. The default redis automatic triggering mechanism is as follows:

# Save the DB on disk:
#
# save 
       
       
      
#
# Will save the DB if both the given number of seconds and the given
# number of write operations against the DB occurred.
#
# In the example below the behaviour will be to save:
# after 900 sec (15 min) if at least 1 key changed
# after 300 sec (5 min) if at least 10 keys changed
# after 60 sec if at least 10000 keys changed
#
# Note: you can disable saving completely by commenting out all "save" lines.
#
# It is also possible to remove all the previously configured save
# points by adding a save directive with a single empty string argument
# like in the following example:
#
# save ""

save 900 1
save 300 10
save 60 10000

Copy the code

For example, save 900 1 means that bgSave is executed if the Redis data has been modified at least once by 900s

RDB configuration

In redis.conf there are a number of configurations related to RDB, as follows:

The name of the snapshot file
dbfilename dump.rdb

# Directory for storing snapshots
dir ./

Whether to compress during snapshot backup.
# yes: compression, but requires some CPU consumption.
# no: no compression, requires more disk space.
rdbcompression yes

# Whether to enable RDB file validation, both write files and read files, write files and start files have about 10% performance improvement, but can not check RDB corruption
rdbchecksum yes

# Automatic trigger
Create a snapshot after 900 seconds and at least one key has changed
save 900 1  
Create a snapshot 300 seconds later when at least 10 keys have changed
save 300 10  
Create a snapshot after 60 seconds and at least 10000 keys have changed
save 60 10000 
Copy the code

If there is a problem with the database, the data stored in our RDB file is not new, and the data from the last RDB file generation to Redis downtime is lost. For example, if snapshots are created every 5 minutes or more, Redis may lose the last few minutes of data when it stops working (such as an unexpected power outage). So RDB is not suitable for real-time persistence

Other mechanisms

  • Full copy: If the slave node performs full copy, the master node automatically executes the BGsave command to generate an RDB file and send it to the slave node
  • Debug Reload: Reloading redis triggers the save operation
  • Shutdown command: If AOF persistence is not enabled, the command is automatically executedbgsave

AOF

AOF (Append only File) persistence: Each write command is recorded in an independent log, and the commands in the AOF file are executed again when the system restarts to recover data, similar to MySQL binlog.

Compared with RDB, AOF has better real-time performance.

Open AOF

Enable AOF persistence in the redis.conf configuration file

AOF enabled by default (no)
appendonly yes

# AOF file name (default: "appendone.aof ")

appendfilename "appendonly.aof"

The save path is the same as RDB, configurable

dir ./
Copy the code

Using the AOF process:

  • Command write (append): All write commands are appended to the aOF_buf buffer
  • File sync: The AOF buffer synchronizes data to the hard disk according to the corresponding policy.
  • Rewrite: As AOF files become larger, they need to be rewritten periodically to compress them
  • Load: When Redis restarts, the AOF file is loaded for data recovery

Command to write

Redis first appends the command to the aOF_buf buffer, which is in the user buffer of the system, more specifically in Redis buffer memory, including: client buffer, copy backlog buffer, AOF buffer. So here Redis is actually appending the command to the AOF buffer.

The reason why the buffer is written first is to avoid that every command to write AOF files is directly written to the disk, causing the DISK I/O to become the bottleneck of Redis load.

The content written by AOF command is directly in the format of text protocol, a text protocol formulated by Redis. Please refer to related materials for details. It is a plain text format with good compatibility. After AOF is enabled, all write commands contain append operations and adopt the protocol format directly, avoiding the overhead of secondary processing.

File Synchronization mode

File synchronization will essentially flush the data in the user-mode buffer (specifically, the AOF buffer) into the operating System kernel via System Call and into the hard disk.

There are two important System calls involved:

  • Write: The system calls write data to the kernel buffer and then returns the data. The operating system writes the data to disks. This improves I/O performance
  • Fsync: Submits data to the hard disk through the fsync system call, forcing the hard disk synchronization. The data is blocked until the data is written to the hard disk

File synchronization has three policies, which can be configured in the redis. Conf configuration file:

  • appendfsync always: Simply by callingfsyncFlushing data to hard disk is the most guaranteed complete persistence, but it’s also the slowest becausefsyncIt is not recommended because it is blocked.
  • appendfsync everysec: Every second, data is written to the kernel buffer and returned after completion. In order to further ensure persistence, a special background thread is opened to refresh data to disk through fsync, so as to avoid the loss of data in the kernel buffer due to system failure. A good compromise between performance and persistence is the recommended approach.
  • appendfsync no: Writes data to the kernel cache through the write system call, and then writes data from the kernel cache to the hard disk depending on the OS write. It usually takes about 30 seconds. The performance is the best but the persistence is the least guaranteed. It is not recommended. If there is a system breakdown, the data in the last 30 seconds may be lost

Rewriting mechanism

As commands write to AOF, files get bigger and bigger. To solve this problem, Redis introduced AOF rewriting to reduce file size. After rewriting, such as combining multiple commands into one command, the AOF file can be smaller so that it can be loaded faster by Redis

AOF rewrite process is divided into manual trigger and automatic trigger:

  • Manual trigger: Direct callbgrewriteaofThe command
  • Automatic trigger: two parameters
    • auto-aof-rewrite-min-size: indicates the minimum size of a file when AOF rewrite is run. The default size is 64MB
    • auto-aof-rewrite-percentage: AOF growth rate, the ratio of the current AOF file space to the AOF file space since the last rewrite

In the redis.conf configuration file, the automatic trigger parameters are set as follows:

# Start rewriting AOF automatically when AOF grows 100% and reaches 64MB
auto-aof-rewrite-percentage 100  
auto-aof-rewrite-min-size 64mb
Copy the code

Reload the

The loading process of Redis persistent files is as follows. If BOTH RDB and AOF are enabled at the same time, AOF is preferred (the following figure is extracted from Redis Development & Operation) :

References & acknowledgements

  • Redis development and operation
  • Redis persistence demystified
  • Learn more about Redis (2) : Persistence
  • Redis: Persistence mechanism