Foreword: Redis is an essential framework for persistence in work. When it comes to persistence, RDB and AOF are definitely necessary. Let’s explore them

The meaning of persistence

When you use Redis for persistence, you rarely think about the significance of Redis persistence. In my opinion, the significance of Redis persistence is troubleshooting. Deployed redis, for example, as a cache cache, if there is no persistence, redis there was a catastrophic failure, the data will be lost, if through the persistent data stored on disk, and then the time synchronization and backup to the cloud storage service, you can ensure that data is not lost, still can recover part of the data

Persistence

**RDB: ** Periodic persistence of data in Redis according to rules

AOF: Write each log to a log file in appends-only mode. When Redis restarts, the entire data set can be reconstructed by playing back the write instructions in the AOF log

RDB

RDB is a process that generates snapshots of Redis memory data and saves them to disk. RDB persistence can be manually triggered or automatically triggered

Manual trigger

Save: This command is a synchronization operation, which will block the current Redis process and cause service unavailability. It is not recommended to use this command until the RDB process is complete

192.168.42.129:6379 > save OKCopy the code

** bgSave: This command creates a child process through for()**, which is stored in the background

192.168.42.129:6379> BGSAVE
Background saving started
Copy the code

Automatic trigger

In actual development, we usually do not generate RDB files by command. In Redis, automatic triggered RDB persistence is enabled by default. We can check it through the Redis

#redis.conf
save 900 1
save 300 10
save 60 10000
Copy the code

** Save 60 10000: ** Every 60 seconds, if more than 10000 keys are changed, a new dump. RDB file is generated, which is the complete snapshot of the current Redis memory

The working process

  1. Redis runs the bgsave command. Redis checks whether a sub-process, such as the RDB/AOF sub-process, exists and returns the bgsave command
  2. Fork the child process. The Redis parent process blocks during fork operations
  3. Fork complete return
  4. The child process generates a temporary RDB snapshot file of the in-memory data
  5. The child tells the parent that it is done and replaces the old snapshot file

Data recovery experiment based on RDB persistence mechanism

(1) Save some data in Redis, immediately stop the Redis process, and then restart Redis, you will find that the data is still there, why?

How to trigger the RDB mechanism

Stopping Redis with cli SHUTDOWN is a safe exit mode. When redis exits, it immediately generates a complete snapshot of the data in the memory and saves it

(2) Save several new data in Redis, use kill -9 to forcibly kill Redis, simulate the abnormal exit of Redis failure, which will lead to memory data loss

RDB file is automatically generated every 5 seconds when there is more than one key change. Then kill Redis and restart Redis. It can be found that the inserted data is still there

Advantages and disadvantages of RDB

advantages

  1. RDB will generate multiple data files, and each data file represents the data of Redis at a certain time. This multiple data file method is very suitable for cold backup

    Advantages of using RDB for cold preparation

    1. Redis is used to control the operation of generating snapshot files within a fixed period. AOF needs to write some scripts to do this, which is inconvenient
    2. RDB data is cold standby and, in the worst case, provides data recovery faster than AOF
  2. The read/write speed that RDB provides in Redis is minimal and allows Redis to maintain high performance because the main Redis process only needs to fork a subprocess to perform disk I/O operations for persistence

    Each RDB write is directly written to Redis memory, only at a certain point, the data will be written to disk

    AOF writes files every time. Although AOF can write files to OS cache quickly, it still has a certain time cost. It is definitely slower than RDB

  3. It is much faster to restart and restore Redis processes directly based on RDB data files than AOF persistence

    AOF stores instruction files, and data recovery is, in fact, to replay and execute all instruction logs to recover all data in memory

    The RDB is a data file that can be loaded directly into memory during recovery

Combined with the above advantages, RDB is ideal for cold backup

disadvantages

  1. RDB is not as good as AOF if you want to lose as little data as possible when Redis fails. In general, RDB data snapshot files are generated every 5 minutes or more, at which point you have to accept that if the Redis process goes down, the data in the last 5 minutes will be lost

    This problem, and the biggest disadvantage of RDB, is that it is not suitable for the first-priority recovery scheme. If you rely on RDB for the first-priority recovery scheme, more data will be lost

  2. Every time DB forks a child process to perform RDB snapshot data file generation, if the data file is very large, the service to the client may be suspended for milliseconds, or even seconds

    In general, do not let the RDB interval is too long, otherwise the generated RDB file is too large, the performance of Redis itself may be affected

AOF

If the Redis process crashes, the data between the two RDBS will be lost. AOF can solve this problem by writing instructions to a file every time. AOF writes the command to the OS cache and then writes it to the hard disk **(appendfsync)** according to the policy

AOF configuration

In Redis, AOF persistence is disabled by default. If you want to enable AOF persistence, you need to modify some configuration in the Redis

Appendonly yes # Whether to enable the aof file mode appendFilename "appendonly-6379.aof" # aof filename appendfsync everysec # aof policy mode dir /redis/data # no-appendfsync-on-rewrite yes # Do not use aof. Auto-aof -rewrite-percentage 100 # specifies the percentage of automatic rewrites. Auto-aof -rewrite-min-size 64mb # specifies the size to be rewrittenCopy the code

appendfsync

** Always: ** Every time a piece of data is written, the corresponding write log fsync is immediately sent to disk. Very poor performance, very low throughput,

Everysec: Fsync OS cache data to disks every second. It is recommended

No: Only Redis is responsible for writing data to the OS cache, and then the OS will flush data to disk from time to time according to its own policy

Auto – aof – rewrite – percentage, auto – aof – rewrite – min – size

For example, after the last AOF rewrite, it was 128MB

Write AOF (128MB). If you write AOF (256MB), you might rewrite

When 256MB > 64MB, rewrite will be triggered

The working process

The data in Redis is limited to a certain extent, and it cannot be said that the data in Redis memory grows wirelessly, thus causing AOP to grow wirelessly

Redis has a certain amount of memory, and at its peak redis uses a cache flushing algorithm, such as LRU, to automatically purge some data from the memory

When the amount of data stored in the AOF becomes large enough, AOF will rewrite a new AOF file based on the current redis memory and then delete the old AOF file

  1. Redis fork a child process

  2. The child process builds the log based on the data in current memory and starts writing the log to a new temporary AOF file

  3. The main Redis process, after receiving the new write operation from the client, writes the log to the memory summary while the new log continues to write to the old AOF file

  4. After the child process writes the new log file, Redis appends the new log file in memory to the new AOF file again

  5. Replace the old log file with the new one

AOF damaged file repair

If Redis goes down when the Append data is delivered to the AOF file, the AOF file may be corrupted

Use the redis-check-aof –fix command to fix broken AOF files

AOF and RDB work simultaneously

  1. If the RDB is executingThe snapshotOperation, then Redis won’t do AOF rewrite; If Redis were to do AOF rewrite, RDB snapshots would not be performed
  2. If the RDB is performing a snapshot and the user runs the BGREWRITEAOF command, AOF rewrite will be performed only after the RDB snapshot is generated
  3. If there are RDB snapshot files and AOF log files at the same time, when Redis restarts, AOF will be used for data recovery first because the logs are more complete

Persistent data recovery experiment based on AOF

  1. Appendonly of RDB (dump) and AOF (appendonly of RDB) with some data in RDB and some data in AOF (appendonly of RDB)

  2. We simulate a broken AOF, then fix, and a piece of data will be deleted by Fix

  3. Use the fix aof file to restart Redis again, and find that there is only one data

Data recovery is completely dependent on the persistence of the underlying disk, the main RDB and AOF have no data, that is gone

Advantages and disadvantages of AOF

advantages

  1. AOF can better protect against data loss. Generally, AOF will execute fsync operation every second through a background thread and lose data for a maximum of one second
  2. AOF log files are written in appends-only mode, so there is no disk addressing overhead, write performance is very high, and the file is not prone to breakage, and even if the tail of the file is broken, it is easy to repair
  3. Even if the AOF log file is too large, the background rewrite operation does not affect the client read and write. When the rewrite log was written, the guidance was compressed to create a minimal log that needed to be retrieved. When a new log file is created, the old log file is written as usual. When the log files after the merge are ready, the old and new log files can be exchanged.
  4. Commands for AOF log files are logged in a very readable manner, which is ideal for emergency recovery in the event of catastrophic deletions. Flushhall flushes all data in the flushhall file. Rewrite in the background has not yet happened. Flushhall deletes the last item in the AOF file and then flushes the AOF file back

disadvantages

  1. AOF log files are usually larger than RDB data snapshot files for the same data
  2. When AOF is enabled, the write QPS supported is lower than the write QPS supported by RDB, because AOF is typically configured to fsync log files once per second, although the performance is still very high
  3. The only big disadvantage is that it is slow to do data recovery, and it is not convenient to do cold backup, regular backup, and you may have to write complex scripts to do it, so it is not appropriate to do cold backup

How to choose BETWEEN RDB and AOF

  1. Don’t just use RDB, because that will cause you to lose a lot of data
  2. Also don’t just use AOF, because there are two problems with that. First, if you do cold standby through AOF, you can recover faster without RDB. Second, RDB is more robust by simply generating snapshots each time, avoiding the bugs of complex backup and recovery mechanisms such as AOF
  3. AOF and RDB persistence mechanisms are used in a comprehensive way. AOF is used to ensure that data is not lost as the first choice for data recovery. RDB is used for varying degrees of cold backup and for quick data recovery when AOF files are lost or corrupted and unavailable