Wechat public number search “code road mark”, point attention not lost!

This article introduces Redis persistence from the following aspects:

Writing in the front

This paper introduces two persistence methods of Redis in detail, including working principle, persistence process and practice strategy, as well as some theoretical knowledge behind. The last article only introduced RDB persistence, but Redis persistence is a whole, alone introduced not system, so reorganize.

Redis is an in-memory database, and all data is stored in memory. Compared with traditional relational databases such as MySQL, Oracle and SqlServer, which directly save data to the hard disk, Redis has a very high read and write efficiency. However, there is a major drawback to keeping it in memory. In the event of a power outage or outage, the contents of the in-memory database will be completely lost. To compensate for this shortcoming, Redis provides the ability to persist in-memory data to hard disk files and restore data by backing up files, namely Redis persistence mechanism.

Redis supports two types of persistence: RDB snapshots and AOF.

RDB persistence

In official terms, the RDB persistence scheme is a point-to-time snapshot of your data set at a specified interval. It saves memory snapshots of all data objects in Redis database at a certain time in compact binary files, which can be used for Redis data backup, transfer and recovery. For now, this is still the official default support.

How RDB works

Since RDB is a point-in-time snapshot of data sets in Redis, let’s take a quick look at how data objects in Redis are stored and organized in memory.

By default, there are 16 Redis databases, numbered from 0 to 15, one for each Redis databaseredisDbObject to represent,redisDbUse hashTable to store K-V objects. To facilitate understanding, I take one of the DB as an example to draw a schematic diagram of Redis internal data storage structure.Point-in-time snapshot is the state of every data object in every DB in Redis at a certain point in time.Let’s assume that all data objects are unchanged at this point in time, we can read these data objects in turn and write them into files according to the data structure relationship in the figure above, so as to realize the persistence of Redis. Then, when Redis restarts, the contents of this file are read according to the rules, and then written to Redis memory to restore to the persistent state.

Of course, this premise is that our above hypothesis is valid, otherwise we have no way to start in the face of a constantly changing data set. We know that client command processing in Redis is a single-threaded model. If persistence is handled as a command, the data set must be static. In addition, the child process created by the fork() function provided by the operating system can obtain the same memory data as the parent process, which is equivalent to obtaining a copy of memory data. Once fork is complete, the parent process does what it wants to do, leaving the child process to persist the state.

Obviously, the first case is not desirable, as persistent backup will cause Redis service to become unavailable for a short period of time, which is unacceptable for high HA systems. So, the second approach is the main practice for RDB persistence. Since the data of the parent process keeps changing after the fork of the child process, the child process does not synchronize with the parent process, so RDB persistence cannot guarantee real-time performance. After RDB persistence is complete, some data will be lost if power failure or downtime occurs. The backup frequency determines the amount of lost data. Increasing the backup frequency means that the fork process consumes more CPU resources and leads to larger disk I/O.

Persistent process

Finish RDB in Redis persistence methods have rdbSave and rdbSaveBackground two function method (RDB) source files in c), under the first simple speak both difference:

  • RdbSave: Is executed synchronously and the persistence process is started immediately after the method is called. Because Redis is a single-threaded model, it will block during the persistence process, and Redis cannot provide services externally.
  • RdbSaveBackground: is executed in the background (asynchronously). This method forks out a child process. The real persistence is performed in the child process (calling rdbSave) and the main process continues to provide services.

RDB persistence must be triggered by the above two methods, which can be divided into manual and automatic. Manual triggering is easy to understand. It means that the Redis server manually initiates a persistent backup command through the Redis client, and then the Redis server executes the persistence process. Automatic triggering is a persistent process automatically triggered by Redis when the preset conditions are met according to its operation requirements. The scenarios of automatic triggering are as follows (excerpted from this article) :

  • In the serverCronsave m nConfiguration rules are automatically triggered.
  • When the slave node performs full replication, the master node sends the RDB file to the slave node, and the master node sends the BGSave file.
  • performdebug reloadCommand to reload redis;
  • By default (AOF is not enabled), bgSave is automatically executed when shutdown is executed.

Combined with the source code and reference articles, I organized the RDB persistence process to help you have an overall understanding, and then from some details.From the picture above, we can know:

  • Auto-triggered RDB persistence is a subprocess persistence strategy executed by rdbSaveBackground;
  • Manual triggering is triggered by client-side commands, including the save and BGsave commands, where the save command is invoked blocking in the command processing thread of RedisrdbSaveMethod.

Automatic trigger process is a complete link, covering rdbSaveBackground, rdbSave, etc. Next, I take serverCron as an example to analyze the whole process.

Save rules and checks

ServerCron is a periodic function in Redis that executes every 100 milliseconds. One of its jobs is to determine the current automatic persistence process based on the save rule in the configuration file, and try to start persistence if the conditions are met. Look at the implementation of this part.

RedisServer has several fields related to RDB persistence.

struct redisServer {
	/* omit other fields */ 
    /* RDB persistence */
    long long dirty;                /* Changes to DB from the last save */
    struct saveparam *saveparams;   /* Save points array for RDB, */
    int saveparamslen;              /* Number of saving points, * Number of save parameters */
    time_t lastsave;                /* Unix time of last successful save */
    /* omit other fields */
}

/* Corresponds to the save parameter */ in redis.conf
struct saveparam {
    time_t seconds;					/* Statistical time range */   
    int changes;					/* Number of data changes */
};
Copy the code

Saveparams corresponds to the save rule in redis. Conf. The save parameter is the triggering policy for automatic backup triggered by redis. Save m n: A snapshot is triggered if n writes occur in m seconds. The save parameter can be configured in multiple groups to meet backup requirements in different conditions. To disable the automatic backup policy of the RDB, use the save “” command. The following are the configurations:

#Indicates that at least one key value changes within 900 seconds (15 minutes)
save 900 1
#If at least one key value changes within 300 seconds (5 minutes), the command is executed
save 300 10
#Indicates that at least 10,000 key values change within 60 seconds (1 minute)
save 60 10000
#This configuration will turn off persistence in RDB mode
save ""
Copy the code

ServerCron’s detection code for RDB Save rule is as follows:

int serverCron(struct aeEventLoop *eventLoop, long long id, void *clientData) {
    /* omit other logic */
    
    /* If a user requests an AOF file rewrite while Redis is doing RDB persistence, Redis will schedule an AOF file rewrite after the RDB persistence is complete. * If aOF_rewrite_scheduled is set to true, the user's request needs to be executed */
    /* Check if a background saving or AOF rewrite in progress terminated. */
    if (hasActiveChildProcess() || ldbPendingChildren())
    {
        run_with_period(1000) receiveChildInfo();
        checkChildrenDone();
    } else {
        /* There is no saving/rewrite child to check each save rule */
        for (j = 0; j < server.saveparamslen; j++) {
            struct saveparam *sp = server.saveparams+j;
            
            /* The check rules are as follows: the number of changes is met, the statistical period is met, the retry interval is reached, or the last persistence is completed */
            if(server.dirty >= sp->changes && server.unixtime-server.lastsave > sp->seconds &&(server.unixtime-server.lastbgsave_try >  CONFIG_BGSAVE_RETRY_DELAY || server.lastbgsave_status == C_OK)) { serverLog(LL_NOTICE,"%d changes in %d seconds. Saving...", sp->changes, (int)sp->seconds);
                rdbSaveInfo rsi, *rsiptr;
                rsiptr = rdbPopulateSaveInfo(&rsi);
                /* Execute the bgSave procedure */
                rdbSaveBackground(server.rdb_filename,rsiptr);
                break; }}/* omit: Trigger an AOF rewrite if needed. */
    }
	/* omit other logic */
}
Copy the code

If there is no RDB persistence or AOF overwriting process in the background, serverCron will determine whether to persist based on the above configuration and state by seeing if lastSave and Dirty meet one of the criteria in saveParams arrays. If a condition matches, the rdbSaveBackground method is called and the asynchronous persistence process is performed.

rdbSaveBackground

RdbSaveBackground is an auxiliary method for RDB persistence. The main job is to fork the child process and have two different execution logic depending on the caller (parent or child).

  • If the caller is the parent process, fork out the child process, save the child process information and return directly.
  • If the caller is a child process, rdbSave is called to perform RDB persistence logic, and the child process exits after persistence.
int rdbSaveBackground(char *filename, rdbSaveInfo *rsi) {
    pid_t childpid;

    if (hasActiveChildProcess()) return C_ERR;

    server.dirty_before_bgsave = server.dirty;
    server.lastbgsave_try = time(NULL);

    // fork the child process
    if ((childpid = redisFork(CHILD_TYPE_RDB)) == 0) {
        int retval;

        /* Child: Change the process title */
        redisSetProcTitle("redis-rdb-bgsave");
        redisSetCpuAffinity(server.bgsave_cpulist);
        // Perform RDB persistence
        retval = rdbSave(filename,rsi);
        if (retval == C_OK) {
            sendChildCOWInfo(CHILD_TYPE_RDB, 1."RDB");
        }
        Exit the child process after persisting
        exitFromChild((retval == C_OK) ? 0 : 1);
    } else {
        /* Parent: records the time of the child process */
        if (childpid == - 1) {
            server.lastbgsave_status = C_ERR;
            serverLog(LL_WARNING,"Can't save in background: fork: %s",
                strerror(errno));
            return C_ERR;
        }
        serverLog(LL_NOTICE,"Background saving started by pid %ld", (long) childpid);
        // Record the start time and type of the child process.
        server.rdb_save_time_start = time(NULL);
        server.rdb_child_type = RDB_CHILD_TYPE_DISK;
        return C_OK;
    }
    return C_OK; /* unreached */
}
Copy the code

RdbSave is the real implementation of persistence method, it has a lot of I/O, calculation operations, time-consuming, CPU consumption, in Redis single-threaded model of persistence process will continue to occupy thread resources, resulting in Redis can not provide other services. In order to solve this problem, Redis forks the child process in rdbSaveBackground, and the child process completes the persistence work, avoiding consuming too many resources of the parent process.

Note that if the parent process occupies too much memory, the fork process will be time-consuming. During this process, the parent process cannot provide external services. In addition, you need to take into account the amount of memory used by the computer. If the child process is forked, it will take up twice as much memory as before. The latest_fork_USec option is viewed with the info STATS command to obtain the time taken for the most recent fork operation.

rdbSave

Redis rdbSave function is the real RDB persistence function, the process, details thieves, the overall process can be summarized as: Create and open temporary files, Redis memory data is written to temporary files, temporary files are written to disk, temporary files are renamed to official RDB files, persistent status information is updated (dirty, lastSave). Among them, “Redis memory data writing temporary files” is the most core and complex, the writing process directly reflects the RDB file format, in line with the idea of a picture wins thousands of words, I draw the following figure in accordance with the source process.As a sidebar, the section “traverse the current database key-value pair and write” in the bottom right corner of the figure will be written to the RDB file in different formats depending on the Redis data type and underlying data structure. It is no longer expanded. I think it’s good to have an intuitive understanding of the whole process, because it helps us understand the inner workings of Redis.

AOF persistence

In the previous section, we learned that RDB is a point-to-time snapshot, suitable for data backup and disaster recovery. Due to the “congenital defect” of the working principle, it cannot guarantee real-time persistence, which is a hard problem for the system with zero tolerance of cache loss, hence the AOF.

How AOF works

AOF, short for Append Only File, is the full persistence policy of Redis, supported since version 1.1; File stores the set of commands that cause Redis data to be modified (e.g., set/hset/del, etc.). These sets are appended to the file in the order that Redis Server processes them. When Redis is restarted, the instructions in AOF can be read from the beginning and replayed, thus restoring the data state before shutdown.

AOF persistence is disabled by default. To enable AOF persistence, modify the following information in redis.

# no- off, yes- on, default no appendonly yes appendfilename appendone.aofCopy the code

The essence of AOF is for persistence. The persistent object is the state of each key in Redis. The purpose of persistence is to restore Reids to the state before restart or failure after failure. In contrast to RDB, AOF adopts the strategy of persisting every command that can cause the state change of objects in Redis in the order of execution, which is orderly and selective. Transfer the AOF file to any Redis Server and replay the commands from start to finish. Here’s an example:

Set number 0, incr number, get number, incr number, incr number, incr number, incr number, incr number, incr number

In this process, only set/ INCR instructions can cause the state of number to change, and their execution sequence is known, no matter how many times get is executed, the state of number will not be affected. Therefore, keep all set/incr commands and persist them to aOF files. According to the design principle of AOF, the contents of aOF file should look like this (this is the assumption, the actual RESP protocol) :

set number 0
incr number
incr number
incr number
incr number
incr number
Copy the code

The most essential principle with “command replay” four words can be summarized. However, given the complexity of the actual production environment and operating system and other constraints, Redis has a much more complex task to consider than this example:

  • After Redis Server started, aOF file has been adding commands, the file will be bigger and bigger. The larger the file, the longer the recovery time after Redis restarts. The bigger the file, the harder it is to move it; Without management, the hard drive could burst. It is clear that documentation needs to be streamlined at the right time. The five INCR instructions in this example can obviously be replaced with onesetCommand, there is a lot of compression room.
  • As we all know, file I/O is the short board of operating system performance, in order to improve efficiency, file system designed a set of complex cache mechanism, Redis operation command append operation just write data into the buffer (AOF_buF), from the buffer to write physical files in the trade-off between performance and security will have different choices.
  • File compression means rewriting. During rewriting, commands can be integrated according to existing AOF files, or snapshots can be made according to the current status of data in Redis, and then commands added in the snapshot storage process can be added.
  • The files after AOF backup are for data recovery. Redis also needs to design a complete scheme to support the format, integrity and other factors of AOF files.

Persistent process

In terms of process, AOF can be summarized into several steps: command append, file write and sync, file rewrite, and load. The details of each step and the design philosophy behind each step are explained in turn.

Command to add

When AOF persistence is enabled, Redis appends a write command to the end of the AOF buffer maintained by the Redis server in the protocol format (RESP) after it is executed. There are only single thread append operations to AOF files, no seek and other complex operations, even if power failure or downtime does not exist file damage risk. In addition, there are many benefits to using a text protocol:

  • Text protocols have good compatibility;
  • Text protocol is the request command of the client, which does not need secondary processing, saving the processing overhead during storage and loading.
  • The text protocol is readable and easy to view and modify.

AOF SDS buffer type designed for Redis autonomous data structure, Redis will adopt different methods depending on the type of command (catAppendOnlyGenericCommand, catAppendOnlyExpireAtCommand, etc.), the content of the command Finally, the buffer is written.

Note that commands are also appended to the rewrite buffer (aOF_rewrite_buffer) if they are being overridden at the time they are appended.

File writing and synchronization

AOF file writing and synchronization cannot be done without the support of the operating system. Before we begin, we need to supplement the knowledge about Linux I/O buffers. The DISK I/O performance is poor, and the file read/write speed is far lower than that of the CPU. If data is written to the disk every time, the operating system performance deteriorates. To solve this problem, the operating system provides a delayed write mechanism to improve disk I/O performance.

Traditional UNIX implementations have buffer caching or page caching in the kernel, and most disk I/O is done through caching. When writing data to the file, the kernel usually copies the data to one of the first buffer, if the buffer is full, is not it into output queue, but wait for the full or when the kernel need to reuse the buffer to hold other disk block data, then the buffer into the output queue, and then leave to reach the team first, Before the actual I/O operation is performed. This type of output is called deferred write.

Write delay reduces the disk read and write times, but slows down the update speed of the file content. As a result, the data to be written to the file is not written to the disk for a period of time. This delay can result in the loss of file updates in the event of a system failure. To ensure the consistency between the actual file system on the disk and the contents in the buffer cache, the UNIX system provides sync, fsync, and fdatasync functions to forcibly write data to the disk.

FlushAppendOnlyFile is called at the end of each event cycle. The flushAppendOnlyFile flushAppendOnlyFile writes data from the aOF_buF buffer to the kernel buffer. The appendfsync configuration determines which policy to use to write data from the kernel buffer to disk, calling fsync(). The configuration has three options: always, no, and Everysec.

  • Always: is invoked every timefsync()Is a policy with the highest security and the worst performance.
  • “No” : it will not be calledfsync(). Best performance, worst security.
  • Everysec: Called only when synchronization conditions are metfsync(). This is an official recommended synchronization policy and the default configuration. It balances performance and data security. In theory, only one second of data is lost in the case of a sudden system outage.

Note: The policy described above is subject to configuration itemsno-appendfsync-on-rewriteThis tells Redis whether fsync() is prohibited during AOF file rewriting. The default is no.

If appendfsync is set to always or Everysec, ongoing BGSAVE or BGREWRITEAOF in the background consumes too much disk I/O, and Redis calls to fsync() can block for a long time under some Linux system configurations. However, the problem has not been fixed because synchronous writes are blocked even when fsync() is executed in different threads.

To mitigate this problem, use this option to prevent fsync() from being called in the main process while BGSAVE or BGREWRITEAOF is being done.

  • Set toyesMeans if the child process is in progressBGSAVEorBGREWRITEAOF, the persistence capability of AOF is similar to that ofappendfsyncSet tonoIt has the same effect. In the worst case, this can result in 30 seconds of cached data loss.
  • If your system has the latency issues described above, set this option toyesOtherwise, keep it asno.

File to rewrite

As mentioned earlier, when Redis runs for a long time and commands are constantly written to AOF, the file will become larger and larger, which may affect the security of the host computer if left unchecked.

In order to solve the problem of AOF file size, Redis introduced the AOF file rewriting function, which will generate new AOF files according to the latest state of data objects in Redis. The old and new files have the same data state, but the new file will have a smaller size. Overwriting reduces the disk space footprint of AOF files and improves the speed of data recovery when Redis restarts. Again, in this example, 6 commands in the old file are equal to 1 command in the new file, and the compression effect is obvious.

Let’s say that if an AOF file is too big it triggers an AOF file rewrite, how big is that? What are the circumstances that trigger a rewrite operation?

Like RDB, AOF file rewriting can be triggered either manually or automatically. The bgrewriteAof command is invoked manually. If no child process is running at that time, the command is executed immediately. Otherwise, the command is executed after the child process ends. The automatic trigger is triggered by Redis’s periodic method serverCron check when certain conditions are met. Learn two configuration items:

  • Auto-aof -rewrite-percentage: percentage of the increase in the current AOF file size (aOF_current_size) compared to the aOF file size (aof_base_size) since the last rewrite.
  • Auto-aof -rewrite-min-size: indicates runningBGREWRITEAOFIs the minimum space occupied by AOF files. The default value is 64MB.

When Redis is started, aof_base_size is initialized to the size of the aOF file at that time. When Redis is running, when the aOF file rewriting operation is completed, it will be updated. Aof_current_size is the real-time size of the AOF file when serverCron is executing. AOF file rewriting is triggered when the following two conditions are met:

Percentage increase: (aof_current_size - aof_base_size)/aof_base_size > auto-aof-rewrite-percentage File size: aof_current_size > auto-aof-rewrite-min-sizeCopy the code

The code for manual and automatic firing is as follows, also in the periodic method serverCron:

int serverCron(struct aeEventLoop *eventLoop, long long id, void *clientData) {
    /* omit other logic */
    
    /* If a user requests an AOF file rewrite while Redis is doing RDB persistence, Redis will schedule an AOF file rewrite after the RDB persistence is complete. * If aOF_rewrite_scheduled is set to true, the user's request needs to be executed */
    if(! hasActiveChildProcess() && server.aof_rewrite_scheduled) { rewriteAppendOnlyFileBackground(); }/* Check if a background saving or AOF rewrite in progress terminated. */
    if (hasActiveChildProcess() || ldbPendingChildren())
    {
        run_with_period(1000) receiveChildInfo();
        checkChildrenDone();
    } else {
        /* omit RDB persistence condition checking */

        /* AOF override condition check: AOF enabled, no child process running, growth percentage set, current file size exceeds threshold */
        if(server.aof_state == AOF_ON && ! hasActiveChildProcess() && server.aof_rewrite_perc && server.aof_current_size > server.aof_rewrite_min_size) {long long base = server.aof_rewrite_base_size ?
                server.aof_rewrite_base_size : 1;
            /* Calculate the percentage increase */
            long long growth = (server.aof_current_size*100/base) - 100;
            if (growth >= server.aof_rewrite_perc) {
                serverLog(LL_NOTICE,"Starting automatic rewriting of AOF on %lld%% growth",growth); rewriteAppendOnlyFileBackground(); }}}/ * * /
}
Copy the code

What is the process of AOF file rewriting? I heard that Redis supports mixed persistence, what is the impact on AOF file rewriting?

Since version 4.0, Redis has introduced mixed persistence schemes in AOF mode, namely: pure AOF mode, RDB+AOF mode. This policy is controlled by the configuration parameter aof-use-rdb-preamble (using RDB as the first half of AOF file), which is disabled by default (no) and enabled by setting it to yes. So, there are two different ways that files can be written during AOF rewrite. When the value of aof-use-rdb-preamble is:

  • No: Writes commands in the AOF format, which is the same as that before 4.0.
  • Yes: Write the data state in RDB format first, and then write the contents of the AOF buffer during the rewrite in AOF format. The first half of the file is in RDB format and the second half is in AOF format.

Combined with source code (version 6.0, source code too much here is not posted, can refer toaof.c) and reference materials, draw AOF rewrite (BGREWRITEAOF) flowchart:Combined with the above figure, summarize the process of AOF file rewriting:

  • RewriteAppendOnlyFileBackground begin to execute, check whether there is the ongoing AOF rewrite or anti-fuzzy RDB lasting processes: if you have, retreat to the process; If not, continue to create communication channels for subsequent data transmission between the parent and child processes. Fork (), and the parent and child processes execute different processes.
  • Parent:
    • Record sub-process information (PID), timestamp, etc.
    • Continue to respond to other client requests;
    • Collect commands during AOF rewrite, append to aof_rewrite_buffer;
    • Wait for and synchronize the contents of aOF_rewrite_buffer to the child process;
  • The child process:
    • Change the current process name, create a temporary file for rewriting, and call rewriteAppendOnlyFile.
    • According to theaof-use-rdb-preambleConfiguration, in RDB or AOF mode written to the first half, and synchronized to the hard disk;
    • The incremental AOF command is received from the parent process, written to the second half in AOF mode, and synchronized to the hard disk.
    • Rename AOF file, child process exits.

The data load

When Redis is started, the loadDataFromDisk function is used to load data. Note that while you can use AOF, RDB, or both for persistence, you have to make a choice when loading the data, and loading the data separately will be a mess.

Theoretically, AOF persistence has better real-time performance than RDB. When AOF persistence is enabled, Redis gives priority to AOF when loading data. In addition, since Redis 4.0 AOF supports mixed persistence, loading AOF files requires version compatibility. The data loading process of Redis is shown below:In AOF mode, files generated when the mixed persistence mechanism is enabled are RDB header +AOF tail. If the mixed persistence mechanism is disabled, all files generated are in AOF format. Considering the compatibility of the two file formats, if Redis finds that the AOF file is an RDB header, it will use the RDB data loading method to read and restore the first half. The second half is then read and restored using AOF. Because the data stored in AOF format is RESP protocol commands, Redis uses the pseudo-client to execute commands to restore data.

If an outage occurs during the appending of AOF commands, the RESP command of AOF may be incomplete (truncated) due to the technical nature of deferred writes. In this case, Redis performs different processing policies based on the aof-load-TRUNCated configuration item. This configuration tells Redis to start reading aOF files and what to do if the files are truncated (incomplete) :

  • Yes: load as much data as possible and notify users in the form of logs.
  • “No” : the system crashes in an incorrect way, and the startup is prohibited. You need to repair the file and restart the system.

conclusion

Redis offers two options for persistence: RDB supports point-in-time snapshots of datasets at specific intervals; AOF persists every write command received by Redis Server to the log, and restores the data through the replay command when Redis restarts. The log file format is RESP, and only append operations are performed on the log file, without risk of damage. And when the AOF file is too large, it can automatically rewrite the compressed file.

Of course, you can also disable Redis persistence if you don’t need to persist data, but this is not the case in most cases. In fact, it is possible to use both RDB and AOF, but it is important that we understand the differences so that we can use them properly.

RDB vs AOF

Advantages of RDB

  • RDB is a compact binary file that represents a snapshot of Redis data at a point in time and is ideal for backup, full copy, and other scenarios.
  • RDB is very friendly for disaster recovery, data migration, and RDB files can be moved and reloaded wherever needed.
  • RDB is a memory snapshot of Redis data. Data recovery is faster and has higher performance than command replay of AOF.

RDB shortcomings

  • RDB does not allow real-time or second-level persistence. Because of persistent process is through the fork after the child process by the child, the child’s memory is in the fork operation that moment data snapshot of the parent, and the fork step-father process for foreign service operation, the internal data change in time, the child process data is no longer updated, there always exist difference, so I can’t achieve the real time.
  • Fork operations in RDB persistence can double the memory footprint, and the more data the parent has, the longer the fork process is.
  • High concurrency of Redis requests may result in frequent hits to save rules, leading to uncontrollable frequency of fork operation and persistent backup.
  • RDB files have file format requirements. Different versions of Redis will adjust the file format. The old version may not be compatible with the new version.

AOF advantages

  • AOF persistence has better real-time performance and we can choose three different approaches (appendfsync) : no, every second, always, every second as the default policy has the best performance and in extreme cases may lose a second of data.
  • AOF file only append operation, no complex seek file operation, no damage risk. It is easy to use even if the last written data is truncatedredis-check-aofTool repair;
  • When AOF files become large, Redis can automatically rewrite them in the background. The old file continues to be written during the rewrite, the new file becomes smaller after the rewrite, and the incremental command during the rewrite appends to the new file.
  • The AOF file contains all the commands to manipulate the data in Redis in a way that is understood and parsed. Even if we accidentally erased all of the data by mistake, as long as the AOF file was not rewritten, we could retrieve all of the data by removing the last command.
  • AOF already supports mixed persistence, file sizes can be controlled, and data loading is more efficient.

AOF shortcomings

  • AOF files are usually larger than RDB files for the same data set;
  • Under certain fsync policies, AOF is slightly slower than RDB. In general, fsynC_every_second performance is still high, and FSYNC_NO performance is comparable to RDB. However, RDB provides maximum low latency protection under heavy write pressure.
  • On AOF, Redis has encountered some rare bugs that are almost impossible to encounter on RDB. Some special instructions (such as BRPOPLPUSH) caused the data to be reloaded differently than before persistence. Redis officials have tested the same conditions, but were unable to reproduce the problem.

Use advice

After understanding the working principle, execution process, and advantages and disadvantages of RDB and AOF, we will consider how to weigh the advantages and disadvantages in practical scenarios and rationally use the two persistence modes. If you only use Redis as the cache tool, all data can be rebuilt according to the persistent database, then you can turn off the persistence function, do a good job of preheating, cache penetration, breakdown, avalanche protection work.

In general, Redis will do more work, such as distributed locks, leaderboards, registries, etc. Persistence will play a bigger role in disaster recovery and data migration. Several principles are recommended:

  • Do not use Redis as a database. If possible, all data can be rebuilt automatically by the application service.
  • With Redis 4.0 or higher, use AOF+RDB hybrid persistence.
  • Reasonably plan the maximum memory occupied by Redis to prevent resource shortage in the process of AOF rewrite or save.
  • Avoid multiple instances in single-node deployment.
  • In most production environments, the slave can enable the persistence capability so that the master can better provide write services.
  • Backup files must be automatically uploaded to the remote equipment room or cloud storage for disaster backup.

About the fork ()

From the above analysis, we know that snapshots of the RDB and overwriting of the AOF require forking, which is a heavyweight operation that blocks Redis. So in order not to affect the Redis main process response, we need to keep blocking as low as possible.

  • Reduce the frequency of forking, such as manually triggering RDB snapshots and AOF overrides;
  • Control the maximum memory used by Redis to prevent long fork time;
  • Use higher performance hardware;
  • Configure Linux memory allocation policies to avoid fork failures due to insufficient physical memory.

reference

  • Redis persistence principle juejin.cn/post/684490…
  • Redis persistence: what are RDB and AOF? What are their advantages and disadvantages? What is the operation process? www.chenxm.cc/article/38….
  • Redis interview questions (including answers) blog.csdn.net/Butterfly_r…
  • Redis: Sentry + replication + transaction + cluster + persistence blog.csdn.net/qq_41699100…
  • Redis Persistence mechanism juejin.cn/post/684490…
  • Reids Persistence redis. IO /topics/pers… .
  • The Redis RDB personal Details “programming. VIP/docs/Redis -…