It’s okay to stand out, it’s okay to fit in, it’s important to figure out what kind of life you want and what price you’re willing to pay for it.

We usually use Redis as a cache to improve the read response performance. Once Redis is down, all the data in memory will be lost. If a large amount of traffic is sent to MySQL, it may cause more serious problems.

In addition, slowly read from the database into Redis performance will be faster than Redis, but also lead to slow response.

In order to achieve intrepid downtime fast recovery, Redis designed two killer mace, respectively AOF (Append Only FIle) log and RDB snapshot.

Learning a technology usually involves only scattered technical points, without establishing a complete knowledge framework and architecture system in my mind and without a systematic view. This will be very demanding, and there will be a look as if they will, then forget, a face mengbi.

Follow “code elder brother byte” to thoroughly understand Redis and deeply master the core principles and actual combat skills of Redis. Build a complete knowledge framework, learn to sort out the whole knowledge system.

This article is hardcore, and I suggest that you collect some likes and read it quietly. I believe there will be a lot of harvest.

The last Redis Core: The Only Unbreakable Secret analyzed the core data structure of Redis, the IO model, the threading model, and the use of appropriate data encoding for different data. Deep down understand the real fast reason!

This article will focus on the following points:

  • How can I quickly recover from an outage?
  • How does Redis avoid data loss when it goes down?
  • What is an RDB memory snapshot?
  • AOF logging implementation mechanism
  • What is copy-on-write technology?
  • … .

The knowledge points involved are shown in the figure:

Redis panorama

The panorama can be expanded around two dimensions, which are:

Application dimensions: cache usage, clustering usage, clever use of data structures

System dimension: can be classified as three high

  1. High performance: thread model, network IO model, data structure, persistence mechanism;
  2. High availability: master/slave replication, sentinel Cluster, Cluster sharding Cluster;
  3. High scalability: load balancing

The Redis series begins with the following mind maps, this time exploring the secrets of Redis’s high-performance, persistence mechanism.

Have panorama, master system view.

In fact, the system view is very important. To some extent, when solving problems, having a system view means that you can locate and solve problems in a systematic way.

RDB memory snapshot for quick recovery from downtime

65 brother: Redis is down for some reason, so all the traffic will hit the back-end MySQL, I immediately restart Redis, but its data is stored in memory, how to restart there is still no data, how to prevent the restart data loss?

Don’t worry, “Code brother byte” takes you step by step to understand how to quickly recover after Redis downtime.

Redis data is stored in memory. Is it possible to write the data in memory to disk? When Redis restarts, the data stored on the disk is quickly restored to memory, so that services can be provided after the restart.

Brother: I came up with a plan that writes to disk every time I write to memory

This scheme has a fatal problem: each write instruction not only writes to memory but also to disk, and disk performance is too slow relative to memory, causing Redis performance to degrade significantly.

Memory snapshot

Brother 65: How to avoid the simultaneous writing problem?

We usually use Redis as a cache, so even if Redis does not save all data, it can be retrieved from the database, so Redis does not save all data, Redis data persistence uses “RDB data snapshot” method to achieve fast recovery from downtime.

Brother: What is RDB memory snapshot?

As Redis executes the write command, the memory data changes. The so-called memory snapshot refers to the state data of data in Redis memory at a certain moment.

Just like time is fixed in a certain moment, when we take photos, we can completely record the instant picture of a certain moment through photos.

Redis is similar in that it takes a snapshot of a moment’s worth of data as a file and writes it to disk. This snapshot file is called an RDB file, which stands for Redis DataBase.

Redis performs periodic RDB memory snapshots so that the disk is not written every time the write command is executed, but only when the memory snapshot is executed. Not only to ensure that only fast not broken, but also to achieve persistence, downtime fast recovery.

During data restoration, RDB files are directly read into the memory to complete data restoration.

Brother 65: What data do you take snapshots of? Or how often do you take snapshots? This will affect the efficiency of snapshot execution.

Good for you, start thinking about data efficiency. As we learned in Redis Core: The Secret that Can’t Be Broken, his single-threaded model dictates that we should avoid operations that block the main thread as much as possible, and avoid RDB file generation blocking the main thread.

Generate an RDB policy

Redis provides two instructions for generating RDB files:

  • Save: main thread execution, blocking;
  • Bgsave: Calls the glibc functionforkA child process is created to write to the RDB file, and the snapshot persistence is completely handled by the child process. The parent process continues to process client requests and generates the default configuration of the RDB file.

Brother: Can the memory data be modified while taking a snapshot? That is, can write instructions be processed properly?

First let’s make it clear that avoiding blocking is not the same thing as being able to handle writes during RDB file generation. Although the main thread is not blocked, until then only read operations can be processed, not the data in the snapshot being performed, in order to ensure consistency of the snapshot data.

Obviously, suspending write operations to generate RDB is not allowed by Redis.

How does Redis process write requests while generating RDB files?

Redis uses the operating system’s multi-process copy-on-write technology COW(Copy On Write) to implement snapshot persistence, an interesting and little-known mechanism. Multi-process COW is also an important indicator to identify the breadth of knowledge of programmers.

Redis calls glibc’s fork during persistence to generate a child process. Snapshot persistence is left entirely to the child process, and the parent process continues to process client requests.

When a child process is created, it shares code and data segments in memory with the parent process. Here you can think of the father-child process as a conjoined baby, sharing a body.

This is the mechanism of the Linux operating system. To save memory resources, try to share them as much as possible. At the moment of process separation, there is little change in memory growth.

The BGSave child can share all of the main thread’s memory data, read the main thread’s data and write it to the RDB file.

When you run the SAVE command or BGSAVE command to create a new RDB file, the program checks the database keys, and expired keys are not saved to the newly created RDB file.

When the main thread executes a write command to modify the data, a copy of the data is made. The bgSave child reads the copy and writes it to the RDB file, so the main thread can directly modify the original data.

This ensures snapshot integrity and allows the main thread to modify data at the same time, avoiding impact on normal services.

Redis uses bgSave to take a snapshot of all the data currently in memory. This is done in the background by the child process, which allows the main program to modify the data at the same time.

65 brother: can you execute RDB files every second, so that at most 1 second of data will be lost if there is an outage?

Performing full data snapshots too frequently has two serious performance costs:

  1. RDB files are frequently generated and written into disks. The disk pressure is too high. The last RDB has not yet been executed, and the next RDB starts generating again, in an endless loop.
  2. Forking a BGSave child blocks the main thread, and the larger the main thread, the longer the block time.

The advantages and disadvantages

The snapshot recovery speed is fast, but the RDB file generation frequency is difficult to control. If the frequency is too low, more data will be lost during the downtime. Too fast, and it costs extra.

RDB uses binary and data compression to write data to disks, resulting in small file size and fast data recovery.

In addition to the full RDB snapshot, Redis also designed AOF after write log, let’s discuss what is AOF log.

AOF post-write logs to avoid data loss during downtime

The AOF log stores the sequential instruction sequence of the Redis server, and the AOF log only records the instruction records that modify the memory.

Assuming that the AOF log records all the sequence of modified instructions since the Redis instance was created, it is possible to restore the state of the in-memory data structure of the current instance of Redis by executing all instructions sequentially, known as “replay,” on an empty Instance of Redis.

Compare the before and after logs

Write Ahead Log (WAL) : The modified data is written to a Log file before data is actually written to ensure fault recovery.

For example, the redo log in MySQL Innodb storage engine is a data log that records changes. It records changes before the actual data is modified.

Log after write: The write command is executed to write data to the memory, and then logs are recorded.

Log format

When Redis receives the “Set Key MageByte” command to write data to memory, Redis writes the data to the AOF file in the following format.

  • *3: indicates that the command is divided into three parts. Each part starts with $+ number, followed by the specific command, key, and value of the part.
  • Digit: indicates the size of bytes occupied by commands, keys, and values. For example, “$3” indicates that this part contains three bytes, which is the “set” instruction.

Brother 65: Why does Redis use post-write journaling?

Post-write logging avoids the extra checking overhead and does not require syntactic checking of executed commands. If you use pre-write logging, you need to check whether the syntax is wrong. Otherwise, the wrong command is recorded in the log, and errors will occur when you use log recovery.

In addition, the log is logged after the write and does not block the execution of the current “write” command.

Brother 65: That has AOF to be sure?

Silly boy, it’s not that simple. If Redis is down after executing a command without logging, it may lose data related to the command.

Also, AOF avoids blocking on the current command, but risks blocking on the next command. AOF logs are executed by the main thread. During the process of writing logs to disks, if the disk pressure is high, disk write is slow and subsequent write commands block.

Note that these two problems are related to disk write back. If you can properly control the time when AOF logs are written back to disk after the “write” command is executed, the problem is solved.

Write back to strategy

In order to improve the efficiency of document writing, when a user calls the write function, some data will be written to a file, the operating system will often write data is temporarily stored in a memory buffer, wait until the buffer space fill up, or more than the specified time limit, after data from the buffer was really written to disk.

While this improves efficiency, it also poses a security problem for writing data, because if the computer goes down, the written data stored in the memory buffer will be lost.

For this purpose, the system provides two synchronization functions, fsync and fdatasync, which can force the operating system to immediately write the data in the buffer to the hard disk to ensure data writing security.

The Redis appendfsync write back policy for the AOF configuration item directly determines the efficiency and security of AOF persistence.

  • always: Synchronous write back, the write command will be executed immediatelyaof_bufThe contents of the buffer are written to the AOF file.
  • Everysec: Write back every second. After the write command is executed, the log is only written to the AOF file buffer and the contents of the buffer are synchronized to disk every second.
  • No: the write execution is controlled by the operating system. After the write execution is complete, the logs are written to the MEMORY buffer of the AOF file. The operating system decides when to write logs to the disk.

There is no one-size-fits-all strategy, and we need to make a trade-off between performance and reliability.

Always Synchronous write back protects data from loss. However, each write command needs to be written to the disk, resulting in the worst performance.

Everysec writes back every second, avoiding the performance overhead of synchronous write backs. In the event of an outage, one second of data may be written to disk and lost, a trade-off between performance and reliability.

No operating system control, after executing write instructions, write AOF file buffer can execute subsequent “write” instructions, the best performance, but may lose a lot of data.

Brother 65: So how do I choose a strategy?

You can select a write back policy based on the system’s requirements for high performance and reliability. To summarize: For high performance, choose the No strategy; To ensure high reliability, choose the Always policy. If you want to allow a bit of data loss, but you don’t want performance to suffer too much, choose the Everysec policy.

The advantages and disadvantages

Advantages: Logs are recorded only after the command is successfully executed, avoiding the overhead of checking the syntax of instructions. Also, it does not block the current “write” instruction.

Disadvantages: Since AOF records the contents of each instruction, please refer to the log format above for details. Every instruction needs to be executed during failover, and if the log file is too large, the recovery process can be very slow.

In addition, the file system also has limits on file size. You cannot save large files. As large files become, the append efficiency decreases.

Log too large: AOF rewriting mechanism

65 brother: What if the AOF log file is too large?

AOF pre-write log, which records each “write” operation. It will not cause performance loss like RDB full snapshot, but the execution speed is not as fast as RDB, at the same time the log file is too large will cause performance problems, for the only fast Redis this real man, absolutely cannot tolerate the log caused by the problem.

Therefore, Redis designed a killer “AOF rewrite mechanism”. Redis provides the BgreWriteAOF directive to slim down AOF logs.

The principle is to create a sub-process to traverse the memory into a series of Redis operation instructions, serialized into a new AOF log file. After serialization, the incremental AOF logs that occurred during the operation are appended to the new AOF log file, which immediately replaces the old AOF log file, and the slimming job is complete.

Why does AOF override shrink log files?

The override mechanism has the “change-one” function, which converts multiple instructions in the old log into one instruction after rewriting.

As follows:

The three LPUSH instructions are rewritten by AOF to generate one, and the reduction effect is more obvious for scenes that have been modified for many times.

Brother 65: After the rewrite, the AOF log becomes smaller, and finally the operation log of the latest data of the entire database is written to disk. Does overwriting block the main thread?

The AOF log is written back by the main thread. The AOF rewrite process is actually done by the backend bgrewriteaof process to prevent the main thread from blocking.

Rewrite the process

Unlike AOF logs written back by the main thread, the rewrite process is done by the backend child bgreWriteAof to avoid blocking the main thread and causing database performance degradation.

In general, there are two logs, one copy of the memory data, the old AOF log and the new AOF rewrite log and the Redis data copy.

Redis records “write” operations received during the rewrite to both the old AOF buffer and the AOF rewrite buffer, so that the rewrite log also stores the most recent operations. After all operation records of copying data have been overwritten, the latest operation of overwriting buffer records will also be written to the new AOF file.

For each AOF rewrite, Redis first performs a memory copy, which is used to iterate over the data to generate rewrite records. The use of two logs ensures that the newly written data is not lost and data consistency is maintained during the rewrite process.

AOF rewrite also has a rewrite log, why doesn’t it share the log that uses AOF itself?

This is a good question for two reasons:

  1. One reason is that parent and child processes writing the same file will inevitably cause contention issues, and controlling contention means that the performance of the parent process will be affected.
  2. If the AOF rewrite fails, the original AOF file is contaminated and cannot be restored. So Redis AOF overwrites a new file. If the overwrite fails, delete the file directly, without affecting the original AOF file. When the rewrite is complete, simply replace the old file.

Redis 4.0 hybrid logging model

When rebooting Redis, we rarely use RDB to restore memory state because a lot of data is lost. We usually use AOF log replay, but replay AOF log performance is much slower than RDB, so it can take a long time to start up with a large Redis instance.

Redis 4.0 addresses this problem with a new persistence option, hybrid persistence. Store the contents of the RDB file with the incremental AOF log file. Here the AOF log is no longer the full log, but rather the incremental AOF log that occurs between the beginning of persistence and the end of persistence, which is usually small.

Therefore, when Redis restarts, the contents of RDB can be loaded first, and then the incremental AOF log can completely replace the previous full AOF file replay, so the restart efficiency is greatly improved.

So RDB memory snapshots are performed at a slightly slower frequency, using AOF logging to record all “writes” that occur between the two RDB snapshots.

In this way, snapshots do not need to be executed frequently. In addition, AOF only needs to record the “write” instructions that occur between two snapshots, and does not need to record all operations, avoiding the problem of large files.

conclusion

Redis designed bgSave and copy-on-write to minimize the impact of read and write instructions during snapshot execution, frequent snapshots that stress disks and fork the main thread.

Redis has designed two killer features to achieve fast recovery from downtime without data loss.

AOF rewriting mechanism is provided to avoid large logs. According to the latest data status of the database, write operations of data are generated as new logs, and the main thread is not blocked through the background.

The combination of AOF and RDB in Redis 4.0 provides a new persistence strategy, a hybrid logging model. When Redis is restarted, the RDB can be loaded first, and then the incremental AOF log can be replayed completely instead of the previous full AOF file replay, which greatly improves the restart efficiency.

Finally, regarding the choice of AOF and RDB, “Code brother” has three suggestions:

  • When data cannot be lost, a combination of memory snapshots and AOF is a good choice;
  • If minute-level data loss is allowed, only RDB can be used.
  • If only AOF is used, everysec’s configuration option is preferred because it strikes a balance between reliability and performance.

After two series of articles on Redis, readers should have an overview of Redis.

The next “code brother” will bring a real combat, “Redis high availability: master and slave architecture mystery” actual combat + principle to present to you!

Stay tuned……

Hardcore oliver

Redis core: The only unbreakable secret

Tomcat architecture principle analysis to design reference

Finish kafka from an interview perspective

Analysis of volatile and synchronized principles from JMM

thanks

[1] Redis core technology and actual combat

[2] Redis Deep Adventure: Core Principles and Application Practice

[3] Redis design and practice