Redis second blast, server down, data lost how to do?

preface

Redis as a memory database, although very fast, still has a big hidden danger, once the server is down and restarted, will the data in memory still exist?

One obvious solution is to recover this data from background data, which can be a viable solution if the amount of data is small. But if the amount of data is too large, frequent access to data from the background database, great pressure; On the other hand, data recovery time is extremely slow.

For Redis, data persistence and fast recovery are critical.

In this article, WE will introduce the two mechanisms of Redis persistence: AOF logging and RDB snapshot.

What is an AOF log?

AOF(Append Only File) logs are called write-after logs. After commands are executed and data is written to the memory, logs are generated.

The AOF log (in text form) will write each command received and successfully executed in a certain format to the text (by way of appending).

What are the benefits of post-journaling? As follows:

The log before writing is recorded regardless of whether the command is successfully executed, butRedisThe log is written only after the command is successfully executed, avoiding incorrect commands in the log.
In addition, logs are written only after the command is successfully executed. Therefore, the execution of the current command is not blocked.

However, AOF logs also have potential risks, which are analyzed as follows:

If the server breaks down suddenly after the command is successfully executed but before the logs are written to the disk, the data in the log file will be lost when the server restarts to recover the data. (In the case of failure to recover through the background database)
Although it does not block execution of the current command, since logging is also in the main thread (RedisIf the log is blocked while writing to disk, it will definitely affect the execution of the next command.

To address the above risks, AOF logs provide three write-back strategies.

Three write back strategies

The AOF mechanism provides three write back policies, which are configured in appendfsync as follows:

Always(Synchronous write back) : Logs are synchronized to disks immediately after the command is executed
Everysec(Write back per second) : After the command is executed, the log is written to the memory buffer of the AOF file and the contents of the buffer are written to the disk every second.
No(operating system-controlled write back) : After each write command is executed, only the log is written firstAOFThe memory buffer of a file whose contents are written back to disk by the operating system.

In fact, none of the three write back strategies can solve the problem of main thread blocking and data loss. The analysis is as follows:

Synchronous write back: Basically does not lose data, but each step of the operation will have a slow down operation, which inevitably affects the main thread performance.
Per second to write back: Writes to the AOF log file once a second, but loses one second of data in the event of an outage.
Operating system controlled write back: After the buffer is written, the data is written to disk. However, the data remains in the buffer period and is still lost in the event of downtime.

The advantages and disadvantages of the above three strategies are summarized as follows:

strategy	advantages	disadvantages
Always	The reliability is high, and data is not lost	Each write command must be dropped, which has a significant impact on performance
Everysec	The performance is moderate	Lost one second of data during outage
No	Performance is good	A lot of data is lost during downtime

What if the log file is too large?

As the volume of data increases, the AOF log file will inevitably become large, which will make writing and recovering data very slow. AOF provides a rewriting mechanism to solve this problem.

The override mechanism is simple to understand. Redis creates a new AOF log file and writes the final value of each key value pair to the log file with a command.

For example, if the key/value pair key1:value1 is read, the override mechanism will record the following command in the new AOF log file:

set key1 value1
Copy the code

The final value of multiple changes is recorded in a new AOF log file so that the command can be executed directly when data is recovered.

Why does overwriting shrink files? When a key is changed several times, the AOF log file will record the command for changing the key multiple times. The rewrite mechanism generates write commands for the key based on the latest state of the key, so that multiple commands in the old file become one command in the new log after rewriting.

The author has drawn a rewrite flow chart for reference only, as follows:

Does AOF overwriting block the main thread?

Although AOF rewriting can reduce the size of log files and reduce the time for logging and data recovery, it is a very time-consuming process to write the entire database rewritten logs to disk in the case of a very large amount of data. Won’t it block the main thread?

The answer is: it does not block the main thread; Because the AOF rewrite process is done by the backend child process bgrewriteAOF, this is also done to avoid blocking the main thread and causing database performance degradation.

In fact, the rewrite process is divided into two stages: one copy, two log.

Copy: Each time a rewrite is performed, the main thread forks a child thread, bgreWriteaof. The main thread copies the memory data to the child thread, which contains the latest data of the database. The child thread can then rewrite the AOF without affecting the main thread.

What are the two logs? As follows:

The first logThe child thread rewrite does not block the main thread, the main thread is still processing the request, and the AOF log is still being logged, so that the data is complete even if there is an outage. The first log is the one being used by the value main thread.
The second log: refers to the new AOF rewrite log; Operations during the rewrite process are also written to the rewrite log buffer so that the rewrite log does not lose the latest operations. After all operation records of copying data are overwritten, the latest operation records of rewriting log records will also be written to a new AOF file to ensure the latest state records of the database. At this point, we can replace the old file with the new AOF file.

Summary: Redis forks a child thread (it does not block the main thread) and makes a memory copy for the AOF rewrite, and then uses two logs to ensure that the new data is not lost during the rewrite.

The disadvantage of AOF

Although the AOF log file is much smaller after log rewriting, the data recovery process is still executed command by command (which can only be executed sequentially due to single thread), and the recovery process is very slow.

conclusion

AOF logs operation commands one by one and provides three write back policies to ensure data reliability: Always, Everysec, and No. These three policies are from high to low in reliability and from low to high in performance.

In order to avoid large log files, Redis provides a rewriting mechanism. Each rewrite forks a child thread, copies memory data for rewriting, reduces multiple commands to one command that generates key-value pairs, and finally rewrites the log as a new log.

What is RDB?

Redis DataBase (RDB) is another persistence method: memory snapshot.

RDB records memory data at a certain time, not operation commands.

This approach is similar to taking a photo, preserving an image of only one moment. A memory snapshot is a file that writes the status of a given moment to disk. This way, even if there is an outage, the data will not be lost. This snapshot file is called an RDB file.

Because it records in-memory data at one point in time, data recovery is very fast, and there is no need to execute recorded commands like AOF logs one by one.

What data are you taking snapshots of?

To ensure data reliability, Redis performs a full snapshot, that is, writes all the data in memory to disk.

As the amount of data increases, writing all the data to disk at once will inevitably cause thread blocking, which affects Redis performance.

Redis provides two commands for thread blocking, as follows:

save: Executed in the main thread, causing the main thread to block.
bgsave:forkA child process dedicated to writingRDBFile to avoid blocking on the main thread, which is the default Redis configuration.

In this way, you can use the BGsave command to execute a full snapshot, ensuring data reliability and avoiding blocking on the main thread.

Can I modify data during snapshot?

While the child thread is performing a full snapshot, the main thread is still receiving requests. Therefore, there is no problem in reading data. However, if data is modified, how can the snapshot integrity be guaranteed?

For example, I take a full snapshot at time T, and assume that the data volume is 8 GB. It takes at least 20 seconds to write data to the disk. During this 20 seconds, once the data in the memory is modified, the integrity of the snapshot is damaged.

However, if the data cannot be modified at snapshot time, it has a huge impact on Redis performance. How does Redis solve this problem?

Redis uses copy-on-write (COW) technology provided by the operating system to process Write operations while performing snapshots.

The bgSave command forks a child thread that shares all memory data and reads the data from the main thread and writes it to the RDB file.

As shown in the figure above, the child thread is not affected by the reading of A key/value pair, but if the main thread modifies A piece of memory (for example, A key/value pair D), the data is copied and then written to the RDB file by the BGSave child thread.

How often do you take snapshots?

A snapshot records the data at a certain point in time. If the data is isolated for a long time and the server breaks down, the data will be lost.

For example, a snapshot is taken at T1 time and another snapshot is taken at T1+ T time. If the server breaks down suddenly at T1 time, only the snapshot taken at T1 time is saved in the snapshot and the data modification during T time is not recorded (lost). The diagram below:

It is clear from the figure above that RDB is not a perfect logging scheme, and the lost data can only be reduced by gradually shrinking the T time.

So the question is, can time be shortened by one second? That is, a snapshot is executed every second.

A full snapshot is a snapshot that records all memory data at a certain point in time. A full snapshot performed once per second has a significant impact on Redis performance, so incremental snapshots are created.

Increment snapshot

An incremental snapshot means that after a full snapshot is created, subsequent snapshots record only the modified data. In this way, the cost of each full snapshot is avoided.

The premise of incremental snapshot is that Redis can remember the modified data, which is also expensive. It needs to save the complete key and value pairs, which consumes a lot of memory.

To solve this problem, Redis uses a mixture of AOF and RDB.

AOF and RDB are used in combination

This concept was introduced in Redis4.0. In simple terms, memory snapshots are taken at a certain frequency, such as once an hour, and in between snapshots all command actions are recorded using AOF logs.

In this way, memory snapshots need not be executed frequently, and AOF records not all operation commands, but operation commands between two snapshots. AOF log files will not be too large, avoiding the overhead of AOF rewrite.

This solution combines the quick recovery benefits of RDB with the simplicity of logging only operation commands and is highly recommended.

conclusion

The RDB memory snapshot records memory data at a certain point in time and can be quickly recovered. The combination of AOF and RDB enables fast data recovery after outages and prevents the AOF log file from becoming too large.

conclusion

This article introduces two data recovery and persistence schemes, AOF and RDB respectively.

What does AOF introduce? As follows:

AOFIt is a post – write log to persist data by recording operation commands.
Due to theAOFThe log is generated after the command is executed. If the server breaks down before the command is written to disk, data will be lost. If writing to disk is blocked suddenly, the main thread is blocked. To address these issues, the AOF mechanism provides three write-back strategies, each with different advantages and disadvantages.
AOFWhat if the log file is too large?AOFthroughforkA child thread overwrites a new log file (sharing the main thread’s memory, keeping track of the latest write commands) while the child thread overwrites to avoid blocking the main thread.

What does the RDB introduce? As follows:

RDBA snapshot is a memory snapshot that records memory data at a certain point in time, not an operation command.
RedisTwo commands are provided, respectivelysave,bgsaveTo execute a full snapshot. The difference between these two commands issaveExecuting on the main thread will block the main thread,bgsaveIs in theforkOne child thread, shared memory.
RDB uses the operating system’s copy-on-write technology to ensure that the main thread can modify the snapshot while executing the snapshot.
If the server is down, the data generated at the interval between two snapshots will be lost.Redis4.0Begin to useAOFLogs records commands executed between snapshots (AOFandRDBMixed use).

Redis second blast, server down, data lost how to do?

preface

What is an AOF log?

Three write back strategies

What if the log file is too large?

Does AOF overwriting block the main thread?

The disadvantage of AOF

conclusion

What is RDB?

What data are you taking snapshots of?

Can I modify data during snapshot?

How often do you take snapshots?

Increment snapshot

AOF and RDB are used in combination

conclusion

conclusion

Related Posts

This section describes the RabbitMQ cluster high availability principle and actual deployment

Amazing that Python can do this

Dubbo: Spring XML Schema extension mechanism