We all know that Redis is an in-memory database, which stores its data in memory. If the server process exits (power failure, restart, etc.), the data will be lost. To solve this problem, Redis provides two persistent methods to persist data to hard disk, namely memory snapshot (RDB) and AOF log.

Java backend architecture advanced Study notes:Java from entry to architecture Growth Notes :JVM+ Concurrency + source code + distributed + Microservices + Dachang field projects + Dachang performance tuning solutions (click here to get free)

1. What is a memory snapshot

The so-called memory snapshot, as the name implies, is to take a picture of the memory, at a certain point to record the data in the memory, in the form of a file saved to the hard disk, so that even if the downtime, the data still exist. After the server restarts, you just need to restore the data in “Photos”.

RDB persistence is a process that generates a snapshot (a compressed binary file) of the data in the current process at a certain moment and saves it to disk. RDB persistence can be triggered manually or automatically.

1.1 Manual Triggering

Manually trigger the corresponding save and bgsave commands respectively:

1.1.1 the save command

The save command blocks the current Redis server until the RDB process is complete. The server cannot process any command requests while the server process is blocked. Therefore, while the save command is executing, all commands sent by the client are rejected until the save command is executed.

Copyredis>save // Wait until the RDB file is created
ok
Copy the code

Note:

The single-threaded model of Redis dictates that we should try to avoid all operations that block the main thread. Since blocking the server process during the execution of the Save command can cause long blocks for instances with large memory, online environments are not recommended.

1.1.2 bgsave command

The BGSave command gives birth to a child process (rather than a thread) that creates the RDB file, while the parent process continues to process the command.

Copyredis>bgsave
Background saving started   // The RDB file is created by the child process
redis>					  	// Continue processing other commands
Copy the code

Note:

  1. When executing the BGSave command, the server rejects the save command sent by the client in order to avoid a race condition when both parent and child rdbSave calls are executed simultaneously.
  2. If the BGSave command is executing, the bgrewriteAOF (AOF overwrite) command is delayed until after the BGSave command, and if the bgrewriteAof command is executing, the bgSave command sent by the client is rejected by the server.
  3. Although the BGSave command is used by the child to generate the RDB file, fork() creates the child and blocks the parent (see below).

1.2 Automatic Trigger

Because bgSave commands can be saved without blocking the server process, Redis can have the server automatically execute bgSave commands every once in a while by setting the save option for the server configuration. For example, we set the following configuration to the server (which is also the default redis configuration):

Copysave 900 1
save 300 10
save 60  10000
Copy the code

The bgsave command is executed if one of the following conditions is met:

  • The server made at least one change to the database within 900 seconds
  • The server made at least 10 changes to the database in 300 seconds
  • The server made at least 10,000 changes to the database in 60 seconds

1.3 Loading of RDB files

When Redis is started, the RDB file is automatically loaded whenever it is detected. One thing to note

  • Since AOF files are usually updated more frequently than RDB files, if AOF persistence is enabled, the server will use AOF files to restore database state in preference.
  • The RDB file is used by the server to restore the database state only when AOF persistence is turned off.

Note: The server blocks while loading the RDB file until the load is complete

2. Memory snapshot is faulty

Now that you know what Redis’s RDB persistence is, let’s think about two questions.

2.1 Can Data Be Modified during snapshot

Redis RDB persistence takes a picture of the full amount of data in memory at a given moment. This makes us have to think, can the data be modified during the snapshot?

First of all, if we use the save command for persistence, the Redis single-threaded model will block during the persistence process and other commands cannot be executed. One might say it’s ok to use the bgave command, but isn’t it ok to use bgSave?

When we are taking pictures, the photographer usually doesn’t let us move because it might blur. This is also true when Redis takes a memory snapshot. If we persist, some data gets modified. This undermines the correctness and integrity of the snapshot.

For example, at time t, we take A snapshot of memory, at which point we want to record all the data in memory at time T. Suppose that our RDB operation takes 10s, and at time t+2s we perform A modification operation to change Key1 from A to B, but the RDB operation has not written the value of Key1 to disk. The value of key1 read at t+5s is written to disk. The value of Key1 recorded in this snapshot is B, not A at time T. This breaks the correctness of the RDB file.

RDB file generation takes time, and if data cannot be modified during snapshot execution, it is unacceptable for business systems. So how does Redis solve this problem?

Redis makes use of copy-on-write (COW) technology provided by the operating system to process Write operations while performing snapshots. In simple terms, when bgSave forks a child process, it does not copy the main process’s memory data exactly, but copies only the necessary virtual data structures and does not allocate real physical space to them, which share the same physical memory space as the parent process. Once the BGSave child process runs, it starts reading the main thread’s memory data and writing it to an RDB file. At this point, if the main thread also reads these data, then the main thread and the BGSave child do not affect each other. However, if the main thread wants to modify a piece of data, a physical memory space is allocated to the child process, and a copy of the data to be modified is made to the child process’s physical memory space. The BGSave child then writes the copy to the RDB file, while the main thread can still modify the original data directly.

2.2 Can I Perform Snapshot Operations frequently

Suppose we take a snapshot at time T, and then we take another snapshot at time t+n, and in the meantime, data changes happen. If the system is down, you can only restore the system based on the snapshot at time T. So those n seconds of data are completely lost and can’t be recovered.

Therefore, if you want to recover data as much as possible, you must shorten the interval between snapshot execution. The shorter the interval, the less data will be lost. Can snapshots be performed frequently?

We know that bgSave execution does not block the main thread, but this does not mean that snapshots can be performed frequently.

On the one hand, persistence is a process of disk writing. Frequently writing a full amount of data to the disk puts a lot of pressure on the disk. In addition, frequent snapshot execution causes a vicious cycle in which snapshots compete for limited disk bandwidth.

Moreover, although bgsave fork out of the child to perform operations will not block the operation of the parent process, but the fork out of the operation of the child process is completed by the master, will block the main process, the fork the child process need to copy the necessary data structures, and one of them is to copy the memory page table (virtual memory and physical memory mapping index table), This copy process will consume a lot of CPU resources, and the whole process will block before the copy is completed. The blocking time depends on the memory size of the entire instance. The larger the instance, the larger the page table, the longer the fork block time.

Some of you might wonder if you can do incremental snapshots? That is, snapshots are created only for data generated after the last snapshot.

The first idea is certainly yes, but incremental snapshots require remembering what data was created since the last snapshot. This requires additional metadata to record this information, introducing additional space consumption. This is not a good solution for Redis, where memory resources are precious.

If snapshots cannot be performed frequently, how can you solve the problem of data loss between snapshots? Redis also provides another method of persistence — AOF(Append to File) logging.

In the previous section, we concluded that Redis memory snapshot is used for persistence. A snapshot is taken at time t and then another snapshot is taken at time t+ N. At this time, if the downtime occurs, the data modified during this period will be lost. But you can’t take memory snapshots very often, so what can you do to minimize this loss of data? Redis provides another way of persisting — AOF logging (Append Only File).

3. What is AOF log persistence

3.1 Writing Logs After execution

Unlike memory snapshots that hold data currently in memory, AOF persistence records database state by holding write commands executed by the Redis server. That is, every time a command is executed, it is written to a log file.

Note that the logging operation is performed after Redis executes the command to write data to memory, as shown in the following figure:

The advantage is not blocking the current operation, also can avoid additional inspection costs, if it is in front of the command to write the log operation, once the command syntax is wrong, not to check if it can lead to command in written to the log file is wrong, when I was in the use of the log file to restore data will go wrong. Writing logs after the command is executed does not have this problem.

But there are two problems,

  1. While AOF avoids blocking the current command, it may risk blocking the next operation. Because the AOF log is executed in the main process, if the disk is under a lot of write pressure while the log file is being written to disk, the write to disk will be slow and subsequent operations will not be able to be performed
  2. If a command goes down just after it is executed without logging, the command and its data are at risk of being lost. If Redis is used as a cache, you can re-read data from the back-end database for recovery, but if Redis is used directly as a database, the command is not logged, so recovery cannot be logged.

3.2 AOF buffer

In response to the above two problems, Redis provides a buffer method for recording AOF logs to avoid blocking and data loss problems as much as possible.

That is, Redis does not write to the disk log file directly after executing the command for persistence, but writes to the AOF buffer first, and then writes to the disk through a certain policy.

The two problems mentioned above are related to when the logs are written to disk from the buffer.

3.3 Three Write back Strategies

The Redis AOF mechanism provides three strategies for writing back to disk.

  • Always(synchronous write back): After the command is written into the AOF buffer, the fsync operation is invoked to synchronize the command to the AOF file. After fsync is complete, the thread returns
  • Everysec(write back every second): After the command is written to the AOF buffer, the system calls the write operation. When the write operation is complete, the thread returns. The fsync file synchronization operation is invoked once per second by a dedicated thread
  • No(automatic write back by the OPERATING system): After the command is written into the AOF buffer, the system invoks the write operation. Fsync is not performed for AOF files. The operating system is responsible for disk synchronization

But as you can see, none of these three write back strategies can solve the problem perfectly.

When always is configured, AOF files are synchronized every time they are written. The hard disk cannot write as fast as memory, which is obviously the opposite of Redis’ high performance features

If this parameter is set to NO, the interval for each AOF file synchronization is uncontrollable and the amount of data to be synchronized increases. This improves performance, but data security is not guaranteed.

Everysec is a simple synchronization policy and is the default, although it can balance performance and data security. However, in extreme cases, the data will be lost within 1 second.

In real use, we can analyze these three write back strategies according to specific performance and data integrity requirements, and select the appropriate strategy for persistence.

Back to write policy advantages disadvantages
Always(synchronous write back) The reliability is high and data is not lost Poor performance
Everysec(write back per second) The performance is moderate Data lost within 1 second during outage
No(Automatic write back by the operating system) Performance is good A lot of data is lost during downtime

3.4 AOF rewrite

3.4.1 What Can I Do if Log Files Grow Larger

With the proper write-back strategy chosen, what else is wrong with AOF persistence?

Because AOF persistence records the state of the database by storing the write commands that are executed, over time, the contents of AOF files become more and more large. Too large AOF files not only slow appending commands, It may also affect the Redis server, or even the entire host computer, and the larger the size of AOF files, the more time it takes to restore data using AOF files.

This is where the AOF rewrite mechanism comes in

Copyredis> set testKey testValue
OK
redis> set testKey testValue1
OK
redis> del testKey
OK
redis> set testKey hello
OK
redis> set testKey world
OK
Copy the code

The AOF file records the write commands received one by one in an appending manner. When a key-value pair is repeatedly modified by multiple write commands, the AOF file records the corresponding multiple commands. As in the example above, Redis appends 5 commands to AOF after we execute the command. But all you really need is the set testKey world command.

AOF rewriting is when Redis creates a new AOF file based on the current state of the database. That is, it reads all the key/value pairs in the database and records the write of each key/value pair with a command. For example, when the key-value pair “testkey” : “world” is read, the override mechanism records the set testkey world command. In this way, when you need to restore, you can run the command again to write “testkey” : “world”.

As a result, the log is rewritten from five to one, and the space savings are even greater for key pairs that may have been changed hundreds or thousands of times.

Although the log files are smaller after the AOF rewrite, it is still a very time consuming process to write the operation logs of the latest data for the entire database back to disk. At this point, we have to worry about: will overwriting cause blocking? So that’s what the AOF rewrite process looks like, right

3.4.2 AOF rewrite process

Because AOF overwriting is also a very time-consuming process, and because of the single-threaded nature of Redis, AOF overwriting is done by the parent forking out the BgrewriteAof child, just like memory snapshots.

Using a child process (rather than starting a thread) for AOF overwriting can avoid locking and ensure data security, but it can cause problems with child and parent consistency. For example, when the parent process receives a new key pair after rewriting, the child process cannot know about it, and the database state of the child process after rewriting is inconsistent with that of the parent process.

The following table:

time Server process (parent process) The child process
T1 Run SET K1 V1
T2 Run SET K1 V1
T3 Create child process to perform AOF file rewrite Start AOF rewrite
T4 Run SET K2 V2 Perform rewrite
T5 Run SET K3 V3 Perform rewrite
T6 Run SET K4 V4 Complete the AOF rewrite

At time T6 the server process has four keys, while the child process has only one key

To resolve this inconsistency, Redis sets up an AOF rewrite buffer.

During AOF rewrite by the child process. The server process needs to perform the following three actions:

  1. Run client commands
  2. Append to the AOF buffer after execution
  3. Append to the AOF rewrite buffer after execution

When the child completes the AOF rewrite, it sends a signal to the parent, which calls a signal handler that appends the AOF rewrite buffer to the new AOF file and replaces the existing AOF file. The parent process can continue to receive client command calls after the completion of the process, it can be seen that the AOF background rewrite process only this signal processing function will block the server process. The following table is the complete AOF background rewrite process:

time Server process (parent process) The child process
T1 Run SET K1 V1
T2 Run SET K1 V1
T3 Create child process to perform AOF file rewrite Start AOF rewrite
T4 Run SET K2 V2 Perform rewrite
T5 Run SET K3 V3 Perform rewrite
T6 Run SET K4 V4 Complete the AOF rewrite, sending a signal to the parent process
T7 On receiving the signal, append the T5 T6 T7 server’s write command to the end of the new AOF file
T8 Replace the old AOF with the new AOF

This ensures that all operations during the log rewrite will also be written to the new AOF file.

Note that T7 T8 performs tasks that block server processing commands.

In general, every time AOF is overwritten, Redis forks a child process for overwriting. Then, two logs are used to ensure that the newly written data is not lost during the rewrite process.

3.4.3 Recovering AOF Files

AOF log files will be loaded first after Redis server restarts. Since the AOF file contains all the write commands needed to restore the database state, the server can re-execute the write commands saved in the AOF file to restore the database state before the server shutdown.

Since the Redis command can only be executed in the client context, Redis creates a dummy client with no network connection to execute the contents of the AOF file.

4. Summary

This article summarized Redis AOF persistence, introduced three strategies for disk synchronization, and how to override log files that are too large. Redis has two types of persistence: AOF and RDB. What are the advantages and disadvantages of these two methods? How should we choose the appropriate persistence method in real use, and what problems might we encounter?