Daily sentence

Bow is a kind of ability, it is not inferiority, nor cowardice, it is the sober evolution. Sometimes, a little lower head, or our way of life will be more wonderful.

If the profile

Redis is an in-memory database service for key-value (K-V) pairs, usually containing any non-empty database. Each non-empty key-value database can store any K-V, as shown in the following figure:

  • The strong performance of Redis is largely due to the fact that all data is stored in memory. In order to ensure that data is not lost after Redis restarts, data needs to be synchronized from memory to disk in some form. This process is called persistence.

  • We know that the data cached in Redis is stored in the memory. Once the service fails, the data in the memory will be lost. Therefore, a data persistence scheme is needed to write the data in redis memory to the disk and recover data from the disk after redis restarts.

Redis server structure

  • There is a problem here, because Redis is an in-memory database, if it stores data directly in memory, but if it does not consider persisting the data stored in memory to disk, once the server process exits, the data in the database will also disappear.

  • There are two main database persistence mechanisms, one is RDB mechanism, the other is AOF mechanism, AOF mechanism has been introduced in the previous article,

  • If you are interested, you can check it out. This article focuses on the RDB mechanism.

RDB persistence mode

RDB persistence refers to writing snapshots of data sets in redis memory to disks at a specified time interval. The implementation principle is that redis service forks a sub-process within a specified time interval, and the sub-process writes data sets to temporary files. After successful writing, the previous files are replaced and stored in binary compression. The dump. RDB file is generated and stored on disk.

RDB mechanism

  • Redis provides RDB persistence capability, which can keep Redis database state in memory on disk to avoid accidental data loss.

  • The RDB persistence mechanism can be performed manually or periodically based on the server configuration. The RDB persistence mechanism saves a snapshot of data at a point in time to an RDB file.

RDB advantage

  • Once done this way, your entire Redis database will contain only one file, which is perfect for file backups. For example, you might want to archive the last 24 hours of data every hour and the last 30 days of data every day. With this backup strategy, we can easily recover from catastrophic system failures.

  • RDB is a great choice for disaster recovery. It is very easy to compress a single file and transfer it to another storage medium.

  • Maximize performance. As for Redis server, the only thing it needs to do to start persistence is fork out the child process, and then the child process will do the work of persistence, so that the server process can not perform IO operations.

  • Compared to AOF, RDB starts more efficiently if the data set is large.

RDB disadvantage

  • RDB is not a good choice if you want to ensure high availability of data, i.e. avoid data loss as much as possible. If the system breaks down before scheduled persistence, data that has not been written to disk will be lost.

  • Because the RDB is forked to assist in data persistence, a large data set can cause the entire server to go out of service for hundreds of milliseconds, or even a second.

RDB configuration rules

In the redis 6379.conf configuration file:

Backing up configuration Parameters

save <seconds> <changes>
Copy the code

Save < specified interval > < Perform the specified number of update operations >. If the conditions are met, the data in memory is synchronized to the hard disk. The official factory default is 1 change in 900 seconds, 10 changes in 300 seconds, and 10000 changes in 60 seconds to write a snapshot of the data in memory to disk.

Save 900 1 # Dump memory snapshot 300 after 900 seconds if at least one key has changed 10 # Dump memory snapshot 300 after 300 seconds if at least 10 keys have changed After 60 seconds (1 minute), if at least 10000 keys have changed, the memory snapshot will be dumpedCopy the code

File configuration Parameters

The default RDB file path is the current directory and the filename is dump. RDB. You can change the path and filename in the configuration file, dir and dbfilename respectively.

/ # RDB file storage path dbfilename dump. RDB # RDB filenameCopy the code

Compressed configuration parameters

Whether to compress images during image backup.

Rdbcompression yes #Redis is enabled by default. # yes: compression, but requires some CPU consumption. # no: no compression, requires more disk space.Copy the code

If automatic snapshot is not triggered, you need to run the save and bgsave commands to manually snapshot Redis.

  • SAVE: Snapshots taken by the main process block other requests.
  • BGSAVE: Snapshots are taken by forking the child process without blocking other requests.

Note: Since Redis uses fork to copy the current process, the child process will have the same memory resources as the main process. For example, the main process will have 8GB memory, so the backup must have 16GB memory, otherwise virtual memory will be enabled and performance will be very poor.

The snapshot process is as follows:

  1. Redis uses fork to make a copy of the current process (parent process) (child process);
  2. The parent process continues to receive and process commands from the client, while the child process begins to write data from memory to temporary files on the hard disk.
  3. When the child process finishes writing all data, it replaces the old RDB file with the temporary file. The snapshot operation is complete. (Note: there will be a command compression cache to record the operation when writing to the RDB file.)

When fork is executed, the operating system uses copy-on-write policy. That is, at the moment when fork occurs, the parent process shares the same memory data. When the parent process changes a piece of data (for example, when executing a write command), the operating system copies the data to ensure that the child process is not affected. So the new RDB file stores memory snapshot data at the moment of fork.

Redis does not modify the RDB file during the snapshot process. It replaces the old file with the new one only after the snapshot is completed. In other words, the RDB file is complete at any time. This enables Redis database backups by periodically backing up RDB files.

Snapshot process compression analysis:

RDB files are compressed binary format (as mentioned above: rDBCOMPRESSION can be set to disable compression to save CPU), so the space occupied is less than the size of the data in memory, which is more convenient for transmission.

Snapshot loading process:
  • When Redis starts, it reads the RDB snapshot file and loads the data from the hard disk into memory. This time varies depending on the size and structure of the data and server performance. It typically takes 20 to 30 seconds to load a 1GB snapshot file of 10 million string keys into memory.

  • Persistence via RDB, once Redis exits unexpectedly, all data that has changed since the last snapshot is lost. This requires developers to control the possible data loss in an acceptable range by setting automatic snapshot conditions in combination according to specific application situations. If the data is too important to afford any loss, consider using AOF for persistence.

Advantages and disadvantages of RDB

Advantages:

  1. Suitable for large-scale data recovery.
  2. RDB is a good choice if the business is not demanding data integrity and consistency.

Disadvantages:

  1. Data integrity and consistency is not high because the RDB may have gone down during the last backup.
  2. The backup takes up memory because Redis creates a separate subprocess during the backup, writes data to a temporary file (at this point, the data in memory is twice as large), and then replaces the backup file with the temporary file.
  3. Because the RDB is forked to assist in data persistence, a large data set can cause the entire server to go out of service for hundreds of milliseconds, or even a second. (Write back and overwrite using the main process).

The selection criteria of RDB and AOF (although AOF has not been mentioned yet, it has been popularized in advance)

  • If the system is willing to sacrifice some performance for higher cache consistency (AOF)

  • Or, if you want to write frequently, do not enable backup for higher performance, and then do backup (RDB) when you run save manually.

Redis allows both AOF and RDB to be enabled at the same time, ensuring data security and making backup and other operations very easy. After restarting Redis, Redis will use AOF files to recover data, because AOF persistence may lose less data.

conclusion

  • Redis enables RDB persistence by default. If a specified number of write operations are performed within a specified period of time, data in the memory is written to the disk.

  • RDB persistence is suitable for large-scale data recovery but has poor data consistency and integrity.

  • Redis needs to manually enable AOF persistence. By default, write operation logs are appended to AOF files every second.

Therefore, it is reasonable for Redis persistence and data recovery to be carried out in the dead of night.