preface

  • This article was first published on wechat public number [code ape technology column] : [Redis every day, what do you know about the persistence plan?] (https://mp.weixin.qq.com/s?__biz=MzI0ODYzMzIwOA==&mid=2247484106&idx=1&sn=f2dc239478b6481e52e9bbe94687d662&chksm=e99c80 dddeeb09cbc241f4e0fefab7ffee7639338de7eb8809455ec91f42ab961f96e16a9c75&scene=126&sessionid=1587346810&key=087e4453ab2c10 b91659b80845b2b74c5fb001ce66f3d4e0ddb18cb8bbb391009c4518efb34e1fde3a28f36fd6605ce0fa5b5a2f2ca2e399d976d46d28626c1bfeb90a 361de7cc442db9da177701e750&ascene=1&uin=MTA3MjI0MTk2&devicetype=Windows+10&version=62080079&lang=zh_CN&exportkey=AarKwCV q4htSp1dFaUsiQC4%3D&pass_ticket=vpBJqr5QTpXblOduck%2BE8iwsQJsQiFnM5ZRMsNTec0M%3D)
  • Redis has become a mainstream in-memory database, but most people just use it. Do you really understand the inner workings of Redis?
  • This article will introduce two methods of Redis persistence from the following five aspects:

    1. ** What is RDB and how does RDB persist? **
    2. ** What is AOF and how is AOF persisted? **
    3. ** Difference between AOF and RDB. **
    4. ** How do I restart and restore data? **
    5. ** Persistence performance issues and solutions **

RDB

  • RDB persistence is a process in which snapshots of the current process data are generated and saved to disks. RDB persistence can be triggered manually or automatically.
  • When RDB is complete, a file is automatically generated and stored in the specified directory configured by ‘dir’ with the name specified by ‘dbfileName’.
  • By default, Redis uses LZF algorithm to compress the generated RDB file. The compressed file is far smaller than the memory size, so it is enabled by default.

Manual trigger

  • Manually triggered commands are ‘save’ and ‘bgsave’.
  • ‘save’ : this command blocks the Redis server until the RDB process is complete and has been deprecated, so it is not recommended online.
  • ‘BGSave’ : Each RDB process forks a child process that performs the RDB operation, so blocking only occurs during the fork phase, usually for a short time.

Automatic trigger

  • In addition to manually triggering the RDB, the Redis server has the following scenarios that can trigger the RDB automatically:

    1. Automatically triggered according to our ‘Save m n’ configuration rules.
    2. If the slave node performs a full copy operation, the master node automatically bgSave to generate RDB files and send them to the slave node.
    3. When you run the ‘debug reload’ command to reload Redis, the save operation is also triggered automatically.
    4. By default, ‘bgSave’ is automatically executed if AOF persistence is not enabled when shutdown is executed.

RDB executes the process

  • The main method of RDB is bgSave. [RDB execution process] (HTTP: / / https://p1-jj.byteimg.com/tos-cn-i-t2oaga2asx/gold-user-assets/2020/4/20/1719540935186502~tplv-t2oaga2asx-imag e.image)
  • The figure above shows the RDB execution process as follows:

    1. After executing the bgsave command, the bgsave command checks whether the AOF or RDB child process exists.
    2. Fork creates a child process. The parent process is blocked.
    3. After the fork is complete, the child process starts to generate a temporary snapshot file from the parent process’s memory and replaces the original RDB file when it is complete. Run the ‘lastsave’ command to view the last RDB time.
    4. When the child process completes, it sends a signal to the parent process, which updates the statistics.

The advantages of RDB

  • The RDB is a compact binary file that represents a snapshot of Redis data at a point in time. It is suitable for backup and full copy scenarios. For example, perform ‘BGSave’ backups every 6 hours and copy RDB files to a remote machine or file system for disaster recovery.
  • Redis loads’ RDB ‘recovery data much faster than’ AOF ‘.

The disadvantage of RDB

  • RDB data cannot be ‘real-time persistence’/’ second persistence ‘. Because bgSave forks every time it runs, it is a heavyweight operation that is expensive to execute frequently.
  • RDB files are saved in a specific binary format. During the Redis version evolution, multiple RDB versions are available. However, the old Redis service may not be compatible with the new RDB format.

AOF

  • AOF (Append only file) persistence: Records each write command in an independent log, and executes the command in the AOF file again to recover data upon restart. The main role of AOF is to solve the real-time performance of data persistence, and it has been the ‘mainstream way’ of Redis persistence.

How do I start AOF

  • To enable the AOF function, set the following configuration: ‘appendonly yes’, which is disabled by default. The AOF filename is set through the ‘appendfilename’ configuration. The default filename is’ appendone.aof ‘. The save path is the same as the RDB persistence path, specified through the ‘dir’ configuration.

AOF overall execution process

  • The process of AOF execution can be roughly divided into four steps: ‘command writing’, ‘file synchronization’, ‘file rewriting’ and ‘restart loading’, as shown below: [AOF execution process] (HTTP: / / https://p1-jj.byteimg.com/tos-cn-i-t2oaga2asx/gold-user-assets/2020/4/20/1719540935bbcea6~tplv-t2oaga2asx-imag e.image)
  • From the figure above, we have a general understanding of the AOF execution process. The following four steps are analyzed one by one.

Command to write

  • The AOF command writes directly to the text protocol format. For example, the command ‘set hello world’ appends the following text to the AOF buffer:

    *3\r\n3\r\nset\r\n5\r\nhello\r\n$5\r\nworld\r\n

  • Command writes are written directly to the buffer of AOF. Why? The reason is simple, Redis uses single-threaded response commands, and if every time you write an AOF file you append it directly to the disk, the performance depends entirely on the current disk load. Writing to buffer ‘aof_buf’ first has another benefit. Redis can provide multiple buffer synchronization strategies to balance performance and security.

File synchronization

  • Redis provides multiple AOF buffer file synchronization policies controlled by the argument ‘appendfsync’ as follows:

    • When always is configured, AOF files must be synchronized for each write. On ordinary SATA disks, Redis can only support several hundred TPS writes, which obviously runs counter to the high performance features of Redis and is not recommended.
    • If the parameter is set to no, the interval for each AOF file synchronization is uncontrollable and the amount of data to be synchronized on disks increases. This improves performance, but data security cannot be guaranteed.
    • Set it to ‘Everysec’ (the default configuration), which is the recommended synchronization policy ** and the default configuration to balance performance and data security. In theory only 1 second of data can be lost in the event of a sudden system outage (this is not accurate, of course).

File rewriting mechanism

  • As commands continue to write to AOF, files get bigger and bigger. To solve this problem, Redis introduced AOF rewriting to reduce file size. AOF file rewriting is the process of converting data in the Redis process into write commands to synchronize to a new AOF file.
  • ** Why file rewrite? ** Because file rewriting makes AOF files smaller, it can be loaded faster by Redis.
  • The rewriting process is divided into manual triggering and automatic triggering.

    • Manually trigger direct use of the ‘bgrewriteaof’ command.
    • Auto-aof -rewrite-min-size and auto-aof-rewrite-percentage are used to determine the automatic trigger time.
  • ‘auto-aof-rewrite-min-size’ : indicates the minimum size of a file when aof rewrite is run. The default size is 64MB.
  • ‘auto-aof-rewrite-percentage’ : indicates the ratio of the current AOF file space (‘ aof_current_size ‘) to the aOF file space (‘ aof_base_size ‘) since the last rewrite.
  • Aof_current_size >auto-aof-rewrite-minsize&& (aof_current_size-aof_base_size) / aof_base_size > = auto – aof – rewritepercentage * *. Where ‘aof_current_size’ and ‘aof_base_size’ can be viewed in the ‘info Persistence’ statistics.
  • ** Why does the rewritten AOF file become smaller? There are several reasons for ** :

    1. Data that has timed out in the process will not be written to the AOF file again.
    2. Old AOF files contain invalid commands such as’ del key1 ‘, ‘hdel key2’, etc. Overrides are generated directly using in-process data, so that the new AOF file retains only the write commands for the final data.
    3. Multiple write commands can be combined into one, for example: ‘lpush list A’, ‘lpush list b’, ‘lpush listc’ can be converted into: ‘lpush list A b C’. To prevent client buffer overflow caused by a large command, operations of the ‘list’, ‘set’, ‘hash’, and ‘zset’ types are divided into multiple operations with 64 elements as boundaries.
  • Redis rewrites files in Redis. Redis rewrites files in Redis. Rewriting] [documents (https://p1-jj.byteimg.com/tos-cn-i-t2oaga2asx/gold-user-assets/2020/4/20/1719540936b02496~tplv-t2oaga2asx-image.i mage)
  • After looking at the figure above, we have a general understanding of the process of file rewriting. For the process of rewriting, we can add the following:

    1. During the rewrite, the main thread is not blocking, but is still writing data to the old AOF file while executing other commands, which ensures the ultimate integrity of the backup and ensures that data is not lost if the rewrite fails.
    2. A buffer is also reserved for the child process to prevent the newly written file from losing data, in order to also write to the new file in response to the rewrite.
    3. Rewrite is to directly generate the current memory data corresponding command, do not need to read the old AOF file for analysis, command merge.
    4. AOF files directly used ‘text protocol’, mainly good compatibility, easy to add, high readability can be considered to modify repair.
    5. Both ‘RDB’ and ‘AOF’ write a temporary file first and then rename the file to complete the replacement.

The advantages of AOF

  • Using AOF persistence makes Redis very durable: you can set different fsync policies, such as no fsync, fsync every second, or fsync every time a write command is executed. The default AOF policy is fsync once per second. In this configuration, Redis still performs well and loses at most a second of data in the event of an outage (fsync is performed in background threads, so the main thread can continue to struggle to process command requests).

The disadvantage of AOF

  • AOF files are usually larger than RDB files for the same data set. Depending on the fsync strategy used, AOF may be slower than RDB. Fsync per second performance is still very high under normal conditions, and turning off fsync allows the AOF to be as fast as the RDB, even under high loads. However, RDB can provide a more guaranteed maximum latency when dealing with large write loads.
  • The data recovery speed is slower than that of RDB.

The difference between AOF and RDB

  • RDB persistence refers to writing snapshots of data sets in memory to disks at a specified interval. The actual operation is to fork a sub-process to write data sets to temporary files first. After the data is written successfully, the original files are replaced and stored in binary compression.
  • AOF persistence records every write and delete operation processed by the server in the form of logs. The query operation is not recorded in the form of text. You can open the file to see detailed operation records.

Restart the load

  • Both RDB and AOF can be used to restore data when the server restarts. The process is as follows: (https://p1-jj.byteimg.com/tos-cn-i-t2oaga2asx/gold-user-assets/2020/4/20/17195409372562d5~tplv-t2oaga2asx-image/restart loading process .image)
  • The figure above clearly analyzes the process of Redis to start data recovery. First check whether the AOF file is open and the file exists, and then check whether the RDB is open and the file exists.

Performance problems and solutions

  • From the above analysis, we know that snapshots of the RDB and overwriting of the AOF require forking, which is a heavyweight operation that blocks Redis. So in order not to affect the Redis main process response, we need to keep blocking as low as possible.
  • So how do you reduce the blocking of fork operations?

    1. Physical machines or virtualization technologies that support efficient fork operations are preferred.
    2. Control the maximum available memory of Redis instance, fork time is proportional to the amount of memory, online it is recommended to control the memory of each Redis instance within 10GB.
    3. Configure Linux memory allocation policies to avoid fork failures caused by insufficient physical memory.
    4. Reduce the frequency of fork operations, such as moderately relaxed AOF automatic trigger timing, to avoid unnecessary full copy, etc.

conclusion

  • This article introduces two different strategies for Redis persistence. Most of the content is for operation personnel to master, of course, as back-end personnel also need to understand, after all, small companies are all one person to handle the whole stack, haha.
  • ** If you think Chen wrote well and gained something, pay attention to share a wave, your attention will be the biggest power of Chen’s writing, thank you for your support!! **