What is persistence
The working mechanism of using permanent storage media to store data and restore the saved data at a specific time is called persistence.
Why persist
Prevent accidental data loss and ensure data security
What does persistence hold
- Save the current data state in the form of snapshot, store data results, storage format is simple, focus on data (RDB)
- The operation process of data is saved in the form of log, and the operation process of data is stored. The storage format is complex and the focus is on the operation process of data (AOF).
RDB
1. RDB startup mode
1. The save command
Run the save command to manually save the configuration
2. Configure the save command
- Dbfilename dump. RDB Note: Set the filename of the local database. The default value is dump. RDB. Experience: The value is dump-port number
- Dir Description: Set the path where the. RDB file is stored. Experience: Usually set it to dump-port number
- Experience: LZF compression is usually enabled by default. If rdbcompression is set to no, it can save CPU running time, but will make the stored files larger (huge).
- Rdbchecksum yes Specifies whether to verify the RDB file format. The checksum is performed during both file write and file read processes. The checksum is enabled by default
Note: Execution of the save command will block the current Redis server until the current RDB process is complete, which may cause prolonged blocking and is not recommended for online environments.
3. Bgsave directive
Command: bgSave Function: Manually start the background save operation, but not immediately execute the BGSave command works as follows:
Note: The BGSave command is optimized for save blocking. All RDB operations in Redis are bgSave and the save command can be abandoned.
4. Configure the BGSave command
- Stop-writes-on-bgsave-error yes Specifies whether to stop the saving operation if an error occurs during background storage procedures. Experience: The function is enabled by default
5. Save Automatically saves the configuration (bgSave)
- configuration
save second changes
- When the number of key changes within a specified time range reaches the specified number, the system persists
- Parameter second: monitoring time range CHANGES: monitoring key changes
- The location is configured in the CONF file
- sample
save 900 1
save 300 10
save 60 10000
Copy the code
Note: 1. The save configuration should be set according to the actual business situation. If the frequency is too high or too low, performance problems may occur and the result may be disastrous. The bgSave operation is executed after the save configuration is started
4. Comparison of RDB three startup modes
way | The save command | Bgsave instruction |
---|---|---|
Read and write | synchronous | asynchronous |
Block client instructions | is | no |
Extra memory consumption | no | is |
Start a new process | no | is |
Second, RDB special start form
- Full amount of copy
This is explained in detail in master-slave replication
- The server is being restarted. Procedure
debug reload
- Specifies to save data when shutting down the server
shutdown save
Bgsave is automatically executed when shutdown is executed by default (if AOF persistence is not enabled)
3. Advantages and disadvantages of RDB
Advantages:
- RDB is a compact binary file with high storage efficiency
- RDB stores redis data snapshots at a certain point in time, which is very suitable for data backup, full replication and other scenarios
- RDB can recover data much faster than AOF
- Application: BgSave backups are performed every X hours on the server and RDB files are copied to a remote machine for disaster recovery.
Disadvantages:
- RDB mode can not achieve real-time persistence no matter it is executing instructions or using configuration, and has a greater possibility of data loss
- The BGSave directive forks the child process each time it runs, sacrificing some performance
- The RDB file format is not unified among many versions of Redis, which may cause data format incompatibility among services of different versions
AOF
- AOF(Append only File) persistence: Each write command is recorded in an independent log, and the commands in the AOF file are executed again when the system restarts to recover data. In contrast to RDB, it can simply be described as the process of changing recorded data to recorded data generation
- The main role of AOF is to solve the real-time of data persistence, which has been the mainstream way of Redis persistence
Three policies for writing data (appendfSync)
- Always (every) Every write operation is synchronized to an AOF file, providing zero data error and low performance
- Everysec (per second) Synchronizes instructions in the buffer to AOF files every second, with high data accuracy and high performance. In the case of sudden system downtime, the data within 1 second is lost
- No (System control) The operating system controls the period for each synchronization to AOF files. The whole process is uncontrollable
AOF is enabled
- Configuration:
appendonly yes|no
- Effect: Whether to enable the AOF persistence function. The AOF persistence function is disabled by default
- Configuration:
appendfsync always|everysec|no
- Effect: AOF write data policy
AOF configurations
- Configuration:
appendfilename filename
- The default file name is not appendone. AOF. You are advised to set it to appendonly- port number
- Configuration:
dir
- Effect: The path for storing AOF persistence files is the same as that for RDB persistence files
AOF rewrite
As commands write to AOF, files get bigger and bigger. To solve this problem, Redis introduced AOF rewriting to reduce file size. AOF file rewrite is the process of converting data in the Redis process into write commands to synchronize the data to a new AOF file. Simply put, it is to convert the execution results of several commands on the same data into the corresponding instructions of the final result data for recording.
AOF rewriting
- Reduce disk usage and improve disk utilization
- Improves the persistence efficiency, reduces the persistent write time, and improves I/O performance
- Reduces the data recovery time and improves the data recovery efficiency
AOF rewrite rule
- Data that has timed out in the process is no longer written to the file
- Ignore invalid instructions and rewrite using in-process data generated directly, so that the new AOF file retains only write commands for the final data. For example, DEL Key1, hDEL Key2, SREM key3, SET KEY4 111, and Set KEY4 222
- Multiple write commands for the same data are combined into one command. For example, lpush list1 a, lpush list1 b, lpush list1 C can be converted to: lpush list1 a b C.
To prevent client buffer overflow caused by excessive data, a maximum of 64 elements can be written into each instruction of the list, set, hash, and zset types
AOF rewrite
- Manual override (performed at the console) :
bgrewriteaof
- Automatic rewriting:
auto-aof-rewrite-min-size size
auto-aof-rewrite-percentage percentage
Copy the code
AOF automatic rewriting
AOF workflow
AOF rewrite process
1.
2,
AOF buffer synchronizes file policies, controlled by the appendfsync parameter
System calls write and fsync:
- The write operation triggers delayed write, and Linux provides page buffers in the kernel to improve disk I/O performance. The write operation returns directly after writing to the system buffer. Disk synchronization depends on the system scheduling mechanism. For example, the buffer page space is full or a specific period of time is reached. If the system breaks down before file synchronization, data in the buffer will be lost.
- Fsync forces disk synchronization for a single file operation (such as an AOF file). Fsync blocks data writing until it returns to the disk, ensuring data persistence.
RDB is different from AOF
Persistent mode | RDB | AOF |
---|---|---|
Occupying Storage space | Small (data level: compressed) | Large (instruction level: rewrite) |
Storage speed | slow | fast |
Recovery rate | fast | slow |
Data security | You lose data | Decision by strategy |
Resource consumption | High/heavyweight | Low/lightweight |
Startup priority | low | high |
The choice of RDB and AOF
- Very sensitive to data, the default AOF persistence scheme is recommended
- The AOF persistence policy uses everysecond, fsync everysecond. The redis policy can still maintain good processing performance, when there is a problem, the maximum loss of data within 0-1 seconds.
- Note: Due to the large storage volume of AOF files, the recovery speed is slow
- RDB persistence scheme is recommended for the validity of data presentation stage
- Data can not be lost in a phase (manually maintained by developers or O&M personnel), and the recovery speed is fast. RDB is usually used for data recovery at phase points
- Note: using RDB to implement compact data persistence will reduce Redis considerably.
- Comprehensive comparison
- The choice of RDB and AOF is actually a tradeoff, each of which has advantages and disadvantages
- If you cannot bear data loss within a few minutes and are sensitive to business data, choose AOF
- If you can withstand data loss within a few minutes and pursue the recovery speed of large data sets, choose RDB
- RDB is used for disaster recovery
- Double insurance policy: Enable RDB and AOF at the same time. After the restart, Redis preferentially uses AOF to recover data and reduce the amount of lost data