Star /fork: github.com/Wasabi1234/…
When Redis provides external data access services, it uses data resident in memory. If only the data is stored in the memory, all data will be lost once the system restarts.
Introduction to persistence
1.1 What is Persistence
Redis keeps all data in memory and updates to data are asynchronously saved to disk. Persistence is mainly for disaster recovery and data recovery, which can be classified as high availability.
If your Redis goes down, all you need to do is make Redis available, as soon as possible!
Restart Redis and make it available to the outside world as soon as possible. If you do not make data backup, even if Redis is started, the data is gone! What can we use?
It is possible to say that a large number of requests come, the cache can not hit, in Redis can not find the data, this time caused cache avalanche, will go to MySQL database to find, suddenly MySQL to accept high concurrency, crash!
MySQL is down and you can’t even find data to restore to Redis. Where does Redis data come from? It’s from MySQL!
If you do a good job of Redis persistence, backup and recovery plan, so even if your Redis failure, you can also backup data, rapid recovery, once the restoration of external services immediately
1.2 Persistence Mode
Redis provides two methods of persistence:
Redis RDB – Snapshot
RDB performs point-in-time snapshots of data sets at specified intervals, similar to MySQL Dump.
Redis AOF – Command log
AOF records every write received by the server, which is performed again when the server is started to rebuild the original data set. The commands are recorded in the same format as the Redis protocol itself, and appends-only. Redis can override logs in the background when they become too large. Similar to MySQL Binlog and Hbase HLog. When Redis restarts, the entire data is reconstructed by playing back the write instructions in the log.
If you want Redis to be used only as a pure memory cache, you can also disable RDB and AOF.
You can use both AOF and RDB in the same instance. In this case, when Redis restarts, the AOF file will be used to rebuild the original data set, as it is guaranteed to be the most complete.
The most important thing is to understand the different tradeoffs between RDB and AOF persistence. If both RDB and AOF persistence mechanisms are used, then when Redis restarts, AOF is used to rebuild the data, because the data in AOF is more complete!
2 RDB – Full write
The k v stored by Redis Server in multiple dB can be understood as a state of Redis. When a write occurs, Redis switches from one state to another. Full-volume persistence is to persist all Redis data to hard disk at some point in time to form a snapshot. When Redis is restarted, the Redis can be restored to the last persistent state by loading the latest snapshot data.
2.1 Trigger Mode
The save command
Save can be triggered by the client display or when redis is shutdown. Save itself is executed in a single-threaded serial manner, so a long lag of Redis Server may occur when there is a large amount of data. However, no other command is executed during the backup. Therefore, data status is consistent during the backup.
If an old RDB file exists, the new one will replace the old one, O(N) time.
bgsave
Bgsave also can be made of
- Explicitly triggered by the client
- Configure scheduled task triggering
- Triggered by slave nodes in master-slave architecture
When executing the BGsave command, a child process is forked. After the subprocess submits, it immediately returns a response to the client. The backup operation is performed asynchronously in the background without affecting the normal response of Redis.
For BGSave, when the parent forks the child, the asynchronous task copies the current memory state as a version and changes made during the replication are not reflected in the backup.
Instead of commands, use configurations
In the default configuration of Redis, bgSave execution is automatically triggered when any of the following conditions are met:
configuration | seconds | changes |
---|---|---|
save | 900 | 1 |
save | 300 | 10 |
save | 60 | 10000 |
The advantages of BGSave over Save areAsynchronous execution
The command does not affect subsequent command execution. However, forking the child process, which involves memory replication of the parent process, can increase the server memory overhead.When memory overhead is high enough to use virtual memory, the bgSave Fork child process blocks running
, may cause second level unavailability. Therefore, using BGSave requires that the server has enough free memory.
The command | save | bgsave |
---|---|---|
IO types | synchronous | asynchronous |
Whether blocking | blocking | Non-blocking (blocking at fork) |
The complexity of the | O(N) | O(N) |
advantages | No extra memory will be consumed | Do not block client commands |
disadvantages | Block the client command | The child process is forked and the memory is expensive |
RDB optimal configuration
Disable automatic RDB:
dbfilename dump-${port}.rdb
dir /redisDataPath
stop-writes-on-bgsave-error yes
rdbcompression yes
rdbchecksum yes
Copy the code
Trigger time that needs to be paid attention to
- The master node performs bgSave for full replication at master/slave replication time
- debug reload
- shutdown
- FlushDB, flushAll
RDB nature
- The RDB is a Redis memory-to-disk snapshot for persistence
- Save usually blocks Redis
- Bgsave does not block Redis, but forks a new process
- The save autoconfiguration meets either of these criteria and is executed
Advantages of RDB
- RDB generates multiple data files, each of which represents all data in Redis at one time. This approach is ideal for cold backup. Such complete data files can be sent to a cloud server storage, such as ODPS distributed storage, to periodically back up data in Redis with a predetermined backup policy
- RDB has very little impact on the read and write services provided by Redis. This allows Redis to maintain high performance because the main Redis process only has to fork a child process to execute the RDB
- Compared to AOF, it is faster to restart and restore Redis processes directly based on RDB files
RDB shortcomings
- Take, O (n)
- Fork () : memory consumption, copy-on-write policy
Each time the RDB forks a child process to perform the RDB snapshot data file generation, if the data file is very large, the service provided to the client may be suspended for milliseconds, or even seconds
- Uncontrolled, easy to lose data
Generally, the RDB is generated every 5 minutes or longer. If Redis goes down during the process, the most recent unpersisted data will be lost
2.2 Restoration Process
When Redis restarts, the previously persistent files are loaded from the local disk. When the recovery is complete, subsequent requests are processed.
3 AOF (Append only File) – Incremental mode
RDB records the full data of each state, while AOF records the record of each write command. Through the execution of all write commands, the final data state is finally recovered.
- Its file generation is as follows:
3.1 Write Process
AOF’s three strategies
always
- Each time the buffer is flushed, a synchronization operation is triggered synchronously. This policy reduces Redis throughput because synchronization is triggered for every write operation, but this mode has the highest fault tolerance.
every second
- Asynchronous per second trigger synchronization operation, for Redis
The default configuration
.
no
- The operating system decides when to synchronize. In this mode, Redis cannot decide when to land, so it cannot be controlled.
contrast
The command | always | everysec | no |
---|---|---|---|
advantages | No data loss | Fsync is performed once per second, and data is lost for 1 second | Do not need to set up |
disadvantages | IO overhead is high. A typical STAT disk has only a few hundred TPS | Lost 1 second of data | uncontrolled |
3.2 Playback Process
The playback time of AOF is also when the machine is started. Once AOF exists, Redis will select incremental playback.
Because incremental persistence is a continuous write to disk, data is more complete than full persistence. The playback process is to execute the command stored in AOF again. Then continue to receive new commands from the client.
Optimal rewriting of AOF pattern
As Redis continues to run, a large amount of incremental data is appended to AOF files. To reduce hard disk storage and speed recovery, Redis uses the rewrite mechanism to merge historical AOF records. As follows:
Native AOF
set hello world set hello java set hello hehe incr counter incr counter rpush mylist a rpush mylist b rpush mylist c Stale dataCopy the code
AOF rewrite
set hello hehe
set counter 2
rpush mylist a b c
Copy the code
The role of AOF rewrite
- Reduce disk usage
- Accelerated recovery rate
3.3 AOF rewriting is implemented in two ways
bgrewriteaof
AOF overrides configuration
Configuration items
- AOF file growth rate/size required for AOF file rewrite
- AOF Current size (in bytes)
aof_base_size
Size of AOF last started and overwritten in bytes
Automatic trigger configuration
aof_current_size > auto-aof-rewrite-min-size
aof_current_size - aof_base_size/aof_base_size > auto-aof-rewrite-percentage
Copy the code
3.4 AOF rewrite process
AOF overrides configuration
Modifying a Configuration File
appendonly yes
appendfilename "appendonly-$(port).aof"
appendfsync everysec
dir /opt/soft/redis/data
auto-aof-rewrite-percentage 100
auto-aof-rewrite-min-size 64mb
no-appendfsync-on-rewrite yes
Copy the code
The advantages of AOF
- Better to avoid data loss
In general, the AOF performs fsync every 1s through a sub-process and loses a maximum of 1s data
append-only
Pattern appending
Therefore, there is no disk addressing overhead, high write performance, and the file is not damaged, even if the file tail is damaged, it is easy to repair
- Even if the log file is too large and background rewriting is performed, client read and write operations are not affected
This is because when you rewrite log, you compress the instructions in it, creating a minimal log that needs to be retrieved. When a new log is created, the old log file is written as usual. When the new merge log files are ready, swap the old and new log files!
- Commands are recorded in a very readable manner
This feature is ideal for emergency recovery for catastrophic deletions. Rewrite (AOF, flushhall, AOF, AOF, AOF, AOF, AOF, AOF, AOF, AOF, AOF, AOF, AOF, AOF, AOF, AOF, AOF
2.2.2 Disadvantages of AOF
- AOF logs are generally larger than RDB snapshots for the same data
- When AOF is enabled, write QPS will be lower than RDB because AOF is typically configured to log files once every second fsync, although performance is still very high
- In the past, there was a bug in AOF, that is, the same data was not recovered when the logs recorded by AOF were recovered
More complex command-log /merge/ playback approaches such as AOF are a bit more bug-prone than the RDB-based approach of persisting a full snapshot at a time, but AOF was designed to avoid the bugs caused by the rewrite process. So instead of merging the rewrite from the old instruction log each time, rewrite is more robust by reconstructing the instruction based on the data in memory at the time
4 Selection and best practice
The command | RDB | AOF |
---|---|---|
Startup priority | low | high |
volume | low | high |
Recovery rate | fast | slow |
Data security | Lost data | By strategy |
Order of magnitude | heavyweight | lightweight |
4.1 Optimal RDB Policy
- Shut down
- Manage RDB operations manually in a centralized manner
- Enable automatic configuration on the slave node, but do not execute RDB frequently
4.2 AOF optimal Strategy
- It is recommended to enable it, but not if it is used purely as cache
- AOF rewrite centralized management
- everysec
4.3 Choice between RDB and AOF
- Don’t just use RDB, because that will cause you to lose a lot of data
- Don’t just use AOF either, because there are two problems with that
- You can recover faster by using AOF to do cold backup, without RDB to do cold backup
- RDB is more robust by simply generating snapshots at a time, avoiding the bugs of complex backup and recovery mechanisms such as AOF
- Use a combination of AOF and RDB
- AOF is the first choice for data recovery to ensure that data is not lost
- Use RDB to do different degrees of cold backup, in the case of AOF files are lost or damaged, you can also use RDB to quickly achieve data recovery
4.4 Some best practices
- Small shard
For example, the maxMemory parameter is set to store only 4 GIGABytes of space per redis, so that all operations are not too slow
- Monitoring (hard disk, memory, load, network)
- Enough memory
reference
- Redis. IO/switchable viewer/pers…
- Redis Design and Implementation