Redis series of directories

Redis series – distributed lock

Redis series – Cache penetration, cache breakdown, cache avalanche

Why Is Redis so fast?

Redis series — Data Persistence (RDB and AOF)

Redis series – consistent hash algorithm

Redis series – High Availability (Master slave, Sentinel, Cluster)

Redis series – Things and Optimism lock

Redis series — Geospatial: Do you have Lao Wang next door?

Bitmaps: Did you check in today?

What is a Bloom filter?!

In the development of databases (such as mysql) and caches (such as Redis), both learn from each other’s strengths to make up for their own shortcomings. For example, mysql, as a persistent database, uses caching technology to improve data access speed. When an SQL query is completed, mysql uses SQL to generate a key and cache the result of the SQL query on this key. If the same SQL is run, the server directly obtains the result from the cache. There is no need to parse, optimize, or execute SQL. At the same time, Redis as a cache, in order to solve the problem of data loss caused by downtime, also added a persistence mechanism.

Redis supports two persistence mechanisms, RDB and AOF. The persistence function effectively avoids data loss caused by the exit of the process. When the next restart, the data can be recovered using the previous persistent file.

Today we are going to talk about redis persistence. This issue is also a frequent interview question, but also the actual production must consider the problem.

RDB (Redis DataBase)

By default, only RDB is enabled. RDB is a binary file.

The principle of

The RDB mode is also called snapshot mode. In this mode, the current redis memory snapshot is saved to the dump. RDB file on the disk at a certain trigger time. In this process, the main command bgsave is executed.

  1. Bgsave execution is triggered under certain conditions

  2. The parent Redis process checks whether there are currently executing child processes, such as RDB/AOF, and returns the bgsave command if there are.

  3. The parent process forks to create a child process. During the fork operation, the parent process blocks and cannot respond to other client requests. The latest_fork_usec option is viewed using the info STATS command to obtain the time spent in microseconds for the last fork operation. Fork copies the same process as the current one. All data values of the new process (variables, environment variables, program counters, etc.) are the same as those of the original process, but are a new process and a child of the original process.

  4. After the parent fork completes, the BGsave command returns the “Background Saving Started” message and no longer blocks the parent and can continue to respond to other client commands.

  5. The child process creates the RDB file dump. RDB, which generates a temporary fast RDB file from the parent process memory.

  6. Atomic replacement of the original RDB file with a temporary RDB file. Run the lastsave command to obtain the time when the RDB was generated for the last time, which corresponds to the rdb_last_save_time option of info statistics

  7. The child process exits.

configuration

################################ SNAPSHOTTING ################################
If at least one 1key has been modified in #900s, the operation is persisted
save 900 1
If at least one 10key has been modified in #300s, the operation will be persisted
save 300 10
If at least one 10000Key has been modified within 60 seconds, the operation will be persisted
save 60 10000
If something goes wrong, do you need to continue working
stop-writes-on-bgsave-error yes
Whether to compress the RDB file consumes some CPU resources
rdbcompression yes
Error check when saving RDB file
rdbchecksum yes
The name of the persistent file
dbfilename dump.rdb
# RDB file save directory
dir ./
Copy the code

triggering

Triggering RDB persistence can be manually triggered or automatically triggered.

Manual trigger

After logging in to Redis using the REDis CLI, you can use the save and bgsave commands to trigger RDB persistence.

  • The save command: blocks the current Redis server until the RDB process is complete. It will cause a long block for instances with large memory and is not recommended for online environments.

  • Bgsave: The Redis process forks to create a child process. The RDB persistence process is the responsibility of the child process and ends automatically. Blocking occurs only during the fork phase, which is usually very short.

The bgsave command is clearly optimized for save blocking. Therefore, all operations involving RDB within Redis are bgSave and the save command has been deprecated.

Automatic trigger

In addition to manually triggering by executing commands, Redis also has a persistence mechanism that automatically triggers the RDB.

  • Meet the requirements in the configuration file abovesave m nIs automatically executed by Redisbgsave.
  • When the slave copies the master in full, the slave sends a message to the master, which the master performsbgsave, the RDB file is sent to the slave.
  • performshutdownandflushallCommand is automatically executed by Redisbgsave.

To disable the RDB persistence mechanism, run the config set save “” command line in redis-cli.

Data recovery

RDB files are automatically restored. On the cli of the Redis client, run the config get dir command to obtain the installation directory of Redis, move the backup file (dump. RDB) to the installation directory, and start the service. Then redis automatically loads the file data to the memory.

When Redis starts, it reads the RDB snapshot file and loads the data from the hard disk into memory. This time varies depending on the size and structure of the data and server performance. RDB itself is a binary file and recovery is very fast. It typically takes 20 to 30 seconds to load a 1GB snapshot file of 10 million string keys into memory.

127.0.0.1:6379 > config get dirGet the redis directory
1) "dir"
2) "/usr/local/bin" Dump. RDB file in this directory and restart Redis to restore data automatically
Copy the code

The advantages and disadvantages

advantages

Represents a snapshot of Redis data at a point in time. It is suitable for backup and full copy scenarios. For example, performing bgSave backups every 6 hours and copying RDB files to a remote machine or file system (such as HDFS) for disaster recovery (low data integrity and consistency requirements)

RDB is saved in binary, Redis loads RDB recovery data much faster than AOF (suitable for large-scale data recovery)

disadvantages

RDB does not allow real-time persistence/second persistence. Because bgSave forks every time it runs, it is a heavyweight operation (a copy of the data in memory is cloned, roughly twice as much expansibility needs to be considered), and frequent execution is expensive (performance impact).

RDB files are saved in a specific binary format. In the process of Redis version evolution, there are multiple RDB versions. The old Redis service cannot be compatible with the new RDB format (version incompatibility).

Backups are made at regular intervals, so if Redis unexpectedly goes down, all changes since the last snapshot are lost.

AOF (Append Only File)

By default, AOF is turned off. The content written by the AOF command is in the text protocol format. You can view the file content through vim.

The principle of

Each write command is recorded in an independent log, and the commands in the AOF file are executed again when the system restarts to recover data. The main role of AOF is to solve the real-time of data persistence, which has been the mainstream way of Redis persistence. Understanding AOF persistence is very helpful in balancing data security and performance. It records all write instructions performed by Redis (read operations are not recorded). Only files can be appended but files cannot be overwritten. When Redis starts, it reads the file to rebuild data, in other words, When redis is restarted, write instructions are executed from front to back according to the contents of the log file to complete data recovery.

There are three steps to aOF persistence: command write (append), file sync, and file rewrite.

  1. All write commands are appended to the aOF_buf (buffer).
  2. The AOF buffer synchronizes data to disks based on the corresponding policy.
  3. As AOF files become larger, they need to be rewritten periodically to achieve compression.

Rewrite principle: When an AOF file grows too large, it forks a new process to Rewrite the file (again, write temporary files first before rename). Traverses the data in memory of the new process, with a Set statement for each record. The operation of overwriting an AOF file, rather than reading the old AOF file, commands the entire contents of the database in memory to rewrite a new AOF file, similar to a snapshot.

configuration

############################## APPEND ONLY MODE ###############################
Aop persistence is not enabled by default because RDB is sufficient for general business purposes
appendonly no

# aof Persistence file name
appendfilename "appendonly.aof"

Sync is not executed, it is completely dependent on the operating system to write, usually every 30 seconds. Performance is best, but persistence is not guaranteed and is not recommended
# appendfsync no
Sync is executed every time a write command is executed to append the command to the aOF log file. Ensure full persistence, no data loss, and the worst performance. Not recommended
# appendfsync always
Execute sync every second to append the command to the AOF log file. A compromise between performance and persistence is recommended. It's also the default
appendfsync everysec

Aof files will not be overwritten using BGREWRITEAOF by default.
no-appendfsync-on-rewrite no
If the aOF file occupies more than 100% of the space of the previous record and the AOF file increases by 64MB, a rewrite will be performed.
auto-aof-rewrite-percentage 100
auto-aof-rewrite-min-size 64mb

When loading aOF file, if the file itself has problems (such as the end of the file), whether to continue loading. Yes: continue loading, no: loading error, stop loading, need to repair the aof file before loading, this is not restart Redis
aof-load-truncated yes
Copy the code

Note here the configuration that triggers sync and the configuration of rewrite.

triggering

The appendfsync trigger mechanism of AOF is determined by the three parameters configured above: no, always, and Everysec. You can configure it according to the real-time requirements for performance and persistence. If you don’t know which one is appropriate, use the default Everysec and you may lose 1s of data.

The rewriting process of an AOF file is triggered by the rewrite parameter configured above. It can be triggered manually or automatically.

Manually triggering rewrite

After logging in to Redis using redis-CLI, you can run the bgrewriteaof command to trigger the aOF rewrite mechanism.

Rewrite is triggered automatically

Auto-aof -rewrite-min-size and auto-aof-rewrite-percentage parameters are used to determine the automatic triggering time. When both conditions are met, rewriting will be triggered.

Data recovery

When redis is started, it checks whether the AOF function is enabled. If the AOF persistent file is enabled, only the AOF file is used for data recovery, not the RDB file.

When aOF files are recovered, the AOF files may have incomplete endings. For example, the aOF tail file commands cannot be written completely due to a sudden power failure. Redis provides us with the aof-load-TRUNCated configuration to accommodate this case, which is enabled by default. When AOF is loaded, this problem is ignored and startup continues.

If the aOF-load-truncated configuration is disabled, redis will fail to start when the AOF file is incomplete. In this case, using the client connection also fails:

[root@redis ~]# redis-cli -p 6379
Could not connect to Redis at 127.0.0.1:6379: Connection refused
not connected> exit
[root@redis ~]#
Copy the code

You can back up AOF files with incorrect formats and run the redis-check-aof command in the redis installation directory to restore AOF files. After the restoration, you can run the Linux command diff -u to compare data differences and find out lost data. Some of the files can be manually modified and completed.

[root@redis ~]# redis-check-aof --fix appendonly.aof
0x             a4: Expected \n\r, got:6461
AOF analyzed: size=185, ok_up_to=139, diff=46
This will shrink the AOF form 185 bytes, with 46 bytes, to 139 bytes
Continue? [y/N]: y Enter y to agree to change
Successfully truncated AOF
[root@redis ~]#
Copy the code

The advantages and disadvantages

advantages

  1. Each change is synchronized, resulting in higher file integrity.
  2. You can select synchronization parameters based on services to ensure both performance and real-time synchronization. Only 1 second of data will be lost if the data is synchronized every second.

disadvantages

  1. AOF files are much larger than RDB files, and data recovery is slower than RDB files.

Third, summary

1. RDB persistence allows snapshot storage of data at a specified interval. This function is enabled by default.

2. AOF persistence records every write to the server. When the server is restarted, these commands will be executed again to restore the original data. AOF persistence will append each write operation to the end of the file. Redis can also rewiter AOF files to prevent the size of AOF files from becoming too large. This function is disabled by default.

3. If a separate service loads mysql data into Redis every time you restart the machine, you don’t need to use any persistence.

4. Enable both persistence methods

  • In this case, when Redis restarts, the aOF file will be loaded first to recover the original data. AOF files usually hold more complete data sets than RDB files do.
  • RDB files are not as complete as AOF files. When both files are used, only AOF files will be used during server restart. Not recommended, as RDB is more suitable for backup databases (AOF is constantly changing and not good for backup), is binary storage recovery faster, and there is no potential AOF Bug, save as a precaution.

5. Performance suggestions

  • Since RDB files are only used for data backup, it is recommended that RDB files be persisted only on the Slave and only backed up once every 15 minutessave 900 1This rule.
  • If you enable AOF, the benefit is that you lose less than two seconds of data in the worst case. The startup script is simpler and you can only load your OWN AOF files. The second is the end of AOF rewrite. The blocking caused by writing new data to new files in the rewrite process is almost inevitable. The frequency of AOF rewrite should be minimized; the default base size of AOF rewrite is 64M (auto-aof-rewrite-min-size 64mb) Too small to set to 5G or more; The default size overrides 100% of the original size and can be changed to an appropriate value.
  • If aOF is not enabled, just master-slave Replication can be used for high availability, saving a lot of IO and reducing the volatility of rewriting. The trade-off is that if the Master/Slave goes down at the same time, the data will be lost for more than ten minutes. The startup script will also compare the RDB files in both Master/Slave and load the newer one.

Done, done!

[spread knowledge, share value], thank small partners attention and support, I am [Zhuge small ape], a hesitation in the struggle of the Internet migrant workers!!