The students well, before we have a few class in detail the underlying data structure of Redis, these contents are also Redis is different from other database is the core of place, today we learn Redis in such aspects as data security design scheme, this part also more common in interview, today we mainly study the Redis persistence.

What is Redis persistence

We know that Redis data is stored in memory, if the server goes down suddenly, all the memory data will disappear, in order to prevent the problem, use a mechanism to ensure that data is not lost because of failure, we call this mechanism Redis persistence mechanism, the mechanism of main purpose is to memory data into hard disk

Redis provides two persistence mechanisms: Redis DataBase (RDB) and Appends-only File (AOF).

RDB snapshot –

Snapshot is the simplest Redis persistence mode, that is, to generate a set of data at a point in time, generate an RDB file, you can see that the data in the RDB file is very compact, so it is very fast to read the data when recovering

RDB file

There are two ways to trigger an RDB snapshot

Manual trigger

The bgSave/Save command is manually executed to display the trigger for snapshot generation

  • The save command: blocks the current Redis server until the RDB process is complete. It will cause a long block for instances with large memory and is not recommended for online environments

  • Bgsave: The Redis process forks to create a child process. The RDB persistence process is the responsibility of the child process and ends automatically. Blocking occurs only during the fork phase, which is usually very short

Configuration parameters are automatically triggered

Automatic triggering has the following situations:

  • Use the save related configuration commandsave m n. Bgsave is automatically triggered when the data set is modified for n times within m seconds
  • Perform full replication from the nodeAction, the master node automatically performs bgSave to generate RDB files to send to the slave node
  • Run the debug reload commandThe save command is automatically triggered when Redis is reloaded
  • Run shutdown commandIf AOF persistence is not enabled, bgSave is automatically executed

Note: There are two issues to consider during RDB persistence

  1. Specifies whether Redis stops providing external services during the RDB snapshot
  2. If the service is not stopped, how are new requests processed

Let’s take a look at the RDB persistent execution process

RDB persistence process

The main thread forks a child thread for persistence, and the parent thread shares a data area set to read-only. This area is read without problem, but copyonWrite is triggered. Let’s look at what COW(Copy On Write) is.

Copy On Write (COW) mechanism

Copy On Write (COW) is a mechanism for operating systems to process multiple processes. Redis invokes the glibc function to fork a subprocess during persistence. Parent and child processes share code and data segments in memory.

Therefore, the child process is completely handed over during persistence, while the parent process continues to process client requests. Therefore, the operating system adopts COW mechanism to separate process data sections and pages during persistence. Data segment is composed of many operating system pages. When the parent process modifies the data on one of the pages, it copies and separates the page shared by the parent and child threads, and then directly modifies the copied page process, while the page corresponding to the child process is not modified.

The simple process for Redis to adopt this mechanism is as follows. Lunix After a fork, the operating system sets the permissions on all memory of the parent process to read-only, and then the address space of the child process points to the parent process. When the parent process is read-only, there is no problem. When there is write memory, the CPU hardware detects that the memory is also read-only and triggers a Page-fault, falling into an interrupt routine of the operating system. In the interrupt routine, the operating system uses COW to trigger a copy of the exception, so that the parent and child processes each hold an independent copy. If a large number of write operations are performed at this time, a large number of paging errors (page exceptions interrupt page-fault) are generated, and cow is triggered.

The reason it is called a snapshot is that it starts at the moment the child process is created. The data in memory is fixed and does not change.

Advantages and disadvantages of RDB

Advantages:
  1. Performance maximization, fork the child process to complete the write operation and let the main process continue processing the command, ensuring the high performance of Redis
  2. Restart to recover data. When there is a large amount of data, Redis directly parses the RDB binary file to generate the corresponding data and stores it in memory.Higher start-up efficiency than AOF
disadvantages
  1. Low data securityIf a failure occurs during the persistence period, data will be lost, which determines that this method is more suitable for the time when data requirements are not strict
  2. System performance cost, according to the above mentioned Redis implementation cow mechanism, you can see a large number ofPaging errors can cost a lot of performance in replication.

AOF (Append Only File – Append files Only)

As mentioned above, snapshots are not a viable option in some cases, so AOF is well supported.

AOF principle

This method is very simple: that is, the operation commands to modify memory are recorded, adding to the AOF log is all the sequence of modification instructions since the creation of the Redis instance, so the recovery is the sequential execution of all the execution.


Redis uses single-thread corresponding commands. If the command is appended to the hard disk each time the AOF file is written, the processing performance will be greatly affected. Therefore, Redis first writes to the AOF buffer, which is written to the AOF file according to the user-configured synchronous disk policy.

  • Always:Fsync is called once for every write operationWhen the data is most secure, of course, its performance suffers because fsync is performed every time
  • No: Redis does not actively call fsync to synchronize the AOF log content to disk, so all this isCompletely dependent on operating system debugging. On most Linux operating systems, fsync is performed every 30 seconds to write the data in the buffer to disk.
  • Everysec:Redis makes fsync calls every second by default, writes the data in the buffer to disk.But when the fsync call takes longer than 1 second. Redis adopts a policy of delaying fsyncWait a second. That is, fsync will be performed two seconds later, and this time the fsync will be performed no matter how long it takes. Since the file descriptor is blocked during fsync, the current write operation is blocked.

Note that this is also one of the parameters that affects Redis performance. Appendfsync Everysec (default) is recommended.

AOF rewrite

By rewriting, Redis logs get bigger and bigger over a long period of time and work better when they are restored, so our goal is to slim down the logs

Will do weight loss from the following points:

  1. Invalid commands can be deleted, such as del key1, hdel key2, srem keys, set A111, set A222, etc., directly save with the final data generation command
  2. Multiple commands can be deleted, for example, lpush list A, lpush list B, and lpush list C can be converted to lpush list a, b, and C
  3. And so on, I won’t list them

Redis uses the bgrewriteAof directive to slim it down, mainly by creating a sub-process to convert memory traversal into a series of instructions, which are serialized to a new file, and then append the incremental AOF logs during the operation to the new log file, eventually replacing the old one.

The AOF override mechanism triggers in two ways

  1. Manual trigger:Bgrewriteaof instruction
  2. Automatic trigger:Auto-aof -rewrite-min-size and auto-aof-rewrite-percentage parameters determine the automatic trigger time
  • Auto-aof -rewrite-min-size: indicates the minimum size of a file when aof rewriting is run. The default size is 64MB.

  • Auto-aof-rewrite-percentage: specifies the ratio of the current AOF file space (AOF_current_size) to the aOF file space after the last rewrite (aOF_base_size).

auto-aof-rewrite-min-size    100

auto-aof-rewrite-percentage  64mb

Copy the code

This is triggered when the AOF file size is less than 64MB (the default) and the current AOF file size increases by 100% from the base size.

AOF the pros and cons

advantages

Data security, aOF persistent configuration appendfsync property, always, every command operation is recorded to the AOF file

disadvantages

When the data set is large, for example, RDB startup is inefficient

Hybrid Persistence (Redis 4.0)

As we know from the above, RDB recovery has a large amount of data and AOF recovery performance is slow, so in Redis4.0, mixed persistence is used to put RDB file memory together with incremental AOF log files, where the AOF log is no longer the full log. Instead, incremental logs from the start of persistence to the end of persistence are usually small and can greatly improve restart efficiency

Mixed persistence

When loading, it first identifies whether the AOF file starts with a REDIS string. If so, it is loaded in the RDB format. After loading, it continues to load the rest of the file in the AOF format



conclusion

This section mainly explains the persistence process of Redis, one is RDB snapshot mode, the other is AOF mode, RDB can not meet the data security, and AOF recovery efficiency is slow, so the mixed mode is adopted after version 4.0, that is, RDB and AOF are mixed together, to meet the two shortcomings. Next class will lead you to continue to explore Redis security design, class is over!