Redis master-slave architecture

A master-slave replication

A single Redis can carry QPS ranging from tens of thousands to tens of thousands. For caches, this is generally used to support high read concurrency. Therefore, the architecture is made into a master-slave architecture, with one master and many slaves. The master is responsible for writing and copying data to other slave nodes, and the slave nodes are responsible for reading. All read requests go from the node. In this way, it is also easy to achieve horizontal expansion and support high read concurrency.

Redis Replication -> Master/slave Architecture -> Read/write Separation -> Horizontal expansion supports high read concurrency

The core mechanism of Redis Replication

Redis replicates data to slave nodes asynchronously. However, starting from Redis2.8, slave nodes periodically confirm the amount of data they replicate each time.
A master node can be configured with multiple slave nodes.
Slave nodes can also connect to other slave nodes.
The slave node does not block the normal operation of the master node.
The slave node does not block its own query operation when performing replication. Instead, it uses the old data set to provide services. However, when the replication is complete, the old data set needs to be deleted and the new data set needs to be loaded. At this time, the external service will be suspended.
The slave node is used for horizontal capacity expansion and read/write separation. The expanded slave node improves read throughput.

Note that if you use a master/slave architecture, it is recommended to enable persistence for master nodes. It is not recommended to use slave nodes as data hot standby for master nodes, because then if you turn off persistence for master nodes, The data may be empty when the master is down and restarted, and may be lost as soon as the slave node is replicated.

In addition, the master of various backup schemes, also need to do. If all local files are lost, select an RDB from the backup to restore the master. This ensures that data is available at startup. Even with the high availability mechanism described later, the slave node can automatically take over the master node. However, it is also possible that the Master node automatically restarts before Sentinel detects a master failure, or that all slave node data on it may be wiped clean.

The core principles of Redis master-slave replication

When a slave node is started, it sends a PSYNC command to the master node.

If this is the first time that the slave node connects to the master node, a full resynchronization full replication is triggered. At this point, the master starts a background thread to generate a snapshot file of the RDB and cache all the new write commands received from the client. After the RDB file is generated, the master sends the RDB file to the slave. The slave first writes the RDB file to the local disk and then loads the RDB file from the local disk to the memory. The master then sends the write commands cached in the memory to the slave and the slave synchronizes the data. If the slave node is disconnected from the master node due to a network fault, the slave node automatically reconnects to the slave node. After the connection, the master node copies only the missing data to the slave node.

Breakpoint continuation for master/slave replication

Since Redis2.8, breakpoint continuation of master/slave replication is supported. If the network connection is down during master/slave replication, the replication can continue where the last replication was made, rather than starting from scratch.

Master nodes maintain a backlog in memory. Both master and slave nodes maintain a replica offset and a Master run ID. Offset is stored in the backlog. If the network connection between the master and slave breaks down, the slave asks the master to continue replication from the last replica offset. If no corresponding offset is found, a resynchronization operation is performed.

Locating the master node based on host+ IP is unreliable. If the master node restarts or data changes, the slave nodes should be distinguished based on different RUN ids.

Diskless replication

The master creates the RDB in memory and sends it to the slave instead of landing on its own disk. Simply enable repl-diskless-sync yes in the configuration file.

Repl-diskless-sync yes # Wait 5s before starting the replication, because more slaves need to reconnect to repl-diskless-sync-delay 5Copy the code

Handling expired Keys

The slave does not wait for the master key to expire. If the master expires a key or discards a key via the LRU, a del command is emulated and sent to the slave.

The complete process of replication

When the slave node starts, it saves the master node information, including the host and IP address of the master node, but the replication process does not start.

The slave node has a scheduled task that checks every second to see if there are new master nodes to be connected and replicated. If so, the slave node establishes socket connections with the master node. The slave node then sends the ping command to the master node. If master has requirePass set, slave node must send masterauth’s password for authentication. The master node performs full replication for the first time and sends all data to the slave node. Later, the master node asynchronously copies the write command to the slave node.

Full amount of copy

Master performs bgSave to generate an RDB snapshot file locally.
The master node sends the RDB snapshot file to the slave node. If the RDB replication time exceeds 60 seconds (Repl-timeout), the slave node considers that the replication fails. You can adjust this parameter appropriately. Generally 100MB, 6GB file transfer per second, probably more than 60 seconds)
When the master node generates an RDB, it caches all new write commands to the memory. After the master node saves the RDB, the master node copies the new write commands to the slave node.
If memory buffer consumption exceeds 64MB continuously during the replication or exceeds 256MB at a time, the replication is stopped and the replication fails.

client-output-buffer-limit slave 256MB 64MB 60
After receiving the RDB, the slave node clears its old data, reloads the RDB into its own memory, and provides services based on the old data version.
If AOF is enabled on the slave node, BGREWRITEAOF is immediately executed to override the AOF.

Incremental replication

If the master-slave network connection breaks during the full replication, incremental replication is triggered when the slave reconnects to the master.
The master takes some of the missing data directly from its own backlog and sends it to slave nodes. The default backlog is 1MB.
The master retrieves the backlog from the offset in the psync sent by the slave.

heartbeat

Both the primary and secondary nodes send heartbeat information to each other.

By default, the master sends a heartbeat every 10 seconds, and the slave node sends a heartbeat every 1 second.

Asynchronous replication

After receiving a write command, the master writes data internally and asynchronously sends the data to the slave node.

How can Redis be highly available

A system is highly available if it is available 99.99% of the time within 365 days.

The failure of one slave does not affect availability, as other slaves provide the same external query service with the same data.

But what happens if the master node dies? I can’t write data. When I write to the cache, it all fails. What’s the point of a slave node? With no master to copy data to them, the system is virtually unusable.

The high availability architecture of Redis is called failover, also known as master/slave switchover.

When a master node fails, it automatically detects the fault and switches a slave node to the master node. This is called active/standby switchover. This process implements high availability under the master-slave architecture of Redis.