### What is master-slave replication
- A master can have multiple slaves
- A slave can only have one master
- The data flow is one-way, from master to slave
Full replication and partial replication #####run id Check the replication offset (used to compare data synchronization problems between the two sides. The difference should not be too large)
- Insert a command
redis-cli -p 6379 info server | grep run
View the Redis run ID
- View the replication offset
##### Full replication Based on Redis replication, it is very simple to use and configure master slave replication, which allows slave Redis servers to become exact copies of master servers. Each time the link is disconnected, the slave device will automatically reconnect to the master device, and whatever happens to the master device will attempt to become an exact copy of the master device.
This system uses three main mechanisms:
- When the master and slave instances are well connected, the master keeps the slave up to date by sending a stream of commands to replicate the effects of the data set on the master: client writes, key expiration or evictions, and so on.
- When the link between the master and slave is disconnected, the slave reconnects and attempts partial resynchronization for a network problem or due to a timeout sensed in the master or slave: this means that it will attempt to acquire only part of the command stream that was missed when the connection was disconnected.
- When partial resynchronization is not possible, the slave machine will require full resynchronization. This would involve a more complex process in which the host would need to create a snapshot of all the data, send the data to the slave, and then continue sending the command stream as the data set changed.
Redis uses asynchronous replication by default, which is high latency and high performance, and is the natural replication mode for the vast majority of Redis use cases. However, the Redis slave asynchronously confirms the amount of data the master receives on a regular basis.
Synchronous replication of certain data can be requested by the client using the WAIT command. But WAIT only ensures that there are a specified number of confirmed copies in other Redis instances: confirmation writes may still be lost during failover for different reasons, or depending on the exact configuration of Redis persistence. You can check the Sentinel or Redis cluster documentation for more information on high availability and failover. The rest of this article focuses on describing the basic characteristics of Redis basic replication.
Here are some very important facts about Redis replication:
-
Redis uses asynchronous replication, which asynchronously confirms the amount of data to be processed from host to host.
-
A master can have more than one slave.
-
Slave stations can accept connections from other slave stations. In addition to connecting multiple slave stations to the same master, slave stations can also be connected to other slave stations in a cascading structure. Since Redis 4.0, all child slave servers receive the exact same replication stream as the master server.
-
Redis replication is non-blocking on the primary side. This means that when one or more slaves perform an initial synchronization or partial resynchronization, the host will continue processing the query.
-
Replication is also largely non-blocking. When initial synchronization is performed from the server, it can process queries using older versions of the dataset, assuming you have redis configured in redis.conf. Otherwise, the Redis subordinate can be configured to return an error to the client if the replication stream is closed. However, after the initial synchronization, the old dataset must be deleted, and the new dataset must be loaded. The slave station will block incoming connections in this short window (possibly for many seconds for very large data sets). Since Redis 4.0, it is possible to configure Redis so that the deletion of the old dataset occurs in a different thread, but the loading of the new initial dataset still occurs in the main thread and prevents dependencies.
-
Replication can be used for scalability, multiple slave sites for read-only queries (for example, low-speed O (N) operations can be offloaded to slave sites), or just for data security.
-
Replication can be used to avoid the cost of having the primary server write complete data sets to disk: a typical technique involves configuring the primary server redis.conf to avoid permanent saving to disk, and then connecting to the secondary server configured to save from time to time or enabling AOF. However, this setting must be handled with care, because the restarted master will start with an empty data set: if the slave tries to synchronize with it, the slave will also be emptied. ##### Replication security when master is down In Settings using Redis replication, it is strongly recommended that persistence be enabled on both master and slave servers. If this is not feasible, such as delays due to very slow disk speeds, configure the instance to avoid automatic reboots after restarts.
To better understand why shutting down a master device configured for automatic restart is dangerous, check the following failure mode where data is erased from the master device and all its slaves:
- We have A setup where node A is the primary node, persistence is turned off, and nodes B and C are copied from node A.
- Node A crashes, but it has some automatic restart of the system, the restart process. But because persistence is turned off, the node will restart an empty data set.
- Nodes B and C will replicate from node A, which is empty, so they will effectively destroy their copy of data. When Redis Sentinel is used for high availability, it is also dangerous to turn off persistence on the primary server and automatically restart the process. For example, the host can be restarted quickly and Sentinel will not detect a failure in order for the above failure mode to occur.
Every time data security is important, automatic restart of the instance should be disabled when replication is used with a master site configured for non-persistence.
When Redis Sentinel is used for high availability, it is also dangerous to turn off persistence on the primary server and automatically restart the process. For example, the host can be restarted quickly and Sentinel will not detect a failure in order for the above failure mode to occur.
Every time data security is important, automatic restart of the instance should be disabled when replication is used with a master site configured for non-persistence. ##### How Redis copy works Each Redis master has a copy ID: it is a large pseudo-random string that marks the given story of the data set. Each master device also gets an offset that is incremented for each generated copy stream byte to be sent to the slave device to update the slave device’s state with new changes to the modified dataset. The replication offset increases even if there is no slave connection, so basically every pair given: identifies the exact version of the master data set.
When the slave connects to the master, they use the PSYNC command to send their old master replication ID along with the offsets processed so far. This allows the host to send the desired increment. However, if there is not enough backlog in the primary buffer, or if the slave server references a history (replication identifier) that is no longer known, a full resynchronization occurs: in this case, the slave server gets a full copy of the data set and starts from scratch.
#### Here’s how full synchronization works in more detail:
-
The host begins a background save process to generate an RDB file. It also starts buffering all new write commands received from the client. When the background save is complete, the master server transfers the database file to the slave server, saves it to disk, and then loads it into memory. The master sends all buffered commands to the slave. This is done as a command stream and is in the same format as the Redis protocol itself.
-
You can try it out for yourself via Telnet. Connect to the Redis port and issue SYNC while the server is doing some work. You will see a batch transfer, and then each command received by the host will be reissued in the remote login session. In fact, SYNC is an old protocol that is no longer used by newer Redis instances, but there is still backward compatibility: it does not allow partial resynchronization, so PSYNC is now used.
-
As mentioned earlier, when the master/slave link is down for some reason, the slave station is able to reconnect automatically. If the master receives multiple concurrent slave synchronization requests, it performs a background save to service all of them. #### Diskless replication Typically, full resynchronization requires creating an RDB file on disk and then reloading the same RDB from disk to provide data to slave devices.
For slower disks, this can be a very stressful operation for the main device. Redis version 2.8.18 is the first version to support diskless replication. In this setup, the child process sends the RDB directly to the slave server over the wire, without using disk as intermediate storage. #### Full replication cost
- Bgsave time
- RDB file network transfer time
- Clear the data time from the node
- Time when the RDB is loaded from the node
- If the connection is partially disconnected, the master will write a command to copy the buffer. When connecting to the master, the slave will tell the master its offset and runid values. If the lost data is within a range of the buffer, the slave will write a command to copy the buffer. The master gives the data from the buffer queue to the slave. Then synchronize some of the data to the slave ### replication configuration #####slaveof command
- Copy command
Slaveof 127.0.0.1 6380Copy the code
Discard the old data set and start synchronizing with the new master server instead. 2. Cancel the replication
slaveof no one
Copy the code
This will cause the slave server to turn off replication and switch back from the slave server to the master server, without discarding the data set originally synchronized. ##### Modify the following parameters:
- The port;
- Logfile;
- Slaveof;
- pidfile;
- Daemonize (whether executed in background)
Slaveof IP port slave-read-only Yes# set read-only
Copy the code
##### Compare the two methods
way | The command | configuration |
---|---|---|
advantages | Don’t need to restart | Unified configuration |
disadvantages | Unmanageable | Need to restart |
# # # # #
- I configured a service for port 6379, and port 6380, and started it in the background
# 6379 configuration
port 6379
pidfile /var/run/redis_6379.pid
# slaveof <masterip> <masterport>
logfile "6379.log"
daemonize yes
# 6380 configurationPort 6380 pidfile /var/run/redis_6380.pid slaveof 127.0.0.1 6379 logfile"6380.log"
daemonize yes
Copy the code
- And then I set something on the host and I get it on the slave