The third part is the realization of multi-machine database

copy

1. Implementation of the old copy function

  • Old replication is divided into two phases: synchronization and command propagation

  • The execution steps of the synchronization process

    • The secondary server sends the SYNC command to the primary server
    • Upon receipt of the SYNC command, the primary server starts performing the BGSAVE operation to generate the RDB file and uses a buffer to record all write commands now executed (for database consistency during the command propagation phase).
    • When the BGSAVE operation of the primary server is complete, the primary server sends the RDB file generated by the BGSAVE command to the slave server. The slave server receives and loads the RDB file, updating its database state to the database state of the primary server when the BGSAVE command is executed.
    • The master server sends all write commands recorded in the buffer to the slave server, which executes these write commands. Update your database state to the current state of the primary database.
  • The process of command propagation

    • The primary server sends the write command to the secondary server. After the secondary server runs the same write command, the primary and secondary servers return to the consistency state
  • Flowchart of the SYNC command

2. Defects of the copy function of the old version

  • From server AboltWhen the server reconnects to the primary server (the primary server is now saved to A120),The secondary server again sends the SYNC command to the primary serverAt this time, the master server needs to save all the key pairs of the database to the RDB file (A1-A120 key pairs), but the slave server A only needs the A100-A120 data generated during the offline period to restore the consistency state.So it would be inefficient to have the primary server redo the BGSAVE operation in order for the secondary server to make up for a small portion of the missing data.

3. Implementation of the new replication function

  • How to improve the efficiency of synchronization after reconnection? In fact, we only need to save the part of the write command after disconnection, and then let the master server send it to the slave server to execute it.
  • The new replication resolves the inefficiency of the old replication in dealing with disconnection and reconnection. Use the PSYNC command instead of the SYNC command to perform the synchronization during the replication.
  • The PSYNC command hasFull resynchronizationandPartial resynchronizationTwo modes
    • Full resynchronization is used to handle the first copy: the same steps as SYNC are performed by having the master create and send an RDB file and sending a write command to the slave in a buffer.
    • And some heavy synchronization for processing after the break to repeat system: from the server after the break line to connect the main server, if the condition allows, the primary server can connect the slave servers disconnect during the execution of write command to send from the server, as long as the receiving from the server and perform these write command, you can update the primary current state in the database.
  • Partial resynchronization requires fewer resources and is faster because it only needs to send write commands that are missing from the server to the slave server.

4. Partial resynchronization

  • Copy offset

    • Both master and slave servers maintain a replication offset. Each time the master propagates N bytes of data to the slave, it adds N to the value of its own replication offset.

    • The slave server adds N to its replication offset each time it receives N bytes of data propagated from the master server

    • You can easily determine whether the master and slave servers are in a consistent state by determining their replication offsets

    How to determine whether full or partial resynchronization is required if the offset is inconsistent. How does the master compensate for the data lost during the outage? Related to replication backlog buffers.

  • Replication backlogs

    • The replication backlog buffer maintains a fixed-length FIFO queue, with a default size of 1MB

    • When the server propagates the command, it not only sends the command to all slave servers, but also queues the write command into the replication backlog buffer

    • When the slave server reconnects to the master server, the slave server uses the PSYNC command to send the replication offset to the master server. If the data after the offset still exists in the replication backlog buffer, partial resynchronization is performed. If the data after offset is no longer in the replication backlog buffer (too long offline), a full resynchronization operation is performed.

    • The minimum size of the replication backlog buffer should be the average reconnection time * number of write commands per second generated by the primary server; To ensure partial resynchronization is available for most drop cases, set the replication backlog buffer to 2* average reconnection time * number of write commands per second generated by the primary server

  • Server Running ID

    • This record is used to check whether the server that is reconnected after disconnection is the original server
    • When the slave makes the first replication to the master, the master passes its run ID to the slave, and the slave saves the ID. When the slave server reconnects, the slave server sends the previously saved server run ID to the master server
      • If the ID saved on the secondary server is the same as the running ID of the connected primary server, it indicates that the current primary server is replicated before disconnection. In this case, partial or full resynchronization is determined based on the replication backlog buffer
      • On the contrary, it indicates that the replication before the line is not the current primary server, and a full resynchronization operation is required.

5. Implementation of the PSYNC command

  • PSYNC: runid indicates the primary server running ID of the last replication. Offset indicates the current replication offset
  • The primary server returns FULLRESYNC: a FULLRESYNC operation is performed, with offset as the initial offset of the secondary server
  • The primary server returns a CONTINUE: partial resynchronization is performed. Just wait for the master server to resend the missing data.

6. Implementation of replication

SLAVEOF <master_ip> <master_port>

  1. Set the address and port number of the primary server
  2. Establish a socket connection (make the slave the primary server client)
  3. Send the PING command (check the primary server connection)
  4. The authentication
  5. Send port information (let the master server save information about the slave server)
  6. Synchronization (Run the PSYNC command to perform the synchronization. Duplex communication is programmed by simplex communication, that is, master and slave servers become clients of each other.)
  7. Command transmission
  8. Heartbeat detection (the secondary server sends commands to the primary server once per secondREPLCONF ACK < Offset from server >)
    • Check the network connection status of the primary and secondary servers
    • Replication implements the min-Slaves function (when the number of slaves of the master server is less than Min-Slaves, the cluster is considered to be dead)
    • The detection command is lost (the primary server is lost due to a network problem during partial resynchronization. If it is detected by the heartbeat, it will be resent)

conclusion

  • When the client sends the SLAVEOF command to the slave server to replicate the master server, the slave server first needs to perform synchronization. Updates the slave server’s database cluster to the current database state of the master server. After the synchronization is complete, the consistency is not constant. When the master server executes the write command from the client, the status of the master server and the slave server are not consistent. Therefore, the master server needs to transmit the write command to the slave server, that is, the master server sends the write command to the slave server.
  • Redis2.8’s previous replication feature did not efficiently handle post-disconnection duplication, but the new partial resynchronization feature added to Redis2.8 solves this problem.
  • Partial resynchronization passedCopy offset,Replication backlogs,Server runningID three parts to achieve.
  • At the beginning of the replication operation, the slave server becomes the client of the master server and performs the replication step by sending command requests to the master server. In the later stages of replication (synchronization), master and slave servers become clients of each other (master sends commands (command propagation, partial resynchronization) to slave servers to maintain data consistency).
  • The master server updates the status of the slave server by broadcasting commands to the slave server to ensure consistency between the master and slave servers, while the slave server sends commands to the master server for heartbeat detection and command loss detection.

Sentinel Sentinel mechanism

Sentinel is a high availability solution for Redis: The Sentinel system, which consists of one or more Sentinel instances, can monitor any number of primary servers and the secondary servers under these primary servers, and automatically upgrade a heavy server under the offline primary server when the monitored primary server goes offline. The new primary server then replaces the offline primary server to continue processing command requests (failover)

The composition of Sentinel

  • Setinel startup steps

    • Initialization server (Sentinel is essentially just a Redis server running in special mode, some initialization process is different)
    • Replace the code used by the normal Redis server with Sentinel specific code (e.g. ping,INFO instead of get,set)
    • Example Initialize the Sentinel status
    • Initialize Sentinel’s list of monitor master servers
    • Create network connections to the primary server (create command connections and subscription connections; The former is used to send commands to the primary server. The latter is the _sentinel_hello channel that subscribs to the master to receive changes to other master messages.)
  • Setineal uses a dictionary to record all masters by reading configuration files; By sending the INFO command to the primary server, the information about all secondary servers under the primary server is obtained.

How do I get master server information

  • Sentinel will send INFO command to the monitored master server by command connection every 10 seconds by default and obtain the current information (status, slave slaves information) of the master server by analyzing the response of INFO command.

How do I get slave server information

  • In addition to creating the corresponding instance structure for the new slave server, Sentinel also creates command connections and subscription connections to the slave server when Sentinel detects a new slave server

    • After the command connection is created, the INFO command is sent to the slave server every 10 seconds

      • Gets the slave server’s operating ID, role, priority, offset, connection information of the slave server, and status

Sends messages to master and slave servers

By default, Sentinel sends subscription messages to all monitored master and slave servers through command connections at a frequency of every two seconds

PUBLISH sentinel_:hello

Channel information received from master and slave servers

  • When Sentinel establishes a subscription connection with a master or slave server, Sentinel sends the following command to the server through the subscription connection. This continues until Sentinel is disconnected from the server.

  • For multiple sentinels monitoring the same server, the information sent by one Sentinel will be received by other Sentinels. This information will be used to update the cognition of other Sentinels on the sending information of Sentinels, and also will be used to update the cognition of other Sentinels on the monitored server

Detect subjective offline status

  • By default, Sentinel sends a PING command once per second to all instances (primary, secondary, and other Sentinels) with which it has created command connections, and determines whether the instance is online by the PING response returned by the instance.
  • In the Sentinel configuration filedown-after-millisecondsSpecifies the length of time required to determine the subjective offline of the instance. If an invalid reply is sent consecutively within down-after-milliseconds, Sentinel determines that the instance is offline and sets the flags of the corresponding instance structure to SRI_S_DOWN. (Subjective Subjective)

Detect objective offline status

  • When Sentinel identifies a master server asSubjective offlineTo verify that the primary server is actually offline, it willAsk other Sentinels that also monitor this master serverTo see if they also think the main server has gone offline.When Sentinel receives a sufficient number of offline judgments from other Sentinels, Sentinel will judge from the server asObjective offlineAnd will be executed by the master serverfailoverOperation.
  • Set flags to SRI_O_DOWN (Objective Objective)

How to elect the head Setinel

  • When a primary server is judged to be objective offline, the sentinels monitoring the offline primary server negotiate to elect a lead Sentinel, who will perform failover operations on the offline primary server.
Sentinel IS-master-down-by-addr < MASTER IP address > < port number > < config era > <SentinelId>Copy the code

failover

  1. Out of all slave servers under the offline master, select one slave and convert it to the master (upgrade)
  2. Make all slave servers under the offline master replicate the new master
  3. Set the offline primary server to be the slave of the new primary server. When the old server comes back online, it becomes the slave of the new primary server (degraded).

The cluster

Redis cluster is a distributed database solution provided by Redis. The cluster shares data through sharding and provides replication and failover functions.

The main content of the cluster node, slot assignment, command execution, re-sharding, steering, failover, message and other aspects

  • Nodes use handshakes to add other nodes to their cluster (Clustermeet IP Port)

  • The 16,384 slots in the cluster can be assigned to each node in the cluster, and each node records which slots are assigned to itself and which slots are assigned to other nodes.

  • The ClusterState. slots array records all slots assigned to a cluster, enabling you to determine which node a slot is assigned to in O (1) time. The clusterNode. Slots array records only the slots assigned to nodes represented by the clusterNode structure. So, when swapping slots, you only need to send the ClusterNode. slots array. Then, the receiving node updates the corresponding clusterNode in the dictionary table. Instead of traversing all the elements in clusterState, marking the slot designated by the current node, and then transmitting its information, the time complexity is O (n). The following figure shows the clusterState structure for node 7000.

  • When a node receives a command request, it checks to see if the slot for the key to be processed is its own (see the ClusterState.slots array). If not, the node will return a MOVED operation to the client. MOVED The information carried by the error can direct the client to the node that is responsible for the related slot.

  • The node also uses skip tables to keep the relationship between slots and keys (for example, to see which commands are used for database keys belonging to a slot)

  • Resharding of a Redis cluster is done by Redis-trib, and the key to resharding is to move all key-value pairs belonging to a slot from one node to another.

  • If node A is migrating slot I to node B, then when node A fails to find the database key specified by the command in its own database, node A will return an ASK error to the client, directing the client to node B to continue searching for the specified database key.

  • The MOVED error indicates that the responsibility for the slot has been MOVED from one node to another, and the ASK error is just a temporary measure used by the two nodes as they move the slot.

  • Secondary nodes in the cluster are used to replicate the master node and continue processing requests in place of the master node when it goes offline.

  • Nodes in a cluster communicate by sending and receiving messages. Common messages include MEET, PING, PONG, PUBLISH, and FAIL.