This is the sixth day of my participation in the August Text Challenge.More challenges in August

1. Expansion principle

Redis provides flexible node expansion and contraction solutions. Without affecting cluster external services, you can add nodes to expand the cluster or reduce the capacity of offline nodes.

In this article, we set up a three-master, three-slave Redis Cluster (as shown in the figure below). In the construction of Redis Cluster communication process analysis in this blog according to the source code detailed analysis of the construction of the Cluster process.

This blog is about the Redis cluster expansion and reduction process. We first set up the Cluster as shown in the figure according to the introduction and construction of Redis Cluster and check the effect of the construction.

127.0. 01.:6379> cluster nodes
29978c0169ecc0a9054de7f4142155c1ab70258b 127.0. 01.:6379 myself,master - 0 0 7 connected 0-5461

8f285670923d4f1c599ecc93367c95a30fb8bf34 127.0. 01.:6380 master - 0 1496717082785 3 connected 5462-10922

66478bda726ae6ba4e8fb55034d8e5e5804223ff 127.0. 01.:6381 master - 0 1496717085793 2 connected 10923-16383

961097d6be64ebd2fd739ff719e97565a8cee7b5 127.0. 01.:6382 slave 29978c0169ecc0a9054de7f4142155c1ab70258b 0 1496717084791 7 connected

6fb7dfdb6188a9fe53c48ea32d541724f36434e9 127.0. 01.:6383 slave 8f285670923d4f1c599ecc93367c95a30fb8bf34 0 1496717087797 4 connected

e0c7961a1b07ab655bc31d8dfd583da565ec167d 127.0. 01.:6384 slave 66478bda726ae6ba4e8fb55034d8e5e5804223ff 0 1496717086795 2 connected

Copy the code

The following figure shows the slot information of the active node:

2. Expand the cluster

Expanding a cluster is a common requirement for distributed storage. Expanding a Redis cluster can be performed as follows: Prepare new nodes

To join the cluster

Migrate slots and data

2.1 Preparing a New Node

We need two nodes with ports 6385 and 6386 respectively. The configuration is basically the same as the previous cluster node configuration, except for different ports for easy management. The configuration of the 6385 node is as follows:

port 6385                               / / port
cluster-enabled yes                     // Enable cluster mode
cluster-config-file nodes-6385.conf     // Cluster internal configuration file
cluster-node-timeout 15000              // Node timeout in milliseconds
// The other configurations are the same as the single-machine mode
Copy the code

Start two nodes

sudo redis-server conf/redis-6385.conf
sudo redis-server conf/redis-6386.conf
Copy the code

The new node after startup runs as an orphan node and does not communicate with other nodes.

2.2 Joining a Cluster

We can add the 6385 node to the CLUSTER by using the CLUSTER MEET command.

127.0. 01.:6379> CLUSTER MEET 127.0. 01. 6385
OK
127.0. 01.:6379> CLUSTER NODES
cb987394a3acc7a5e606c72e61174b48e437cedb 127.0. 01.:6385 master - 0 1496731333689 8 connected
......
Copy the code

You can also use redis’s cluster management tool, Redis-trib.rb, located in the redis source directory, to add 6386 nodes to the cluster

sudo src/redis-trib.rb add-node 127.0. 01.:6386 127.0. 01.:6379
127.0. 01.:6379> CLUSTER NODES
cdfb1656353c5c7f29d0330a754c71d53cec464c 127.0. 01.:6386 master - 0 1496731447703 0 connected
......

Copy the code

The two methods can be used. The newly added node is the primary node and cannot accept any read and write operations because it is not in charge of slots. For the newly added node, we can have two operations:

Capacity expansion for new node migration slots and data.

Slave nodes that act as other master nodes are responsible for failover.

2.3 Migrating Slots and Data

Once the new node is added to the cluster, we can migrate slots and data to the new node. There are two ways to migrate slots to the new node, either using the redis-trib.rb tool or using manual commands, but generally making sure that each master node is responsible for an even number of slots. This will be done in batches using the redis-trib.rb tool, but for the sake of demonstrating the migration process, let’s do the migration manually using the command.

We start by creating a few keys that belong to a slot and migrate them to the new node.

127.0. 01.:6379> SET key:{test}:555 value:test:555
-> Redirected to slot [6918] located at 127.0. 01.:6380
OK
127.0. 01.:6380> SET key:{test}:Awesome! value:test:Awesome!
OK
127.0. 01.:6380> SET key:{test}:777 value:test:777
OK
127.0. 01.:6380> CLUSTER KEYSLOT key:{test}:555
(integer) 6918
127.0. 01.:6380> CLUSTER KEYSLOT key:{test}:Awesome!
(integer) 6918
127.0. 01.:6380> CLUSTER KEYSLOT key:{test}:777
(integer) 6918
Copy the code

It was originally created on node 6379, but was redirected to node 6380 because our common keys were assigned to slot 6918 based on CRC16 calculations, which is handled by node 6380.

If the key name has {} in it, then the hash evaluates only the string contained in {}, so the three keys created belong to the same slot.

The source code for calculating the hash value is as follows:

unsigned int keyHashSlot(char *key, int keylen) {
    int s, e; /* start-end indexes of { and } */
    // find the '{' character
    for (s = 0; s < keylen; s++)
        if (key[s] == '{') break;
    // Hash the entire key without finding "{}"
    if (s == keylen) return crc16(key,keylen) & 0x3FFF;
    // Find '{', check for '}'
    for (e = s+1; e < keylen; e++)
        if (key[e] == '} ') break;
    // Hash the entire key without finding a matching '}
    if (e == keylen || e == s+1) return crc16(key,keylen) & 0x3FFF;
    // If "{}" is found, compute the hash in the middle of {}
    return crc16(key+s+1,e-s-1) & 0x3FFF;
}
Copy the code

We’ve got the slot we want to move, 6918. Therefore, the process is as follows: Set slot 6918 to the import state on target 6385 node

127.0. 01.:6385> CLUSTER SETSLOT 6918 importing 8f285670923d4f1c599ecc93367c95a30fb8bf34
OK
/ / 8 f285670923d4f1c599ecc93367c95a30fb8bf34 is the name of 6380 nodes


Copy the code

View the import status of slot 6918 on node 6385

127.0. 01.:6385> CLUSTER NODES
cb987394a3acc7a5e606c72e61174b48e437cedb 127.0. 01.:6385 myself,master - 0 0 8 connected [6918-<-8f285670923d4f1c599ecc93367c95a30fb8bf34]
Copy the code

On the source 6380 node, set slot 6918 to the export state

127.0. 01.:6380> CLUSTER SETSLOT 6918 migrating cb987394a3acc7a5e606c72e61174b48e437cedb
OK
/ / cb987394a3acc7a5e606c72e61174b48e437cedb is the name of 6385 nodes
Copy the code

View the export status of slot 6918 on source node 6380

127.0. 01.:6380> CLUSTER NODES
8f285670923d4f1c599ecc93367c95a30fb8bf34 127.0. 01.:6380 myself,master - 0 0 3 
connected 5462-10922 [6918->-cb987394a3acc7a5e606c72e61174b48e437cedb]
Copy the code

Batch obtain the keys in slot 6918

127.0. 01.:6380> CLUSTER GETKEYSINSLOT 6918 5
1) "key:{test}:555"
2) "key:{test}:666"
3) "key:{test}:777"
Copy the code

Verify that these three keys exist on the source 6380 node.

127.0. 01.:6380> MGET key:{test}:777 key:{test}:Awesome! key:{test}:555
1) "value:test:777"
2) "value:test:666"
3) "value:test:555"
Copy the code

Run the migrate command to migrate

127.0. 01.:6380> MIGRATE 127.0. 01. 6385 "" 0 1000 keys key:{test}:777 key:
{test}:Awesome! key:{test}:555
OK
Copy the code

The MIGRATE command was added after Redis 3.0.6. Parameters of the command are as follows:

MIGRATE host port key dbid timeout [COPY | REPLACE]
MIGRATE host port "" dbid timeout [COPY | REPLACE] KEYS key1 key2 ... keyN
// host port Specifies the destination node address
// dbid specifies the database ID to be migrated
// timeout Specifies the migration timeout period
// If the COPY option is specified, keys on the source node are not deleted
// If the REPLACE option is specified, REPLACE the existing key on the target node (if any)
Copy the code

When the migration is complete, we query these three keys on the source 6380 node and send back an ASK error

127.0. 01.:6380> MGET key:{test}:777 key:{test}:Awesome! key:{test}:555
(error) ASK 6918 127.0. 01.:6385
Copy the code

Finally, we simply send the CLUSTER SETSLOT NODE <target_name> command to any NODE to send the slot assignment information to the NODE, which then sends the assignment information to the entire CLUSTER.

CLUSTER SETSLOT 6918 node cb987394a3acc7a5e606c72e61174b48e437cedb
/ / cb987394a3acc7a5e606c72e61174b48e437cedb is the name of 6385 nodes
Copy the code

Run the command on node 6381

127.0. 01.:6381> CLUSTER SETSLOT 6918 node cb987394a3acc7a5e606c72e61174b48e437cedb
OK
Copy the code

View the slot assignment information on node 6379

127.0. 01.:6379> CLUSTER NODES
29978c0169ecc0a9054de7f4142155c1ab70258b 127.0. 01.:6379 myself,master - 0 0 7 connected 0-5461
66478bda726ae6ba4e8fb55034d8e5e5804223ff 127.0. 01.:6381 master - 0 1496736248776 2 connected 10923-16383
cb987394a3acc7a5e606c72e61174b48e437cedb 127.0. 01.:6385 master - 0 1496736244766 10 connected 6918
8f285670923d4f1c599ecc93367c95a30fb8bf34 127.0. 01.:6380 master - 0 1496736247773 3 connected 5462-6917 6919-10922
// Filter out slave nodes and primary nodes with unassigned slots
Copy the code

You can see that slot 6380 is responsible for 5462-6917 6919-10922, while 6918 has been responsible for 6385.

Adding slave Nodes

Initially, we added two new nodes to the cluster. Node 6385 has migrated slots and data as the master node, but it does not yet have failover capability.

In this case, set node 6386 as the secondary node of node 6385 to ensure high availability of the cluster. The use of cluster

The replicate <master_id> command adds a slave node to the master node. In cluster mode, the slaveof command is not supported.

127.0. 01.:6386> CLUSTER REPLICATE 
cb987394a3acc7a5e606c72e61174b48e437cedb
OK
127.0. 01.:6386> CLUSTER NODES
cb987394a3acc7a5e606c72e61174b48e437cedb 127.0. 01.:6385 master 
- 0 1496742992748 10 connected 6918
cdfb1656353c5c7f29d0330a754c71d53cec464c 127.0. 01.:6386 
myself,slave cb987394a3acc7a5e606c72e61174b48e437cedb 0 0 0 
connected
Copy the code

This completes the expansion of the cluster. The cluster relationship is shown in the following figure:

3. Shrink the cluster

Shrinking a cluster means scaling down, and some nodes need to be safely offline from the cluster. Check whether the offline node has a responsible slot. If yes, migrate the slot to another node to ensure the integrity of the node mapping after the offline node.

When the offline node is not in the responsible slot or is itself a subordinate node, other nodes in the cluster can be notified to forget the offline node. When all nodes forget the offline node, they can be normally shut down.

We used the Redis-trib.rb tool this time to bring the migration slot offline. The process is very similar to expanding a cluster, in the opposite direction, making 6380 the target node and 6385 the source node. Shrink the newly expanded cluster back.

./redis-trib.rb reshard 127.0. 01.:6385
>>> Performing Cluster Check (using node 127.0. 01.:6385)... [OK] All nodes agree about slots configuration. >>> Checkfor open slots...
>>> Check slots coverage...
[OK] All 16384 slots covered.
// How many slots do you want to migrate
How many slots do you want to move (from 1 to 16384)? 1 /* Migrate a slot */
// Id of the target node
What is the receiving node ID? 8f285670923d4f1c599ecc93367c95a30fb8bf34 /* Enter the id of the target '6380' node */
Please enter all the source node IDs.
  Type 'all' to use all the nodes as source nodes for the hash slots.
  Type 'done' once you entered all the source nodes IDs.
// Enter the source node where the slot is to be migrated
// all Indicates that all nodes are source nodes
// done indicates that the input is complete
Source node #1:cb987394a3acc7a5e606c72e61174b48e437cedb
Source node #2:done
.....
// Whether to execute the resharding plan immediately
Do you want to proceed with the proposed reshard plan (yes/no)? yes
Moving slot 6918 from 127.0. 01.:6385 to 127.0. 01.:6380:...Copy the code

Check out the results:

127.0. 01.:6380> CLUSTER NODES
8f285670923d4f1c599ecc93367c95a30fb8bf34 127.0. 01.:6380 myself,master - 0 0 11 connected 5462-10922
cb987394a3acc7a5e606c72e61174b48e437cedb 127.0. 01.:6385 master - 0 1496744498017 10 connected
Copy the code

Node 6380 has taken over the slots of node 6385.

Finally, all the nodes in the cluster forget to go offline. Perform CLUSTER FORGET

<down_node_id> or use a tool.

./redis-trib.rb del-node 127.0. 01.:6379 cdfb1656353c5c7f29d0330a754c71d53cec464c
>>> Removing node cdfb1656353c5c7f29d0330a754c71d53cec464c from cluster 127.0. 01.:6379
>>> Sending CLUSTER FORGET messages to the cluster...
>>> SHUTDOWN the node.
./redis-trib.rb del-node 127.0. 01.:6379 cb987394a3acc7a5e606c72e61174b48e437cedb
>>> Removing node cb987394a3acc7a5e606c72e61174b48e437cedb from cluster 127.0. 01.:6379
>>> Sending CLUSTER FORGET messages to the cluster...
>>> SHUTDOWN the node.
Copy the code

Note that the slave node is taken offline first and the master node is taken offline to avoid unnecessary full copy operations. If you forget the offline node on 6379, the other nodes in the cluster will also forget it over time.

127.0. 01.:6380> CLUSTER NODES
6fb7dfdb6188a9fe53c48ea32d541724f36434e9 127.0. 01.:6383 slave 8f285670923d4f1c599ecc93367c95a30fb8bf34 0 1496744890808 11 connecte
29978c0169ecc0a9054de7f4142155c1ab70258b 127.0. 01.:6379 master - 0 1496744892814 7 connected 0-5461
66478bda726ae6ba4e8fb55034d8e5e5804223ff 127.0. 01.:6381 master - 0 1496744891810 2 connected 10923-16383
e0c7961a1b07ab655bc31d8dfd583da565ec167d 127.0. 01.:6384 slave 66478bda726ae6ba4e8fb55034d8e5e5804223ff 0 1496744888804 2 connected
8f285670923d4f1c599ecc93367c95a30fb8bf34 127.0. 01.:6380 myself,master - 0 0 11 connected 5462-10922
961097d6be64ebd2fd739ff719e97565a8cee7b5 127.0. 01.:6382 slave 29978c0169ecc0a9054de7f4142155c1ab70258b 0 1496744889805 7 connected
Copy the code

The primary node of port 6380 has forgotten the offline node, so the offline node has been safely offline.