“This is the 16th day of my participation in the August More Text Challenge.

1. Introduction

1.1 the characteristics of

  • Speed is fast

  • Multiple data structures

  • feature-rich

  • Simple and stable

  • Support client languages

  • Persistence mechanism is supported

  • High availability architecture

1.2 Application Scenarios

  • Key expiration: cache, session saving, coupon expiration

  • Lists: leaderboards

  • Natural counter: post browsing number, video playback number, comment message number

  • Collection: interest tags, advertising

  • Message queue: ELK

2. Installation and deployment

2.1 Directory Planning

  • Redis: /data/soft/

  • Redis installation directory: /opt/redis_cluster/redis_{PORT}/{conf,logs,pid}

  • Redis data directory: /data/redis_cluster/redis_{PORT}/redis_{PORT}.rdb

  • Redis runs the /root/scripts/redis_shell.sh script

2.2 redis installation

mkdir -p /data/soft mkdir -p /data/redis_cluster/redis_6379 mkdir -p /opt/redis_cluster/redis_6379/{conf,pid,logs} cd http://download.redis.io/releases/redis-3.2.9.tar.gz/data/soft/wget tar ZXF redis - 3.2.9. Tar. Gz - C/opt/redis_cluster / Ln -s /opt/redis_cluster/redis-3.2.9/ /opt/redis_cluster/redis CD /opt/redis_cluster/redis make && make installCopy the code

Make install: copy the command to the /usr/local/bin directory. After the make install command is executed, the following information is displayed:


INSTALL install

INSTALL install

INSTALL install

INSTALL install

INSTALL install

Copy the code

2.3 Modifying the Configuration File

Suppose we want to write a Redis configuration file, what should we refer to? It must refer to the official configuration file, so what do we think?

CD/opt/redis_cluster/redis - 3.2.9 / utils/install_server. ShCopy the code

The following output information is displayed:


Welcome to the redis service installer

This script will help you easily set up a running redis server

Please select the redis port for this instance: [6379]

Selecting default: 6379

Please select the redis config file name [/etc/redis/6379.conf]

Selected default - /etc/redis/6379.conf

Please select the redis log file name [/var/log/redis_6379.log]

Selected default - /var/log/redis_6379.log

Please select the data directory for this instance [/var/lib/redis/6379]

Selected default - /var/lib/redis/6379

Please select the redis executable path [/usr/local/bin/redis-server]

Selected config:

Port : 6379

Config file : /etc/redis/6379.conf

Log file : /var/log/redis_6379.log

Data dir : /var/lib/redis/6379

Executable : /usr/local/bin/redis-server

Cli Executable : /usr/local/bin/redis-cli

Is this ok? Then press ENTER to go on or Ctrl-C to abort.

Copied /tmp/6379.conf => /etc/init.d/redis_6379

Installing service...

Successfully added to chkconfig!

Successfully added to runlevels 345!

Starting Redis server...

Installation successful!

Copy the code

We can see the configuration file in /etc/redis/6379.conf, using this file for reference.

Next, let’s configure our own configuration file:


cd /opt/redis_cluster/redis_6379/conf

vim redis_6379.conf

Copy the code
Daemonize yes Bind 172.21.0.3 # port 6379 # Pidfile specifies the address for storing pid and log files /opt/redis_cluster/redis_6379/pid/redis_6379.pid logfile /opt/redis_cluster/redis_6379/logs/redis_6379.log SQL > create database 0 databases 16 RDB dbfilename redis_6379. RDB # Local database directory dir /data/redis_cluster/redis_6379Copy the code

2.4 Starting the Server


redis-server /opt/redis_cluster/redis_6379/conf/redis_6379.conf

Copy the code

Check whether the startup is successful:

[root@VM-0-3-centos conf]# ps -ef | grep redis root 6763 1 0 10:53 ? 00:00:00 /usr/local/bin/redis-server 127.0.0.1:6379 root 10626 10 11:12? 00:00:00 redis-server 172.21.0.3:6379 root 11440 27666 0 11:14 PTS /0 00:00:00 grep --color=auto redisCopy the code

2.5 Shutting down the Server


[root@VM-0-3-centos conf]# redis-cli

127.0.0.1:6379> SHUTDOWN

Copy the code

3. Redis basic operation commands

Establish a connection:

Redis - cli - h 172.21.0.3Copy the code

3.1 Global Commands

  1. View all keys (very dangerous!! No online use)

KEYS *

Copy the code
  1. View the total number of keys (the dbsize command does not iterate over all keys when calculating the total number of keys, but gets the total number of keys built into Redis)

DBSIZE

Copy the code
  1. Checks if the key exists (returns 1 if it does, 0 if it does not)

EXISTS key

Copy the code
  1. Remove the key

DEL key

Copy the code
  1. Key expired

  2. Redis supports adding an expiration time to a key. When the expiration time is exceeded, the key is automatically deleted.

  3. Run the TTL command to view the remaining time of the key

  4. >=0: indicates the remaining key expiration time

  5. -1: The expiration time is not set

  6. -2: The key does not exist

  7. Note: after key is set to expire, then set key, the key is reset to never expire. For example, the coupon function, after this operation, the coupon never expires.


EXPIRE key seconds

TTL key

Copy the code
  1. Removal expiration time

PERSIST key

Copy the code
  1. The data type of the key

TYPE key

Copy the code

3.2 the string

  1. Set the key

172.21.0.3:6379> SET key1 value

OK

Copy the code
  1. To get the key
172.21.0.3:6379 > GET key1 "value"Copy the code
  1. Batch Setting Keys

172.21.0.3:6379> MSET key2 value2 key3 value3

OK

Copy the code
  1. Obtaining Keys in Batches

172.21.0.3:6379> MGET key1 key2 key3

1) "value"

2) "value2"

3) "value3"

Copy the code
  1. Value increment and decrement (INCR command parses string)

172.21.0.3:6379> INCR k1

(integer) 101

172.21.0.3:6379> INCRBY k1 10

(integer) 111

172.21.0.3:6379> DECR k1

(integer) 110

172.21.0.3:6379> DECRBY k1 10

(integer) 100

Copy the code

3.3 list

  1. Add a new element to the left (header) of the listlpush

172.21.0.3:6379> LPUSH list1 A

(integer) 1

172.21.0.3:6379> LPUSH list1 B

(integer) 2

172.21.0.3:6379> TYPE list1

list

Copy the code
  1. Add a new element to the right (tail) of the listrpush

172.21.0.3:6379> RPUSH list1 C

(integer) 3

172.21.0.3:6379> RPUSH list1 D

(integer) 4

Copy the code
  1. Viewing list length

172.21.0.3:6379> LLEN list1

(integer) 4

Copy the code
  1. View the data

172.21.0.3:6379> LRANGE list1 0 -1

1) "B"

2) "A"

3) "C"

4) "D"

Copy the code

172.21.0.3:6379> LRANGE list1 0 1

1) "B"

2) "A"

Copy the code
  1. Delete data from the right

172.21.0.3:6379> RPOP list1

"D"

172.21.0.3:6379> LRANGE list1 0 -1

1) "B"

2) "A"

3) "C"

Copy the code
  1. Delete data from the left

172.21.0.3:6379> LPOP list1

"B"

172.21.0.3:6379> LRANGE list1 0 -1

1) "A"

2) "C"

Copy the code

3.4 the hash

  1. To set the hash value

172.21.0.3:6379> HMSET user1 name wys age 20

OK

Copy the code
  1. Gets a field in the hash value

172.21.0.3:6379> HGET user1 name

"wys"

Copy the code
  1. Get hash values for all fields

172.21.0.3:6379> hgetall user1

1) "name"

2) "wys"

3) "age"

4) "20"

Copy the code

3.5 the collection

  1. Collections add elements (unlike lists, collections do not allow duplicate data)

172.21.0.3:6379> SADD set1 3 4 5 5

(integer) 3

Copy the code
  1. Getting collection elements

172.21.0.3:6379> SMEMBERS set1

1) "3"

2) "4"

3) "5"

Copy the code
  1. Removes the specified value from the collection

172.21.0.3:6379> SREM set1 5

(integer) 1

172.21.0.3:6379> SMEMBERS set1

1) "3"

2) "4"

Copy the code
  1. Computes set difference members (those not present in the second set)

172.21.0.3:6379> SADD set2 3 5 6

(integer) 3

172.21.0.3:6379> SADD set3 5 7

(integer) 2

172.21.0.3:6379> SDIFF set2 set3

1) "3"

2) "6"

Copy the code
  1. Computes the set intersection (both sets have it)
172.21.0.3:6379> SINTER set2 set3Copy the code
  1. Computes the set union (owned by multiple sets)

172.21.0.3:6379> SUNION set2 set3

1) "3"

2) "5"

3) "6"

4) "7"

Copy the code

4. Redis persistence

4.1 RDB

RDB persistence is a process in which snapshots of the current process data are generated and saved to disks. RDB persistence can be triggered manually or automatically.

4.1.1 Manual Triggering

  1. Data insertion:

172.21.0.3:6379> set k1 v1

OK

Copy the code
  1. Manually save the RDB file

172.21.0.3:6379> BGSAVE

Background saving started

Copy the code
  1. We can see persistent files in our data directory:
CD /data/redis_cluster/redis_6379 ll Total usage 4 -rw-r--r-- 1 root root 88 March 8 15:34 redis_6379.rdbCopy the code
  1. Close the redis

172.21.0.3:6379> SHUTDOWN

not connected> exit

Copy the code
  1. Start Redis and connect
Redis -server /opt/redis_cluster/redis_6379/conf/redis_6379.conf redis-cli -h 172.21.0.3Copy the code
  1. The test data
172.21.0.3:6379> keys * 1Copy the code

Note: If there is a.rdb file detected in redis, it will be used to recover data during the next startup.

4.1.2 Automatic Trigger

Add the following content to the configuration file:

# BgSave Save 900 every 900 seconds (15 minutes) if there is 1 update 1 # BGSave Save 300 every 300 seconds (5 minutes) if there are 10 updates 10 # BgSave Save 10000 every 60 seconds (5 minutes For updates, run bgSave save 60 10000Copy the code

4.1.3 advantages

  1. It can be used for backup and restoration. The RDB is a compact binary file that represents a snapshot of Redis data at a point in time. It is ideal for backup, full recovery scenarios, such as performing bgSave backups every 6 hours and copying RDB files to remote machines for disaster recovery.

  2. Fast recovery. Redis loads RDB recovery data much faster than AOF

4.1.4 shortcomings

  1. There is no way to do real-time persistence/second persistence. Because bgSave forks every time it runs, it is a data-heavy operation that is costly to execute frequently.

  2. RDB file format problem, may not be common between different versions. RDB files are saved in a specific binary format. In the process of redIS version evolution, there are multiple RDB versions. The old REDis is incompatible with the new REDis format.

4.1.5 special point

  1. When you run SHUTDOWN, you’re essentially executing two commands, bgsave and SHUTDOWN, which is why, when you run SHUTDOWN, we don’t actually have bgsave, but the system does it for you, so the data is still not lost.

  2. The same is true for executing the kill operation. Kill can be interpreted as normal exit, exit only after the current unfinished execution.

  3. The kill -9 command loses data. Kill whatever you’re doing.

4.2 AOF

The AOF storage mode records all write commands executed by the server and restores the data set by re-executing these commands when the server starts up.

All commands in the AOF file are saved in Redis format, and new commands are appended to the end of the file.

4.2.1 Configuration File

Add the following information to the configuration file: Note that the two storage modes can coexist.

Appendonly yes # every command is immediately synchronized to AOF appendfsync always appendfsync everysec # write to the operating system Appendfilename "appendonly. AOF"Copy the code

4.2.2 Workflow

  • Store all commands, append all commands to the file

  • When the file is too large, Redis has an algorithm that will delete some commands

  1. set k1 v1

  2. set k2 v2

  3. del k1

  4. get k2

The internal delete command deletes all records 1, 3, and 4.

Holdings advantages

  • It can ensure data loss to the greatest extent

4.2.4 shortcomings

  • Data recovery using AOF is slow

  • The log magnitude is large

4.2.5 special point

  • When the RDB file and AOF file exist together, the AOF file is used to recover data.

The specific test method is as follows:

  1. Copy the old RDB file.

  2. Write the data

  3. Run SHUTDOWN command

  4. Overwrite the new RDB file with the old RDB file.

  5. Restart the redis

  6. Check to see if there is any newly inserted data in Redis. If yes, AOF is used to restore data. If no, RDB is used to restore data.

4.3 the interview questions

  1. What are the redis persistence methods and what are the differences?

  2. There are two ways, RDB and AOF.

  3. RDB recovers data quickly, and AOF recovers data slowly.

  4. RDB is not real-time, AOF is real-time, but some data may still be lost, to ensure data loss to the maximum extent.

  5. AOF logs contain a large amount of commands and are equivalent to mysql’s binlog.

  6. AOF format is common, RDB files may have different versions cannot be identified.

  7. In RDB persistence, why is the data not lost after SHUTDOWN?

  8. Executing SHUTDOWN is equivalent to executing two commands, bgsave and close. So the data is persisted.

  9. The same is true for kill.

  10. Kill -9 will lose data and exit immediately.

  11. When both RDB and AOF persistence exist, which method is used to recover data?

  12. AOF is preferred for data recovery and can be tested.

5. Redis security mode

Redis has protected mode enabled by default, allowing only local session addresses to log in and access the database.

Ban protected mode

# protected mode, whether to allow only local access to protected-mode yes/noCopy the code
  1. Bind: specifies the IP address for listening
Vim /opt/redis_cluster/redis_6379/conf/redis_6379.conf bind 172.21.0.3 127.0.0.1Copy the code
  1. Increase the requirepass {password}

vim /opt/redis_cluster/redis_6379/conf/redis_6379.conf

requirepass 123456

Copy the code
  1. Password test validation

[root@VM-0-3-centos conf]# redis-cli -h 172.21.0.3

172.21.0.3:6379> set k1 v1

(error) NOAUTH Authentication required.

172.21.0.3:6379> auth 123456

OK

172.21.0.3:6379> set k1 v1

OK

172.21.0.3:6379> get k1

"v1"

Copy the code

6. Redis primary/secondary replication

6.1 Setting up a Primary/Secondary Replication Environment

Redis master-slave replication can be tested on two machines, because I only have one machine, so I set up the Redis master-slave replication on a single machine.


mkdir -p /home/service/

mkdir -p /home/work/

mkdir -p /home/work/redis_cluster/redis_6380

mkdir -p /home/service/redis_cluster/redis_6380/{conf,pid,logs}

cp /opt/redis_cluster/redis_6379/conf/redis_6379.conf /home/service/redis_cluster/redis_6380/conf/redis_6380.conf

Copy the code

/home/service/redis_cluster/redis_6380/conf/redis_6380.conf Configuration file contents are as follows:

Daemonize yes Bind 172.21.0.3 # port 6380 # Pidfile specifies the address for storing pid files and log files /home/service/redis_cluster/redis_6380/pid/redis_6380.pid logfile Mysql > create database 0 databases 1 database 1 database 1 database 1 database 1 database 1 database 2 database 1 database 2 The default value is dump. RDB dbfilename redis_6380. RDB save 900 1 save 300 10 save 60 10000 # Directory dir of the local database /home/work/redis_cluster/redis_6380 AOF #appendfsync always # write once per second #appendfsync everysec # write to the operating system, the operating system determines the buffer size, #appendfilename "appendonly. AOF"Copy the code

6.2 Testing primary/Secondary Replication

  1. Log in to the primary library on port 6379
Redis -server /opt/redis_cluster/redis_6379/conf/redis_6379.conf redis-cli -h 172.21.0.3 -p 6379Copy the code
  1. Log in to the slave library on port 6380
Redis_cluster /redis_6380.conf redis-cli -h 172.21.0.3 -p 6380Copy the code
  1. Query data from the library
172.21.0.3:6380 > get k1 (nil)Copy the code
  1. Implement master-slave replication (slave library execution)
172.21.0.3:6380> SLAVEOF 172.21.0.36379 OKCopy the code
  1. The primary library inserts data

172.21.0.3:6379> set k1 v1

OK

Copy the code
  1. Read data from the library
172.21.0.3:6380 > get k1 "v1"Copy the code
  1. Insert data from library (unable to insert)

172.21.0.3:6380> set k2 v2

(error) READONLY You can't write against a read only slave.

Copy the code
  1. Disconnect the primary/secondary replication (After the replication is disconnected from the secondary node, the original data is not discarded but the data changes on the primary node cannot be obtained.)

172.21.0.3:6380> slaveof no one

OK

Copy the code
  1. After the master/slave replication is disconnected, data is inserted into the slave library

172.21.0.3:6380> set k3 v3

OK

Copy the code

6.3 Viewing Primary/Secondary Replication Logs

6.3.1 Master library logs


cd /opt/redis_cluster/redis_6379/logs/

tail redis_6379.log

Copy the code
7649:M 08 Mar 17:45:36.277 # WARNING you have Transparent Huge Pages (THP) support enabled in your kernel create latency and memory usage issues with Redis. To fix this issue run the command 'echo never > /sys/kernel/mm/transparent_hugepage/enabled' as root, and add it to your /etc/rc.local in order to retain the setting after a reboot. Redis must be restarted after THP is 7649:M 08 Mar 17:45:36.277 * The server is now ready to accept connections on port 6379 7649:M 08 Mar 17:46:57.983 * Slave 172.21.0.3:6380 tasks for synchronization 7649:M 08 Mar 17:46:57.983 * Full resync requested by Slave 172.21.0.3:6380 7649:M 08 Mar 17:46:57.983 * Starting BGSAVE for SYNC with target: Disk 7649:M 08 Mar 17:46:57.983 * Background saving started by PID 9619 9619:C 08 Mar 17:46:58.043 * DB saved on disk 9619:C 08 Mar 17:46:58.043 * RDB: 0 MB of memory used by copy-on-write 7649:M 08 Mar 17:46:58.088 * Background saving terminated with success 7649:M 08 Mar 17:46:58.088 * Synchronization with slave 172.21.0.3:6380 SucceededCopy the code
  • Full resync requested by slave 172.21.0.3:6380: Indicates that the connection between the master and slave is established successfully

  • Starting BGSAVE for SYNC with target: disk: Indicates the start of persisting data to disks

  • Background Saving Started by PID 9619: Indicates that a new process is started to persist data

  • DB saved on Disk: indicates the disk for storing persistent data

  • Background saving terminated with success: Persistent data is successful

  • Synchronization with slave 172.21.0.3:6380 Succeeded: Data is synchronized to the slave database

6.3.2 Secondary library logs


cd /home/service/redis_cluster/redis_6380/logs/

tail redis_6380.log

Copy the code
8206:S 08 Mar 17:46:57.982 * Connecting to MASTER 172.21.0.3:6379 8206:S 08 Mar 17:46:57.982 * SLAVE sync Started 8206:S 08 Mar 17:46:57.982 * Non blocking connect for SYNC fired the event. 8206:S 08 Mar 17:46:57.982 * Master replied to PING, replication can continue... 8206:S 08 Mar 17:46:57.982 * Partial resynchronization Not possible (no cached master) 8206:S 08 Mar 17:46:57.983 * Full Resync the from master: a2fc4c71b2f3803e0c7f11f68a124396981e3454:1:8206 S Mar 08 17:46:58. 088 * master < - > SLAVE sync: Receiving 76 bytes from master 8206:S 08 Mar 17:46:58.088 * master <-> SLAVE sync: Flushing old data 8206:S 08 Mar 17:46:58.088 * MASTER <-> SLAVE Loading DB in memory 8206:S 08 Mar 17:46:58.088 * MASTER <-> SLAVE sync: Finished with successCopy the code
  • Connecting to MASTER 172.21.0.3:6379: Indicates that a connection is established with the MASTER database

  • MASTER <-> SLAVE sync started: Indicates that the MASTER and SLAVE synchronization is started

  • MASTER <-> SLAVE sync: receiving 76 bytes from MASTER: Receiving 76 bytes from the MASTER library. 76 bytes indicates the size of the RDB file.

  • MASTER <-> SLAVE sync: Flushing old data: Flushing old data (Previous data will also be overwritten and can be tested. The slave library has data before, but after the master/slave replication, the data is gone.

  • MASTER <-> SLAVE sync: Loading DB in memory: Loading data sent by the MASTER library into memory.

  • MASTER <-> SLAVE sync: Finished with success: The synchronization between the MASTER and SLAVE is successful.

6.4 Master/Slave Synchronization Process

  1. Send a synchronization request from the library

  2. The master library receives the request and executes bgSave to save the current memory data to disk

  3. The master library sends persisted data to the slave library’s data directory

  4. After receiving persistent data from the main library, clear all data in your current memory

  5. The slave library loads the persistence files sent from the master library into its own memory.

6.5 the conclusion

  1. Before performing the master/slave replication, back up data

  2. You are advised to write the primary/secondary replication to the configuration file

  3. Master slave replication during peak business periods

  4. Copying data consumes bandwidth

  5. The primary/secondary switchover cannot be completed automatically. Manual intervention is required

7. Sentinel

7.1 Introduction to Sentry

In the master-slave mode of Redis, once the master node fails and cannot provide services, manual intervention is required to promote the slave node to the master node and modify the client configuration at the same time. This approach is unacceptable for many application scenarios.

Sentinel architecture solves the problem of Redis master-slave manual intervention.

Redis Sentinel is a high availability implementation of Redis, which is very helpful to improve the overall system availability in a real production environment.

7.2 Main functions of sentry

Redis Sentinel is a distributed system and Redis Sentinel provides high availability for Redis. Automatic failover can occur without human intervention.

The Redis Sentinel system is used to manage multiple Redis servers (instance). The system has the following three functions:

  1. Monitoring

Sentinel regularly checks your primary and secondary servers to see if they are running properly.

  1. Notification

Sentinel can send notifications to administrators or other applications via the API when a monitored Redis server has a problem.

  1. Automatic Failover

When a primary server fails, Sentinel starts an automatic failover operation by upgrading one of the secondary servers from the failed primary server to the new primary server and making other secondary servers from the failed primary server replicate the new primary server. When the client view connects to a failed primary server, the cluster also returns the address of the new primary server to the client, allowing the cluster to use the new primary server in place of the failed server.

The Redis Sentinel architecture is as follows:

When the Master hangs, our switch steps are to select one of the slave databases to exit the Master/slave replication and change the other to replicate the new Master.

Slave 1: slaveof no one slave 2: Slaveof slave 1 1_portCopy the code

Sentinel is a special node, and each node can be monitored.

7.3 Directory Planning

role IP port
Master 172.21.0.3 6379
Slave1 172.21.0.3 6380
Slave2 172.21.0.3 6381
Sentinel-01 172.21.0.3 26379
Sentinel-02 172.21.0.3 26380
Sentinel-03 172.21.0.3 26381

7.4 Installation and Configuration Commands

Sentinel mode is based on master slave replication, so master slave replication needs to be deployed first.

Since we are deploying on the same machine, we create two more similar directories with different port numbers.

7.4.1 Creating a Configuration File


mkdir -p /opt/redis_cluster/redis_6380/

mkdir -p /opt/redis_cluster/redis_6381/

cp -r /opt/redis_cluster/redis_6379/* /opt/redis_cluster/redis_6380/

cp -r /opt/redis_cluster/redis_6379/* /opt/redis_cluster/redis_6381/

rm /opt/redis_cluster/redis_6380/logs/redis_6379.log

rm /opt/redis_cluster/redis_6381/logs/redis_6379.log

rm /opt/redis_cluster/redis_6380/pid/redis_6379.pid

rm /opt/redis_cluster/redis_6381/pid/redis_6379.pid

mv /opt/redis_cluster/redis_6380/conf/redis_6379.conf /opt/redis_cluster/redis_6380/conf/redis_6380.conf

mv /opt/redis_cluster/redis_6381/conf/redis_6379.conf /opt/redis_cluster/redis_6381/conf/redis_6381.conf

vim /opt/redis_cluster/redis_6380/conf/redis_6380.conf

:%s/6379/6380/g

vim /opt/redis_cluster/redis_6381/conf/redis_6381.conf

:%s/6379/6381/g

mkdir -p /data/redis_cluster/redis_6380/

mkdir -p /data/redis_cluster/redis_6381/

Copy the code

The port number used by the sentry is preceded by a 2, namely 26379, 26380, 26381. Let’s create a directory.


mkdir -p /data/redis_cluster/redis_26379

mkdir -p /data/redis_cluster/redis_26380

mkdir -p /data/redis_cluster/redis_26381

mkdir -p /opt/redis_cluster/redis_26379/{conf,pid,logs}

mkdir -p /opt/redis_cluster/redis_26380/{conf,pid,logs}

mkdir -p /opt/redis_cluster/redis_26381/{conf,pid,logs}

Copy the code

Modify the configuration file. Sentry’s configuration file is similar to the regular Redis configuration file, with a few additional lines of configuration.


vim /opt/redis_cluster/redis_26379/conf/redis_26379.conf

Copy the code

The configuration file is as follows:

Bind 172.21.0.3 port 26379 daemonize yes logfile /opt/redis_cluster/redis_26379/logs/redis_26379.log dir /data/redis_cluster/redis_26379 Sentinel monitor myMaster 172.21.0.3 6379 2 Sentinel down-after-milliseconds mymaster 3000 sentinel parallel-syncs mymaster 1 sentinel failover-timeout mymaster 18000Copy the code

Let’s copy two more configuration files and modify them.


cp /opt/redis_cluster/redis_26379/conf/redis_26379.conf /opt/redis_cluster/redis_26380/conf/redis_26380.conf

cp /opt/redis_cluster/redis_26379/conf/redis_26379.conf /opt/redis_cluster/redis_26381/conf/redis_26381.conf

vim /opt/redis_cluster/redis_26380/conf/redis_26380.conf

:%s/26379/26380/g

vim /opt/redis_cluster/redis_26381/conf/redis_26381.conf

:%s/26379/26381/g

Copy the code

7.4.2 Configuration File Description

  • Sentinel Monitor myMaster 172.21.0.36379 2: alias of the myMaster primary node, IP address and port of the primary node.2 indicates that the primary node fails to be identified and requires the consent of the two Sentinel nodes

  • Sentinel down-after-milliseconds myMaster 3000: This configuration specifies how many milliseconds it takes for Sentinel to determine that the server has been disconnected

  • Sentinel parallel-syncs myMaster 1: indicates the number of secondary nodes initiating a replication operation to the new primary node. 1 indicates polling to initiate replication. As shown in the picture below (right)

  • Sentinel failover-timeout MyMaster 18000: indicates the failover timeout period.

7.4.3 Configuring the Primary/secondary Server

Vim /opt/redis_cluster/redis_6380/conf/redis_6380.conf slaveof 172.21.0.36379 vim /opt/redis_cluster/redis_6381/conf/redis_6381.confCopy the code

7.4.4 Starting Three Instances


redis-server /opt/redis_cluster/redis_6379/conf/redis_6379.conf

redis-server /opt/redis_cluster/redis_6380/conf/redis_6380.conf

redis-server /opt/redis_cluster/redis_6381/conf/redis_6381.conf

Copy the code

[root@VM-0-3-centos conf]# ps -ef | grep redis

root 3010 1 0 11:05 ? 00:00:00 redis-server 172.21.0.3:6379

root 3014 1 0 11:05 ? 00:00:00 redis-server 172.21.0.3:6380

root 3019 1 0 11:05 ? 00:00:00 redis-server 172.21.0.3:6381

Copy the code

172.21.0.3:6379> set test1 hhh

OK

172.21.0.3:6380> get test1

"hhh"

172.21.0.3:6381> get test1

"hhh"

172.21.0.3:6380> set test2 hhh

(error) READONLY You can't write against a read only slave.

172.21.0.3:6381> set test2 hhh

(error) READONLY You can't write against a read only slave.

Copy the code

7.4.5 Starting the Sentinel


redis-sentinel /opt/redis_cluster/redis_26379/conf/redis_26379.conf

redis-sentinel /opt/redis_cluster/redis_26380/conf/redis_26380.conf

redis-sentinel /opt/redis_cluster/redis_26381/conf/redis_26381.conf

Copy the code

[root@VM-0-3-centos conf]# ps -ef | grep redis

root 3010 1 0 11:05 ? 00:00:00 redis-server 172.21.0.3:6379

root 3014 1 0 11:05 ? 00:00:00 redis-server 172.21.0.3:6380

root 3019 1 0 11:05 ? 00:00:00 redis-server 172.21.0.3:6381

root 3138 3054 0 11:07 pts/1 00:00:00 redis-cli -h 172.21.0.3 -p 6380

root 3139 3096 0 11:07 pts/2 00:00:00 redis-cli -h 172.21.0.3 -p 6381

root 3190 1 0 11:09 ? 00:00:00 redis-sentinel 172.21.0.3:26379 [sentinel]

root 3200 1 0 11:09 ? 00:00:00 redis-sentinel 172.21.0.3:26380 [sentinel]

root 3204 1 0 11:09 ? 00:00:00 redis-sentinel 172.21.0.3:26381 [sentinel]

root 3218 1383 0 11:10 pts/0 00:00:00 grep --color=auto redis

Copy the code

7.4.6 Viewing changes in the Sentinel Configuration File

Bind 172.21.0.3 port 26379 daemonize yes logfile "/opt/redis_cluster/redis_26379/logs/redis_26379.log" dir "/data/redis_cluster/redis_26379" sentinel myid 341cd7783bbaac95262ddb07083c52acf4d3d700 sentinel monitor mymaster 2 Sentinel down-after-milliseconds mymaster 3000 sentinel Failover -timeout mymaster 18000 # Generated by  CONFIG REWRITE sentinel config-epoch mymaster 0 sentinel leader-epoch mymaster 0 sentinel known-slave mymaster 172.21.0.3 6381 Sentinel Known -slave mymaster 172.21.0.3 6380 Sentinel Known - Sentinel mymaster 172.21.0.3 26381 Caf9ec5e1bfc9148377d58fddd962943a6e08da9 sentinel known - sentinel mymaster 172.21.0.3 26380 c9771fe6db4884e2a77c2ffa39805e42621cc9e3 sentinel current-epoch 0Copy the code
Bind 172.21.0.3 port 26380 daemonize yes logfile "/opt/redis_cluster/redis_26380/logs/redis_26380.log" dir "/data/redis_cluster/redis_26380" sentinel myid c9771fe6db4884e2a77c2ffa39805e42621cc9e3 sentinel monitor mymaster 2 Sentinel down-after-milliseconds mymaster 3000 sentinel Failover -timeout mymaster 18000 # Generated by  CONFIG REWRITE sentinel config-epoch mymaster 0 sentinel leader-epoch mymaster 0 sentinel known-slave mymaster 172.21.0.3 6381 Sentinel Known -slave mymaster 172.21.0.3 6380 Sentinel Known - Sentinel mymaster 172.21.0.3 26381 Caf9ec5e1bfc9148377d58fddd962943a6e08da9 sentinel known - sentinel mymaster 172.21.0.3 26379 341cd7783bbaac95262ddb07083c52acf4d3d700 sentinel current-epoch 0Copy the code
Bind 172.21.0.3 port 26381 daemonize yes logfile "/opt/redis_cluster/redis_26381/logs/redis_26381.log" dir "/data/redis_cluster/redis_26381" sentinel myid caf9ec5e1bfc9148377d58fddd962943a6e08da9 sentinel monitor mymaster 2 Sentinel down-after-milliseconds mymaster 3000 sentinel Failover -timeout mymaster 18000 # Generated by  CONFIG REWRITE sentinel config-epoch mymaster 0 sentinel leader-epoch mymaster 0 sentinel known-slave mymaster 172.21.0.3 6381 Sentinel Known -slave mymaster 172.21.0.3 6380 Sentinel Known - Sentinel mymaster 172.21.0.3 26379 341 cd7783bbaac95262ddb07083c52acf4d3d700 sentinel known - sentinel mymaster 172.21.0.3 26380 c9771fe6db4884e2a77c2ffa39805e42621cc9e3 sentinel current-epoch 0Copy the code

After all nodes are started, the content of the configuration file changes in the following aspects:

  1. The Sentinel node automatically discovers the slave node and the other Sentinel nodes.

  2. The default configuration, such as parallel-syncs failover-timeout, is removed.

  3. Added epoch related parameters

7.4.6 Viewing sentinel Log Files

3190:X 09 Mar 11:09:39.815 # WARNING: The TCP backlog setting of 511 cannot be enforced because /proc/sys/net/core/somaxconn is set to the lower value of 128. Mar 3190: X 09 11:09:39. 826 # Sentinel ID is 341 cd7783bbaac95262ddb07083c52acf4d3d700 3190: X 09 Mar 11:09:39. 826 # +monitor master mymaster 172.21.0.3379 Quorum 2 3190:X 09 Mar 11:09:39.827 * +slave slave 172.21.0.3:6380 172.21.0.3 6380@mymaster 172.21.0.3 6379 3190:X 09 Mar 11:09:39.834 * +slave slave 172.21.0.3:6381 172.21.0.3 6381 @mymaster 172.21.0.3, 6379, 3190: X 09 Mar 11:09:51. 400 * + sentinel sentinel c9771fe6db4884e2a77c2ffa39805e42621cc9e3 172.21.0.3 26380 @mymaster 172.21.0.3 6379 3190:X 09 Mar 11:09:56.620 * + Sentinel sentinel Caf9ec5e1bfc9148377d58fddd962943a6e08da9 172.21.0.3 26381 @ mymaster 172.21.0.3 6379Copy the code

7.5 Sentry common API

  1. The login
Redis -cli -h 172.21.0.3 -p 26379Copy the code
  1. View Sentinel information
172.21.0.3:26379> INFO SENTINEL # SENTINEL sentinel_masters:1 sentinel_tilt:0 sentinel_running_scripts:0 sentinel_scripts_queue_length:0 sentinel_simulate_failure_flags:0 Master0: name = mymaster, status = ok, address = 172.21.0.3:6379, slaves = 2, sentinels = 3Copy the code
  1. Query master information

172.21.0.3:26379> Sentinel masters

1) 1) "name"

2) "mymaster"

3) "ip"

4) "172.21.0.3"

5) "port"

6) "6379"

7) "runid"

8) "134a225eb4509100ab06f9edd6429909c6f2ee22"

9) "flags"

10) "master"

11) "link-pending-commands"

12) "0"

13) "link-refcount"

14) "1"

15) "last-ping-sent"

16) "0"

17) "last-ok-ping-reply"

18) "500"

19) "last-ping-reply"

20) "500"

21) "down-after-milliseconds"

22) "3000"

23) "info-refresh"

24) "2358"

25) "role-reported"

26) "master"

27) "role-reported-time"

28) "785446"

29) "config-epoch"

30) "0"

31) "num-slaves"

32) "2"

33) "num-other-sentinels"

34) "2"

35) "quorum"

36) "2"

37) "failover-timeout"

38) "18000"

39) "parallel-syncs"

40) "1"

Copy the code
  1. Get single master information
172.21.0.3:26379 > Sentinel master mymasterCopy the code
  1. Viewing Slave Information
172.21.0.3:26379 > Sentinel slaves mymasterCopy the code
  1. View other sentries

Sentinel sentinels mymaster

Copy the code
  1. Check the master IP address (with this command, the client can know which IP port to connect to)

172.21.0.3:26379> Sentinel get-master-addr-by-name mymaster

1) "172.21.0.3"

2) "6379"

Copy the code
  1. Viewing the Failover Status

172.21.0.3:26379> Sentinel failover mymaster

OK

Copy the code
  1. Update the configuration
172.21.0.3:26379 > Sentinel flushconfig OKCopy the code

7.6 Simulating failover

Normally, the Redis Sentry mode 1 master 2 slave should be deployed on three machines in one-to-one correspondence. If there is a failure, it should be a machine failure.

So, we kill both processes in 2679 and 26379.

By looking at the 26380 log, we can see that automatic failover has occurred. Master is now the process for port 6381.

3200:X 09 Mar 11:43:27.553 # +sdown slave 172.21.0.3:6379 172.21.0.36379 @myMaster 172.21.0.36381 3200:X 09 Mar 11:43:27.553 # +sdown slave 172.21.0.3:6379 172.21.0.36379 @myMaster 172.21.0.36381 3200:X 09 Mar 11:43:27. 554 # + sdown sentinel 341 cd7783bbaac95262ddb07083c52acf4d3d700 172.21.0.3 26379 @ mymaster 172.21.0.3 6381Copy the code

Let’s look at the Sentinel configuration.

Bind 172.21.0.3 port 26380 daemonize yes logfile "/opt/redis_cluster/redis_26380/logs/redis_26380.log" dir "/data/redis_cluster/redis_26380" sentinel myid c9771fe6db4884e2a77c2ffa39805e42621cc9e3 sentinel monitor mymaster 2 Sentinel down-after-milliseconds mymaster 3000 sentinel Failover -timeout mymaster 18000 # Generated by  CONFIG REWRITE sentinel config-epoch mymaster 1 sentinel leader-epoch mymaster 0 sentinel known-slave mymaster 172.21.0.3 6379 Sentinel Known -slave mymaster 172.21.0.3 6380 Sentinel Known - Sentinel mymaster 172.21.0.3 26381 Caf9ec5e1bfc9148377d58fddd962943a6e08da9 sentinel known - sentinel mymaster 172.21.0.3 26379 341cd7783bbaac95262ddb07083c52acf4d3d700 sentinel current-epoch 1Copy the code

You can see that the PORT on the master node has been changed.

Let’s query again to verify:


172.21.0.3:26380> Sentinel get-master-addr-by-name mymaster

1) "172.21.0.3"

2) "6381"

Copy the code

At the same time, we can also log in the Redis node to check whether the primary node has been switched.

172.21.0.3:6381> set GGG HHH OK 172.21.0.3:6380> set LLL HHH (error) READONLY You can't write against a read only slave. 172.21.0.3:6380 > get GGG HHH ""Copy the code

You can see that the failover has succeeded.

Let’s start the failed node again.


redis-server /opt/redis_cluster/redis_6379/conf/redis_6379.conf

redis-sentinel /opt/redis_cluster/redis_26379/conf/redis_26379.conf

Copy the code

As you can see, the Sentinel log has changed. A new node has been added as a slave node and a master-slave replication has been performed.

3200:X 09 Mar 11:54:48.331 * +reboot slave 172.21.0.3:6379 172.21.0.3 6379 @myMaster 172.21.0.3 6381 3200:X 09 Mar 11:54:48.383 # -sdown slave 172.21.0.3:6379 172.21.0.36379 @mymaster 172.21.0.3 6381 3200:X 09 Mar 11:55:08.014 # 341 - sdown sentinel cd7783bbaac95262ddb07083c52acf4d3d700 172.21.0.3 26379 @ mymaster 172.21.0.3 6381Copy the code

7.7 Failover Process

The node with the largest ID is selected, and the data on both nodes is the same by default.

When we want to force a choice. We can set the weights.

The weight of each node is the same by default.

When Redis Sentinel has multiple slave nodes, if you want to promote the specified slave node to the master node, you can set slavepriority of other slave nodes to 0. However, after failover, slave-priority should be set back to the original value.

There are two ways to do this. One is to raise the node where you want to force primary. The other is to turn the other nodes down.

  1. Check the weight

172.21.0.3:6379> CONFIG GET slave-priority

1) "slave-priority"

2) "100"

Copy the code
  1. Change the weight (I want port 6379 to be the primary node, and the weight of other ports is set to 0. Port 6379 does not need to be changed)

172.21.0.3:6380> CONFIG SET slave-priority 0

OK

172.21.0.3:6381> CONFIG SET slave-priority 0

OK

Copy the code
  1. Forced failover (performed at sentinel node)

172.21.0.3:26379> sentinel failover mymaster

OK

Copy the code
  1. View the primary node at this point

172.21.0.3:26379> Sentinel get-master-addr-by-name mymaster

1) "172.21.0.3"

2) "6379"

Copy the code
  1. Set the weights of the other two nodes to 100 to ensure normal automatic failover next time.

172.21.0.3:6381> CONFIG SET slave-priority 100

OK

172.21.0.3:6380> CONFIG SET slave-priority 100

OK

Copy the code

8. Redis Cluster

8.1 Cluster Introduction

Redis Cluster is a distributed solution of Redis, released in version 3.0.

When the bottleneck of single machine, memory, concurrency, and traffic is encountered, Cluster architecture solution can be used to achieve load balancing.

Prior to Redis Cluster, there were two distributed solutions:

  1. Client partition scheme:

  2. Advantages: partition logic controllable;

  3. Disadvantages: Need to handle data routing, high availability, and failover yourself.

  4. Agency scheme:

  5. Some: simplify the client distributed logic, upgrade and maintenance is convenient;

  6. Disadvantages: Large architecture and high performance consumption

8.2 Data Distribution

Distributed database should first solve the problem of mapping the entire database set to multiple nodes according to partition rules, that is, divide the data set into multiple nodes, each node is responsible for a subset of the overall data, need to pay attention to the data sharding rules, Redis Cluster uses hash sharding rules.

8.3 Directory Planning

  • Redis installation directory: /opt/redis_cluster/redis_{PORT}/{conf,logs,pid}

  • Redis data directory: /data/redis_cluster/redis_{PORT}/redis_{PORT}.rdb

  • Redis o&C script: /root/scripts/redis_shell.sh

8.4 Manually Setting up a Deployment Cluster

We are still going to build a Redis cluster on a single machine.

There should actually be three machines with one master and one slave on each machine.

Our plan is as follows:

  • Master (6479) <—–> Slave (6480)

  • Master (6481) <—–> Slave (6482)

  • Master (6483) <—–> Slave (6484)

  1. Create a directory
The mkdir -p/opt/redis_cluster redis_ {6479648 0648 1648 2648 3648 4} / {conf, logs, pid} mkdir -p / data/redis_cluster redis_ {4} 6479648 0648 1648 2648 3648 cat > / opt/redis_cluster redis_6479 / conf/redis_6479. Conf < < EOF Bind 172.21.0.3 port 6479 daemonize yes pidfile "/opt/redis_cluster/redis_6479/pid/redis_6479. Pid "logfile "/opt/redis_cluster/redis_6479/logs/redis_6479.log" dbfilename "redis_6479.rdb" dir "/data/redis_cluster/redis_6479/" cluster-enabled yes cluster-config-file nodes_6479.conf cluster-node-timeout 15000 EOF cd /opt/redis_cluster/ cp redis_6479/conf/redis_6479.conf redis_6480/conf/redis_6480.conf sed -i 's#6479#6480#g' redis_6480/conf/redis_6480.conf cd /opt/redis_cluster/ cp redis_6479/conf/redis_6479.conf redis_6481/conf/redis_6481.conf sed -i 's#6479#6481#g' redis_6481/conf/redis_6481.conf cd /opt/redis_cluster/ cp redis_6479/conf/redis_6479.conf redis_6482/conf/redis_6482.conf sed -i 's#6479#6482#g' redis_6482/conf/redis_6482.conf cd /opt/redis_cluster/ cp redis_6479/conf/redis_6479.conf redis_6483/conf/redis_6483.conf sed -i 's#6479#6480#g' redis_6483/conf/redis_6483.conf cd /opt/redis_cluster/ cp redis_6479/conf/redis_6479.conf redis_6484/conf/redis_6484.conf sed -i 's#6479#6484#g' redis_6484/conf/redis_6484.confCopy the code
  1. Start the Redis service

redis-server /opt/redis_cluster/redis_6479/conf/redis_6479.conf

redis-server /opt/redis_cluster/redis_6480/conf/redis_6480.conf

redis-server /opt/redis_cluster/redis_6481/conf/redis_6481.conf

redis-server /opt/redis_cluster/redis_6482/conf/redis_6482.conf

redis-server /opt/redis_cluster/redis_6483/conf/redis_6483.conf

redis-server /opt/redis_cluster/redis_6484/conf/redis_6484.conf

Copy the code

8.4.1 Interpreting the Configuration File

Conf # Cluster-config-file nodes_6479.conf # Cluster timeout cluster-node-timeout 15000Copy the code
cd /data/redis_cluster/redis_6479 ll cat nodes_6479.conf 514ca73ae546b1a71e055c63b08caa06b53ff35e :0 myself,master - 0 0  0 connected vars currentEpoch 0 lastVoteEpoch 0Copy the code

8.4.2 Cluster Commands

  1. View cluster nodes (the node information is the same as in the nodes.conf file)
172.21.0.3:6479 > CLUSTER NODES a43fc8f84d0f887f2a41292c742ca7cd69c36b18:6479 myself, master - 0 0 0 connectedCopy the code
  1. Discovering cluster Nodes
172.21.0.3:6479 > CLUSTER MEET 172.21.0.3 6481 OK 172.21.0.3:6479 > CLUSTER NODES 9 c7864769ea4461a150d627418f297d17763f622 172.21.0.3: master 6481-0, 1615289900344 1 connected a43fc8f84d0f887f2a41292c742ca7cd69c36b18 172.21.0.3:6479 myself,master - 0 0 0 connectedCopy the code
  1. Other nodes are added to the two-node cluster
172.21.0.3:6483 > CLUSTER NODES 9 aa430f83c3585ebf56f3f48cfbde8c77b9461b9:6483 myself, master - 0 0 0 connected 172.21.0.3:6483 > CLUSTER MEET 172.21.0.3 6479 OK 172.21.0.3:6483 > CLUSTER NODES 9 c7864769ea4461a150d627418f297d17763f622 172.21.0.3: master 6481-0, 1615289968089 1 connected a43fc8f84d0f887f2a41292c742ca7cd69c36b18 172.21.0.3: master - 6479 0 1615289968901 0 connected 9 aa430f83c3585ebf56f3f48cfbde8c77b9461b9 172.21.0.3:6483 myself, master - 0 0 2 connectedCopy the code

8.4.3 Manually Configuring Node Discovery

When we start all the nodes and look at the process, we will find that all the processes have the word cluster:

root 15524 1 0 19:37 ? 00:00:00 redis-server 172.21.0.3:6479 [cluster] root 15526 1 0 19:37? 00:00:00 redis-server 172.21.0.3:6480 [cluster] root 15528 1 0 19:37? 00:00:00 redis-server 172.21.0.3:6481 [cluster] root 15532 1 0 19:37? 00:00:00 redis-server 172.21.0.3:6482 [cluster] root 15536 1 0 19:37? 00:00:00 redis-server 172.21.0.3:6483 [cluster] root 15544 1 0 19:37? 00:00:00 redis-server 172.21.0.3:6484 [cluster] root 15556 14740 0 19:37 PTS /0 00:00:00 grep --color=auto redisCopy the code

However, when we connect to the Redis server and execute the CLUSTER NODES command, we will find the ID of each node. At present, the NODES in the CLUSTER have not discovered each other, so to build a Redis CLUSTER, the first step is to let the NODES in the CLUSTER discover each other.

Before executing the node discovery command, we first look at the data directory of the cluster to find the configuration file that generated the cluster.

Only the content of its own node is found. After all nodes are discovered, the ID of the discovered node is written into this file.

Redis in cluster mode has a cluster configuration file in addition to the existing one. When the node information in the cluster changes, such as adding a node, bringing a node offline, or failover. The node automatically saves the cluster status to a configuration file. Note that Redis automatically maintains the cluster configuration file and does not need to manually modify it to prevent confusion during node restart.

Tip: Execute the node discovery command on any machine in the cluster

We add each node to the cluster:

172.21.0.3:6479> CLUSTER MEET 172.21.0.3 6481 OK 172.21.0.3:6483> CLUSTER MEET 172.21.0.3 6479 OK 172.21.0.3:6480> CLUSTER MEET 172.21.0.3 6479 OK 172.21.0.3:6482> CLUSTER MEET 172.21.0.3 6479 OK 172.21.0.3:6484> CLUSTER MEET 172.21.0.3 6479 OK 172.21.0.3:6479 > CLUSTER NODES 8260 b2c30b5e94f3768d5f192cfafd475988214a 172.21.0.3: master 6482-0 1615290155164 4 connected 9 c7864769ea4461a150d627418f297d17763f622 172.21.0.3:6481 master 1615290154158-0 1 connected A43fc8f84d0f887f2a41292c742ca7cd69c36b18 172.21.0.3:6479 myself, master - 0 0 0 connected 9 aa430f83c3585ebf56f3f48cfbde8c77b9461b9 172.21.0.3:6483 master - 0 1615290156170 2 connected 016 a681b4dd4aa45b72130dbe55be37568ee3851 172.21.0.3: master 6480-0 1615290157174 3 connected 61 b33f247ba176806caf498ad7493a63d6e142de 172.21.0.3: master 6484-0, 1615290152146 5 connectedCopy the code

8.4.4 Redis Cluster Startup Flowchart

8.4.5 Redis Cluster Communication flow

In distributed storage, you need to provide a mechanism for maintaining node metadata. Metadata refers to the data that a node is responsible for and whether a fault occurs.

The Redis cluster uses the Gossip protocol. The Gossip protocol works by constantly exchanging information with each other. After a period of time, all nodes will know the complete information of the cluster.

The communication process is as follows:

  1. Each node in the cluster will open a TCP channel for communication between nodes. The communication port is added 10000 to the basic port.

  2. Each node selects structural nodes to send ping messages within a fixed period based on specific rules.

  3. The node that receives the ping message uses the Pong message as its counterpart. Each node in the cluster chooses the nodes to communicate with according to certain rules. Each node may know all or only part of the nodes. As long as these nodes can communicate with each other normally, they will eventually reach a consistent state. When nodes fail, new nodes join, master/slave state changes, etc., it can give continuous ping/pong messages, so as to achieve synchronization.

The Gossip protocol is responsible for information exchange. Common Gossip messages are classified into ping, pong, meet, and fail.

  • Meet: used to inform the new node to join. The message sender informs the receiver to join the current cluster. After the normal completion of meet message communication, the receiving node will join the cluster and exchange ping and pong messages.

  • Ping: Indicates the most frequently exchanged messages. Each node in a cluster sends ping messages to multiple nodes every second to check whether the nodes are online and exchange information.

  • Pong: When ping and meet messages are received, the node replies to the sender as a response message to confirm the normal communication of the message. The node can also broadcast its own PONG message to the cluster to inform the whole cluster to update its own status.

  • Fail: When a node determines that another node in the cluster goes offline, it broadcasts a Fail message to the cluster. After receiving the fail message, other nodes update the status of the node to offline.

8.4.6 Manually Assigning Slots

Although the nodes have discovered each other, the cluster is still unavailable because no slots have been allocated to the nodes and the cluster is available only after all slots have been allocated.

Otherwise, if only one slot is not allocated, the whole cluster is unavailable.

  1. Add key to cluster (found unable to add key)

172.21.0.3:6479> set k1 v1

(error) CLUSTERDOWN Hash slot not served

Copy the code
  1. Viewing cluster information (Cluster_state :fail indicates that the cluster is in fail state)
172.21.0.3:6479> CLUSTER INFO Cluster_state :fail Cluster_SLOts_assigned :0 Cluster_SLOts_OK :0 Cluster_SLOts_pFAIL :0 cluster_slots_fail:0 cluster_known_nodes:6 cluster_size:0 cluster_current_epoch:5 cluster_my_epoch:0 cluster_stats_messages_sent:21311 cluster_stats_messages_received:21311Copy the code
  1. Select the slot node to be assigned. We have already said that although we have six nodes, only three nodes are actually responsible for data writing, and the other three nodes are only slave nodes of the master node, that is, only three nodes need to be allocated slots. Let’s say 6479, 6481, 6483.

  2. There are two ways to assign slots (which need to be configured on each primary node) :

  3. Log in to the client of each master node separately to execute the command

  4. Log in remotely from one of the machines to the master node of the other machines using the Redis client to execute commands.

  5. Number of node slots:

  6. 16384/3 = 5461.333

  7. Therefore, the number of slots on each node is {0.. 5461}, {5462.. 10922}, {10923.. 16383}

  8. Slot 16384 needs to be assigned separately

  9. Distribution of {0.. 5461}, {5462.. 10922}, {10923.. 16383} slot

  10. Redis -cli -h 172.21.0.3 -p 6479 cluster addslots {0.. 5461}

  11. Redis -cli -h 172.21.0.3 -p 6481 cluster addslots {5462.. 10922}

  12. Redis -cli -h 172.21.0.3 -p 6483 cluster addslots {10923.. 16383}

  13. View cluster information (cluster_state: OK indicates that the cluster status is successful)

172.21.0.3:6479> CLUSTER INFO Cluster_state: OK Cluster_SLOts_assigned :16384 Cluster_SLOts_OK :16384 Cluster_SLOts_pfail :0  cluster_slots_fail:0 cluster_known_nodes:6 cluster_size:3 cluster_current_epoch:5 cluster_my_epoch:0 cluster_stats_messages_sent:23763 cluster_stats_messages_received:23763Copy the code

8.4.7 Manually Configuring HA for a Cluster

Although the cluster is available at this point, the entire cluster is unavailable as long as one machine fails. Therefore, the other three nodes are used as the secondary nodes of the current three primary nodes, so that automatic switchover can be performed when the primary node of the cluster fails to ensure the continuous availability of the cluster.

Note:

  1. Do not let the replication node copy the primary node of the local machine, because then the machine is hung and the cluster is still unavailable, so the replication node copies the primary node of another server.

  2. Note That the replication node and the primary node are on the same machine when the redis-trID tool is used for automatic allocation.

Since we are doing redis clustering on one machine, let 6480 replicate 4379,6482 replicate 6481,6484 replicate 6483.

Note the following during the operation:

  1. It is the slave nodes of each server that need to execute commands;

  2. Be careful not to confuse the master and slave ids.

172.21.0.3:6480 > CLUSTER NODES 8260 b2c30b5e94f3768d5f192cfafd475988214a 172.21.0.3:6482 master - 0, 1615341504146 Connected 9 c7864769ea4461a150d627418f297d17763f622 172.21.0.3:6481 master 1615341501126-0 1 connected. 5462-10922 016 a681b4dd4aa45b72130dbe55be37568ee3851 172.21.0.3:6480 myself, master - 0 0 3 connected A43fc8f84d0f887f2a41292c742ca7cd69c36b18 172.21.0.3: master 6479-0, 1615341503139 connected 0-5461 61 b33f247ba176806caf498ad7493a63d6e142de 172.21.0.3: master 6484-0, 1615341498100 5 connected 9 aa430f83c3585ebf56f3f48cfbde8c77b9461b9 172.21.0.3: master 6483-0 1615341502132 2 connected. 10923-16383Copy the code
Redis - 172.21.0.3 cli - h - p 6480 CLUSTER REPLICATE a43fc8f84d0f887f2a41292c742ca7cd69c36b18 redis - 172.21.0.3 cli - h - p 6482 CLUSTER REPLICATE 9 c7864769ea4461a150d627418f297d17763f622 redis - 172.21.0.3 cli - h - p 6484 CLUSTER REPLICATE 9aa430f83c3585ebf56f3f48cfbde8c77b9461b9Copy the code

Test whether the copy is successful:


[root@VM-0-3-centos ~]# redis-cli -h 172.21.0.3 -p 6479

172.21.0.3:6479> set key1 value1

(error) MOVED 9189 172.21.0.3:6481

Copy the code

It failed. It doesn’t matter. Let’s plug in 6481 and test it.


[root@VM-0-3-centos ~]# redis-cli -h 172.21.0.3 -p 6481

172.21.0.3:6481> set key1 value1

OK

Copy the code

Let’s go to 6482 and see if we have it.

[root@VM-0-3-centos ~]# redis-cli -h 172.21.0.3 -p 6482 172.21.0.3:6482> get key1 (error) MOVED 9189 172.21.0.3:6481 172.21.0.3:6482> keys * 1)Copy the code

Prove that the master-slave replication is successful.

8.5 Cluster Testing

Let’s insert data into Redis the same way we wrote to Redis and see what happens.


[root@VM-0-3-centos ~]# redis-cli -h 172.21.0.3 -p 6479

172.21.0.3:6479> set key1 value1

(error) MOVED 9189 172.21.0.3:6481

Copy the code

The result prompts an error, but gives the address of another node in the cluster. So has this data been written at all? Let’s take a look at nodes 6479 and 6481 respectively.

172.21.0.3:6479> get key1 (error) MOVED 9189 172.21.0.3:6481 [root@VM-0-3-centos ~]# redis-cli -h 172.21.0.3 -p 6481 172.21.0.3:6481 > get key1 (nil)Copy the code

The data is not written. Why?

This is because the data is shard, so it is not written on 6479, the data will be written on 6479 node, cluster data write and read involves another cluster concept, ASK routing.

In clustered mode, Redis first calculates the slot corresponding to the key when receiving any key-related command, and then finds the node corresponding to the slot. If the node is itself, the key command is processed; Otherwise, a MOVED redirection error is reported and the client is notified to request the correct node, a process known as Mover redirection.

So how do we deal with this situation? We add a -c parameter when connecting to the Redis server. The -c parameter has the following functions:


-c Enable cluster mode (follow -ASK and -MOVED redirections).

Copy the code

We set the key again and found that it was ready to execute.


[root@VM-0-3-centos ~]# redis-cli -c -h 172.21.0.3 -p 6479

172.21.0.3:6479> set key1 value1

-> Redirected to slot [9189] located at 172.21.0.3:6481

OK

Copy the code

8.6 Simulating failover

We’ve set up the Redis cluster above, but we haven’t simulated cluster failover yet. Kill -9 on master 6479

Cluster status before simulation is as follows:

172.21.0.3:6481 > CLUSTER NODES a43fc8f84d0f887f2a41292c742ca7cd69c36b18 172.21.0.3:6479 master - 0, 1615346493813 Connected a scale of 0-5461 to 61 b33f247ba176806caf498ad7493a63d6e142de 172.21.0.3:6484 nine aa430f83c3585ebf56f3f48cfbde8c77b9461b9 slave 0 1615346492804 5 connected 8260 b2c30b5e94f3768d5f192cfafd475988214a 172.21.0.3:6482 slave 9c7864769ea4461a150d627418f297d17763f622 0 1615346491799 4 connected 9aa430f83c3585ebf56f3f48cfbde8c77b9461b9 172.21.0.3: master 6483-0 1615346490791 2 connected a681b4dd4aa45b72130dbe55be37568ee3851 10923-16383-016 172.21.0.3:6480 slave a43fc8f84d0f887f2a41292c742ca7cd69c36b18 0 1615346494820 3 connected 9 c7864769ea4461a150d627418f297d17763f622 172.21.0.3:6481 myself, master - 0 0 1 connected 172.21.0.3:5462-10922, 6481 > CLUSTER INFO cluster_state:ok cluster_slots_assigned:16384 cluster_slots_ok:16384 cluster_slots_pfail:0 cluster_slots_fail:0 cluster_known_nodes:6 cluster_size:3 cluster_current_epoch:5 cluster_my_epoch:1 cluster_stats_messages_sent:123805 cluster_stats_messages_received:123805Copy the code

6480 logs are as follows:

15526:S 10 Mar 11:22:52.886 # Error condition on socket for SYNC: 15526:S 10 Mar 11:22:53.99 # Connection refused 15526:S 10 Mar 11:22:53.99 #  Failover election won: I'm the new master. 15526:S 10 Mar 11:22:53.199 # configEpoch set to 6 after successful failover 15526:M 10 Mar 11:22:53.199 * Discarding previously cached master state. 15526:M 10 Mar 11:22:53.199 # Cluster state changed: OKCopy the code

From the log, we can clearly see that a reelection and Failover process occurred.

Let’s look at the state of the node.

172.21.0.3:6481 > CLUSTER NODES a43fc8f84d0f887f2a41292c742ca7cd69c36b18 172.21.0.3:6479 master, fail - 1615346554608 1615346550183 0 disconnected 61 b33f247ba176806caf498ad7493a63d6e142de 172.21.0.3:6484 slave 9aa430f83c3585ebf56f3f48cfbde8c77b9461b9 0 1615346597554 5 connected 8260b2c30b5e94f3768d5f192cfafd475988214a 172.21.0.3:6482 slave 9 c7864769ea4461a150d627418f297d17763f622 0 1615346598561 4 connected 9 aa430f83c3585ebf56f3f48cfbde8c77b9461b9 172.21.0.3: master 6483-0 1615346596540 2 connected. 10923-16383 016 a681b4dd4aa45b72130dbe55be37568ee3851 172.21.0.3: master 6480-0 6 connected a scale of 0-5461 to 1615346599568 9 c7864769ea4461a150d627418f297d17763f622 172.21.0.3:6481 myself, master - 0 0 1 connected. 5462-10922Copy the code

You can see that 6480 becomes the primary node, while the previous 6479 node shows fail.

Let’s start up node 6479 again and look at the cluster status to see what happens.


redis-server /opt/redis_cluster/redis_6479/conf/redis_6479.conf

Copy the code
172.21.0.3:6481 > CLUSTER NODES a43fc8f84d0f887f2a41292c742ca7cd69c36b18 172.21.0.3:6479 slave 016a681b4dd4aa45b72130dbe55be37568ee3851 0 1615346830399 6 connected 61b33f247ba176806caf498ad7493a63d6e142de 172.21.0.3:6484 slave 9 aa430f83c3585ebf56f3f48cfbde8c77b9461b9 0 1615346829394 5 connected 8260 b2c30b5e94f3768d5f192cfafd475988214a 172.21.0.3:6482 slave 9 c7864769ea4461a150d627418f297d17763f622 0 1615346824361 4 connected 9 aa430f83c3585ebf56f3f48cfbde8c77b9461b9 172.21.0.3:6483 master - 0 1615346828386 2 connected. 10923-16383 016 a681b4dd4aa45b72130dbe55be37568ee3851 172.21.0.3: master 6480-0 6 connected a scale of 0-5461 to 1615346826377 9 c7864769ea4461a150d627418f297d17763f622 172.21.0.3:6481 myself, master - 0 0 1 connected. 5462-10922Copy the code

As you can see, 6479 automatically becomes a slave node when it starts, and 6480 is copied.

So we also want to test if there is a failover, does it lose data, does it work? So how do we measure that? We can read and write after the kill node.

The observed result is that the cluster fails to write in for a period of time, but can write normally after a period of time. Cluster-node-timeout 15000 is set in the redis configuration file. A cluster performs a Failover only after the timeout period is reached.

Conclusion:

  1. When a primary node fails, the cluster automatically fails over and the secondary node becomes the primary node.

  2. When the faulty primary node is added to the cluster, it becomes a secondary node.

  3. Failover has time. This time is usually set to 5-10 seconds. If the time is too long, the failure time will be long. If the time is too short, for example, if the network jitter is set to 1 second, the primary/secondary switchover will continue. The primary/secondary switchover takes time, and the RDB file needs to be copied. If the file size is too large, it is time-consuming and may take longer than network jitter.

8.7 Using tools to Build and Deploy the Redis Cluster

Manual cluster building is easy to understand the process and details of cluster creation, but manual cluster building requires many steps. When there are many cluster nodes, it will inevitably increase the complexity and operation and maintenance cost of the cluster. Therefore, the official tool Redis-Trib.rb is provided for us to quickly build the cluster.

Redis-trib. rb is a Redis Cluster management tool that uses Ruby to implement. The internal commands help us simplify common operations such as Cluster creation, checking, slot migration and balancing.

yum makecache fast yum install rubygems gem sources --remove https://rubygems.org/ gem sources -a http://mirrors.aliyun.com/rubygems/ gem update -- system gem install redis -v 3.3.5Copy the code

We can stop all the nodes and then wipe out the data and restore to a brand new cluster. The command is as follows:

Pkill redis rm -rf /data/redis_cluster/redis_64{79,80,81,82,83,84}/*Copy the code

Start all nodes after all are cleared:


redis-server /opt/redis_cluster/redis_6479/conf/redis_6479.conf

redis-server /opt/redis_cluster/redis_6480/conf/redis_6480.conf

redis-server /opt/redis_cluster/redis_6481/conf/redis_6481.conf

redis-server /opt/redis_cluster/redis_6482/conf/redis_6482.conf

redis-server /opt/redis_cluster/redis_6483/conf/redis_6483.conf

redis-server /opt/redis_cluster/redis_6484/conf/redis_6484.conf

Copy the code

Create a cluster using the redis-trib.rb tool:

CD /opt/redis_cluster/redis/ SRC /./redis-trib.rb create --replicas 1 172.21.0.3:6479 172.21.0.3:6481 172.21.0.3:6483 172.21.0.3 172.21.0.3:6480 172.21.0.3:6482:6484Copy the code

1 indicates that each node has one replication node. So who is the master node and who is the slave node? The first node is the master node, and the second node is the slave node.

Check cluster integrity:

. / redis - trib. Rb check 172.21.0.3:6479Copy the code

Note that there is a problem with this tool: the machine on one machine copies the master node on its own machine. This is not guaranteed to be highly available in a production environment. If this machine dies, the cluster becomes unavailable. Since we are deployed on a single machine, we cannot see this problem, but be aware of it. You need to manually adjust the replication cluster.

8.8 Adding nodes Using tools

We can also use tools to expand. The number of slots is fixed, that is, 16384. Expansion first requires slot allocation, slot migration, and then data migration. The migration steps are as follows:

  1. Preparing a new node

  2. To join the cluster

  3. Migrate slots and data

Let’s demonstrate cluster expansion using the tool:

  1. Create a new node:
Mkdir -p /opt/redis_cluster/redis_64{85,86}/{conf,logs,pid} mkdir -p /data/redis_cluster/redis_64{85,86} cp /opt/redis_cluster/redis_6479/conf/redis_6479.conf /opt/redis_cluster/redis_6485/conf/redis_6485.conf cp /opt/redis_cluster/redis_6479/conf/redis_6479.conf /opt/redis_cluster/redis_6486/conf/redis_6486.conf sed -i 's#6479#6485#g' /opt/redis_cluster/redis_6485/conf/redis_6485.conf sed -i 's#6479#6486#g' /opt/redis_cluster/redis_6486/conf/redis_6486.confCopy the code
  1. Start node

redis-server /opt/redis_cluster/redis_6485/conf/redis_6485.conf

redis-server /opt/redis_cluster/redis_6486/conf/redis_6486.conf

Copy the code
  1. Find the node
Redis -cli -c -h 172.21.0.3 -p 6485 cluster meet 172.21.0.3 6479 redis-cli -c -h 172.21.0.3 -p 6486 cluster meet 172.21.0.3 6479Copy the code
  1. Use tools to expand capacity
CD /opt/redis_cluster/redis/ SRC /./redis-trib.rb reshard 172.21.0.3:6479Copy the code
  1. And then there’s the interaction. The tool tells us how many slots to allocate to the new node, and we just calculated 16384/4=4096. So we allocate 4096 slots to new nodes

How many slots do you want to move (from 1 to 16384)? 4096

Copy the code
  1. Next to determine which ID is going to receive the slot, we need to use the ID of port 6485 to receive the slot, so we enter the ID of port 6485.

What is the receiving node ID? 2a8b867d98534ef9d7ec9d454dd8a2f8000dbdf5

Copy the code
  1. Next let’s enter the source ID, either one by one, or we can assign all the nodes to this machine slot. We can just type “all”.

Source node #1:all

Copy the code
  1. After the migration command is completed, it will exit automatically. At this time, we can check the status of the migration.
. / redis - trib. Rb rebalance 172.21.0.3:6479Copy the code
  1. We also need to make replicas for the new nodes. So let’s just copy 6496 and 6485. However, the online environment cannot be replicated by the same machine. You can adjust the replication node, for example:

Redis - 172.21.0.3 cli - c - h - p 6486 cluster 2 a8b867d98534ef9d7ec9d454dd8a2f8000dbdf5 replicateCopy the code

8.9 Using tools to shrink Nodes

Node shrinkage is the reverse operation of node expansion. The process is as follows:

  1. Check whether the offline node has a slot. If yes, migrate the slot to another node to ensure the integrity of slot point mapping after the offline node.

  2. The slot needs to be evenly allocated to other nodes. For example, we now have four nodes, and if we want to reduce the capacity of one node, we need to evenly distribute slots to the other three nodes, 4096/3=1365.333. And do it three times, because you can only move from one place to another at a time.

  3. If the offline node is not in the responsible slot or is a subordinate node, other nodes in the cluster can be notified to forget the offline node. When all nodes forget the offline node, the node can be shut down normally.

Let’s do it:

  1. First migration slot:

cd /opt/redis_cluster/redis/src/

How many slots do you want to move (from 1 to 16384)? 1365

What is the receiving node ID? d118021cf062f3ff2dc5d4a7aee62918db22dfc4

Source node #1:2a8b867d98534ef9d7ec9d454dd8a2f8000dbdf5

Source node #2:done

Copy the code
  1. Second migration tank:
/redis-trib.rb reshard 172.21.0.3:6485 How many slots do you want to move (from 1 to 16384)? 1365 What is the receiving node ID? 9017f1e2bda6515408601b64b7e567f908b54c4e Source node #1:2a8b867d98534ef9d7ec9d454dd8a2f8000dbdf5 Source node #2:doneCopy the code
  1. Migrate the third slot (move the remaining slots)
/redis-trib.rb reshard 172.21.0.3:6485 How many slots do you want to move (from 1 to 16384)? 1366 What is the receiving node ID? 347a7e63742b624f1c23789214c4ab4bc2552a4f Source node #1:2a8b867d98534ef9d7ec9d454dd8a2f8000dbdf5 Source node #2:doneCopy the code
  1. Forget the node
. / redis - trib. Rb del -node 172.21.0.3:6485 2 a8b867d98534ef9d7ec9d454dd8a2f8000dbdf5. / redis - trib. Rb del - node Fb6c3f6b0abb91c6995c51011d9c051bbbb76 172.21.0.3:6486 206Copy the code

9. Redis operation and maintenance tools

9.1 Data Import and Export Tool

Demand background

You may encounter data import problems when switching to a Redis cluster, so you are advised to use the Redis – Migrate -tool to import single-node data to the cluster.

The official address

redis-migrate-tool

Installation tools


cd /opt/redis_cluster/

git clone https://github.com/vipshop/redis-migrate-tool.git

cd redis-migrate-tool/

yum install autoconf automake libtool ncurses-devel

autoreconf -fvi

./configure

make && make install

Copy the code

Creating a Configuration File


cd

vim redis_6379_to_6479.conf

Copy the code
[source] type: single Servers: -172.21.0.3:6379 [target] type: redis cluster servers: -172.21.0.3:6479 [common] listen: 0.0.0.0:8888 source_safe: trueCopy the code

Generate test data


vim input_key.sh

Copy the code
#! /bin/bash for I in $(seq 1 1000) do redis-cli -c -h 172.21.0.3 -p 6379 set wys_${I} v_${I} && echo "set wys_${I} is OK" doneCopy the code

chmod 777 input_key.sh

redis-server /opt/redis_cluster/redis_6379/conf/redis_6379.conf

bash input_key.sh

Copy the code

Verify data before import


172.21.0.3:6379> get wys_999

"v_999"

172.21.0.3:6479> get wys_999

(nil)

Copy the code

Data import


redis-migrate-tool -c redis_6379_to_6479.conf

Copy the code

Data validation

  • Manual query data:
172.21.0.3:6479 > get wys_999 v_999 ""Copy the code
  • Verify using tools:

redis-migrate-tool -c redis_6379_to_6479.conf -C redis_check

Copy the code

9.2 Analyzing key Values

Demand background

Redis uses too much memory with too many key values. It is not known which key values occupy the most capacity, and online analysis will affect performance.

Installation tools


yum install python-pip gcc python-devel

cd /opt/

git clone https://github.com/sripathikrishnan/redis-rdb-tools

cd redis-rdb-tools

python setup.py install

Copy the code

Method of use


cd /data/redis_cluster/redis_6479/

rdb -c memory redis_6479.rdb -f redis_6479.rdb.csv

Copy the code

Analyze and export the RDB


awk -F ',' '{print $4,$2,$3,$1}' redis_6479.rdb.csv |sort > 6479.txt

Copy the code

The biggest key is at the top.

9.3 Monitoring Expired Keys

Demand background

Because the development of repeated submission, resulting in e-commerce site coupon expiration time is invalid

Problem analysis

If a key has an expiration time set, the expiration time is cancelled on the set key

solution

How to obtain the required monitoring key expiration time in batch without affecting machine performance

  1. Keys * : finds the matched key name. The TTL time is then read in a loop

  2. Scan * : range query key name. The TTL time is then read in a loop

  • Keys redo operation, which affects server performance, except for slave nodes that are not serving

  • Scan is a small burden, but it takes several times to complete, and scripts need to be written

The script content

cat 01get_key.sh #! /bin/bash key_num=0 > key_name.log for line in $(cat key_list.txt) do while true do scan_num=$(redis-cli -h 192.168.47.75 -p 6380 SCAN ${key_num} match ${line} \ * count 1000 | awk 'NR = = {print $0}' 1) key_name = $(redis - cli - h 192.168.47.75 -p 6380 SCAN ${key_num} match ${line} \ * count 1000 | awk 'NR 1 > {print $0}') echo ${key_name} | xargs - n 1 > > key_name.log ((key_num=scan_num)) if [ ${key_num} == 0 ] then break fi done doneCopy the code

Reference documentation

  • Redis

  • Old boy education Redis