Persistence mechanism
What is persistence
The working mechanism of using permanent storage media to store data and restore the saved data at a specific time is called persistence.
To ensure the security of the memory data, Redis saves the data in the memory as a file to the hard disk. After the server restarts, the data on the hard disk is automatically restored to the REDis.
Redis supports two types of persistence:
Default snapshotting mode append-only file(aof)Copy the code
Snapshotting snapshot
This persistence is enabled by default. All data in redis is stored in a single file on hard disk (the backup file name is dump.rdb by default).
If the data is very large (10-20GB) it is not suitable to do this persistence operation frequently.
-
You can set the save location, and backup file name.
Dbfilename Specifies the snapshot filename. The default value is dump. RDB
Dir./ Storage location of the snapshot fileCopy the code
Other Configuration Items
Rdbchecksum yes Rdbchecksum yes Rdbchecksum yes Rdbchecksum yes Write stop-writing-on-bgsave-error yes if the main process stopsCopy the code
-
Manually Initiating a Snapshot
Method 1: Login status
Simply execute bgSave
Method 2: No login status
/redis-cli -a Password bgsave
-
Automatic execution (configuration files)
Data is dumped from the memory every N minutes or N write operations to form RDB files and compressed to the backup directory.
-
How to enable (default enabled, has its own trigger conditions)
Save 900 1 if more than one key is changed within 900 seconds, the snapshot is saved
Save 300 10 300 seconds When more than 10 keys are modified, snapshots are initiated
Save 60 10000 60 seconds More than 10000 keys are modified and snapshots are initiated
Note: You can disable the snapshot mode by masking the trigger condition.
-
Advantages of RDB
RDB is a compact binary file with high storage efficiency
RDB stores redis data snapshots at a certain point in time, which is very suitable for data backup, full replication and other scenarios
RDB can recover data much faster than AOF
Application: Perform bgSave backups on the server every X hours and copy RDB files to a remote machine for disaster recovery
- RDB shortcomings
In RDB mode, whether the command is executed or the configuration is used, because the snapshot mode is performed at a certain interval, so if redis unexpectedly goes down, all changes since the last snapshot will be lost.
The BGSave directive forks the child process each time it runs, sacrificing some performance
There is no unified version of RDB file format among many versions of Redis, and it is possible that the data format between different versions of services cannot be compatible with images
append-only-file AOF
Disadvantages of RDB storage
-
The storage efficiency is low due to a large amount of data
Based on the snapshot concept, all data is read and written each time. When the amount of data is large, the efficiency is very low
-
The I/O performance of a large amount of data is low
-
Creating child processes based on fork incurs additional memory consumption
-
Data loss risks caused by downtime
Essence: To record each “write” command (add, modify, delete) in an independent log, and then execute the command in the AOF file again when restarting to recover data. In contrast to RDB, it can simply be described as the process of changing recorded data to record data generation.
-
AOF three data writing strategies
Appendfsync always forcibly writes to disk every time a write command is received. Slowest, but ensures full persistence. Not recommended
Appendfsync everysec // forces writes to disk once per second, a good compromise between performance and persistence, recommended.
Appendfsync no // Completely dependent on OS, best performance, no persistence guaranteed
- How to open
Appendonly yes appendfsync everysec // Save the command file (the path can be specified) appendfilename appendone.aofCopy the code
- Aof file rewrite
Problem: each command overwrites aOF once. If a key operation is performed 100 times, producing 100 lines, the aOF file will be very large.
For example, when the incr number operation is performed multiple times, the aof file stores the incr number command multiple times.
This will increase the size of the AOF file. We can rewrite the AOF file to compress the repeated commands into a single command.
For example, if you do incr number 10 times, compress it into set number 11
As commands write to AOF, files get bigger and bigger. To solve this problem, Redis introduced AOF rewriting to reduce file size.
Aof file rewrite is the process of converting data in the Redis process into write commands to synchronize to a new AOF file.
Simply put, it is to convert the execution results of several commands on the same data into the corresponding instructions of the final result data for recording.
Function:
Reduce disk usage and improve disk utilization
Improves the persistence efficiency, reduces the persistent write time, and improves I/O performance
Reduces the data recovery time and improves the data recovery efficiency
-
Manually execute the override command:
Login status: Enter bgrewriteaof
Not logged in:./bin/redis-cli -a Password bgrewriteaof
-
Perform automatic override conditions:
Auto-aof-rewrite-percentage 100 is overwritten when the file size increases by 100% compared to the last rewrite
// If the aof file is at least 64MB in size, rewrite auto-aof-rewrite-min-size 64MB
// Stop synchronizing aof no-appendfsync-on-rewrite yes
Other problems
- When RDB is dumped, will AOF be lost if synchronization is stopped?
No, all operations are cached in the memory queue. After the dump is complete, the operation is unified
- What does AOF rewrite mean
Aof overwriting refers to writing the data in memory into the. Aof log to solve the problem of large AOF log.
- If both RDB and AOF files exist, who is the first to recover data
aof
- Whether the two can be used simultaneously
Yes, and recommended
- When recovering, which is faster, RDB or AOF
RDB is fast because it is a memory map of data that is loaded directly into memory, whereas AOF is a command that needs to be executed line by line.
Note: If both persistence modes are enabled, AOF prevails. Although the snapshot mode is fast in recovery, aOF overwrites the snapshot mode. Therefore, if both persistence modes are enabled, AOF prevails.
Redis transactions
A Redis transaction is a queue of command execution that wraps a series of predefined commands into a whole (a queue). When executing, they are added in sequence at one time without interruption or interference.
Redis vs. mysql transactions
MySQL | Redis | |
---|---|---|
open | start transaction | multi |
statements | Common SQL | Ordinary command |
failure | The rollback rollback | This cancellation |
successful | commit | exec |
Basic operation
Set zhao 1000 set Wang 2000 multi // enable transaction decrby zhao 100 // add transaction queue Actually execute mget Zhao Wang //900 //2100Copy the code
Mget zhao Wang //900 //2100 multi // enable transaction decyby Zhao 200 // Join transaction queue incrby wang 200 // Join transaction queue discard // Cancel transaction, Release transaction queue mget Zhao Wang //900 //2100Copy the code
A statement error can occur in two ways
1. Grammar problems
In this case, an error is reported and all statements cannot be executed
Mget zhao Wang //900 //2100 multi Decrby zhao 200 aghdsajd // Because of syntax error mget zhao Wang //900 //2100Copy the code
2. The grammar itself is correct, but the object of application is problematic.
For example, zadd operates on a list object and, after exec, executes the correct statement and skips inappropriate statements
Mget zhao Wang //900 //2100 multi decrby zhao 200 sadd Wang 200 This statement has no syntax error, but will fail to be executed. Wang is a string and operates on the set syntax. When exec // commits, it will be half successful, which is a bit of a violation of transaction atomicity. mget zhao wang //700 //2100Copy the code
Watch the lock
The watch command monitors one or more keys, and if one of them is changed (or deleted), subsequent transactions will not be executed,
Monitoring continues until the exec command (the commands in the transaction are executed after the exec command, and the monitored keys are automatically unwatched after the exec command is executed)
Scene: I’m buying a ticket
ticket -1 , money -100
There is only one ticket, and if the ticket is bought by someone else after multi and before exec —— becomes 0.
How to observe this situation and do not submit again.
Pessimistic thinking the world is full of danger, there must be someone and I grab, lock the ticket, only I can operate ------ pessimistic thinking no one and I grab, I just need to pay attention, if anyone changes the value of the ticket can ------ optimistic lockCopy the code
In Redis transactions, optimistic locks are enabled and only responsible for monitoring whether the key has been changed
Set ticket 1 Set money 100 watch ticket // Enable ticket monitoring. If the ticket value changes, // ok multi decr ticket decrby money 100 exec //nil NilCopy the code
A distributed lock
Business scenario: How to avoid the last item being purchased by more than one person at the same time? [Oversold problem]
Solution:
Set a public lock using setnx, using the return value characteristics of the setnx command (set failure if there is a value, set success if there is no value)
If the Settings are returned successfully, the system has the right to perform the next service operation
For return Settings that fail, there is no control, queue or wait
Release the lock by performing the DEL operation
$redis = new Redis();
$redis->connect('127.0.0.1');
$redis->auth('321612');
/ / lock
$lock = $redis -> setnx('lock-num'.1);
if($lock) {$num = $redis->get('num');
if($num>0) {$redis->decr('num');
}
$redis->del('lock-num');
}
Copy the code
A deadlock
Depending on the distributed lock mechanism, the client breaks down when a user performs operations and the client has obtained the lock. How to solve it?
Analysis:
Because the lock operation is controlled by the user, there is a risk that the lock is not unlocked
The unlocking operation cannot be controlled only by the user. The system must provide the corresponding guarantee solution.
Solution:
Use expire to add a time limit to the lock key
expire key second
pexpire key milliseconds
Deletion policy
Stale data
Redis is a kind of memory level database, all data are stored in memory, memory data can obtain its state through TTL instruction
XX: Time-sensitive data
-1: indicates permanently valid data
-2: indicates expired data, deleted data, or undefined data
Data Deletion Policy
Objectives of the data deletion policy:
Finding a balance between memory usage and CPU usage can cause overall Redis performance to degrade, even causing server outages or memory leaks
- Time to delete
Create a timer. When the key expires, the timer task deletes the key immediately
Advantages: saving memory, then delete, quickly release unnecessary memory occupation
Disadvantages: The CPU is under great pressure. No matter how high the CPU load is at this time, it will occupy THE CPU, which will affect the Redis server response time and instruction throughput
Summary: Trading processor performance for storage (time for space)
- Lazy to delete
Data is not processed when it reaches the expiration date. The next time you access the data
If not, the data is returned
Delete it and return it does not exist
Advantages: Saves CPU performance and deletes the vm only when the vm must be deleted
Disadvantages: High memory pressure, long – term memory occupying data
Summary: Trading storage for processor performance (trading space for time)
- Periodically delete
Periodical polling of time-sensitive data in redis library, random sampling strategy is adopted, and the deletion frequency is controlled by the proportion of expired data
Feature 1: CPU usage has a peak value, and the detection frequency can be customized
Feature 2: The memory pressure is not high. Cold data that occupies memory for a long time is cleared continuously
Summary: Periodically check storage space (randomly and critically)
Principle:
When Redis starts server initialization, it reads the value of configuration server.Hz, which defaults to 10
Server.hz times per second serverCron()
– “databasesCron ()
– “activeExpireCycle ()
ActiveExpireCycle () checks each Expires [*] one by one, 250ms/ server.Hz each time
For a certain Expires [*] test, W keys are randomly selected for detection
If the key times out, delete the key
If the number of keys removed in a round is greater than W*25%, the process is repeated
If the number of keys removed in a round <=W*25%, check the next Expires [*], cycle 0-15
W Value =ACTIVE_EXPIRE_CYCLE_LOOKUPS_PER_LOOP Attribute value
The current_db parameter is used to record which Expires [*] expires the activeExpireCycle() enters
If the activeExpireCycle() execution time expires, the next execution continues down from current_db
- Deleting policy Comparison
Time to delete | Lazy to delete | Periodically delete | |
---|---|---|---|
memory | Save memory, no occupation | Heavy Memory usage | The memory is cleared periodically and randomly |
CPU | Occupying CPU resources at all times with high frequency | Delay execution, high CPU utilization | It takes fixed CPU resources per second to maintain memory |
conclusion | Trade time for space | Trade space for time | Random sampling, key sampling |
Redis uses lazy delete internally and deletes periodically
Out of algorithm
What if I run out of memory when new data comes into Redis?
Redis uses memory to store data and calls freeMemoryIfNeeded() to check that memory is sufficient before executing each command.
If the memory does not meet the minimum storage requirements for newly added data, Redis temporarily deletes some data to clear storage space for the current instruction.
The strategy for cleaning up data is called an eviction algorithm.
Note:
The process of evicting data is not 100% likely to clean up enough usable memory and is repeated if unsuccessful.
When all data has been attempted, an error message will appear if memory cleanup requirements are not met.
The configurations related to data expulsion are affected
// Maximum available memory. Maxmemory // The proportion of physical memory occupied. The default value is 0, indicating that there is no limit. Maxmemory-samples maxMemory-samples maxMemory-samples maxmemory-samples maxmemory-samples maxmemory-samples maxmemory-samples maxmemory-samples maxmemory-samples maxmemory-samples maxmemory-samples maxmemory-samples maxmemory-samples Therefore, random data acquisition is adopted as the policy for deleting data to be detected // expulsion policy maxmemory-policy // Deletes selected data when the maximum memory is reachedCopy the code
There are three types of eviction strategies:
- Detect volatile data (data sets that may expire server.db[I].expires)
Volatile – LRU: Selects the least recently used data for elimination
Volatile – lFU: Selects the data that has been used least recently
Volatile – TTL: Selects data to be obsolete
Volatile -random: Randomly selects data for elimination
- Check full database data (all datasets server.db[I].dict)
Allkeys-lru: Pick the least recently used data for elimination
Allkeys-lfu: Select the data that has been used least recently
Allkeys-random: Randomly selects data for elimination
- Discarding data expulsion
No-enviction: disables ejection Of data (default policy in Redis4.0), raising OOM (Out Of Memory) error
Example: Specific configuration
maxmemory-policy volatile-lru
Copy the code
Data expulsion policy configuration basis
Run the INFO command to output monitoring information, query the number of hits and misses cached, and adjust the Redis configuration based on service requirements
keyspace_hits
keyspace_misses
Copy the code
Server Configuration
Server setup
/ / set the server running in the form of the daemon daemonize yes | no / / bind the host address bind 127.0.0.1 / / set the server port port 6379 / / set up the database number 16 databasesCopy the code
The log configuration
/ / set to specify the server log level loglevel debug | verbose | notice | warning / / log filename logfile port number. The logCopy the code
Client Configuration
// Set the maximum number of client connections at a time. Maxclients 0 Redis closes new connections when the number of client connections reaches the upper limit. To disable this function, set it to 0 timeout 300Copy the code
Quick configuration of multiple servers
Import and load the specified configuration file information to quickly create redis instance configuration files with many common redis configurations and maintain include /path/ server-port number. ConfCopy the code
The advanced data type bitmaps
Function: Used for information state statistics
Basic operation
- Setbit key offset Value Sets the bit value of the offset corresponding to the specified key. The value can be 1 or 0
setbit bits 0 1
Copy the code
- Getbit key offset Gets the bit value of the offset corresponding to the specified key
getbit bits 0
//1
getbit bits 10
//0
Copy the code
Extend the operation
Movie website
Statistics on whether a movie is on demand at a given time each day
Count how many movies are shown on demand every day
Count how many movies are on demand per week/month/year
Figure out which movies weren’t on demand this year
- Bitcount key [start end] Counts the number of 1s in the specified key
setbit 20200808 11 1
setbit 20200808 333 1
setbit 20200808 1024 1
setbit 20200809 44 1
setbit 20200809 55 1
setbit 20200809 1024 1
bitcount 20200808
//3
bitcount 20200809
//3
setbit 20200808 6 1
bitcount 20200808
//4
Copy the code
- bitop op destKey key1 [key2…] Perform the join, union, non, xOR operation on the specified key bit by bit and save the result to the destKey
And: /
Or: and
Not:
Xor: xor
bitop or 08-09 20200808 20200809
Copy the code
The advanced data type HyperLogLog
Purpose: Used in radix statistics
Cardinality: The cardinality is the number of elements in the dataset after deduplication
Note:
Used for cardinality statistics, not collections, does not save data, only records data rather than specific data
The core of the algorithm is cardinality estimation, and the final value has some error
Margin of error: The cardinality estimate results in an approximation with a standard error of 0.81%
Space consumption is minimal, with each Hyperloglog key taking up 12K of memory for marking cardinality
The pfmerge command occupies 12 KB of storage space, regardless of the amount of data before merging
Basic operation
- pfadd key element [element…] Add data
pfadd hll 1
pfadd hll 1
pfadd hll 1
pfadd hll 1
pfadd hll 2
pfadd hll 2
pfadd hll 3
Copy the code
- pfcount key [key…] statistics
pfcount hll
//3
Copy the code
- pfmerge destkey sourcekey [sourcekey…] Merge data
The advanced data type GEO
Purpose: It is used in geographical location information calculation
Basic operation
-
Add the coordinates of the geographic location
geoadd key longitude latitude member [longitude latitude member …]
geoadd geos 1 1 a
geoadd geos 2 2 b
Copy the code
-
Gets the coordinates of a geographic location
geopos key member [member…]
geopos geos a
geopos geos b
Copy the code
-
Calculate the distance between two positions
geodist key member1 member2 [unit]
geodist geos a b m
geodist geos a b km
Copy the code
-
Retrieves a set of geographic locations within a specified range according to the latitude and longitude coordinates given by the user
GEORADIUS key longitude latitude radius m|km|ft|mi [WITHCOORD] [WITHDIST] [WITHHASH] [COUNT count] [ASC|DESC] [STORE key] [STOREDIST key]
Geoadd geos 1 1,1 geoadd geos 1 2,2 geoadd geos 1 3 1,3 geoadd geos 2 1 2 2 geoadd geos 2 3 3 Geoadd geos 3 1 3 1 geoadd geos 3 2 3 2 Geoadd geos 3 3 3 3 geoadd geOS 5 5 5 5 Georadius geos 1.5 1.5 90 km //1,2 /2 2 / / 1, 1 / / 2, 1Copy the code
-
Gets a set of geographic locations within a specified range based on a location stored in the location set
GEORADIUSBYMEMBER key member radius m|km|ft|mi [WITHCOORD] [WITHDIST] [WITHHASH] [COUNT count] [ASC|DESC] [STORE key] [STOREDIST key]
Georadiusbymember administrators, 2 km / 180/1, 1/2/2, 1 / / / / / / 2, 1, 2, 2, 1/3 / / / / / 2, 3, 3, 2 / / 3, 3Copy the code
-
Returns the geohash value of one or more location objects
GEOHASH key member [member …]
Geohash administrators 2 / / s037ms06g70Copy the code
A master-slave replication
In order to reduce the load on each Redis server, you can set several more and use master slave mode
One server load “writes” (adds, modifies, deletes) data, the other server load “reads” data, and the primary server data is “automatically” synchronized to the secondary server.
Redis supports easy-to-use master-slave replication, which allows a slave server to become an exact replica of the master Server.
The role of master-slave replication:
Read/write separation: Master writes data and slave reads data, improving the read/write load of the server
Load balancing: Based on the master-slave structure, with read/write separation, the slave shares the master load and changes the number of slaves according to the change of demand. The data read load is shared by multiple slave nodes, greatly improving the concurrency and data throughput of redis server
Fault recovery: When the master fails, the slave provides services for rapid fault recovery
Data redundancy: Hot data backup is a data redundancy method other than persistence
High availability cornerstone: Based on master/slave replication, build sentinel mode and cluster to realize the high availability solution of Redis
Note:
Redis uses asynchronous replication, which cannot block the master and slave servers
A master server can have multiple slave servers. Not only can the master server have slave servers, but slave servers can also have their own slave servers
There are three phases of master/slave replication:
Establish connection phase
Data synchronization phase
Command propagation phase
Master slave communication process
The configuration steps
Prepare two VMS:
192.168.1.69 main 192.168.1.70 from
Master server configuration:
Bind 127.0.0.1 to #bind 127.0.0.1 protected-mode yes to protected-mode noCopy the code
Slave server configuration:
-
Use Slaveof to specify your role, master server address and IP
//slaveof Primary server IP port slaveof 192.168.1.69 6379Copy the code
-
Secondary server read only
// Starting from redis2.6, the slave server supports read-only mode, which is configured using the slave-read-only configuration item. This mode is the default mode of the slave server. slave-read-only yesCopy the code
-
Specifies the password for the secondary server to connect to the primary server
If the master server has a password set through the RequirePass option, configure the password to connect to the master server through Masterauth to allow the slave synchronization operation to proceed smoothly.
Masterauth 321612 (Master server password)Copy the code
** Verify ** : Run the info replication command on the secondary server to check whether the configuration is correct. Master_link_status :up If the configuration is successful, if the configuration is down, the configuration fails. ** Cancel ** : > simply mask the above configuration from the slave server. Master: * If the master has a large amount of data, avoid traffic peak hours during data synchronization to avoid blocking the master and affecting normal service execution. * If the size of the replication buffer is not set properly, data overflow may occur. If the full replication period is too long and data is already lost during partial replication, a second full replication must be performed, causing the slave to fall into the dead-loop state. * The memory usage of the master single machine should not be too large. You are advised to use 50%-70% of the memory of the host, and reserve 30%-50% of the memory for bgsave command execution and replication buffer creationCopy the code
// default 1mb repl-backlog-size 1mb
Slave: * You are advised to disable external services during full or partial replication to avoid server response congestion or data synchronizationCopy the code
slave-serve-stale-data yes|no
* During data synchronization, the master sends a message to the slave (ping). It can be understood that the master is a client of the slave and actively sends commands to the slave. * Multiple slaves request data synchronization to the master at the same time, and the number of RDB files sent by the master increases, causing a huge impact on bandwidth. If the master database has insufficient bandwidth, the data synchronization needs to be based on service requirements. ## command propagation phase * When the master database state is changed, the state of the master database is inconsistent with that of the slave database. In this case, the master database must be synchronized to a consistent state. The action of synchronization is called command propagation * The master sends the received data change command to the slave, and the slave executes the command after receiving the command. * Network disconnection occurs during command propagation * Network intermittent disconnection and intermittent connection Ignore * Short duration network interruption Partial replication * Long duration network interruption Full replication * Three core elements of partial replication * Run ID of the server * Replication backlog buffer of the primary server * Replication offset of the primary and secondary servers ## Common problems with master/slave replication ** Frequent full replication ** ** Frequent network interruption ** ** Inconsistent data ** Every time a slave server is disconnected, whether it is an active disconnection or a network fault, the slave server must dump all RDB from the master server, and then aOF; This means that the synchronization process has to be performed all over again, so remember that if you have multiple slave servers, do not start them all at once. Sentinel is a distributed system used to 'monitor' each server in the master/slave structure. In case of failure, the new master is selected by voting mechanism and all slaves are connected to the new master sentinel: * Monitor * Continuously check whether the master and slave are running properly * Master survival detection, Master and slave health detection * notifications (reminders) * Automatic failover * Disconnect the master from the slave, select a slave as the master, Connect other slaves to the new master and inform the client of the new server address note: Sentinel is also a Redis server, but it does not provide data service. Sentinel configuration is usually an odd number of sentinels. ## Enable Sentinel mode * configure one drag-two master-slave structure * Configure three Sentinel.conf (same configuration, different ports).Copy the code
// Port 26379
// Information storage directory dir/TMP
// MyMaster: a custom name 2: How many sentinels think the master has failed, usually set to (number of sentinels /2) +1 Sentinel monitor myMaster 127.0.0.1 6379 2
// How long did the monitor master not respond to Sentinel down-after-milliseconds myMaster 30000
Sentinel PARALLEL – Syncs myMaster 1
Sentinel Failover -timeout myMaster 180000
The sentinel goes through three stages during the master/slave switchover process * monitoring the status information used to synchronize each node * obtaining the status of each sentinel (whether online or not) * Obtain master status * Master properties * runid * role: Master * Details about each slave * Obtain all slave status (based on the master information) * Slave properties * runid * role: Slave * master_host, master_port * offset *...... * Notification * failover * Vote on which Sentinel to choose, Select alternate master * online * slow * Long disconnected from original master * priority * priority * offset * runid * send instructions (sentinel) * Send Slaveof no one to the new master * Slaveof the new masterIP port to other slavesCopy the code
The cluster
Cluster is to use the network to connect a number of computers, and provide a unified management mode, so that the external presentation of a single service effect
Cluster role:
Load balancing is implemented by distributing the access pressure on a single server
The storage pressure of a single server is dispersed to achieve scalability
Reduce service disasters caused by the failure of a single server
Redis cluster structure design
Data storage design
- Through algorithm design, calculate the location where the key should be saved
- All storage space plans are cut into 16,384 pieces, and each host saves a portion
- Each copy represents a storage space, not a storage space for a key
- Put the key into the corresponding storage space according to the calculated result
- Enhanced scalability
Design of cluster internal communication
- Each database communicates with each other and stores slot number data in each database
- Once hit, return directly
- One miss, inform the location
Cluster Sets up the cluster structure
Three master-slave (one-to-one) architectures
Configure redis. Conf
Cluster-enabled yes // Cluster configuration file name. This file is automatically generated. Cluster-config-file nodes-6379.conf // Timeout duration of node service response, Cluster-node-timeout 10000 // Master slave Minimum number of connections cluster-migration-barrier <count>Copy the code
Start the cluster
//1: a master is connected to a slave, and the first three are the master. Rb create --replicas 2 127.0.0.1:6379 127.0.0.1:6380 127.0.0.1:6381 127.0.0.1:6382 127.0.0.1:6383./redis-trib.rb create --replicas 2 127.0.0.1:6379 127.0.0.1:6380 127.0.0.1:6381 127.0.0.1:6382 127.0.0.1:6383 127.0.0.1:6384Copy the code
operation
Redis -cli -c // Set data set name xiaoming // Get data get nameCopy the code
Enterprise-level solutions
Cache warming
Problem: The server fails quickly after startup
Screen:
1. High number of requests
2. The data throughput between master and slave is large and the data synchronization operation frequency is high
Solution:
-
Preparatory work:
1. Daily routine data access records and hotspot data with high access frequency
2. Use LRU data deletion strategy to build data retention queue
-
Preparations:
The data in the statistics result is classified. Redis preferentially loads the hotspot data with a higher level
Using distributed multiple servers to read data at the same time to speed up the data loading process
-
Implementation:
The data warm-up process is permanently triggered using a script
If possible, use CDN(content delivery Network), the effect is better
Conclusion:
Cache preheating means that relevant cache data is directly loaded to the cache system before system startup. Avoid the problem of querying the database first and then caching the data when the user requests it! Users directly query cached data that has been preheated.
Cache avalanche
Database server crash:
-
The system runs smoothly, suddenly the database connection quantity surges
-
The application server cannot process requests in a timely manner
-
A large number of 408,500 error pages appear
-
The customer refreshes the page repeatedly for data
-
Database crash
-
Application Server Crash
-
Restarting the application server fails
-
Redis server crashes
-
The Redis cluster crashes
-
After the database is restarted, it is knocked down by instantaneous traffic again
Troubleshoot problems
-
More sets of keys in the cache expire over a shorter period of time
-
During this period, redis attempts to access expired data. Redis fails and retrieves data from the database
-
The database receives a large number of requests simultaneously and cannot process them in a timely manner
-
Redis has a large number of requests backlogged and starts to time out
-
The database crashes due to the database traffic surge
-
No data is available in the cache after the restart
-
The Redis server resources are heavily occupied and the Redis server crashes
-
The Redis cluster collapses
-
The application server fails to receive data and respond to requests in a timely manner. As a result, the number of requests from clients increases, and the application server crashes
-
The application server, Redis and database are all restarted, but the effect is not ideal
Problem analysis
-
In a short time frame
-
A large number of keys expire in a centralized manner
Solution (Channel)
-
More page static processing
-
Build a multi-level cache architecture
Nginx cache + Redis cache + EhCache
-
Optimize services to detect serious Mysql time consuming
Troubleshoot database bottlenecks, such as timeout queries and time-consuming transactions
-
Disaster warning system
Monitor redis server performance metrics
-
CPU usage or CPU usage
-
Memory capacity
-
Query the average response time
-
The number of threads
-
-
Traffic limiting and degradation
Sacrifice some customer experience for a short period of time, restrict access to some requests, reduce the pressure on the application server, and gradually release access after services run at a low speed
Solution (technique)
-
Switch between LRU and LFU
-
Data validity period policy adjustment
-
Staggered peaks are classified according to the validity period of service data: 90 minutes for class A, 80 minutes for class B, and 70 minutes for class C
-
Expiration time is in the form of fixed time + random value, diluting the number of expired keys in the set
-
-
Super hot data uses a permanent key
-
Regular maintenance (automatic + manual)
Analyze the traffic volume of the data that is about to expire, confirm whether the data is delayed, and delay the hot data according to the traffic statistics
-
lock
Careful!
conclusion
A cache avalanche is when the amount of out-of-date data is so large that it causes stress to the database server.
If the expiration time concentration can be effectively avoided, avalanches can be effectively solved (about 40%), and other policies can be used together, and the running data of the server can be monitored and quickly adjusted according to the running records.
Cache breakdown
Database server Crash
-
The system is running smoothly
-
The number of database connections shot up
-
Redis server does not have a large number of keys expired
-
Redis memory is smooth with no fluctuations
-
The CPU of the Redis server is normal
-
Database crash
Troubleshoot problems
-
A key in Redis expired and the key was heavily accessed
-
Multiple data requests were pressed directly from the server to Redis, but none was hit
-
Redis initiates a large number of accesses to the same data in the database in a short period of time
Problem analysis
-
Hot data of a single key
-
The key expired
Solution (technique)
-
preset
Take e-commerce as an example. According to the store level, each merchant designates several main products and increases the expiration time of such information keys during the shopping festival
Note: not only does the shopping festival refer to the day, but also the following days, the peak visits tend to decrease gradually
-
The adjustment
Monitor the traffic volume and extend the expiration period or set the key as permanent for the data with natural traffic surge
-
Background Data Refresh
Start scheduled tasks and refresh data validity periods before peak hours to ensure data loss
-
The second level cache
Set different failure time, guarantee will not be eliminated at the same time on the line
-
lock
Distributed lock, prevent breakdown, but also pay attention to performance bottleneck, careful!
conclusion
Cache breakdown is the moment when a single hot data expires. There is a large amount of data traffic. After redIS is missed, a large number of database accesses to the same data are initiated, causing pressure on the database server.
The coping strategy should be based on business data analysis and prevention, as well as running monitoring tests and adjusting policies in real time. After all, it is difficult to monitor the expiration of a single key.
The cache to penetrate
Database server Crash
-
The system is running smoothly
-
The traffic on the application server increases over time
-
Redis server hit ratio decreases over time
-
Redis memory smooth, memory pressure free
-
Redis server CPU usage increases rapidly
-
Database server stress spikes
-
Database crash
Troubleshoot problems
-
A large area of miss appears in Redis
-
Abnormal URL access occurred
Problem analysis
-
The obtained data does not exist in the database, and the corresponding data is not found in the database query
-
Redis returns null data without persisting it
-
Repeat the process the next time such data arrives
-
A hacker attack occurred on the server
Solution (technique)
-
Cache is null
Cache the data whose query result is null (used for a long time and cleared periodically). Set a short time limit, such as 30-60 seconds and a maximum of 5 minutes
-
Whitelist policy
-
Preheat the bitmaps corresponding to various classified data ids in advance. Id is used as the offset of bitmaps, which is equivalent to setting the data whitelist. Release normal data when loading, and directly intercept abnormal data when loading (low efficiency)
-
Use bloom filters (hit issues with Bloom filters are negligible for current conditions)
-
monitor
Real-time monitoring of the ratio of redis hit ratio (usually a fluctuation value when business is normal) to NULL data
-
Fluctuation of non-active period: usually 3-5 times of detection, and more than 5 times of detection will be included in key screening objects
-
Fluctuation of activity period: usually 10-50 times of detection, and more than 50 times of detection will be included in key screening objects
Start different troubleshooting processes based on the multiple. Then use the blacklist for prevention and control (operation)
-
The key to encrypt the
When the problem occurs, the disaster prevention service key is temporarily started, the encryption service of the key is transmitted at the business layer, and the verification program is set to verify the incoming key
For example, 60 encryption strings are randomly allocated every day, two or three are selected and mixed into the page data ID. If the access key does not meet the rules, the data access is rejected
conclusion
Cache penetration: Access to non-existent data, skipping the Redis data cache phase of legitimate data, causing stress to the database server each time the database is accessed. Usually the amount of this kind of data is a low value, when this kind of situation occurs, and timely alarm.
The countermeasures should focus more on temporary preplan prevention.
Both blacklist and whitelist are pressure on the whole system, and should be removed as soon as the alarm is cleared.
Performance Indicator Monitoring
- Performance indicators: Performance
Name | Description |
---|---|
latency | The time Redis takes to respond to a request |
instantaneous_ops_per_sec | Average number of requests processed per second |
To hit the right rate (calculated) | Cache hit ratio (calculated) |
- Memory specifications: Memory
Name | Description |
---|---|
used_memory | Used memory |
mem_fragmentation_ratio | Memory fragmentation rate |
evicted_keys | The number of keys removed due to the maximum memory limit |
blocked_clients | A client that is blocked due to BLPOOP, BRPOP, or BRPOPLPUSH |
- Basic Activity indicator: Basic Activity
Name | Description |
---|---|
connected_clients | Number of client connections |
connected_slaves | Number of Slave |
master_last_io_seconds_ago | Number of seconds since last master-slave interaction |
keyspace | Total number of keys in the database |
- Persistence indicator: Persistence
Name | Description |
---|---|
rdb_last_save_time | The timestamp of the last persistent save to disk |
rdb_changes_since_last_save | The number of changes to the database since the last persistence |
- Error indicator: Error
Name | Description |
---|---|
rejected_connections | The number of connections rejected because the maxClient limit was reached |
keyspace_misses | Number of key search failures (no match) |
master_link_down_since_seconds | Duration of master/slave disconnection in seconds |
Commands for monitoring performance indicators
- benchmark
The command
redis-benchmark [-h ] [-p ] [-c ] [-n <requests]> [-k ]
Copy the code
// Default: 50 connections, 10000 requests performance redis-benchmark //100 connections, 5000 requests performance redis-benchmark -C 100-N 5000Copy the code
- Monitor Displays debugging information about the server
monitor
Copy the code
- slowlog [operator]
-
Get: Obtains slow query logs
-
Len: Obtains the number of slow query log entries
-
Reset: Resets slow query logs
-
slowlog get
Copy the code
The related configuration
Slowlog-log-slower than 1000 # Slowlog-log-slower than 1000 # Slowlog-max-len 100 # Slowlog-max-len 100 # Slowlog-max-lenCopy the code