directory

  • An overview of the
  • Redis data structures and common commands
  • Data persistence
  • Memory management and data elimination mechanisms
  • Pipelining
  • Affairs and Scripting
  • Redis performance tuning
  • Master/slave replication and cluster sharding
  • Redis Java client of choice


This paper will start from the basic characteristics of Redis, through the introduction of Redis data structure and main commands to the basic capabilities of Redis intuitive introduction.

It then provides an overview of the advanced capabilities offered by Redis, and provides more insight and guidance on deployment, maintenance, performance tuning, and more.

This article is intended for casual developers who use Redis, as well as architecture designers who do selection, architectural design, and performance tuning for Redis.

An overview of the

Redis is an open source, memory-based structured data storage medium that can be used as a database, caching service, or messaging service.

Redis supports a variety of data structures, including strings, hash tables, linked lists, sets, ordered sets, bitmaps, Hyperloglogs, and more.

Redis is capable of LRU elimination, transaction implementation, and different levels of disk persistence. It also supports replica sets and high availability solutions through Redis Sentinel, as well as automatic data sharding capabilities through Redis Cluster.

The main functionality of Redis is based on a single-threaded model, which means that Redis uses a single thread to service all client requests

At the same time, Redis adopts non-blocking IO and finely optimizes the algorithm time complexity of various commands, which means:

  • Redis is thread-safe (because there is only one thread), all operations are atomic, and there are no data exceptions due to concurrency
  • Redis is very fast (because it uses non-blocking IO and most commands have O(1) algorithm time)
  • Using the time-consuming Redis command is dangerous, consuming a large amount of processing time for a single thread, causing all requests to be slowed down. (For example, the O(N) KEYS command is strictly prohibited in production environments.)

Redis data structure and related common commands

This section introduces the main data structures supported by Redis and the associated common Redis commands

Key

Redis uses key-value basic data structures. Any binary sequence can be used as a Redis Key (such as a regular string or a JPEG image).

Some things to note about keys:

  • Don’t use long keys. Using a 1024-byte key, for example, is not a good idea, consuming more memory and making lookups less efficient
  • For example, “U1000FLW” saves less storage space than “User :1000: Followers”, but causes readability and maintainability problems
  • It is best to use a uniform specification for keys, such as “object-type: ID :attr”. Keys designed with this specification might be “user:1000” or “comment:1234:reply-to”.
  • The maximum Key length allowed by Redis is 512MB (also 512MB for values).

String

String is the basic data type of Redis. Redis does not have the concepts of Int, Float, Boolean, etc. All basic data types are represented by String in Redis.

Common String related commands:

  • SET: sets the value of a key, which can be used in conjunction with EX/PX parameters to specify the validity period of the key. NX/XX parameters can be used to distinguish whether the key exists or not. Time complexity O(1)
  • GET: Obtains the value of a key. Time complexity O(1)
  • GETSET: Sets the value of a key and returns the original value of the key, time complexity O(1)
  • MSET: Set values for multiple keys, time complexity O(N)
  • MSETNX: same as MSET, if any of the specified keys already exists, no operation is performed, time complexity O(N)


  • MGET: Obtain the values of multiple keys, time complexity O(N)


As mentioned above, the basic data type of Redis is only String. However, Redis can use String as an integer or floating point number. This is mainly reflected in the commands of INCR and DECR classes:

  • INCR: Increases the value of the key by 1 and returns the value after the increment. Applies only to String data that can be converted to integers. Time complexity O(1)
  • INCRBY: increments the value of the key to the specified integer value and returns the incremented value. Applies only to String data that can be converted to integers. Time complexity O(1)
  • DECR/DECRBY: Same as INCR/INCRBY, autoincrement is changed to autodecrement.


The INCR/DECR series of commands require that the value of the operation be of type String and can be converted to a 64-bit signed integer number, otherwise an error will be returned.

In other words, the value of the INCR/DECR command must be in the range from -2^63 to 2^ 63-1.

As mentioned earlier, Redis uses a single-threaded model, which is naturally thread-safe, making INCR/DECR commands very convenient to achieve precise control in high concurrency scenarios.

Example 1: Inventory control

Accurate check of inventory margin in high concurrency scenario to ensure no oversold situation.

Set the total inventory:

SET inv:remain "100"
Copy the code


Inventory deduction + margin check:

DECR inv:remain
Copy the code


When the return value of the DECR command is greater than or equal to 0, it indicates that the inventory margin check passes; if the return value is less than 0, it indicates that the inventory is exhausted.

Assuming 300 concurrent requests for inventory deduction, Redis can ensure that the 300 requests get a return value of 99 to -200, and that each request gets a unique return value, never finding two requests that get the same return value.

Example 2: Self-increasing sequence generation

Implement rDBMs-like Sequence functionality to generate a series of unique Sequence numbers

Set the sequence start value:

SET sequence "10000"
Copy the code


Get a sequence value:

INCR sequence
Copy the code


Just use the return value as a sequence.

Get a batch of (say 100) sequence values:

INCRBY sequence 100
Copy the code


Assuming the return value is N, then the values [n-99 to N] are all available sequence values.

When multiple clients request an increment sequence from Redis at the same time, Redis can ensure that each client gets a globally unique sequence value or range, and there will never be a situation where different clients get a duplicate sequence value.

List

The List of Redis is a chained data structure, and you can insert and pop elements at both ends of the List using commands like LPUSH/RPUSH/LPOP/RPOP.

While lists also support the ability to insert and read elements on a particular index, they have a high time complexity (O(N)) and should be used with caution.

Common commands related to List:

  • LPUSH: Inserts one or more elements to the left (header) of the specified List, returning the inserted length of the List. Time complexity O(N), where N is the number of inserted elements
  • RPUSH: Like LPUSH, inserts 1 or more elements to the right (tail) of the specified List
  • LPOP: Removes an element from the left side of the specified List and returns, time complexity O(1)
  • RPOP: Same as LPOP, removes one element from the right (tail) of the specified List and returns it
  • LPUSHX/RPUSHX: Similar to LPUSH/RPUSH except that the LPUSHX/RPUSHX operation does not perform any operations if the key does not exist
  • LLEN: Returns the length of the specified List, time complexity O(1)
  • LRANGE: returns the specified range of elements in the specified List (double-ended, i.e. LRANGE key 0 and 10 will return 11 elements), time complexity O(N).


Note: The number of elements fetched at a time should be controlled as much as possible. Fetching a large range of List elements at a time can cause delays, while avoiding full traversal operations such as LRANGE key 0-1 for lists of unpredictable length.

List commands should be used with caution:

  • LINDEX: Returns the element on the specified index of the specified List, or nil if the index is out of bounds. The index value is looped, i.e. -1 represents the last position in the List and -2 represents the second-to-last position in the List. Time complexity O(N)
  • LSET: sets the element on the specified List to value, returns an error if the index is out of bounds, O(N), O(1) if the operation is on a header/tail element
  • LINSERT: Inserts a new element before/after the specified element in the specified List and returns the List length after the operation. Returns -1 if the specified element does not exist. If the specified key does not exist, no operation is performed. Time complexity O(N)

Because the List of Redis is a linked List structure, the algorithm efficiency of the above three commands is low, and the List needs to be traversed. The command time cannot be estimated. When the List length is large, the time will increase significantly, so it should be used with caution.

In other words, Redis’ List is actually designed to implement queues, not lists like ArrayList.

If you don’t want to implement a double-ended queue, try not to use the List data structure of Redis.

To support queuing, Redis also provides a series of blocking commands, such as BLPOP/BRPOP, that are similar to BlockingQueue. If the List is empty, the connection will be blocked until there are objects in the List that can be queued.

Hash

Hash is a Hash table. Redis Hash is a field-value data structure like a traditional Hash table. It can be understood as moving a HashMap into Redis.

Hash is a great way to represent object type data, using the field of the Hash corresponding to the field of the object.

Advantages of Hash include:

  • Binary lookup can be implemented, such as “find the age of the user with ID 1000”
  • Hash is a great way to reduce network transport overhead compared to serializing the entire object as a String
  • When using Hash to maintain a collection, it provides a much more efficient random access command than List

Common commands related to Hash:

  • HSET: Sets the field in the Hash corresponding to the key to value. If the Hash does not exist, one will be created automatically. Time complexity O(1)
  • HGET: Returns the value of the field field in the specified Hash, time complexity O(1)


  • HMSET/HMGET: similar to HSET and HGET, multiple fields under the same key can be operated in batches. The time complexity is O(N), where N is the number of fields in one operation
  • HSETNX: same as HSET, but if field already exists, HSETNX will not perform any operation, time complexity O(1)
  • HEXISTS: checks whether field exists in the specified Hash. Returns 1 if field exists and 0 if field does not exist. Time complexity O(1)
  • HDEL: deletes one or more fields from the specified Hash. Time complexity: O(N), where N is the number of fields in the operation
  • HINCRBY: INCRBY a field in the Hash, same as INCRBY, time complexity O(1)

Hashing commands that should be used with caution:

  • HGETALL: Returns all field-value pairs in the specified Hash. The result is an array with alternating fields and values. Time complexity O(N)
  • HKEYS/HVALS: Returns all fields/values in the specified Hash, time complexity O(N)

All three commands complete Hash traversal, and the number of fields in the Hash is linearly related to the command time

For Hash whose size is unpredictable, you should strictly avoid using the above three commands and use HSCAN commands instead for cursor traversal

Set

A Redis Set is an unordered, non-repeatable collection of strings.

Common commands related to Set:

  • SADD: Adds 1 or more members to the specified Set. If the specified Set does not exist, one is automatically created. Time complexity O(N), where N is the number of added members
  • SREM: Removes one or more members from a specified Set. The time complexity is O(N), where N is the number of members to be removed
  • SRANDMEMBER: Returns one or more members randomly from the specified Set. The time complexity is O(N), where N is the number of returned members
  • SPOP: Removes randomly from a Set and returns count members, O(N), where N is the number of members removed
  • SCARD: returns the number of members in the specified Set, time complexity O(1)
  • SISMEMBER: check whether the specified value exists in the specified Set, time complexity O(1)
  • SMOVE: moves the specified member from one Set to another

Careful use of Set commands:

  • SMEMBERS: Returns all members of the specified Hash, time complexity O(N)
  • SUNION/SUNIONSTORE: Computes the union of multiple sets and returns/stores them in another Set. The time complexity is O(N), where N is the total number of members of all sets participating in the calculation
  • SINTER/SINTERSTORE: Calculates the intersection of multiple sets and returns/stores them in another Set. The time complexity is O(N), where N is the total number of members of all sets involved in the calculation
  • SDIFF/SDIFFSTORE: Calculate the difference Set between 1 Set and 1 or more sets and return/store it in another Set, time complexity O(N), N is the total number of members of all sets participating in the calculation


The above commands involve a large amount of calculation and should be used with caution, especially when the size of the Set involved in the calculation is unknown, they should be strictly avoided.

Sorted Set

A Redis Sorted Set is an ordered, non-repeatable collection of strings.

Each element in a Sorted Set needs to be assigned a score, and the Sorted Set sorts the elements in ascending order based on that score.

If multiple members have the same score, they are sorted in ascending lexicographical order. The Sorted Set is very good for ranking.

Sorted Set main commands:

  • ZADD: Adds one or more members to the specified Sorted Set. The time complexity is O(Mlog(N)), where M is the number of added members and N is the number of members in the Sorted Set
  • ZREM: Deletes one or more members from a given Sorted Set. The time complexity is O(Mlog(N)), where M is the number of deleted members and N is the number of members in the Sorted Set
  • ZCOUNT: returns the number of members within the specified score range in the specified Sorted Set. Time complexity: O(log(N))
  • ZCARD: Returns the number of members in the specified Sorted Set, time complexity O(1)
  • ZSCORE: Returns the score of the specified member in the specified Sorted Set, time complexity O(1)
  • ZRANK/ZREVRANK: Returns the Sorted member ranking in the Sorted Set. ZRANK returns the Sorted member ranking in ascending order, and ZREVRANK returns the Sorted member ranking in descending order. Time complexity ORDER log(N)
  • ZINCRBY: same as INCRBY, increment the score of specified member in specified Sorted Set, time complexity O(log(N))

3. Select the Sorted Set command with caution:

  • ZRANGE/ZREVRANGE: Returns all members within the specified ranking range in the specified Sorted Sorted Set. ZRANGE is Sorted in ascending order by score, and ZREVRANGE is Sorted in descending order by score. The time complexity is O(log(N)+M), and M is the returned number of members
  • ZRANGEBYSCORE/ZREVRANGEBYSCORE: returns the score in Sorted Set specified within the scope of all member, returns the results in ascending/descending order, min and Max can be specified for inf and + inf, represent all the member of return. Time complexity O(log(N)+M)
  • ZREMRANGEBYRANK/ZREMRANGEBYSCORE: remove the Sorted Set is specified in the rankings range/all member of the specified score range. Time complexity O(log(N)+M)

[0 -1] or [-INF + INF] should be avoided in order to perform a complete traversal of the Sorted Set, especially if the Sorted Set size is unpredictable.

Bitmap and HyperLogLog

These two data structures of Redis are less commonly used than the previous ones and will only be briefly introduced in this article

A Bitmap is not an actual data type in Redis, but a way of using String as a Bitmap.

This can be interpreted as converting a String to an array of bits. Using bitmaps to store simple data of type true/false is a huge space saver.

HyperLogLogs is a data structure primarily used for quantitative statistics. Similar to sets, HyperLogLogs maintain a non-repeatable collection of strings

However, HyperLogLogs does not maintain specific member contents, only the number of members.

That is, HyperLogLogs can only be used to count the number of non-repeating elements in a Set, so it is much less memory efficient than Set.

Other Common Commands

  • EXISTS: checks whether the specified key EXISTS. 1 indicates that the key EXISTS, 0 indicates that the key does not exist, and time complexity O(1) is returned.
  • DEL: deletes the specified key and its value. The time complexity is O(N), where N is the number of deleted keys
  • EXPIRE/PEXPIRE: Sets the expiration date for a key in seconds or milliseconds, time complexity O(1)
  • TTL/PTTL: Returns the remaining valid time of a key, in seconds or milliseconds. Time complexity O(1)
  • RENAME/RENAMENX: Renames the key to newKey. When RENAME is used, if the newkey already exists, its value is overwritten. When using RENAMENX, if newkey already exists, no operation will be performed. Time complexity O(1)
  • TYPE: Returns the TYPE of the specified key, string, list, set, zset, hash. Time complexity O(1)
  • CONFIG GET: obtain the current value of a Redis configuration item, you can use * wildcard, time complexity O(1)
  • CONFIG SET: SET a new value for a Redis configuration item, time complexity O(1)
  • CONFIG REWRITE: Get Redis to reload the configuration in redis.conf


Data persistence

Redis provides the ability to automatically persist data to hard disk on a regular basis, including RDB and AOF solutions.

The two schemes have their advantages and disadvantages respectively, which can be combined to run at the same time to ensure the stability of data.

Must data persistence be used?

Redis data persistence can be turned off. If you only use Redis as a caching service and all data stored in Redis is not the body of the data but only a synchronized backup, then you can turn off Redis data persistence.

In general, however, it is still recommended to at least enable rDB-based persistence because:

  • RDB persistence has almost no performance cost of Redis itself. The only thing the main Redis process needs to do for RDB persistence is to fork out a child process, which does all the persistence work
  • After Redis crashes for whatever reason, the data recorded in the last RDB snapshot is automatically restored upon restart. This eliminates the need to manually synchronize data from other sources, such as DB, and is faster than any other data recovery method
  • Hard drives are so big these days, there’s really no shortage of space

RDB

In RDB persistence mode, Redis periodically saves snapshots of data to an RBD file and automatically loads the RDB file upon startup to restore previously saved data.

You can configure the Redis snapshot saving time in the configuration file:

save [seconds] [changes]
Copy the code


If [changes] data changes occur within [seconds], an RDB snapshot will be saved, for example:

save 60 100
Copy the code


Redis is told to check for data changes every 60 seconds, and if 100 or more data changes have occurred, the RDB snapshot is saved.

Multiple save commands can be configured to enable Redis to execute a multi-level snapshot saving policy.

Redis enables RDB snapshot by default. The default RDB policy is as follows:

save 900 1
save 300 10
save 60 10000
Copy the code


You can also manually trigger RDB snapshot saving using the BGSAVE command.

Advantages of RDB:

  • Minimal impact on performance. As mentioned above, Redis will fork out child processes when saving RDB snapshots, with little impact on Redis’s efficiency in handling client requests.
  • Each snapshot generates a complete data snapshot file. Therefore, snapshots at multiple points in time can be saved by other means (for example, snapshots at 0 o ‘clock every day are backed up to other storage media), which is a very reliable method for disaster recovery.
  • Data recovery using RDB files is much faster than using AOF.

Disadvantages of RDB:

  • Snapshots are generated periodically, so some of the data is more or less lost in Redis crashes.
  • If the data set is very large and the CPU is not strong enough (such as a single-core CPU), Redis can take a relatively long time (up to 1 second) to fork the child process, affecting client requests during that time.

AOF

With AOF persistence, Redis logs every write request in a log file.

When Redis restarts, all write operations recorded in the AOF file are executed sequentially to ensure that data is restored to the latest date.

AOF is disabled by default. To enable AOF, perform the following configuration:

appendonly yes
Copy the code


AOF provides three fsync configurations, always/everysec/no, specified through the appendfsync configuration item:

  • Appendfsync no: Does not perform fsync and hands the timing of flush files to the OS as fast as possible
  • Appendfsync always: The fsync operation is performed every time a log is written. It has the highest data security but lowest speed
  • Appendfsync everysec: Compromise, fsync to the background thread every second


As AOF continues to log and write operations, some useless logs are bound to appear

For example, if SET key1 “ABC” is executed at one point and SET key1 “BCD” is executed at a later point, the first command is obviously useless.

A large number of useless logs can make AOF files too large and take too long to recover.

So Redis provides AOF rewrite, which allows you to rewrite AOF files with only the smallest set of writes needed to restore data to its latest state.

AOF rewrite can be triggered by the BGREWRITEAOF command, or Redis can be configured to do this automatically at regular intervals:

auto-aof-rewrite-percentage 100
auto-aof-rewrite-min-size 64mb
Copy the code


Redis rewrites the AOF log size every time it has AOF rewrite, and automatically AOF rewrite when the AOF log size has increased by 100%. And if the size does not grow to 64MB, rewrite will not be done.

Advantages of AOF:

  • At best, no written data is lost when appendfsync always is enabled, and data is lost for at most 1 second when using Appendfsync Everysec.
  • AOF files are not damaged in the event of problems such as power outages and can be easily repaired using the Redis-check-aOF tool even if a log is only half-written.
  • AOF files are readable and modify.after doing something wrong with the AOF file, you can back up the AOF file, delete the wrong commands, and then restore the data, as long as the AOF file does not have rewrite.

Disadvantages of AOF:

  • AOF files are usually larger than RDB files
  • Performance consumption is higher than RDB
  • The data recovery speed is slower than RDB


Memory management and data elimination mechanisms

Maximum memory setting

By default, Redis uses a maximum of 3GB of memory on a 32-bit OS, and no limit on a 64-bit OS.

When using Redis, you should have a fairly accurate estimate of the maximum amount of space that data will take up, and set the maximum amount of memory that Redis will use.

Otherwise, on a 64-bit OS, Redis takes up memory without limit (swap space is used when physical memory is full), which can cause all kinds of problems.

Control the maximum memory used by Redis through the following configuration:

maxmemory 100mb
Copy the code


When writing data to Redis after maxMemory is used, Redis will:

  • Try to flush out data based on the configured data flushing policy to release space
  • If there is no data to flush, or no data flush policy is configured, Redis returns an error for all write requests, but read requests still execute normally


When setting maxMemory for Redis, note:


If the master/slave synchronization of Redis is used, some memory space will be occupied when the master node synchronizes data to the slave node

If maxMemory is too close to the host’s available memory, it will run out of memory for data synchronization.

So don’t set maxMemory too close to the available memory of the host, and set some of it aside for master/slave synchronization.


Data elimination mechanism

Redis offers five strategies for data elimination:

  • Volatile – LRU: The LRU algorithm is used for data elimination (the earliest and least frequently used keys are eliminated). Only the keys with expiration dates are eliminated
  • Allkeys-lru: use lru algorithm for data elimination, allkeys can be eliminated
  • Volatile -random: Randomly eliminates data. Only keys with a specified expiration date are discarded
  • Allkeys-random: randomly eliminate data. Allkeys can be eliminated
  • Volatile – TTL: eliminates the remaining keys with the shortest validity period


It is best to specify an effective data flushing policy for Redis in conjunction with the MaxMemory setting to avoid write failures when memory is full.

In general, the recommended strategy is volatile- LRU and recognizes the importance of the data stored in Redis.

For data that is important and must not be discarded (configuration data, for example), you should not set the expiration date, so Redis will never discard the data.

For less important data that can be hot loaded (such as the cache of the most recently logged user information, which will be read from DB if it cannot be found in Redis), you can set the expiration date so that Redis will flush out the data when it runs out of memory.

Configuration method:

maxmemory-policy volatile-lru   The default is noeviction, data elimination will not be performed
Copy the code


Pipelining

Pipelining

Redis provides many commands for batch operations, such as MSET/MGET/HMSET/HMGET, etc. These commands exist to reduce the resources and time consumed in maintaining network connections and transferring data.

For example, using the SET command five times to SET five different keys has the same effect as using the MSET command to SET five different keys once

However, the former consumes more Round Trip Time (RTT), and the latter should always be preferred.

However, if the client is to perform multiple operations in a row that cannot be combined by the Redis command, for example:

SET a "abc"
INCR b
HSET c name "hi"
Copy the code


The Pipelining feature provided by Redis can be used to execute multiple commands in a single interaction.

Using Pipelining, multiple commands (separated by \r\n) are sent from the client at once to Redis, which executes these commands in sequence and assembs each of the commands in a single sequence, such as:

$(printf "PING\r\nPING\r\nPING\r\n"; sleep 1) | nc localhost 6379
+PONG
+PONG
+PONG
Copy the code


Pipelining is supported by most Redis clients, so developers usually don’t need to manually assemble their own command lists.

Limitations of Pipelining

Pipelining can only be used to execute sequential, uncorrelated commands, and cannot be used when the generation of a command depends on the return of a previous command.

This limitation can be circumvented through Scripting features

Affairs and Scripting

Pipelining enables Redis to process multiple commands in a single interaction, but in some scenarios it may be necessary to ensure that the set of commands is executed consecutively.

For example, take the current cumulative PV number and clear it to 0

> GET vCount
12384
> SET vCount 0
OK
Copy the code


Inserting an INCR vCount between the GET and SET commands will cause the client to GET an inaccurate vCount.

Redis transactions ensure atomicity when executing complex commands.

That is, Redis guarantees that a set of commands in a transaction are executed in absolute succession, and that no other commands from other connections are inserted until these commands are executed.

Add these two commands to a transaction through the MULTI and EXEC commands:

> MULTI
OK
> GET vCount
QUEUED
> SET vCount 0
QUEUED
> EXEC
1) 12384
2) OK
Copy the code


Redis starts a transaction upon receiving the MULTI command, after which all read and write commands are stored in the queue but not executed

Until the EXEC command is received, Redis executes all the commands in the queue sequentially and returns the result of each command as an array.

You can use the DISCARD command to DISCARD the current transaction and empty the saved command queue.

Note that Redis transactions do not support rollback: if a command in a transaction has a syntax error, most client drivers will return an error

Redis above 2.6.5 also checks for syntax errors in commands in the queue during EXEC execution, and if there are, the transaction is automatically abandoned and an error is returned.

But if a command in a transaction has a non-syntactical error (such as an HSET operation on a String), neither the client driver nor Redis can detect it before the command is actually executed, so all the commands in the transaction will still be executed in sequence.

In this case, some commands succeed and some fail in a transaction. However, unlike RDBMS, Redis does not provide transaction rollback, so data can only be rolled back through other methods.

CAS is implemented through transactions

Redis provides the WATCH command to be used with transactions to implement the CAS optimistic locking mechanism.

Suppose you want to implement changing the status of an item to sold:

if(exec(HGET stock:1001 state) == "in stock")
    exec(HSET stock:1001 state "sold");
Copy the code


This pseudo-code does not ensure concurrency security when executed, and it is possible for multiple clients to obtain the “in stock” status, resulting in an inventory being sold multiple times.

Use the WATCH command and transaction to solve this problem:

exec(WATCH stock:1001);
if(exec(HGET stock:1001 state) == "in stock") {
    exec(MULTI);
    exec(HSET stock:1001 state "sold");
    exec(EXEC);
}
Copy the code


The mechanism of WATCH is as follows: When the transaction EXEC command is executed, Redis will check the WATCH key. The WATCH key will be executed only when the WATCH key has not changed since the start of WATCH.

If the WATCH key changes between the WATCH command and the EXEC command, the EXEC command returns a failure.

Scripting

The EVAL and EVALSHA commands allow Redis to execute LUA scripts. This is similar to the stored procedure of RDBMS, which can put the intensive read/write interaction between the client and Redis on the server side to avoid excessive data interaction and improve performance.

Scripting was born as an alternative to transaction functionality, and all of the capabilities that transaction provides are available to Scripting. Redis officially recommends using LUA Script instead of transactions, which are more efficient and convenient than transactions.

For the details of the Scripting is used, this article does not do in detail, please refer to the official documentation at https://redis.io/commands/eval

Redis performance tuning

Just because Redis is a very fast in-memory data storage medium doesn’t mean Redis doesn’t cause performance problems.

As mentioned earlier, Redis uses a single-threaded model in which all commands are executed sequentially by a single thread

So when a command takes a long time to execute, it slows down all subsequent commands, making Redis more sensitive to the efficiency of each task.

The performance optimization of Redis mainly starts from the following aspects:

  • First and foremost, make sure that you don’t have Redis executing time-consuming commands
  • Pipelining combines sequential commands to perform pipelining
  • Transparent Huge Pages of the OPERATING system must be disabled:
echo never > /sys/kernel/mm/transparent_hugepage/enabled
Copy the code

  • If you are running Redis in a virtual machine, there may be inherent latency associated with the virtual machine environment.
  • You can view inherent latency by running the./redis-cli –intrinsic-latency 100 command. At the same time, if you have high performance requirements for Redis, you should deploy Redis directly on the physical machine as much as possible.
  • Check the data persistence policy
  • Consider introducing a read-write separation mechanism


Long time command

The time complexity of most Redis read and write commands ranges from O(1) to O(N). The time complexity of each command is explained in this article and official documents.

Generally speaking, the O(1) command is safe. Caution should be taken when using the O(N) command. If the order of N is unpredictable, avoid using it.

For example, if you run HGETALL/HKEYS/HVALS on a Hash with an unknown number of fields, these commands are usually executed quickly. However, if the number of fields in the Hash is very large, the time required for these commands increases exponentially.

Take care when using SUNION to Union two sets, or SORT to List/Set, etc.

There are several ways to avoid problems with these O(N) commands:

  • Do not use lists as lists, only as queues
  • The size of Hash, Set, and Sorted Set is strictly controlled by a mechanism
  • When possible, sort, union, intersection, and so on are performed on the client side
  • The KEYS command is absolutely forbidden
  • Avoid traversing all the members of a collection type at once. Instead, use the SCAN class commands for batch, cursor traversal


Redis provides the SCAN command to perform cursor traversal of all KEYS stored in Redis, avoiding performance problems caused by using the KEYS command.

In addition, SSCAN, HSCAN, ZSCAN and other commands are used to perform cursor traversal of the elements in the Set, Hash, and Sorted Set, respectively.

For details about how to use the SCAN command, see the official documents:

https://redis.io/commands/scan

Redis provides the Slow Log function to automatically record commands that take a long time. There are two related configuration parameters:

slowlog-log-slower-than xxxms  Commands whose execution time is slower than XXX milliseconds are counted in the Slow Log
slowlog-max-len xxx  The length of Slow logs is the maximum number of Slow logs to record
Copy the code


Using the SLOWLOG GET [number] command, you can output the latest number of Slow Log entries.

You can RESET Slow Log by running the SLOWLOG RESET command

Network induced latency

  • Use long connections or connection pooling whenever possible to avoid frequent connection creation and destruction
  • Bulk data operations performed by the client should be done in one interaction using the Pipeline feature. See the Pipelining section of this article

Delays caused by data persistence

Redis data persistence is inherently delayed and requires a reasonable persistence strategy based on data security levels and performance requirements:

  • Although the AOF + fsync always setting can absolutely ensure data security, each operation will trigger fsync once, which will have a significant impact on Redis performance.
  • AOF + fsync every second is a good compromise, fsync every second
  • AOF + fsync Never provides optimal performance under AOF persistence
  • Using RDB persistence generally provides better performance than using AOF, but you need to pay attention to the policy configuration of RDB
  • Every RDB snapshot and AOF Rewrite requires the Redis main process to fork. The fork operation itself can be time-consuming, depending on how much MEMORY the CPU and Redis consume. Configure RDB snapshots and AOF Rewrite timing accordingly to avoid delays caused by over-frequent forks

When Redis forks a child process, the paging table needs to be copied to the child process. For example, if the Redis instance occupies 24GB of memory, 48MB of data needs to be copied. This fork takes 216ms on a single Xeon 2.27Ghz physical machine. Can be achieved by
INFOThe latest_FORK_usec field returned by the command displays the time it took for the last fork operation (in microseconds).


Delay caused by Swap

When Linux moves the memory pages used by Redis to swap, it blocks the Redis process, causing an abnormal delay in Redis.

Swap usually occurs when physical memory is insufficient or some processes are performing a large number of I/O operations. Avoid the above two situations.

The /proc//smaps file stores the swap record of the process. By viewing this file, you can determine whether the delay of Redis is caused by swap.

If a large Swap size is recorded in this file, the delay is most likely due to Swap.

Delays caused by data obsolescence

Redis is also delayed when a large number of keys expire in one second. When using the key, try to stagger the expiration time.

Introduce read/write separation

Redis’ master-slave replication capability enables a master-slave multi-node architecture, in which the master node receives all write requests and synchronizes data to multiple slave nodes.

On this basis, we can let the slave node to provide read request services with low requirements on real time, so as to reduce the pressure on the master node.

In particular, statistics tasks that use time-consuming commands can be executed on one or more slave nodes to prevent these time-consuming commands from affecting the response of other requests.

Master/slave replication and cluster sharding

A master-slave replication

Redis supports a master slave replication architecture with one master and many slaves. A Master instance handles all write requests, and the Master synchronizes write operations to all slaves.

With master/slave replication in Redis, read/write separation and high availability are possible:

  • Read requests that do not require high real-time performance can be completed on the Slave to improve efficiency. In particular, some statistics tasks that are periodically executed may require the execution of some time-consuming Redis commands. One or more slaves can be specially planned to serve these statistics tasks
  • With the help of Redis Sentinel, high availability can be realized. When the Master crashes, Redis Sentinel can automatically promote a Slave to Master and continue to provide service

Enabling master-slave replication is very simple, just need to configure multiple Redis instances, in the Redis instance as Slave:

Slaveof 192.168.1.1 6379# specify the IP address and port of the Master
Copy the code


After the Slave starts, the Slave performs a cold start data synchronization from the Master

The Master triggers BGSAVE to generate an RDB file and push it to the Slave for import. After the import is complete, the Master synchronizes the incremental data to the Slave through the Redis Protocol

Since then, data between master and slave has been synchronized using Redis Protocol

Use Sentinel for automatic failover

The master/slave replication function of Redis is only for data synchronization, but does not provide monitoring and automatic failover

To make Redis highly available through master-slave replication, one more component needs to be introduced: Redis Sentinel

Redis Sentinel is a monitoring component officially developed by Redis. It can monitor the status of Redis instances, automatically discover Slave nodes through the Master node, elect a new Master when the Master node fails, and push the new Master/Slave configuration to all Redis instances.

Redis Sentinel requires at least three instances to be deployed to form an election relationship.

Key configuration:

Sentinel Monitor MyMaster 127.0.0.1 6379 2The IP of the Master instance, the port, and the number of votes needed for the election
sentinel down-after-milliseconds mymaster 60000  # How long without a response is considered Master invalid
sentinel failover-timeout mymaster 180000  The interval between two failover attempts
sentinel parallel-syncs mymaster 1  # If there are multiple slaves, you can use this configuration to specify the number of slaves to synchronize data from the new Master at the same time. This prevents the query service from being unavailable due to data synchronization from all slaves
Copy the code


It is also important to note that the automatic failover implemented by Redis Sentinel is not done on the same IP and port

In other words, the IP address and port provided by the new Master generated by automatic failover are different from the original Master. Therefore, to achieve HA, the client must support Sentinel and be able to interact with Sentinel to obtain the information of the new Master.

Cluster shard

Why cluster sharding:

  • The amount of data stored in Redis is too large for the physical memory of a host
  • The concurrency of Redis write requests is too large for one Redis instance to handle


When these two problems occur, Redis must be sharded.

There are many Redis sharding schemes. For example, many Redis clients have self-implemented sharding function, and there are also Redis sharding schemes like Twemproxy implemented by proxy.

However, the preferred solution is the Redis Cluster sharding solution released in version 3.0.

This article will not introduce the specific installation and deployment details of Redis Cluster, but focus on the benefits and disadvantages of Redis Cluster.

Redis Cluster capability

  • The ability to automatically spread data across multiple nodes
  • If the access key is not in the current shard, the request is automatically forwarded to the correct shard
  • Services can still be provided when some nodes in the cluster fail


The third point is based on master-slave replication. Each data fragment of Redis Cluster adopts the structure of master-slave replication, and the principle is exactly the same as the master-slave replication described above

The only difference is that the additional component Redis Sentinel is omitted, and the Redis Cluster is responsible for node monitoring and automatic failover within a shard.

Redis Cluster sharding principle

The Redis Cluster has 16384 hash slots. Redis calculates the CRC16 of each key and modules the result with 16384 to determine which hash slot to store the key in.

You also need to specify the number of slots for each data shard in the Redis Cluster. Slot allocation can be reassigned at any point in time.

When a client reads or writes a key, it can connect to any shard in the Cluster. If the key is not in the Slot of the shard, the Redis Cluster automatically redirects the request to the correct shard.

hash tags

In addition to the basic sharding principle, Redis also supports hash tags, which ensure that keys in the hash tags format are entered into the same Slot.

For example, {uiv}user:1000 and {uiv}user:1001 have the same hash tag {uiv} and are stored in the same Slot.

When using Redis Cluster, the keys involved in pipelining, transaction and LUA Script functions must be on the same data shard or an error will be returned.

To use the above functionality in a Redis Cluster, hash tags must be used to ensure that all keys operated in a pipeline or transaction are in the same Slot.

Some clients, such as Redisson, implement clustered Pipelining, which automatically groups commands in a pipeline into the shards where the key resides and executes them on different shards. However, Redis does not support cross-shard transactions, and transactions and LUA Script must still follow the rules that all keys are in one shard.


Master/slave replication vs cluster sharding

When designing a software architecture, how do you choose between master-slave replication and clustered sharding?

Redis Cluster is superior to master-slave replication in all respects

  • Redis Cluster can solve the problem of large amount of data on a single node
  • The Redis Cluster can solve the problem of excessive single-node access
  • The Redis Cluster contains master-slave replication capabilities


Does that mean Redis Cluster is always a better choice than master-slave replication?

Not really!!

More complex software architectures are never better, and complex architectures bring significant benefits as well as corresponding disadvantages. Disadvantages of using Redis Cluster include:

1. Increased maintenance difficulty.

When Redis Cluster is used, the number of Redis instances to be maintained multiplies, the number of hosts to be monitored increases, and the complexity of data backup/persistence increases.

When adding or subtracting fragments, the Reshard operation is also required, which is much more complex than adding a Slave in master/Slave mode.

2. Client resource consumption increases

When a client uses connection pooling, it needs to maintain a connection pool for each data fragment. As a result, the number of connections maintained by the client increases exponentially, consuming the resources of the client and the operating system.

3. Performance optimization becomes more difficult

You may need to view Slow logs and Swap logs on multiple shards to locate performance issues.

4. Increased transaction and LUA Script usage costs

There are strict restrictions on using transactions and LUA Script features in Redis Cluster

Transaction and Script operation keys must be on the same shard, which requires additional planning and specification requirements at development time for the keys involved in the corresponding scenario.

If the application scenario involves a large number of transactions and the use of Script, how to evenly divide the data into multiple data fragments under the premise of ensuring the normal operation of these two functions will become difficult.

Therefore, when making a choice between master-slave replication and Cluster sharding, Redis Cluster should be used only when it is really necessary to introduce data sharding based on comprehensive consideration of application software features, data and access magnitude, and future development planning.

Here are some suggestions:

  • How much data do YOU need to store in Redis? How big will it be in the next 2 years? Does all this data need to be stored for a long time? Can THE LRU algorithm be used to eliminate non-hot data? Taking into account the previous factors, the physical memory that Redis needs to use is assessed.
  • How much physical memory does the host have to deploy Redis? How much can be allocated to Redis? Is the memory requirement assessment in (1) sufficient?
  • How much concurrent writing pressure will Redis face? Without using pipelining, Redis write performance can be more than 100000 times per second (more benchmark can refer to https://redis.io/topics/benchmarks)
  • Can pipelining and transactions be used when using Redis? Are there many scenarios used?


Based on the above considerations, if the available physical memory of a single host is sufficient to support the capacity requirements of Redis, and the concurrent write pressure of Redis is still far from the Benchmark value, it is recommended to adopt the master-slave replication architecture, which can save a lot of unnecessary trouble.

If pipelining and transactions are used extensively in applications, master/slave replication is recommended to reduce design and development complexity.

Redis Java client of choice

There are many Java clients in Redis, and three are officially recommended: Jedis, Redisson and lettuce.

Here is a comparison of Jedis and Redisson

Jedis:

  • Lightweight, simple, easy to integrate and transform
  • Support connection pooling
  • Supports Pipelining, transactions, LUA Scripting, Redis Sentinel, and Redis Cluster
  • It does not support read/write separation and needs to be implemented by itself
  • Poor documentation (really bad, almost no…)


Redisson:

  • Netty based implementation, using non-blocking IO, high performance
  • Support for asynchronous requests
  • Support connection pooling
  • Supports Pipelining, LUA Scripting, Redis Sentinel, and Redis Cluster
  • Transactions are not supported, and LUA Scripting is officially recommended instead
  • Support for pipelining under the Redis Cluster architecture
  • Read/write separation and read load balancing are supported in both primary/secondary replication and Redis Cluster architectures
  • Built-in Tomcat Session Manager provides Session sharing for Tomcat 6/7/8
  • Can be integrated with Spring Session for Redis based Session sharing
  • Documentation is rich, with Chinese documents


The same principle applies to Jedis and Redisson. Although Jedis has various disadvantages over Redisson, Redisson should be used only when Redisson’s advanced features are needed to avoid unnecessary complexity.

Welcome Java engineers who have worked for one to five years to join Java Programmer development: 721575865

Group provides free Java architecture learning materials (which have high availability, high concurrency, high performance and distributed, Jvm performance tuning, Spring source code, MyBatis, Netty, Redis, Kafka, Mysql, Zookeeper, Tomcat, Docker, Dubbo, multiple knowledge architecture data, such as the Nginx) reasonable use their every minute and second time to learn to improve yourself, don’t use “no time” to hide his ideas on the lazy! While young, hard to fight, to the future of their own account!