This paper will start from the basic characteristics of Redis, through the introduction of Redis data structure and main commands to the basic capabilities of Redis intuitive introduction. It then provides an overview of the advanced capabilities provided by Redis and provides more in-depth guidance on deployment, maintenance, performance tuning, and more.

This article is intended for casual developers who use Redis, as well as architecture designers who do selection, architectural design, and performance tuning for Redis.

directory

  • An overview of the
  • Redis data structure and related common commands
  • Data persistence
  • Memory management and data elimination mechanisms
  • Pipelining
  • Affairs and Scripting
  • Redis performance tuning
  • Master/slave replication and cluster sharding
  • Redis Java client of choice

An overview of the

Redis is an open source, memory-based structured data storage medium that can be used as a database, caching service, or messaging service.

Redis supports a variety of data structures, including strings, hash tables, linked lists, sets, ordered sets, bitmaps, Hyperloglogs, and more.

Redis is capable of LRU elimination, transaction implementation, and different levels of disk persistence. It also supports replica sets and high availability solutions through Redis Sentinel, as well as automatic data sharding capabilities through Redis Cluster.

The main functions of Redis are implemented based on the single-threaded model, that is, Redis uses a single thread to service all client requests. At the same time, Redis adopts non-blocking IO and finely optimizes the algorithm time complexity of various commands. These information means:

  • Redis is thread-safe (because there is only one thread), all operations are atomic, and there are no data exceptions due to concurrency
  • Redis is very fast (because it uses non-blocking IO and most commands have O(1) algorithm time)
  • Using the time-consuming Redis command is dangerous, consuming a large amount of processing time for a single thread, causing all requests to be slowed down. (For example, the O(N) KEYS command is strictly prohibited in production environments.)

Redis data structure and related common commands

This section introduces the main data structures supported by Redis and the associated common Redis commands. This section only gives a brief overview of the Redis commands and lists only the more common commands. For the full Redis command set, or for details on how to use a command, please refer to the official documentation:

https://redis.io/commands

Key

Redis uses key-value basic data structures. Any binary sequence can be used as a Redis Key (such as a regular string or a JPEG image).

Some things to note about keys:

  • Don’t use long keys. Using a 1024-byte key, for example, is not a good idea, consuming more memory and making lookups less efficient
  • For example, “U1000FLW” saves little storage space compared to “User :1000: Followers”, but causes readability and maintainability problems
  • It is best to use a common specification for keys, such as object-type: ID :attr. A Key designed with this specification might be “user:1000” or “comment:1234:reply-to”.
  • The maximum Key length allowed by Redis is 512MB (also 512MB for values).

String

String is the basic data type of Redis. Redis does not have the concepts of Int, Float, Boolean, etc. All basic data types are represented by String in Redis.

Common String related commands:

  • SET: sets the value of a key, which can be used in conjunction with EX/PX parameters to specify the validity period of the key. NX/XX parameters can be used to distinguish whether the key exists or not. Time complexity O(1)
  • GET: Obtains the value of a key. Time complexity O(1)
  • GETSET: Sets the value of a key and returns the original value of the key, time complexity O(1)
  • MSET: Set values for multiple keys, time complexity O(N)
  • MSETNX: same as MSET, if any of the specified keys already exists, no operation is performed, time complexity O(N)
  • MGET: Obtain the values of multiple keys, time complexity O(N)

As mentioned above, the basic data type of Redis is only String. However, Redis can use String as an integer or floating point number. This is mainly reflected in the commands of INCR and DECR classes:

  • INCR: Increases the value of the key by 1 and returns the value after the increment. Applies only to String data that can be converted to integers. Time complexity O(1)
  • INCRBY: increments the value of the key to the specified integer value and returns the incremented value. Applies only to String data that can be converted to integers. Time complexity O(1)
  • DECR/DECRBY: Same as INCR/INCRBY, autoincrement is changed to autodecrement.

The INCR/DECR series of commands require that the value of the operation be of type String and can be converted to a 64-bit signed integer number, otherwise an error will be returned.

In other words, the value of the INCR/DECR command must be in the range from -2^63 to 2^63-1.

As mentioned earlier, Redis uses a single-threaded model, which is naturally thread-safe, making INCR/DECR commands very convenient to achieve precise control in high concurrency scenarios.

Example 1: Inventory control

Accurate check of inventory margin in high concurrency scenario to ensure no oversold situation.

Set the total inventory:

SET inv:remain “100”

Inventory deduction + margin check:

DECR inv:remain

When the return value of the DECR command is greater than or equal to 0, it indicates that the inventory margin check passes; if the return value is less than 0, it indicates that the inventory is exhausted.

Assuming 300 concurrent requests for inventory deduction, Redis can ensure that the 300 requests get a return value of 99 to -200, and that each request gets a unique return value, never finding two requests that get the same return value.

Example 2: Self-increasing sequence generation

Implement rDBMs-like Sequence functionality to generate a series of unique Sequence numbers

Set the sequence start value:

SET sequence “10000”

Get a sequence value:

INCR sequence

Just use the return value as a sequence.

Get a batch of (say 100) sequence values:

INCRBY sequence 100

Assuming the return value is N, then the values [N — 99 to N] are all available sequence values.

When multiple clients request an increment sequence from Redis at the same time, Redis can ensure that each client gets a globally unique sequence value or range, and there will never be a situation where different clients get a duplicate sequence value.

List

The List of Redis is a chained data structure, and you can insert and pop elements at both ends of the List using commands like LPUSH/RPUSH/LPOP/RPOP. While lists also support the ability to insert and read elements on a particular index, they have a high time complexity (O(N)) and should be used with caution.

Common commands related to List:

  • LPUSH: Inserts one or more elements to the left (header) of the specified List, returning the inserted length of the List. Time complexity O(N), where N is the number of inserted elements
  • RPUSH: Like LPUSH, inserts 1 or more elements to the right (tail) of the specified List
  • LPOP: Removes an element from the left side of the specified List and returns, time complexity O(1)
  • RPOP: Same as LPOP, removes one element from the right (tail) of the specified List and returns it
  • LPUSHX/RPUSHX: Similar to LPUSH/RPUSH except that the LPUSHX/RPUSHX operation does not perform any operations if the key does not exist
  • LLEN: Returns the length of the specified List, time complexity O(1)
  • LRANGE: returns the specified range of elements in the specified List (double-ended, i.e. LRANGE key 0 and 10 will return 11 elements), time complexity O(N). The number of elements fetched at one time should be controlled as much as possible. Fetching a large range of List elements at one time will cause delay, and for lists of unpredictable length, avoid full traversal operations such as LRANGE key 0-1.

Java development engineers who have worked for one to five years can join us at 760940986 to get the latest Java advanced architecture materials, source code, notes and videos. Dubbo, Redis, Design pattern, Netty, ZooKeeper, Spring Cloud, distributed, high concurrency and other architecture technologies

List commands should be used with caution:

  • LINDEX: Returns the element on the specified index of the specified List, or nil if the index is out of bounds. The index value is looped, i.e. -1 represents the last position in the List and -2 represents the second-to-last position in the List. Time complexity O(N)
  • LSET: sets the element on the specified List to value, returns an error if the index is out of bounds, O(N), O(1) if the operation is on a header/tail element
  • LINSERT: Inserts a new element before/after the specified element in the specified List and returns the List length after the operation. Returns -1 if the specified element does not exist. If the specified key does not exist, no operation is performed. Time complexity O(N)

Because the List of Redis is a linked List structure, the algorithm efficiency of the above three commands is low, and the List needs to be traversed. The command time cannot be estimated. When the List length is large, the time will increase significantly, so it should be used with caution.

In other words, Redis’ List is actually designed to implement queues, not lists like ArrayList. If you don’t want to implement a double-ended queue, try not to use the List data structure of Redis.

To support queuing, Redis also provides a series of blocking commands, such as BLPOP/BRPOP, that are similar to BlockingQueue. If the List is empty, the connection will be blocked until there are objects in the List that can be queued. In view of the command block class, here do not discuss in detail, please refer to the official documentation (https://redis.io/topics/data-types-intro) in the “Blocking operations on lists” section.

Hash

Hash is a Hash table. Redis Hash is a field-value data structure like a traditional Hash table. It can be understood as moving a HashMap into Redis.

Hash is a great way to represent object type data, using the field of the Hash corresponding to the field of the object.

Advantages of Hash include:

  • Binary lookup can be implemented, such as “find the age of the user with ID 1000”
  • Hash is a great way to reduce network transport overhead compared to serializing the entire object as a String
  • When using Hash to maintain a collection, it provides a much more efficient random access command than List

Common commands related to Hash:

  • HSET: Sets the field in the Hash corresponding to the key to value. If the Hash does not exist, one will be created automatically. Time complexity O(1)
  • HGET: Returns the value of the field field in the specified Hash, time complexity O(1)
  • HMSET/HMGET: similar to HSET and HGET, multiple fields under the same key can be operated in batches. The time complexity is O(N), where N is the number of fields in one operation
  • HSETNX: same as HSET, but if field already exists, HSETNX will not perform any operation, time complexity O(1)
  • HEXISTS: checks whether field exists in the specified Hash. Returns 1 if field exists and 0 if field does not exist. Time complexity O(1)
  • HDEL: deletes one or more fields from the specified Hash. Time complexity: O(N), where N is the number of fields in the operation
  • HINCRBY: INCRBY a field in the Hash, same as INCRBY, time complexity O(1)

Hashing commands that should be used with caution:

  • HGETALL: Returns all field-value pairs in the specified Hash. The result is an array with alternating fields and values. Time complexity O(N)
  • HKEYS/HVALS: Returns all fields/values in the specified Hash, time complexity O(N)

All the preceding three commands complete Hash traversal. The number of fields in the Hash is linearly related to the command duration. For hashes with unpredictable dimensions, you should strictly avoid the preceding three commands and use HSCAN commands instead to perform cursor traversal

https://redis.io/commands/scan

Set

A Redis Set is an unordered, non-repeatable collection of strings.

Common commands related to Set:

  • SADD: Adds 1 or more members to the specified Set. If the specified Set does not exist, one is automatically created. Time complexity O(N), where N is the number of added members
  • SREM: Removes one or more members from a specified Set. The time complexity is O(N), where N is the number of members to be removed
  • SRANDMEMBER: Returns one or more members randomly from the specified Set. The time complexity is O(N), where N is the number of returned members
  • SPOP: Removes randomly from a Set and returns count members, O(N), where N is the number of members removed
  • SCARD: returns the number of members in the specified Set, time complexity O(1)
  • SISMEMBER: check whether the specified value exists in the specified Set, time complexity O(1)
  • SMOVE: moves the specified member from one Set to another

Careful use of Set commands:

  • SMEMBERS: Returns all members of the specified Hash, time complexity O(N)
  • SUNION/SUNIONSTORE: Computes the union of multiple sets and returns/stores them in another Set. The time complexity is O(N), where N is the total number of members of all sets participating in the calculation
  • SINTER/SINTERSTORE: Calculates the intersection of multiple sets and returns/stores them in another Set. The time complexity is O(N), where N is the total number of members of all sets involved in the calculation
  • SDIFF/SDIFFSTORE: Calculate the difference Set between 1 Set and 1 or more sets and return/store it in another Set, time complexity O(N), N is the total number of members of all sets participating in the calculation

The above commands involve a large amount of calculation and should be used with caution, especially when the size of the Set involved in the calculation is unknown, they should be strictly avoided. Can consider through SSCAN command traversal access to the Set of all related member (please see https://redis.io/commands/scan), if you need to do and Set intersection/difference Set, can be done on the client side, real-time query request Slave or in service.

Sorted Set

A Redis Sorted Set is an ordered, non-repeatable collection of strings. Each element in a Sorted Set needs to be assigned a score, and the Sorted Set sorts the elements in ascending order based on that score. If multiple members have the same score, they are sorted in ascending lexicographical order.

The Sorted Set is very good for ranking.

Sorted Set main commands:

  • ZADD: Adds one or more members to the specified Sorted Set. The time complexity is O(Mlog(N)), where M is the number of added members and N is the number of members in the Sorted Set
  • ZREM: Deletes one or more members from a given Sorted Set. The time complexity is O(Mlog(N)), where M is the number of deleted members and N is the number of members in the Sorted Set
  • ZCOUNT: returns the number of members within the specified score range in the specified Sorted Set. Time complexity: O(log(N))
  • ZCARD: Returns the number of members in the specified Sorted Set, time complexity O(1)
  • ZSCORE: Returns the score of the specified member in the specified Sorted Set, time complexity O(1)
  • ZRANK/ZREVRANK: Returns the Sorted member ranking in the Sorted Set. ZRANK returns the Sorted member ranking in ascending order, and ZREVRANK returns the Sorted member ranking in descending order. Time complexity ORDER log(N)
  • ZINCRBY: same as INCRBY, increment the score of specified member in specified Sorted Set, time complexity O(log(N))

3. Select the Sorted Set command with caution:

  • ZRANGE/ZREVRANGE: Returns all members within the specified ranking range in the specified Sorted Sorted Set. ZRANGE is Sorted in ascending order by score, and ZREVRANGE is Sorted in descending order by score. The time complexity is O(log(N)+M), and M is the returned number of members
  • ZRANGEBYSCORE/ZREVRANGEBYSCORE: returns the score in Sorted Set specified within the scope of all member, returns the results in ascending/descending order, min and Max can be specified for inf and + inf, represent all the member of return. Time complexity O(log(N)+M)
  • ZREMRANGEBYRANK/ZREMRANGEBYSCORE: remove the Sorted Set is specified in the rankings range/all member of the specified score range. Time complexity O(log(N)+M)

[0 -1] or [-INF + INF] should be avoided in order to perform a complete traversal of the Sorted Set, especially if the Sorted Set size is unpredictable. Can be done in ZSCAN command cursor type traversal (please see https://redis.io/commands/scan), Or use the LIMIT parameter to LIMIT the number of members returned (for ZRANGEBYSCORE and ZREVRANGEBYSCORE commands) for cursor traversal.

Bitmap and HyperLogLog

These two data structures of Redis are not commonly used and are only briefly introduced in this article. For details about these two data structures and their related commands, please refer to the official documentation

The section of https://redis.io/topics/data-types-intro

A Bitmap is not an actual data type in Redis, but a way of using String as a Bitmap. This can be interpreted as converting a String to an array of bits. Using bitmaps to store simple data of type true/false is a huge space saver.

HyperLogLogs is a data structure mainly used for quantitative statistics. Similar to Set, HyperLogLogs maintains an unrepeatable Set of strings. However, HyperLogLogs does not maintain specific member contents, only the number of members. That is, HyperLogLogs can only be used to count the number of non-repeating elements in a Set, so it is much less memory efficient than Set.

Other Common Commands

  • EXISTS: checks whether the specified key EXISTS. 1 indicates that the key EXISTS, 0 indicates that the key does not exist, and time complexity O(1) is returned.
  • DEL: deletes the specified key and its value. The time complexity is O(N), where N is the number of deleted keys
  • EXPIRE/PEXPIRE: Sets the expiration date for a key in seconds or milliseconds, time complexity O(1)
  • TTL/PTTL: Returns the remaining valid time of a key, in seconds or milliseconds. Time complexity O(1)
  • RENAME/RENAMENX: Renames the key to newKey. When RENAME is used, if the newkey already exists, its value is overwritten. When using RENAMENX, if newkey already exists, no operation will be performed. Time complexity O(1)
  • TYPE: Returns the TYPE of the specified key, string, list, set, zset, hash. Time complexity O(1)
  • CONFIG GET: obtain the current value of a Redis configuration item, you can use * wildcard, time complexity O(1)
  • CONFIG SET: SET a new value for a Redis configuration item, time complexity O(1)
  • CONFIG REWRITE: Get Redis to reload the configuration in redis.conf

Data persistence

Redis provides the ability to automatically persist data to the hard disk on a regular basis, including RDB and AOF. The two schemes have their advantages and disadvantages respectively. They can be combined to run at the same time to ensure data stability.

Must data persistence be used?

Redis’ data persistence mechanism can be turned off. If you only use Redis as a caching service, all data stored in Redis is not the body of the data but only a synchronized backup, then you can turn off Redis data persistence mechanism.

In general, however, it is still recommended to at least enable rDB-based persistence because:

  • RDB persistence has almost no performance cost of Redis itself. The only thing the main Redis process needs to do for RDB persistence is to fork out a child process, which does all the persistence work
  • After Redis crashes for whatever reason, the data recorded in the last RDB snapshot is automatically restored upon restart. This eliminates the need to manually synchronize data from other sources, such as DB, and is faster than any other data recovery method
  • Hard drives are so big these days, there’s really no shortage of space

RDB

In RDB persistence mode, Redis periodically saves snapshots of data to an RBD file and automatically loads the RDB file upon startup to restore previously saved data. You can configure the Redis snapshot saving time in the configuration file:

save [seconds] [changes]

This means that an RDB snapshot is saved if [changes] data changes occur within [seconds], for example

save 60 100

Redis is told to check for data changes every 60 seconds, and if 100 or more data changes have occurred, the RDB snapshot is saved.

Multiple save commands can be configured to enable Redis to execute a multi-level snapshot saving policy.

Redis enables RDB snapshot by default. The default RDB policy is as follows:

save 900 1save 300 10save 60 10000

You can also manually trigger RDB snapshot saving using the BGSAVE command.

Advantages of RDB:

  • Minimal impact on performance. As mentioned above, Redis will fork out child processes when saving RDB snapshots, with little impact on Redis’s efficiency in handling client requests.
  • Each snapshot generates a complete data snapshot file. Therefore, snapshots at multiple points in time can be saved by other means (for example, snapshots at 0 o ‘clock every day are backed up to other storage media), which is a very reliable method for disaster recovery.
  • Data recovery using RDB files is much faster than using AOF.

Disadvantages of RDB:

  • Snapshots are generated periodically, so some of the data is more or less lost in Redis crashes.
  • If the data set is very large and the CPU is not strong enough (such as a single-core CPU), Redis can take a relatively long time (up to 1 second) to fork the child process, affecting client requests during that time.

AOF

With AOF persistence, Redis logs every write request in a log file. When Redis restarts, all write operations recorded in the AOF file are executed sequentially to ensure that data is restored to the latest date.

AOF is disabled by default. To enable AOF, perform the following configuration:

appendonly yes

AOF provides three fsync configurations, always/everysec/no, specified through the appendfsync configuration item:

  • Appendfsync no: Does not perform fsync and hands the timing of flush files to the OS as fast as possible
  • Appendfsync always: The fsync operation is performed every time a log is written. It has the highest data security but lowest speed
  • Appendfsync everysec: Compromise, fsync to the background thread every second

As AOF continues to log and write operations, some useless logs are bound to appear. For example, SET key1 “ABC” is executed at one point and SET key1 “BCD” is executed at another point. The first command is obviously useless. A large number of useless logs can make AOF files too large and take too long to recover.

So Redis provides AOF rewrite, which allows you to rewrite AOF files with only the smallest set of writes needed to restore data to its latest state.

AOF rewrite can be triggered by the BGREWRITEAOF command, or Redis can be configured to do this automatically at regular intervals:

auto-aof-rewrite-percentage 100auto-aof-rewrite-min-size 64mb

Redis rewrites the AOF log size every time it has AOF rewrite, and automatically AOF rewrite when the AOF log size has increased by 100%. And if the size does not grow to 64MB, rewrite will not be done.

Advantages of AOF:

  • At best, no written data is lost when appendfsync always is enabled, and data is lost for at most 1 second when using Appendfsync Everysec.
  • AOF files are not damaged in the event of problems such as power outages and can be easily repaired using the Redis-check-aOF tool even if a log is only half-written.
  • AOF files are readable and modify.after doing something wrong with the AOF file, you can back up the AOF file, delete the wrong commands, and then restore the data, as long as the AOF file does not have rewrite.

Disadvantages of AOF:

  • AOF files are usually larger than RDB files
  • Performance consumption is higher than RDB
  • The data recovery speed is slower than RDB

Memory management and data elimination mechanisms

Maximum memory setting

By default, Redis uses a maximum of 3GB of memory on a 32-bit OS, and no limit on a 64-bit OS.

When using Redis, you should have a fairly accurate estimate of the maximum amount of space that data will take up, and set the maximum amount of memory that Redis will use. Otherwise, on a 64-bit OS, Redis takes up memory without limit (swap space is used when physical memory is full), which can cause all kinds of problems.

Control the maximum memory used by Redis through the following configuration:

maxmemory 100mb

When writing data to Redis after maxMemory is used, Redis will:

  • Try to flush out data based on the configured data flushing policy to release space
  • If there is no data to flush, or no data flush policy is configured, Redis returns an error for all write requests, but read requests still execute normally

When setting maxMemory for Redis, note:

  • If the master/slave synchronization of Redis is used, some memory space will be occupied when the master node synchronizes data to the slave node. If maxMemory is too close to the available memory of the host, the memory will be insufficient during data synchronization. So don’t set maxMemory too close to the available memory of the host, and set some of it aside for master/slave synchronization.

Data elimination mechanism

Redis offers five strategies for data elimination:

  • Volatile – LRU: The LRU algorithm is used for data elimination (the earliest and least frequently used keys are eliminated). Only the keys with expiration dates are eliminated
  • Allkeys-lru: use lru algorithm for data elimination, allkeys can be eliminated
  • Volatile -random: Randomly eliminates data. Only keys with a specified expiration date are discarded
  • Allkeys-random: randomly eliminate data. Allkeys can be eliminated
  • Volatile – TTL: eliminates the remaining keys with the shortest validity period

It is best to specify an effective data flushing policy for Redis in conjunction with the MaxMemory setting to avoid write failures when memory is full.

In general, the recommended strategy is volatile- LRU and recognizes the importance of the data stored in Redis. For data that is important and must not be discarded (configuration data, for example), you should not set the expiration date, so Redis will never discard the data. For less important data that can be hot loaded (such as the cache of the most recently logged user information, which will be read from DB if it cannot be found in Redis), you can set the expiration date so that Redis will flush out the data when it runs out of memory.

Configuration method:

Maxmemory-policy volatile- lRU # Noeviction is designed to exclude data flushing

Pipelining

Pipelining

Redis provides many commands for batch operations, such as MSET/MGET/HMSET/HMGET, etc. These commands exist to reduce the resources and time consumed in maintaining network connections and transferring data.

For example, using the SET command for five consecutive times to SET five different keys has the same effect as using the MSET command to SET five different keys, but the former will consume more Round Trip Time (RTT), so the latter should always be used first.

However, if the client is to perform multiple operations in a row that cannot be combined by the Redis command, for example:

SET a “abc”INCR bHSET c name “hi”

The Pipelining feature provided by Redis can be used to execute multiple commands in a single interaction.

Using Pipelining, multiple commands (rn) are sent from the client to Redis at once. Redis executes these commands sequentially and assembles the returns from each command into a single sequence, such as:

$ (printf “PINGrnPINGrnPINGrn”; sleep 1) | nc localhost 6379+PONG+PONG+PONG

Pipelining is supported by most Redis clients, so developers usually don’t need to manually assemble their own command lists.

Limitations of Pipelining

Pipelining can only be used to execute sequential, uncorrelated commands, and cannot be used when the generation of a command depends on the return of a previous command.

This limitation can be circumvented through Scripting features

Affairs and Scripting

Pipelining enables Redis to process multiple commands in a single interaction, but in some scenarios it may be necessary to ensure that the set of commands is executed consecutively.

For example, take the current cumulative PV number and clear it to 0

> GET vCount12384> SET vCount 0OK

Inserting an INCR vCount between the GET and SET commands will cause the client to GET an inaccurate vCount.

Redis transactions ensure atomicity when executing complex commands. That is, Redis guarantees that a set of commands in a transaction are executed in absolute succession, and that no other commands from other connections are inserted until these commands are executed.

Add these two commands to a transaction through the MULTI and EXEC commands:

> MULTIOK> GET vCountQUEUED> SET vCount 0QUEUED> EXEC1) 123842) OK

After receiving the MULTI command, Redis starts a transaction, after which all read and write commands are stored in the queue but not executed until the EXEC command is received, Redis executes all commands in the queue sequentially and returns the result of each command as an array.

You can use the DISCARD command to DISCARD the current transaction and empty the saved command queue.

Note that Redis transactions do not support rollback:

If there is a syntax error in a command in a transaction, most client drivers will return an error. Redis version 2.6.5 and above will also check the command in the queue for syntax error during EXEC execution. If there is any syntax error, the transaction will be automatically abandoned and return an error.

But if a command in a transaction has a non-syntactical error (such as an HSET operation on a String), neither the client driver nor Redis can detect it before the command is actually executed, so all the commands in the transaction will still be executed in sequence. In this case, some commands succeed and some fail in a transaction. However, unlike RDBMS, Redis does not provide transaction rollback, so data can only be rolled back through other methods.

CAS is implemented through transactions

Redis provides the WATCH command to be used with transactions to implement the CAS optimistic locking mechanism.

Suppose you want to implement changing the status of an item to sold:

if(exec(HGET stock:1001 state) == “in stock”) exec(HSET stock:1001 state “sold”);

This pseudo-code does not ensure concurrency security when executed, and it is possible for multiple clients to obtain the “in stock” status, resulting in an inventory being sold multiple times.

Use the WATCH command and transaction to solve this problem:

exec(WATCH stock:1001); if(exec(HGET stock:1001 state) == “in stock”) { exec(MULTI); exec(HSET stock:1001 state “sold”); exec(EXEC); }

The mechanism of WATCH is as follows: When the transaction EXEC command is executed, Redis will check the WATCH key. The WATCH key will be executed only when the WATCH key has not changed since the start of WATCH. If the WATCH key changes between the WATCH command and the EXEC command, the EXEC command returns a failure.

Scripting

The EVAL and EVALSHA commands allow Redis to execute LUA scripts. This is similar to the stored procedure of RDBMS, which can put the intensive read/write interaction between the client and Redis on the server side to avoid excessive data interaction and improve performance.

Scripting was born as an alternative to transaction functionality, and all of the capabilities that transaction provides are available to Scripting. Redis officially recommends using LUA Script instead of transactions, which are more efficient and convenient than transactions.

For details about the use of Scripting, please refer to the official documentation

https://redis.io/commands/eval

Redis performance tuning

Just because Redis is a very fast in-memory data storage medium doesn’t mean Redis doesn’t cause performance problems.

As mentioned above, Redis adopts a single-thread model, in which all commands are executed sequentially by one thread. Therefore, when a command takes a long time to execute, all subsequent commands will be slowed down, which makes Redis more sensitive to the execution efficiency of each task.

The performance optimization of Redis mainly starts from the following aspects:

  • First and foremost, make sure that you don’t have Redis executing time-consuming commands
  • Pipelining combines sequential commands to perform pipelining
  • Transparent Huge Pages of the OPERATING system must be disabled:

echo never > /sys/kernel/mm/transparent_hugepage/enabled

  • If you are running Redis in a virtual machine, there may be inherent latency associated with the virtual machine environment. You can run the./redis-cli -intrinsic-latency 100 command to query intrinsic latency. At the same time, if you have high performance requirements for Redis, you should deploy Redis directly on the physical machine as much as possible.
  • Check the data persistence policy
  • Consider introducing a read-write separation mechanism

Long time command

The time complexity of most Redis read and write commands is between O(1) and O(N), and the time complexity of each command is explained in the text and official documents.

Generally speaking, the O(1) command is safe. Caution should be taken when using the O(N) command. If the order of N is unpredictable, avoid using it. For example, if you run HGETALL/HKEYS/HVALS on a Hash with an unknown number of fields, these commands are usually executed quickly. However, if the number of fields in the Hash is very large, the time required for these commands increases exponentially.

Take care when using SUNION to Union two sets, or SORT to List/Set, etc.

There are several ways to avoid problems with these O(N) commands:

  • Do not use lists as lists, only as queues
  • The size of Hash, Set, and Sorted Set is strictly controlled by a mechanism
  • When possible, sort, union, intersection, and so on are performed on the client side
  • The KEYS command is absolutely forbidden
  • Avoid traversing all the members of a collection type at once. Instead, use the SCAN class commands for batch, cursor traversal

Redis provides the SCAN command to perform cursor traversal of all KEYS stored in Redis, avoiding performance problems caused by using the KEYS command. In addition, SSCAN, HSCAN, ZSCAN and other commands are used to perform cursor traversal of the elements in the Set, Hash, and Sorted Set, respectively. For details about how to use the SCAN command, see the official documents:

https://redis.io/commands/scan

Redis provides the Slow Log function to automatically record commands that take a long time. There are two related configuration parameters:

Slowlog-log-slower than XXXMS # Slowlog-log-slower slowslowlog-max-len XXX # The length of Slow logs, that is, the maximum number of Slow logs recorded

Using the SLOWLOG GET [number] command, you can output the latest number of Slow Log entries.

You can RESET Slow Log by running the SLOWLOG RESET command

Network induced latency

  • Use long connections or connection pooling whenever possible to avoid frequent connection creation and destruction
  • Bulk data operations performed by the client should be done in one interaction using the Pipeline feature. See the Pipelining section of this article

Delays caused by data persistence

Redis data persistence is inherently delayed and requires a reasonable persistence strategy based on data security levels and performance requirements:

  • Although the AOF + fsync always setting can absolutely ensure data security, each operation will trigger fsync once, which will have a significant impact on Redis performance
  • AOF + fsync every second is a good compromise, fsync every second
  • AOF + fsync Never provides optimal performance under AOF persistence
  • Using RDB persistence generally provides better performance than using AOF, but you need to pay attention to the policy configuration of RDB
  • Every RDB snapshot and AOF Rewrite requires the Redis main process to fork. The fork operation itself can be time-consuming, depending on how much MEMORY the CPU and Redis consume. Configure RDB snapshots and AOF Rewrite timing accordingly to avoid delays caused by over-frequent forks

When Redis forks a child process, the memory paging table needs to be copied to the child process. For example, the Redis instance that occupies 24GB of memory needs to copy 48MB of data. This fork takes 216ms on a single Xeon 2.27Ghz physical machine. The latest_FORK_usec field returned from the INFO command can be used to see how long the last fork operation took (in microseconds).

Delay caused by Swap

When Linux moves the memory pages used by Redis to swap, it blocks the Redis process, causing an abnormal delay in Redis. Swap usually occurs when physical memory is insufficient or some processes are performing a large number of I/O operations. Avoid the above two situations.

The /proc/<pid>/smaps file stores the swap record of the process. By viewing this file, you can determine whether the delay of Redis is caused by swap. If a large Swap size is recorded in this file, the delay is most likely due to Swap.

Delays caused by data obsolescence

Redis is also delayed when a large number of keys expire in one second. When using the key, try to stagger the expiration time.

Introduce read/write separation

Redis’ master-slave replication capability enables a master-slave multi-node architecture, in which the master node receives all write requests and synchronizes data to multiple slave nodes.

On this basis, we can let the slave node to provide read request services with low requirements on real time, so as to reduce the pressure on the master node.

In particular, statistics tasks that use time-consuming commands can be executed on one or more slave nodes to prevent these time-consuming commands from affecting the response of other requests.

For details on read/write separation, see the following sections

Master/slave replication and cluster sharding

A master-slave replication

Redis supports a master slave replication architecture with one master and many slaves. A Master instance handles all write requests, and the Master synchronizes write operations to all slaves.

With master/slave replication in Redis, read/write separation and high availability are possible:

  • Read requests that do not require high real-time performance can be completed on the Slave to improve efficiency. In particular, some statistics tasks that are periodically executed may require the execution of some time-consuming Redis commands. One or more slaves can be specially planned to serve these statistics tasks
  • With the help of Redis Sentinel, high availability can be realized. When the Master crashes, Redis Sentinel can automatically promote a Slave to Master and continue to provide service

Enabling master-slave replication is very simple, just need to configure multiple Redis instances, in the Redis instance as Slave:

Slaveof 192.168.1.1 6379 # specifies the Master IP address and port number

After the Slave starts, the Slave performs a cold start data synchronization from the Master. The Master triggers the BGSAVE to generate an RDB file and pushes it to the Slave for import. After the import is complete, the Master synchronizes incremental data to the Slave through the Redis Protocol. Since then, data between master and slave has been synchronized using Redis Protocol

Use Sentinel for automatic failover

The master/slave replication function of Redis is only for data synchronization, but does not provide monitoring and automatic failover. To realize high availability of Redis through the master/slave replication function, one component is needed: Redis Sentinel

Redis Sentinel is a monitoring component officially developed by Redis. It can monitor the status of Redis instances, automatically discover Slave nodes through the Master node, elect a new Master when the Master node fails, and push the new Master/Slave configuration to all Redis instances.

Redis Sentinel requires at least three instances to be deployed to form an election relationship.

Key configuration:

In addition, it should be noted that the automatic failover implemented by Redis Sentinel is not completed on the same IP and port, that is to say, the new Master generated by automatic failover provides services with different IP and port from the previous Master. Therefore, HA must be implemented. It is also required that the client must support Sentinel and be able to interact with Sentinel to obtain information about the new Master.

Cluster shard

Why cluster sharding:

  • The amount of data stored in Redis is too large for the physical memory of a host
  • The concurrency of Redis write requests is too large for one Redis instance to handle

When these two problems occur, Redis must be sharded.

There are many Redis sharding schemes. For example, many Redis clients have realized the sharding function by themselves, and there are also Redis sharding schemes implemented by proxy such as Twemproxy. However, the preferred solution is the Redis Cluster sharding solution released in version 3.0.

This article will not introduce the specific installation and deployment details of Redis Cluster, but focus on the benefits and disadvantages of Redis Cluster.

Redis Cluster capability

  • The ability to automatically spread data across multiple nodes
  • If the access key is not in the current shard, the request is automatically forwarded to the correct shard
  • Services can still be provided when some nodes in the cluster fail

The third point is realized based on master-slave replication. Each data fragment of Redis Cluster adopts the structure of master-slave replication. The principle is exactly the same as the master-slave replication mentioned above, the only difference is that the additional component Redis Sentinel is omitted. The Redis Cluster is responsible for an internal shard node monitoring and automatic failover.

Redis Cluster sharding principle

The Redis Cluster has 16384 hash slots. Redis calculates the CRC16 of each key and modules the result with 16384 to determine which hash slot to store the key in. You also need to specify the number of slots for each data shard in the Redis Cluster. Slot allocations can be reassigned at any point in time.

When a client reads or writes a key, it can connect to any shard in the Cluster. If the key is not in the Slot of the shard, the Redis Cluster automatically redirects the request to the correct shard.

hash tags

In addition to the basic sharding principle, Redis also supports hash tags, which ensure that keys in the hash tags format are entered into the same Slot. For example, {uiv}user:1000 and {uiv}user:1001 have the same hash tag {uiv} and are stored in the same Slot.

When using Redis Cluster, the keys involved in pipelining, transaction and LUA Script functions must be on the same data shard or an error will be returned. To use the above functionality in a Redis Cluster, hash tags must be used to ensure that all keys operated in a pipeline or transaction are in the same Slot.

Some clients, such as Redisson, implement clustered Pipelining, which automatically groups commands in a pipeline into the shards where the key resides and executes them on different shards. However, Redis does not support cross-shard transactions, and transactions and LUA Script must still follow the rules that all keys are in one shard.

Master/slave replication vs cluster sharding

When designing a software architecture, how do you choose between master-slave replication and clustered sharding?

Redis Cluster is superior to master-slave replication in all respects

  • Redis Cluster can solve the problem of large amount of data on a single node
  • The Redis Cluster can solve the problem of excessive single-node access
  • The Redis Cluster contains master-slave replication capabilities

Does that mean Redis Cluster is always a better choice than master-slave replication?

And it isn’t.

More complex software architectures are never better, and complex architectures bring significant benefits as well as corresponding disadvantages. Disadvantages of using Redis Cluster include:

  • Maintenance difficulty increases. When Redis Cluster is used, the number of Redis instances to be maintained multiplies, the number of hosts to be monitored increases, and the complexity of data backup/persistence increases. When adding or subtracting fragments, the Reshard operation is also required, which is much more complex than adding a Slave in master/Slave mode.
  • The client resource consumption increases. When a client uses connection pooling, it needs to maintain a connection pool for each data fragment. As a result, the number of connections maintained by the client increases exponentially, consuming the resources of the client and the operating system.
  • Performance tuning becomes more difficult. You may need to view Slow logs and Swap logs on multiple shards to locate performance issues.
  • Transaction and LUA Script usage costs increase. The use of transaction and LUA Script features in Redis Cluster has strict restrictions. Transaction and Script operation keys must be on the same shard, which makes it necessary to carry out additional planning and specification requirements for the key involved in the corresponding scenario during development. If the application scenario involves a large number of transactions and the use of Script, how to evenly divide the data into multiple data fragments under the premise of ensuring the normal operation of these two functions will become difficult.

Therefore, when choosing between master-slave replication and Cluster sharding, Redis Cluster should be used only when it is really necessary to introduce data sharding based on comprehensive consideration of application software features, data and access magnitude, and future development planning.

Here are some suggestions:

  1. How much data do YOU need to store in Redis? How big will it be in the next 2 years? Does all this data need to be stored for a long time? Can THE LRU algorithm be used to eliminate non-hot data? Taking into account the previous factors, the physical memory that Redis needs to use is assessed.
  2. How much physical memory does the host have to deploy Redis? How much can be allocated to Redis? Is the memory requirement assessment in (1) sufficient?
  3. How much concurrent writing pressure will Redis face? Without using pipelining, Redis write performance can be more than 100000 times per second (more benchmark can refer to https://redis.io/topics/benchmarks)
  4. Can pipelining and transactions be used when using Redis? Are there many scenarios used?

Based on the above considerations, if the available physical memory of a single host is sufficient to support the capacity requirements of Redis, and the concurrent write pressure of Redis is still far from the Benchmark value, it is recommended to adopt the master-slave replication architecture, which can save a lot of unnecessary trouble. If pipelining and transactions are used extensively in applications, master/slave replication is recommended to reduce design and development complexity.

Redis Java client of choice

There are many Java clients in Redis, and three are officially recommended: Jedis, Redisson and lettuce.

Here is a comparison of Jedis and Redisson

Jedis:

  • Lightweight, simple, easy to integrate and transform
  • Support connection pooling
  • Supports Pipelining, transactions, LUA Scripting, Redis Sentinel, and Redis Cluster
  • It does not support read/write separation and needs to be implemented by itself
  • Poor documentation (really bad, almost no…)

Redisson:

  • Netty based implementation, using non-blocking IO, high performance
  • Support for asynchronous requests
  • Support connection pooling
  • Supports Pipelining, LUA Scripting, Redis Sentinel, and Redis Cluster
  • Transactions are not supported, and LUA Scripting is officially recommended instead
  • Support for pipelining under the Redis Cluster architecture
  • Read/write separation and read load balancing are supported in both primary/secondary replication and Redis Cluster architectures
  • Built-in Tomcat Session Manager provides Session sharing for Tomcat 6/7/8
  • Can be integrated with Spring Session for Redis based Session sharing
  • Documentation is rich, with Chinese documents

The same principle applies to Jedis and Redisson. Although Jedis has various disadvantages over Redisson, Redisson should be used only when Redisson’s advanced features are needed to avoid unnecessary complexity.