Redis: How to implement distributed locks

Why do we need distributed locks

Why do we need distributed locks

The purpose of using distributed locks is to ensure that only one client can operate on a shared resource at a time

We often encounter concurrency problems when doing logic processing in distributed applications.

For example, if an operation wants to modify the user’s state, it needs to read the user’s state first, modify it in memory, and then save it back. If such operations are performed simultaneously, concurrency problems arise because the two operations of reading and saving state are not atomic.

A distributed lock is used to limit concurrent execution of the program. Redis, as a caching middleware system, can provide this distributed locking mechanism,

The essence of this is to occupy a pit in redis, and when another process tries to occupy the pit, it finds that the pit has already been occupied, so it can wait and try again later

In general, distributed locks available in production environments should meet the following requirements:

  • Mutual exclusion is the basic feature of the lock. Only one thread can hold the lock at a time and perform critical operations.
  • Timeout release, timeout release is another essential lock feature, can be compared to MySQL InnoDB engineinnodb_lock_wait_timeoutConfiguration, through timeout release, to prevent unnecessary thread wait and waste of resources;
  • Reentrancy. In distributed environment, the same thread on the same node can successfully request again if it has acquired the lock.

implementation

Use SETNX

SETNX can be used as follows: SETNX key Value: Sets the key value to value only when the key does not exist. If the key does exist, SETNX does nothing.

boolean result = jedis.setnx("lock-key".true) = =1L;
if  (result) {
    try {
        // do something
    } finally {
        jedis.del("lock-key"); }}Copy the code

A fatal problem with this approach is that a thread that acquires a lock cannot unlock it due to some unusual factor (such as downtime), and the lock is never released.

To do this, we can add a timeout to the lock

Executing SET key value EX seconds is the same as executing SETEX key seconds value

Executing SET key value PX milliseconds is the same as executing PSETEX key milliseconds Value

String result = jedis.set("lock-key".true.5);
if ("OK".equals(result)) {
    try {
        // do something
    } finally {
        jedis.del("lock-key"); }}Copy the code

The scheme looks perfect, but there are problems

Imagine that thread A acquires A lock and sets the expiration time to 10 seconds. Then it takes 15 seconds to execute the business logic. The lock acquired by thread A has already been automatically released by Redis expiration mechanism

The lock may have been acquired by another thread 10 seconds after thread A acquired the lock. When thread A is ready to unlock (DEL key) after executing the business logic, it may delete A lock that has already been acquired by another thread.

The best way to determine if the lock is your own is to set the key to a uniqueValue (a random value, a UUID, a combination of machine number and thread number, a signature, etc.).

When unlocking a key, that is, deleting a key, check whether the value of the key is equal to the value previously set. If so, delete the key

String velue= String.valueOf(System.currentTimeMillis())
String result = jedis.set("lock-key",velue, 5);
if ("OK".equals(result)) {
    try {
        // do something
    } finally {
      	// Non-atomic operation
	      if(jedis.get("lock-key")==value){
		        jedis.del("lock-key"); }}}Copy the code

We can see the problem at a glance here: GET and DEL are separate operations, and exceptions can occur in the interval between GET and DEL execution.

We can solve the problem if we just make sure the code we unlock is atomic

Here we introduce a new approach, Lua scripts, as shown in the following example:

if redis.call("get",KEYS[1]) == ARGV[1] then
    return redis.call("del",KEYS[1])
else
    return 0
end
Copy the code

Where ARGV[1] represents the unique value specified when setting the key.

Due to the atomicity of Lua scripts, during the execution of the Lua script by Redis, commands on other clients can be executed only after the Lua script is executed.

Ensure that the expiration time is longer than the service execution time

To prevent multiple threads from executing business code at the same time, ensure that the expiration time is longer than the business execution time

Added a Boolean property, isOpenExpirationRenewal, that identifies whether periodic refresh expiration is enabled

In adding a scheduleExpirationRenewal method is used to open the refresh expiration time of threads

After the success of the lock code in acquiring a lock will isOpenExpirationRenewal set to true, and the scheduleExpirationRenewal method is called, the thread to refresh expiration time

Add a line to the unlock code that sets the isOpenExpirationRenewal property to false to stop thread polling that refreshes expiration time

Redisson implementation

If the lock is successfully obtained, a scheduled task will be started. The scheduled task will be checked and renewed periodically

The time difference between each call is internalLockLeaseTime / 3, which is just 10 seconds

By default, the lock time is 30 seconds. If the lock is not completed, a renewal will be performed to reset the lock to 30 seconds when 30-10 = 20 seconds

RedLock

In a cluster, when the primary node fails, the secondary node takes over, with no apparent sense on the client. The first client successfully applied for a lock on the master node, but before the lock could be synchronized to the slave node, the master node suddenly hung. Then the secondary node becomes the master node, and the new node does not have the lock inside, so when another client comes to request the lock, it immediately approves it. As a result, the same lock in the system is held by two clients at the same time, causing insecurity

The Redlock algorithm is designed to solve this problem

To use Redlock, you need to provide multiple instances of Redis that are previously independent of each other without a master-slave relationship. Like many distributed algorithms, Redlock uses most of the mechanisms

When locking, it sends a set instruction to the half-node. If the half-node set succeeds, the lock is considered successful. To release the lock, the DEL directive is sent to all nodes. However, the Redlock algorithm also needs to consider many details such as error retry, clock drift, and because Redlock needs to read and write to multiple nodes, it means that the performance of Redis is lower than that of single-instance Redis

Redlock algorithm is a high availability mode introduced on the basis of single Redis node. Redlock is based on N completely independent Redis nodes, generally an odd number greater than 3 (N can be set to 5 under normal circumstances), which can basically ensure that all nodes in the cluster will not break down at the same time.

Assume that there are five nodes in the cluster, and the client running the Redlock algorithm performs the following steps to obtain the lock

  • The client records the current system time in milliseconds.
  • The client should set a timeout time for network connection and response when requesting the lock from Redis. The timeout time should be less than the lock failure time to avoid problems caused by network faults.
  • The client obtains the lock use time by subtracting the start time from the current time. If and only when the lock is obtained from more than half of the Redis nodes, and when the use time is less than the lock expiration time, the lock is considered to be successful.
  • If the lock is acquired, the key’s true validity time is equal to the validity time minus the time used to acquire the lock, reducing the chance of timeout.
  • If the lock fails to be obtained, the client should unlock all Redis instances, even the nodes that failed in the previous operation, to prevent inconsistencies caused by the server response message being lost but the actual data being added successfully.

That is, if the lock expires in 30 seconds and it takes 31 seconds to lock three nodes, the lock has failed

An implementation of RedLock is built into Redisson, the Java client recommended by Redis

Redis. IO/switchable viewer/dist…

Github.com/redisson/re…

RedLock problem:

RedLock only guarantees high availability of locks, not correctness

RedLock is a distributed system that relies heavily on the system clock

Martin’s critique of RedLock:

  • RedLock is too heavy for efficiency scenarios.
  • RedLock does not guarantee correctness in highly accurate scenarios.