A distributed lock

In single-process applications, when a piece of code can only be executed by one thread at a time,

You can make an error with multiple threads, for example, two threads adding a number at the same time, two threads getting the number at the same time, for example, 40, one thread adding 10, one thread adding 20, the correct result should be 70,

But since two threads in their own memory, one evaluates 50 and the other evaluates 60, both write their results in the place of the original number,

At this point, one thread must be overwritten, because read -> calculate -> save is not an atomic operation.

That means the end result is either 50 or 60, not 70 (which is likely to happen in concurrent or parallel situations),

When this happens in a single-process application, the lock semantics provided by the program (such as Sychornized in Java) guarantee that the code will be executed by only one thread, in order, and the result will be correct.

In distributed applications, semantic locks are no longer available because the same application process may be running on the same machine or on different machines.

Because the same code may be executed on different machines or processes, it is necessary to have a system that allows the same application code to be executed sequentially (or only one thread at a time) on different machines or processes.

This requires the use of middleware to coordinate, there are many middleware can achieve distributed lock, Redis is one of them.

Redis implements the principle of distributed lock

The principle of distributed lock in Redis is to set a value in redis (of course, ensure that distributed applications are connected to the same Redis, with this redis as the middle point, otherwise of course it is useless), this value can only be stored by one thread, when other threads (or processes on different machines) also store. If this value is already present, it indicates that the lock is already being used, and either retry wait or abandon.

The SETNX (Set if not exists) instruction is generally used to set the lock. If the value does not exist, it is set; if it exists, it is not set. This is the key to get the lock.

The deadlock problem

Distributed locking is implemented using SETNX and DEL, but there is a situation where if a thread does SETNX and gets the lock, then all of a sudden the thread crashes for some reason and does not DEL release the lock,

The solution to this problem is to set the lock to an expiration date, after which the lock will be automatically released.

Use EXPIRE to set an expiration date for the lock, as follows

SETNX LOCK-KEY-NAME true
EXPIRE LOCK-KEY-NAME 5
Copy the code

However, there is still a problem, because SETNX and EXPIRE are divided into two instructions, there is still a possibility that the SETNX execution is completed, because of the failure of the machine or program, the EXPIRE execution is not successful, there is still a possibility of deadlock.

Can transactions solve this problem?

NO, because EXPIRE depends on the result of SETNX execution, only after SETNX is successful, it cannot be executed otherwise, there is NO if else branch logic in the transaction, either all or none will be executed.

In redis2.8, an extended parameter to the set directive was introduced to allow SETNX and EXPIRE to execute simultaneously (atoms).

SET LOCK-KEY-NAME true ex 5 nx
Copy the code

Timeout problems

Although the above mentioned use to set the expiration time for the lock to solve the deadlock problem may occur, but what if my program code execution time exceeds the set expiration time, at this time the lock is automatically released, but my code is not finished, other people execute, resulting in an error?

In general development scenarios, we will try to set the lock time as long as possible, such as 60s. Generally, the application can be executed within 60s, but the fear is very serious. What if the execution cannot be completed within 60s?

At this point, a renewal scheme can be used, that is, when the program is in the execution process, constantly judge whether the lock is about to expire, whether the execution of the code is completed, if the expiration is not completed, the lock will be renewed, to ensure that the lock will not be automatically released until the completion of our code execution. This scheme, implemented in Java by a framework called Redisson, can be introduced directly.

Lock misplacement

In the process of using the lock, it is possible that another application does not hold the lock, but also executes the DEL command, which releases the lock of the application that we are executing, causing the lock to be held somewhere else, entering the code section to start executing,

The solution here is to at the time of SETNX, the value can be set to a random and global only a string of Numbers or characters, this thread has been holding character, at the time of releasing the lock, to compare the character and lock in the character, if there is a match, you can lock is released, if not match, that is the others by mistake, refused to release at this time,

But checking whether characters are the same and releasing locks are not atomic operations, and Redis does not provide such a command, so we considered using lua scripts (which Redisson also implemented).

The most important point is that the entry point for using the lock release must be consistent. If an application does not use the above method and uses DEL directly, then the above scheme will not work (I have used DEL directly for testing purposes).

reentrancy

ReentrantLock is a ReentrantLock. If a lock can be locked multiple times by the same thread, the lock is reentrant. In Java, ReentrantLock is a ReentrantLock. When the number is 0, the distributed lock is released,

Redis locks also need to be supported in this way if they are to support reentrant, but this logic adds complexity and it is generally recommended to logically adjust code segments that require locks to avoid repeatedly acquiring distributed locks. (Reentrant locking is also supported by Redisson, of course)

Redlock

The above method does not seem to have many problems, but because redis itself can also have problems, for example, in the Sentinel cluster, the primary node fails, and the secondary node becomes the primary node, but the client does not know about it.

If the client SETNX succeeds on the failed master node, but the lock has not been synchronized to the slave node, the slave node becomes the master node, and there is no information about the lock in the new master node,

At this time, another client requests the lock, and SETNX succeeds directly, which leads to the two clients executing the same code at the same time, and there is insecurity again.

The industry offers a solution called Redlock, which basically provides multiple instances of Redis that are independent of each other and have no master-slave relationship, using most of the same mechanisms as other distributed algorithms.

Set (key, value, nx = True, ex = XXX) command will be issued to the half-node when locking. As long as the half-node is set successfully, the lock will be obtained. DEL command will be issued to all nodes when releasing the lock.

The Redlock algorithm (already supported by Redisson) needs to consider details such as error retry, clock drift and so on. Meanwhile, Redlock needs to read and write to multiple nodes, and its performance will be lower than that of the singleton Redis.

If the business scenario has a low tolerance for errors and can accept a slight performance degradation, consider using the Redlock algorithm.