This is the 27th day of my participation in the November Gwen Challenge. Check out the event details: The last Gwen Challenge 2021

In a distributed system, there are situations where resources shared by multiple nodes need to be locked, and distributed locks are needed. Distributed locks are usually stored in a shared storage system and can be shared and accessed by multiple nodes.

The nature of the lock

Simply put, a lock can be represented by a variable. For example, in a stand-alone multithreaded program, the lock on a resource can be represented by a single bit of data. That is, 0 indicates that no resource can be accessed, and 1 indicates that the lock of the resource has been acquired by another thread and cannot be accessed.

Acquiring and releasing locks on a particular resource is essentially acquiring and modifying the value of this variable. If the value is 0, change it to 1 to complete the acquisition process. If the value is not 0, the lock acquisition fails. Changing the value of the variable representing the lock to 0 is the operation of releasing the lock if it was previously acquired.

In a distributed scenario, the lock is implemented in the same way, except that the variable representing the resource lock needs to be kept in a shared storage system. This shared storage system can be Redis or any other system that provides data storage.

Distributed lock implementation based on Redis

Step 1: Initial implementation of functionality

In the case of Redis as the shared storage system, the variable representing the lock on a resource is a key-value pair in Redis. For example, if the resource to which a distributed lock is to be added is resource_A, we can call the key of the lock variable of Resource_A in Redis lock_A.

For example, when a node needs to acquire a lock, it accesses the value of LOCK_A in Redis. Assuming that the value obtained is 0, the node sets the value to 1 to complete the lock operation. In this case, node 2 also needs to obtain the lock of Resource_A. When it accesses the value of lock_A in Redis and finds that the lock is 1, it indicates that the lock has been obtained by another node and has not been released. Therefore, node 2 fails to lock resource_A.

When a node needs to release the lock, it only needs to set the value of LOCK_A in Redis to 0 to complete the release. After that, other nodes can acquire the lock again.

Step 2: Atomize lock operation

In the above description, locking is not a single operation, but involves several steps: reading the lock variable, determining the value of the variable, and modifying the lock variable. These three operations need to be atomized.

In Redis, there is a SETNX command used to SET the value of a KEY/value pair. Unlike the SET command, this command determines whether the KEY/value pair exists first. The value is SET only if the specified KEY does not exist. SETNX stands for SET if Not eXist. It is used in the same way as SET:

SETNX lock_a 1
Copy the code

If the value is set successfully, the lock is obtained; if the value fails, the lock is not obtained. To release the lock, use the DEL operation to delete the key-value pair.

This enables atomization of lock acquisition and lock release.

Step 3: prevent lock after not release

Next, consider a problem early. If a node does not release the lock due to program exceptions after obtaining the lock, the lock will always be held by it and cannot be released, and other nodes cannot access resources.

To avoid this, we must set the expiration time for the lock variable. When the lock variable expires, we can request the lock again, thus avoiding this problem.

The SETNX command does not have the option to SET an expiration time. Fortunately, Redis provides an NX option for the SET command that mimics SETNX.

SET lock_a 1 NX PX 10000
Copy the code

The above command means that if lock_A does not exist, set its value to 1 and expire after 10 seconds.

Step 4: Who locks who releases

The last problem is that if node one acquires the lock and node two for some reason performs a DEL operation, then the other nodes can acquire the lock again.

To solve this problem, we can modify the contents of the lock variable. In the previous logic, when we apply for a lock, it is to determine whether the lock variable exists, not the value stored in it, so we can use that value.

If the value is saved as a unique identifier for each node, the value can be checked before DEL is performed to determine whether the lock was added to the current node, and if so, the lock can be released. This implements “lock and release”.

In this part, no single instruction can complete the operation of reading lock variables, judging and deleting, so it can be implemented using Lua scripts. Get the value of the current lock variable in the script, and compare it with the given node id. Delete it only if it matches, otherwise it will not be operated.

To release the lock, execute the Lua script.

Step 5: Implement high availability

After completing the functionality, we will finally implement high availability. If we use a single Redis as a shared storage system for distributed locks, then, if this Redis is unavailable, then all the parts involved in distributed locks are unavailable, which is very brittle, which is why high availability is necessary.

At this point, we need to move out the distributed lock algorithm Redlock proposed by Antirez, the author of Redis. In short, it is to let the lock applicant, to multiple independent Redis instances to request locks, if more than half of the Redis to complete the lock operation, then the successful acquisition of the lock, otherwise the acquisition of failure.

When a lock is released, the Lua script that successfully removes the lock variable on more than half of the instances is considered successful.