Why distributed locks?
Before we get to that, let’s look at a business scenario:
System A is an e-commerce system, which is currently deployed on A machine. There is an interface for users to place orders in the system, but before placing orders, users must check the inventory to ensure that the inventory is enough before placing orders to users.
Because the system has certain concurrency, the inventory of goods will be stored in Redis in advance, and the inventory of redis will be updated when users place orders.
The system architecture is as follows:
However, there is a problem with this: if at some point the inventory of an item in Redis is 1, two requests come in at the same time, one of which goes to step 3 in the figure above and updates the inventory of the database to 0, but step 4 has not yet been executed.
The other request reaches step 2, finds that the inventory is still 1, and continues with step 3.
As a result, 2 items are sold when there is only 1 item in stock.
Obviously not! This is a classic oversold inventory problem
At this point, it’s easy to think of a solution: lock steps 2, 3, and 4 so that when they’re done, another thread can come in and execute step 2.
As shown above, use the Java-provided synchronized or ReentrantLock to hold the lock during step 2 and release the lock after step 4.
Thus, steps 2, 3, and 4 are “locked” and can only be executed sequentially between threads.
But the good times did not last long, the whole system concurrency soared, a machine can not bear. Now we need to add a machine as shown below:
After adding machines, the system looks like the one above. Oh my God!
Suppose that the requests from two users come at the same time, but fall on different machines, then can these two requests be executed at the same time, or there will be the problem of overselling inventory.
Why is that? Because the two A systems in the figure above are running on two different JVMS, the locks they add are valid only for threads in their own JVMS, not for threads in other JVMS.
Therefore, the problem here is that the native locking mechanism provided by Java fails in multi-machine deployment scenarios
This is because the locks on both machines are not the same (they are in different JVMS).
So, we just have to make sure that the locks on both machines are the same, and the problem is solved, right?
At this point, the distributed lock grand stage, distributed lock idea is:
Provide a global and unique lock acquisition “thing” in the whole system, and then each system needs to add a lock, to ask this “thing” to get a lock, so that different systems can be considered the same lock.
This “thing” could be Redis, Zookeeper, or a database.
Text description is not very intuitive, let’s look at the picture below:
Through the above analysis, we know that the thread safety cannot be guaranteed by using Java native locking mechanism in the case of distributed deployment system, so we need to use distributed locking scheme.
So how do you implement distributed locks? Keep reading!
Distributed lock based on Redis
The above analysis of why to use distributed lock, here we specifically look at the distributed lock landing should be how to handle. Extension: How does Redisson implement distributed locking?
One of the most common solutions is to use Redis for distributed locking
The idea of using Redis for distributed locking is as follows: set a value in Redis to indicate that a lock has been added, and then delete the key when the lock is released.
The code looks like this:
SET anyLock unique_value NX PX 30000// Release lock: By executing a lua script // releasing the lock involves two instructions that are not atomic // using Redis lua script support features, Call ("get",KEYS[1]) == ARGV[1] thenreturn redis. Call ("del",KEYS[1])elsereturn 0endCopy the code
This approach has several main points:
-
Be sure to use the SET key value NX PX milliseconds command
If not, set the value first and then set the expiration time. This is not an atomic operation. It may break down before setting the expiration time, causing deadlock (key permanence).
-
Value must be unique
When unlocking the key, verify that the value is the same as that of the lock before deleting the key.
This is to avoid A situation: suppose A has acquired the lock, the expiration time is 30 seconds, then 35 seconds later, the lock has been automatically released, A to release the lock, but then B may have acquired the lock. Client A cannot delete client B’s lock.
In addition to thinking about how the client will implement distributed locking, there are also redis deployment issues to consider.
Redis can be deployed in three ways:
-
Stand-alone mode
-
Master-slave + Sentinel election mode
-
Redis cluster mode
The disadvantage of using Redis for distributed locking is that there is a single point of problem if the redis fails. Locking doesn’t work.
In the master-slave mode, only one node is locked during the locking process. Even if sentinel is used for high availability, if the master node fails and the master/slave switchover occurs, the lock loss may occur.
Based on the above considerations, the author of Redis also considered this problem. He proposed a RedLock algorithm, which means something like this:
Assume the deployment mode of Redis is Redis Cluster, with a total of 5 master nodes, obtain a lock by the following steps:
-
Gets the current timestamp in milliseconds
-
Take turns trying to create locks on each master node with short expiration times, typically tens of milliseconds
-
Try to create a lock on most nodes, e.g., 3 nodes for 5 (n / 2 +1)
-
The client calculates the lock establishment time. If the lock establishment time is shorter than the timeout period, the lock establishment is successful
-
If the lock fails, then the lock is removed in turn
-
Whenever someone else sets up a distributed lock, you have to keep polling to try and get it
But this algorithm is quite controversial, there may be a lot of problems, can not guarantee the lock process must be correct.
Another way: Redisson
In addition, Redis distributed lock, in addition to their own implementation based on Redis client native API, you can also use the open source framework: Redission
Redisson is an enterprise-level open source Redis Client that also provides distributed lock support. I highly recommend it. Why?
Recall from above that if you write code to set a value via Redis, you do so with the following command.
- SET anyLock unique_value NX PX 30000
The timeout is set to 30 seconds. If I do not complete the business logic after 30 seconds, the key will expire and other threads may acquire the lock.
In this way, the second thread will come in before the first thread has finished executing the business logic. Therefore, we still need to maintain the expiration time, which is too troublesome
How does Redisson do that? Feel the thrill of using Redission:
Config config = new Config(); Config. UseClusterServers (.) addNodeAddress (" redis: / / 192.168.31.101:7001 "). The addNodeAddress (" redis: / / 192.168.31.101:7002 "). A DdNodeAddress (" redis: / / 192.168.31.101:7003 "). The addNodeAddress (" redis: / / 192.168.31.102:7001 "). The addNodeAddress (" redis: / / 192.1 68.31.102:7002 "). AddNodeAddress (" redis: / / 192.168.31.102:7003 "); RedissonClient redisson = Redisson.create(config); RLock lock = redisson.getLock("anyLock"); lock.lock(); lock.unlock();Copy the code
It’s as simple as that, we just need to use the API lock and unlock to complete the distributed lock, it helps us to consider a lot of details:
-
All redisson instructions are executed through Lua scripts, and Redis supports atomic execution of Lua scripts
-
Redisson sets the default expiration time of a key to 30s. What if a client holds a lock for more than 30s?
Redisson has the concept of a watchdog, which sets the key timeout to 30 seconds every 10 seconds after you acquire the lock
This way, if the lock is held all the time, the key will not expire and another thread will acquire the lock.
-
Redisson’s “watchdog” logic ensures that no deadlocks occur.
(If the machine goes down, so does the watchdog. The lock will expire automatically after 30 seconds, and other threads can acquire the lock.
Here is the implementation code:
Private <T> RFuture<Long> tryAcquireAsync(Long leaseTime, TimeUnit Unit, final Long threadId) {if (leaseTime! = -1) { return tryLockInnerAsync(leaseTime, unit, threadId, RedisCommands.EVAL_LONG); } // Call a lua script, Set some key, expiration time RFuture<Long> ttlRemainingFuture = tryLockInnerAsync(commandExecutor.getConnectionManager().getCfg().getLockWatchdogTimeout(), TimeUnit.MILLISECONDS, threadId, RedisCommands.EVAL_LONG); ttlRemainingFuture.addListener(new FutureListener<Long>() { @Override public void operationComplete(Future<Long> future) throws Exception { if (! future.isSuccess()) { return; } Long ttlRemaining = future.getNow(); / / lock acquired the if (ttlRemaining = = null) {/ / watchdog logic scheduleExpirationRenewal (threadId); }}}); return ttlRemainingFuture; }<T> RFuture<T> tryLockInnerAsync(long leaseTime, TimeUnit unit, long threadId, RedisStrictCommand<T> command) { internalLockLeaseTime = unit.toMillis(leaseTime); return commandExecutor.evalWriteAsync(getName(), LongCodec.INSTANCE, command, "if (redis.call('exists', KEYS[1]) == 0) then " + "redis.call('hset', KEYS[1], ARGV[2], 1); " + "redis.call('pexpire', KEYS[1], ARGV[1]); " + "return nil; " + "end; " + "if (redis.call('hexists', KEYS[1], ARGV[2]) == 1) then " + "redis.call('hincrby', KEYS[1], ARGV[2], 1); " + "redis.call('pexpire', KEYS[1], ARGV[1]); " + "return nil; " + "end; " + "return redis.call('pttl', KEYS[1]);" , Collections.<Object>singletonList(getName()), internalLockLeaseTime, getLockName(threadId)); } / / watchdog will call here private void scheduleExpirationRenewal (final long threadId) {if (expirationRenewalMap.containsKey(getEntryName())) { return; } / / this will delay 10 s task execution Timeout task. = commandExecutor getConnectionManager () newTimeout (new TimerTask () {@ Override public Void run(Timeout Timeout) throws Exception {// This operation resets the key expiration time to 30s RFuture<Boolean> Future = renewExpirationAsync(threadId); future.addListener(new FutureListener<Boolean>() { @Override public void operationComplete(Future<Boolean> future) throws Exception { expirationRenewalMap.remove(getEntryName()); if (! future.isSuccess()) { log.error("Can't update lock " + getName() + " expiration", future.cause()); return; } the if (future getNow ()) {/ / reschedule itself / / by recursive calls this method, an infinite loop to extend the expiration time scheduleExpirationRenewal (threadId); }}}); } }, internalLockLeaseTime / 3, TimeUnit.MILLISECONDS); if (expirationRenewalMap.putIfAbsent(getEntryName(), new ExpirationEntry(threadId, task)) ! = null) { task.cancel(); }}Copy the code
In addition, Redisson provides support for the Redlock algorithm, which is also easy to use:
RedissonClient redisson = Redisson.create(config); RLock lock1 = redisson.getFairLock("lock1"); RLock lock2 = redisson.getFairLock("lock2"); RLock lock3 = redisson.getFairLock("lock3"); RedissonRedLock multiLock = new RedissonRedLock(lock1, lock2, lock3); multiLock.lock(); multiLock.unlock(); Summary: This section analyzes the specific landing scheme of using Redis as a distributed lock and some of its limitations. Then it introduces a Redis client framework redisson, which I recommend everyone to use, which will be much less care than writing code.Copy the code
Summary:
This section analyzes the specific landing scheme of using Redis as a distributed lock and some of its limitations, and then introduces a Redis client framework Redisson, which I recommend you to use, which will be much less care details than writing code.
Distributed lock based on ZooKeeper
Common distributed lock implementation scheme, in addition to the use of Redis to achieve, the use of ZooKeeper can also achieve distributed lock.
Before introducing zooKeeper’s mechanism for implementing distributed locks, let’s take a quick look at what ZK is:
Zookeeper is a centralized service that provides configuration management, distributed collaboration, and naming.
The zK model looks like this: ZK consists of a series of nodes, called ZNodes, which act like a file system. Each Znode represents a directory, and zNode has several features:
-
Ordered nodes: If we have a parent node that is /lock, we can create children under that parent node.
Zookeeper provides an optional order feature. For example, you can create child nodes “/lock/node-” and specify order. Then, when ZooKeeper generates child nodes, it automatically adds integer numbers based on the current number of child nodes
That is, if it is the first child node created, the child node is /lock/node-0000000000, the next node is /lock/ Node-0000000001, and so on.
-
Temporary node: A client can create a temporary node. After a session ends or times out, ZooKeeper automatically deletes this node.
-
Event monitoring: When reading data, we can set event monitoring on the node at the same time. When the node data or structure changes, ZooKeeper will notify the client. Zookeeper has the following four events:
-
Node to create
-
The node to delete
-
Modifying Node data
-
Child node change
Based on some of the above ZK features, we can easily come to the implementation of distributed lock using ZK landing scheme:
-
With zK’s temporary and ordered nodes, each thread acquiring the lock creates a temporary ordered node in ZK, such as under /lock/.
-
After the node is successfully created, obtain all temporary nodes in the /lock directory, and then determine whether the node created by the current thread is the node with the smallest serial number of all nodes
-
If the node created by the current thread is the node with the smallest node serial number, the lock is considered successful.
-
If the node created by the current thread is not the node with the lowest node number, an event listener is added to the node that precedes the node number.
Such as the current thread to get to the node number, for the lock / / 003, then to list all of the nodes [/ lock / 001, the lock / 002 /, / lock / 003], for the lock / 002 / this node to add an event listener.
If the lock is released, it wakes up the node with the next ordinal number, and then performs step 3 again to determine whether it has the smallest node number.
For example,/lock/ 001 is released,/lock/ 002 monitors the time, and the node set is [/lock/002,/lock/003], then /lock/002 is the minimum serial number node, and the lock is obtained.
The whole process is as follows:
The specific implementation idea is like this, as for how to write the code, here is more complex will not post out.
Curator introduce
Curator is an open source zooKeeper client that also provides an implementation of distributed locks.
The way he uses it is simple:
InterProcessMutex interProcessMutex = new InterProcessMutex(client,"/anyLock"); interProcessMutex.acquire(); interProcessMutex.release();Copy the code
The core source code of the distributed lock is as follows:
private boolean internalLockLoop(long startMillis, Long millisToWait, String ourPath) throws Exception{ boolean haveTheLock = false; boolean doDelete = false; try { if ( revocable.get() != null ) { client.getData().usingWatcher(revocableWatcher).forPath(ourPath); } while ( (client.getState() == CuratorFrameworkState.STARTED) && !haveTheLock ) { // 获取当前所有节点排序后的集合 List<String> children = getSortedChildren(); // 获取当前节点的名称 String sequenceNodeName = ourPath.substring(basePath.length() + 1); // +1 to include the slash // 判断当前节点是否是最小的节点 PredicateResults predicateResults = driver.getsTheLock(client, children, sequenceNodeName, maxLeases); if ( predicateResults.getsTheLock() ) { // 获取到锁 haveTheLock = true; } else { // 没获取到锁,对当前节点的上一个节点注册一个监听器 String previousSequencePath = basePath + "/" + predicateResults.getPathToWatch(); synchronized(this){ Stat stat = client.checkExists().usingWatcher(watcher).forPath(previousSequencePath); if ( stat != null ){ if ( millisToWait != null ){ millisToWait -= (System.currentTimeMillis() - startMillis); startMillis = System.currentTimeMillis(); if ( millisToWait <= 0 ){ doDelete = true; // timed out - delete our node break; } wait(millisToWait); }else{ wait(); } } } // else it may have been deleted (i.e. lock released). Try to acquire again } } } catch ( Exception e ) { doDelete = true; throw e; } finally{ if ( doDelete ){ deleteOurPath(ourPath); } } return haveTheLock;}
Copy the code
In fact, the underlying principle of distributed locking implemented by curator is similar to that analyzed above. Here we use a diagram to illustrate the principle in detail:
Summary:
This section introduces the distributed lock scheme implemented by Zookeeperr and the basic use of zK open source client, and briefly introduces its implementation principle. ZooKeeper implementation of distributed lock solutions, with examples!
Compare the advantages and disadvantages of the two schemes
After learning the two distributed lock implementations, this section discusses the pros and cons of redis and ZK implementations.
Redis distributed locks have the following disadvantages:
-
It can obtain the lock in a simple and crude way. If it cannot obtain the lock, it tries to obtain the lock continuously, which consumes performance.
-
On the other hand, the design orientation of Redis determines that its data is not highly consistent, which may cause problems in some extreme cases. The lock model is not robust enough
-
Even if the redlock algorithm is used for implementation, in some complex scenarios, it cannot be guaranteed that the implementation is 100% trouble-free. For discussion on Redlock, see How to do distributed locking
-
Redis distributed lock, in fact, you need to constantly try to obtain the lock, compared to consume performance.
On the other hand, distributed locks using Redis are common in many enterprises, and for the most part, they don’t have “extremely complex scenarios.”
Therefore, using Redis as a distributed lock is also a good solution. The most important point is that Redis has high performance and can support highly concurrent lock acquisition and release operations.
For ZK distributed locks:
-
Zookeeper is designed for distributed coordination and consistency. The lock model is robust, easy to use and suitable for distributed locking.
-
If you can’t get the lock, you just need to add a listener instead of polling all the time.
However, ZK also has its disadvantages: if more clients frequently apply for locks and release locks, the pressure on the ZK cluster will be greater.
Summary:
To sum up, both Redis and ZooKeeper have their advantages and disadvantages. When we do technology selection can be based on these problems as a reference factor.
advice
From the previous analysis, there are two common solutions for implementing distributed locks: Redis and ZooKeeper, each of which has its own advantages. How to choose?
Personally, I prefer the lock implemented by ZK:
Because Redis may have hidden dangers, which may lead to incorrect data. However, how to choose depends on the specific scene in the company.
If there is zK cluster condition in the company, ZK is preferred. But if there is only Redis cluster in the company, there is no condition to build ZK cluster.
In fact, it is also possible to use Redis to achieve, in addition, the system designer may consider the system has Redis, but do not want to introduce some external dependencies again, can choose Redis.
This is an architecture-based consideration for the system designer.