1 | 0 foreword

Nowadays, many systems are based on distributed or microservice ideas to complete the system architecture design. In this system, there will be several microservices, and there will be communication calls between services. Now that a service invocation has occurred, there must be a delay or failure of the service invocation. When this problem occurs, the server may retry or the client may submit multiple clicks. If this is done multiple times, the final data result must be consistent, such as the payment scenario. This needs to be done by ensuring the business idempotent scheme.

2 | 0 idempotent is what

Idempotence is itself a mathematical concept. So f of n is equal to 1 to the n, no matter what n is, f of n is always equal to 1. In programming, idempotent is defined as: no matter how many times you operate on a resource, the effect should be the same. In other words: in the case of repeated interface calls, the impact on the system is the same, but the return values are allowed to be different, such as queries. Idempotence is not just one or more operations that have no impact on the resource, but also the first operation that has an impact and subsequent operations that have no impact. And idempotent is concerned with whether the resource is affected, not the result.

Take SQL as an example:

Select * from table where id=1. This SQL is idempotent, no matter how many times it is executed, although the results may be different. Insert into table(id,name) values(1,'heima') This SQL is idempotent if id or name is unique and multiple operations allow only one record to be inserted. If not, it is idempotent, and multiple operations produce multiple pieces of data. Update table set score=100 where id =1 This SQL has the same impact on the data no matter how many times it is executed. Idempotent. Update table set score=50+score where id = 1 This SQL involves calculations, and each operation has an impact on the data. It's not idempotent. Delete from table where id = 1. This SQL operation many times, produces the same result, has idempotent.Copy the code

Idempotent design is mainly considered from two dimensions: space and time. Space: defines the scope of idempotent, such as the generation of orders, do not allow repeated orders. Time: Defines the idempotent validity period. Some services require permanent idempotent guarantees, such as orders, payments, etc. And some businesses only need to be idempotent for a period of time. At the same time, the use of idempotent is usually accompanied by the concept of occurrence lock to solve the problem of concurrency security.

3 | 0 interface idempotent

For idempotent consideration, two main solutions are front and back end interaction and service interaction. Both of these are sometimes considered idempotent implementations. There are three main solutions: front-end anti-weight, PRG mode and Token mechanism.

3.1 Front end weight prevention

To ensure idempotent through the front end is the simplest way to achieve, front-end related attributes and JS code can be set. Reliability is not good, and experienced people can skip pages with the tool and still re-submit. It is mainly applicable to repeated forms submission or button clicking.

3.2 these RPG mode

PRG is the post-redirect-get mode. When a user submits a form, it is redirected to another successful submission page instead of staying on the original form page. This avoids repeated commits caused by user refreshes. It also prevents the form from being submitted repeatedly by pushing forward/back through the browser buttons. It is a common front – end anti – weight strategy.

3.3 token mechanism

2.3.1) Scheme Introduction To ensure idempotent through token mechanism is a very common solution, and also suitable for most scenarios. The scheme requires a certain degree of interaction between the front and back ends.

1) The server provides a token acquisition interface for the client to use. After the server generates the token, the server stores the token in redis if the current architecture is distributed. If the architecture is single, the server stores the token in the JVM cache.

2) When the client obtains the token, it initiates a request with the token.

3) After receiving the request from the client, the server will first determine whether the token exists in Redis. If yes, the service processing is complete. After the service processing is complete, delete the token. If the request does not exist, it indicates that the request is repeated and the corresponding identifier is directly returned to the client.

However, there is a problem. Currently, the service is executed before the token is deleted. In the case of high concurrency, it is highly likely that the token exists during the first access to complete specific service operations. However, before the token is deleted, the client sends a request with the token. In this case, because the token still exists, the second request is authenticated and specific service operations are performed.

The solution to this problem is the idea of parallel to serial. Performance loss and throughput decrease. The first option is to execute and remove the token for business code with a whole thread lock. When a subsequent thread accesses again, the queue is blocked. The second option: redis single threading and INCR are atomicity features. When obtaining a token for the first time, the token is used as the key and the token is incremented. Then the token is returned. When the client carries the token to access and execute the business code, it does not delete the token to determine whether it exists, but incR it again. If the value returned after incr is 2. If it is a valid request, it is allowed to execute. If it is any other value, it is an invalid request and is returned directly.

What if the token is deleted before the service is executed? In fact, there may be problems. If the execution of the specific service code times out or fails, and no clear result is returned to the client, the client may retry the request. However, the previous token has been deleted, and the request is considered to be repeated and no service is processed.

In this scenario, no additional processing is required and one token can only represent one request. In case of an exception in business execution, let the client retrieve the token and initiate a new access. You are advised to use the delete Token solution. However, the problem remains the same regardless of whether the token is deleted first or later. Each business request generates an additional request to obtain the token. However, when a business fails or times out, in the production environment, at most ten out of ten thousand requests will fail. Then for these ten requests, more than nine thousand nine hundred requests will generate additional requests, and some of the loss is not worth the gain. Although Redis performs well, it is also a waste of resources.

4 | 0 service idempotent

4.1 the heavy table

Another solution to preventing duplicate data submissions is to implement anti-duplicate tables. The realization idea of anti – heavy table is also very simple. Create a table as an anti-duplicate table, and create a unique index of one or more fields in the table as an anti-duplicate field to ensure that only one data is available in concurrent cases. Before inserting data into the service table, insert data into the anti-duplicate table. If the insert fails, the data is duplicated.

One might say why not use pessimistic locks as a solution for anti-duplicate tables. Deadlocks can also occur during use of pessimistic locks. Pessimistic locking is implemented by locking tables. Suppose that now A user A accesses table A (locks table A) and then attempts to access table B; Another user B accesses table B (locks table B) and then attempts to access table A. For user A, table B has been locked by user B, so user A must wait for user B to release table B. For user B, table A has been locked by user A, so user B cannot access table A until user A releases table A. At this point a deadlock has occurred.

4.2 Mysql optimistic lock guarantees idempotent

MySQL optimistic lock is an implementation of distributed lock based on database, which can be implemented in two ways: based on version number and based on condition. But the implementation is based on the idea of MySQL line lock to achieve.

Control by version number is a very common approach that works for most scenarios. But in the current inventory deduction scenario, version number control means that when multiple people access to purchase, the query shows that the purchase can be made, but only one person can succeed in the end, which is also not possible. In fact, as long as the goods inventory is not oversold, it can be. And then you can control it with conditions.

Mysql optimistic lock is more suitable for some tables that need to count, and it is recommended to use optimistic lock when the competition is not fierce and the probability of concurrent conflicts is small. Although the concurrency control can be completed through MySQL optimistic lock, but the lock operation is directly acted on the database, which will affect the database performance to a certain extent. In addition, the number of mysql connections is limited. If a large number of lock operations occupy connections, mysql performance bottlenecks will also occur.

4.3 ZooKeeper Distributed Lock

Implement ideas

For the implementation of distributed locks, ZooKeeper naturally carries some features that can perfectly implement distributed locks. Its internal mainly uses zNode node features and watch mechanism to complete.

Zookeeper nodes are divided into four types:

** Persistent nodes: ** Once created, they are permanent in ZooKeeper unless manually deleted.

** Persistent ordered nodes: ** Once created, they are permanent in ZooKeeper unless manually deleted. At the same time, each node has a default node serial number, and the serial number of each node is orderly increasing. Such as demo000001, demo000002… Demo00000N. ** Temporary node: ** When a node is created, it is automatically deleted once the server restarts or goes down. ** Temporary ordered nodes: ** When a node is created, it is automatically deleted once the server restarts or goes down. At the same time, each node has a default node serial number, and the serial number of each node is orderly increasing. Such as demo000001, demo000002… Demo00000N.

Watch monitoring mechanism

The watch monitoring mechanism is mainly used to monitor the status change of nodes and trigger subsequent events. It is assumed that when node B listens to node A, once node A is modified, deleted, or the list of child nodes is changed, node B will receive notification of the change of node A and then complete other additional things.

Realize the principle of

The realization idea is that when a thread wants to lock a method, it will first create a parent node corresponding to the current method in ZooKeeper, and then each thread that wants to acquire the lock of the current method will create a temporary ordered node under the parent node, because the node serial number is increasing. Therefore, the sequence number of the subsequent thread to acquire the lock in ZooKeeper is also increasing. According to this feature, the node with the smallest current serial number must be the first thread to acquire the lock, so the node with the smallest serial number can be specified to acquire the lock. Therefore, when each thread wants to acquire the lock again, it can determine whether its node serial number is the smallest, if so, it will obtain the lock. When the lock is released, you simply delete your own temporary ordered node.

Under concurrency, each thread creates its own temporary node under the corresponding method node, and each node is temporary and ordered. So how does ZooKeeper orderly allocate locks to different threads? This is where the Watch listening mechanism is applied. Every time a new temporary node is added, it will be based on the Watcher mechanism to listen to its own previous node waiting for notification from the previous node, when the current node is deleted, it will hold the lock. And so on.

The advantages and disadvantages

1) ZooKeeper is based on cp mode to ensure strong data consistency.

2) Automatic monitoring of lock release is realized based on watch mechanism, and lock operation performance is good.

3) Frequent node creation, great pressure for ZK server, throughput is not as strong as Redis.

The principle of analyzing

Low efficiency locking idea

In the implementation of distributed lock through ZooKeeper, there is another implementation of writing, which is also very common, but its efficiency is not high, here can be discussed in this implementation.

In this way, there is only one lock node. When the lock node is created, if the lock node does not exist, it is successfully created, indicating that the current thread has acquired the lock. If the lock node fails to be created, indicating that another thread has acquired the lock, the thread will listen to the release of the lock node. When the lock node is released, a subsequent attempt is made to create a lock node to lock.

The low efficiency of this scheme is that there is only one lock node, and other threads will listen to the same lock node. Once the lock node is released, other threads will be notified and compete to acquire the lock node. Such a large number of notification operations will seriously degrade zooKeeper performance. Such a large number of notification operations caused by the change of a ZNode under watch is called herd behavior.

Efficient locking idea

To avoid herding, a common solution in the industry is to queue the threads that acquire the lock, one listening to the other, in order. This approach is recommended for distributed locking

According to the above process, a corresponding temporary ordered node will be created under the root node for each thread waiting to acquire the lock. The node with the smallest serial number will hold the lock, and the latter node only listens to the one node in front of it, so that the process of acquiring the lock can be orderly and efficient.

Code implementation

Public abstract class AbstractLock {// ZooKeeper server address public static final String ZK_SERVER_ADDR="192.168.200.131:2181"; // ZooKeeper timeout duration public static final int CONNECTION_TIME_OUT=30000; public static final int SESSION_TIME_OUT=30000; ZkClient = new ZkClient(ZK_SERVER_ADDR,SESSION_TIME_OUT,CONNECTION_TIME_OUT); /** * get lock * @return */ public abstract Boolean tryLock(); Public void waitLock(); Public void releaseLock(); public void getLock() { String threadName = Thread.currentThread().getName(); If (tryLock()) {system.out.println (threadName+": lock succeeded "); }else {system.out. println(threadName+": lock failed, waiting "); // wait for lock waitLock(); getLock(); } } } public class HighLock extends AbstractLock{ private static final String PARENT_NODE_PATH="/high_lock"; // currentNodePath private String currentNodePath; Private String preNodePath; private String preNodePath; private CountDownLatch countDownLatch; @override public Boolean tryLock() {// Check whether the parent exists if (! ZkClient. The exists (PARENT_NODE_PATH)) {/ / there is no zkClient createPersistent (PARENT_NODE_PATH); } / / temporary order to create the first child node if (currentNodePath = = null | | "" equals (currentNodePath)) {/ / no nodes under the root node information, will be the current node as the first child node, type: Temporary order currentNodePath = zkClient. CreateEphemeralSequential (PARENT_NODE_PATH + "/", "lock"); List<String> childrenNodeList = zkClient.getChildren(PARENT_NODE_PATH); // Ascending sort collections.sort (childrenNodeList); If (currentNodePath. Equals (PARENT_NODE_PATH+"/"+ childrenNodelist.get (0))){return true; }else {// The current node is not the node with the smallest serial number, get the node name before it, and assign int length = parent_node_path.length (); int currentNodeNumber = Collections.binarySearch(childrenNodeList, currentNodePath.substring(length + 1)); PreNodePath = PARENT_NODE_PATH + "/" + childrenNodeList. Get (currentNodeNumber ‐ 1); } return false; } @Override public void waitLock() { IZkDataListener zkDataListener = new IZkDataListener() { @Override public void handleDataChange(String dataPath, Object data) throws Exception { } @Override public void handleDataDeleted(String dataPath) throws Exception { if (countDownLatch ! = null){ countDownLatch.countDown(); }}}; / / monitored before a node change zkClient subscribeDataChanges (preNodePath zkDataListener); if (zkClient.exists(preNodePath)){ countDownLatch = new CountDownLatch(1); try { countDownLatch.await(); } catch (InterruptedException e) { } } zkClient.unsubscribeDataChanges(preNodePath,zkDataListener); } @Override public void releaseLock() { zkClient.delete(currentNodePath); zkClient.close(); }}Copy the code

4.4 Redis Distributed Lock

Principle & Implementation

A very important feature of distributed lock is mutual exclusion. Multiple callers compete with each other at the same time, and only one of them can successfully lock. Redis is based on a single-threaded model, which can be used to queue requests from the caller. For concurrent requests, only one request can obtain the lock. SetNx () : Save key-value to Redis. Only if the key does not exist, the setting will be successful. Otherwise, 0 will be returned. Indicates mutual exclusion. Expire () : Sets the expiration time of the key to avoid deadlocks. Delete () : deletes the key to release the lock. 1) Write a tool class to realize locking through jedis.set. If the return value is OK, it means locking is successful. If locking fails, the spin keeps trying to acquire the lock. RequestId: Used to identify the lock tag currently held by each thread

Public class SingleRedisLock {JedisPool JedisPool = new JedisPool("192.168.200.128",6379); // Lock expiration time protected Long internalLockLeaseTime = 30000; Private Long Timeout = 999999; /** ** lock * @param lockKey lockKey * @param requestId request unique identifier * @return */ SetParams SetParams = SetParams.setParams().nx().px(internalLockLeaseTime); public boolean tryLock(String lockKey, String requestId){ String threadName = Thread.currentThread().getName(); Jedis jedis = this.jedisPool.getResource(); Long start = System.currentTimeMillis(); try{ for (;;) { String lockResult = jedis.set(lockKey, requestId, setParams); If ("OK".equals(lockResult)){system.out.println (threadName+": lock succeeded "); return true; System.out.println(threadName+": lock failed, waiting "); Long l = system.currentTimemillis () if (l>=timeout) { return false; } try { Thread.sleep(100); } catch (InterruptedException e) { e.printStackTrace(); } } }finally { jedis.close(); }}}Copy the code

When unlocking, prevent the current thread from releasing someone else’s lock. Suppose thread A succeeded in locking thread A. After A period of time, thread A tries to unlock thread A, but thread A’s lock has expired. At this point, thread B also tries to lock thread A, because thread A’s lock has expired, so thread B can successfully lock thread B. The problem is that thread A has released thread B’s lock. For this problem, you need to use requestId at lock time. When unlocking, check whether the value of the current lock key is the same as the value passed in. If they are the same, they represent the same person and can be unlocked. Otherwise, it cannot be unlocked. But for this operation, there are very many people, will first query for comparison, and then delete the same. It’s on the right track, but it ignores the question of atomicity. If judgment and deletion are divided into two steps, atomicity cannot be guaranteed and problems will arise as well. So when you unlock it, you not only have to make sure that you unlock it by the same person but you also have to make sure that you unlock it atomically. So combine lua script to complete query & delete operation.

/** * unlock * @param lockKey lockKey * @param requestId request unique identifier * @return */ public Boolean releaseLock(String lockKey,String) requestId){ String threadName = Thread.currentThread().getName(); System.out.println(threadName+" : release lock "); Jedis jedis = this.jedisPool.getResource(); String lua = "if redis.call('get',KEYS[1]) == ARGV[1] then" + " return redis.call('del',KEYS[1]) " + "else" + " return 0  " + "end"; try { Object result = jedis.eval(lua, Collections.singletonList(lockKey), Collections.singletonList(requestId)); if("1".equals(result.toString())){ return true; } return false; }finally { jedis.close(); }}Copy the code

The test class

Public class LoclTest {public static void main(String[] args) {for (int I =0; i<5; i++) { Thread thread = new Thread(new LockRunnable()); thread.start(); } } private static class LockRunnable implements Runnable { @Override public void run() { SingleRedisLock singleRedisLock = new SingleRedisLock(); String requestId = UUID.randomUUID().toString(); boolean lockResult = singleRedisLock.tryLock("lock", requestId); if (lockResult){ try { TimeUnit.SECONDS.sleep(5); } catch (InterruptedException e) { e.printStackTrace(); } } singleRedisLock.releaseLock("lock",requestId); }}}Copy the code

At this point, it can be seen that multiple threads compete for the same lock, and the thread that did not acquire the lock will spin and keep trying to acquire the lock. Every time a thread releases a lock, another thread holds it. And so on.

Existing problems

Lock the renewal

When a service is locked, the lock expiration time must not be automatically set. Assume that thread A successfully locks A service and sets the lock expiration time. However, the service execution time is too long. If the service execution time exceeds the lock expiration time, the lock is automatically released before the service execution is complete. Subsequent threads can then acquire the lock and perform the business again. The same business logic will be executed repeatedly because thread A has not completed its execution. Therefore, the lock timeout period must be determined based on the service execution time, so that the lock expiration time is longer than the service execution time. The above solution is a basic solution, but it is still problematic. There are too many factors affecting the execution time of a business to determine an exact value, only an estimate. There is no 100% guarantee that a lock can only be held by one thread during the execution of a business. If you want to ensure this, you can create a daemon thread at the same time as the lock is created, and define a scheduled task to increase the expiration time of unreleased locks at regular intervals. When the service is complete, the daemon thread is closed after the lock is released. This implementation idea can be used to solve lock renewals.

Service single point & cluster problem

While redis can complete the locking operation at a single point, once the Redis service node fails, the locking operation cannot be provided. In a production environment, asynchronous replication is used for primary/secondary deployment to ensure high availability of Redis. When the primary node writes data successfully, the data is asynchronously replicated to the secondary node. When the primary node breaks down, the secondary node is promoted to the primary node to continue working. Assume that the primary node writes data successfully, but the primary node breaks down when data is not copied to the secondary node. There is no lock information in the slave node promoted to the master node, and other threads can continue to lock, resulting in the mutual exclusion failure.

4.5 Redisson Distributed Lock

Redisson is a third party library recommended by Redis to implement distributed locking. Its internal implementation is very powerful, with various locks implemented, and is very simple for the user, allowing the user to focus more on the business logic. Redisson is used here to solve two problems caused by single-machine locking.

Standalone Redisson implementation

Rely on

Commons ‐pool2</artifactId> </dependency> <! ‐Redis DLC ‐‐> <dependency> <groupId>org.redisson</groupId> <artifactId>redisson‐spring‐boot‐starter The < version > 3.13.1 < / version > < / dependency >Copy the code

The configuration file

Server: redis: host: 192.168.200.150 Port: 6379 Database: 0 Jedis: Pool: Max ‐active: 500 Max ‐ IDLE: 1000 min‐ IDLE: 4Copy the code

Start the class

@Value("${spring.redis.host}") private String host; @Value("${spring.redis.port}") private String port; @Bean public RedissonClient redissonClient(){ RedissonClient redissonClient; Config config = new Config(); String url = "redis://" + host + ":" + port; config.useSingleServer().setAddress(url); try { redissonClient = Redisson.create(config); return redissonClient; } catch (Exception e) { e.printStackTrace(); return null; }}Copy the code

Lock tool

@Component public class RedissonLock { @Autowired private RedissonClient redissonClient; Public Boolean addLock(String lockKey){try {if (redissonClient == null){ System.out.println("redisson client is null"); return false; } RLock lock = redissonClient.getLock(lockKey); Lock. lock(5, timeunit.seconds); // Set the lock timeout period to 5 SECONDS. System.out.println(thread.currentThread ().getName()+": lock "); Return true; } catch (Exception e) { e.printStackTrace(); return false; } } public boolean releaseLock(String lockKey){ try{ if (redissonClient == null){ System.out.println("redisson client is  null"); return false; } RLock lock = redissonClient.getLock(lockKey); lock.unlock(); System.out.println(thread.currentThread ().getName()+": release lock "); return true; }catch (Exception e){ e.printStackTrace(); return false; }}}Copy the code

The test class

@SpringBootTest @RunWith(SpringRunner.class) public class RedissonLockTest { @Autowired private RedissonLock redissonLock; @test public void easyLock(){for (int I =0; i<10; i++) { Thread thread = new Thread(new LockRunnable()); thread.start(); } try { System.in.read(); } catch (IOException e) { e.printStackTrace(); } } private class LockRunnable implements Runnable { @Override public void run() { redissonLock.addLock("demo"); try { TimeUnit.SECONDS.sleep(3); } catch (InterruptedException e) { e.printStackTrace(); } redissonLock.releaseLock("demo"); }}}Copy the code

According to the execution effect, when multiple threads concurrently acquire the lock, when one thread obtains the lock, other threads cannot obtain the lock, and its internal will constantly try to acquire the lock. When the thread holding the lock releases the lock, other threads will continue to compete for the lock.

Source code analysis

Lock () source code analysis

When the RLock object is acquired, the lock() inside it is called to perform the locking operation. According to the source description, when a thread obtains a lock, if it does not acquire the lock, it is put into spin until it does. If a lock is acquired, it is held until it is released manually by calling unLock() or automatically based on the leaseTime passed in. Two parameter values are currently passed: the lock timeout period and the unit of time. It is used to avoid deadlocks. If the Redis node that holds the lock goes down, the lock can be automatically released after expiration.

The lock() method also calls another overloaded method of lock(), passing in three parameters: expiration time, unit of time, and whether to interrupt.

In a three-argument overloaded lock() method, the current thread ID is first obtained, and tryAcquire() is then called to attempt to acquire the lock. If null is returned, the lock is obtained. If the return value is not null, an asynchronous task is created based on the current thread ID and placed in the thread pool, followed by spin. During the spin, tryAcquire() is attempted to acquire the lock, and if it is obtained, the spin exits. Otherwise the lock will be constantly tried to acquire.

At the heart of the Lock () method is tryAcquire(). The internal core implementation calls tryAcquireAsync() and acquires the lock by passing in the expiration time, unit of time, and current thread ID. If leaseTime is not -1, a valid time is set, then call tryAcquireAsync() to obtain the lock. If it is -1, it defaults to never expire to 30 seconds, creates an asynchronous task, and does nothing if no lock is acquired. If you get the lock, then call scheduleExpirationRenewal () to delay the current thread id of the lock.

The final tryLockInnerAsync() is a concrete implementation of the lock acquisition. As you can see, lock acquisition is done internally in the Lua scripting language. Because the lock acquisition process involves many steps, lua is used to ensure the atomicity of the execution process. The most important thing is to understand the execution process of the Lua script.

For this Lua script, KEYS[1] represents the key to be locked, ARGV[1] represents the timeout period for the lock, and ARGV[2] represents the unique identifier of the lock. For this lua script, simply put: 1) Check whether the lock key is occupied, if not set the lock key and unique identifier, the initial value is 1, and set the lock key expiration time. 2) If the lock key exists and the value matches, indicating that the lock is held by the current thread, then the reentrant times are increased by 1 and the expiration time is set. 3) Return the lock key expiration time in milliseconds.

UnLock () source analysis

To release a lock, unlockAsync() is called inside unlock() to release the lock held by the current thread. The unlockInnerAsync() method is eventually executed internally to release the lock and return the result.

UnlockInnerAsync () is still done in conjunction with the Lua script. KEYS[1] : indicates the current lock key. KEYS[2] : ChannelName of the redis message. Each lock has a unique ChannelName. ARGV[1] : redis message body, used to indicate that the key of redis has been unlocked, used to inform other threads to apply for the lock. ARGV[2] : lock timeout time. ARGV[3] : unique identifier of the lock.

1) Check whether the lock key and the unique identifier of the lock match. If they do not match, it indicates that the lock has been occupied.

2) If the current thread holds the lock, value -1 is used for reentrant operations.

3) If the value after -1 is greater than 0, set the expiration time for the lock.

4) If the value after -1 is 0, the lock key is deleted and the message is published that the lock is released. Used to notify other threads to apply for a lock.

Lock the renewal

The issue of lock renewal was introduced in the distributed lock implementation of single-point Redis to prevent repeated service execution caused by timeout or downtime. Based on the analysis of the lock method, it can be found that after the expiration time is set, the expiration time of the current lock is already set and will not be changed. After the lock expires, it will be automatically released. Therefore, locking the lock using the lock() method may cause hidden dangers in service execution.

Red lock

When a Redis lock is implemented in a single point of Redis, it cannot be locked once the Redis server goes down. Redis is therefore considered as a master-slave structure, but in a master-slave structure, data replication is implemented asynchronously. Suppose that in a master-slave structure, the master asynchronously copies data to the slave. Once a thread holds the lock, the master breaks down before the data is copied to the slave. The slave is promoted to master, but the master that is promoted to slave does not have the previous thread’s lock information, and other threads can be re-locked

Redlock algorithm RedLock is a distributed lock algorithm based on multi-node RedIS, which can effectively solve the problem of redIS single point of failure. It is recommended to build five Redis servers to implement the Redlock algorithm. In redis official website, the realization of Redlock algorithm is also introduced in detail. Address: redis. IO/switchable viewer/dist… .

The whole implementation process is divided into five steps: 1) record the current time before obtaining the lock; 2) use the same key and value to obtain the lock in all instances of Redis, and set the lock acquisition time to be much shorter than the lock automatic release time. Assuming the automatic lock release time is 10 seconds, the acquisition time should be between 5-50 ms. This prevents the client from waiting too long for a closed instance and trying to get the next one if one instance is not available. 3) By subtracting the time of the first step from the time after acquiring the lock of all instances, the difference value obtained by the client should be less than the automatic lock release time to avoid getting an expired lock. In addition, more than half of redis instances must successfully acquire the lock before the lock is finally acquired. If the number is not greater than half, multiple clients may obtain the lock repeatedly, resulting in lock failure. 4) When the lock has been acquired, its true expiration time should be: expiration time – the difference of step 3. 5) If the client fails to acquire the lock, the lock is released in all instances of Redis. To ensure efficient lock acquisition, you can also set a retry policy to retry the lock acquisition after a certain period of time, but not endlessly. Set the number of retries.

Although redlock can prevent redis single point problem more effectively, there are still hidden dangers. Assume that redis does not enable persistence and all redis restart after clientA obtains the lock, the lock record of clientA will disappear and clientB can still obtain the lock. The odds of that happening are extremely low, but there’s no guarantee it won’t happen.

The guaranteed solution is to start AOF persistence, but pay attention to the synchronization strategy, use synchronization per second, if you restart within a second, the data is still lost. Using always causes a sharp performance degradation.

It is recommended to use the default AOF policy, that is, to synchronize data every second. After redis is stopped, restart the system after the TTL. The disadvantage is that Redis cannot provide services during TTL.

implementation

Redisson’s implementation of the red lock has been very complete, through the internal API to complete the operation of the red lock.

@Configuration public class RedissonRedLockConfig { public RedissonRedLock initRedissonClient(String lockKey){ Config config1 = new Config(); Config1. UseSingleServer (.) setAddress (" redis: / / 192.168.200.150:7000 "). The setDatabase (0); RedissonClient redissonClient1 = Redisson.create(config1); Config config2 = new Config(); Config2. UseSingleServer (.) setAddress (" redis: / / 192.168.200.150:7001 "). The setDatabase (0); RedissonClient redissonClient2 = Redisson.create(config2); Config config3 = new Config(); Config3. UseSingleServer (.) setAddress (" redis: / / 192.168.200.150:7002 "). The setDatabase (0); RedissonClient redissonClient3 = Redisson.create(config3); Config config4 = new Config(); Config4. UseSingleServer (.) setAddress (" redis: / / 192.168.200.150:7003 "). The setDatabase (0); RedissonClient redissonClient4 = Redisson.create(config4); Config config5 = new Config(); Config5. UseSingleServer (.) setAddress (" redis: / / 192.168.200.150:7004 "). The setDatabase (0); RedissonClient redissonClient5 = Redisson.create(config5); RLock rLock1 = redissonClient1.getLock(lockKey); RLock rLock2 = redissonClient2.getLock(lockKey); RLock rLock3 = redissonClient3.getLock(lockKey); RLock rLock4 = redissonClient4.getLock(lockKey); RLock rLock5 = redissonClient5.getLock(lockKey); RedissonRedLock redissonRedLock = new RedissonRedLock(rLock1,rLock2,rLock3,rLock4,rLock5); return redissonRedLock; }}Copy the code

The test class

@SpringBootTest @RunWith(SpringRunner.class) public class RedLockTest { @Autowired private RedissonRedLockConfig redissonRedLockConfig; @test public void easyLock(){for (int I =0; i<10; i++) { Thread thread = new Thread(new RedLockTest.RedLockRunnable()); thread.start(); } try { System.in.read(); } catch (IOException e) { e.printStackTrace(); } } private class RedLockRunnable implements Runnable { @Override public void run() { RedissonRedLock redissonRedLock = redissonRedLockConfig.initRedissonClient("demo"); try { boolean lockResult = redissonRedLock.tryLock(100, 10, TimeUnit.SECONDS); If (lockResult){system.out.println (" lock succeeded "); TimeUnit.SECONDS.sleep(3); } } catch (InterruptedException e) { e.printStackTrace(); }finally { redissonRedLock.unlock(); System.out.println(" release lock "); }}}}Copy the code

RedissonRedLock add lock source code analysis

public boolean tryLock(long waitTime, long leaseTime, TimeUnit unit) throws InterruptedException {long newLeaseTime = ‐1; if (leaseTime ! = 1) {newLeaseTime = unit.tomillis (waitTime)*2; } long time = System.currentTimeMillis(); Long remainTime = ‐ 1; if (waitTime ! = 1) {remainTime = unit.tomillis (waitTime); } long lockWaitTime = calcLockWaitTime(remainTime); /** * 1\. Limit on the number of failed nodes allowed to lock (N ‐(N/2+1)), 2 */ int failedLocksLimit = failedLocksLimit(); */ List<RLock> acquiredLocks = new ArrayList<>(locks.size()); for (ListIterator<RLock> iterator = locks.listIterator(); iterator.hasNext();) { RLock lock = iterator.next(); boolean lockAcquired; /** * if (waitTime == ‐1 &&leasetime == ‐1) {lockAcquired = lock.tryLock(); } else { long awaitTime = Math.min(lockWaitTime, remainTime); lockAcquired = lock.tryLock(awaitTime, newLeaseTime, TimeUnit.MILLISECONDS); }} the catch (RedisResponseTimeoutException e) {/ / if thrown such abnormalities, in order to prevent locking success, failure, but the response to unlock all the nodes unlockInner (arrays.aslist (lock)); lockAcquired = false; } catch (Exception e) {lockAcquired = false; } if (lockAcquired) { /** *4\. */ acquiredLocks. Add (lock); } else {/** * 5\. Calculate whether the number of nodes that failed to apply for the lock has reached the limit (N ‐(N/2+1)) * If it has reached the limit, the final application failed. The Redlock algorithm requires that at least N/2+1 nodes are locked successfully. 4) message idempotency */ if (locks.size() ‐ acquiredlocks.size () == failedLocksLimit()) {break; } if (failedLocksLimit == 0) { unlockInner(acquiredLocks); If (waitTime == 1 &&leasetime == 1) {return false; } failedLocksLimit = failedLocksLimit(); acquiredLocks.clear(); // reset iterator while (iterator.hasPrevious()) { iterator.previous(); }} else {failedLocksLimit‐‐; }} /** * totaltime = remainTime;}} /** * totaltime = remainTime; RemainTime ‐= system.currentTimemillis (); time = System.currentTimeMillis(); if (remainTime <= 0) { unlockInner(acquiredLocks); return false; } } } if (leaseTime ! = 1) {List<RFuture<Boolean>> futures = new ArrayList<>(acquiredLocks. Size ()); for (RLock rLock : acquiredLocks) { RFuture<Boolean> future = ((RedissonLock) rLock).expireAsync(unit.toMillis(leaseTime), TimeUnit.MILLISECONDS); futures.add(future); } for (RFuture<Boolean> rFuture : futures) { rFuture.syncUninterruptibly(); }} /** * 7. If the logic is successfully executed, the lock is successfully applied, and return true */ return true; }Copy the code