High concurrency distributed lock architecture decryption, not all locks are distributed locks!

Writing in the front

Recently, many partners said that when learning high concurrency programming, do not quite understand what distributed lock is used to solve the problem, and many partners even do not understand what distributed lock is. Why is there a problem when you use your own distributed lock in production? The same program, with distributed locking, performance is several orders of magnitude worse! Why is that? Today, we will talk about how to implement distributed locks in high concurrency environments. Not all locks are high concurrency.

Not all locks are distributed locks!!

What locks can best support high concurrency scenarios? Today, we will decrypt the typical distributed lock architecture in a high-concurrency environment, combining with other articles in the “high-concurrency” topic, and put it into practice.

What problem does a lock solve?

In any of the applications that we write or highly concurrent programs, have you ever wondered why we need locks? What problem does locking solve for us?

In many business scenarios, there is a lot of competition for resources in the applications we write. In order to solve these resource competition problems, we introduce locks in high concurrency programs.

E-commerce oversold problem

Here, we can present a simple business scenario. For example, in the business scenario of e-commerce (shopping mall), when submitting an order to buy a commodity, the first step is to check whether the inventory of the corresponding commodity is sufficient. Only when the inventory quantity of the commodity is sufficient, can users place a successful order. When placing an order, we need to subtract the quantity of goods ordered by the user from the inventory quantity and update the result data of the inventory operation to the database. The whole process can be simplified as shown below.

Many friends also left a message saying, let me give the code, so that I can better learn and master the relevant knowledge. Well, here, I’ll give you the corresponding snippet of code. We can use the following code snippet to represent the user’s order. Here, I have saved the inventory information of the goods in Redis.

@RequestMapping("/submitOrder")
public String submitOrder(a){
    int stock = Integer.parseInt(stringRedisTemplate.opsForValue().get("stock"));
    if(stock > 0){
        stock -= 1;
        stringRedisTemplate.opsForValue().set("stock", String.valueOf(stock));
        logger.debug("Inventory deduction succeeded, the current inventory is: {}", stock);
    }else{
        logger.debug("Inventory shortage, inventory reduction failed.");
        throw new OrderException("Inventory shortage, inventory reduction failed.");
    }
    return "success";
}
Copy the code

Note: The code snippet above is relatively simple, but for your convenience, you can’t write code like this in a real project.

The above code seems fine, but you can’t just look at the code and see the order of execution. This is because code in the JVM is not necessarily executed in the order in which we wrote it. Even if the code in the JVM executes in the order we write it, the interfaces we provide to the outside world will be exposed to thousands of clients accessing our interfaces. So, the interfaces we expose are accessed concurrently.

Is the above code thread-safe in high-concurrency environments? The answer is certainly not thread-safe, because the inventory destocking operations described above are executed in parallel.

We can use Apache JMeter to test the above interface, and here I’m using Apache JMeter to test the above interface.

In Jmeter, I set the concurrency of the thread to 3, and the following configuration is shown.

Concurrent access to the order submission interface as an HTTP GET request. At this point, run JMeter to access the interface, and the command line prints the following log information.

Inventory deduction is successful, current inventory is: 49 inventory deduction is successful, current inventory is: 49Copy the code

Here, we clearly requested 3 times, that is to say, submitted 3 orders, why are the inventory after deduction the same? There is a professional term for this phenomenon in the field of e-commerce called “oversold”.

If a large and high power generator system, such as Taobao, Tmall, JINGdong, etc., appears oversold phenomenon, the loss is incalculable! The architects and developers of e-commerce systems are expected to be laid off. Therefore, as technical personnel, we must be rigorous in the treatment of technology, strictly do a good job in every technical link of the system.

Locks provided in the JVM

Synchronized and Lock locks (synchronized and Lock locks) can also be used to implement simple thread exclusion functions. So, as an aspiring architect, do you know the underlying principles of JVM locking?

The JVM lock principle

When it comes to JVM locking, we have to talk about object headers in Java.

Object header in Java

Every Java object has an object header. If it is a non-array type, it uses 2 word widths to store object headers, and if it is an array, it uses 3 word widths to store object headers. In 32-bit processors, a word is 32 bits wide; In a 64-bit virtual machine, a word width is 64 bits.

The following table shows the contents of the object header.

The length of the	content	instructions
32/64bit	Mark Word	Store objects such as hashCode or lock information
32/64bit	Class Metadata Access	A pointer to data stored in an object type
32/64bit	Array length	Length of array (if array)

The format of Mark Work is shown below.

The lock state	29 – bit or 61 bit	Is 1bit biased locking?	2bit lock flag bit
unlocked		0	01
Biased locking	Thread ID	1	01
Lightweight lock	Pointer to the lock record in the stack	This bit is not used to identify bias locks	00
Heavyweight lock	A pointer to a mutex (heavyweight lock)	This bit is not used to identify bias locks	10
The GC tag		This bit is not used to identify bias locks	11

As you can see, when the object state is biased lock, Mark Word stores the ID of the biased thread. In the lightweight Lock state, Mark Word stores a pointer to the Lock Record in the thread stack. When the state is a heavyweight lock, Mark Word is a pointer to a Monitor object in the heap.

For more information about Java object headers, see Java Multithreading in A Nutshell.

The JVM lock principle

In a nutshell, locking in the JVM works like this.

For example, when the first thread executes a program, it checks the lock mark in the Java object header and finds that the lock mark in the Java object header is unlocked. Therefore, it locks the Java object and sets the lock mark in the object header to the locked state. When the second thread executes the same program, it also checks the lock markers in the Java object header and finds that the lock markers in the Java object header are in the locked state. The second thread then enters the corresponding blocking queue and waits.

A key point here is how lock tags in Java object headers are implemented.

Weaknesses of JVM locks

Synchronized and Lock locks are available at the JVM level, and as you all know, when running a Java program, a JVM process is started to run our application. Synchronized and Lock are valid at the JVM level; that is, synchronized and Lock are valid within the same Java process. If we were developing a distributed application, using only synchronized and Lock to solve the high concurrency problem in distributed scenarios would be a bit of a stretch.

Synchronized and Lock support mutually exclusive threads within the same JVM process

Synchronized and Lock guarantee mutual exclusion for highly concurrent programs at the JVM level, as illustrated in the following figure.

However, synchronized and Lock do not guarantee the mutual exclusivity of distributed and multi-JVM applications when deploying applications to distributed architectures or running applications in different JVM processes.

Synchronized and Lock do not implement thread exclusion between multiple JVM processes

The nature of both distributed architectures and multi-JVM processes is to deploy applications on different JVM instances, that is, to remain multi-JVM processes by nature.

A distributed lock

When we implement distributed locking, we can refer to the idea of JVM lock implementation. The JVM lock is implemented by changing the flag bit of the lock in the object header of the Java object. That is, all threads access the flag bit of the lock in the object header of the Java object.

We also with the idea to implement distributed lock, and when we will be split and deploy applications into the distributed architecture, all threads in the application to access the Shared variables, all in the same place to check whether the critical section of the current program for the lock operation, and whether the lock operation, we in the United States place using the corresponding to the tag.

As you can see, the idea of implementing distributed locks is not that different from JVM locks. In the implementation of distributed lock, the service to save the lock state can be implemented using MySQL, Redis and Zookeeper.

However, in the Internet high concurrency environment, the distributed lock scheme using Redis is the most widely used. Next, we will use Redis to delve into the architectural design of distributed locks.

How does Redis implement distributed locking

Redis command

One of the infrequently used commands in Redis is shown below.

SETNX key value
Copy the code

This command is SET if Not Exists, that is, the value is SET only when it does Not exist.

Set the value of key to value only if key does not exist. If the key already exists, the SETNX command does nothing.

The return value of this command is as follows.

The command returns 1 on success.
The command returns 0 on failure to set.

Therefore, we can use the SETNX command of Redis to realize the distributed lock in the distributed high concurrency environment. Let’s say thread A and thread B access the critical section code at the same time. Let’s say thread A first executes the SETNX command and returns result 1. When thread B executes the SETNX command again, the result returned is 0, thread B cannot continue to execute. Only when thread A runs the DELETE command to DELETE the set lock status, thread B successfully runs the SETNX command to set the lock status and then continues.

Introducing distributed locks

Now that we know how to use the commands in Redis to implement distributed locks, we can modify the single interface to include distributed locks, as shown below.

/** * For the sake of demonstration, I simply define a constant as the commodity ID * in practice, this commodity ID is passed by the front end of the order operation */
public static final String PRODUCT_ID = "100001";

@RequestMapping("/submitOrder")
public String submitOrder(a){
    // Call Redis SETNX with stringRedisTemplate, key = item ID, value = string "binghe"
    // In fact, value can be swapped for any character
    Boolean isLocked = stringRedisTemplate.opsForValue().setIfAbsent(PRODUCT_ID, "binghe");
   // Failed to get the lock
    if(! isLock){return "failure";
    }
    int stock = Integer.parseInt(stringRedisTemplate.opsForValue().get("stock"));
    if(stock > 0){
        stock -= 1;
        stringRedisTemplate.opsForValue().set("stock", String.valueOf(stock));
        logger.debug("Inventory deduction succeeded, the current inventory is: {}", stock);
    }else{
        logger.debug("Inventory shortage, inventory reduction failed.");
        throw new OrderException("Inventory shortage, inventory reduction failed.");
    }
    // The PRODUCT_ID key is deleted
    stringRedisTemplate.delete(PRODUCT_ID);
    return "success";
}
Copy the code

So, in the above code, we add the operation of distributed lock, can the above code guarantee the atomicity of the business in the high concurrency scenario? The answer is to ensure atomicity of the business. However, in a real scenario, the code above implementing distributed locks is not available!!

Assume that when the thread A executed first stringRedisTemplate. OpsForValue setIfAbsent () () method returns true, continue to execute down, executing business code, throws exceptions, thread A directly out of the JVM. At this point, the stringRedisTemplate. Delete (PRODUCT_ID); Code could perform, after all the thread into the submitted order method, called stringRedisTemplate. OpsForValue setIfAbsent () () method will return false. All subsequent orders will fail. This is the deadlock problem in distributed scenarios.

Therefore, the implementation of distributed locks in the above code is not desirable in a real scenario!!

Introduce try-finally code blocks

We’re going to add a try-finall block to our single-interface method.

/** * For the sake of demonstration, I simply define a constant as the commodity ID * in practice, this commodity ID is passed by the front end of the order operation */
public static final String PRODUCT_ID = "100001";

@RequestMapping("/submitOrder")
public String submitOrder(a){
    // Call Redis SETNX with stringRedisTemplate, key = item ID, value = string "binghe"
    // In fact, value can be swapped for any character
    Boolean isLocked = stringRedisTemplate.opsForValue().setIfAbsent(PRODUCT_ID, "binghe");
   // Failed to get the lock
    if(! isLock){return "failure";
    }
    try{
        int stock = Integer.parseInt(stringRedisTemplate.opsForValue().get("stock"));
        if(stock > 0){
            stock -= 1;
            stringRedisTemplate.opsForValue().set("stock", String.valueOf(stock));
            logger.debug("Inventory deduction succeeded, the current inventory is: {}", stock);
        }else{
            logger.debug("Inventory shortage, inventory reduction failed.");
            throw new OrderException("Inventory shortage, inventory reduction failed."); }}finally{
         // The PRODUCT_ID key is deleted
    	stringRedisTemplate.delete(PRODUCT_ID);
    }
    return "success";
}
Copy the code

So, does the code above really solve the deadlock problem? When we write code, we can’t just look at the code and think there’s nothing wrong with it. In reality, the production environment is very complex. If, after successfully locking, the thread executes business code before it has time to execute the code that removes the lock flag, then the server goes down and the program does not gracefully exit the JVM. It will also cause subsequent threads to enter the order submission method, because they cannot successfully set the lock flag and order failed. So, the above code still has problems.

Introduce Redis timeout mechanism

The automatic expiration time of the cache can be set in Redis, which we can introduce into the implementation of distributed locks, as shown in the code below.

/** * For the sake of demonstration, I simply define a constant as the commodity ID * in practice, this commodity ID is passed by the front end of the order operation */
public static final String PRODUCT_ID = "100001";

@RequestMapping("/submitOrder")
public String submitOrder(a){
    // Call Redis SETNX with stringRedisTemplate, key = item ID, value = string "binghe"
    // In fact, value can be swapped for any character
    Boolean isLocked = stringRedisTemplate.opsForValue().setIfAbsent(PRODUCT_ID, "binghe");
   // Failed to get the lock
    if(! isLock){return "failure";
    }
    try{
        stringRedisTemplate.expire(PRODUCT_ID, 30, TimeUnit.SECONDS);
        int stock = Integer.parseInt(stringRedisTemplate.opsForValue().get("stock"));
        if(stock > 0){
            stock -= 1;
            stringRedisTemplate.opsForValue().set("stock", String.valueOf(stock));
            logger.debug("Inventory deduction succeeded, the current inventory is: {}", stock);
        }else{
            logger.debug("Inventory shortage, inventory reduction failed.");
            throw new OrderException("Inventory shortage, inventory reduction failed."); }}finally{
         // The PRODUCT_ID key is deleted
    	stringRedisTemplate.delete(PRODUCT_ID);
    }
    return "success";
}
Copy the code

In the above code, we added the following line to set the expiration time for the lock flag in Redis.

stringRedisTemplate.expire(PRODUCT_ID, 30, TimeUnit.SECONDS);
Copy the code

At this point, we set the expiration time to 30 seconds.

So the question is, does this really solve the problem? Is there really no pit in the above procedure? The answer is still a pit!!

“Pit location” analysis

We introduced a timeout mechanism for distributed locks in the single operation method, but the code still doesn’t really avoid deadlocks, so where is the “pit”? . Just think, when the program execution of the stringRedisTemplate opsForValue () setIfAbsent () method, Was about to perform stringRedisTemplate. Expire (PRODUCT_ID, 30, TimeUnit. SECONDS) code, the server is down, you also don’t say that about the production situation of the very complex, is so qiao, the server is down. At this point, when the subsequent request enters the method of order submission, the subsequent order process cannot be executed normally because the lock flag cannot be set successfully.

Now that we’ve found the “hole” in the code above, how do we fill it? How to solve this problem? Don’t worry, Redis already provides this functionality. When saving data to Redis, you can specify a timeout period for the data. So, we can transform the code to look like this.

/** * For the sake of demonstration, I simply define a constant as the commodity ID * in practice, this commodity ID is passed by the front end of the order operation */
public static final String PRODUCT_ID = "100001";

@RequestMapping("/submitOrder")
public String submitOrder(a){
    // Call Redis SETNX with stringRedisTemplate, key = item ID, value = string "binghe"
    // In fact, value can be swapped for any character
    Boolean isLocked = stringRedisTemplate.opsForValue().setIfAbsent(PRODUCT_ID, "binghe".30, TimeUnit.SECONDS);
   // Failed to get the lock
    if(! isLock){return "failure";
    }
    try{
        int stock = Integer.parseInt(stringRedisTemplate.opsForValue().get("stock"));
        if(stock > 0){
            stock -= 1;
            stringRedisTemplate.opsForValue().set("stock", String.valueOf(stock));
            logger.debug("Inventory deduction succeeded, the current inventory is: {}", stock);
        }else{
            logger.debug("Inventory shortage, inventory reduction failed.");
            throw new OrderException("Inventory shortage, inventory reduction failed."); }}finally{
         // The PRODUCT_ID key is deleted
    	stringRedisTemplate.delete(PRODUCT_ID);
    }
    return "success";
}
Copy the code

In the code above, we set the timeout when we set the lock flag bit to Redis. At this point, as long as the data is successfully set in Redis, even if our business system goes down, the data in Redis will be automatically deleted after expiration. After subsequent threads enter the order submission method, they will successfully set the lock flag and proceed to the normal order process.

At this point, the above code has basically solved the deadlock problem of the program from a functional point of view, so is the above program really perfect? Haha, a lot of friends will definitely say not perfect! Sure, the code above is not perfect, but do you know what’s not perfect? Next, let’s continue our analysis.

Analyze code from a development integration perspective

When we develop common system components, such as distributed locks, we will certainly extract some common classes to perform corresponding functions for the system to use.

Here, suppose we define a RedisLock interface, as follows.

public interface RedisLock{
    // Lock
    boolean tryLock(String key, long timeout, TimeUnit unit);
    // Unlock operation
    void releaseLock(String key);
}
Copy the code

Next, the RedisLockImpl class implements the RedisLock interface, providing the specific locking and unlocking implementation, as shown below.

public class RedisLockImpl implements RedisLock{
    @Autowired
    private StringRedisTemplate stringRedisTemplate;
    
    @Override
    public boolean tryLock(String key, long timeout, TimeUnit unit){
        return stringRedisTemplate.opsForValue().setIfAbsent(key, "binghe", timeout, unit);
    }
    @Override
    public void releaseLock(String key){ stringRedisTemplate.delete(key); }}Copy the code

From a development integration perspective, when a thread runs from top to bottom, it locks the program first, then executes the business code, and then releases the lock. In theory, the Redis Key is the same for both locking and releasing locks. However, if another developer writes code that does not call tryLock() but calls releaseLock() directly, the key that he calls releaseLock() passes is the same key that you pass when you call tryLock(). Then there will be a problem, he wrote the code, suddenly will you add lock release!!

Therefore, the above code is not safe, others can casually delete the lock you add, this is the wrong deletion operation of the lock, this is very dangerous, so, the above procedures exist very serious problems!!

So how to realize that only the locked thread can carry out the corresponding unlock operation? Keep looking down.

How to realize the normalization of locking and unlocking?

What is the normalization of locking and unlocking? To put it simply, after a thread performs the locking operation, the subsequent unlocking operation must be performed by this thread, and the locking and unlocking operation is completed by the same thread.

To solve the problem that only the locked thread can perform the corresponding unlock operation, we need to bind the lock and unlock operation to the same thread. Then, how to bind the lock and unlock operation to the same thread? ThreadLocal is a ThreadLocal implementation. Yes, using the ThreadLocal class does solve this problem.

At this point, we modify the code for the RedisLockImpl class to look like this.

public class RedisLockImpl implements RedisLock{
    @Autowired
    private StringRedisTemplate stringRedisTemplate;
    
    private ThreadLocal<String> threadLocal = new ThreadLocal<String>();
    
    @Override
    public boolean tryLock(String key, long timeout, TimeUnit unit){
        String uuid = UUID.randomUUID().toString();
        threadLocal.set(uuid);
        return stringRedisTemplate.opsForValue().setIfAbsent(key, uuid, timeout, unit);
    }
    @Override
    public void releaseLock(String key){
        // If the uUID bound to the current thread is the same as the UUID in Redis, delete the lock
        if(threadLocal.get().equals(stringRedisTemplate.opsForValue().get(key))){ stringRedisTemplate.delete(key); }}}Copy the code

The main logic of the above code is as follows: when attempting to lock the program, first generate a UUID, bind the generated UUID to the current thread, manipulate the key parameter in Redis, save the generated UUID in Redis as the Value in Redis, and set the timeout time. Before unlocking, check whether the UUID bound to the current thread is the same as the UUID stored in Redis. The lock flag bit is deleted only when the uUID bound to the current thread is the same as the UUID stored in Redis. This avoids the problem of one thread locking the program and another thread unlocking the lock.

Continue to analyze

Let’s change the locking and unlocking methods to look like this.

public class RedisLockImpl implements RedisLock{
    @Autowired
    private StringRedisTemplate stringRedisTemplate;
    private ThreadLocal<String> threadLocal = new ThreadLocal<String>();
    private String lockUUID;
    @Override
    public boolean tryLock(String key, long timeout, TimeUnit unit){
        String uuid = UUID.randomUUID().toString();
        threadLocal.set(uuid);
        lockUUID = uuid;
        return stringRedisTemplate.opsForValue().setIfAbsent(key, uuid, timeout, unit);
    }
    @Override
    public void releaseLock(String key){
        // If the uUID bound to the current thread is the same as the UUID in Redis, delete the lock
        if(lockUUID.equals(stringRedisTemplate.opsForValue().get(key))){ stringRedisTemplate.delete(key); }}}Copy the code

I believe many partners will see the above code exists what problem!! Yes, that’s the thread-safety issue.

So, here, we need to use ThreadLocal to solve the thread-safety problem.

Reentrancy analysis

In the above code, when one thread successfully sets the lock flag, another thread sets the lock flag and returns a failure. Another scenario is that service A is called in the interface method of order submission, and service A calls service B, and there are lock and unlock operations on the same goods in the method of service B.

Therefore, after service B successfully sets the lock flag, the interface method to submit the order continues without successfully setting the lock flag. That is, the distributed locks currently implemented have no reentrancy.

Here, there’s the problem of reentrancy. We want to design distributed locks that are reentrant. What is reentrant? In simple terms, the same thread can acquire the same lock multiple times, and can solve the operation in sequence.

In fact, many of the locks offered after JDK 1.5 support reentrancy, such as synchronized and Lock.

How do you achieve reentrancy?

When mapped to our lock and unlock methods, how do we support the same thread to be able to acquire the lock (set the lock flag bit) more than once? A simple design can be as follows: if the current thread is not bound to a UUID, generate a UUID bound to the current thread and set the lock flag bit in Redis. If the current thread is already bound to the UUID, it returns true, indicating that the current thread has previously set the lock flag, that is, has acquired the lock.

Combined with the above analysis, we transformed the interface method code for order submission into the following.

public class RedisLockImpl implements RedisLock{
    @Autowired
    private StringRedisTemplate stringRedisTemplate;
    
    private ThreadLocal<String> threadLocal = new ThreadLocal<String>();
    
    @Override
    public boolean tryLock(String key, long timeout, TimeUnit unit){
        Boolean isLocked = false;
        if(threadLocal.get() == null){
            String uuid = UUID.randomUUID().toString();
        	threadLocal.set(uuid);
            isLocked = stringRedisTemplate.opsForValue().setIfAbsent(key, uuid, timeout, unit);
        }else{
            isLocked = true;   
        }
        return isLocked;
    }
    @Override
    public void releaseLock(String key){
        // If the uUID bound to the current thread is the same as the UUID in Redis, delete the lock
        if(threadLocal.get().equals(stringRedisTemplate.opsForValue().get(key))){ stringRedisTemplate.delete(key); }}}Copy the code

There seems to be no problem with writing like this, but we think about it carefully, is it really OK to write like this?

Problem analysis of reentrancy

Since the above distributed lock reentrancy is a problem, let’s analyze the root cause of the problem!

Suppose that in the method of submitting an order, we first add A distributed lock to the code block using the RedisLock interface, and then call service A in the code after locking, and there are also lock and unlock operations calling the RedisLock interface in service A. If the RedisLock operation is invoked for several times, as long as the previous lock is not invalid, the system returns true, indicating that the lock is successfully obtained. That is, no matter how many times the lock operation is called, it will only succeed once. After executing the logic in service A, call the unlock method of RedisLock interface in service A. At this time, all locks obtained by lock operations of the current thread will be released.

We can simplify this process by using the following figure.

So the question is, how do you solve the problem of reentrancy?

Address reentrancy issues

I believe many friends can come up with a counter to solve the above reentrant problem, yes, is to use a counter to solve. The overall process is as follows.

So, what does that look like in program code? Let’s modify the code for the RedisLockImpl class, as shown below.

public class RedisLockImpl implements RedisLock{
    @Autowired
    private StringRedisTemplate stringRedisTemplate;
    
    private ThreadLocal<String> threadLocal = new ThreadLocal<String>();
    
    private ThreadLocal<Integer> threadLocalInteger = new ThreadLocal<Integer>();
    
    @Override
    public boolean tryLock(String key, long timeout, TimeUnit unit){
        Boolean isLocked = false;
        if(threadLocal.get() == null){
            String uuid = UUID.randomUUID().toString();
        	threadLocal.set(uuid);
            isLocked = stringRedisTemplate.opsForValue().setIfAbsent(key, uuid, timeout, unit);
        }else{
            isLocked = true;   
        }
        // After the lock is successful, increment the counter by 1
        if(isLocked){
            Integer count = threadLocalInteger.get() == null ? 0 : threadLocalInteger.get();
            threadLocalInteger.set(count++);
        }
        return isLocked;
    }
    @Override
    public void releaseLock(String key){
        // If the uUID bound to the current thread is the same as the UUID in Redis, delete the lock
        if(threadLocal.get().equals(stringRedisTemplate.opsForValue().get(key))){
            Integer count = threadLocalInteger.get();
            // Release the lock when the counter decreases to 0
            if(count == null || --count <= 0){ stringRedisTemplate.delete(key); }}}}Copy the code

At this point, we have basically solved the reentrancy problem of distributed locks.

Having said that, I would like to ask you a sentence, the above solution to the problem is really no problem?

Blocking and non-blocking locks

In the order submission method, when the Redis distributed lock fails to be acquired, we simply return failure to indicate that the current order request failed. Imagine that in a high-concurrency environment, once a request has acquired a distributed lock, all other requests will return an order failure message when they call the order method until the request releases the lock. In real life, this is very unfriendly. We can block subsequent requests until the current request releases the lock, and then wake up the blocked request for a distributed lock to execute the method.

Therefore, our distributed lock design needs to support blocking and non-blocking features.

So how do you implement blocking? We can do this using spin, continuing to modify the code for RedisLockImpl as shown below.

public class RedisLockImpl implements RedisLock{
    @Autowired
    private StringRedisTemplate stringRedisTemplate;
    
    private ThreadLocal<String> threadLocal = new ThreadLocal<String>();
    
    private ThreadLocal<Integer> threadLocalInteger = new ThreadLocal<Integer>();
    
    @Override
    public boolean tryLock(String key, long timeout, TimeUnit unit){
        Boolean isLocked = false;
        if(threadLocal.get() == null){
            String uuid = UUID.randomUUID().toString();
        	threadLocal.set(uuid);
            isLocked = stringRedisTemplate.opsForValue().setIfAbsent(key, uuid, timeout, unit);
            // If it fails to acquire the lock, spin to acquire it until it succeeds
            if(! isLocked){for(;;) { isLocked = stringRedisTemplate.opsForValue().setIfAbsent(key, uuid, timeout, unit);if(isLocked){
                        break; }}}}else{
            isLocked = true;   
        }
        // After the lock is successful, increment the counter by 1
        if(isLocked){
            Integer count = threadLocalInteger.get() == null ? 0 : threadLocalInteger.get();
            threadLocalInteger.set(count++);
        }
        return isLocked;
    }
    @Override
    public void releaseLock(String key){
        // If the uUID bound to the current thread is the same as the UUID in Redis, delete the lock
        if(threadLocal.get().equals(stringRedisTemplate.opsForValue().get(key))){
            Integer count = threadLocalInteger.get();
            // Release the lock when the counter decreases to 0
            if(count == null || --count <= 0){ stringRedisTemplate.delete(key); }}}}Copy the code

Blocking and non-blocking locks are very important concepts to keep in mind when designing distributed locks.

Lock failure problem

Although we implemented the blocking feature of distributed locks, there was another issue that we had to consider. That’s the problem with lock failure.

What happens when the program executes business for longer than the lock expires? Many friends can think, that is, the previous request is not completed, the lock expires, the following request to obtain a distributed lock, continue to execute, the program can not achieve true mutual exclusion, can not ensure the atomicity of the business.

So how do you solve this problem? The answer is: we must ensure that the distributed lock is released after the business code completes execution. The scheme is there, how to achieve it?

In plain English, we need to execute the following code from time to time in the business code to ensure that the distributed lock is not released due to timeout while the business code is still executing.

springRedisTemplate.expire(PRODUCT_ID, 30, TimeUnit.SECONDS);
Copy the code

Here, we need to define a timing policy to execute the above code, with the caveat that we cannot wait 30 seconds to execute the above code, because at 30 seconds, the lock will have failed. For example, we could execute the above code every 10 seconds.

Add a while(true) loop to the RedisLockImpl class to solve this problem. Let’s modify the RedisLockImpl class to see if there is any problem.

public class RedisLockImpl implements RedisLock{
    @Autowired
    private StringRedisTemplate stringRedisTemplate;
    
    private ThreadLocal<String> threadLocal = new ThreadLocal<String>();
    
    private ThreadLocal<Integer> threadLocalInteger = new ThreadLocal<Integer>();
    
    @Override
    public boolean tryLock(String key, long timeout, TimeUnit unit){
        Boolean isLocked = false;
        if(threadLocal.get() == null){
            String uuid = UUID.randomUUID().toString();
        	threadLocal.set(uuid);
            isLocked = stringRedisTemplate.opsForValue().setIfAbsent(key, uuid, timeout, unit);
            // If it fails to acquire the lock, spin to acquire it until it succeeds
            if(! isLocked){for(;;) { isLocked = stringRedisTemplate.opsForValue().setIfAbsent(key, uuid, timeout, unit);if(isLocked){
                        break; }}}// Define the update lock expiration time
            while(true){
                Integer count = threadLocalInteger.get();
                // If the current lock has been released, exit the loop
                if(count == 0 || count <= 0) {break;
                }
                springRedisTemplate.expire(key, 30, TimeUnit.SECONDS);
                try{
                    // Execute every 10 seconds
                    Thread.sleep(10000);
                }catch(InterruptedException e){ e.printStackTrace(); }}}else{
            isLocked = true;   
        }
        // After the lock is successful, increment the counter by 1
        if(isLocked){
            Integer count = threadLocalInteger.get() == null ? 0 : threadLocalInteger.get();
            threadLocalInteger.set(count++);
        }
        return isLocked;
    }
    @Override
    public void releaseLock(String key){
        // If the uUID bound to the current thread is the same as the UUID in Redis, delete the lock
        if(threadLocal.get().equals(stringRedisTemplate.opsForValue().get(key))){
            Integer count = threadLocalInteger.get();
            // Release the lock when the counter decreases to 0
            if(count == null || --count <= 0){ stringRedisTemplate.delete(key); }}}}Copy the code

If you look at the code, you will see that there is a problem: updating the lock expiration time should not be written in this way. This would cause the current thread to block in the while(true) loop that updates the lock timeout without returning the result. Therefore, instead of blocking the current thread, we need to perform a timed task asynchronously to update the lock expiration time.

At this point, we continue to modify the code of the RedisLockImpl class to execute the timed update lock timeout code in a separate thread, as shown below.

public class RedisLockImpl implements RedisLock{
    @Autowired
    private StringRedisTemplate stringRedisTemplate;
    
    private ThreadLocal<String> threadLocal = new ThreadLocal<String>();
    
    private ThreadLocal<Integer> threadLocalInteger = new ThreadLocal<Integer>();
    
    @Override
    public boolean tryLock(String key, long timeout, TimeUnit unit){
        Boolean isLocked = false;
        if(threadLocal.get() == null){
            String uuid = UUID.randomUUID().toString();
        	threadLocal.set(uuid);
            isLocked = stringRedisTemplate.opsForValue().setIfAbsent(key, uuid, timeout, unit);
            // If it fails to acquire the lock, spin to acquire it until it succeeds
            if(! isLocked){for(;;) { isLocked = stringRedisTemplate.opsForValue().setIfAbsent(key, uuid, timeout, unit);if(isLocked){
                        break; }}}// Start a new thread to perform a scheduled task to update the lock expiration time
           new Thread(new UpdateLockTimeoutTask(uuid, stringRedisTemplate, key)).start();
        }else{
            isLocked = true;   
        }
        // After the lock is successful, increment the counter by 1
        if(isLocked){
            Integer count = threadLocalInteger.get() == null ? 0 : threadLocalInteger.get();
            threadLocalInteger.set(count++);
        }
        return isLocked;
    }
    @Override
    public void releaseLock(String key){
        // If the uUID bound to the current thread is the same as the UUID in Redis, delete the lock
        String uuid = stringRedisTemplate.opsForValue().get(key);
        if(threadLocal.get().equals(uuid)){
            Integer count = threadLocalInteger.get();
            // Release the lock when the counter decreases to 0
            if(count == null || --count <= 0){
             	stringRedisTemplate.delete(key); 
                // Gets the thread that updated the lock timeout and interrupts
                long threadId = stringRedisTemplate.opsForValue().get(uuid);
                Thread updateLockTimeoutThread = ThreadUtils.getThreadByThreadId(threadId);
                if(updateLockTimeoutThread ! =null) {// Interrupts the thread updating the lock timeout
                    updateLockTimeoutThread.interrupt();   
                    stringRedisTemplate.delete(uuid);
                }
            }
        }
    }
}
Copy the code

Create the UpdateLockTimeoutTask class to perform the update lock timeout.

public class UpdateLockTimeoutTask implements Runnable{
    //uuid
    private long uuid;
    private StringRedisTemplate stringRedisTemplate;
    private String key;
    public UpdateLockTimeoutTask(long uuid, StringRedisTemplate stringRedisTemplate, String key){
        this.uuid = uuid;
        this.stringRedisTemplate = stringRedisTemplate;
        this.key = key;
    }
    @Override
    public void run(a){
        // Save to Redis with uuid as key and current thread ID as value
        stringRedisTemplate.opsForValue().set(uuid, Thread.currentThread().getId());
         // Define the update lock expiration time
        while(true){
            springRedisTemplate.expire(key, 30, TimeUnit.SECONDS);
            try{
                // Execute every 10 seconds
                Thread.sleep(10000);
            }catch(InterruptedException e){ e.printStackTrace(); }}}}Copy the code

Next, we define a ThreadUtils utility class with a method getThreadByThreadId(Long threadId) that gets a thread based on its ID.

public class ThreadUtils{
    // Get a thread handle based on the thread ID
    public static Thread getThreadByThreadId(long threadId){
        ThreadGroup group = Thread.currentThread().getThreadGroup();
        while(group ! =null){
            Thread[] threads = new Thread[(int)(group.activeCount() * 1.2)];
            int count = group.enumerate(threads, true);
            for(int i = 0; i < count; i++){
                if(threadId == threads[i].getId()){
                    return threads[i];
                }
            }
        }
    }
}
Copy the code

In the field of distributed locks, there is a technical term called “asynchronous renewal”. Note that when the business code finishes executing, we need to stop the thread updating the lock timeout. First, I redefined the UpdateLockTimeoutTask as an UpdateLockTimeoutTask class, and injected uUID and StringRedisTemplate into the task class. When executing the timed update lock timeout, The current thread is first saved to Redis, where the Key is the UUID passed in.

After the distributed lock is first acquired, the thread is restarted and the UUID and StringRedisTemplate are passed to the task class to execute the task. When releaseLock() is called after the execution of the business code, the thread ID for updating the lock timeout is obtained from Redis by uUID, and the thread for updating the lock timeout is obtained by the thread ID. Interrupt () is called to interrupt the thread.

At this point, when the distributed lock is released, the thread that timed out will exit due to thread interruption.

Basic requirements for implementing distributed locks

Combined with the above cases, we can draw the basic requirements of realizing distributed lock:

Mutual exclusion is supported
Support lock timeout
Supports blocking and non-blocking features
Reentrancy is supported
High availability support

Universal distributed solutions

In the Internet industry, distributed lock is an undoable topic. At the same time, there are many general distributed lock solutions, among which, one of the most widely used solutions is to use the open source Redisson framework to solve the distributed lock problem.

Redisson distributed lock is a Redisson distributed lock. Redisson distributed lock is a Redisson distributed lock

Since the Redisson framework is awesome, can we use the Redisson framework to guarantee 100% distributed locking? The answer is not 100%. Because no company or architect in the distributed field can guarantee 100% trouble-free, even a big company like Ali and a technology giant like ali’s chief architect can’t guarantee 100% trouble-free.

In the distributed world, it is not possible to be 100% trouble-free. We are pursuing a few “9” goals, such as 99.999% trouble-free.

Theory of CAP

In the distributed world, there is a very important theory called CAP theory.

C: Consistency
A: Availability
P: Partition tolerance

In the distributed world, fault tolerance of partitions must be guaranteed, that is, “P”, so we can only guarantee CP or AP.

Here, we can use Redis and Zookeeper to make a simple comparison. We can use Redis to implement distributed locks in AP architecture, and Zookeeper to implement distributed locks in CP architecture.

Distributed lock model based on Redis AP architecture

In the distributed lock model based on the AP architecture implemented by Redis, data is written to Redis node 1 and results are immediately returned, and then data is synchronized asynchronously in Redis.

Distributed lock model based on Zookeeper CP architecture

In the distributed model based on the CP architecture implemented by Zookeeper, after data is written to node 1, it waits for the data synchronization result. When data is successfully synchronized among most Zookeeper nodes, the result data is returned.

When implementing distributed locks using the REDIS-based AP architecture, there is a problem to be aware of, which can be illustrated in the following figure.

That is, the data synchronization between the Master and Slave nodes in Redis fails. Assume that the thread writes data to the Master node, but the Master node fails to synchronize data to the Slave node in Redis. At this point, another thread reads the data from the Slave node and finds that the distributed lock is not added, then the problem will occur!!

Therefore, when designing distributed locking scheme, we also need to pay attention to data synchronization between Redis nodes.

Implementation of red lock

In Redisson framework, the mechanism of Redlock is realized, and the Redisson Redlock object realizes the locking algorithm introduced by Redlock. This object can also be used to associate multiple RLock objects into a red lock, and each RLock object instance can come from a different Redisson instance. The lock will not be considered successful until more than half of rLocks have been successfully locked, which improves the high availability of distributed locks.

We can use the Redisson framework to implement red locking.

public void testRedLock(RedissonClient redisson1,RedissonClient redisson2, RedissonClient redisson3){
	RLock lock1 = redisson1.getLock("lock1");
	RLock lock2 = redisson2.getLock("lock2");
	RLock lock3 = redisson3.getLock("lock3");
	RedissonRedLock lock = new RedissonRedLock(lock1, lock2, lock3);
	try {
		// Lock at the same time: lock1 lock2 lock3, red lock On most nodes.
		lock.lock();
		// Try locking. Wait 100 seconds at most. After locking, it will be unlocked automatically within 10 seconds
		boolean res = lock.tryLock(100.10, TimeUnit.SECONDS);
	} catch (InterruptedException e) {
		e.printStackTrace();
	} finally{ lock.unlock(); }}Copy the code

In fact, in real life scenarios, red locks are rarely used. This is because using red locks can affect performance in high-concurrency environments, making the application experience worse. Therefore, in practical scenarios, we generally want to ensure the reliability of Redis clusters. At the same time, if the number of rlocks that are successfully locked is less than half of the total number, the system returns a locking failure result. Even if a task is successfully locked at the business level, the system returns a locking failure result. In addition, to use redlock, multiple sets of Redis master-slave deployment architectures must be provided. At the same time, the Master nodes in these sets of Redis master-slave deployment architectures must be independent and have no data interaction with each other.

High concurrency “black science and technology” and winning strange move

Let’s assume that Redis is used to implement distributed locks. Let’s assume that Redis has about 50,000 concurrent read and write operations. The amount of concurrency that our mall business needs to support is around 1 million. If all of these 1 million concurrent messages are sent to Redis, Redis is likely to fail. So, how do we solve this problem? Let’s explore this question.

In a high-concurrency mall system, if Redis is used to cache data, then the Redis cache concurrency capability is key, because many prefix operations need to access Redis. While asynchronous peak clipping is just a basic operation, the key is to ensure the concurrent processing capability of Redis.

The key idea to solve this problem is: divide and conquer, divide and open up commodity inventory.

Foster a

When we store the inventory quantity of goods in Redis, we can “split” the inventory of goods to improve the read and write concurrency of Redis.

For example, the id of the original commodity is 10001, and the inventory is 1000 pieces, and the storage in Redis is (10001, 1000). If we divide the original inventory into 5 pieces, then each piece of inventory is 200 pieces. At this time, the information we store in Redia is (10001_0, 200). (10001_1, 200), (10001_2, 200), (10001_3, 200), (10001_4, 200).

At this point, we will inventory after segmentation, each division of inventory to use commodity id plus a digital id to store, so, in the store inventory of each Key in the Hash arithmetic, the Hash results are different, which means, has a great probability of store inventory Key is not in the same slot Redis, This improves the performance and concurrency of Redis requests.

After splitting the inventory, we also need to store a mapping relation between the id of the commodity and the Key after splitting the inventory in Redis. The Key of the mapping relation is the ID of the commodity, namely 10001, and the Value is the Key to store the inventory information after splitting the inventory, namely 10001_0, 10001_1, 10001_2. 10001 _3, 10001 _4. In Redis we can use a List to store these values.

In the real processing of inventory information, we can first query all the keys corresponding to the goods after the split inventory from Redis, and use AtomicLong to record the current number of requests, and use the number of requests to perform modular calculation on the length of all the keys corresponding to the goods queried from Redia after the split inventory, and the result is 0. One, two, three, four. Then concatenate the item ID in front to get the actual inventory cache Key. At this point, you can directly go to Redis to obtain the corresponding inventory information according to this Key.

At the same time, we can separately store different inventory data to different Redis servers to further improve the concurrency of Redis.

Substitute stealthily

In a high-concurrency business scenario, we can access the cache directly from the load balancing layer using the Lua script library (OpenResty).

Here, consider a scenario: in a high-concurrency business scenario, items are snapped up in an instant. At this point the user to initiate the request, if the system by the load balance layer of each service request application layer, again by the various application layer service access cache and the database, in fact, nature has no meaning, because the goods had been sold out, and then through the system of application layer by layer check already do not have much meaning!! The concurrent visits of the application layer are in hundreds, which will reduce the concurrency of the system to a certain extent.

In order to solve this problem, at this time, we can take out the user ID, commodity ID, activity ID and other information carried by users when they send requests in the load balancing layer of the system, and directly access the inventory information in the cache through Lua script and other technologies. If the inventory of the item is less than or equal to 0, the user is directly returned with the prompt message that the item is sold out, without the verification of the application layer.

Big welfare

WeChat search the ice technology WeChat 】 the public, focus on the depth of programmers, daily reading of hard dry nuclear technology, the public, reply within [PDF] have I prepared a line companies interview data and my original super hardcore PDF technology document, and I prepared for you more than your resume template (update), I hope everyone can find the right job, Learning is a way of unhappy, sometimes laugh, come on. If you’ve worked your way into the company of your choice, don’t slack off. Career growth is like learning new technology. If lucky, we meet again in the river’s lake!

In addition, I open source each PDF, I will continue to update and maintain, thank you for your long-term support to glacier!!

Write in the last

If you think glacier wrote good, please search and pay attention to “glacier Technology” wechat public number, learn with glacier high concurrency, distributed, micro services, big data, Internet and cloud native technology, “glacier technology” wechat public number updated a large number of technical topics, each technical article is full of dry goods! Many readers have read the articles on the wechat public account of “Glacier Technology” and succeeded in job-hopping to big factories. There are also many readers to achieve a technological leap, become the company’s technical backbone! If you also want to like them to improve their ability to achieve a leap in technical ability, into the big factory, promotion and salary, then pay attention to the “Glacier Technology” wechat public account, update the super core technology every day dry goods, so that you no longer confused about how to improve technical ability!