Hello, nice to meet you.

Starting in September, I decided to start a Weekly blog with the goal of Posting at least one blog a week, either to read up on various source code sources or to document a problem I was working on.

After a period of aimless study, I found that it didn’t seem to be of much use. After a while, I forgot what I had read and didn’t take notes.

“What you learn, you bring out.” I think this is the best way for me to learn, and this is the origin of the Blog every Monday. Friends, if you often forget something you’ve seen before, join this activity.

This is the fourth blog post in November and the first in the distributed series.

  • This post was first published on my blog: Javageekers.club
  • Included in this article is personal Speaker Knowledge Base: Back-end Technology as I understand it

Distributed locking is an important part of distributed systems, and if you have practical experience with distributed locking on your resume, it may be a good place to impress the interviewer. This week’s topic is distributed lock, I will discuss from the following aspects:

  1. Why do we need distributed locks?
  2. How to implement a distributed lock?
  3. What can go wrong with distributed locks?

1. Why do we need distributed locks?

Distributed lock, let’s break it down to “distributed” and “lock”, that is, locks in distributed systems. In a single application, we solve the problem of controlling the access to shared resources through locks, while distributed locks solve the problem of controlling the access to shared resources in distributed systems.

Unlike individual applications, the minimum granularity of competing for shared resources in distributed systems has been upgraded from threads to processes, or microservices.

Can imagine such A scene: in A production environment with A, B two instances, create A single order at the same time run A report of tasks regularly, in order to prevent double generated report, waste of system resources, we can use the distributed lock, in order to generate A single report when the order is locked, thus guarantee the individual orders report generated only once.

This scenario is read service or idempotent service. Distributed locks are used to avoid repeated service execution, improve service efficiency, and save system resources.

Another scenario is a write business, specifically a non-idempotent business, such as a payment operation that allows only one payment channel to be successfully executed after an order has been successfully placed.

The reason for using distributed locks here is to avoid data inconsistencies caused by repeated operations, such as in the example above, to avoid multiple payments.

2. Possible problems of distributed locks

Above said the distributed environment in the need of distributed lock reasons, the next summary of the online and I have encountered some distributed lock may encounter problems, which is very useful for the subsequent optimization of distributed lock.

2.1 Distributed deadlock

The distributed deadlock problem cannot strictly be called a “deadlock” because it does not satisfy the mutual hold and wait condition of deadlocks, it is simply a “dead” distributed lock.

Common scenarios: ServerA breaks down after the lock is successfully added, or S fails to unlock the distributed lock due to an exception. As a result, subsequent client locks fail to be added.

Solution: When a distributed lock is added, set the expiration time, that is, the automatic unlock mechanism.

2.2 Automatic Unlock Problem

The introduction of automatic expiration unlock mechanism solves the problem of distributed deadlock, but it brings a new problem, that is, automatic unlock.

Common scenario: After ServerA holds a distributed lock, the distributed lock is automatically released because the transaction processing time is longer than the expiration time. As a result, no lock can be unlocked when services are completed.

Solution: Start another thread to automatically renew distributed locks. For example, Redission’s Watch Dog provides automatic renewal (check every 10 seconds and automatically renew if the client still holds the lock).

2.3 Differentiating operations on Different Clients

In a multi-instance client scenario, another feature required in a distributed lock is the ability to distinguish operations from different clients. In simple terms, locks added by ServerA must be unlocked by itself in addition to being unlocked automatically.

Common scenario: ServerA has a distributed lock. The distributed lock keys in Redis are the same and are unlocked by the unlock command of ServerB. As a result, ServerA cannot unlock the distributed lock after completing the service.

Solution: Add client identifiers such as Key (product ID for inventory reduction) and value (A: timestamp of expiration time) to lock the server. Determine value logically during unlocking to ensure that active unlocking can only be performed by ServerA.

2.4 Reentrant lock

A reentrant lock is a condition in which a client acquires the lock without deadlock if it acquires the lock again in the business.

Common scenario: After ServerA is locked successfully, internal services need to be locked again.

Solution: Change the distributed lock to a reentrant lock. Determine when adding a lock. If the key and client ids are the same and the lock does not expire, extend the lock expiration time.

3. How to implement a distributed lock?

After explaining why distributed locks are needed in a distributed environment and the problems they can cause, let’s talk about how to implement a distributed lock.

I think the idea of implementing a distributed lock is to use a “middleware” that each instance can access, and this middleware can have some properties to ensure that the distributed lock is exclusive, fault tolerant, and can effectively avoid deadlocks.

There are three common ways to implement distributed locks: database, Zookeeper, and Redis.

First, declare the application scenario: after an order is placed successfully, only one payment channel is allowed to successfully execute the payment operation. At the same time, we assume that the above three middleware implementations are single-instance, non-cluster.

3.1 Distributed lock based on MySQL

The first solution is to achieve distributed lock through the database. Taking MySQL as an example, a distributed lock table is established and a unique index is established based on the order number to ensure that only one record can be successfully inserted.

CREATE TABLE `distributed_lock`  (
  `id` int(0) NOT NULL,
  `order_no` varchar(36) DEFAULT NULL COMMENT 'Order Number'.PRIMARY KEY (`id`) USING BTREE,
  UNIQUE INDEX `idx_order_no`(`order_no`) USING BTREE
) ENGINE = InnoDB;
Copy the code

Client-side competing distributed process pseudocode looks like this:

public Integer lock(String orderNo) {
  // Query the existence of the distributed lock record for this orderNo
  if(queryLockByOrderNo(orderNo)) {
    retrun null;
  }
  try {
    // Insert record successfully, i.e. contention lock, return lock primary key ID
    return insertDistributedLock(orderNo);;
  } catch (Exception e) {
    // Failed to insert. Failed to compete for the lock
    return null; }}Copy the code
  • First of all, according to theorderNoQuery whether the distributed lock of the order exists. If the distributed lock exists, it indicates that the distributed lock is held. If not, enter the locking logic
    • Note that the query here needs to use the current read to prevent phantom reads, such asselect * from distributed_lock where order_no='xxx' for update;
  • Locking logic is going todistributed_lockInsert data into the table, return the primary key ID of the lock record on success

The above pseudocode shows the main idea of using a database to implement a distributed lock, of course, there are some other problems with this lock logic, according to the previous discussion of distributed lock problems, such as:

  • If the payment service reported an error, how to ensure that the lock can be released normally? (Distributed deadlock problem)
  • How can I ensure that locks added by “I” can only be released by “I”? (How to distinguish different client operations)
  • How to implement reentrant lock?

In addition, the main bottleneck of implementing distributed locks through databases is performance:

  • Since the current read is used in the query, an exclusive lock will be added, which will of course affect performance, cause blocking, and even deadlock.
  • If two requests need to add distributed locks at the same time, when one request is successfully executed in the query statement, the other request is bound to be blocked, resulting in response blocking. If there is a large number of requests at this time, it is bound to affect the normal execution of other services.
  • whendistributed_lockWhen the volume of table data reaches a certain level, query performance is affected and additional logic is required to remove invalid locks.

For the above reasons, a distributed lock is generally not implemented using a database.

3.2 Distributed Lock Implementation Based on Zookeeper

A little.

Yes, you are right, because the author knows little about Zookeeper, I will not laugh at this. After improving the Zookeeper skill tree in the future, I will make a TODO mark here.

You can get a preliminary understanding of the implementation principle of distributed lock based on Zookeeper in this article.

3.3 Distributed lock based on Redis

The third solution is to realize distributed lock through Redis, which is my way to realize distributed lock in the project, and is almost the most commonly used distributed lock implementation scheme at present.

Note: I could pass the Spring bring tools org. Springframework. Data. Redis. Core. RedisTemplate to implement distributed lock, Spring-boot-starter-data-redis depends on the version 2.3.5.RELEASE

3.3.1 Version1 (Basic Distributed Lock)

There is a command in Redis called setnx, which stands for success if the key does not exist. This feature can be used to implement the exclusivity of distributed locks. In the method provided by RedisTemplate, A method called setIfAbsent(K key, V value) uses the setnx command to implement a basic distributed lock:

@Resource
private RedisTemplate<String, String> redisTemplate;

public boolean lock(String key, String value) {
  if(redisTemplate.opsForValue().setIfAbsent(key, value)) {
    return true;
  }
  return false;
}

public boolean unlock(String redisKey) {
  	// Unlock failed if there is no key in redis
    String redisValue = redisTemplate.opsForValue().get(redisKey);
    if (StringUtils.isEmpty(redisValue)) {
      return false;
    }
  	// If there are keys in Redis, the unlock is successful if the keys are deleted successfully
    if (redisTemplate.opsForValue().getOperations().delete(redisKey)) {
      return true;
    }
  	return false;
}
Copy the code

3.3.2 Version2 (Resolve distributed deadlocks)

Distributed deadlock is the most common problem with the basic version of distributed lock based on Redis implementation: if the client holds a distributed lock, but because of exceptions or downtime does not take the initiative to release the lock, then the lock will always exist in Redis, and the subsequent lock operation of the key will fail.

As discussed earlier, the solution to distributed deadlocks is to add an expiration time to the lock.

The RedisTemplate provides an overloaded method setIfAbsent(K key, V value, Long Timeout, TimeUnit Unit) with an expiration time, which can be used to solve distributed deadlock problems.

@Resource
private RedisTemplate<String, String> redisTemplate;

public boolean lock(String key, String value, long timeout, TimeUnit unit) {
  if(redisTemplate.opsForValue().setIfAbsent(key, value, timeout, unit)) {
    return true;
  }
  return false;
}
Copy the code

3.3.3 Version3 (Solve automatic Unlocking)

By adding an expiration time to distributed locks, you can avoid deadlocks caused by program exceptions or outages, but this also introduces a new problem: auto-unlock.

As discussed in the previous paper, automatic unlock can be solved by adding automatic renewal function to distributed locks. For example, Redission’s Watch Dog can automatically renew keys. It will check keys every 10 seconds and extend their expiration time if they are still held by clients.

However, this also introduces a new problem, that is, how to deal with the distributed lock whose expiration time is less than 10 seconds automatic unlock problem? Anyway, let’s fix the auto-unlock problem first.

I have no experience with Redission, so I’m not going to teach you how to use Redission, but I think there are some solutions to the auto-unlock problem:

  • By setting a proper expiration time, you can prevent auto-unlock problems by testing multiple times in the development environment beforehand, calculating an average business execution time, and increasing it appropriately.
  • Maintain a container holding distributed locks on the client side, and renew distributed locks in the form of another thread at 1/3 of the expiration time set for each lock.
    • The practice is imitationRedissionIs only through the client-side custom code to achieve automatic renewal.
    • Of course there are resource costs and performance and concurrency issues associated with this approach, but this is just a scenario.

3.3.4 Version4 (Differentiating different client operations)

The problem of automatic unlock of distributed lock is solved by setting appropriate expiration time for distributed lock or adding automatic renewal function for distributed lock. But that’s not the end of it. There are other issues that remain unsolved — namely, how to distinguish between different client actions, which is also discussed in this article. The lock “I” adds can only be actively unlocked by myself.

The solution to this problem is to add the client id to the value of Redis and make logical decisions when unlocking.

// This item can be set as a configuration item in the application configuration file
private final static String ServerSingle = "ServerA:";

public boolean lock(String key, String value, long timeout, TimeUnit unit) {
  // value = ServerSingle + value
  if(redisTemplate.opsForValue().setIfAbsent(key, ServerSingle + value, timeout, unit)) {
    return true;
  }
  return false;
}

public boolean unlock(String redisKey) {
    String redisValue = redisTemplate.opsForValue().get(redisKey);
  	// If there is no key in redis, or the value does not contain the client flag, the unlock fails
    if(StringUtils.isEmpty(redisValue) || ! redisValue.contains(ServerSingle)) {return false;
    }
  	// The key is successfully deleted
    if (redisTemplate.opsForValue().getOperations().delete(redisKey)) {
      return true;
    }
  	return false;
}
Copy the code

3.3.5 Version5 (Reentrant Lock)

This brings us to the last issue listed in the previous article — reentrant locks. The so-called reentrant lock refers to the same thread after the outer function to obtain the lock, the inner function still has the code to obtain the lock, can obtain the lock again, will not occur deadlock situation.

The code to transform a distributed lock into a reentrant lock is as follows:

// This item can be set as a configuration item in the application configuration file
private final static String ServerSingle = "ServerA:";

public boolean lock(String key, String value, long timeout, TimeUnit unit) {
  // value = ServerSingle + value
  if(redisTemplate.opsForValue().setIfAbsent(key, ServerSingle + value, timeout, unit)) {
    return true;
  }
  // Failed to set the distributed lock. The lock may be held by another client or this client
  if(value==null| |! value.contains(ServerSingle)) {return false;    
  }
	// Use the setIfPresent method to avoid lock expiration
  if(redisTemplate.opsForValue().setIfPresent(redisKey, ServerSingle + value, timeout, unit)) {
    return true;
  }
  return false;
}

public boolean unlock(String redisKey) {
    String redisValue = redisTemplate.opsForValue().get(redisKey);
  	// If there is no key in redis, or the value does not contain the client flag, the unlock fails
    if(StringUtils.isEmpty(redisValue) || ! redisValue.contains(ServerSingle)) {return false;
    }
  	// The key is successfully deleted
    if (redisTemplate.opsForValue().getOperations().delete(redisKey)) {
      return true;
    }
  	return false;
}
Copy the code
  • The client identity prevents other client operations in the first place
  • Then throughsetIfPresentMethod to prevent lock expiration

This version of distributed locking based on the Redis implementation is the most complete version and basically solves all the problems mentioned above.

The discussion about distributed lock is over here, if there is any improper place in the article, wan Hope you see the officer pointed out, in this first thank you.

4. Learn from the past

  • Why do we need distributed locks?
  • Possible problems with distributed locks?
  • How to implement a distributed lock?

Chat 🏆 technology project stage v | distributed those things…