This article focuses on the solution of the seckill scenario.

What is seckill?

Literally, the so-called “seckill” refers to a high number of requests coming in a very short period of time, which is prone to service crashes or data inconsistencies.

Common scenarios include taobao Double 11, ride-hailing drivers snatching tickets, and 12306 snatching tickets.

The SEC kill oversold Bug reappears in high concurrency scenarios

Here’s a quick example of a commodity killing,

  1. Write code according to normal logic, request to check the inventory first, deduct the inventory when the inventory is greater than 0, and then execute other order logic business code;
/** ** ** */
@Service
public class GoodsOrderServiceImpl implements OrderService {

    @Autowired
    private GoodsDao goodsDao;

    @Autowired
    private OrderDao orderDao;

    /** ** order **@paramGoodsId Indicates the commodity ID *@paramUserId userId *@return* /
    @Override
    public boolean grab(int goodsId, int userId) {
        // Query inventory
        int stock = goodsDao.selectStock(goodsId);
        try {
            // Sleep here for 2 seconds to simulate equal concurrency coming here, simulating a real flood of requests
            Thread.sleep(2000);
        } catch (InterruptedException e) {
            e.printStackTrace();
        }

        // If the inventory is greater than 0, save the order
        if (stock > 0) {
            goodsDao.updateStock(goodsId, stock - 1);
            orderDao.insert(goodsId, userId);
            return true;
        }
        return false; }}Copy the code
@Service("grabNoLockService")
public class GrabNoLockServiceImpl implements GrabService {

    @Autowired
    OrderService orderService;

    /** ** Unlocked logic **@param goodsId
     * @param userId
     * @return* /
    @Override
    public String grabOrder(int goodsId, int userId) {
        try {
            System.out.println("Users." + userId + "Perform panic buying logic");
            boolean b = orderService.grab(goodsId, userId);
            if (b) {
                System.out.println("Users." + userId + "Snap it up");
            } else {
                System.out.println("Users." + userId + "Panic buying failed"); }}finally{}return null; }}Copy the code
  1. Inventory set to 2;

  1. Start 10 thread pressure tests using Jmeter.
  • Pressure test results
    • Inventory surplus: 1

    • Snap up orders: 10

Something’s wrong! Big problem!!

Originally there were two stocks, but now there is still one left, and there are 10 seconds to kill successful, there is a serious oversold problem!

Problem analysis

The problem is actually very simple, when the second kill starts, 10 requests come in at the same time, check the inventory at the same time, find the inventory =2, and then reduce the inventory to 1, the second kill succeeds, a total of 10 goods sold, inventory minus 1.

So how to solve this problem, say to go to also quite simple, add lock line.

Solution in single machine mode

Add the JVM lock

First, in single-machine mode, there is only one service, adding A JVM Lock is OK, synchronized and Lock can be used.

@Service("grabJvmLockService")
public class GrabJvmLockServiceImpl implements GrabService {

    @Autowired
    OrderService orderService;

    /** * JVM lock snap logic **@param goodsId
     * @param userId
     * @return* /
    @Override
    public String grabOrder(int goodsId, int userId) {
        String lock = (goodsId + "");

        synchronized (lock.intern()) {
            try {
                System.out.println("Users." + userId + "Perform panic buying logic");
                boolean b = orderService.grab(goodsId, userId);
                if (b) {
                    System.out.println("Users." + userId + "Snap it up");
                } else {
                    System.out.println("Users." + userId + "Panic buying failed"); }}finally{}}return null; }}Copy the code

Here, take synchronized as an example. After locking, restore the inventory and re-test the pressure, and the results are as follows:

  • Pressure test results
    • Inventory surplus: 0

    • Snap up orders: 2

And you’re done!

Do JVM locks still work in cluster mode?

If the issue is resolved in single-machine mode, will jVM-level locks still work in cluster mode?

There are two services, and a gateway layer, for load balancing, reloading,

  • Pressure test results

    • Inventory surplus: 0

    • Snap up orders: 4

The answer is obvious, lock doesn’t work!!

Solution in cluster mode

Problem analysis

The reason for this problem is that jVM-level locks are two different locks in two services, each of which takes, sells, and is not mutually exclusive.

So what to do? It is easy to do, just separate the lock, let two services to hold the same lock, that is, distributed lock.

A distributed lock

Distributed locking is a way to control synchronous access to shared resources between distributed systems.

In distributed systems, they often need to coordinate their actions. If different systems or hosts on the same system share one or a group of resources, the access to these resources must be mutually exclusive to prevent interference and ensure consistency. In this case, distributed locks are required.

Common distributed lock implementation methods include MySQL, Redis, Zookeeper, etc.

Distributed lock –MySQL

MySQL implements locking by preparing a table as a lock.

  • Insert the ID of the item to be snapped up as the primary key or unique index into the table as the lock, so that other threads to lock will fail to insert, so as to ensure mutual exclusion;

  • This record is deleted when unlocking, and other threads can continue to lock.

According to the above scheme, part of the code prepared:

  • The lock
/** * MySQL write distributed lock */
@Service
@Data
public class MysqlLock implements Lock {

    @Autowired
    private GoodsLockDao goodsLockDao;

    private ThreadLocal<GoodsLock> goodsLockThreadLocal;

    @Override
    public void lock(a) {
        // 1
        if (tryLock()) {
            System.out.println("Try locking");
            return;
        }
        / / 2. Sleep
        try {
            Thread.sleep(10);
        } catch (InterruptedException e) {
            e.printStackTrace();
        }
        // 3. Recursively call again
        lock();
    }

    /** * non-blocking lock, success, success, failure. Return */ directly
    @Override
    public boolean tryLock(a) {
        try {
            GoodsLock goodsLock = goodsLockThreadLocal.get();
            goodsLockDao.insert(goodsLock);
            System.out.println("Lock object:" + goodsLockThreadLocal.get());
            return true;
        } catch (Exception e) {
            return false; }}@Override
    public void unlock(a) {
        goodsLockDao.delete(goodsLockThreadLocal.get().getGoodsId());
        System.out.println("Unlock object:" + goodsLockThreadLocal.get());
        goodsLockThreadLocal.remove();
    }

    @Override
    public void lockInterruptibly(a) throws InterruptedException {
        // TODO Auto-generated method stub

    }

    @Override
    public boolean tryLock(long time, TimeUnit unit) throws InterruptedException {
        // TODO Auto-generated method stub
        return false;
    }
    
    @Override
    public Condition newCondition(a) {
        // TODO Auto-generated method stub
        return null; }}Copy the code
  • For logic
@Service("grabMysqlLockService")
public class GrabMysqlLockServiceImpl implements GrabService {

    @Autowired
    private MysqlLock lock;
    
    @Autowired
    OrderService orderService;

    ThreadLocal<GoodsLock> goodsLock = new ThreadLocal<>();

    @Override
    public String grabOrder(int goodsId, int userId) {
        / / to generate the key
        GoodsLock gl = new GoodsLock();
        gl.setGoodsId(goodsId);
        gl.setUserId(userId);
        goodsLock.set(gl);
        lock.setGoodsLockThreadLocal(goodsLock);

        // lock
        lock.lock();

        // Perform business
        try {
            System.out.println("Users."+userId+"Perform panic buying logic");

            boolean b = orderService.grab(goodsId, userId);
            if(b) {
                System.out.println("Users."+userId+"Snap it up");
            }else {
                System.out.println("Users."+userId+"Panic buying failed"); }}finally {
            / / releases the lock
            lock.unlock();
        }
        return null; }}Copy the code

After the inventory was restored, the pressure test was continued and the results were consistent with expectations.

  • Pressure test results
    • Remaining inventory: 0

    • Snap up success: 2

Problems and solutions

  1. What should I do if the lock is not released due to sudden network disconnection?

A: Add the start time and end time fields to the lock table as the lock validity period. If the lock is not released in time due to various reasons, you can judge whether the lock is valid according to the validity period.

  1. What if the lock expires and the thread task has not finished executing?

A: You can introduce a Watch Dog mechanism to renew the lock before the task ends, which will be explained later.

Distributed lock –Redis

MySQL solutions can be used in small to medium sized projects and can be used in large projects with MySQL configurations added, but Redis is the most commonly used.

The Redis lock is implemented by using setnx command in the format of setnx key value.

Setnx stands for “Set if Not exists”; If the key does not exist, set the key value to value. When the key exists, no operation is performed.

  • Lock:setnx key value
  • Unlock:del key

Redis distributed lock – deadlock problem

The reasons causing

The locked service was suspended during execution. The lock remained in Redis before it was released, causing other services to fail to be locked.

The solution

Set the expiration time for the key to automatically expire. After the expiration, the key will not exist and other services can continue to lock.

  • Note that when you add an expiration time,Can'tUse this way:
setnx key value;
expire key time_in_second;
Copy the code

This method can also hang after the first setnx succeeds, and the expiration time is not set, resulting in a deadlock.

  • The effective solution is to lock and set the expiration time with a single command in the following format:
set key value nx ex time_in_second;
Copy the code

This approach is supported in Redis 2.6.12, older versions of Redis can use LuaScript.

An issue caused by expiration time

Fault 1: Assume that the lock expiration time is set to 10 seconds. After service 1 locks the lock, the execution is not complete within 10 seconds. In this case, service 2 can also successfully lock the lock, causing both services to obtain the lock at the same time.

Fault 2: When service 1 finishes to release the lock after 14 seconds, the lock added by service 2 is released. In this case, service 3 can successfully add the lock again.

The solution

The second problem is easy to solve. When releasing the lock, judge whether it is your own lock. If it is your own lock, release it. If not, skip it.

Problem # 1 solution: The Watch Dog mechanism

When the main thread is not finished executing the business logic, the child thread (watchdog) will renew the expiration time every third of the expiration time, so as to ensure that the main thread is not finished, the lock will not expire.

  • Implementation of the Watch Dog mechanism
@Service
public class RenewGrabLockServiceImpl implements RenewGrabLockService {

    @Autowired
    private RedisTemplate<String, String> redisTemplate;

    @Override
    @Async
    public void renewLock(String key, String value, int time) {
        System.out.println("Life"+key+""+value);
        String v = redisTemplate.opsForValue().get(key);
        // write an infinite loop, add a judgment
        if (StringUtils.isNotBlank(v) && v.equals(value)){
            int sleepTime = time / 3;
            try {
                Thread.sleep(sleepTime * 1000);
            } catch(InterruptedException e) { e.printStackTrace(); } redisTemplate.expire(key,time,TimeUnit.SECONDS); renewLock(key,value,time); }}Copy the code

A single Redis node is faulty

If Redis hangs during execution and all services to lock cannot be locked, this is a single node failure problem.

The solution

Use multiple Redis.

First, let’s analyze a question: Can multiple Redis be the master slave?

Redis master-slave problem

When a thread locks successfully, but the key is not synchronized, the Master node of Redis hangs, and there is no key in the Slave node, the lock can still be successfully added by another service.

Therefore, the master-slave scheme cannot be used.

Another option is a red lock.

Red lock

The redlock scheme also uses multiple Redis, but there is no relationship between the multiple Redis, they are independent Redis.

When locking, lock on one Redis is successful, immediately lock on the next Redis. If lock on half of Redis is successful, lock is successful; otherwise, lock fails.

Red lockWill there be oversold problems?

Will! .

If the operation and maintenance guy is very diligent and does automation, and the Redis hangs and restarts immediately, then the rebooted Redis does not have the key that was locked before, and other threads can still lock successfully, which results in two threads getting the lock at the same time.

  • Solution:Delay to restartHang up theRedis, a day late startup is not a problem, restart too soon will be a problem.

The ultimate question

Is the program perfect so far?

No!

When The program is executing, The lock is added, and The Watch dog is constantly renewed, everything seems good, but there is a final problem in Java –STW (Stop The World).

When FullGC is encountered, The JVM will Stop The World (STW). At this point, The World is paused, The main thread executing The task is paused, The watch dog used for renewal is not renewed, and The lock in Redis will slowly expire. When The lock expires, Other JVMS can come in and successfully lock, and the old problem arises again, with two services holding the lock at the same time.

The solution
  • Scheme 1: Ostrich algorithm

  • Solution 2: Final solution: Zookeeper+MySQL Optimistic lock

Distributed lock –Zookeeper+MySQL Optimistic lock

How does Zookeeper solve the STW problem?

  • When a lock is added, a temporary sequence node is created in ZooKeeper. Zookeeper generates a sequence number and saves the sequence number in the Verson field of MySQL for verification.

    • If the lock is not released, it occursSTWThe version field in MySQL will be changed after the lock expires.
  • When unlocking, verify that the Version field is the content of the lock

    • If yes, delete the node and release the lock.
    • If the command is not executed, it indicates that the system is in a coma and the execution fails.

The world became quiet.

The relevant code

  • gitee: distributed-lock

reference

Horse soldier education