preface

During the development of our daily, and they are using the database for data storage, but when faced with a large amount of data the needs of concurrent requests, such as kill or hot data request, if all the requests are directly will take up a lot of hard disk resources, play to the database system in a very short period of time to complete the tens of thousands of times of the read/write operations, extremely easy to cause the database system paralysis.

At this point we will introduce a caching layer to block most of the requests and relieve the database pressure. But the introduction of cache layer often brings cache penetration, cache breakdown, cache avalanche and other problems.

This paper takes Redis as an example to simulate and solve the above three problems.

Cache breakdown

Breakdown is refers to the cache cache without but some data in the database (usually the cache time expires), at this time due to concurrent users much more special, read cache didn’t read the data at the same time, and to the database to get the data at the same time, this time if your code does not achieve synchronization mechanism, can cause a small request directly to the database, the database to bring certain pressure.

To simulate the demand

Simulation requirement: a seckill activity is about to start, and 1W requests are simulated at the same time to obtain the detailed information of an item

Expectation: only 1 request can be sent to the database, all other requests will be sent to Redis or other cache

The wrong sample

Let’s start with an error example. The following example code does not do any synchronization and simulates case one.

import java.util.HashMap;
import java.util.Map;
import java.util.concurrent.CountDownLatch;
import java.util.concurrent.atomic.AtomicInteger;

/ * * *@author HeyS1
 * @date 2020/3/12
 * @description* /
public class ConcurrentTest {
    // Number of requests
    private int reqestQty = 10000;
    // Run the timer down and continue executing the main thread after sending a reqestQty request
    private CountDownLatch latch = new CountDownLatch(reqestQty);

    // Record the number of times the request landed on the database
    private AtomicInteger dbSelectCount = new AtomicInteger();
    // Record the number of times the request fell into the cache
    private AtomicInteger cacheSelectCount = new AtomicInteger();
    // Use HashMap to simulate cache storage
    private Map<String, String> cache = new HashMap<>();

    public static void main(String[] args) {
        new ConcurrentTest().go();
    }


    private void go(a) {
        // Create 1W threads at the same time
        for (int i = 0; i < reqestQty; i++) {
            new Thread(() -> {
                this.getGoodsDetail("Product id");
                latch.countDown();
            }).start();
        }


        // The await() method blocks the program from continuing when the counter is greater than 0
        try {
            latch.await();
        } catch (InterruptedException e) {
            e.printStackTrace();
        }
        System.out.println("Database query times:" + dbSelectCount.get());
        System.out.println("Number of cache queries:" + cacheSelectCount.get());
    }


    /** * Obtain product data **@paramKey Specifies the product ID *@return* /
    public String getGoodsDetail(String key) {
        // Select * from cache; // Select * from cache
        String data = this.selectCache(key);
        if(data ! =null) {
            return data;
        }

        // If it does not exist, query from the database and put the data into the cache
        data = this.selectDB(key);
        cache.put(key, data);
        return data;
    }


    /** * Get data from cache **@param key
     * @return* /
    public String selectCache(String key) {
        cacheSelectCount.addAndGet(1);// Record times

        System.out.println(Thread.currentThread().getId() + "Obtaining data from cache ====");
        return cache.get(key);
    }

    /** * Get data from database **@param key
     * @return* /
    public String selectDB(String key) {
        sleep(100);// It takes 100ms to simulate the query database
        dbSelectCount.addAndGet(1);// Record times

        System.out.println(Thread.currentThread().getId() + "Get data from DB ====");
        return "Data of data";
    }

    private static void sleep(long m) {
        try {
            Thread.sleep(m);
        } catch(InterruptedException e) { e.printStackTrace(); }}}Copy the code
Result: Database query times: 202 Cache query times: 10000Copy the code

As you can see, if the getGoodsDetail method doesn’t do anything, there are still a few requests that go directly to the database, and that’s called cache penetration

    /** * Obtain product data **@paramKey Specifies the product ID *@return* /
    public String getGoodsDetail(String key) {
        // Select * from cache; // Select * from cache
        String data = this.selectCache(key);
        if(data ! =null) {
            return data;
        }

        // If it does not exist, query from the database and put the data into the cache
        data = this.selectDB(key);
        cache.put(key, data);
        return data;
    }
Copy the code

Solution 1: synchronized

The simplest solution to this problem is to use a synchronized code block, but it has a disadvantage: in a distributed system/cluster, there is no way to ensure that the nodes are synchronized, which means that if the inventory is not oversold, such as seckill, it cannot be used. But only to query the details of the commodity this demand, in fact, the problem is not big, specific to see the business.

Just modify the getGoodsDetail method in the example above

    /** * Obtain product data **@paramKey Specifies the product ID *@return* /
    public String getGoodsDetail(String key) {
        // Select * from cache; // Select * from cache
        String data = this.selectCache(key);
        if(data ! =null) {
            return data;
        }
        // Synchronize the code block
        synchronized (this) {
            // You need to query the cache again to prevent queries from reaching the database from other threads waiting to enter the synchronized code block
            data = this.selectCache(key);
            if(data ! =null) {
                return data;
            }

            // If it does not exist, query from the database and put the data into the cache
            data = this.selectDB(key);
            cache.put(key, data);
            returndata; }}Copy the code
Database query times: 1 Cache query times: 10276Copy the code

Solution 2: Redis distributed locks

This scheme is suitable for cluster or distributed architecture, single machine use can also, but does not make sense.

There are generally three distributed lock implementation methods, this paper uses Redis to achieve

1. Optimistic database lock. 2. Distributed lock based on Redis; 3. Distributed locks based on ZooKeeper

First, to ensure that distributed locks are available, we must at least ensure that the lock implementation meets the following four conditions:

  1. Mutual exclusivity. At any given time, only one client can hold the lock.
  2. Deadlocks do not occur. Even if one client crashes while holding the lock and does not actively unlock it, the lock is guaranteed to be added later by other clients.
  3. With fault tolerance. Clients can lock and unlock as long as most Redis nodes are running properly.
  4. To solve the bell must also be the bell. The locking and unlocking must be the same client. The client cannot unlock the lock imposed by others. If thread C1 acquires the lock, but the lock expires before thread C1 finishes processing the service, thread C2 acquires the lock and releases the lock when thread C1 finishes processing the service. But at this time, thread C2 is still processing the business. Thread C1 releases the lock on thread C2, causing thread C2’s business processing to actually have no lock to provide protection. Similarly, thread C2 can release locks on thread C3, causing serious problems.

Complete example:

RedisTemplate is used here, please use Spring to integrate Redis


@RunWith(SpringRunner.class)
@SpringBootTest(classes = App.class)
@Slf4j
public class ConcurrentTest3 {

    @Autowired
    RedisTemplate<String, String> redisTemplate;

    // Number of requests
    private int reqestQty = 10000;
    // Run the timer down and continue executing the main thread after sending a reqestQty request
    private CountDownLatch latch = new CountDownLatch(reqestQty);

    // Record the number of times the request landed on the database
    private AtomicInteger dbSelectCount = new AtomicInteger();
    // Record the number of times the request fell into the cache
    private AtomicInteger cacheSelectCount = new AtomicInteger();


    @Test
    public void go(a) {
        // Create 1W threads at the same time
        for (int i = 0; i < reqestQty; i++) {
            new Thread(() -> {
                this.getGoodsDetail("Product id");
                latch.countDown();
            }).start();
        }

        // The await() method blocks the program from continuing when the counter is greater than 0
        try {
            latch.await();
        } catch (InterruptedException e) {
            e.printStackTrace();
        }
        System.out.println("Database query times:" + dbSelectCount.get());
        System.out.println("Number of cache queries:" + cacheSelectCount.get());
    }


    public String getGoodsDetail(String key) {
        // Select * from cache; // Select * from cache
        String data = this.selectCache(key);
        if(data ! =null) {
            return data;
        }

       /** * requestId is used to ensure that client A does not release the lock obtained by client B */
        String lockKey = "The Key lock";
        String requestId = "Request client ID";

        / / lock
        if (!this.lock(lockKey, requestId, 10)) {
            // If the lock fails, it proves that another thread has acquired the lock. In this case, just wait a moment and call this method again
            sleep(100);
            this.getGoodsDetail(key);
        }

        // Lock succeeded
        // You need to query the cache again to prevent other waiting threads from acquiring the lock and hitting the database again
        data = this.selectCache(key);
        if(data ! =null) {
            return data;
        }

        // Query from the database and put the data into the cache
        data = this.selectDB(key);
        redisTemplate.opsForValue().set(key, data, 60, TimeUnit.SECONDS);

        / / releases the lock
        this.unLock(lockKey, requestId);
        return data;
    }


    /** * Implement mutex (setnx) with redis **@param lockKey
     * @param requestId
     * @paramExpireTime Indicates the lock expiration time. If the lock is not unlocked after the expiration time, the lock is automatically unlocked to prevent deadlock *@return* /
    public boolean lock(String lockKey, String requestId, int expireTime) {
        Boolean res = redisTemplate.opsForValue().setIfAbsent(lockKey, requestId, Duration.ofSeconds(expireTime));
        returnres ! =null && res;
    }

    /** * release the lock, using the Lua script to ensure atomicity **@param lockKey
     * @param requestId
     * @return* /
    public boolean unLock(String lockKey, String requestId) {
        String script = "if redis.call('get', KEYS[1]) == ARGV[1] then return redis.call('del', KEYS[1]) else return 0 end";
        DefaultRedisScript<Boolean> redisScript = new DefaultRedisScript<>(script, Boolean.class);
        Boolean res = redisTemplate.execute(redisScript, Collections.singletonList(lockKey), requestId);
        returnres ! =null && res;
    }


    /** * Get data from cache **@param key
     * @return* /
    public String selectCache(String key) {
        cacheSelectCount.addAndGet(1);// Record times

        System.out.println(Thread.currentThread().getId() + "Obtaining data from cache ====");
        return redisTemplate.opsForValue().get(key);
    }

    /** * Get data from database **@param key
     * @return* /
    public String selectDB(String key) {
        sleep(100);// It takes 100ms to simulate the query database
        dbSelectCount.addAndGet(1);// Record times

        System.out.println(Thread.currentThread().getId() + "Get data from DB ====");
        return "Data of data";
    }

    private static void sleep(long m) {
        try {
            Thread.sleep(m);
        } catch(InterruptedException e) { e.printStackTrace(); }}Copy the code
Result: Database query times: 1 Cache query times: 32095Copy the code

Redis implements distributed Locking in the correct way (Java version)

Solution expansion

As long as the synchronization mechanism is implemented, the problem of penetration can be fundamentally solved. Of course, we have some methods to avoid the occurrence of penetration, such as

  1. Hotspot data will never expire.
  2. Cache warm-up, such as initialization of data in Redis before a seckill activity begins.
  3. Write a script to scan a cache that is about to expire but is heavily trafficked at the time, to delay its expiration.

The cache to penetrate

Normally, the data we query is there.

So the request to query a data that does not exist in the database, that is, the cache and the database can not query this data, but the request will be sent to the database every time.

This phenomenon of query without data is called cache penetration.

The general situation is a hacker attack, with a non-existent ID to send a large number of requests, such a request to the database to query, may cause your database due to excessive pressure and downtime.

Solution 1: Cache empty values

Penetration occurs because there is no key in the cache to store the empty data. As a result, every query goes to the database.

Then we can set the values of these keys to null or an empty string and throw them into the cache. Any subsequent request for this key will return null.

This way, you don’t have to go all the way through the database, but don’t forget to set the expiration time to be too long, such as 5 minutes.

Solution 2: Bloom filter

Solution 1 can basically solve the problems in the business scenario, but the Redis memory will burst when the website attacks and the redis is constantly requested different keys.

This is where another solution comes in: the Bloom filter

A Bloom filter is a magical data structure that can be used to determine whether an element is in a set

Consult the article on the net by oneself specifically

Cache avalanche

Cache avalanche is a conceptual problem, similar to a cache breakdown.

Cache avalanche refers to a large amount of data in the cache that reaches the expiration date and a large amount of queried data, resulting in excessive pressure on the database and even down the machine. Different from the cache breakdown: cache breakdown refers to the concurrent search of the same data, cache avalanche is different data expired, many data can not be found to check the database.

Once you solve the problem of penetration and penetration, there will be no avalanches, even if a lot of data is out of date at some point, it won’t hit the database directly, as long as your Redis can withstand it.

Of course, we should avoid this problem as much as possible. We can solve this problem by setting random expiration times for cached data to avoid having many caches expire at the same time.

In fact, the above “solution expansion” mentioned in the method is also applicable to avalanche, to flexibly use the solution according to the business.