For caching, we are certainly familiar with, whether front-end or server-side development, caching is almost one of the essential optimization methods. In the actual production environment, the specification of the use of cache has always been attached great importance. If the cache is not properly used, it is easy to encounter serious abnormal scenarios such as cache breakdown and avalanche, which will bring unexpected disasters to the system.

In order to avoid the loss of cache misuse, it is important to understand the cause and resolution of each exception so that we can take better preventive measures.

The cache to penetrate

And cache penetration refers to the cache and database are not in the data, so every request will go to check the library, not check the cache, if at the same time a large number of requests come in, it will give the database cause huge query pressure, even down the DB system.


For example, if you query the goods with ID -1, such id certainly does not exist in the goods list. If you do not do special treatment, the attacker can easily make the system crash. How can we avoid this situation?

In general, there are two common solutions for cache penetration:

Cache empty objects

If the cache and data cannot find the corresponding key data, the returned empty object can be written to the cache. In this way, the next time the key is requested, the returned empty object can be directly queried from the cache without going to the DB. Of course, to avoid storing too many empty objects, it is common to give empty objects a short expiration time, such as 30 seconds for keys:

redisTemplate.opsForValue().set(key, null, 30, TimeUnit.SECONDS);

Copy the code

There are two problems with this approach:

  • If there are a large number of key traversals, caching empty objects can take up valuable memory space.
  • The key of an empty object is set to expire. During this period, the database may have data for this key, resulting in data inconsistency.

In this case, we can use a better solution, which is a Bloom filter

Second, the Bloom Filter

Bloom Filter was proposed by a young man named Bloom in 1970. Bloom Filter is a probabilistic data structure composed of a long binary vector and a series of random mapping functions. This data structure has very high spatial efficiency and can be used to retrieve whether there is a specific element in a set.

Design idea

A Bloom filter is a data structure consisting of a bit array of m bits and K hash functions. The idea is that when an element is added to the set, it is mapped by K hash functions to K points in an array of bits, setting them to 1. When we search, we can roughly know whether it is in the set by looking at whether all the points are 1. That is, if any of the points has a 0, the element being checked must not be there. If both are 1, the element being checked is most likely in.

As for why both are 1’s, it is only possible that there are retrieval elements, because different elements may calculate the same hash value, there will be a hash collision, resulting in a non-existent element may correspond to a 1 bit.

For example, here is a Bloom filter with 18 bits and 3 hash functions. When an element W is queried and evaluated by three hash functions, it is found that there is a bit with a value of 0, so it is certain that the element is not in the set.


The advantages and disadvantages

Advantages:

  • Space saving: You do not need to store the data itself, but only the hash bits corresponding to the data
  • Low time complexity: to search elements based on the hash algorithm, the time complexity of both insertion and search is O(k), where K is the number of hash functions

Disadvantages:

  • Error in accuracy: Bloom filter judgment exists, may appear elements are not in the set; The accuracy depends on the number of hash functions
  • Cannot delete elements: If an element is deleted, it cannot be removed from the Bloom filter, which further results in a 1 for elements that do not exist.

Applicable scenario

  • Crawler URL deduplication
  • Spam filtering
  • The blacklist

Cache breakdown

Cache penetration is literally confusing with penetration, which is what many interviewers like to do, but if you know what you’re talking about, you won’t be fooled

In simple terms, cache breakdown refers to a key is very hot, in the continuous carrying large concurrency, large concurrency focused on this point to access, when the key in the moment of failure, continuous large concurrency will break through the cache, directly request the database, just like the dam suddenly broke a mouth, a large number of flood surging into.

When a cache breakdown occurs, the query pressure on the database multiplicates, resulting in a large number of requests being blocked.

If the data needs to be updated, we can start an asynchronous thread in the background and overwrite the cache if the key has expired.

Of course, this solution is only suitable for situations where strict consistency of data is not required, because while background threads are building the cache, other threads are likely also reading data, thus accessing old data.

If you want to ensure strict data consistency, you can use mutex

The mutex

A mutex means that when a key fails, one thread reads the data and builds it into the cache, while the other threads wait until the cache builds and then reads it again.

If it is a stand-alone system, use the JDK’s own synchronization tool Synchronized or ReentrantLock can be implemented, but generally speaking, have achieved to prevent cache breakdown traffic who also engaged in stand-alone system, must be distributed high point, in this case we can use distributed lock to do mutual exclusion effect.

In order for you to understand the process better, as a warm guy, I have prepared pseudo code for you as usual:

public String getData(String key){

    String data = redisTemplate.opsForValue().get(key);

    if (StringUtils.isNotEmpty(data)){

        return data;

    }

    String lockKey = this.getClass().getName() + ":" + key;

    RLock lock = redissonClient.getLock(lockKey);

    try {

        boolean boo = lock.tryLock(5, 5, TimeUnit.SECONDS);

        if(! boo) {

// Sleep for a while, then request again

            Thread.sleep(200L);

            data = getData(key);

        }

// Read data from the database

        data = getDataByDB(key);

        if (StringUtils.isNotEmpty(data)){

// Build the data into the cache

            setDataToRedis(key,data);

        }

    } catch (InterruptedException e) {

// Exception handling, logging, throwing, etc

    }finally {

        if(lock ! = null && lock.isLocked()){

            lock.unlock();

        }

    }

    return data;

}

Copy the code

Of course, the mutex approach also has its drawbacks. When the cache fails, only one thread reads the database and writes back to the cache at a time, and all the other threads are blocked. In a high-concurrency scenario, a large number of threads blocking will inevitably reduce throughput. How to deal with this situation? Design I can only say that nothing is perfect, and you want to be a consistent data, and to guarantee the throughput, which have so good things, to can more robust system, under the necessary sacrifice performance is also can adopt measures, how to choose between the decision according to actual business scenarios, universal technology scheme of what does not exist.

Cache avalanche

Cache avalanche is also the key failure after a large number of requests to the database of abnormal situation, however, with breakdown, cache cache breakdown because refers to a hot key failure cause, and cache avalanche is refers to the mass of data cache expiration at the same time, a huge amount of request directly to the db layer, causing the db stress even downtime, It also fits the literal “avalanche” description.


The solution

The cache avalanche solution is the same as the breakdown solution, with keys that are not stale or mutex.

In addition, to prevent a large number of keys from failing at the same time, you can add random values to different key expiration times to ensure that the cache expiration time is as uniform as possible. In this way, data will not fail at a large number of times at the same time.

redisTemplate.opsForValue().set(Key, value, time + math.random () * 1000, timeUnit.seconds);

Copy the code

You can also combine a master/standby cache strategy to make the mutex approach more reliable,

Primary cache: The validity period is set according to the empirical value. The primary cache is set to read by the primary cache, and the latest value is loaded from the database after the primary cache fails.

Backup cache: the cache that is read when the lock fails to be obtained. When the primary cache is updated, the backup cache needs to be updated synchronously.

In general, the above three types of cache exception scenarios are asked a lot, so it’s enough to know the basics, but some interviewers may like to go overboard and extend their questions to other exception scenarios, so we’ll cover two other common cache exception scenarios, just in case.

Cache warming

Cache preheating is to build relevant data into the cache after the system goes online, so that users can avoid directly checking the library when they request.

This part of preheated data mainly depends on the volume of traffic and data volume, if the volume of data is not large, then there is no need to do preheating, there are not many requests, directly according to the normal cache read process.

If the volume of traffic is large, it also depends on the size of the data to do preheating measures.

  • When the amount of data is not large, the loading and caching action is carried out when the project is started. Such data can generally be information such as operation bits on the home page of e-commerce.
  • When there is a large amount of data, set a scheduled task script to refresh the cache.
  • When the amount of data is too large, priority should be given to ensure that hotspot data is loaded into the cache in advance, and the cache cannot be changed during the visit. For example, a timer should be used to refresh the commodity information into the cache 30 minutes before the event, and the background operation personnel should not change the commodity attributes during the event.

Cache the drop

Cache degradation refers to the failure of the cache or the failure of the cache server. Instead of accessing the database, the default data is returned or the memory data of the service is accessed.

In actual project practice, it is common to cache some hot data in the memory of the service. Tools like HashMap and Guava can directly use the memory data of the service once the cache is abnormal, so as to avoid huge pressure on the database.

, of course, such operations have damage to the business, it is easy to appear in the distributed system data inconsistency problem, therefore, generally in this case, we are all from the perspective of operational preference to ensure the cache server high availability, such as deployment of Redis adopts cluster method, completes the backup at the same time, in short, try to avoid relegation.

The last

That’s all we have to say about the exception handling of caches. Although we have a solution for each exception, we don’t mean it can be used directly. Reality or according to the actual situation in the development process for caching make corresponding measures, such as a cloth filter to prevent the cache although penetration is very effective, but is not commonly used in particular, this year, prevent malicious attack what do limit at the operational level, is a business code level is more of the parameters and data check.

If you have to consider this complexity in every place that you use caching, you’re going to have to do a lot more work. Overdesigning your code just makes it harder to maintain, and it’s not necessarily practical. As for programmers, the less they have to worry about, the better. After all, our biggest enemy is not 996, but the precious amount of ammo.



More exciting articles welcome to pay attention to my public number, scan below the TWO-DIMENSIONAL code or wechat search I Xue, reply [e-book] can also get learning materials oh ~~~ we see you next time!