This article is shared in huawei Cloud community “[High Concurrency] What is cache penetration? Breakdown? Avalanche? How to solve it?” , author: Ice River.

When it comes to Redis, most of the scenarios are used as the cache of the system. When it comes to the cache, especially the distributed cache system, in the actual high-concurrency scenario, a slight mistake will cause the problems of cache penetration, cache breakdown and cache avalanche. So what is cache penetration? What is cache breakdown and what is cache avalanche? How are they caused? And how to solve it? Today, we are going to look at these questions.

The cache to penetrate

First, let’s talk about cache penetration. What is cache penetration? The cache penetration problem is partly related to cache hit ratio. If our cache design is poor and the cache hit ratio is very low, most of the burden of data access will be at the back-end database level.

What is cache penetration?

If no match is found at either the cache layer or the database layer when the data is requested, that is, there is no hit at either the cache layer or the database layer, then this situation is called cache penetration.

We can use the following figure to illustrate the phenomenon of cache penetration.

The main reason for cache penetration is that the data corresponding to a Key is queried. If there is no corresponding data in the Redis cache, the data is queried directly in the database. If there is no data in the database to query, the database will return null and Redis will not cache the null result. This results in the data being queried directly into the database every time the Key is queried. Redis does not cache empty results. This creates a cache penetration problem.

How to solve the cache penetration problem?

Now that we know that the main cause of cache penetration is that there is no corresponding data in the cache, so we query directly to the database, and the database returns empty results, but the cache does not store empty results.

The first solution comes naturally: cache empty objects. When the first query result from the database is empty, we load the empty object into the cache and set a reasonable expiration time, thus ensuring the security of the back-end database to some extent.

The second solution to the cache penetration problem is to use bloom filters, which can handle large, regular key values. Whether or not a record exists is essentially a Bool that can be stored using only 1bit. We can use a Bloom filter to compress this yes, no, and so on into a data structure. For example, the data we’re most familiar with, the gender of the user, is ideal for processing with bloom filters.

Cache breakdown

If we set the same expiration time for most of the data in the cache, at some point the data in the cache will expire in bulk.

What is cache breakdown?

If the data in the cache expires in bulk at some point, the majority of users’ requests will land directly on the database, this phenomenon is called cache breakdown.

We can use the following figure to represent cache breakdown threads.

The main reason for the cache breakdown is that we set an expiration time for the data in the cache. If a large amount of data is fetched from the database at one time and the same expiration time is set, the cached data will be invalidated at the same time, causing cache breakdown.

How to solve the cache breakdown problem?

For hot data, we can set the data in the cache to never expire. You can also update the expiration time of data in the cache as it is accessed; If the cache items are stored in batches, they can be assigned a reasonable expiration time to avoid being invalid at the same time.

Another solution is to use distributed locks to ensure that only one thread queries back-end services for each Key at the same time. When a thread queries back-end services, other threads do not have access to distributed locks and need to wait. However, in high concurrency scenarios, this solution has a high access pressure on distributed locks.

Cache avalanche

If the caching system fails, all the concurrent traffic goes directly to the database.

What is cache avalanche?

If at some point the cache set fails, or the cache system fails, all concurrent traffic goes directly to the database. The amount of calls to the data storage layer will explode, and before long, the database will be overwhelmed by heavy traffic. This cascade of service failures is called cache avalanche.

The phenomenon of cache avalanche can be illustrated in the following figure.

The main cause of cache avalanches is a cache set failure, or a cache service failure that overwhelms the database with a sudden surge of concurrent traffic.

How to solve cache avalanche problem?

The most common solution to the cache avalanche problem is to ensure high availability of Redis. Deploying Redis cache into a high availability cluster (remote live if necessary) can effectively prevent the cache avalanche problem.

In order to mitigate large concurrent traffic, we can also use traffic limiting degradation to prevent cache avalanche. For example, the number of threads that read the database write cache can be controlled by locking or using queues after a cache failure. Some keys allow only one thread to query data and write to the cache while the other threads wait. It can effectively alleviate the huge impact of large concurrent traffic on the database.

In addition, we also can be accessed through data preheating may be a lot of data is loaded into the cache, at the time of impending large concurrent access, manual trigger loading different data to the cache in advance, and for the data set different expiration time, let the timing of the cache invalidation even as far as possible, not all at the same time.

Click to follow, the first time to learn about Huawei cloud fresh technology ~