Hello everyone, I am the third, I am tired of…
Write a post to share the old eight: cache breakdown, cache penetration, cache avalanche.
Before we understand these three problems, we need to understand that the common distributed cache Redis single machine concurrency can reach ten thousand levels, commonly used relational database MySQL generally concurrency is thousand levels, they support concurrency may be ten times worse, so as far as possible to intercept traffic in the cache layer.
Why is that? It is just like that if more water is discharged from the big lake, the small river channel may be washed away. I don’t know if you have heard that the water of the Yangtze River comes from heaven, but it cannot be held in baiyangdian.
Cache breakdown
What is cache breakdown
Let’s start with cache breakdown.
Cache breakdown: A key with a large number of concurrent visits expires at a certain time, causing all requests to be directly sent to the DB.
Cache breakdown will increase the database load, let’s see how to alleviate.
How to solve cache breakdown
Lock the update
Query the cache, find that the cache does not exist, lock, let other threads wait, only one thread to update the cache.
Asynchronous update
Another possible solution is to set the cache to never expire. So how does the cache update? Update the cache asynchronously.
For example, a daemon thread is set up in the background to update the cache periodically, but this timing is difficult to grasp.
The asynchronous update mechanism is actually better suited for cache warming.
The cache to penetrate
What is cache penetration
Cache penetration: Cache penetration refers to querying data that does not exist in the cache or database, so that each request goes directly to the database as if the cache did not exist.
Cache penetration will cause non-existent data to be queried in the storage tier every time a request is made, losing the significance of caching to protect back-end storage.
Cache penetration may increase the load on back-end storage. If a large number of empty hits are detected, cache penetration may occur.
There are two possible reasons for cache penetration:
- Own business code issues
- Malicious attack, crawler caused a blank hit
So let’s see how we can solve this.
How is cache penetration resolved
Cache null/default values
One way is to save an empty object or default value to the cache after a database miss, and then access the data later, it will be retrieved from the cache, thus protecting the database.
There are two major problems with caching null values:
-
Null values are cached, meaning that more keys are stored in the cache layer, requiring more memory space (which is more serious in the case of an attack), which is more effective
The method is to set a short expiration time for such data and let it be automatically removed.
-
Data at the cache layer and storage layer may be inconsistent in a period of time, which may affect services.
For example, if the expiration time is set to 5 minutes and the data is added to the storage layer, data inconsistency will occur between the cache layer and the storage layer during this period.
Message queues or other asynchronous methods can be used to clean up empty objects in the cache.
Bloom filter
In addition to caching empty objects, we can also add a Bloom filter to filter them before storing and caching them.
The Bloom filter stores whether or not the data exists, and if it determines that the data is no longer available, it does not access the storage.
What the hell is a Bloom filter? Is it slow to find it?
What is a Bloom filter?
I don’t know how much you know about hash tables, but bloom filters are a similar thing.
It is a continuous data structure, and each bit store is a bit, i.e., 0 or 1, to identify the existence of data.
When storing data, map this variable to K points in the list of bits using K different hash functions, setting them to 1.
We check whether the cache key exists, and similarly, K hash functions map to K points on the list of bits, and check whether they are 1:
- If none of them are 1’s, then the key doesn’t exist;
- If they are both 1’s, it just means that the key might exist.
And why? Because hash functions can collide.
Let’s briefly compare the two main solutions for cache penetration:
Cache avalanche
Now let’s look at the most serious case, cache avalanche.
What is cache avalanche
Cache avalanche: When a large cache failure occurs at a certain time, such as cache service downtime, a large number of keys expire at the same time, the result is a large number of requests directly hit the DB, which may cause the whole system to crash, called avalanche.
How can cache avalanche be resolved
Cache avalanche is the most serious of the three cache problems, so let’s take a look at how to prevent and handle it.
Improve cache availability
- Cluster deployment: To improve cache availability, use Redis Cluster or a third-party Cluster solution such as Codis.
- Multi-level cache: Set multi-level cache. After level-1 cache fails, level-2 cache is accessed. The expiration time of each level-2 cache is different.
Expiration time
- Uniform expiration: To avoid a large number of caches that expire at the same time, you can generate different key expiration times randomly to avoid too many expiration times.
- Hotspot data never expires.
Fusing the drop
- Service meltdown: When the cache server goes down or responds out of time, business services are temporarily stopped from accessing the cache system to prevent an avalanche of the entire system.
- Service degradation: When a large number of cache failures occur and the business system is under high concurrency and load, requests for non-core interfaces and data are temporarily abandoned and a fallback error message is returned in advance.
conclusion
Summary in one chart:
Reference:
[1]. Redis Development and Operation
[2]. Forty Questions on Geek Time High Concurrency System Design
[4] cache penetration, cache breakdown, cache avalanche, just read this article
[5] bloom filter, it’s not that hard