Cache series: Cache penetration solution

Hi, I’m Li Ge.

Last time we discussed the architecture of caching in distributed systems, from browser cache to client cache to CDN cache to reverse proxy cache to local cache to distributed cache. There are a lot of caches throughout the link.

If an interviewer asks you, “Given a chance, is there any way you can bypass all these layers of caching and crush our slow machines?” (Slow devices here generally refer to disk-based relational databases such as MySQL)

Is this what the legend says: “even if we are outnumbered, the last general can take the head of the enemy general from the army of a million!”

That’s our topic today: cache penetration

What is cache penetration?

As we know, the cache works by fetching data from the cache first, returning it directly to the user if there is data, or reading the actual data from the slow device if there is no data and putting it into the cache. Synchronous caching looks like this:

Or use an asynchronous way to synchronize the cache like this:

But what is the flow if there is no data in the cache or on slow devices all the time?

Yes, if there is no data in the cache or the database, the request will still go through the cache and then through the database, and the next request will still follow the same path. This is cache penetration.

Summary: Cache penetration means that a request is cached every time to access a slow device, which has no data. In the case of concurrent access, the slow device may be overwhelmed.

What are the possible problems with cache penetration?

Now that we know the cache penetration scenario, the next thing we need to know is what are the pain points?

It is possible to overwhelm slow devices in high concurrency;
Each request goes through an invalid cache layer and the request is relatively slow;

What are the solutions to cache penetration?

Solution to Problem 1:

High concurrency will crush slow devices, so the solution here can be to make high concurrency low, and slow devices strong enough to withstand enough pressure not to collapse.

High concurrency translates to low concurrency

How do you turn high concurrency into low concurrency?

You’re on the right track right now. Lock it. In this case it can be local and distributed. A local lock means that a service allows only N threads to access slow devices at the same time (N stands for N instances), and a distributed lock means that only one thread can access slow devices at the same time at all times. Of course, you can also play with some limiting algorithms. For example, the counter method, token bucket algorithm, leaky bucket algorithm, etc., our topic today is not it, for the moment to skip, there will be a special article on it.

Back to business.

Local locks can be used for a small number of service instances because local locks are faster than distributed locks.

If the number of service instances is large, distributed locks are required to prevent traffic.

However, if the data never exists, there is still this invalid request, so, when this thread accesses the slow device, if the data is empty, we will agree to put a data in the cache, if the data is agreed, the next request will return no data. Here’s an example:

(This chart is recommended to be understood.)

Improve the high availability of slow devices

By slow devices we generally mean DB, how can we increase DB high availability in this scenario without getting crushed?

We have two ideas:

We configure the maximum number of threads in the DB side, with a fixed number of threads to limit its flow processing;

Or we could make DB a master-slave database, with read data evenly spread across the slave database, and the slave database allows us to expand horizontally indefinitely.

Solution to Problem 2:

Problem 2: Each request goes through an invalid cache layer and the request is relatively slow.

Yes, if you are careful, you must have noticed that we have solved this problem in our discussion of switching from high concurrency to low concurrency.

Another way to think about it:

Why does this exist? (The cache has no data, the slow device has no data, but the request exists)

My guess is:

The attacker.
Data is modified or deleted. After the data is changed, the original query path cannot obtain the data.
Slow device failure causes data loss and cache invalidity

Therefore, if it is an attacker, our response can be to block IP and filter the query content after the request reaches the threshold. For example, if id≤0 does not exist, we can filter.

If the original path cannot be queried due to data updates and slow devices break down due to high concurrency, the solution is to minimize the update frequency of hotspot data and asynchronously refresh hotspot data to ensure that hotspot data does not expire in the cache.

If it is a slow device failure, then there is no way, beyond the scope of programmers should have to think, in this case should be to do same-city live, remote live and other architecture to ensure high availability. Like Ali’s two places three centers, three places five centers and other structures, don’t worry, the back will be introduced in detail, a focus on it.

conclusion

Cache penetration refers to the fact that neither the cache nor the slow device has data, and high concurrent requests cause the slow device to fail.

Cache penetration solutions:

If it is an attacker, block the IP
Perform data filtering in advance to reduce the number of accesses from slow devices
If there is no data in the cache, obtain a lock before accessing slow devices to reduce concurrent access from slow devices
Design a high availability architecture

You can pay attention to my public number: Li Ge technology

Well, that’s all for this issue. Thanks for reading.

Now that you have seen this, your attention + like is the biggest support for me.