Author: Ma Gongwei Qingyun Technology software development engineer
Currently engaged in the development of Qingyun database management platform, has been engaged in Server development.
In high-concurrency scenarios, caching is often used to relieve database pressure, which greatly improves user experience and system stability. Redis is good for caching because of its many features. The following is a common cache query flow.
While caching brings significant performance gains in database usage, it also brings some cache usage issues.
This article will introduce and distinguish three common cache problems: cache penetration, cache breakdown, and cache avalanche, and provide corresponding solutions for different problems.
| cache breakdown
After a query request is sent, no query content exists in the cache layer but in the database. In a high-concurrency scenario, a hotspot Key suddenly fails and all requests are sent to the database, causing heavy database pressure.
solution
It will never fail
There are two cases:
- No expiration time is set for the Redis Key of cached data, and the cache is refreshed after the database is written.
- Another asynchronous task updates the cache when the Redis Key is about to expire.
Use mutex
Mutex is often used in the industry. In simple terms, only one thread is allowed to rebuild the cache, and the rest of the thread can wait until the rebuild completes and then fetch data from the cache.
| cache to penetrate
After a query is delivered, there is no data at the cache layer or the database layer. No data is cached because the storage layer cannot be queried. Procedure Non-existent data requests must be queried in the storage tier every time. In high concurrency scenarios, the database is overburdened.
solution
Cache is null
For data that is not in the cache and cannot be queried in the database, null values can be cached and a shorter validity period can be set.
Interface layer verification
For a request such as ID=-1, you can add parameters and occurrence check at the interface layer. If the check fails, Return directly.
Traffic limiting by IP address
Limit the flow of a single IP address, for example, 10 times / 2 seconds. Virtual currency exchanges use this solution.
Bloom filter
When you want to query a data, K hash functions are used to evaluate the element value K times, resulting in K hash values. Based on the resulting hash values, check the corresponding K bit values.
| cache avalanche
In a high concurrency scenario, a large number of cache failures occur at the same time, and all requests are sent to the database, causing excessive database pressure. Unlike a cache breakdown, which queries the same data concurrently, a cache avalanche is a massive cache failure.
solution
Random generated Redis Key expiration time
Generate a random expiration time for the Key when writing to the cache, such as a random number of seconds between 1 and 10.
Lock line
Same as cache breakdown solution.
Cache warming
When the system goes live, data is written to the cache to avoid high concurrent access in the early days of the system going live.
It will never fail
Same as cache breakdown solution.
In distributed system, the problem of distributed lock should be considered, and in high concurrency, it may lead to user waiting timeout, which has no significant effect on system concurrency.
conclusion
To the above mentioned solutions, there are also some disadvantages as follows.
Type of problem | Train of thought | disadvantages |
---|---|---|
Breakdown & avalanche | It will never fail | Dirty data is easily generated |
Breakdown & avalanche | Use mutex | The system throughput decreases |
through | Bloom filter | False positives are produced |
Therefore, different service systems need to be analyzed based on different service scenarios.