First of all, this blog excerpted from aobing troll, click to enter the original!

Cache avalanche

What is cache avalanche

Let’s start with the current mainstream system architecture

At present, the home page and hotspot data of e-commerce platforms are all cached. Generally, the cache is refreshed by scheduled tasks, or updated after it cannot be found. However, there is a problem with the scheduled refresh.

Here’s an example:

If the keys on all home pages expire for 12 hours, they are refreshed at 12:00 every day. But if there’s a second kill activity at zero o ‘clock on any given day, there’s a flood of users, and let’s say 6,000 requests per second, and the cache would have been able to handle 5,000 requests per second, but then all the keys are dead. At this time, 6000 requests per second are all processed by the database, the database must not be able to carry, so the database will alarm, the real situation may be the DBA did not respond to directly hang. At this point, if there is no special solution to handle the failure, the DBA will restart the database, but the database will immediately be killed by the new traffic. This is cache avalanche.

At the same time, a large number of keys in Redis fail, and at that moment Redis will be the same as none, so the result of direct request to the database of this order of magnitude is almost disastrous. If the failed database handles user services, all other interfaces that rely on the library collapse, and the user experience deteriorates dramatically.

How to resolve cache avalanche

There are many ways to handle cache avalanches

  • When storing data in batch to Redis, it is good to add a random value to the expiration time of each key, so as to ensure that the data in Redis will not fail in a large area at the same time. setRedis (key, value, time + Math.random() * 10000);

  • If Redis is deployed in a cluster, evenly distributing hotspot data among different Redis libraries can avoid all failures.

  • Or set the hotspot data will never expire, and update the cache when there is an update operation (for example, when the operation and maintenance update the commodities on the home page, it is ok to refresh the data in the cache, do not set the expiration time). The data on the home page of e-commerce platforms can also use this operation, which is quite safe.

Cache penetration and breakdown and the difference from avalanche

What is cache penetration

Cache in the perspective of the cache and the database data, and users constantly initiate the request, our database id is from 1 began to increase, if the id of 1 data or id for particularly large, there is no data, the user may may be the attacker, the rooster stress can lead to database, even shut down the database.

Like this, if you do not verify the parameters passed in by the user, the database ID is always greater than 0, I always use the parameters less than 0 to request, every time I can bypass Redis to access the database directly, the database can not be found, every time, when the concurrency is high, the database is easy to crash.

What is the cache breakdown

A cache breakdown is similar to, but different from, a cache avalanche, where a large area of the cache fails at the same time, causing users to request direct access to DB. To paralyze the DB, and cache breakdown “refers to a certain key of Redis is very hot, was carrying a large number of concurrent, a large number of simultaneous focus on a visit to this one point, when the key at the moment of failure, continued with a large number of simultaneous will wear out the cache, direct access to the DB, as in an intact on the bucket digging a hole.

Cache penetration and cache breakdown solutions

The cache to penetrate

The solution to cache penetration is to add verification at the interface layer, such as user authentication verification and parameter verification. If invalid parameters are returned, the code is returned. Comparison diagram: ID for basic verification, ID <=0 direct interception.

If the corresponding data cannot be retrieved from the cache or queried in the database, you can write the value of the corresponding key to NULL, unknown error, and retry later. In this way, users cannot use the same ID repeatedly to attack. However, common users do not perform this operation. In this case, o&M personnel can mask the IP addresses whose access times per second exceed the threshold.

Or use the Bloom filter of Redis, which can also prevent the occurrence of cache penetration. Its principle is also very simple, which is to use efficient data structure and algorithm to determine whether the key of the request initiated by the user exists in the database, and return directly if it does not. If it exists, check it from DB and refresh KV in Redis.

Cache breakdown

A cache breakdown can be resolved by setting hotspot data to never expire or by adding a mutex.

conclusion

Redis cache avalanche, cache penetration, and cache breakdown are similar, but there are some differences.

Ways to prevent these problems

  • Ex ante: Redis high availability, master/slave + Sentinel or Redis Cluster to avoid a complete Redis Cluster crash.
  • Issue: Local EhCache +Hystrix limiting + degrade to avoid Mysql crash.
  • After: Redis persisted RDB+AOF, once restarted, automatically load data from disk, fast recovery of cached data.