This is the 9th day of my participation in Gwen Challenge

Redis is often used for caching in the system, which can solve the problem that IO devices cannot meet the massive read and write requests of Internet applications.

1. Cache penetration

Cache penetration is when data is not in the cache or database and the user keeps making requests, such as data with ID -1 or data that is too large to exist. It is possible that hackers exploit the vulnerability to overwhelm the application’s database.

1. Common solutions

There are three common solutions to the cache penetration problem:

  • Verification interception: verify the interface layer, such as identifying user permissions, and perform basic verification on fields like ID, such as directly intercepting fields with ID <=0;

  • Cache empty data: When the database query data is empty, this data is also cached, but the validity of the cache is set to be short, so as not to affect the cache of normal data.

    Copypublic Student getStudentsByID(Long id) {

    / / information obtained from the Redis students Student. Student = redisTemplate opsForValue () get (String) the valueOf (id)); if (student ! = null) { return student; } / / from the database query student information, and deposited in the Redis student. = studentDao selectByStudentId (id); if (student ! = null) { redisTemplate.opsForValue() .set(String.valueOf(id), student, 60, TimeUnit.MINUTES); } else {/ / even does not exist, are to be deposited in the cache redisTemplate. OpsForValue (). The set (String. The valueOf (id), null, 60, TimeUnit. SECONDS); } return student;Copy the code

    }

  • Use bloom filter: Bloom filter is a relatively unique data structure, there are certain errors. When it specifies that a data exists, it does not necessarily exist, but when it specifies that a data does not exist, then it must not exist.

2. Bloom filter

In business, we may use HashMap to determine whether a value exists or not. It can return results within O(1) time complexity, which is highly efficient. However, limited by storage capacity, if the value to be determined may exceed 100 million levels, That’s a lot of memory for a HashMap.

BloomFilter’s solution to this problem is simple. First, replace the array in the HashMap with multiple bits, so that the storage space is saved, and then hash the Key multiple times, with the value of the hash corresponding to the bit position of 1.

When determining whether an element exists, we determine whether the hashed bits of the value are all 1, which may or may not exist (figure F below). But if there is a bit that is not 1, then the Key definitely does not exist.

Note: BloomFilter does not support delete operations, only add operations. This is easy to understand, because if you want to delete data, you have to set the corresponding bit to 0, but the corresponding bit of your Key may also be the corresponding bit of other keys.

3. Comparison between null cache data and Bloom filter

Both schemes are briefly described above. Caching empty data and Bloem filters can effectively solve the cache penetration problem, but the usage scenarios are slightly different.

  • Caching empty data is not a good solution when a number of malicious queries have different keys. Because it needs to store all the keys, the memory footprint is high. And in this case, many keys may only be used once, so it doesn’t make sense to store them. So in this case, using a Bloom filter is a good choice;

  • However, if the number of keys with empty data is limited and the efficiency of repeated Key requests is high, empty data can be cached.

Second, cache breakdown

Cache breakdown occurs when multiple threads concurrently access hotspot data when the current hotspot data store expires. Because the cache has just expired, all concurrent requests are queried into the database.

The solution

  • Set hotspot data to never expire.

  • Add mutex: The mutex can control the thread access to the query database. However, this solution will reduce the throughput of the system. Therefore, you need to use it according to the actual situation.

    Copypublic String get(key) { String value = redis.get(key); If (value == null) {if (value == null) {if (value == null) { If (redis.setnx(key_mutex, 1, 3 * 60) == 1) {value = db.get(key); redis.set(key, value, expire_secs); redis.del(key_mutex); } else {sleep(50); sleep(50); sleep(50); get(key); }} else {return value; }}

Cache avalanche

A cache avalanche occurs when a large number of caches are concentrated or the cache fails in a large area at the same time, causing a large number of requests to access the database, resulting in CPU and memory overloads or even outages.

A simple avalanche process:

  1. Redis cluster produced a large number of failures;

  2. The cache fails and there are still a lot of requests to access the Redis cache server.

  3. After a large number of Redis requests fail, these requests will go to the database;

  4. Because the design of the application relies on the database and Redis services, it can quickly lead to an avalanche of server clusters and ultimately the entire system.

The solution

  • High availability cache: High availability cache is to prevent the entire cache failure. Even if individual nodes, machines or even machine rooms are shut down, the system can still provide services. Both Redis Sentinel and Redis Cluster can be highly available.

  • Cache degradation (temporary support) : How do we ensure that the service is still available when the service has problems due to a sharp increase in access times? Hystrix, which is widely used in China, reduces losses after avalanches by three means: circuit breaker, downgrading and current limiting. As long as the database is not dead, the system can always respond to the request, the annual Spring Festival 12306 we are not so come over? As long as it’s responsive at least there’s a chance of getting tickets;

  • Redis backup and quick warm-up: Redis data backup and recovery, quick cache warm-up.