preface

Cache avalanche, cache penetration, cache penetration, Redis and MySQL data consistency have been widely written about on the Internet, but this blogger decided to write another article to explain his own understanding, hoping to help some people. If you’re a new developer and don’t know anything about these terms, don’t be afraid. They look great, but they’re really simple. There may be some unconsidered problems with Redis in development, but it’s just a fancy name.

The cache to penetrate

Before talking about cache penetration, let’s first look at a business scenario. Take our company APP as an example. We now have a business that queries logistics information

Public ExpressInfo findByDeliveryOrderId(Long id) {public ExpressInfo findByDeliveryOrderId(Long id) {public ExpressInfo findByDeliveryOrderId(Long id) redisTemplate.opsForValue().get("hosjoy-b2b-express:express-info:" + id); if (obj ! = null) { return (ExpressInfo) obj; } else { ExpressInfo expressInfo = expressMapper.selectByDeliveryOrderId(id); If (expressInfo! = null) { redisTemplate.opsForValue().set("hosjoy-b2b-express:express-info:" + id, expressInfo, expressInfo,Duration.ofHours(2l)); return expressInfo; } else {throw new ClientException(" invoice: {} logistics information does not exist ", id); }}}Copy the code

If it’s not in the cache, I go to the database, update it to Redis, set the expiration time and return it. It looks good at first glance, but there’s a potential problem, what if I pass in a negative id? Is that not always cached query, not the database query? In the last post, we talked about Hibernate-Validator validation. The @positive annotation can easily avoid negative parameters. If I pass an id of type long that is randomly generated, you will not be able to prevent parameter verification. Moreover, I randomly generated an ID that is certainly not in your database. So according to the above code, every time will not find the cache to go to the database, and the database can not be found.

We use Redis as the cache middleware, will request to Redis, when the Redis data expired and then access MySQL to update Redis

The above scenario accesses MySQL every time, which clearly defeats the purpose of Redis as a cache middleware. Query for a resource that does not exist, causing MySQL to be accessed on every request. This is called cache penetration

How can you avoid this problem? In fact, I personally don’t think the cache penetration solution is very significant, compared to the cache penetration and cache avalanche, after all, is a small probability of things. And no Han has to go to the trouble of making fun of our servers, right? There are ways to fix it, if there must be. The most common is to do a good job of parameter verification, at least to prevent the negative id of this obvious illegal parameters. Second, remember that we mentioned an advanced data structure in the Redis Basic interview — the Bloom filter. We can put all resource ids into the Bloom filter and judge first in the code. Since the Bloom filter can verify 100% that the resource does not exist, it can be returned directly. Here, Redisson is used as the specific implementation of the Bloom filter and the verification is added before the business code

public ExpressInfo findByDeliveryOrderId(Long id) { RBloomFilter<Long> bloomFilter = redissonClient.getBloomFilter("hosjoy-b2b-product:bloom-filter:express-info"); if(! Bloomfilter.contains (id)){throw new ClientException(" invoice: {} logistics information does not exist ", ID); }}Copy the code

Cache breakdown

Similarly, we don’t care what is called cache breakdown, let’s look at a pseudo code, take our company APP as an example, now there is a query commodity classification business

/** * query ProductCategory information */ @suppresswarnings ("unchecked") public List<ProductCategory> findProductCategory() {Object obj = redisTemplate.opsForValue().get("hosjoy-b2b-product:product-category"); If (obj = = null) {/ / database List < ProductCategory > categoryList = productCategoryMapper. SelectProductCategory (); // Write to Redis, RedisTemplate expiration time is 2 hours. OpsForValue (). The set (" hosjoy - b2b - product: the product - the category ", categoryList, Duration, ofHours (2 l)); return categoryList; } else { return (List<ProductCategory>) obj; }}Copy the code

Again, I first check the product from Redis, if not, then go to the database, and update to Redis and return. But we need to know that the data such as commodity classification is frequently accessed, otherwise there is no need to save in Redis, as long as users browse the goods will be accessed. The above code seems to have no problem, but in fact, it has a big problem. Everything was normal originally, but we just set the expiration time of commodity classification in Redis to be two hours. When the expiration time is two hours later, when the number of users is large, Tens of thousands or even hundreds of thousands of requests (for taobao and JINGdong such a magnitude is actually very small) in Redis did not check the data, and then go to access the database, there is no doubt that the database can not withstand such a large concurrent request pressure, the moment was crushed. This is called a cache breakdown when the number of requests for direct access to MySQL in a flash occurs due to data expiration in Redis.

This issue is more important than cache penetration and must be addressed. The solution is actually very simple. As we already know, a large number of requests are directly sent to MySQL at the moment of data expiration in Redis, so I just need to add a lock to MySQL query. If Redis does not have one, let a request go to the database and update the value to Redis.

/** * query ProductCategory information */ @suppresswarnings ("unchecked") public List<ProductCategory> findProductCategory() {Object obj = redisTemplate.opsForValue().get("hosjoy-b2b-product:product-category"); If (obj == null) {synchronized (this){ Prevent a thread to lock has been updated. The obj = redisTemplate opsForValue () get (" hosjoy - b2b - product: the product - the category "); if(obj ! = null){ return (List<ProductCategory>) obj; } List<ProductCategory> categoryList = productCategoryMapper.selectProductCategory(); redisTemplate.opsForValue().set("hosjoy-b2b-product:product-category", categoryList, Duration.ofHours(2L)); } return categoryList; } else { return (List<ProductCategory>) obj; }}Copy the code

In this way, only one request can access MySQL at a time, and when it finds that the data was updated to Redis, it releases the lock. At this point, other concurrent threads entering the synchronized code block first check Redis, find that Redis already exists, and do not access MySQL again

Here are a few details:

  • Since the Spring container is singleton by default, it is safe to use synchronized(this) in the current Service class
  • Synchronized code blocks must be checked once in case the last thread that grabbed the lock has already updated Redis
  • This scenario does not necessarily use distributed locks

You might wonder why we don’t use distributed locks here. After all, our production instances of goods and services must be clustered, and synchronized(this) can only ensure that the code is executed on one request of the current application instance at a time, not other instances in the cluster. It is important to note that we are not making security changes to the data here, we just want to prevent a large number of requests from accessing MySQL. Given that the commodity service is now a cluster of 10 instances, the code here would have a worst-case scenario of 10 requests accessing MySQL queries simultaneously. Not a problem, of course, with distributed locks

Cache avalanche

Cache avalanche is similar to cache breakdown, because the e-commerce system uses Redis to store a lot of data, and we usually set the expiration time for cache data. There is a scenario where a large number of keys fail at the same time, and a large number of requests to Redis are not available, MySQL will be accessed, which may cause MySQL to fail due to stress. The failure of a large number of keys in Redis leads to a large number of requests for access to the database, causing the database to crash due to excessive pressure. This is called cache avalanche

The solution to cache avalanche is relatively simple. We just need to make the key in Redis not expire at the same time as possible, and add a random number to the expiration time when storing Redis

    Duration expire = Duration.ofHours(2L).plus(Duration.ofSeconds((int) (Math.random() * 100)));
    redisTemplate.opsForValue().set("key","value", expire);
Copy the code

Redis is consistent with MySQL data

This question is often asked in interviews. First, what is data consistency?

We know that Redis is used as the cache middleware. When the data in MySQL is updated and then written to Redis, this is a two-step operation. There may be multiple threads in the two-step operation order, resulting in inconsistent data between MySQL and Redis, which may lead to business problems. So let’s see what’s the problem

Double write mode

The so-called double write mode is our code example above, first change MySQL, then update Redis. Look at the pseudocode below

Void Transactional public void update(Object obj){xxxmapper.update (obj); / / update the database redisTemplate. OpsForValue (). The set (" key ", obj); // Update the cache}Copy the code

In fact, this approach is fine for most companies of average size, and only becomes problematic when the number of concurrent requests is large. It is possible for thread A to finish updating the database, and thread B to finish updating the database for the second time, but thread A has jitter on the Redis service network, and the result of thread A’s modification has not been written to Redis, and the result of thread B’s modification has been written to Redis first. Thread A then updates Redis after thread B. This will cause dirty data problems, because thread B is the latest modified database, but its Redis value is overwritten by thread A’s modified database value, as shown in the figure:

The MySQL thread was modified and the Redis update was executed. We know that writing MySQL and Redis is not an atomic operation, so we need to lock them so that they are atomic. Since it is related to read and write, the most suitable is the read and write lock. Until the write lock is released, the write operation cannot be read, and the write operation needs to be queued, so that the data read is not inconsistent

/** * Double write mode - read/write lock */ @transactional public void update(Object obj) {RReadWriteLock lock = redissonClient.getReadWriteLock("lock"); RLock writeLock = lock.writeLock(); try { if (writeLock.tryLock(waitTime, leaseTime, TimeUnit.SECONDS)) { xxxMapper.update(obj); / / update the database redisTemplate. OpsForValue (). The set (" key ", obj); // Update the cache}} finally {writeLock. Unlock (); }}Copy the code

Failure mode

The so-called failure mode is that we delete the data in Redis after updating the database. Because our business code will query MySQL and update to Redis when it cannot be found from Redis, so the next time there is a user request, it will be automatically updated to the latest value

Void Transactional public void update(Object obj) {xxxmapper.update (obj); ** * Transactional public void update(Object obj) {xxxMapper.update(obj); // Update the database redistemplate-delete ("key"); // Delete cacheCopy the code

In fact, the above code also has some problems. Firstly, since we need to delete Redis, we need to prevent the problem of cache breakdown during business query, which has been mentioned above. Second, it also creates dirty data problems. The diagram below:

In this scenario, thread A is writing MySQL and Redis, and thread B is writing MySQL after it. At the same time, thread C is reading the data. Thread C finds that there is no data in Redis (because it has been deleted by thread A), and it reads MySQL. At this point, B’s modification to MySQL has not been committed, so C reads the value that A wrote to MySQL. Then THE MySQL modified by B submitted the transaction, and the network deleted the Redis normally. C updated the Redis slowly due to the network problems. After B deleted the Redis, the value that C updated to the Redis was actually written to the MySQL by A. At this point, the latest MySQL value should be the result of B’s modification, so dirty data problem arises

If the text is difficult to understand, look at the picture. This situation is similar to the above dual-write mode, which produces data inconsistency. We can avoid this scenario by adding a read/write lock to ensure that updating the database and deleting the cache are one atomic operation. We can avoid this scenario by adding a read/write lock to ensure that updating the database and deleting the cache are one atomic operation. The lock code is almost the same as the double – write mode, so I won’t post it here.

Use the middleware Canal to subscribe to the binary log

As you can see, both patterns require locking to avoid data inconsistency. It is important to know that locking will affect system throughput if we do not want to lock and keep MySQL and Redis data consistent in real time. A good solution is to use the middleware Canal to subscribe to MySQL’s binary Log to solve this problem.

Canal is an open source middleware of Ali. Its principle is to send dump protocol to MySQL as the slave node of MySQL and subscribe to the binary log of MySQL. When there is any data change in MySQL, such as data addition, deletion, index creation, new table creation, etc. Canal will be listening to it. In this way, the problem becomes simple. We only need to write a client program to receive the event type sent by canal Server, send a message to MQ, and then use an application to receive MQ message. According to the change type of binlog data, the data can be synchronized to Redis. This eliminates the data inconsistencies associated with the two modes above. The diagram below:

It is worth noting that MQ (why use message queues?) is introduced here. To achieve asynchronous, peaking, decoupling, so we have to ensure that MQ order consumption, otherwise Redis data will still have inconsistency.

This solution is relatively good because once made, we do not even need to update Redis in the business code, and this solution is relatively perfect to solve the MySQL and Redis data consistency. The disadvantage is that we introduced additional middleware to maintain, but also introduced two at once, in order to prevent single-node high availability problems, but also to do the cluster…… That’s a big cost.

conclusion

As for the consistency of MySQL and Redis data, there are also some other solutions proposed on the Internet, such as delayed double-delete. Those who are interested can study it, but in fact, it cannot solve the problem… There are also said to reduce the cache invalidation time, this is actually a palliative rather than a cure. In fact, I like this approach the most: communicate with the business, let them accept the data inconsistencies for a short period of time, and when the Redis data expires it will be the latest, so you don’t even need to lock it. This is the way to highlight the status of development in the company……

I’m kidding, but I think locking is a good way to solve data consistency

For the sake of my Dragon Boat Festival, if this article is helpful to you, remember to like and follow. Your support is my motivation to continue my creation!