preface

Due to my limited level, if there are mistakes, welcome to correct, thank you!

Some of the concepts discussed in this paper are agreed as follows:

  1. Cache: Data in Redis
  2. Database: MySQL

This article will not cover redis and MySQL, but will focus on Cache Aside patterns.

What is a Cache Aside Pattern

Cache Aside Pattern is the most classic Cache + database read/write Pattern.

  1. When you read, you read the cache first. If the cache is not available, you read the database. Then you take the data and put it into the cache

  2. When updating, delete the cache first, then update the database (cache and database double write)

Understand Cache Aside patterns

In Cache Aside Pattern, the idea of a read request is to read from the Cache first. If there is no data in the Cache, the data is read from the database and written to the Cache, and then a response is returned. That makes sense.

A closer look at the operations that update caches and data may raise the following questions:

  1. Why delete the cache first and then update the database
  2. Why delete the cache instead of updating it

Let’s take each one in turn.

Why delete the cache first and then update the database

If we first modify the database and then delete the cache, if the deletion fails, the data will be new in the database and old in the cache, resulting in inconsistent data.

Why delete the cache instead of updating it

Cause 1: In a high concurrency scenario, multiple concurrent writes may cause data inconsistency.

For example, in the case of concurrent writes 1 and 2, there is no guarantee of order, so either the cache or the database is operated first:

  • Request 1 operates the database, request 2 operates the database
  • Request 2 sets the cache first, request 1 sets the cache first

If we just delete the cache, no data inconsistency will occur.

Reason 2: Lazy init lazy loading idea

Because it is possible that the current cached data will become cold data, the access frequency will not be too high, and there is no need to update the cache in time. This cached data may only be accessed 100 times an hour, so there is no need to update the cached data. The cache is updated when the next read request for the data occurs. This is lazy loading thinking.

Reason 3: Updating the cache may require complex calculations

Will it be seamless to delete the cache first and then update the database

This can still be problematic in high concurrency scenarios.

  1. Suppose the data has changed, the cache has been deleted, and the database has not been modified.

  2. Then a request comes in, reads the cache, finds the cache empty, queries the database, finds the old data before the modification, puts it in the cache

  3. Then the data change program completes the database modification.

It explodes. Now the database is not the same as the data in the cache. This is the problem of cache and database dual-write inconsistencies in high concurrency scenarios.

conclusion

This article first makes a basic definition of the Cache Aside Pattern (readers can refer to the figure below), and then answers two questions. It also leads to the inconsistency between the Cache and the database double write in the scenario of high concurrency. We will explain this problem in the following article.

reference

  1. Cache Aside Pattern parsing

  2. Huoshan billion level flow of e-commerce details page course

    Thanks to my teacher for giving me a preliminary understanding of Cache Aside Pattern by drawing lots of pictures and explaining in plain English