preface

In a distributed system, when both the cache and the database exist, if there is a write operation, should the database or the cache be operated first? So let’s think about what the problems might be, and let’s move on. Below I divide several schemes to elaborate.

Github.com/whx123/Java…

Cache Maintenance Solution 1

Suppose there is A write (thread A) and A read (thread B) operation, working first on the cache and then on the database. , as shown in the following flowchart:

1) Thread A initiates A write operation. The first step is del cache

2) Thread A writes new data to DB in the second step

3) Thread B initiates a read operation, cache miss,

4) Thread B obtains the latest data from DB

5) Request B to set cache simultaneously

Look at it that way, it’s fine. Let’s look at the second flow chart as follows:

1) Thread A initiates A write operation. The first step is del cache

2) Thread B initiates a read operation, and the cache miss occurs

3) Thread B continues to read DB, reading out an old data

4) Then the old data is cached

5) Thread A writes the latest data

The old data is stored in the cache, and every time it is read, it is old data. The cache and data are not consistent with the database data.

Cache maintenance solution 2

Double – write operation: cache first, database second.

1) Thread A initiates A write operation. The first step is to set cache

2) Thread A writes new data to DB in the second step

3) Thread B initiates a write operation, sets cache,

4) Thread B writes new data to DB

That’s not a problem. But sometimes things don’t go as expected. Let’s look at the second flow chart, as follows:

1) Thread A initiates A write operation. The first step is to set cache

2) Thread B initiates a write operation. The first step is setCache

3) Thread B writes to the database

4) Thread A writes to the database

After the operation is complete, the cache stores the data after B’s operation, and the database stores the data after A’s operation. The data in the cache is inconsistent with that in the database.

Cache maintenance solution 3

A write (thread A) and A read (thread B) operation operates on the database and then on the cache.

1) Thread A initiates A write operation. The first step is to write DB

2) The second step is del cache

3) Thread B initiates a read operation, cache miss

4) Thread B obtains the latest data from DB

5) Thread B sets cache simultaneously

This scheme has no obvious concurrency problem, but it may fail to delete the cache in step 2. Although the probability is relatively small, it is better than scheme 1 and scheme 2, and scheme 3 is also used in daily work.

To sum up, we generally use solution 3, but is there a perfect solution to the disadvantages of solution 3?

Cache maintenance solution 4

This is an improvement of scheme 3, which is to operate the database first and then the cache. Let’s take a look at the flowchart:

The binlog of the database is used to eliminate the key asynchronously. Taking mysql as an example, Ali’s Canal can be used to collect the binlog log and send it to the MQ queue, and then the UPDATE message can be confirmed and processed through ACK mechanism to delete the cache and ensure the consistency of data cache.

But there’s a problem, what if it’s a master-slave database?

Cache maintenance solution 5

The synchronization between the primary and secondary DB databases is delayed. If the cache is deleted, dirty data will be read from the standby database before the data is synchronized to the standby database. How to solve this problem? The solution is as follows:

Summary of Cache Maintenance

To sum up, in a distributed system, when the cache and the database exist simultaneously, if there is a write operation, the database is operated first, then the cache is operated. As follows:

(1) Read whether there is relevant data in the cache

(2) If there is relevant data value in the cache, return

(3) If there is no relevant data in the cache, then read the relevant data from the database into the cache key->value, and then return

(4) If the data is updated, update the data first and then delete the cache

(5) In order to ensure the fourth step to delete the cache successfully, use binlog asynchronous deletion

(6) If it is a master and slave database, binglog is taken from the slave library

(7) If there is one master and more than one slave, each slave library should collect binlog, and then the consumer end receives the last binlog data to delete the cache

Personal public Account

  • Welcome everybody to pay attention, everybody study together, discuss together ha.
  • Github: github.com/whx123/Java…