With lazy people, Rest cannot be enjoyed by lazy people
preface
Because of the high concurrency and high performance of caches that have been widely used in various projects, the read cache is basically the same. It follows the following process:
But when it comes to updating the cache, do you update the cache after updating the database or do you just delete the cache? Or delete the cache first and then update the database? It is worth discussing at this point.
Consistency scheme
In the actual project development, it is necessary to ensure the consistency of the data in the database and cache, otherwise the other people recharge 100 pieces, constantly refresh but still display 0.01 yuan, isn’t it embarrassing? In theory, for the cache Settings expiration time is the ultimate solution to ensure data consistency, adopt this solution, all write operations is subject to a database if the database written to success but the cache update failure, as long as the cache to read behind the cache expiration time after naturally in the new value read from the database and then update the cache. The main direction of the following discussion is how to ensure data consistency without relying on setting expiration times for the cache. Three schemes are mainly discussed here:
① Update the database first and then the cache
② Delete the cache first, then update the database
③ Update the database first, then delete the cache
Update the database first and then the cache
This approach is generally opposed (to the best of my knowledge), and why? Why is this scheme opposed? There are two main reasons, please listen to me carefully:
First of all, from the aspect of data security, if there are requests A and B to operate at the same time, A first updates A piece of data in the database, and then B immediately updates the data, but B updates the cache before A possibly because of network delay and other reasons, what will happen? The data in the cache is not the latest data updated by B, resulting in data inconsistency.
Secondly, from the perspective of business scenarios, if a business has a large number of database writes but a small number of database reads, this solution will result in frequent updates to the cache before data is read, wasting performance.
Considering the above two aspects, this plan decisively passes. The following two schemes are controversial. Do you delete the cache and then update the database or update the database and then delete the cache?
Delete the cache before updating the database
If there is A request at the same time A update operation, the query operation request B, there may be A request prior to the write operation will remove the cache, B request just now found in the cache is empty, B requests will query the database, if A request at this time of write operation has yet to be completed, B request query to still be the old value, or the old values will be written to the cache, When A requests to write new values to the database, data inconsistency will occur. If the cache is not set to expire, the data will always be dirty.
Solve this kind of circumstance can adopt the strategy of delay double delete, is to first remove the cache before updating the database, and then to write operation of database, database updates done to delete the cache operation again, the purpose is to remove read requests may cause of dirty data, cache before the second delete cache can sleep for a few seconds, Developers can estimate the read data service logic time of their projects, and then add a few hundred ms to this time to ensure that the read request ends and the dirty data caused by the write request is deleted. If MySQL uses the read-write separation architecture, data inconsistency may occur due to primary-slave latency. You can sleep for a while after the write operation is complete according to the primary-slave latency before deleting the cache. The pseudo-code of delayed double deletion is as follows:
# pseudocode
def delay_delete() :
redis.delete('name') Delete the cache before updating the database
sql = 'update info set name='lili' where id=1; ' Update database
cursor.execute(sql)
time.sleep(1) If mysql is a master-slave architecture, sleep master-slave latency is several hundred ms longer
redis.delete('name') # delete cache again
Copy the code
Would there be a second cache deletion failure? If the second deletion fails, the cache and database will still be inconsistent, and how to solve the problem? Let’s look at the next scenario.
Update the database before deleting the cache
Cache− Asidepatterncache-aside patternCache−Asidepattern. ** The application should fetch data from the Cache, return it if it succeeds, and return it if it fails. If the update succeeds, the data should be stored in the database before invalidating the cache. ** the original text is as follows
If an application updates information, it can follow the write-through strategy by making the modification to the data store, and by invalidating the corresponding item in the cache.
When the item is next required, using the cache-aside strategy will cause the updated data to be retrieved from the data store and added back into the cache.
Does this scheme produce inconsistent data? Such as the following situation:
There are two requests A and B. A does the query while B does the update. Suppose the following happens:
(1) The cache is invalid
② Request A will query the database and get an old value
③ Request B to write the new value to the database
④ Delete the cache after request B is successfully written
⑤ Request A to write the mechanism found in the cache, generating dirty data…
If the above situation is voiced, it does produce data inconsistency, but XDM think about it, what is the probability of this happening? In order for this result to occur, there must be A condition that the operation time of request B is very short, to what extent, the operation time of request B is faster to write to the database than that of request A to read data from the database. Only in this case can ④ speak before ⑤, but database reads are much faster than writes, otherwise why do read/write separation? So the probability of this happening is very, very, very low, but what if the ocD has to be addressed? You can set an expiration time for the cache or adopt the delayed dual-delete policy of the second scheme to ensure that the read request is deleted after it is completed.
Last question
There is also a problem, that is, the final solution 3 May have a very low probability of data inconsistency is to adopt the delayed double deletion strategy of plan 2, but it is also said in Plan 2, what if there is a cache deletion failure? Isn’t there still the problem of inconsistent data? How can this problem be solved? This provides a retry mechanism. If the deletion fails, try again. This provides a retry scheme.
① Update database
② Failed to delete the cache due to various reasons
③ Put the cache that fails to delete into the message queue
④ The service code obtains the key to be deleted from the message queue
⑤ Continue to try the deletion operation until it succeeds
conclusion
The article was first published in the wechat public account Program Yuan Xiaozhuang, at the same time in nuggets.
The code word is not easy, reprint please explain the source, pass by the little friends of the lovely little finger point like and then go (╹▽╹)