A month ago —
The whole coupon center is divided into the front end and the back end. What Xiao Hui is responsible for is the development of the back end RPC interface. The interface contains two methods of “checking coupons” and “receiving coupons”. The general structure of the project is shown as follows:
– after two weeks
Small gray: look, this is the effect of coupon query function!
Xiao Grey: Look, this is the effect of the coupon claim function!
Three days later,
The original coupon query interface is implemented like this:
The List of coupons is stored as a List in Redis, and the logic for querying is simple:
1. Query the cache. If the cache exists, return the result
2. The cache does not exist. Query the database
3. Loop the query results into the cache
However, when the cache does not exist at a certain point in time and the number of requests is high, the problem of cache concurrency can occur. That is, multiple threads will repeatedly query the DB and update the cache. (Note that this is not a cache breakdown, as many people confuse the two concepts.)
The repeated query of DB is the secondary problem, while the repeated cache update is the main problem. If two threads enter the third stage at the same time, each performing an Rpush operation, they end up inserting two sets of the same data into the cache of the coupon list.
How do you solve it? Using Java’s locking mechanism? Obviously not, because an online environment is typically a cluster of multiple servers. So Grey thought of using distributed locks.
There are many kinds of distributed locks, which can be realized by ZooKeeper, MemCache and Redis. Redis is a simple way to use a shared Key between servers, as well as Setnx instructions.
When the first thread executes Setnx, the corresponding key value is stored, which is equivalent to successfully acquiring the lock. When a subsequent thread executes Setnx on the same Key, it returns null, equivalent to a failed lock grab. At the same time, to prevent a thread from holding the lock for too long due to unexpected circumstances, the program sets a one-second expiration time on the Key.
To summarize the modified logic:
1. Query the cache. If the cache exists, return the result
2. The cache does not exist. Query the database
3. Contention for distributed locks
4. The lock is obtained successfully, and the result of the query is looped into the cache
5. Release the distributed lock
Three days later,
The weird bug has reappeared, because grey’s last change still has a fatal bug. Here we assume that the cache does not exist and that two threads, A and B, enter the code block first.
In phase 1, thread A has just started querying the coupon cache and thread B is trying to acquire A distributed lock:
In the second stage, thread A queries the database because the cache does not exist, and thread B successfully acquires the lock and starts updating the cache:
In phase 3, thread A attempts to acquire the distributed lock, while thread B has released the distributed lock:
In phase 4, thread A acquires the lock, updates the cache again, and thread B has successfully returned:
In this way, the cache has been updated twice, so the data duplication bug appears again.
How to crack this situation? After the thread has successfully acquired the lock, the coupon cache is detected again:
To summarize the modified logic:
1. Query the cache. If the cache exists, return the result
2. The cache does not exist. Query the database
3. Contention for distributed locks
4. The lock is successfully obtained. Check the cache again
5. If the cache still does not exist, loop the results of the query into the cache
6. Release the distributed lock
This mechanism for determining the existence of a second time has a name: double detection. This approach is also often used in thread-safe singleton patterns.
Small gray memories come to an end —
A few points to add:
1. The distributed lock used in this paper is actually not an “authentic” distributed lock. When the thread fails to compete for the lock, it will directly return the result of DB query, rather than relying on the spin mechanism to wait for the lock.
2. Why is the coupon List information cached as a List instead of storing the entire List as a long Json string? This is due to business needs, and using a List makes it easier to modify individual coupon information in some cases (the LSET directive).
3. Why not use the Redis Set or Hash data type to store the information in the coupon list for automatic de-duplication? For the Set type, it is necessary to compare whether the whole string is identical before deduplication, and each coupon is a long Json string, so the efficiency of comparison is low. Using Hash can achieve efficient de-duplication, but it does not fundamentally solve the problem of repeated updates.
— — the END — — –
If you like this article, please long click the picture below to follow the subscription number programmer Xiao Grey and watch more exciting content