Suddenly I found that the topic of XXX that I had written before was not very concerned by the public. It may be that I really did not write well. It may also be that most of them were some things that I had nothing to do to write blind after eating. Without further ado, what prompted me to look into this topic was a situation at work where I needed to implement multi-level caching myself. For example, in SpringBoot, we can use either Redis or EhCache, but as far as I know, only one cache can be used by default. Suppose I now need to use ehCache and Redis at the same time, with EhCache doing the local level 1 cache. Ignoring the caching of mybatis and Hibernate orM frameworks that operate databases, only the application level caching is considered. With springBoot, EhCache, and Redis already available, it is not very difficult to combine ehCache and Redis with custom annotations, section interception, and two-level caching. The difficulty lies only in the details, such as how to make the annotations and sections defined by myself be controlled by the caching annotation switch @enablecaching in Springboot, how to synchronize the ehCache local cache and redis centralized cache in multi-point deployment, and how to lock the cache. This topic focuses on locking when using cache.
In fact, when we started learning Java thread synchronization tools, we came into contact with synchronized, lock, read and write lock and so on. There is an example in the ReentrantReadWriteLock class of the API documentation that demonstrates the use of a read/write lock on a cached object as follows:
class CachedData { Object data; volatile boolean cacheValid; ReentrantReadWriteLock rwl = new ReentrantReadWriteLock(); void processCachedData() { rwl.readLock().lock(); if (! cacheValid) { // Must release read lock before acquiring write lock rwl.readLock().unlock(); rwl.writeLock().lock(); // Recheck state because another thread might have acquired // write lock and changed state before we did. if (! cacheValid) { data = ... cacheValid = true; } // Downgrade by acquiring read lock before releasing write lock rwl.readLock().lock(); rwl.writeLock().unlock(); // Unlock write, still hold read } use(data); rwl.readLock().unlock(); }}Copy the code
Inside the read-write lock, double check and lock down, etc. With the art naturally needless to say, I was recalled, addictions, feeling of this code is very suitable for B, then in my cache implementation code I also ready to play, so when I make a copy of this code in the past, after changing I need logic, I found out the problem. Instant feeling this if so play, possibly install B not that what of. Then I searched the Internet to see if other people were embarrassed. Then search out a large number of such as the following code (Baidu search: Java read and write lock cache)
package test; import java.util.HashMap; import java.util.Map; import java.util.Random; import java.util.concurrent.locks.ReadWriteLock; import java.util.concurrent.locks.ReentrantReadWriteLock; /** * Design a cache system * read/write lock application. * JDK1.5 built-in read/write lock feature, read and read are not mutually exclusive, read and write mutually exclusive, write and write mutually exclusive. * Why use read/write locks? In a word that is to improve system performance, how to improve? * Imagine that no thread exclusion is required for all read operations, whereas if the synchronized keyword is used in a method to achieve thread-safety, both read and write operations are synchronized for all threads. * There is a performance bottleneck if there are a lot of reads. * * Therefore, when there are multiple threads accessing a method and there are read and write operations within the method, * the thread-safe way to improve performance is to use read/write locks to keep read/write exclusive and write exclusive. * @author zhurudong * */ public class CacheTest {private CacheTest <String, Object> map = new HashMap<String, Object>(); Private ReadWriteLock ReadWriteLock = new ReentrantReadWriteLock(); * @param key * @return */ public Object getData(String key) {readwritelock.readlock ().lock(); Object value = null; Try {// Try to fetch data from cache value = map.get(key); if (value == null) { readWriteLock.readLock().unlock(); // If the target value is null, release the read lock readWriteLock.writelock ().lock(); Try {value = map.get(key); // Write lock try {value = map.get(key); // Take this step very seriously. If (value == null) {// Be careful about this step. // Simulate DB operation value = new Random().nextint (10000) + "test"; map.put(key, value); System.out.println("db completed!" ); } readWriteLock.readLock().lock(); } finally {/* * finally {/* * finally { */ readWriteLock. WriteLock ().unlock(); */ readWriteLock. } } } finally { readWriteLock.readLock().unlock(); } return value; } /** * test main * @param args */ public static void main(String[] args) { final CacheTest cache = new CacheTest(); final String key = "user"; for (int i = 0; i < 1000; i++) { new Thread(){ public void run() { System.out.println(cache.getData(key)); }; }.start(); }}}Copy the code
I’m sure a lot of people will look at the above code and think it’s as damn rigorous as I did when I changed the API documentation example. After the change, and I think about it (I used to verify the logic in real scene, after all procedure world scene processing methods and so on all is the real world of abstract), the thought of the life to the supermarket at the gate of the storage cabinet access packages, read operation may take a parcel in the cupboard of analogy, the write operation may go to the supermarket to save package of analogy, Map can be likened to that cabinet. The above code is used to restore the scenario in reality: when I take the receipt to pick up the package, others can pick up the package at the same time, but when I go to deposit the package, I make everyone except me wait behind me, and when I deposit the package, others can approach the cabinet. This kind of overbearing approach, is not the above section of code performance? How about reading this description, do you still want to pretend B as above? If you take a closer look at the code in the API documentation, the method that locks has no input. The cached data is used directly as a member variable, which is shared as the cached value by all threads calling processCachedData. The problem of synchronization in multi-threaded environment should be considered when sharing data. The above code has an input key, which means that the cached value obtained by the key may or may not be shared by all calling threads. This data is not affected by different keys. So this case of locking is best treated differently based on the input parameter. Then I had the following idea, I don’t know if anyone else thought of it?
public Object getData1(String key) {
if(key == null) return null;
Object data = null;
synchronized (key.intern()) {
data = dataMap.get(key);
if(data == null) {
data = "query object";
dataMap.put(key, data);
}
}
return data;
}
Copy the code
In the code above, we could use the key as a monitor, but it would be a little sloppy to just use the key, because maybe the string contents are the same, one that returns a reference to the string pool and one that returns a reference to the heap, not an object at all. The result may be equivalent to not locking. So use key.intern() here, both returning references to the pool. This control means that different people with different receipts can access the parcel at the same time. If different people come in with the same receipts to access the parcel, one person should access the parcel while the other person waits. Although each person’s receipt should be unique in the real scene, it is reasonable in the program that we can assume that the receipt is repeated. Back to the program: if multiple threads manipulate data with different keys, they do not need to wait for another thread to lock the data; if they manipulate data with the same key, they need to wait for another thread to release the lock. The result seems very reasonable, but in general the cache application scenario, the cache is the biggest role in a short time a large number of repeated access to the same key value can be quickly from the cache access to the data, assuming that don’t read and write requests to distinguish, unified lock, can lead to multiple threads cannot at the same time to read the same key value, also needs to wait for each other, also hurt performance. The above two locking schemes: the first one considers read and write separately. In multithreading, read and write are not mutually exclusive, but mutual exclusion is not required for obtaining different key values. The second considers the case where different keys do not need to be mutually exclusive, but does not consider the case where read keys do not need to be mutually exclusive. Unfortunately, the Java API doesn’t provide a way to get different read/write locks based on different parameters, which is embarrassing. I believe that all self-claimed high performance cache will encounter this embarrassment, so I looked at the source code of pure Java cache EhCache, and found that the core idea is to generate 2048 length ReentrantReadWriteLock array by default. Then use the hash algorithm to calculate the key to obtain an int value. Then use the ReentrantReadWriteLock for each key to obtain the index corresponding to the hash value in the array. Then use the read and write lock to control the read and write. The same key does not read mutually exclusive but reads and writes mutually exclusive. The extracted code looks like this (appropriately simplified).
ReentrantReadWriteLock is a ReadWriteLockSync class. I don't like to beat around the corner. Final ReadWriteLock[] rwlocks = new ReentrantReadWriteLock[2048]; {for(int I = 0; i < rwlocks.length; i++) { rwlocks[i] = new ReentrantReadWriteLock(); Public static int selectLock(final Object key, Int numberOfLocks) throws CacheException {** **; The goal is to ensure that the number of locks is 2 to the n, such as 8 and 7 and 1000&0111 = 0000 if the result is not equal to 0, it must not be 2 to the N */ int number = numberOfLocks & (numberOfLocks -) 1); if (number ! = 0) { throw new CacheException("Lock number must be a power of two: " + numberOfLocks); } if (key == null) { return 0; } else {/** this is the same thing as the % operation when numberOfLocks is 2 to the n. As follows: 8% 4 = 8&3 1000&0011 = 0000 = 0 11% 8 = 11&7 1011&0111 = 0011 = 3 Every time I see someone else using bit operation, I can only sigh, this is a piece of B ah! */ int hash = hash(key) & (numberOfLocks - 1); return hash; Public static int hash(Object Object) {int h = object.hashCode(); public static int hash(Object Object) {int h = object.hashCode(); h ^= (h >>> 20) ^ (h >>> 12); return h ^ (h >>> 7) ^ (h >>> 4); }Copy the code
The lock code for B read/write is as follows
/** We use a map that supports concurrency to ensure that the container itself does not have concurrency problems. The access operation of the container is atomic, and the synchronization control of getData2 is to prevent the dirty reading of the business data. This is the concurrency control in the business. As for the concurrency control in the container, it is obvious that two different keys are used to write the same location, leading to data inconsistency and other problems. For example, when two people want to store the package in a box, there will be a scramble. This is not to say that the use of thread concurrency library does not need concurrency control, after all, thread concurrency library reads and writes, can only ensure that individual reads and writes are atomic operations. */ private Map<String,Object> dataMap = new ConcurrentHashMap<>(); private Map<String,Object> dataMap = new ConcurrentHashMap<>(); public Object getData2(String key) { int locknum = selectLock(key,2048); ReadWriteLock rwlock = rwlocks[locknum]; rwlock.readLock().lock(); try { Object data = dataMap.get(key); if (data == null) { rwlock.readLock().unlock(); rwlock.writeLock().lock(); if(data == null) { data = "query object"; dataMap.put(key, data); } rwlock.readLock().lock(); rwlock.writeLock().unlock(); } return data; } finally { rwlock.readLock().unlock(); }}Copy the code
In fact, the idea is to reduce the lock granularity as much as possible, a read and write lock is replaced by 2048 small locks, like the previous JDK1.8 ConcurrentHashMap used lock segments, divided into 16 ends, in fact, the same idea is to reduce the lock granularity, to achieve efficient concurrency control. However, it is estimated that after reading and writing reaches a certain level of two, only this 16 – end sharding, it is estimated that not much effect. So the official estimate realized this and changed the implementation mode to CAS lock-free algorithm after 1.8. It occurs to me that the new version of Ehcache (the ehcache version mentioned in this article is 2.10.4) will also use CAS? This new version of the source code, I have not studied for the time being, the pursuit of human performance is endless, this is interested in friends to explore.
Well, that’s the end of it. If there are any mistakes in the article, please correct them. The next topic will discuss how to seamlessly implement custom level 2 caching on top of SpringBoot with EhCache and Redis.