Public search “code road mark”, point of concern not lost!
Mybatis has level 1 cache and level 2 cache. However, we generally use the default configuration when using Mybatis and know little about how caching works. Mybatis caching was covered in the Executor article, but it was trivial and not systematic.
Today, through this article to do a detailed interpretation of the cache mechanism of Mybatis, level 1 cache, level 2 cache execution process, working principle have a comprehensive understanding, convenient for rational use in the development process.Or first through a “panorama” to have an overall understanding of Mybatis cache mechanism, and then combined with the overall process, to analyze each link one by one. As shown in the figure, Mybatis Level 1 cache is implemented in BaseExecutor, and level 1 cache is only valid for the lifetime of SqlSession; Level 2 caching is implemented in CachingExecutor and level 2 caching is shared globally at the Namespace dimension.
If the primary and secondary caches are enabled at the same time, when the query is executed: Mybatis queries the secondary cache first; If the level-2 cache is not matched, the level-2 cache is queried. If level-1 caches are not matched, the database is queried and the level-2 caches are updated. Level 1 caching is enabled by default and there is no way to turn it off; Level 2 cache is disabled by default. After a general understanding of Mybatis cache mechanism, let’s take a look at the implementation of the first and second level cache respectively.
Level 1 cache
Let’s start with the simpler level 1 cache. Mybatis level 1 Cache is implemented in BaseExecutor, using the Cache interface implementation class PerpetualCache as a container for Cache storage, which is inside a HashMap. This is similar to the way we usually do k-V storage or Redis cache. So, naturally, we need to clarify the following questions:
- How and when does Mybatis write to cache?
- How and when does Mybatis read cache?
- When does Mybatis invalidate the cache and what are the strategies used for cache invalidation?
- What interface methods do Cache interfaces and implementation classes provide, and how do Mybatis use them?
- Level-1 cache is a local cache within an application. In distributed scenarios, does level-1 cache cause problems such as dirty reads? How to solve it?
Well, there seem to be a lot of problems! The first three are the use of cache, the fourth is the source level, the last is combined with the characteristics of level 1 cache how to use the problem. With these questions in mind, let’s get down to business.
Cache interface and PerpetualCache
Cache interface is located in the org.apache.ibatis. Cache package, is the core interface of Mybatis Cache, Mybatis level 1 and level 2 Cache are dependent on it, through the following class diagram to understand the inheritance relationship, to reflect the important position of Cache:Level 1 Cache only uses PerpetualCache, so in this phase we first understand the source code for the Cache interface and implementation of PerpetualCache. Cache interface:
/**
* SPI for cache providers.
* One instance of cache will be created for each namespace.
* The cache implementation must have a constructor that receives the cache id as an String parameter.
* MyBatis will pass the namespace as id to the constructor.
*
* <pre>
* public MyCache(final String id) {
* if (id == null) {
* throw new IllegalArgumentException("Cache instances require an ID");
* }
* this.id = id;
* initialize();
* }
* </pre>
*
* @author Clinton Begin
*/
Copy the code
The Cache interface defines the SPI of the Mybatis Cache architecture. Each Cache instance is created for a namespace. The example also requires that each Cache interface implementation class should receive a namespace as the Cache ID as a unique identifier through the constructor. This means that each Cache instance has a unique id. Cache interface definition:
public interface Cache {
/** * get the unique identifier of the cache *@return The identifier of this cache
*/
String getId(a);
/** * Add content to the cache, where key uses the CacheKey class **@param key Can be any object but usually it is a {@link CacheKey}
* @param value The result of a select.
*/
void putObject(Object key, Object value);
/** * select * from cache by key@param key The key
* @return The object stored in the cache.
*/
Object getObject(Object key);
/** * Removes the key from the cache **@param key The key
* @return Not used
*/
Object removeObject(Object key);
The individual elements of this cache instance are often deleted by the elements themselves
void clear(a);
}
Copy the code
The Cache interface is small. It provides basic operations for adding, querying, deleting, and emptying caches. To add, query, and delete caches, a key of type CacheKey is required. CacheKey is the utility class Mybatis uses to generate cached indexes with rules to ensure hash robustness. Going back to PerpetualCache, which uses a HashMap as a storage container for the Cache, and the way the Cache interface is defined, we know that it operates on a HashMap element, so we don’t have to write about it.
Cache management
Level 1 cache management involves PerpetualCache create, query cache, adding caching, empty cache several process, in addition to the cache created, several other process by the Executor of the update/query/commit/rollback method such as trigger during execution.
The cache created
The initialization of PerpetualCache is done in the BaseExecutor constructor with the specified id LocalCache, so a level 1 cache is also called a LocalCache. The initialization code looks like this:
protected BaseExecutor(Configuration configuration, Transaction transaction) {
this.transaction = transaction;
this.deferredLoads = new ConcurrentLinkedQueue<>();
// Cache initialization
this.localCache = new PerpetualCache("LocalCache");
this.localOutputParameterCache = new PerpetualCache("LocalOutputParameterCache");
this.closed = false;
this.configuration = configuration;
this.wrapper = this;
}
Copy the code
Cache query and add
The purpose of caching is to improve query efficiency, so the query cache must be triggered in BaseExecutor#query. If the cache is hit during the query, the cache content can be returned directly. If no match is found, query the query result from the database and add the database query result to the cache to ensure the efficiency of the next query. The cache management process in the query process is as follows:BaseExecutor# Query (), BaseExecutor#queryFromDatabase
public <E> List<E> query(MappedStatement ms, Object parameter, RowBounds rowBounds, ResultHandler resultHandler, CacheKey key, BoundSql boundSql) throws SQLException {
ErrorContext.instance().resource(ms.getResource()).activity("executing a query").object(ms.getId());
if (closed) {
throw new ExecutorException("Executor was closed.");
}
if (queryStack == 0 && ms.isFlushCacheRequired()) {
clearLocalCache();
}
List<E> list;
try {
queryStack++;
// Check if the cache is hit and fetch the result if it is
list = resultHandler == null ? (List<E>) localCache.getObject(key) : null;
if(list ! =null) {
handleLocallyCachedOutputParameters(ms, key, parameter, boundSql);
} else {
// Call the method to query the database if there is no matchlist = queryFromDatabase(ms, parameter, rowBounds, resultHandler, key, boundSql); }}finally {
queryStack--;
}
if (queryStack == 0) {
for (DeferredLoad deferredLoad : deferredLoads) {
deferredLoad.load();
}
// issue #601
deferredLoads.clear();
// If the scope of the local cache (level 1 cache) is STATEMENT, the cache is cleared
if (configuration.getLocalCacheScope() == LocalCacheScope.STATEMENT) {
// issue #482clearLocalCache(); }}return list;
}
private <E> List<E> queryFromDatabase(MappedStatement ms, Object parameter, RowBounds rowBounds, ResultHandler resultHandler, CacheKey key, BoundSql boundSql) throws SQLException {
List<E> list;
// Preoccupy the cache with placeholders
localCache.putObject(key, EXECUTION_PLACEHOLDER);
try {
// Query the database
list = doQuery(ms, parameter, rowBounds, resultHandler, boundSql);
} finally {
// Remove the placeholder
localCache.removeObject(key);
}
// Add database query results to the cache
localCache.putObject(key, list);
if (ms.getStatementType() == StatementType.CALLABLE) {
localOutputParameterCache.putObject(key, parameter);
}
return list;
}
Copy the code
Clear the cache
When the data source changes, to ensure data consistency, the invalid data in the cache is cleared so that the latest data can be loaded from the database when the next query is made. Therefore, the operation that causes the data source to change is the time to clear the cache.
From the BaseExecutor interface, the update, COMMIT, rollback, and other methods all cause changes to the data source. From the source code, these methods do call clearLocalCache internally to clear the cache. The code looks like this:
public int update(MappedStatement ms, Object parameter) throws SQLException {
ErrorContext.instance().resource(ms.getResource()).activity("executing an update").object(ms.getId());
if (closed) {
throw new ExecutorException("Executor was closed.");
}
clearLocalCache();
return doUpdate(ms, parameter);
}
public void commit(boolean required) throws SQLException {
if (closed) {
throw new ExecutorException("Cannot commit, transaction is already closed");
}
clearLocalCache();
flushStatements();
if(required) { transaction.commit(); }}public void rollback(boolean required) throws SQLException {
if(! closed) {try {
clearLocalCache();
flushStatements(true);
} finally {
if(required) { transaction.rollback(); }}}}public void clearLocalCache(a) {
if (!closed) {
localCache.clear();
localOutputParameterCache.clear();
}
}
Copy the code
Note that in the UPDATE method, if the scope of the local cache (level 1 cache) is STATEMENT, the method clearLocalCache is also called to clear the cache.
Level 1 Cache summary
The life cycle
As we learned in the previous article, executors are held by SQLSessions, while level 1 cachingLocalCache
It’s just a field of BaseExecutor. Therefore, when the life of SqlSession ends, level 1 cache is also reclaimed. That is, level 1 cache can only be used for the declaration period of the same SqlSession and cannot be shared between different SQLsessions. The relationship between them is shown below:
How do distributed systems avoid data inconsistencies
Level-1 cache can be shared only within SQLSessions. If multiple SQLsessions exist or in a distributed environment, cache invalidation methods, such as UPDATE and COMMIT, cannot take effect on other SQLsessions and will inevitably cause dirty read problems. This problem can be avoided by changing the scope of the level 1 cache (the localCacheScope) to STATEMENT.
Cache read and write timing and principle
Level 1 caching is implemented by PerpetualCache, which internally reads and writes to the cache using a HashMap, so essentially level 1 caching is implemented using a HashMap. The key in the Map is a CacheKey generated by MappedStatement to ensure that the query is unique.
Level-1 cache is used to improve query efficiency. The cache is queried before the query operation. If a match is found in the cache, the cache content is returned as the result, and the database is not queried. If the cache is not hit, the database query is executed, the query result is added to the cache, and the final result is returned.
The second level cache
The level 1 cache life cycle is limited to SQLSessions, however in practice we create multiple SqlSession objects, but the cache content cannot be shared between multiple SQLSessions. To solve this problem, we can use a level 2 cache. Level 2 caching is implemented by CachingExecutor and precedes level 1 caching. After level 2 cache is enabled, the Executor query process changes to Level 2 Cache > Level 1 cache > Database.
Enabling Level 2 Caching
To enable level-2 cache, complete the following configurations:
- Configure the cacheEnabled Settings in the Mybatis profile: Globally enables or disables any cache that has been configured in all mapper profiles. The default value is true.
<setting name="cacheEnabled" value="true"/>
Copy the code
- Add nodes to the Mapper configuration file.
<! -- By default, this line will work -->
<cache/>
Copy the code
This will use the default cache policy. The cache node has the following attributes:
- Eviction: Removal strategy. Support LRU (least recently used, default), FIFO (FIFO), SOFT, WEAK.
- FlushInterval: flushInterval. Can be set to any positive integer, and the value should be a reasonable amount of time in milliseconds. The default is no, that is, there is no refresh interval, and the cache is flushed only when the statement is called.
- Size: indicates the number of caches. The default value is 1024. Can be set to any positive integer, paying attention to the size of the object to be cached and the memory resources available in the runtime environment.
- The readOnly property can be set to true or false. A read-only cache returns the same instance of the cache object to all callers. Therefore, these objects cannot be modified. This provides a significant performance boost. A read-write cache returns (through serialization) a copy of the cached object. It’s slower, but safer, so the default is false.
The default Settings are as follows:
- The results of all SELECT statements in the mapping statement file will be cached.
- All INSERT, UPDATE, and DELETE statements in the mapping statement file flush the cache.
- The cache uses the Least Recently Used algorithm (LRU) algorithm to clear unwanted caches.
- The cache is not flushed regularly (that is, there are no flush intervals).
- The cache holds 1024 references to lists or objects (whichever is returned by the query method).
- The cache is treated as a read/write cache, which means that the retrieved object is not shared and can be safely modified by the caller without interfering with potential changes made by other callers or threads.
Level 2 cache initialization
Level 2 cache initialization is performed when the mapper’s corresponding XML file is parsed. You can follow the following link query parsing process, where the cacheElement method reads and assigns the configuration information to the default value. MapperBuilderAssistant#useNewCache is then called to initialize the second-level cache object.
- org.apache.ibatis.session.SqlSessionFactoryBuilder#build(java.io.InputStream, java.lang.String, java.util.Properties)
- org.apache.ibatis.builder.xml.XMLConfigBuilder#parse
- org.apache.ibatis.builder.xml.XMLConfigBuilder#parseConfiguration
- org.apache.ibatis.builder.xml.XMLConfigBuilder#mapperElement
- org.apache.ibatis.builder.xml.XMLMapperBuilder#parse
- org.apache.ibatis.builder.xml.XMLMapperBuilder#configurationElement
- org.apache.ibatis.builder.xml.XMLMapperBuilder#cacheElement
- org.apache.ibatis.builder.MapperBuilderAssistant#useNewCache
The cacheElement method reads the configuration and initializes the configuration information. If the properties are not configured, the default values are used.
private void cacheElement(XNode context) {
if(context ! =null) {
// Read the cache type, which defaults to PERPETUAL
String type = context.getStringAttribute("type"."PERPETUAL");
Class<? extends Cache> typeClass = typeAliasRegistry.resolveAlias(type);
// Read the data obsolescence policy and get its type, default is LRU, default size 1024
String eviction = context.getStringAttribute("eviction"."LRU");
Class<? extends Cache> evictionClass = typeAliasRegistry.resolveAlias(eviction);
// Refresh interval, default is empty, that is, no refresh.
Long flushInterval = context.getLongAttribute("flushInterval");
// Number of caches: empty by default, LRU's default size 1024 will be used
Integer size = context.getIntAttribute("size");
booleanreadWrite = ! context.getBooleanAttribute("readOnly".false);
boolean blocking = context.getBooleanAttribute("blocking".false);
Properties props = context.getChildrenAsProperties();
// Initialize the cache objectbuilderAssistant.useNewCache(typeClass, evictionClass, flushInterval, size, readWrite, blocking, props); }}Copy the code
UseNewCache method: Create a cache object PerpetualCache, add a data obsolescence policy (decorator), set the refresh interval, set the cache size… This sequence of operations is assembled in decorator mode, resulting in a layer of nested Cache objects.
public Cache useNewCache(Class<? extends Cache> typeClass,
Class<? extends Cache> evictionClass,
Long flushInterval,
Integer size,
boolean readWrite,
boolean blocking,
Properties props) {
Cache cache = new CacheBuilder(currentNamespace)
.implementation(valueOrDefault(typeClass, PerpetualCache.class))
.addDecorator(valueOrDefault(evictionClass, LruCache.class))
.clearInterval(flushInterval)
.size(size)
.readWrite(readWrite)
.blocking(blocking)
.properties(props)
.build();
configuration.addCache(cache);
currentCache = cache;
return cache;
}
Copy the code
During Cache initialization, currentNamespace (that is, the namespace of mapper file) is used as the Cache ID, and the Cache object is stored in the global Cache Configuration, realizing cross-session sharing.
The cache using
As mentioned earlier, the secondary cache takes effect in CachingExecutor. Check the org. Apache. Ibatis. Executor. CachingExecutor# query method, its execution process is as follows:
- Generate a CacheKey based on information such as MappedStatement and parameterObject.
- Get the Cache object cached in MappedStatement.
- Cache data obtained through TransactionalCacheManager try: if a cache hit, the direct return; If cache misses, performing subsequent query and the results are cached TransactionalCacheManager.
The query method completes the query and add operations to the cache, as shown in the following code.
@Override
public <E> List<E> query(MappedStatement ms, Object parameterObject, RowBounds rowBounds, ResultHandler resultHandler, CacheKey key, BoundSql boundSql)
throws SQLException {
Cache cache = ms.getCache();
if(cache ! =null) {
flushCacheIfRequired(ms);
if (ms.isUseCache() && resultHandler == null) {
ensureNoOutParams(ms, boundSql);
@SuppressWarnings("unchecked")
List<E> list = (List<E>) tcm.getObject(cache, key);
if (list == null) {
list = delegate.query(ms, parameterObject, rowBounds, resultHandler, key, boundSql);
tcm.putObject(cache, key, list); // issue #578 and #116
}
returnlist; }}return delegate.query(ms, parameterObject, rowBounds, resultHandler, key, boundSql);
}
Copy the code
Cache invalidation
To avoid dirty reads, the Cache is deleted or reset when the CachingExecutor performs update, COMMIT, and rollback operations.
Update the source code:
@Override
public int update(MappedStatement ms, Object parameterObject) throws SQLException {
flushCacheIfRequired(ms);
return delegate.update(ms, parameterObject);
}
private void flushCacheIfRequired(MappedStatement ms) {
Cache cache = ms.getCache();
if(cache ! =null&& ms.isFlushCacheRequired()) { tcm.clear(cache); }}Copy the code
Commit the source code:
//org.apache.ibatis.executor.CachingExecutor#commit
@Override
public void commit(boolean required) throws SQLException {
delegate.commit(required);
tcm.commit();
}
//org.apache.ibatis.executor.CachingExecutor#rollback
@Override
public void rollback(boolean required) throws SQLException {
try {
delegate.rollback(required);
} finally {
if(required) { tcm.rollback(); }}}//org.apache.ibatis.cache.TransactionalCacheManager#commit
public void commit(a) {
for(TransactionalCache txCache : transactionalCaches.values()) { txCache.commit(); }}//org.apache.ibatis.cache.decorators.TransactionalCache#commit
public void commit(a) {
if (clearOnCommit) {
delegate.clear();
}
flushPendingEntries();
reset();
}
private void reset(a) {
clearOnCommit = false;
entriesToAddOnCommit.clear();
entriesMissedInCache.clear();
}
Copy the code
Summary of Level 2 Cache
Compared with level 1 Cache, level 2 Cache of MyBatis realizes the sharing of Cache data between SqlSession. At the same time, the granularity is more fine, and it can reach namespace level. Different combinations of classes can be realized through Cache interface, and the controllability of Cache is stronger.
In distributed environment, as the default MyBatis Cache implementation is based on local, it is inevitable to read dirty data in distributed environment, so centralized Cache is needed to implement the Cache interface of MyBatis, which has certain development cost. Using distributed caches such as Redis and Memcached directly can be cheaper and more secure.
The full text summary
This paper briefly summarizes the caching mechanism of Mybatis, mainly to understand its operation principle. Although it will not be used in practical work, some parameters or characteristics may affect the operation effect of our application. The key to understand its principle is to avoid pits. In addition, its cache principle can also provide some good ideas for our actual development work.
Public search “code road mark”, point of concern not lost!