To cope with high concurrency and relieve database read and write pressure, cache stores data in a storage medium (such as memory) with faster read and write speeds. Caches are generally divided into local caches (such as Java heap memory caches) and distributed caches (such as Redis). Since it is a cache, it means that the data temporarily stored in the cache is only a copy, which means that the data consistency between the copy and the master data needs to be guaranteed, which is the update of the cache to be analyzed next.

Common cache update strategies are:

  1. Delete the cache first, then update the database
  2. Update the database first, then delete the cache
  3. Update the database first, then the cache
  4. read/write through
  5. Write back. When updating data, only the cache is updated, not the database, whereas our cache asynchronously updates the database in batches

Delete the cache and then update the database

It is clear that this logic is problematic, if two concurrent operation, the update operation, the other a query operation, the update operation to delete cached haven’t had time to update the database, at that moment, another user has launched the query operation, it has missed the cache and then read from the database, the first operation to update the database, read the old data, It is then written to the cache, causing the data in the cache to become dirty and remain dirty until the cache expires or a new update operation is initiated.

Initial cache value: A = 1

Initial database value: A = 1

update The query
Delete the cache
Read cache (empty)
Read database (A = 1)
Update database (A=100)
Write cache (A = 1)

Cache final value: A = 1

Database final value: A = 100

There are two ways to deal with this problem:

  • Lock. Locking is meant to solve the concurrency problem, which means that concurrent processing will be processed sequentially, which naturally leads to slightly lower performance.
  • Add the expiration time to the key, and control the expiration time, but the business itself has the problem of tolerating data inconsistency within this time.
  • Reduce the probability of this problem

Update the database first, then delete the cache

This is by far the most common solution in the industry. Although it is also not perfect, the probability of problems is very small. Its read and write flow is shown in the following figure (picture from network, intrusion deletion) :

Write operations update the database and invalidate the cache after successful updates. The read operation first reads the cache. If it reads data from the cache, it directly returns the data. The database cannot be read from the cache, and then the database data is loaded into the cache.

But it also has a problem, as shown in the figure below, the query fails to hit the cache, then after reading the old data from the database, before writing to the cache, another user initiates an update operation to update the database and clear the cache, then the query updates the old data from the database to the cache. This causes the data in the cache to become dirty and remain dirty until the cache expires or a new update operation is initiated.

Cache initial value: null

Initial database value: A = 1

update The query
Read cache
Read database (A = 1)
Update database (A = 100)
Delete cache (empty)
Write cache (A = 1)

Cache final value: A = 1

Database final value: A = 100

Why is this idea so widely used when it has such obvious problems? Because the probability of occurrence of this case is actually very low, the following four conditions should be met to produce this case:

  1. Read operation Read cache invalid
  2. There is a concurrent write operation
  3. Write operations are faster than read operations
  4. Read operations enter the database before write operations and update the cache after write operations

In fact, the database write operation is much slower than the read operation and locks the table, while the read operation must enter the database operation before the write operation and update the cache after the write operation, all of these conditions are not very likely. And even if this problem occurs there is also a cache expiration time to automatically fill the bottom.

Update the database first, and then update the cache

By contrast, this approach theoretically provides better read performance than updating the database and then deleting the cache because it prepares the data beforehand. But due to update the database and cache two pieces of data, so it write performance is lower, and the key is that it also can appear dirty data, the following figure, two concurrent updates, respectively in tandem write cache before writing database, one after one, will eventually cached data is the first writing of data at a time, not the latest.

Initial cache value: A = 1

Initial database value: A = 1

update The query
Update database (A = 100)
Update database (A = 200)
Update cache (A = 200)
Update cache (A = 100)

Cache final value: A = 100

Database final value: A = 200

Read/Write Through Cache proxy

The Read/Write Through routine enacts updating a Repository by the cache itself, making it much simpler for the application layer. The application considers the back end to be a single store, and the store maintains its own Cache. The database is proxied by the cache. When the cache is not hit, the database data is loaded by the cache and then the application reads from the cache. When the data is written, the database is written synchronously after updating the cache. The application is cache aware, not database aware.

5, write back

This is called Write Behind or Write Back. If you are familiar with the Linux operating system kernel, you should be familiar with the write back algorithm. Yes, that’s the thing. In this mode, only the cache, not the database, is updated when the data is updated, whereas our cache asynchronously updates the database in batches.

The problem with this approach is that the data is not consistent and can be lost (we know that Unix/Linux abnormal shutdowns cause data loss because of this). In addition, the implementation logic of Write Back is complicated, because it needs to track which data is updated and brush to the persistence layer. An operating system write back is persisted only when the cache needs to be invalidated, such as when memory runs out or a process exits. This is called lazy write.

Take the first way for example

  • Invalid: The application retrieves data from the cache. If the application fails to retrieve data, it retrieves data from the database and puts it in the cache.
  • Hit: An application retrieves data from the cache and returns it.
  • Update: Save the data to the database, and then invalidate the cache after success.

The general process is as follows:

Get examples of product details

  1. Retrieves item details from the commodity Cache, if present, returns fetch Cache data.
  2. If not, get it from the goods DB. After the Cache is successfully obtained, the data is stored in the Cache. The next time you obtain commodity details, the detailed commodity data can be obtained from the Cache.
  3. After the item details are updated or deleted successfully from the item DB, the item details cache is deleted from the cache

Adding Maven dependencies

Here is MyBatis to do database DAO operation, Redis to do cache operation.

SpringBoot 2 starts with the default Redis client implementation being Lettuce, and you need to add commons-pool2 dependencies.

<dependencies>
    <dependency>
        <groupId>org.springframework.boot</groupId>
        <artifactId>spring-boot-starter-data-redis</artifactId>
    </dependency>
    <dependency>
        <groupId>org.apache.commons</groupId>
        <artifactId>commons-pool2</artifactId>
    </dependency>
    <dependency>
        <groupId>com.fasterxml.jackson.core</groupId>
        <artifactId>jackson-core</artifactId>
        <version>${jackson-databind-version}</version>
    </dependency>
    <dependency>
        <groupId>com.fasterxml.jackson.core</groupId>
        <artifactId>jackson-databind</artifactId>
        <version>${jackson-databind-version}</version>
    </dependency>
    <dependency>
        <groupId>com.fasterxml.jackson.core</groupId>
        <artifactId>jackson-annotations</artifactId>
        <version>${jackson-databind-version}</version>
    </dependency>
    <dependency>
        <groupId>org.springframework.boot</groupId>
        <artifactId>spring-boot-starter-test</artifactId>
        <scope>test</scope>
        <exclusions>
            <exclusion>
                <groupId>com.vaadin.external.google</groupId>
                <artifactId>android-json</artifactId>
            </exclusion>
        </exclusions>
    </dependency>
    <dependency>
        <groupId>org.springframework.boot</groupId>
        <artifactId>spring-boot-starter-jdbc</artifactId>
    </dependency>
    <dependency>
        <groupId>mysql</groupId>
        <artifactId>mysql-connector-java</artifactId>
        <version>${mysql-connector.version}</version>
        <scope>runtime</scope>
    </dependency>
    <dependency>
        <groupId>com.alibaba</groupId>
        <artifactId>druid</artifactId>
        <version>${druid.version}</version>
    </dependency>
    <! MyBatis Plus enhancements and SpringBoot integration -->
    <dependency>
        <groupId>com.baomidou</groupId>
        <artifactId>mybatisplus-spring-boot-starter</artifactId>
        <version>${mybatisplus-spring-boot-starter.version}</version>
    </dependency>
</dependencies>
Copy the code

Note that I added the Jackson dependency above, which I’ll use later.

The configuration application. Yml

Added Redis configuration

spring:
  cache:
    type: REDIS
    redis:
      cache-null-values: false
      time-to-live: 600000ms
      use-key-prefix: true
    cache-names: userCache,allUsersCache
  redis:
    host: 127.0. 01.
    port: 6379
    database: 0
    lettuce:
      shutdown-timeout: 200ms
      pool:
        max-active: 7
        max-idle: 7
        min-idle: 2
        max-wait: -1ms
Copy the code

Corresponding configuration classes:

org.springframework.boot.autoconfigure.data.redis.RedisProperties
Copy the code

Adding a Configuration Class

The RedisTemplate configuration class is designed to use Jackson instead of the default serialization mechanism:

@Configuration
public class RedisConfig {
    /** * redisTemplate uses the JDK's serialization mechanism by default, storing binary bytecode, so customize the serialization class *@paramRedisConnectionFactory Redis connection factory class *@return RedisTemplate
     */
    @Bean
    public RedisTemplate<Object, Object> redisTemplate(RedisConnectionFactory redisConnectionFactory) {
        RedisTemplate<Object, Object> redisTemplate = new RedisTemplate<>();
        redisTemplate.setConnectionFactory(redisConnectionFactory);

        / / use Jackson2JsonRedisSerialize replace the default serialization
        Jackson2JsonRedisSerializer jackson2JsonRedisSerializer = new Jackson2JsonRedisSerializer(Object.class);

        ObjectMapper objectMapper = new ObjectMapper();
        objectMapper.setVisibility(PropertyAccessor.ALL, JsonAutoDetect.Visibility.ANY);
        objectMapper.enableDefaultTyping(ObjectMapper.DefaultTyping.NON_FINAL);

        jackson2JsonRedisSerializer.setObjectMapper(objectMapper);

        // Set the serialization rules for value and key
        redisTemplate.setKeySerializer(new StringRedisSerializer());
        redisTemplate.setValueSerializer(jackson2JsonRedisSerializer);
        redisTemplate.afterPropertiesSet();
        returnredisTemplate; }}Copy the code

Using the demonstration

MyBatis implementation of the DAO layer, and the User entity class I will not paste here, only paste Service core add, delete, change and check operations:

@Service
@Transactional
public class UserService {
    private Logger logger = LoggerFactory.getLogger(this.getClass());
    @Resource
    private UserMapper userMapper;

    @Resource
    private RedisTemplate<String, User> redisTemplate;

    /** * Create user * does nothing to cache */
    public void createUser(User user) {
        logger.info("Create user start...");
        userMapper.insert(user);
    }

    /** * Get user info * if cache exists, get city info from cache * If cache does not exist, get city info from DB, then insert cache *@paramId User ID *@returnUser * /
    public User getById(int id) {
        logger.info("Get user start...");
        // Retrieve user information from the cache
        String key = "user_" + id;
        ValueOperations<String, User> operations = redisTemplate.opsForValue();

        // The cache exists
        boolean hasKey = redisTemplate.hasKey(key);
        if (hasKey) {
            User user = operations.get(key);
            logger.info("User ID = obtained from cache" + id);
            return user;
        }
        
        // Cache does not exist, get from DB
        User user = userMapper.selectById(id);
        // Insert the cache
        operations.set(key, user, 10, TimeUnit.SECONDS);
        return user;
    }

    /** * Update user * if cache exists, delete * If cache does not exist, do not operate *@paramUser * /
    public void updateUser(User user) {
        logger.info("Update user start...");
        userMapper.updateById(user);
        int userId = user.getId();
        // Cache exists, delete cache
        String key = "user_" + userId;
        boolean hasKey = redisTemplate.hasKey(key);
        if (hasKey) {
            redisTemplate.delete(key);
            logger.info("Delete user from cache when updating user >>"+ userId); }}/** * Delete user * if present in cache, delete */
    public void deleteById(int id) {
        logger.info("Delete user start...");
        userMapper.deleteById(id);

        // Cache exists, delete cache
        String key = "user_" + id;
        boolean hasKey = redisTemplate.hasKey(key);
        if (hasKey) {
            redisTemplate.delete(key);
            logger.info("Delete user from cache when deleting user >>"+ id); }}}Copy the code

RedisTemplate encapsulates RedisConnection with connection management, serialization, and Redis operations. There is also a template object for Strings called StringRedisTemplate.

The Redis operation view interface class uses ValueOperations, which corresponds to the Redis String/Value operation.

There are other action views, ListOperations, SetOperations, ZSetOperations, and HashOperations. ValueOperations Yes You can set the expiration time of cache insertion. The value is 10 seconds.

Then write a test class test run to see the effect:

@RunWith(SpringRunner.class)
@SpringBootTest(classes = Application.class)
@Transactional
public class UserServiceTest {
    @Autowired
    private UserService userService;
    
    @Test
    public void testCache(a) {
        int id = new Random().nextInt(1000);
        User user = new User(id, "admin"."admin");
        userService.createUser(user);
        User user1 = userService.getById(id); // First visit
        assertEquals(user1.getPassword(), "admin");
        User user2 = userService.getById(id); // Second visit
        assertEquals(user2.getPassword(), "admin");
        user.setPassword("123456");
        userService.updateUser(user);
        User user3 = userService.getById(id); // Third visit
        assertEquals(user3.getPassword(), "123456"); userService.deleteById(id); assertNull(userService.getById(id)); }}Copy the code

Run the SpringBoot integration test and view the log as follows:

Create user start...
= = >  Preparing: INSERT INTO t_user ( id, username, `password` ) VALUES ( ? . ? . ? )
= = > Parameters: 89(Integer), admin(String), admin(String)
< = =    Updates: 1
Get user start...
Starting without optional epoll library
Starting without optional kqueue library
= = >  Preparing: SELECT id AS id,username,`password` FROM t_user WHERE id=?
= = > Parameters: 89(Integer)
< = =      Total: 1
Get user start...
The user was retrieved from the cache id = 89
Update user start...
= = >  Preparing: UPDATE t_user SET username=? . `password`=? WHERE id=?
= = > Parameters: admin(String), 123456(String), 89(Integer)
< = =    Updates: 1
When updating a user, remove the user from the cache >> 89
Get user start...
= = >  Preparing: SELECT id AS id,username,`password` FROM t_user WHERE id=?
= = > Parameters: 89(Integer)
< = =      Total: 1
Delete user start...
= = >  Preparing: DELETE FROM t_user WHERE id=?
= = > Parameters: 89(Integer)
< = =    Updates: 1
When updating a user, remove the user from the cache >> 89
Get user start...
= = >  Preparing: SELECT id AS id,username,`password` FROM t_user WHERE id=?
= = > Parameters: 89(Integer)
< = =      Total: 0
Rolled back transaction for test: [DefaultTestContext@6e20b53a testClass = UserServiceTest
Closing org.springframework.context.annotation.AnnotationConfigApplicationContext@503f91c3
{dataSource-1} closed
Copy the code

You can see that on the first fetch, if the cache misses, the data is fetched from DB, and on the second fetch, the cache hit is fetched directly from the cache. Subsequent updates and deletions are removed from the cache. When the cache is not hit again, fetch the latest from DB.

Seven,

This paper summarizes five common ideas of cache update, among which updating database first and then deleting cache is the most commonly used idea at present. Deleting the cache and then updating the database because the probability of failure is too high doesn’t help. The third and fifth ideas are also useful in specific application scenarios. For example, updating the database before updating the cache can solve the problem of instantaneous large number of requests penetrating the database due to cache misses in high concurrency. Each scheme has its own advantages and disadvantages. In a word, there is no perfect scheme, only a more suitable scheme that fits the scene.

For more content, please visit my personal blogDynasty