A preface.
The use of caches can be seen everywhere in our daily work. Proper use of cache plays an important role in improving system performance and user experience. But if not used properly, it can be confusing or confusing. This article will give you the popularity of some common cache use and cache use in the process of stepping on the pit, hoping to help you better understand and use the cache, if there are written wrong place, welcome to leave a message to correct.
Ii. Java cache format/media
As we all know, cache access speed, because the cache interaction medium is memory. The normal medium for data interaction such as mysql is disk. So what are the common Java or middleware development tools that we can use for caching?
2.1. JVM local memory
A common use of JVM local memory is to define a global static variable to ensure that the corresponding object space is directly referenced while the back-end service is running and will not be collected by GC.
2.2. Guava Cache tool class
The nature of storing data is similar to that of JVM memory, relying internally on maintaining subclasses of Java collections to store data, but providing Settings for the expiration time of cached data, expiration policies, and so on, like a small middleware.
2.3. Redis
Redis is a fully open source, BSD compliant, high-performance key-value database.
Redis and other key-value cache products have the following three features:
- Redis supports data persistence. Data stored in memory can be saved to disk, which can be reloaded and used upon restart.
- Redis supports not only simple key-value data, but also lists, sets, zsets, hash and other data structures.
- Redis supports data backup in master-slave mode.
Quote from: www.runoob.com/redis/redis…
2.4. Comparison
The JVM caching | Guava cache | Redis cache | |
---|---|---|---|
speed | The first | The second | The third |
Whether cache data occupies JVM memory | is | is | no |
Policies such as expiration time are provided | no | is | is |
Can cache large amounts of data | no | no | is |
Application restart Whether the cache is lost | is | is | no |
Usage scenarios | Dictionary data, less frequently modified after loading | Support for JVM caching of all functions, and suitable for caching token type with time-sensitive data | Supports all functions of guava cache and all caching scenarios in daily work. Whether the application system data is restarted does not affect the loading and use of cache data |
Three. Cache common pits
Before looking at the cache pits, let’s take a look at how cache additions and modifications ensure that the database is consistent with the cache.
1. Query
When querying, first query the cache. If the cache exists, it is returned directly. If the cache does not exist, then query the database, write the database query result into the cache.
2. Increase authorization
Before adding, deleting or modifying the database, ensure that the database data is modified first, and then the data in the cache is synchronized. If an exception occurs, the database transaction is invoked to roll back the data.
3.1. Cache penetration
Concept of 3.1.1.
Under normal circumstances, the queried data exists. If the request for a non-existent data, that is, the cache and database can not find the data, every time to the database query, this query does not exist data phenomenon is called cache penetration. As shown in the figure above, the user keeps requesting the interface to query the id data that does not exist [only the data with ID >0 exists in the database], and the corresponding data can never be in the cache. In high concurrency scenarios, a large number of requests are sent to the database, and the database may be suspended by sudden traffic.
3.1.2. Solution
1. Filter junk data
If the query id data is known to be greater than 0 or is generated based on some rule (such as snowflake ID). Filters requests that are unlikely to exist in the database. Method entry directly adds a parameter check.
2. The cache is empty
The reason for the penetration is that the data does not exist in the database, so we cache the null value and return null when the request arrives. Of course, you have to add the expiration time to the cache in case there is really data with this ID. The expiration time should not be too long. Set the expiration time based on the number of concurrent requests in actual service scenarios.
3. IP blocking
For malicious attack requests that always request invalid data, you can set an IP request policy. If the corresponding IP address initiates a large number of requests within a short period of time, and the request parameters do not exist. The IP address is blocked for a period of time and the system is not allowed to request the IP address again.
4. 4. Smart Car
BloomFilter is used to determine whether a certain element (key) exists in a set. We put all the keys with data in BloomFilter. Every time we query, we use BloomFilter to judge first.
Note that BloomFilter has no deletion operation. For the deleted key, the query will pass BloomFilter, then query the cache and then query the database. Therefore, BloomFilter can be used in combination with the cache null value, and for the deleted key, null can be cached in the cache
Bloom filters are available in both the Guava toolkit and Redis. Interested can look at this article: www.cnblogs.com/ysocean/p/1…
3.2. Cache breakdown
3.2.1. The concept
In line with the concept diagram of 3.1.1 Cache penetration, cache penetration is strictly a type of cache penetration. However, the cache breakdown is valid data for the query. In the case of high concurrency, when the cache is queried, the data in the cache does not exist or is invalid. This will result in a large number of requests to the database, the database hangs
3.2.2. Solution
To analyze the scenario, a large number of request colleagues requested the cache, cache does not exist, then request the database. The result of the request is then written to the cache. The problem is that multiple threads almost simultaneously read and rewrite the cache. Multi-threaded concurrent problem solving, of course, with locks. Individual applications can be locked using the synchronized keyword or ReentrantLock, and distributed services can be locked using distributed locks.
Pseudo code
Public Object query(){// return Object value = queryCache; If (objects.nonnull (value)){return value; if(objects.nonnull (value)){return value; } value = queryCache; value = queryCache; value = queryCache; if(Objects.nonNull(value)){ return value; } value = queryDb; // setCache data setCache; // Return the result return value; }}Copy the code
3.3. Cache Avalanche
Concept of 3.3.1.
Cache avalanche is also a type of cache breakdown, where a large number of caches expire at some point in time with an expiration time/obsolescence policy. A large number of requests are sent to the database at high concurrency.
3.3.2. Solution
When the cache avalanche, the request mode is consistent with the cache breakdown, mainly how to protect the cache avalanche.
1. The hotspot data is set to never expire. The cache elimination policy is to eliminate the earliest expired data
2. Data cache expiration Time Set a high dispersion random value to prevent a large number of caches from expiring at a certain point in time.
3.4. Performance issues
Concept of 3.4.1 track.
Scenarios where caching is used but performance is not improving. For example, in the Double eleven scenario, the order data volume is relatively large. It is very inefficient to add, modify, and delete all operations that have to be operated through the database first and then written back to the cache.
3.4.2. Solution
To use the cache as a database, of course, you need to use the highly available persistent cache middleware Redis. Data exists in Redis, and data interaction directly interacts with Redis. After the traffic peak, enable the scheduled task to flush the Redis data into the database or ES. You can also use message queues here, but I won’t expand them.
4. To summarize
This paper focuses on the strategy of adding, deleting, modifying and checking cache and the daily pit point.
Data consistency
1. To query, use the cache first, then the database, and then update the cache.
2. Add, delete, and modify the database first and then update the cache.
Pit point
Understand cache penetration and cache breakdown from cache avalanche
A high performance
Read/write persistent cache data, asynchronous flush mysql
Contact me
If you think the article is good, you can like it, comment it, and follow it
Nailing: louyanfeng25
WeChat: baiyan_lou