What you have to know about caching

1. The nature of caching

1.1 What is a cache

A cache is the part of the hardware or software used to store data for faster subsequent access to the corresponding data. The data in the cache may be pre-computed results, copies of data, and so on. Typical application scenarios include CPU cache and disk cache.

The read and write speeds of several kinds of memory also vary greatly. The order of read and write speeds is from large to small: general register > cache > main memory > external memory.

1.2 Functions of cache

In traditional back-end business scenarios, the requirements of traffic and response time are not very high. Usually, only a single DB can meet the requirements. However, with the development of the business, the traffic gradually increases, and the DB becomes overwhelmed, and the response time also increases significantly, which seriously affects the user experience (slow). At this time, the DB layer will be optimized by increasing DB instances/read and write separation and other means to increase the performance, but these are far from enough, the traffic is generally exponential growth, but this expansion is linear, can not meet the requirements of response time. In this way, the Cache layer is introduced and the hot data that is frequently accessed is put into the Cache, which can greatly improve the overall carrying capacity of the system. In this way, the original architecture is also changed into Cache+DB two layers.

This architecture optimization has the following benefits:

Can greatly improve the data reading speed;
The scalability of the system is improved. By expanding the cache, the carrying capacity of the system is rapidly improved.
The cost of data storage is reduced. Adding the Cache layer can share the requests of the DB layer and save machine costs.

For example, the system requirement of a query interface in the actual service is that the system can bear the request query with a peak value of 100,000 times/second, and the query result is inconsistent with the actual system to some extent, but the delay time cannot exceed 5 seconds. It can be roughly calculated that the throughput of Tomcat, Redis and MySQL is 5000/s, 50000/s and 700/s respectively. It can be seen that the system’s carrying capacity is greatly improved after the addition of cache. Of course, data consistency and user experience issues need to be considered based on business requirements.

The scenarios are as follows: Hotspot data exists. Higher requirements for effective response; The requirements for consistency are not strict; Distributed locking needs to be implemented.
Not suitable for the following scenarios: read less; Frequent updates; Strict requirements for consistency.

1.3 Mode of accessing cache

There are several common modes of applying access caching:

Dual-read and dual-write: For read operations, the cache is read first. If the cache does not exist, the database is read, and then the cache is written back. For write operations, write to the database first, then write to the cache; Advantages are simple logic, low implementation cost, but may cause cache penetration, high concurrency data inconsistency problems;
Asynchronous update: All data is stored in the cache, and the expiration time of the cache system is not set. The asynchronous service updates the changed or newly added data in the database to the cache. The common practice is to push the update operation to the cache through the MySQL binlog. The advantage is that there is no cache penetration, the data is strong consistent, but there is also the problem of high cost of heterogeneous data;
Serial mode: Applications directly perform read and write operations on the cache. The cache, acting as the proxy layer, performs read and write operations with the database based on requirements and configurations. In this way, problems such as data consistency, cache penetration, and heterogeneous costs are taken into account, but design and maintenance costs are also increased.

2. Common caches

There are two types of Cache commonly used. One is in-memory Cache, such as Map/Ehcache/Guava Cache, and the other is distributed Cache, such as Redis and Memcached.

Local cache: This refers to the cache component in the application, and the same process in the application, requests cache very fast. Local cache is suitable for scenarios where single applications do not need cluster support or nodes do not need to notify each other in a cluster. Its disadvantages are that the cache is shared among multiple applications or processes, which can lead to data inconsistencies. It is also affected by the heap area, the amount of data cached is very limited, and the cache time is affected by GC. It mainly meets the requirements of small data cache in single-node scenarios and does not need to be sensitive to changes in cached data, such as general configuration management and basic static data.
Distributed cache: A cache component or service that is separated from applications and isolated from local applications. Multiple applications can directly use the shared cache. It is easy to expand and has strong data consistency. Generally, it needs the operation and maintenance resources of separate cache middleware, and the cost is high.

Generally, the commonly used distributed cache middleware includes Redis/Memcache/Tair. In the design of technical architecture, the selection of distributed cache technology often needs to consider the following factors: data type/thread model/persistence mechanism/client/high availability/queue support/transaction/data elimination strategy/memory allocation.

Data structures: Redis support String/list/hash/set/zset data structure; Memcache support key/value; Tair supports the same data structure as Redis;
Thread model: Redis is single-threaded, the other two are multi-threaded;
Replication model: Redis supports master-slave replication/master-slave chain. Memcache relies on third-party components; Tair is implemented on a cluster basis;
Persistence mechanism: Redis supports ROF/AOF; Memcache relies on third-party components; Tair is determined by the storage structure;
Storage structure: Redis supports compressed strings/dictionaries/skip lists; Memcache support Stab; Tair supports MDB(memory)/RDB(Redis)/FDB(Persistence)/LDB(LevelDB);
High availability: Redis is guaranteed by master-slave /Sentinel/Cluster mode; Memcache is stored by third-party components. Tair is guaranteed by Cluster mode.

3. High availability and concurrency of cache

3.1 Cache high availability

Generally, the cache scheme is implemented based on the master and slave of sharding, which separates the query with large data volume through the sharding idea, and achieves high availability and some high performance requirements through the master and slave. Multiple copies of data to alleviate the performance pressure of queries. Asynchronous replication is used to implement master/slave. Strong consistency protocol is used to ensure that the write transaction starts from the master node and sends the transaction to the slave node. All slave nodes will receive and return data information to the master node, and then the master node returns success. This is all in memory, with all master and slave nodes reading and writing data asynchronously to ensure synchronization.

3.1.1 Redis master-slave architecture

Master-slave architecture is the simplest solution to achieve high availability. The core principle is master-slave synchronization:

Slave sends sync to master.
After receiving the sync command, the master runs the BGSave command to generate the RDB file.
Since the BGSave process takes some time, the master is still receiving write commands from the client. The master logs the write commands in the cache.
After the bgSave execution is complete, the RDB file is sent to the slave, and the slave executes it.
The master sends cached write commands to the slave to execute, thus ensuring data consistency.

3.1.2 Redis Sentinel mode

The master-slave synchronization mode has obvious disadvantages. If the master node fails, data cannot be written, and the slave becomes useless. The entire architecture is unavailable, except manual switchover, otherwise it cannot be failover

The sentry can monitor multiple master and slave servers at the same time, and when the monitored master goes offline, it automatically promotes a slave to master, and the new master continues receiving commands. The whole process is as follows:

Initialize sentinel, replacing the normal Redis code with sentinel specific code
Initialize the Masters dictionary and the server information. The server information mainly stores IP :port and records the address and ID of the instance
Create two connections with master, command connection and subscription connection, and subscribe to sentinel: Hello channel
The info command is sent to the master every 10 seconds to obtain the current information about the master and all slaves under it
When the master has a new slave, sentinel establishes two connections with the new slave and sends the info command every 10 seconds to update the master information
Sentinel sends a ping command to all servers every second, and if a server returns a continuous invalid response within the configured response time, it will be marked offline
Lead Sentinels are elected, which require the consent of more than half of the sentinels
The lead Sentinel selects one slave from all the offline masters and converts it to master
Make all slaves copy data from the new master instead
If the original master is set as the new master’s slave server, it becomes the new master’s slave server when the original master reconnects

The Sentinel middleware sends a ping command every second to all instances (both master and slave servers and other Sentinels) and determines if they have gone offline based on the response. This method is called subjective offline. When a subjective offline status is determined, it is queried by other monitoring sentinels. If more than half of the votes say it is already offline, it is marked as objective offline and failover is triggered.

3.1.3 Redis cluster mode

If you can achieve high availability of Redis by relying on Sentinels, you need redis clusters if you want to support high concurrency while holding large amounts of data. Redis cluster is a distributed data storage solution provided by Redis. The cluster shares data through data sharding and provides replication and failover functions.

A Redis cluster consists of multiple nodes, which are connected through the cluster meet command. The node handshake process is as follows:

Node A receives the cluster meet command from the client
User A sends A meet message to user B based on the received IP address and port number
Node B returns pong after receiving the meet message
A knows that B has received the meet message and returns A ping message, indicating A successful handshake
Finally, node A will use the Gossip protocol to propagate node B’s information to other nodes in the cluster, and other nodes will also shake hands with NODE B

Groove slot.

Redis stores data in the form of cluster fragments. The entire cluster database is divided into 16,384 slots, and each node in the cluster can process 0 to 16,383 slots. When all 16,384 slots are processed by nodes, the cluster is online. Conversely, any slot that is not processed is processed offline. The Cluster addslots command assigns slots to nodes for processing. Slot is an array of bits. The length of the array is 16384/8=2048. Each bit of the array is 1 for being processed by the node, 0 for not being processed, and 0 for slot 0 to 7 for node A, as shown in the figure.

3.2 High cache concurrency

MySQL can achieve QPS of 100,000 per second under the condition of hitting memory index. Redis also has roughly the same performance, which is related to network IO and memory. There are other ways to improve cache access performance, such as parallelizing cache access without dependencies and serializing cache access with dependencies, using a task tool like CompleteFuture. For example, in order to obtain the information of a curriculum schedule, it is necessary to obtain the information of courses, teachers and classes of commodities at the same time.

4. Precautions

4.1 Cache Avalanche

A cache avalanche is when a large number of caches all fail at the same time, and if a large number of requests are made at the same time, the requests will directly access the database, possibly flushing the database. Cache avalanche generally describes data that is not in the cache but is in the database, and requests are directed to the database because time has expired. Solutions to cache avalanche:

Set different expiration times for different keys to avoid simultaneous expiration.
Traffic limiting. If the cache instance is down, traffic limiting can be implemented to prevent the DB from being crashed by a large number of requests at the same time.
Level 2 cache, same scheme as hot key.

4.2 Cache Breakdown

A cache breakdown is similar to a cache avalanche, except that a cache breakdown is typically a failure of a single cache, with a large number of concurrent requests requiring access to the key at the same time, causing strain on the database. Cache breakdown solution:

Lock to ensure single thread access to the cache. After the first request reaches the database, it is written to the cache, and subsequent requests can be read directly from the cache.
The expiration time is added to the value to asynchronously refresh the expiration time.

4.3 Cache Penetration

Cache penetration means that a large number of concurrent queries are performed on non-existent keys. As a result, the cache cannot be hit. Each request is queried through the back-end database, which causes heavy load and pressure on the database. Locking doesn’t work well for cache penetration because the key itself doesn’t exist, so even if you control the number of threads accessing the database, the request will still arrive. Cache penetration can be resolved by:

Verification is performed at the interface layer, and invalid keys are returned directly. For example, if the database uses the increment id, then if the id is a non-integer type or a negative id, or if the database uses the 32-bit UUID, then if the id is not equal to 32 bits, then the database can return the id.
To cache non-existent data, you can directly cache an empty or other agreed invalid value. In this solution, it is better to set a short-term expiration time for keys. Otherwise, a large number of non-existent keys will be stored in Redis and occupy a large amount of memory.
To solve this problem, add a layer of Bloom filter. The way bloem filters work is that when you store data, it maps it to K points in an array of bits using a hash function and sets them to 1.

In this way, when the user queries A again and the value of A is 0 in the Bloom filter, it returns directly, and no breakdown request will be made to DB. Obviously, one of the problems with using a Bloom filter is misjudgment, because it’s an array, and multiple values can fall into the same place, so theoretically, as long as our array length is long, the probability of misjudgment will be lower, and the problem will be based on the actual situation.

4.4 Hotspot Data

The so-called hot key problem is that suddenly hundreds of thousands of requests to access a particular key on Redis will cause traffic to become so concentrated that the maximum number of physical network cards will be reached, causing the Redis server to go down and trigger an avalanche. There are several ways to solve hot issues:

Break up the hot key to different servers in advance to reduce the pressure
Add level 2 cache and load hot key data into memory in advance. If Redis is down, go to memory for query

4.5 Expiration Policy

Lazy delete means that the key is checked when it is queried and deleted if it has reached its expiration date. Obviously, one drawback is that if the expired key is not accessed, it will never be deleted and will always take up memory.

Periodic deletion means that Redis checks the database every once in a while to delete expired keys. Since it is impossible to poll all keys to delete them, Redis randomly selects some keys each time to check and delete them.

If redis does not delete these keys when it randomly queries them regularly, these keys will be kept in Redis and cannot be deleted. At this time, redis will go to the memory elimination mechanism.

Volatile -lru: Removes the least recently used key from the key whose expiration time has been set.
Volatile – TTL: removes expired keys from the set expired keys
Volatile -random: Randomly selects keys that have expired
Allkeys-lru: Select the least recently used key for elimination
Allkeys-random: Selects keys randomly from keys for elimination
Noeviction: New write operations will bug when memory threshold is exceeded

5. Relevant practices

Under normal circumstances, the read order is cache first, then database; The write order is database first, then cache;
In the use of local cache, we must control the number and life cycle of cache objects, because the JVM memory capacity is limited, will affect the PERFORMANCE of the JVM, and even lead to memory overflow.
Based on the data structure, size, quantity and expiration time of the cache, the capacity usage function within a certain period of time is calculated according to the service situation, and the cache resources are applied and allocated according to the result of capacity evaluation. Otherwise, it is easy to waste cache resources or insufficient cache space.
When writing to the cache, be sure to write complete and correct data;
Physically separate caches for different services. Core services and non-core services use different cache instances. If possible, use a separate instance or cluster for each service to reduce the possibility of interaction between different applications.
All cache instances need to be monitored, such as slow queries, large objects, and memory usage.
Through certain norms to limit the prefix of each application key such as (system: module: function)+ key to carry out isolation design, avoid mutual coverage;
Any cache key should have an invalidation practice, and the invalidation time should not be concentrated at one point, adding a random time, otherwise it will cause the cache to fill up memory or cache avalanche;
The amount of cached data should not be too large, especially for Redis. If the amount of data in a single cache key is too large, the processing of other requests will be blocked.
For keys that store a large number of values, do not use set operations such as HGETALL, which will block requests.
If you need to update a large amount of data, especially when batch processing, use batch mode.
In the use of cache, there must be degraded processing, cache problems or failure can also be traced back to the database for processing;

reference

www.jianshu.com/p/3cd8ebb18…

Mp.weixin.qq.com/s/\_WCg3TDZ… Mysql and Redis cache synchronization scheme

Blog.csdn.net/u011983531/… Redis cache capacity assessment

www.sohu.com/a/219011898… Distributed Cache basics

Database.51cto.com/art/202012/… test

Tech.meituan.com/2017/03/17/… Meituan. – The cache stuff