Redis is introduced

Redis is a database developed using C language. Redis data is stored in memory, so the reading and writing speed is very fast.

Application scenarios

  • Data (hotspot) highly concurrent reads and writes
  • Read and write massive data
  • Data with high scalability requirements

The difference between distributed cache and local cache

Distributed cache The local cache
Cache consistency good Weak, each instance has its own cache
Heap footprint Do not take up Occupy, affect garbage collection
speed Slower because network transport and serialization are required faster
Usage scenarios Scenarios requiring data consistency and heavy traffic Scenarios that do not have high requirements on data consistency and have a large number of accesses

Implementation of local cache:

  • Use specific data structures, such as ConcurrentHashMap
  • Using the open source caching framework Ehcache, Ehcache encapsulates the functionality of memory operations
  • Guava Cache is Google’s open source tool set, which provides a boundary manipulation tool for caching.

Difference between Redis and Memcache

Redis Memcache
storage persistence Power loss
Different data types are supported String,hash,list,set,zset Only support key – value
speed fast slow

Cache Processing Process

Common data structures

  • String (String)

The maximum capacity is 512 MB

  • List (list)

List is a list of strings, sorted by insertion order. Elements can be added to the top (left) or bottom (right) of the list. Maximum capacity is 2^32-1. You can do message queues.

  • Hash

A Redis hash is a collection of key-value pairs. Redis hash is a mapping table of fields and values of String type. Hash is especially suitable for storing objects. Maximum capacity is 2^32-1.

  • Set

Redis’ set is an unordered collection of type String. Maximum capacity is 2^32-1.

  • Sorted set zset (sorted set)

Redis zset, like set, is a collection of String elements and does not allow duplicate members. A different zset is a score of type double associated with each element. Zset uses this fraction to sort all the elements in the set from smallest to largest. Members of a Zset are unique, but scores can be repeated. Maximum capacity is 2^32-1. Good for leaderboards.

Zset underlying implementation

This article explains the hops in detail:

www.jianshu.com/p/dc252b5ef…

Expired Deletion Policy

Common deletion strategies:

  • Scheduled deletion: When you set the expiration time, create a timer and delete the timer when the expiration time expires
  • Inert delete: laissez-faire, each time to obtain, to judge whether expired, expired delete, belong to passive delete
  • Periodically delete: Delete expired keys from the database at regular intervals

Redis uses lazy delete + delete periodically to manage keys. Not only reduce the CPU pressure, but also ensure the accuracy of data.

Memory flushing mechanism

Since it can happen that memory is neither lazy nor regularly deleted, but soon becomes full, some sort of memory flushing mechanism is required. There are six elimination strategies:

  • Write requests will not continue to be served, read requests will be allowed to continue. This ensures that data will not be lost, but the online business will not continue. This is the default elimination strategy.
  • Volatile – lRU: Attempts to eliminate keys with expiration dates. The least used keys are first eliminated. Keys that do not have an expiration date are not obsolete, which ensures that data that needs to be persisted is not suddenly lost. (This is the most used)
  • Volatile – TTL: The same as above, except that the TTL of the key is not the LRU, but the value of the TTL of the remaining life of the key. The smaller the TTL, the more priority is eliminated.
  • Volatile -random: Randomly selects data from a set of expired data sets (server.db[I].expires) to be discarded.
  • Allkeys-lru: Unlike volatile- lRU, this policy eliminates the entire key set, not just expired keys. This means that keys that are not set to expire will also be eliminated.
  • Allkeys-random: Random selection of data from the entire key set (server.db[I].dict).

Persistence mechanism

  • RDB: Redis periodically dumps database records in memory to disk for persistent RDB.
  • AOF: Writes Redis operation logs to a file as an appending.

RDB

RDB persistence refers to writing snapshots of data sets in memory to disks at a specified interval. The actual operation is to fork a sub-process to write data sets to temporary files first. After the data is written successfully, the original files are replaced and stored in binary compression.

Advantages:

  1. RDB is a compact binary file suitable for backup, full copy, and other scenarios
  2. RDB recovers data much faster than AOF

Disadvantages:

  1. Real-time or second-level persistence is not possible
  2. Old and new versions are not compatible with RDB

AOF

AOF persistence records every write and delete operation processed by the server in the form of logs. The query operation is not recorded in the form of text. You can open the file to see detailed operation records.

Advantages:

  1. Better protection against data loss
  2. The append-only mode provides high write performance
  3. Suitable for catastrophic error deletion emergency recovery

Disadvantages:

  1. An AOF file is larger than an RDB snapshot for the same file
  2. Can have an impact on QPS
  3. The database recovery is slow and is not suitable for cold backup

The cache to penetrate

Data not in the query cache and not in the database can cause cache penetration.

Solutions:

  • Bloom filter

All query parameters are stored in a bitmap, before the query to verify in the bitmap, if there is the underlying cache data query; Intercept it if it doesn’t exist.

It can be used to realize data dictionary, judge the weight of data, set intersection.

  • Caching empty objects

Caching an empty object directly, but there are two problems:

  1. The cache will store more key-value pairs, which can be attacked by malicious attacks, as for the waste of memory space; This can be controlled by setting the expiration time.
  2. DB and cache data inconsistencies can be notified of data updates via asynchronous messages.

Cache avalanche

Over a period of time, a large number of cache failures cause a sudden increase in database pressure, resulting in a cache avalanche.

Solutions:

  • Dispersion failure time
  • DB access restriction for traffic limiting
  • Multi-level cache design

Cache breakdown

There is no data in the cache, but the data in the database is oily. At this time, because of the number of concurrent users, it will cause the database pressure to increase instantly.

Solutions:

  1. Set the hotspot data to never expire

  2. Add a mutex so that only one thread writes data:

Cache update strategy

Update the database first, then the cache

  • Can cause thread safety issues

Two threads updating data at the same time can cause problems with dirty data.

  • Updating the cache is relatively complex

Because data stored in the cache usually goes through a series of calculations.

Delete the cache first, then update the database

This can lead to data inconsistencies, for example, when the cache is deleted and another thread reads the request, the cache is still old.

The only solution is to update the cache again after the data is successfully written.

Update the database first, then delete the cache

There may be temporary data inconsistencies, which are acceptable after a successful database update and before the cache is deleted.