Education learning - Notes sharing "wake up" I

Redis is one of the most important techniques in distributed learning. You can spend a lot of time on it

The folded wood is born at the very end; A platform of nine stories begins with soil

Redis website

First, cache principle & design

Part 1 – Basic ideas

Usage Scenario

Relieve stress for DB

Earliest: application <==> DBMS (data in hard disk) page view increase (ten thousand level) : master-slave architecture + sub-database sub-table page view increase (ten thousand ~ one million level) : cache (data in memory)

Improve response speed

Database data is stored in files, that is, hard drives. Swap with memory

In the case of a large number of instant accesses (high concurrency), MySQL single chance cannot respond due to frequent I/OS, and MySQL InnoDB has row locks!

The data is cached in Redis, that is, in memory. Memory naturally supports high concurrent access. Can handle a large number of requests instantly.

Redis: QPS up to 110,000 /S, read requests up to 80,000 /S

Do the Session separate

Traditionally, Sessions are maintained and managed by Tomcat itself. In a clustered or distributed environment, different Tomcats manage their own sessions. Session replication can only be performed between Tomcat servers over the network and IO, which greatly affects system performance.

Session replication between Tomcat servers results in performance loss
Session data synchronization of Tomcat cannot be guaranteed

The Session information after successful login is stored in Redis, so that multiple servers (Tomcat) can share the Session information. Redis is used for temporary storage of data

Make distributed locks (Redis)

Generally speaking, locks are multithreaded locks, which are in a process

Multiple processes (JVMS) can also cause problems when running concurrently, as well as controlling timing

Distributed locks can be used. Use Redis to implement setNX

Make Optimistic Locks (Redis)

Synchronous locks and row and table locks in databases are pessimistic locks

Pessimistic locks have low performance and poor responsiveness

High performance, high response (SEC kill) using optimistic lock

Redis can implement optimistic lock watch + INCR

“What is caching?”

The cache originally refers to a kind of high-speed memory on the CPU. It exchanges data with the CPU before the memory, and the speed is very fast. The general term now refers to a replica set of raw data stored on a computer for quick access.

In the Internet technology, cache is one of the key technologies for rapid response of the system, and it is a technology of exchanging space for time (art).

Obviously, when the Processor(the program we wrote) is running, the first choice is to judge the data in the CACHE of the CPU, and then the memory of the motherboard, and finally to the hard disk.

“Cache Use in Large Sites”

“Common Cache Categories”

Client cache
1. Page caching: Cookie, WebStorage (SessionStorage and LocalStorage), WebSql, indexDB, Application Cache, etc
2. Browser cache: When a client requests a resource from the server, it reaches the browser cache first. If the browser has a copy of the resource to request, it can fetch the resource directly from the browser cache rather than from the original server.
  - Forced cache: Use the browser’s cached data directly
  - Negotiated cache: server resources are not modified, using the browser cache (304); Instead, use server resources (200)
3. App cache: Native apps cache data in memory, file, or local database (SQLite). Like image files.
Network side cache
1. Web Proxy Cache: Nginx, etc., can cache static resources from native servers
2. Edge caching: For example, a commercialized Content Delivery Network (CDN) can be deployed on edge servers to enable users to obtain Content nearby, reduce Network congestion, and improve user access response speed and hit ratio.
Server side cache
1. Database level cache: Uses the database’s internal buffer to cache data. For example, MySQL’s InnoDB engine caches InnoDB indexes and data blocks in buffer-pools
2. Platform level cache: Application framework with caching features. For example, GuavaCache, EhCache (level 2 cache, hard disk), and OSCache (page cache). Because it is deployed on an application server, it is also called a server local cache.
3. Application-level caching (emphasis): Middleware with caching capabilities:Redis(Alibaba, Baidu),Memcached,EVCache(AWS),Tair(Ali, Meituan), etc

Part 2 – Advantages and Costs of caching

“The advantage”

Fast response speed improves user experience
Reduce server stress
Improve system throughput and number of concurrent users

“Price”

Additional hardware technology (which can be avoided with services provided by cloud servers)
With high concurrency, a cache failure (cache penetration, avalanche, breakdown) can cause DB stress to spike and even crash
The cache cannot synchronize with the DB in real time
Cache concurrency competition (when multiple Redis perform a set value operation on a key at the same time, concurrency problems are caused by the execution order)

Part 3 – Read-write mode

“Cache Aside Pattern”

The classic cache + database read/write pattern — read from the cache, read from the database if there is no cache, fetch data, put it into the cache, and return a response.

Read ==> The cache reads the data first, but the cache does not read the database again. After the data is synchronized to the cache, the response is returned

Write ==> Update the database first, and then delete the cache completely. (Delete the cache after write, because the cache is usually a kind of data structure: List, Hash, if write synchronization, need to iterate, and for tens of thousands of data traversal is very time-consuming.)

Dirty reads with high concurrency:

Update the database first, then update the cache (database {A:1}=>{A:2}, but the cache is still {A:1}, the user read"The value of A is 1."It’s dirty data.)

Delete the cache first, and then update the database (cache is deleted, the database has not had time to update the data, the user call to get (old) data! Dirty read)

Update (—-> commit); update (—-> commit); Instead, read old data from the database that has not yet been committed.

Although there will be dirty reads under high concurrency, but the probability of the least is still the first write and then delete, can be followed by various means (delayed double delete…) Reduce the error rate again

“Read/Write Through Pattern direct Read/Write Cache”

The application only operates on the cache, which operates on the database.

Read-through (Read Through/direct Read mode) : The application reads the cache, but the cache does not. The cache is returned to the database and written to the cache
Write-through (Write Through mode/Write direct mode) : The application program writes to the cache

This mode needs to provide the database handler, which is more complicated to develop. GuavaCache stands for technology

“Write Behind Caching Pattern”

The application only updates the cache. The cache asynchronously updates data to the DB in batches or after merging.

No real-time synchronization, and even data loss

Part 4 – Cache architecture design ideas

“Multi-level”

Even if the distributed cache goes down, the local cache can still be used

“Data type”

Simple data types:
- When the following conditions are met:
  1. Value is a string/binary
  2. Value A large value (>100K)
  3. You just have to do the setter, the getter
Often chooseMemCache, it is pure memory cache, multi-threaded K-V
Complex data types:
- When the following conditions are met:
  1. Value is Hash, Set, List…
  2. Relationships need to be stored, aggregated, computed…
Often chooseRedis

“To Cluster”

Consider: Redis (distributed case), Codis, Sentry + master/slave, RedisCluster…

Data types and underlying data structures

Part 1 – Redis data types and application scenarios

Redis is a key-value storage system written in ANSI C language

Key: string (case sensitive)
Value:
1. Common types are string, list, set, sortedSet (zset), and hash
2. Bitmap, geo Geographic location
3. Stream (new in Redis5.0)

“Redis key Design”

The naming rules of keys are different from those of normal languages~~The blank space~~,\nMost characters other than line feeds can be combined

Counter example: “My key”x| “mykey \ n”x
The form of key can be defined by ourselves
- Recommended naming conventions:TABLE_NAME : PRIMARY_KEY_VALUE : COLLUME_NAME= = >Table name: primary key value: current column name
- Example :user_name = user_name; user_name = user_name;
Too short (poor readability) and too long (memory) are not good, at least to ensure semantic clarity:
- U: 1000: PWD x
- User: 1000: password)

Key Common command

Del key: deletes the key
Type Key: View the key type
keys [pattern]: Fuzzy search
- Keys * : matches all keys of the current database
- keys n? Am: Matches name, neme, ngme…
- Keys n*me: matching name, naaaaame…
- Keys n[ae]me: matches only name and neme
Expire key seconds: specifies the expiration time of the key
TTL key: Check the remaining time of the key (2- expired / 1- immortal)
Exists key: Checks whether the key exists

“String String type”

String represents three worth types: a string, an integer, and a floating-point number. Value stores a maximum of 512 MEgabytes of data, including JSON data and image data

【 Common operations 】

Set key value: assign a value, overwrite it if it already exists
getKey: Value. If key does not exist, nil is returned
Getset key value: evaluates and assigns a value (returns nil if there is no key, overwrites if there is)
setexKey seconds value: set value and expiration time setex (in seconds)
setnxKey Value: Assign when value does not exist (for distributed locks)
appendKey valu: appends the value to the tail. If the key does not exist, the operation becomes set
strlenKey: obtains the length of the string
Incr key: incrementing number (for optimistic locking)
Incrby key increment: Adds a specified integer
Decr key: decrement the number
decrbyKey Decrement: Decreases a specified integer
getrangeKey start end: intercepts a string from start->end (subscripts start from 0)

In the actual development process, if it involves multi-threaded security issues. Leave unlocked if you can. Leave unlocked if you can. Because locking is inefficient. (Need to check frequently if there is a lock; Keep waiting | wake up) to the problem of thread lock can consider to use redis to solve. Reason: Redis key is in single-threaded mode.

List List type

The number of elements in a list that can store ordered, repeatable elements and get records near the head or tail is extremely fast is up to 2^32-1 (4 billion)

【 Common operations 】

lpushkey v1 v2 … : Inserts the list from the left
lpopKey: From the left side of the list
rpushkey v1 v2 … : Inserts the list from the right
rpopKey: From the right side of the list
Lpushx key value: Inserts a value into the list header
Rpushx key value: Inserts the value to the end of the list
Blpop Key timeout: it is taken out from the left of the list and blocked when the list is empty. You can set the maximum blocking time (unit: second)
Brpop Key timeout: it is taken out from the right of the list and blocked when the list is empty. You can set the maximum blocking time, in seconds
Llen key: Gets the number of elements in the list
lindexKey index: Gets the index element in the list. Index starts at 0
Lrange key start end: Returns the elements of a specified range in the list. The range is specified by start and end. -1 indicates the last index.
lremKey count value: Deletes elements in the list that are equal to value
- When count>0, LREM deletes from the left side of the list
- When count<0, LREM is deleted from the right of the list
- When count=0, LREM deletes all elements with value
Lset key index Value: Sets the element at index position in the list to the value of value
ltrimKey start end: Prunes the list, reserving only the start to end range
Rpoplpush KEY1 KEY2: Pops from the right side of the key1 list and inserts it into the left side of the key2 list
Brpoplpush KEY1 KEY2: Pops from the right side of the key1 list and inserts into the left side of the key2 list, blocking
linsertKey BEFORE/AFTER Pivot Value: Inserts the value into the list BEFORE or AFTER the value pivot

“Set set type”

Unordered and unique, the maximum number of members in the set is 2^32-1

【 Common operations 】

saddkey mem1 mem2… : Adds a new member to the collection
sremkey mem1 mem2… : Deletes the specified member from the collection
Smembers key: Gets all the elements in the collection
spopKey: Returns a random element in the collection and removes the element
srandmemberKey: Returns a random element in the collection without deleting it
scardKey: Gets the number of elements in the collection
sismemberKey member: Determines whether the element is in the collection
Sinter key1 key2 key3: Find the intersection of multiple sets
Sdiff key1 key2 key3: Find the difference of multiple sets
sunionKey1 key2 key3: Finds the union of multiple sets

“Sortedset Ordered Set type”

The elements themselves are unordered and non-repetitive, and each element is associated with a score (score), which can be sorted by score and repeated

【 Common operations 】

zaddkey score1 member1 score2 member2… : Adds a new member to the ordered collection

(Score can be either an integer or a floating point number, plus INF for infinity and -INF for negative infinity)
Zrem key mem1 MEM2: deletes a specified member from an ordered set
Zcard key: Get the number of elements in the ordered set
Zcount key min Max: returns the number of elements in the set whose score is in the range [min, Max]
Zincrby key Increment Member: Increments the member score of the set
Zscore key member: Get the score of member in the set
zrankKey member: Get the ranking of members in the set (from small to large)
zrevrankKey member: Get the ranking of members in the set (from highest to lowest)
zrangeKey start end (withscores) : get the specified interval members in the set and sort by increasing scores
zrevrangeKey start end (withscores) : obtain the specified interval members in the set and sort them in descending order

Hash table Type

Redis Hash is a mapping table of fields and values of type string that provides mapping of fields and field values.

【 Common operations 】

hsetKey field value: assign a value
hmsetKey field1 value1 field2 Value2: Batch assignment
Hsetnx key field Value: Set a value. If filed exists, the field is not operated
Hexists key field: checks whether a field exists
hgetKey Key field: Gets the value of a field
hmgetKey field1 field2: Obtain multiple field values
hgetallKey: obtains all values under the key
Hdel key filed1 field2: deletes the specified field
Hincrby Key field Increment: Indicates the increment of a specified field
Hlen key: Obtains the number of key-value pairs

“Bitmap bitmap Type”

A bitmap is bitwise operated. A bit is used to represent the value or state of an element, and the key is the element itself

【 Common operations 】

setbitKey offset value: Sets the key’s bit value at offset (1 or 0).
Getbit key offset: Gets the key’s bit value at offset
bitcountKey: The number of bits 1 in the key
bitposKey value: Returns the first index value set to a bit value
Bitop and[OR/xOR /not] DestKey Key [key] : Stores multiple keys in the Destkey after logical calculation

“Geo Location Type”

Geo is used by Redis to process location information. Officially used in Redis3.2. It utilizes z-order curves, Base32 encoding, and geohash algorithms

Z order curve: Converts a decimal number to a binary number on the X-axis and Y-axis. A six-digit code is obtained by crossing the corresponding binary numbers on the X-axis and Y-axis. The curve that connects the numbers from the smallest to the largest is called the Z-order curve, and the z-order curve is a way of converting multiple dimensions into one dimension.
Base32 encoding: Base32 This data encoding mechanism is mainly used to encode binary data into visible strings. Its encoding rules are as follows: Any given binary data is segmented into a group of 5 bits (base64 is segmented into a group of 6 bits), and each group segmented is encoded to obtain 1 visible character. The total number of characters in the character set of the Base32 encoding table is 32 (0-9, b-z minus a, I, L, and O), which is where the name Base32 comes from.
Geohash algorithm: Gustavo launched geohash.org in February 2008. Geohash is a method of encoding geographic location information. The latitude and longitude coordinates of any location on earth can be represented as a short string after the GeoHash mapping. It can be easily stored in a database, attached to mail, and easily used in other services. Take the coordinates of Beijing, for example,[39.928167, 116.389550]Can be converted towx4g0s8q3jf9 。

Latitude and longitude in Redis are encoded as 52-bit integers in zset, where the value element is the key and score is the 52-bit integer value of the GeoHash. When using Redis for Geo query, its internal corresponding operation is actually zSET (Skiplist) operation. Other elements near the coordinates can be obtained by sorting the score of zset, and the original coordinates of elements can be obtained by restoring score to coordinate values.

【 Common operations 】

geoaddGeoadd Key Longitude 1 Latitude 1 Member name 1 Longitude 2 Latitude 2 Member name 2… : Adds the geographic coordinates
geohashKey Member Name 1 Member Name 2… : returns the standard Geohash string
geoposKey Member Name 1 Member Name 2… : Returns the longitude and latitude of the member
geodistKey Member 1 Member 2 Unit: Calculates the distance between members
georadius key longitude latitude radius m|km|ft|mi [WITHCOORD] [WITHDIST] [WITHHASH] [COUNT count]:

Returns all of the location elements contained by the key that are within a given maximum distance from the center, centered at a given latitude and longitude

A range can use one of the following units:
- M m/km km/mi miles/ft feet
The command returns additional information when given the following options:
- Withdist returns the location element, but also the distance between the location element and the center
- Withcoord returns the longitude and dimension of the positional element as well
- Withhash returns the original Geohashed ordered set score of the positional element as a 52-bit signed integer
useCOUNT countGet the first count of matched elements
georadiusbymember key member radius m|km|ft|mi [WITHCOORD] [WITHDIST] [WITHHASH] [COUNT count] : Like the GEORADIUS command, you can find elements in a specified range, but the center point of GEORADIUSBYMEMBER is determined by the given location element

“Stream Data stream type”

Stream is a data structure added after Redis5.0 for a persistent message queue

Each Stream has a unique name, which is the key of Redis, created automatically when the xadd directive is used to append messages for the first time

【 Common operations 】

xaddKey ID <*> field1 Value1: Appends the specified message data to the specified queue (key). * represents the newly generated ID (current time + serial number).
xread[COUNT count] [BLOCK milliseconds] STREAMS key [key…] ID [ID…] : Reads from the message queue

BLOCK: BLOCK (not blocked by default) key: queue name ID: message ID
xrangeKey start end [COUNT] : reads messages in the queue with a given ID range. COUNT: Number of returned messages (message ids from small to large)
Xrevrange key start end [COUNT] : reads messages in the queue for a given ID range. COUNT: Number of returned messages (message ids from large to small)
Xdel key ID: indicates the message to delete the queue
xgroupCreate Key groupName ID: Creates a new consumer group
Xgroup destroy key groupname: Deletes a specified consumer group
Xgroup delConsumer Key groupName cname: Deletes a consumer from the specified consumer group
Xgroup setid key ID: Changes the maximum ID of a message
xreadgroupGroup groupName Consumer COUNT Streams key: Creates a consumer from a consumer group in the queue and consumes data (if no consumer exists)

Part 2 – Redis underlying data structures

Redis does not have the concept of a table. Db corresponding to instances is distinguished by numbers. Db itself is the namespace of keys.

“RedisDB structure”

There is the concept of “database” in Redis, which is defined by redisDb in Redis. H. When the Redis server is initialized, 16 databases are pre-allocated, and all the databases are saved into the array of redisServer.db, a member of the structure redisServer. There is a pointer in redisClient called DB that points to the database currently in use.

RedisDb:

typedef struct redisDb { 
  int id; // Redis has 16 databases by default.
  long avg_ttl; // Average TTL (time to live) of stored database objects, used for statistics
  dict *dict; // Store all database key-value dict *expires; // Store the expiration time of the key
  dict *blocking_keys;// BLPOP stores blocking keys and client objects
  dict *ready_keys;// Blocking push response Blocking client stores the key and client object for push after blocking
  dict *watched_keys;// Store the key and client objects monitored by watch
} redisDb;
Copy the code

Focus on the ID/dict/Expires parameters, which will be explained later

RedisObject ☆

Overview of structural information

RedisDb:

typedef struct redisObject {
    unsigned type:4; // Type corresponding to all object types
    unsigned encoding:4; // The internal encoding of the object
    int refcount; // Reference count
    unsigned lru:LRU_BTIS; // LRU_BITS (24 bits) records the last access time by a command program
}robj;
Copy the code

Four type

REDIS_STRING(string)/REDIS_LIST (list)/REDIS_HASH(hash)/REDIS_SET(set)/REDIS_ZSET(ordered set)

Four encoding

Each object has a different implementation code, Redis can be used according to different scenarios to set different codes for the object, greatly improving the flexibility and efficiency of Redis

throughobject encodingCommand to view the encoding mode of an object

24 LRU

Records the last time the object was accessed by a command program (version 4.0 24 bits, version 2.6 22 bits).

refcount

The number of times the object is referenced is an integer. It is mainly used for reference counting and memory reclamation of objects.

Redis to save memory, new programs do not create new objects when some objects appear repeatedly, but still use the old objects. So,An object is called a shared object when its refcount is greater than 1.

【7 types 】

I string object

Redis uses SDS(Simple Dynamic String). Used to store string and integer data.

struct sdshdr {
    // Records the number of bytes used in the BUF array
    int len;
    // Records the number of unused bytes in the BUF array
    int free;
    // An array of characters to hold strings. (Length = Free +len+1)
    char buf[];
}
Copy the code

“Advantages”

SDS adds free and len fields to the C string to get the string length.

Optimization of time complexity from O(n) of C strings to O(1) of SDS

SDS eliminates buffer overflows by automatically reallocating memory when possible because it records the length

Binary data can be accessed, with len as the end identifier

The C language ==> itself identifies an empty string with \0 terminated. If you want to represent a binary string, the concept of “terminated or not” will be confused by the semantics of \0 (binary itself has \0 semantics).

SDS ==> distinguishes binary from non-binary:

Non-binary end identifier: \0

Binary end identifier: len

Jump table II

Skip list is the low-level implementation of ordered set (sorted-set), which is efficient and easy to implement. Basic idea: layer some nodes in an ordered list, and each layer is an ordered list

To find theWhen a node is reached, if the value of the next node is greater than the value to be searched or the next pointer points to null, the search continues one level down from the current node. (With binary search function)
- For example: Find element 9 in the following list
  - Direct loop, no hierarchical search, to traverse 8 times
  - Layer once and iterate 5 times
  - Double layer, iterate 4 times
  - Layering three times, traversing five times
Delete: Finds the specified element and deletes that element at each level

III the dictionary

Dict, also known as a hash table, is a data structure used to store key-value pairs. The entire database is stored in dictionaries. Redis CURD = Redis CURD = Redis CURD = Redis CURD = Redis CURD = Redis CURD = Redis CURD

There is no implementation of this data structure built into C, so Redis still builds the dictionary itself.

The dictionary structure of Redis is defined as follows:

typedef struct dict {
    // The corresponding specific operation function
    dictType *type;
    // Optional arguments corresponding to the above type function
    void *privdata;
    Ht [0] : native table, HT [1] : rehash table
    dictht ht[2];
    If the value is -1, it indicates that the table is not in the Rehash state. If the value is not -1, it indicates the table index of the Rehash
    long rehashidx;
    // The number of iterators currently running
    int iterators;
}dict;
Copy the code

DictType structure is used in Redis operation function as follows:

typedef struct dictType {
    // Compute the hash value function
    unsigned int (*hashFunction)(const void *key);
    // Copy the key function
    void *(*keyDup)(void *privdata, const void *key);
    // Copy the value function
    void *(*valDup)(void *privdata, const void *obj);
    // Compare key functions
    int (*keyCopare)(void *privdata, const void *key1, const void *key2);
    // The function that destroys the key
    void (*keyDestructor)(void *privdata, void *key);
    // A function that destroys values
    void (*valDestructor)(void *privdata, void *obj);
}dictType;
Copy the code

Redis dictionaries use hash tables as the underlying implementation, structured as follows:

typedef struct dictht{
     // Hash table array
     dictEntry **table;
     // Hash table size
     unsigned long size;
     // Hash table size mask, used to calculate index values
     // always equal to size-1
     unsigned long sizemask;
     // The number of existing nodes in the hash table
     unsigned long used;
}dictht
Copy the code

Hash table consists of array table, in which each element points to dict.h/dictEntry structure, which is defined as follows:

// The key property is used to hold the key, and the val property is used to hold the value, which can be a pointer or a uint64_t integer or an int64_t integer
typedef struct dictEntry{
     / / key
     void *key;
     / / value
     union{
          void *val;
          uint64_tu64;
          int64_ts64;
     }v;
     // Point to the next hash table node to form a linked list
     // Resolve hash conflicts with this pointer
     struct dictEntry *next;
}dictEntry
Copy the code

The hash algorithm

// 1. Use the hash function set by the dictionary to calculate the hash value of key
hash = dict->type->hashFunction(key);
// 2. Use the sizemask attribute of the hash table and the hash value obtained in the first step to calculate the index value
index = hash & dict->ht[x].sizemask;	
Copy the code

Resolve hash conflicts
- The *next pointer in the dictionary points to the next hash table node with the same index (chained address method)
Capacity expansion and reduction
1. If an expand/shrink operation is performed, it creates a hash table with a size equal toht[0].used*2n/ht[0].uesd*n/2The hash table
  
  (extension = = >The space used according to the original hash tableexpanddoubleCreate another hash table)
  
  (extension = = >The space used according to the original hash tablenarrowdoubleCreate another hash table)
2. Reuse the hash algorithm above, compute the index value, and then place the key-value pair in the new hash table location
3. After all key-value pairs are migrated, the memory space of the original hash table is freed
Conditions triggering capacity expansion
- The server is not running BGSAVE or BGREWRITEAOF, and load factor 1 is greater than or equal to 1
- The server is running the BGSAVE or BGREWRITEAOF command and the load factor is greater than or equal to 5
Progressive rehash
- “Progressive” semantics: Expansion and contraction operations are not done in a single, centralized way, but in multiple, incremental ways
  
  Dozens or so of key rehash operations can be done in a flash, but millions or billions of keys can’t be rehash all at once, so “incremental” is born of movement
  - Find & delete: the first hash table is not found, the second hash table is searched
  - New: Run on a new hash table

The complete Redis dictionary data structure:

IV Compressed list

A ziplist is a sequential data structure consisting of a series of specially encoded contiguous chunks of memory. Unlike an array, it allows the storage of different sizes of data

The main feature of compressed lists is “memory saving”, which is relative to arrays;

The original array had a fixed size for each data field. If you wanted to add data dynamically, you could identify each data field with a length attribute: the size of the current data field.

The implementation of compressed lists in Redis is mainly embodied in lists and hashes:
- When aThe list ofContains onlyA small number of list items, and each list item is either a small integer value or a short stringRedis then uses the compressed list for the underlying implementation of the list
- When aThe hashContains onlyA small number of key-value pairs, each with keys and values that are either small integer values or short strings of lengthRedis then uses the compressed list to do the underlying implementation of the hash
Redis compressed list composition:
- 【 Node Entry 】A compressed list can contain as many as you wantnodeEach node can save one entryAn array of bytesOr aAn integer value
  - Structure:
```
typedef struct zlentry {
    // previous_entry_length Specifies the length of the field
    unsigned int prevrawlensize;
    // The previous_entry_length field stores the content
    unsigned int prevrawlen;
    // The length of the encoding field
    unsigned int lensize;
    // Data content length
    unsigned int len;
    // The header length of the current element, that is, the previous_entry_length field length + the encoding field length
    unsigned int headersize;
    // Data type
    unsigned char encoding;
    // Start address of the current element
    unsigned char *p;
}zlentry;
Copy the code
```
  - A node can hold a byte array as:
    1. An array of bytes less than or equal to 63(2^6-1) bytes
    2. An array of bytes less than or equal to 16383(2^14-1) bytes
    3. A byte array with a length less than or equal to 4294967295(2^32-1) bytes
  - The node can hold integer values as:
    1. An unsigned 4-bit integer between 0 and 12
    2. 1 – byte long signed integer
    3. A 3 – byte long signed integer
    4. Int16_t Integer
    5. Int32_t An integer
    6. Int64_t Integer
  - Node composition:
    - previous_entry_length: Records the length of the previous node in the compressed list, in bytes
    - encoding: Records the type and length of data held by the node’s Content property
    - content: Saves the value of the node (the type and length of the value are determined by the encoding property of the node)
- [Total length of bytes] Zlbytes Indicates the length of the compressed list in bytes
- Offset zltail Indicates the offset of the end of the compression list with respect to the start address of the compression list
- Allen Indicates the number of elements in the compressed list
- End zlend The end of the compressed list, in one byte, always 0xFF (255)

Set of V integers

The set of integers (intset) is an ordered (Integer ascending), a continuous storage structure that does not store integers repeatedly

When the Redis collection typeThe elements are integers and are in the range of 64-bit signed integers (2^64)., using the structure for storage. (HashTable is used when the range is exceeded)

Here’s an example:

Structure definition:
- Structure:
```
typedef struct intset{
    // Encoding mode
    uint32_t encoding;
    // The number of elements contained in the collection
    uint32_t length;
    // Hold an array of elements
    int8_t contents[];
} intset;
Copy the code
```
  - Contents array: is the bottom implementation of integer set. Each element of integer set is an array item of contents array. Each item is arranged in order according to the size of value, and the array does not contain any duplicate items
  - length: records the length of the array
  - encoding: records how the array is encoded
  The intset structure declares the contents property as an array of type INT8_T, but the contents array does not actually hold any values of type INT8T. The true type of the contents array depends on the value of the Encoding property.
  - Encoding == INTSET_ENC_INT16: Array is of type UINT16_T (each element in array is an integer value of type INT16_T)
  - Encoding == INTSET_ENC_INT32: Array of type uint32_t (each element in array is an integer value of type INT13_T)

VI Quick List

Quicklist is an important data structure underlying Redis. Is the underlying implementation of a list.

Prior to Redis3.2, Redis was implemented with bidirectional linked lists (AdList) and ziplist.
After Redis3.2, QuickList was designed by combining the strengths of AdList and Ziplist, Redis

Quicklist is also a bidirectional linked list, with each node having an AIplist structure to store multiple data elements.

The structure of quckList is as follows:

typedef struct quicklist {
    // Point to the head of quickList
    quicklistNode *head;
    // Point to the end of the QuickList
    quicklistNode *tail;
    // The total number of all items in the list
    unsigned long count;
    // The number of QuickList nodes, i.e. the number of Ziplist nodes
    unsigned int len;
    // The ziplist size limit can be configured using the list-map-ziplist-size command
    int fill : 16;
    // To set the node compression depth, run the list-compressed-depth command
    unsigned int compress : 16;
}quicklist;
Copy the code

The quicklistNode structure is as follows:

typedef struct quicklistNode {
    // points to the previous ziplist node
    struct quicklistNode *prev;
    // Point to the next ziplistNode
    struct quicklistNode *next;
    // The data pointer, if not compressed, points to the ziplist structure; The reverse points to the quicklistLZF structure
    unsigned char *zl;
    // Indicates the total length of the ziplist structure.
    unsigned int sz;
    // Indicates the number of data items in ziplist
    unsigned int count : 16;
    // Encoding mode, 1--ziplist, 2--quicklistLZF
    unsigned int encoding : 2;
    1--NONE, 2--ziplist
    unsigned int container : 2;
    // Decompress flag, when viewing a compressed data, it needs to be decompressed temporarily, mark this parameter as 1, and then compress again
    unsigned int recompress : 1;
    // Test related
    unsigned int attempted_compress : 1;
    // Extend the field
    unsigned int extra : 10;
}quicklistNode;
Copy the code

Data compression
- The actual data storage structure for each quickList node is ziplist, which has the advantage of saving storage space.
  
  To further reduce the storage space of ziplist, you can also compress ziplist.
  
  Redis uses the compression algorithmLZF. The basic idea is:Repeat position and length of data with previous repeated records, and record original data without duplication.
- The compressed data can be divided into fragments, each of which has two parts: an explain field and a data field
- The quicklistLZF structure is as follows:
```
typedef struct quicklistLZF {
    // LZF Number of bytes occupied after compression
    unsigned int sz;
    // The flexible array points to the data section
    char compressed[];
}
Copy the code
```

VII stream objects

Especially of the stream type, consisting of:
- Underlying data structure: Listpack (Compact list) + Rax tree (radix tree)
  - listpack: represents the serialization of a list of strings. Listpack can be used to store strings or integers. Used to store the message content of the stream.
  - Rax: Radix Tree, lexicographical order of key, support fast localization, insertion, and deletion.
    
    Rax is used in the Redis Stream structure to store message queues. In the Stream, the message ID prefix is timestamp + serial number. Such messages can be understood as time series messages. Using the Rax structure for storage allows you to quickly locate a specific message by message ID, and then proceed to traverse all messages after the specified message.

【10 Kinds of encoding】

Encoding Specifies the 4-bit internal encoding of the object. For small and small data, Redis uses small and compressed storage, reflecting the flexibility of Redis.

string:
- int: REDIS_ENCODING_INT (integer of type int)
- embstr: REDIS_ENCODING_EMBSTR (encoded simple dynamic string)
  
  The value contains less than 44 bytes
- raw: REDIS_ENCODING_RAW (simple dynamic string)
  
  The value contains more than 44 bytes
list:
- quicklist: REDIS_ENCODING_QUICKLIST
hash:
- dict: REDIS_ENCODING_HT (dictionary)
  
  When the hash table has a large number of elements or the elements are not small integers or short strings
- ziplist: REDIS_ENCODING_ZIPLIST (compressed list)
  
  When the hash table has a small number of elements and the elements are small integers or short strings
set:
- intset: REDIS_ENCODING_INTSET (integer set)
  
  When the elements of the Redis collection type are all integers and are in the range of 64-bit signed integers (<18446744073709551616)
- dict: REDIS_ENCODING_HT (dictionary)
  
  When the elements of the Redis collection type are all integers and are outside the range of 64-bit signed integers (>18446744073709551616)
zset:
- ziplist: REDIS_ENCODING_ZIPLIST (compressed list)
  
  When the number of elements is small and the elements are small integers or short strings
- skiplist: REDIS_ENCODING_SKIPLIST (skip list + dictionary)
  
  When the number of elements is large or the elements are not small integers or short strings

Part 3 – Cache expiration and obsolescence strategies

“Maxmemory”

Redis maxMemory defaults to 0, meaning there is no limit to the maximum storage limit.

Maxmemory is not set:
- Redis is used as DB to ensure data integrity; The key will no longer remain fixed incrementally
- Noeviction strategy — Iction
Set maxmemory:
- Redis is used as a cache, and keys can be added continuously

Setting mode: Onredis.confIn the configurationmaxmemory XXXXmb

If maxMemory is configured, synchronize the configurationmaxmemory-policyElimination strategy)
Query method: On the consoleconfig get maxmemoryTo view

How much maxMemory is appropriate? 1 instance of Redis, ensure that the system runs 1 GB, and the rest can be set to 3/4 of the physical memory in Redis.
# In short... if you have slaves attached it is suggested that you set a lower # limit for maxmemory so that there is some free RAM on  the system for slave # output buffers (but this is not needed if the policy is 'noeviction').Copy the code

Expire Data Structure

In Redis, you can use the expire command to set the TTL: time to live for a key, after which the key will be deleted automatically

The redisDb structure is as follows:

typedef struct redisDb {
  dict *dict;
  dict *expires;
  dict *blocking_keys;
  dict *ready_keys;
  dict *watched_keys;
  int id;
} redisDb;
Copy the code

Let’s focus on dict & Expires Pointers:

Dict: Maintains all key-value pairs contained in a Redis database
Expires: A key used to maintain an expiration date in a Redis database (mapping key to expiration date)
When we use the expire command to set the expiration time of a key, Redis first checks the dict dictionary table to see if the key to be set exists:
- ==> Add the target key and expiration time to the expires dictionary table
When we insert data into the system using the setex command, Redis first adds keys and values to the dict dictionary table, and then keys and expires to the Expires dictionary table

Keys with expiration dates and specific expiration dates are all maintained in the expires dictionary table

Expiration Policy

[Scheduled deletion]: When setting the expiration time of a key, you can create a timer to delete the key immediately when the expiration time of the key comes
- Tips: Although memory-friendly, it requires the creation of timers and consumes CPU, which is generally not recommended

【 Lazy deletion 】: Delete the key if it is found to be invalid when it is accessed

Tips: This strategy maximizes CPU savings, but is very memory unfriendly. In extreme cases, a large number of expired keys may not be accessed again and thus will not be cleared, occupying a large amount of memory

Bottom layer: callexpireIfNeededFunction, the meaning of which is:Check to see if the data is invalid before reading it, and delete it if it is

int expireIfNeeded(redisDb *db, robj *key) {
  // Get the primary key expiration time
  long long when = getExpire(db,key);
  // If the expiration time is negative, it indicates that the primary key has no expiration time (the default expiration time is -1). 0 is returned
  if (when < 0) return 0;
  // If Redis server is loading data from RDB file, do not delete invalid primary key temporarily, return 0 directly
  if (server.loading) return 0; .// If none of the above conditions is met, the expiration time of the primary key is compared with the current time, if the specified primary key is found
  // Return 0 if not invalidated
  if (mstime() <= when) return 0;
  // If it is found that the primary key is indeed invalid, first update the statistics on the invalid primary key, and then lose the primary key
  // The primary key is deleted from the database
  server.stat_expiredkeys++;
  propagateExpire(db,key);
  return dbDelete(db,key);
}
Copy the code

[Active deletion] : You can configure the active deletion policy in the redis.conf file, default is no-enviction (no deletion).

“Memory Elimination Strategy”

Memory flushing strategy refers to: when Redis memory for cache is insufficient, how to deal with the data that needs to be written and need to apply for additional space

Noeviction: New write operations will bug when memory is insufficient to accommodate new write data.
Allkeys-lru: Removes the least recently used 2 key from the key space when memory is insufficient to accommodate new writes.
Allkeys-random: Randomly removes a key from the key space when memory is insufficient to accommodate new writes.
Volatile -lru: Removes the least recently used 3 key from the expired key space when memory is insufficient to accommodate new writes.
Volatile -random: Randomly removes a key from the expired key space when memory is insufficient to accommodate new writes.
Volatile – TTL: When the memory is insufficient to accommodate new data, the key whose expiration time is earlier is removed from the key space.

Load factor = Number of stored hash nodes/Hash table size
LFU (further frequently uesd)
LRU (further recently uesd)

Education learning – Notes sharing “wake up” I

First, cache principle & design

Part 1 – Basic ideas

Usage Scenario

Relieve stress for DB

Improve response speed

Do the Session separate

Make distributed locks (Redis)

Make Optimistic Locks (Redis)

“What is caching?”

“Cache Use in Large Sites”

“Common Cache Categories”

Part 2 – Advantages and Costs of caching

“The advantage”

“Price”

Part 3 – Read-write mode

“Cache Aside Pattern”

“Read/Write Through Pattern direct Read/Write Cache”

“Write Behind Caching Pattern”

Part 4 – Cache architecture design ideas

“Multi-level”

“Data type”

“To Cluster”

Data types and underlying data structures

Part 1 – Redis data types and application scenarios

“Redis key Design”

“String String type”

List List type

“Set set type”

“Sortedset Ordered Set type”

Hash table Type

“Bitmap bitmap Type”

“Geo Location Type”

“Stream Data stream type”

Part 2 – Redis underlying data structures

“RedisDB structure”

RedisObject ☆

Overview of structural information

【7 types 】

【10 Kinds of encoding】

Part 3 – Cache expiration and obsolescence strategies

“Maxmemory”

Expire Data Structure

Expiration Policy

“Memory Elimination Strategy”

Related Posts

10 Must-see Books for Advanced Java

In this article you can thoroughly understand the Redis serialization protocol. you can also write Redis clients

List quick grouping