preface
Update from Chinese New Year that this home MySQL series has been in the past for a period of time, this paragraph of time has been busy in the other, so the blog update would be shelved, but have been thinking about what to update the content is good, just a friend gave me a book Redis, I’m going to see the combination of the official document to summarize, share with everyone, Please correct me if there is anything wrong.
Redis series, I think, in the form of “handing down close” to update, but is not necessarily the four oh, because the space is limited, too long afraid you have no patience to read, may be “up” article is divided into a few posts to account, I will also to be neat, can facilitate everybody watching the better absorption, writing articles, after all, can I get you to watch my article, is my pleasure. I have to be responsible for everyone
Anyway, Redis should be used by many people (if not, you should be able to read this article, but some basic theories need to be searched online). This article will focus on the operation commands of the basic data types of Redis, the underlying storage structure, and its application scenarios.
Attached is the brain map of the basic chapter (there is compression on the upload platform, if you are interested, you can go to my official account [6 Xixuan] to get the original picture)
It’s a bit long, so I’ve divided the data types into three chapters, so you can look at them selectively.
The body of the
Redis basic data type
The most basic and most commonly used data type is String.
The set and get commands are the operation commands for String.
Why is it called binary-safe strings?
Let’s see.
String String
Storage type
Can be used to store strings, integers, floating point numbers.
Operation command
Set multiple values (batch operation, atomicity)
mset port 2673 zk 666
Copy the code
If key exists, it will not succeed
setnx port
Copy the code
Based on this, distributed locks can be implemented and released with del key.
But what if the lock release fails and the other nodes never get the lock?
Add expiration time.
Expire cannot be used to guarantee atomicity.
multi-parameter
set key value [expiration EX seconds|PX milliseconds][NX|XX] Copy the code
The way arguments are used
set lock1 1 EX 10 NX Copy the code
(integer) the value increases
incr port
incrby port 100
Copy the code
(integer) value decrement
decr port
decrby port 100
Copy the code
Floating point increment
setF 2.6 incrByFloat f 7.3Copy the code
Get multiple values
mget port jack
Copy the code
Get value length
strlen port
Copy the code
String appending
append port good
Copy the code
Gets the specified range of characters
getrange port 0 8
Copy the code
Storage (implementation) principle
The data model
Set Hello Word for example, because Redis is the database for KV, it is implemented via Hashtable (we call this the outer hash). So every key-value pair has a dictEntry (source location: dict.h) with Pointers to key and value. Next points to the next dictEntry.
typedef struct dictEntry {
void *key; /* key specifies the keyword */
union {
void *val; uint64_t u64; /* value defines */
int64_t s64; double d;
} v;
struct dictEntry *next; /* points to the next key-value pair node */
} dictEntry;
Copy the code
- Key is a string, but instead of using C’s character array directly, Redis stores it in a custom SDS.
- Values are stored not directly as strings, nor in SDS, but in redisObjects.
- In fact, any one of the five commonly used data types is stored via redisObject.
- redisObject
RedisObject is defined in the SRC /server.h file.
typedef struct redisObject {
unsigned type:4; /* Object types, including OBJ_STRING, OBJ_LIST, OBJ_HASH, OBJ_SET, OBJ_ZSET */
unsigned encoding:4; /* Specific data structure */
unsigned lru:LRU_BITS; /* 24 bits, the last time the object was accessed by a command program, related to memory reclamation */
int refcount; /* Reference count. When refCount is 0, the object is no longer referenced by any other object and garbage collection can be performed
void *ptr; /* Points to the actual data structure of the object */
} robj;
Copy the code
You can use the type command to view external types.
127.0.0.1:6379 >type qs
string
Copy the code
The internal encoding
127.0.0.1:6379 >setNumber 1 OK 127.0.0.1:6379 >set zk "'s default port is 2181"OK 127.0.0.1:6379 >setRedis nosql OK 127.0.0.1:6379> Object Encoding number"int"127.0.0.1:6379 > object encoding redis"embstr"127.0.0.1:6379 > object encoding zk"raw"
Copy the code
There are three internal encodings for string types:
- Int, a long integer that stores 8 bytes (long, 2^63-1).
- Embstr, which represents SDS (Simple Dynamic String) in EMBSTR format, stores strings of less than 44 bytes.
- Raw, stores strings larger than 44 bytes (39 bytes prior to version 3.2). Why 39?
/* object.c */
#define OBJ_ENCODING_EMBSTR_SIZE_LIMIT 44
Copy the code
- Q.1 What is SDS?
Redis string implementation. In versions 3.2 and later, SDS has a variety of structures (SDS.h) : Sdshdr5, sdshdr8, sdshdr16, sdshdr32, and sdshdr64 are used to store strings of different lengths, indicating that 2^5=32 bytes, 2^ 8=256 bytes, and 2^ 16=65536 bytes =64KB respectively. 2 ^ 32 byte = 4 gb.
/* sds.h */ struct __attribute__(((packed__)) sdshdr8 { uint8_t len; /* The length of the current character array */ uint8_t alloc; /* Total memory allocated for the current character array */ unsigned char flags; /* The property of the current character array, which identifies whether sdshdr8 or >sdshdr16, etc. */ char buf[]; /* The true value of the string */ }; Copy the code
- Question 2. Why does Redis use SDS to implement strings?
As we know, C itself has no string type (only implemented with the character array char[]).
- To use a character array, you must first allocate enough space to the target variable, otherwise it may overflow.
- If we want to get the character length, we must traverse the character array in O(n) time.
- C String length changes will allocate memory to the character array.
- The end of the string is marked by the first ‘\0’ encountered from the beginning to the end of the string. Therefore, you cannot save binary (bytes) contents such as pictures, audio, video, and compressed files. Binary is not safe.
SDS features:
- Don’t worry about memory overflow, SDS will be expanded if needed;
- The time to get the string length is O(1) because len is defined;
- Prevent multiple memory reallocation through “space preallocation” (sdsMakeRoomFor) and “lazy space release”;
- The end indicator is the len attribute (which also ends in ‘\0’ because it allows you to manipulate string functions using the C library), which can contain ‘\0’.
C string SDS
- The complexity of getting string length is O(N);
- The API is not secure, it can cause a buffer overflow;
- If you change the string length for N times, you must reallocate memory for N times at most.
- Can only store text data can store text or binary data; – You can use all of the functions in the
library. You can use some of the functions in the
library.
- Question 3. What is the difference between EMbstr and raw?
The use of EMbSTR allocates memory space only once (because RedisObject and SDS are contiguous), whereas RAW allocates memory space twice (for RedisObject and SDS, respectively). So compared to raw, embSTR has the advantage of allocating less space on creation, freeing less space on deletion, and having all the data of an object together for ease of finding. The disadvantages of embstr are also obvious. If the length of the string increases and memory needs to be reallocated, the entire RedisObject and SDS need to be reallocated, so embSTR in Redis is implemented as read-only.
- Question 4: When are int and embstr converted to RAW?
When an int is no longer an integer, or the size of the int exceeds the size of long (2^63-1=9223372036854775807), it is automatically converted to embstr.
127.0.0.1:6379 >setK1 1 OK 127.0.0.1:6379> append k1 a (integer) 2 127.0.0.1:6379> object encoding k1 "raw" Copy the code
- Question 5: Why did it become RAW when it did not exceed the threshold?
127.0.0.1:6379 >setK2 a OK 127.0.0.1:6379> Object encoding k2"embstr" 127.0.0.1:6379> append k2 b (integer) 2 127.0.0.1:6379> object encoding k2 "raw" Copy the code
In the case of EMbstr, since its implementation is read-only, any modification to an EMbstr object is converted to RAW before modification. Therefore, whenever you modify an embstr object, the modified object must be raw, whether or not it reaches 44 bytes.
- Question 6: When the length is less than the threshold, will it be restored?
As for the conversion of Redis internal code, it conforms to the following rules: the conversion is completed when Redis writes data, and the conversion process is irreversible. Only small memory encoding can be converted to large memory encoding (but does not include re-set).
- Q7: Why wrap the underlying data structure?
Through encapsulation, the storage structure and available commands can be dynamically selected based on object types, saving space and optimizing query speed.
Application scenarios
- The cache
Type String
Such as:
Hot data cache (e.g., reports, star cheats), object cache, full page cache.
This improves the access speed of hotspot data.
Data sharing distributed
Because Redis is a distributed independent service, it can be shared among multiple applications. For example:
- Distributed Session
<dependency>
<groupId>org.springframework.session</groupId>
<artifactId>spring-session-data-redis</artifactId>
</dependency>
Copy the code
- A distributed lock
STRING setnx method that can only be added if it does not exist and returns true.
Redisdoc.com/string/set…. Parameter format is recommended
public Boolean getLock(Object lockObject){
jedisUtil = getJedisConnetion();
boolean flag = jedisUtil.setNX(lockObj, 1);
if(flag){
expire(locakObj,10);
}
return flag;
}
public void releaseLock(Object lockObject){
del(lockObj);
}
Copy the code
- Global ID
INT, INCRBY, uses atomicity
incrby userid 1000
(Take one section at a time for the scenario of database and table)
- counter
INT, INCR method
For example: the number of articles read, the number of likes on weibo, allow a certain delay, first written to Redis and then timed synchronization to the database.
- Current limiting
INT, INCR method
Use the visitor’s IP and other information as the key. Increments the count for each visit, and returns false if the count exceeds the number of visits.
- A statistical
String BITCOUNT (1.6.6 bitmap).
Characters are stored in 8-bit binary.
set k1 a
setbit k1 6 1
setbit k1 7 0
get k1
Copy the code
The CORRESPONDING ASCII code of A is 97, which is 01100001 when converted to binary data. The corresponding ASCII code of B is 98, which is 01100010 when converted to binary data
Because bit is very space-saving (1 MB=8388608 bit), it can be used to do large data statistics.
For example: online user statistics, retained user statistics
setbit onlineusers 0 1
setbit onlineusers 1 1
setbit onlineusers 2 0
Copy the code
Supports bitwise and, bitwise, and so on.
BITOP AND destkey key [key ...] , obtains a logical union for one or more keys, and saves the result to destkey. BITOP OR destkey key [key ...] To obtain logical or for one or more keys and save the result to destkey. BITOP XOR destkey key [key ...] To obtain logical XOR for one or more keys and save the result to the Destkey. BITOP NOT destKey Key Specifies the logical no of the given key and saves the result to the destkey.# count users who are online for 7 days
BITOP "AND" "7_days_both_online_users" "day_1_online_users" "day_2_online_users"."day_7_online_users"
Copy the code
- How to store the value of an object if it has multiple values?
-
For example, a key is used to store data in a table.
-
Serialization? JSON/Protobuf/XML, for example, increases the overhead of serialization and deserialization, and a value cannot be retrieved or modified individually.
-
This can be done by layering keys, for example:
Mset student:1:sno CK16666 student:1:sname mu wind student:1:company TencentCopy the code
- Fetching multiple values at once:
mget student:1:sno student:1:sname student:1:company
Copy the code
Cons: Key is too long and takes up too much space. Is there a better way?
Hash Hash
Storage type
An unordered hash table containing key-value pairs. A value can only be a string and cannot be nested with other types.
What’s the main difference between a Hash and a String?
-
Save memory by aggregating all related values into one key.
-
Only one key is used to reduce key conflicts.
-
To obtain values in batches, you only need to use one command to reduce memory, I/O, and CPU consumption.
Scenarios where Hash is not appropriate:
-
Field cannot set an expiration time separately;
-
No bit operation;
-
The distribution of data volume needs to be considered (when the value value is very large, it cannot be distributed to multiple nodes);
Operation command
hset h1 f 6
hset h1 e 5
hmset h1 a 1 b 2 c 3 d 4
Copy the code
hget h1 a
hmget h1 a b c d
hkeys h1
hvals h1
hgetall h1
Copy the code
The key operation
hget exists h1
hdel h1
hlen h1
Copy the code
Storage (implementation) principle
Redis Hash itself is also a KV structure, similar to Java HashMap.
The outer hash (an implementation of Redis KV) uses only hashtable. When we store hash data types, we call them inner hashes. The inner hash layer can be implemented using two data structures:
Ziplist: OBJ_ENCODING_ZIPLIST hashtable: OBJ_ENCODING_HT (hash) 127.0.0.1:6379 > hset h2 aaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaa (finteger127.0.0.1) 1:6379 > hset h3 aaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaa (finteger) 1
127.0.0.1:6379> object encoding h2
"ziplist"127.0.0.1:6379 > object encoding h3"hashtable"
Copy the code
Ziplist compressed list
What is a Ziplist compressed list?
*/ The ziplist is a specially encoded dually linked list that is designed to be very memory efficient. It stores both strings and integer values, where integers are encoded as actual integers instead of a series of characters. It allows push and pop operations on either side of the list in O(1) time. However, because every operation requires a reallocation of the memory used by the ziplist, the actual complexity is related to the amount of memory used by the ziplist.
Ziplist is a specially encoded bidirectional linked list. It does not store Pointers to the last linked list node and the next linked list node, but the length of the last node and the length of the current node. It sacrifices part of the read and write performance in exchange for efficient memory utilization, which is a time-for-space idea. It is used only in scenarios where the number of fields is small and the value of the field is small.
What’s inside ziplist?
Ziplist.c line 16:
<zlbytes> <zltail> <zllen> <entry> <entry> ... <entry> <zlend>
Copy the code
typedef struct zlentry {
unsigned int prevrawlensize; /* The length occupied by the last list node */
unsigned int prevrawlen; /* The number of bytes required to store the length value of the last list node */
unsigned int lensize; /* The number of bytes required to store the current list node length value */
unsigned int len; /* The length of the current list node */
unsigned int headersize; /* The header size of the current list node (prevrawlenSize + lensize), i.e., the size of the non-data field */
unsigned char encoding; /* Encoding mode */
unsigned char *p; /* The compressed linked list is stored as a string with a pointer to the start of the current node */
} zlentry;
Copy the code
Encoding (Ziplist.c, 第 204 行.)
#define ZIP_STR_06B (0 << 6) // Contains a maximum of 63 bytes
#define ZIP_STR_14B (1 << 6) // Contains a maximum of 16383 bytes
#define ZIP_STR_32B (2 << 6) // Contains a maximum of 4294967295 bytes
Copy the code
- Question: When to use Ziplist storage?
The Ziplist encoding is used when the hash object meets the following two conditions: 1) The length of the keys and values of all key-value pairs is less than or equal to 64 bytes (one letter and one byte). 2) Hash objects hold less than 512 key-value pairs.
/* SRC /redis.conf configuration */
hash-max-ziplist-value 64 // Maximum length of the value that can be stored in the ziplisthash-max-ziplist-entries 512 // Maximum number of entries that can be stored in ziplistCopy the code
/* source location: t_hash.c, when the number of fields reaches the threshold, use HT as encoding */
if (hashTypeLength(o) > server.hash_max_ziplist_entries)
hashTypeConvert(o, OBJ_ENCODING_HT);
Copy the code
/* source location: t_hash.c, when the field value is too long, to HT */
for (i = start; i <= end; i++) {
if (sdsEncodedObject(argv[i]) &&sdslen(argv[i]->ptr) > server.hash_max_ziplist_value){
hashTypeConvert(o, OBJ_ENCODING_HT);
break; }}Copy the code
If a hash object exceeds the configured threshold (the length of keys and values is greater than 64 bytes, and the number of key-value pairs is greater than 512), it is converted into a Hashtable.
Hashtable (dict)
In Redis, a hashtable is called a dictionary, and it is an array + linked list structure.
Source: dict.h
As we know, the KV structure of Redis is implemented by a dictEntry.
Redis also encapsulates dictEntry in several layers.
typedef struct dictEntry {
void *key; /* key specifies the keyword */
union {
void *val;
uint64_t u64; /* value defines */
int64_t s64;
double d;
} v;
struct dictEntry *next; /* points to the next key-value pair node */
} dictEntry;
Copy the code
DictEntry goes into dictht (hashtable) :
/* This is our hash table structure. Every dictionary has two of this as we
*implement incremental rehashing, for the old to the new table. */
typedef struct dictht {
dictEntry **table; /* Hash table array */
unsigned long size; /* Hash table size */
unsigned long sizemask; /* Mask size, used to calculate index values. It's always going to be size minus 1 times PI
unsigned long used; /* There are already points */
} dictht;
Copy the code
Ht was put into dict:
typedef struct dict {
dictType *type; /* Dictionary type */
void *privdata; /* Private data */
dictht ht[2]; /* A dictionary has two hash tables */
long rehashidx; /* rehash index */
unsigned long iterators; /* The number of iterators currently in use */
} dict;
Copy the code
From the bottom to the top
DictEntry – dictht – dict – OBJ_ENCODING_HT
Summary: Hash storage structure
- NULL after dictht means the second HT is not used yet.
- DictEntry star followed by NULL means there’s no hash to this address.
- DictEntry followed by NULL means that no hash conflicts have occurred.
- Question: Why define two hash tables? ht[2]
Redis hash uses HT [0] by default, ht[1] does not initialize and allocate space. The hash table DICtht uses the chain address method to solve the collision problem. In this case, the performance of the hash table depends on the ratio between its size (the size attribute) and the number of nodes it holds (the used attribute) :
- When the ratio is 1:1 (a hash table ht stores only one entry), the hash table performs best.
- If the number of nodes is much larger than the size of the hash table (the ratio is expressed as ratio, where 5 means an average of 5 entries per HT), the hash table degrades into multiple linked lists, and the performance advantage of the hash table itself is lost.
In this case, expansion is required. This operation in Redis is called rehash.
Steps for rehash:
- Allocates space for the character HT [1] hash table, the size of which depends on the operation to be performed and the number of key-value pairs ht[0] currently contains. Extension: ht[1] size is the first one greater than or equal to HT [0]. Used *2.
- Rehash all nodes on HT [0] to HT [1], recalculate the hash value and index, and place them in the specified location.
- When ht[0] is fully migrated to HT [1], the space of HT [0] is freed, ht[1] is set to ht[0] table, and a new HT [1] is created in preparation for the next rehash.
Question: When will capacity expansion be triggered?
Load factors (dict.c) :
static int dict_can_resize = 1;
static unsigned int dict_force_resize_ratio = 5;
Copy the code
Dict_can_resize is 1 and dict_force_resize_ratio If the ratio between the number of used nodes and the size of the dictionary exceeds 1:5, capacity expansion is triggered
Dictexpandifneeded (dict. C)
if (d->ht[0].used >= d->ht[0].size &&(dict_can_resize || d->ht[0].used/d->ht[0].size > dict_force_resize_ratio)){
return dictExpand(d, d->ht[0].used*2);
}
return DICT_OK;
Copy the code
DictExpand (source dict.c)
static int dictExpand(dict *ht, unsigned long size) {
dict n; /* the new hashtable */
unsigned long realsize = _dictNextPower(size), i;
/* the size is invalid if it is smaller than the number of *elements already inside the hashtable */
if (ht->used > size)
return DICT_ERR;
_dictInit(&n, ht->type, ht->privdata);
n.size = realsize;
n.sizemask = realsize- 1;
n.table = calloc(realsize,sizeof(dictEntry*));
/* Copy all the elements from the old to the new table: *note that if the old hash table is empty ht->size is zero, *so dictExpand just creates an hash table. */
n.used = ht->used;
for (i = 0; i < ht->size && ht->used > 0; i++) {
dictEntry *he, *nextHe;
if (ht->table[i] == NULL) continue;
/* For each hash entry on this slot... * /
he = ht->table[i];
while(he) {
unsigned int h;
nextHe = he->next;
/* Get the new element index */
h= dictHashKey(ht, he->key) & n.sizemask; he->next = n.table[h];
n.table[h] = he; ht->used--;
/* Pass to the next element */
he = nextHe;
}
}
assert(ht->used == 0);
free(ht->table);
/* Remap the new hashtable in the old */
*ht = n;
return DICT_OK;
}
Copy the code
Shrinkage: server. C
int htNeedsResize(dict *dict) {
long long size, used;
size = dictSlots(dict);
used = dictSize(dict);
return (size > DICT_HT_INITIAL_SIZE &&(used*100/size < HASHTABLE_MIN_FILL));
}
Copy the code
Application scenarios
- String
Whatever String can do, Hash can do.
- Store object type data
For example, an object or a table’s data saves more key space than String and is easier to manage centrally.
- The shopping cart
Key: indicates the user ID. Field: commodity ID; Value: indicates the quantity of a commodity.
+ 1: hincr. 1: hdecr. Delete: hdel. All: hgetall. Number of goods: Hlen.
By the way
There is a problem? Can you leave me a message or chat privately? Just give it a thumbs up
Of course, you can also go to my official account “6 Xi Xuan”,
Reply to “Learn” and receive a copy of the Video tutorial for Advanced Architects for Java Engineers
Answer “interview”, can obtain:
MySQL brain Map MySQL brain map
There are [Ali cloud] [Tencent cloud] purchase discount oh ~ specific please contact me
The sunrise hin I am trained programmers, PHP, Android and hardware are done, but in the end or choose to focus on Java, so have what questions to ask the public for discussion (emotional pouring technology can ha ha ha), see words will reply as soon as possible, hope can with everyone common learning progress, about the server architecture, Java core knowledge analysis, career, interview summary and other articles will be pushed irregularly output, welcome to pay attention to ~~~