Redis data Structure (2)

intset

Encoding is intSet (set of small integers) when set stores integers

typedef struct intset {
    int32 encoding;
    int32 length;
    int contents[];
}
Copy the code

field	describe	instructions
encoding	Determines whether the integer bit width is 16, 32, or 64 bits	Enumeration said
length	Number of elements
contents	An integer array that stores element values

Intset stores the elements in order of smallest to largest. When storing an element, the encoding is upgraded based on the integer size, finds the location where the element is to be inserted, and if it is not the last, moves the element after the location one bit, and inserts the element last. If the inserted element is not an integer, the storage becomes a hash structure.

ziplist

If hash and Zset meet the following conditions, the encoding type is ziplist (compressed list). For details, see the configuration file.

Hash-max-ziplist-entries 512 # Hash-max-ziplist-value 64 # zset-max-ziplist-entries 128 for hash keys or values less than 64 # zset-max-ziplist-value 64 # zset < 64Copy the code

typedef struct ziplist {
    int32 zlbytes;
    int32 zltail_offset;
    int16 zllength;
    T[] entries;
    int8 zlend;
}

typedef struct entry {
    int<var> prevlen;
    int<var> encoding;
    byte[] content;
}
Copy the code

field	describe	instructions
zlbytes	Number of bytes occupied by ziplist
zltail_offset	The offset of the last element from the start of the compression list	It is used to quickly locate the last node and traverse in reverse order
zllength	Number of elements
entries	Compression element
zlend	Marks the end of the compressed list	Constant for the FF

field	describe	instructions
prevlen	Length of the previous entry in bytes	The first entry is always 0 and the byte length changes dynamically. If the string length is less than 254, one byte is used; otherwise, five bytes are used
encoding	The encoding type	The encoding type changes dynamically based on the element content, which is extremely complex and will not be described in detail in this articleZiplist encoding type
content	Element content, optional

Below is a ziplist demo

Byte 1-4, zlbytes is 25, indicating that the compressed list occupies 25 bytes in total
Bytes 5-8, zltail_offset is 22, indicating that the last element starts at 22
Bytes 9-10, zllength 3, indicating that there are three elements
Bytes 11-16, the first entry: prevlen=0 because it is preceded by no data item; Encoding =4, which indicates that the following 4 bytes are stored as strings and the data value is name
Bytes 17-21, second entry: Prevlen =6, indicating that the previous entry occupies 6 bytes. Encoding =3, which indicates that the following 3 bytes are stored as strings. The value of the data is why
Prevlen =5, indicating that the previous entry occupies 5 bytes. Encoding =0xFE, which is an integer stored in the following 1byte. The value of the data is 14
Byte 25, zlend for FF, marks the end of the compressed list

When ziplist is used to store hash structures, the key and value are stored as one entry.

As you can see, the compressed list is very compact. When an entry length becomes 254, the prevlen of the next entry expands from 1 byte to 5 bytes. This is a cascading update

quicklist

Quicklist is used to store lists. It is a hybrid of Ziplist and LinkedList, which is similar to a two-way list structure.

By default, a Ziplist in QuickList is 8K long. If the ziplist is longer than this length, another node will be created, which can be configured in the configuration file.

# -2 indicates 8K, enumeration types can be seen in the configuration file list-max-ziplist-size-2Copy the code

Quicklist has a default compression depth of 0, that is, no compression. If the compression depth is 1, the first and last two packets are not compressed. If the compression depth is 2, the first two packets and the last two packets are not compressed. You can configure this parameter in the configuration file.

list-compress-depth 0
Copy the code

skiplist

Zset uses dict to map values to scores, but also provides sorting by score, hence skiplist.

Let’s start with a demo of Skiplist

typedef struct zsl {
    zslnode* header;
    zslnode* tail;
    int maxLevel;
}
Copy the code

typedef struct zslnode {
    sds value;
    double score;
    zslforward*[] forwards;
    zslnode* backward;
}
Copy the code

typedef struct zslforward {
    zslnode* item;
    int span;
}
Copy the code

field	describe	instructions
header	A header pointer to the jump list	Value is fixed to NULL, score is fixed to 0, and BACKWARD is fixed to NULL
tail	Tail pointer to jump list
maxLevel	Maximum number of layers in the current skip table	The maximum is 64
value	Used to store string data
score	Used to store score values
backward	The fallback node	The ← arrow in the figure
forwards	Forward node	→ arrows in the figure, one for each layer
span	Span, which stores how many nodes are skipped between one node and the next	If score1 points to score5, the span value is 4,This is how ranking works

The minimum score backward is fixed with NULL, and for each newly inserted node, a random algorithm is called to assign it a reasonable number of layers

Level1 =0.75 *0.25=0.1875 Level3 =0.1875…… Leveln has a probability of (1-0.25)* math.pow (0.25,n-1)

conclusion

Redis, as a single thread memory service, has made a lot of optimization in response and data structure, which is worth learning

Object type	The encoding type
string	Int, raw, embstr
list	quicklist
hash	Dict, ziplist
set	Intset, dict
zset	Ziplist, skiplist + dict

HyperLogLog

The principle of HyperLogLog is Bernoulli’s test, that is, when a coin is lost, according to the number of consecutive tails X, it is calculated that a total of 2 X power coins have been lost. When X is very large, the error between the calculated total and the actual total is very close. For details, please refer to other articles.

pfadd

Element hashes to a fixed 64-bit value. The lower 14 bits are the bucket. The lower 50 bits are the bucketCopy the code

Assume that the hash value is {omit 45 bits}01100 00000000000101

The lower 14 bits of binary are converted to base 10 with a value of 5 (regnum), that is, we place the data in the fifth bucket
The first 1 in the top 50 is 3, so the count value is 3
Registers [5] extracts the historical value oldcount
Registers [5] = count is updated if count > oldcount
If count <= oldcount, no processing is done

HyperLogLog uses 16,384 buckets and each bucket takes up 6 bits, so a HyperLogLog takes up 12K of memory.

pfcount

Harmonic average: If my salary is 10_000 and Jack ma's salary is 1_000_000, then the average salary of Jack ma and I is 505_000, I definitely disagree... If harmonic mean is used, it is 2/(1/10_000+1/1_000_000)=19_801 similarly, the average number of buckets is n/(1/ bucket 1 bit +1/ bucket 2 bits +... +1/ bucket n digits) average number of buckets is: math.pow (2, average number of bucket digits) Total number: Case 4: constant = 0.673 * m * m; case 4: constant = 0.673 * m * m; case 4: constant = 0.673 * m * m; Case 5: constant = 0.697 * m * m; Case 6: constant = 0.709 * m * m; Default: constant = (0.7213 / (1 + 1.079 / m)) * m * m; }Copy the code

intset

ziplist

quicklist

skiplist

conclusion

HyperLogLog

pfadd

pfcount

Related Posts

Java Basics – Knowledge points

A HashMap analysis

Go implements AES encryption CBC mode | Go Theme month