preface

Http://image.tingvoa.com://zhouwenxing.github.io/
Copy the code

While Redis is normally used to speed up queries, there is an interesting data structure in Redis that saves memory at the expense of reading and writing speed. This is Ziplist. Why did Redis do this? Did you really feel that you were going too fast and that it wouldn’t hurt to sacrifice a little speed?

What is a compressed list

Ziplist is a data structure designed to save memory. A Ziplist is a sequential data structure composed of a series of special codes. A Ziplist can contain any number of entries, and each entry can hold a byte array or an integer value.

Ziplist as a list, the biggest difference between ziplist and ordinary double-ended list, such as LinkedList, is that Ziplist does not store Pointers to the front and back nodes, while linkedList generally maintains a pointer to the front node and a pointer to the back node for each node. So ziplist doesn’t maintain Pointers to front and back nodes. How does it find them?

Ziplist does not maintain Pointers to the front and back nodes, but it maintains the length of the previous node and the length of the current node, and calculates the position of the front and back nodes each time by length. Since calculations are involved, there is definitely a performance cost to storing Pointers directly, which is a typical trade of time for space. Because each time the front and back nodes need to be calculated to get the position of the front and back nodes, it will consume more time. In Redis, a pointer is 8 bytes, but in most cases, if the direct storage length is less than 8 bytes, Therefore, using the memory length design method can save memory space in most scenarios.

Ziplist storage structure

Ziplist consists of:

<zlbytes> <zltail> <zllen> <entry> <entry> ... <entry> <zlend>
Copy the code

Zlbytes, Zltail, zllen are the head part of Ziplist, entry is the entries part of Ziplist, each entry represents a data, and zlend represents the end part of Ziplist. As shown below:

The meanings of each attribute in Ziplist are shown in the following table:

attribute type Long degrees instructions
zlbytes uint32_t 4 bytes Records the number of bytes of memory used by the compressed list (including its own memory)4Bytes).
zltail uint32_t 4 bytes Record the number of bytes between the end of the compressed list and the start address of the compressed list.
zllen uint16_t 2 – Record the number of nodes contained in the compressed list. When the list value exceeds the maximum value that can be stored (65535), this value is stored permanently65535(i.e.216The power cut1), so it is necessary to traverse the whole compressed list to calculate the real number of nodes.
entry node Compress each node in the list, the length of which is determined by the actual data stored.
zlend uint8_t 1 byte Special characters0xFF(Decimal system255) to mark the end of the compressed list (otherwise normal nodes are not marked as255Because the255It is used to identify the end. As you can see later, normal nodes are marked as254).

Entry storage structure

Ziplist head and end store length and mark, while entry stores specific elements, which is a specially designed storage format. Each entry is prefixed with metadata containing two pieces of information, and the composition structure of each entry is as follows:

<prevlen> <encoding> <entry-data>
Copy the code

prevlen

The Prevlen property stores the length of the previous entry and allows you to traverse the list from back to front. The prevlen attribute may be 1 or 5 bytes in length:

  • When the first one in the listentryThe number of occupied bytes is smaller than254At this time,prevlenonly1Represents in bytes.
<prevlen from 0 to 253> <encoding> <entry>
Copy the code
  • When the first one in the listentryThe number of occupied bytes is greater than or equal to254At this time,prevlen5Bytes, where the first1The value of bytes is fixed254(this is a tag that indicates that a larger value has been followed), after4One byte is the actual storage before the otherentryNumber of bytes occupied by.
0xFE <4 bytes unsigned little endian prevlen> <encoding> <entry>
Copy the code

Note: one byte is enough to store the size of 255, and the reason for taking 254 is because zlend is a fixed 255, so 255 is used to determine if it is the end of a Ziplist.

encoding

The Encoding property stores the type and length of the data stored in the current entry. Encoding Contains 1, 2, or 5 bytes. As mentioned earlier, each entry can hold byte arrays and integers, and the first byte of the Encoding property is used to determine whether the current entry is an integer or an array of bytes. When storing integers, the first two bytes are always 11, while when storing byte arrays, it may be one of 00, 01, or 10.

  • When storing integers, no1The first byte2A fixed for11, the other bits are used to record the type or value of an integer value (the first two digits in the encodings shown in the following table are11) :
coding The length of the Entry Data saved
11000000 1 byte Int16_t Integer
11010000 1 byte Int32_t An integer
11100000 1 byte Int64_t Integer
11110000 1 byte 24 – bit signed integer
11111110 1 byte An 8-bit signed integer
1111xxxx 1 byte xxxxOn behalf of the interval0001-1101., stores an intermediate between0-12Between integers, at this timeentry-dataAttribute omitted

Note: the XXXX bit code range is 0000-1111, but 0000,1111 and 1110 are already occupied by the data types shown in the table earlier, so the actual range is 0001-1101, which can save data 1-13, and after subtracting 1 the range is 0-12. As for the reason of subtracting 1, 0 is a very commonly used data in terms of usage, so we choose to uniformly subtract 1 to store a range of 0-12 instead of directly storing the range of 1-13.

  • When storing byte arrays, no1The first byte2Bit is00,01or10Other bits are used to record the length of the byte array:
coding The length of the Entry Data saved
00pppppp 1 byte Length less than or equal to63Byte (6Bit) byte array
01pppppp qqqqqqqq 2 – Length less than or equal to16383Byte (14Bit) byte array
10000000 qqqqqqqq rrrrrrrr ssssssss tttttttt 5 bytes Length less than or equal to232The power cut132Bit), where the first byte1The last byte of6Bit is set to0I haven’t used it yet32A (4One byte) stores data

entry-data

Entry-data Stores specific data. When storing small integers (0-12), because encoding is the data itself, the entry-data part is omitted. The entry structure in ziplist is as follows:

<prevlen> <encoding>
Copy the code

The data structure of entry in the compressed list is defined as follows (source ziplist.c file), of course, the actual storage does not directly use this structure definition, this structure is only used to receive data, so we can understand:

typedef struct zlentry {
    unsigned int prevrawlensize;// Stores the number of bytes used for prevrawlen
    unsigned int prevrawlen;// Store the number of bytes required by the last linked list node
    unsigned int lensize;// Store the number of bytes used by len
    unsigned int len;// Stores the number of bytes of the current node in the linked list
    unsigned int headersize;// PrevrawlenSize + lenSize is the size of the non-data field
    unsigned char encoding;// Encoding mode
    unsigned char *p;// Point to the starting position of the current node (because the data in the list is also a string object)
} zlentry;
Copy the code

Ziplist data example

Explained above, most may feel boring, everyone may feel fog, this has nothing to do, such as long as the in the mind have a concept, it is ok to use corresponding information when, do not need to all remember, let us together through two examples to experience how ziplist to organize data storage.

The following is an example of a compressed list containing two nodes containing integers 2 and 5:

[0f 00 00 00] [0c 00 00 00] [02 00] [00 f3] [02 f6] [ff]
      |             |          |       |       |     |
   zlbytes        zltail     zllen    "2"     "5"   end
Copy the code
  1. The first group4Bytes tozlbytesPart,0fIt’s going to be binary1111That is15, for the wholeziplistLength is15Bytes.
  2. The second group4byteszltailPart,0cIt’s going to be binary1100That is12, which records the number of bytes from the end of the compressed list to the start address, i.e[02 f6]The tail node is 1, 2, 3 away from the starting position12Bytes.
  3. The third group2One byte is the current recordziplistentryThe number of,02It’s going to be binary10In other words, currentlyziplist2A node.
  4. The fourth group2bytes[00 f3]That’s the first oneentry.00said0Because this is number one1So the length of the previous node is0.f3It’s going to be binary11110011, which corresponds to the code in the table1111xxxx, so the last four digits store one0-12An integer of bits.0011In decimal form3, minus1get2So the first oneentryThe stored data is2.
  5. With five2bytes[02 f6]That’s the second oneentry.02Is the2Is the length of the previous node2And notice, because the result here is less than254So that means that we’re only using1Bytes to store the length of the previous node (if equal to254That tells us what happens next4Bytes only store length), so the followingf6That is, the data of the current node is converted to binary zero11110110, corresponds to the codes in the table1111xxxxSame last four0110It’s real data. It’s calculated to be 5.
  6. The last set of 1 bytes [ff] is converted to binary11111111Which means that this is the wholeziplistIn the end.

If a Hello World string is added to the list, an entry will be added, as follows:

[02] [0b] [48 65 6c 6c 6f 20 57 6f 72 6c 64]
Copy the code
  1. The first group of1bytes02In decimal form2Represents the previous node (in the example above[02 f6]The length is2.
  2. The second group of2bytes0bConvert to binary theta00001011In order to00At the beginning, it fits the code00ppppppAnd get rid of the first two00After calculation, it is decimal11, which means that the length of the following byte array is11.
  3. The third group happens to be11One byte corresponds to the length above, so this is real storageHello WorldAn array of bytes.

Ziplist chain update problem

As mentioned above, the Prevlen attribute in Entry can be 1 byte or 5 bytes, so let’s imagine a scenario like this: Suppose that in a Ziplist, the length of consecutive entries is close to but less than 254 (between 250 and 253), then each node in the Ziplist only uses 1 byte to store the length of the previous node. If a new node is added at this time, If entry1 is larger than 254 bytes, the prelen attribute of entry2, the next entry1 node, must be changed from 1 byte to 5 bytes, that is, the space needs to be allocated. If entry3 is larger than 254 bytes, its next entry3 will have its prelen property changed to 5 bytes. In this way, the phenomenon of continuous multiple spatial redistribution is called chain renewal. Similarly, not only new nodes can be added, but also cascading updates can occur when nodes are deleted.

Although ziplist may have this kind of cascading update scenario, if it only occurs between a few nodes, the performance will not be seriously affected, and the probability of such a scenario is relatively low, so do not worry too much about the actual use.

conclusion

This paper mainly explains the Ziplist (compressed list) in Redis, a data structure with time for space, in the introduction of compressed list storage structure at the same time through a storage example to analyze ziplist is how to store data, and finally introduced ziplist may occur in the chain update problem.