Listpack (Compact list)

Redis5.0 introduces a new data structure, ListPack, which is an improved version of Ziplist. It is more frugal and compact in storage and structure than Ziplist. Listpack is currently used in the Stream data structure.

Listpack structure:

struct listpack<T> { int32 total_bytes; // The total number of bytes occupied int16 size; T[] entries; // List of elements int8 end; // Like ziplist zlend, always 0xFF}Copy the code

Entries are structured as follows:

struct lpentry {
    int<var> encoding;
    optional byte[] content;
    int<var> length;
}
Copy the code

Listpack differs from Ziplist

  1. Ziplist uses the zltail_offset field to locate the last element for reverse traversal, but listpack does not have this field.

  2. The entries structure is basically the same, but the length field is placed at the end of the listpack and stores not the length of the previous element but the length of the current one, so the total_bytes field and the length of the last element can be used to calculate the location of the last element. No additional ZLTAIL_offset field is required.

  3. The entries in the Listpack no longer store the length of the previous element, so there is no problem with cascading updates.

Lpentry length field

The length field is encoded using varint, which can be any length from 1 to 5 bytes. The encoding is the same as UTF8, which determines the encoding length by whether the highest bit of the byte is 1.

That is, if the highest bit of a byte is 1, another byte continues to represent its length until the highest bit of a byte is 0.

Lpentry encoding field

Encoding is designed with different values to distinguish between the types of content in the entry, and the size of the bytes used for encoding varies depending on the case.

string

  1. 10XXXXXX (1 byte): a string of 0 to 63 characters. The content field stores the content of the string.

  2. 1110XXXX YYYYYYYY (2 bytes): the value is a medium length string that ranges from 0 to 4095. The content field is a string.

  3. 11110000 aaAAaAAA BBBBBBBB CCCCCCCC DDDDDD (5 bytes) : a large character string. The last four bytes indicate the length. The content field stores the content.

The integer

  1. 0XXXXXXX (1 byte) : a non-negative small integer ranging from 0 to 127.

  2. 110XXXXX YYYYYYYY (2 bytes): a 13-bit signed integer ranging from -2048 to 2047.

  3. 11110001(1 byte): a 16-bit (2 byte) signed integer containing content.

  4. 11110010(1 byte): a 24-bit (3 byte) signed integer containing content.

  5. 11110011(1 byte): a 32-bit (4 byte) signed integer containing content.

  6. 11110100(1 byte): a 64 – bit (8 bytes) signed integer containing content.

terminator

  1. 11111111(1 byte): represents the end of the listpack symbol 0xFF.

Rax (Radix tree)

Radix Tree, raX full Radix Tree, is an ordered dictionary Tree that can sort by key and support fast positioning, insertion and deletion. Unlike Hash and Zset, hash does not have sorting function, and Zset sorts by score.

Evolution of Redis infrastructure:

Application scenarios

  1. English dictionary can be regarded as a RAX, its words are arranged according to the lexicographical order, and each word has its own interpretation, which is value, and the word is key. With this tree, words can be quickly retrieved and words starting with certain prefixes can be queried.

  1. For residents’ file information of public Security Bureau, key is the resident id number and value is the resume information. The prefix code of ID card is graded according to the region, so specific residents’ information can be quickly located and matched with residents’ information matching the prefix.

  2. Time series application, the timestamp is the key, value is the event of the timestamp, the timestamp encoding can be arranged according to year, month, day, hour, minute, second, millisecond, microsecond, nanosecond level, so it can also quickly locate what happened at a specific time and what happened within a range.

  3. Web server routing, routing is regarded as a tree, each URL corresponds to a request processor, when the new request comes, the URL is used to match with the tree to find the corresponding request processor, because there may be regular pattern in THE URL, and there is no requirement on the order of nodes in the same layer, so it is not a strict RAX.

Application in Redis

  1. stream

    In Redis, RAX is applied to the message queue in the stream structure. The message ID of the stream is prefixed with timestamp + ordinal number, so it conforms to the time series.

    Using raX structure for storage can quickly locate the corresponding message according to the message ID, while traversing the message after the specified message.

  2. cluster

    Slots_to_keys is used to record the mapping between slots and keys in a cluster. The raxNode key is a combination of slot number hashslot and key.

    Object keys of the same Hashslot are hung under the same raxNode, so you can quickly locate all object keys in a slot.

    The number of slots in a cluster is 16384, requiring a 2-byte representation, so the key structure is a string of 2-byte hashslot and object key concatenated as follows.

structure

Nodes under RAX can be divided into three types: root node, leaf node and intermediate node. Some intermediate nodes have value, while some intermediate nodes have no value for structural needs.

struct raxNode { int<1> isKey; Int <1> isNull; Int <1> isCompressed; Int <29> size; // Number of leaf nodes or length of compressed string byte[] data; // Store routing keys, leaf Pointers, value}Copy the code

Rax in Redis is not a standard Radix Tree structurally, if an intermediate node has multiple leaf nodes, the routing key is one character; If there is only one leaf node, the routing key is a string.

When there is only one leaf node, it is the compression node (a string of characters pressed together). The dark blue node in the image below is the compression node:

RaxNode. Data structure

Compression node

The case where there is only one leaf node is the compression node.

The pseudo-code of data field of compressed node is as follows:

Struct data {optional struct {// Depends on raxNode size field is 0 byte[] childKey; RaxNode * childNode; } child; optional string value; // Depending on the raxNode isNull field}Copy the code

If there is no node behind the compression node, the childNode does not exist, and if it is an intermediate node, the value field does not exist.

Uncompressed node

If there are multiple leaf nodes, it is not a compression node. There are multiple routing keys, and each key is a character.

struct data { byte[] childKeys; RaxNode *[] childNodes; // Multiple leaf Pointers optional string value; // Depending on the raxNode isNull field}Copy the code

Non-compressed node structure: