The cornerstone of Redis - Ziplist - Moment For Technology

Redis has a lot of data structures in its open API. List, set, hash, zset, string… In fact, however, this is just the result of encapsulation. Even if the list is the same, redis will use different data structures to implement it under different data volumes.

This time I’m talking about compressed lists. A data structure within Redis. What was it designed for? Redis is positioned as a cache, which determines that it will have a great QPS. How to improve its performance? Data will eventually be stored on disk, and the larger the amount of data, the longer it will take to retrieve data. It’s easy to think about keeping hot data in the cache. However, memory is limited, so how do you ensure that only valid data is stored in memory? The first concerns its memory flushing strategy. Based on LRU, LFU algorithm removes non-hotspot data from the memory and increases the proportion of hotspot data in the memory. The second point is to store as much data as possible in limited memory through compression algorithms. Lucene has the same idea, and lucene’s indexes are kept as small as possible so that they can be loaded directly into memory to improve query efficiency.

Ziplist is based on the second point. Ziplist’s data structure looks like this… The first 32-bit corresponding to ZLbytes stores the total size of ziplist. The second 32-bit corresponding to ZLTail stores the offset of tailentry (the last entry before Zlend is called tailentry). The third 16-bit corresponding to ZLLen stores the number of entries Zlend represents an end identifier with a value of 255

The data structure for an entry is like this. Prevlen records the length of the previous entry. Only one byte is recorded when the length of the previous entry is less than 254. Only five bytes are recorded when the length of the previous entry is more than 254. (The first byte is 255, indicating that prevlen requires five bytes, and the following four bytes record the true length.)

Encoding Records the encoding mode of data. Encoding is mapped to printed values when data itself is of type int and determines how many bits are needed to represent int. When data is a string, it is encoded in bytes based on the length of the data.

Data records actual data. Compression only for int types. For example, if the int value is -128~127, only one byte is needed to represent it, saving 3 bytes.

Ziplist can’t actually compress strings, so it’s only good for storing ints. This structure is not suitable when the k/ V base memory overhead is high. Redis automatically selects the appropriate data structure to implement the list internally

A comment on the core method is posted below

macros

/* Return total bytes a ziplist is composed of. */
#define ZIPLIST_BYTES(zl)       (*((uint32_t*)(zl)))

/* Return the offset of the last item inside the ziplist. */
#define ZIPLIST_TAIL_OFFSET(zl) (*((uint32_t*)((zl)+sizeof(uint32_t))))

/* Return the length of a ziplist, or UINT16_MAX if the length cannot be
 * determined without scanning the whole ziplist. */
#define ZIPLIST_LENGTH(zl)      (*((uint16_t*)((zl)+sizeof(uint32_t)*2)))

/* The size of a ziplist header: two 32 bit integers for the total
 * bytes count and last item offset. One 16 bit integer for the number
 * of items field. */
#define ZIPLIST_HEADER_SIZE     (sizeof(uint32_t)*2+sizeof(uint16_t))

/* Size of the "end of ziplist" entry. Just one byte. */
#define ZIPLIST_END_SIZE        (sizeof(uint8_t))

/* Return the pointer to the first entry of a ziplist. */
#define ZIPLIST_ENTRY_HEAD(zl)  ((zl)+ZIPLIST_HEADER_SIZE)

/* Return the pointer to the last entry of a ziplist, using the
 * last entry offset inside the ziplist header. */
#define ZIPLIST_ENTRY_TAIL(zl)  ((zl)+intrev32ifbe(ZIPLIST_TAIL_OFFSET(zl)))

/* Return the pointer to the last byte of a ziplist, which is, the
 * end of ziplist FF entry. */
#define ZIPLIST_ENTRY_END(zl)   ((zl)+intrev32ifbe(ZIPLIST_BYTES(zl))-1)

#define ZIP_DECODE_PREVLEN(ptr, prevlensize, prevlen) do {                     \
    ZIP_DECODE_PREVLENSIZE(ptr, prevlensize);                                  \
    if ((prevlensize) == 1) {                                                  \
        (prevlen) = (ptr)[0];                                                  \
    } else if ((prevlensize) == 5) {                                           \
        assert(sizeof((prevlen)) == 4);                                        \
        memcpy(&(prevlen), ((char*)(ptr)) + 1, 4);                             \
        memrev32ifbe(&prevlen);                                                \
    }                                                                          \
} while(0);

Copy the code

ZiplistNew Creates a Ziplist

Unsigned char *ziplistNew(void) {// Head +end // head is int32 + 1 int16 // end is int8 unsigned int bytes = ZIPLIST_HEADER_SIZE+ZIPLIST_END_SIZE; unsigned char *zl = zmalloc(bytes); ZIPLIST_BYTES(zl) = intrev32ifbe(bytes); ZIPLIST_BYTES(zl) = intrev32ifbe(bytes); ZIPLIST_TAIL_OFFSET(zl) = ZIPLIST_TAIL_OFFSET(zl) = intrev32ifbe(ZIPLIST_HEADER_SIZE); ZIPLIST_LENGTH(zl) = 0; zl = 0; zl = 0; 255 zl[bytes-1] = ZIP_END; 255 zl[bytes-1] = ZIP_END; return zl; }Copy the code

A ziplistPush inserts a string into the head or tail of a ziplist

unsigned char *ziplistPush(unsigned char *zl, unsigned char *s, unsigned int slen, int where) { unsigned char *p; P = (where == ZIPLIST_HEAD) p = (where == ZIPLIST_HEAD) ? ZIPLIST_ENTRY_HEAD(zl) : ZIPLIST_ENTRY_END(zl); Return __ziplistInsert(zl,p,s,slen); }Copy the code

__ziplistInsert Insert implementation

unsigned char *__ziplistInsert(unsigned char *zl, unsigned char *p, unsigned char *s, Unsigned int slen) {// The first element is the total size, which contains head+end size_t curlen = intrev32ifbe(ZIPLIST_BYTES(zl)), // How many extra bytes reqlen is required; unsigned int prevlensize, prevlen = 0; size_t offset; int nextdiff = 0; unsigned char encoding = 0; long long value = 123456789; /* initialized to avoid warning. Using a value that is easy to see if for some reason we use it uninitialized. */ zlentry tail; /* Find out prevlen for the entry that is inserted. */ // The prevlen of the first entry will be parsed below The length of the first entry should be 0 if (p[0]! = ZIP_END) { ZIP_DECODE_PREVLEN(p, prevlensize, prevlen); Unsigned char *ptail = ZIPLIST_ENTRY_TAIL(zl);} else {// the logic is the same; If (ptail[0]! Ptail [0]! Prevlen = zipRawEntryLength(ptail); }} /* See if the entry can be encoded * */ if (zipTryEncoding(s,slen,&value,&encoding)) {/* 'encoding' is set to the appropriate integer encoding */ reqlen = zipIntSize(encoding); } else { /* 'encoding' is untouched, However zipStoreEntryEncoding will use the * string length to figure out how to encode it * */ reqlen = slen; } /* We need space for both the length of the previous entry and * the length of the payload Prevlen = zipStorePrevEntryLength(NULL,prevlen); Prevlen = zipStorePrevEntryLength(NULL,prevlen); prevlen = Prevlen (NULL,prevlen); reqlen += zipStoreEntryEncoding(NULL,encoding,slen); /* When the insert position is not equal to the tail, we need to * make sure that the next entry can hold this entry's length in * its prevlen field. */ int forcelarge = 0; // p[0] ! Nextdiff = (p[0]) nextDiff = (p[0]) nextDiff = (p[0] ! = ZIP_END) ? zipPrevLenByteDiff(p,reqlen) : 0; If (nextDiff == -4 && reqlen < 4) {nextdiff = 0; forcelarge = 1; } /* Store offset because a realloc may change the address of zl. * This gives the offset from the start of ZL to the start of tailentry (end if ziplist is empty) * */ offset = p-zl; Zl = ziplistResize(zl,curlen+reqlen+nextdiff); p = zl+offset; /* Apply memory move when necessary and update tail offset. = ZIP_END) {/* Subtract one because of the ZIP_END bytes * param1 dest param2 SRC param3 len * NextDiff is set to 0 * * If nextDiff is 4 (there is no other case) * the following operation is equivalent to memmove(p+reqlen+nextdiff,p,curlen-offset-1); * */ memmove(p+reqlen,p-nextdiff,curlen-offset-1+nextdiff); /* Encode this entry's raw length in the next entry. How to appear?? */ if (forcelarge) zipStorePrevEntryLengthLarge(p+reqlen,reqlen); Else // Add the length of the new entry to the first zipStorePrevEntryLength(p+reqlen,reqlen); /* Update offset for tail * tailoffset add reqlen * */ ZIPLIST_TAIL_OFFSET(zl) = intrev32ifbe(intrev32ifbe(ZIPLIST_TAIL_OFFSET(zl))+reqlen); /* When the tail contains more than one entry, we need to take * "nextdiff" in account as well. Otherwise, A change in the * size of prevlen doesn't have an effect on the *tail* offset. */ zipEntry(p+reqlen, &tail); // If the original first is the last entry at this point, tailoffset only needs to add reqlen; otherwise, nextDiff if is required for storing prevlen (p[reqlen+tail.headersize+tail.len] ! = ZIP_END) { ZIPLIST_TAIL_OFFSET(zl) = intrev32ifbe(intrev32ifbe(ZIPLIST_TAIL_OFFSET(zl))+nextdiff); } else {/* This element will be the new tail. */ / Mark tail_offset as end ZIPLIST_TAIL_OFFSET(ZL) = intrev32IFBe (p-zL); ZIPLIST_TAIL_OFFSET(ZL) = intrev32IFBe (p-zL); } /* When nextdiff ! = 0, the raw length of the next entry has changed, So * we need to cascade the update throughout the ziplist * If the current entry applies for extra memory to store prevlen * */ if (nextdiff! = 0) { offset = p-zl; zl = __ziplistCascadeUpdate(zl,p+reqlen); p = zl+offset; } // The first part records the prevlen part and the second part corresponds to the Encoding flag. The third part stores the compressed data. Note that STR does not support compression /* Write the entry */ p += zipStorePrevEntryLength(p,prevlen); P += zipStoreEntryEncoding(p,encoding,slen); If (ZIP_IS_STR(encoding)) {memcpy(p,s,slen); ZipSaveInteger (p,value,encoding); ZIPLIST_INCR_LENGTH(zl,1); ZIPLIST_INCR_LENGTH(zl,1); return zl; }Copy the code

ZipRawEntryLength returns the total length of the current entry

unsigned int zipRawEntryLength(unsigned char *p) { unsigned int prevlensize, encoding, lensize, len; ZIP_DECODE_PREVLENSIZE(p, prevlensize); ZIP_DECODE_LENGTH(p + prevlensize, encoding, lensize, len); return prevlensize + lensize + len; }Copy the code

ZipTryEncoding computes the value of encoding

long long value; if (entrylen >= 32 || entrylen == 0) return 0; if (string2ll((char*)entry,entrylen,&value)) { /* Great, Check what's the smallest * of our encoding types that can hold this value. * If the value is within a special range Encoding * */ if (value >= 0 && value <= 12) {if (value >= 0 && value <= 12) { *encoding = ZIP_INT_IMM_MIN+value; Else if (value >= INT8_MIN && value <= INT8_MAX) {* Encoding = ZIP_INT_8B; } else if (value >= INT16_MIN && value <= INT16_MAX) { *encoding = ZIP_INT_16B; } else if (value >= INT24_MIN && value <= INT24_MAX) { *encoding = ZIP_INT_24B; } else if (value >= INT32_MIN && value <= INT32_MAX) { *encoding = ZIP_INT_32B; } else { *encoding = ZIP_INT_64B; } *v = value; return 1; } return 0; }Copy the code

ZipIntSize determines how many bytes the int needs to be stored based on the encoding value

unsigned int zipIntSize(unsigned char encoding) { switch(encoding) { case ZIP_INT_8B: return 1; case ZIP_INT_16B: return 2; case ZIP_INT_24B: return 3; case ZIP_INT_32B: return 4; case ZIP_INT_64B: return 8; If (encoding >= ZIP_INT_IMM_MIN && encoding <= ZIP_INT_IMM_MAX) return} // indicates that value ranges from 0 to 12, so value is stored directly in encoding 0; /* 4 bit immediate */ panic("Invalid integer encoding 0x%02X", encoding); return 0; }Copy the code

ZipStorePrevEntryLength uses Prevlen to indicate how many bytes are needed for the length passed in

unsigned int zipStorePrevEntryLength(unsigned char *p, Unsigned int len) {if (p == NULL) {return (len < ZIP_BIG_PREVLEN)? Return (len < ZIP_BIG_PREVLEN)? 1 : sizeof(len)+1; If (len < ZIP_BIG_PREVLEN) {p[0] = len; if (len < ZIP_BIG_PREVLEN) {p[0] = len; return 1; } else { return zipStorePrevEntryLengthLarge(p,len); }}}Copy the code

ZipStoreEntryEncoding Encoding How many bytes are required for the representation

unsigned int zipStoreEntryEncoding(unsigned char *p, unsigned char encoding, Unsigned char len = 1, buf[5]; unsigned char len = 1, buf[5]; If (ZIP_IS_STR(encoding)) {/* Although encoding is given it may not be set for strings, * */ if (rawlen <= 0x3f) {if (! p) return len; / / here 2 ^ 6 | rawlen just can be expressed in six buf [0] = ZIP_STR_06B | rawlen; Else if (rawlen <= 0x3fff) {len += 1; if (! p) return len; buf[0] = ZIP_STR_14B | ((rawlen >> 8) & 0x3f); buf[1] = rawlen & 0xff; } else {// The rest of the data occupies 5 slots len += 4; if (! p) return len; buf[0] = ZIP_STR_32B; buf[1] = (rawlen >> 24) & 0xff; buf[2] = (rawlen >> 16) & 0xff; buf[3] = (rawlen >> 8) & 0xff; buf[4] = rawlen & 0xff; }} else {/* Implies INTEGER encoding, so length is always 1. p) return len; buf[0] = encoding; } /* Store this length at p. */ memcpy(p,buf,len); return len; }Copy the code

ZipSaveInteger stores ints in compression mode

void zipSaveInteger(unsigned char *p, int64_t value, unsigned char encoding) { int16_t i16; int32_t i32; int64_t i64; If (encoding == ZIP_INT_8B) {((int8_t*)p)[0] = (int8_t)value; } else if (encoding == ZIP_INT_16B) { i16 = value; memcpy(p,&i16,sizeof(i16)); memrev16ifbe(p); } else if (encoding == ZIP_INT_24B) { i32 = value<<8; memrev32ifbe(&i32); // memcpy(p,((uint8_t*)&i32)+1,sizeof(i32)-sizeof(uint8_t)); } else if (encoding == ZIP_INT_32B) { i32 = value; memcpy(p,&i32,sizeof(i32)); memrev32ifbe(p); } else if (encoding == ZIP_INT_64B) { i64 = value; memcpy(p,&i64,sizeof(i64)); memrev64ifbe(p); Else if (encoding >= ZIP_INT_IMM_MIN && encoding <= ZIP_INT_IMM_MAX) {/* Nothing to do, the value is stored in the encoding itself. */ } else { assert(NULL); }}Copy the code

Full note :github.com/a137872798/…

mo4tech.com (Moment For Technology) is a global community with thousands techies from across the global hang out!Passionate technologists, be it gadget freaks, tech enthusiasts, coders, technopreneurs, or CIOs, you would find them all here.

The cornerstone of Redis — Ziplist

The cornerstone of Redis — Ziplist

Related Posts

Python modifies file names in batches in a second, saving time and hassle

Python Basics Generator | Python theme month

Go language efficient Web development four: Layered domain model and project structure layering