“This is the 29th day of my participation in the First Challenge 2022. For details: First Challenge 2022.”

SDS stands for Simple Dynamic String and is the basis of the Redis design implementation.

Redis source code to describe unstable source code.

The data structure

In order to save memory, Redis adopts different data structures for different length data. Sds.h defines the following five types. Generally, SDS_TYPE_5 is not used because it does not store data length and needs to be allocated and released each time.

#define SDS_TYPE_5  0
#define SDS_TYPE_8  1
#define SDS_TYPE_16 2
#define SDS_TYPE_32 3
#define SDS_TYPE_64 4
Copy the code

Take type = 1 as an example

typedef char *sds;

/* __attribute__ ((__packed__)) is used to tell the compiler to unalign the structure during compilation, taking */ as it actually does
struct __attribute__(((packed__)) sdshdr8 {
    uint8_t len; /* Data length */
    uint8_t alloc; /* Remove u-heads and null terminators, valid length + data length */
    unsigned char flags; /* 3 lsb of type, 5 unused bits */
    // Change the length of data
    char buf[];
};
Copy the code

Show an example of data:

  • A value of 0 for the free attribute indicates that the SDS has not allocated any unused space.
  • The len attribute has a value of 5, which means that the SDS holds a string of 5 lengths.
  • The buf property is an array of type char that holds the first five bytes of the strings ‘R’, ‘e’, ‘d’, ‘I’, and ‘s’. The last character holds the null character ‘\0’.

A demonstration of data compression:

Spatial extension capacity

  • Current valid length >= New length, directly return;

  • After the update, determine whether the old and new types are consistent;

    • Use remailc consistently, otherwise use malloc + Free

      • A. The current valid length is greater than or equal to the new length
  • Increase the step length

    • The new length is less than the pre-allocated length (1024 x 1204), which is twice the length
    • The new length is greater than or equal to the pre-allocated length, and the pre-allocated length is added each time (to reduce unnecessary memory).

File location: SDS.c /_sdsMakeRoomFor

#define SDS_MAX_PREALLOC (1024*1024)

sds _sdsMakeRoomFor(sds s, size_t addlen, int greedy) {
    void *sh, *newsh;
    size_t avail = sdsavail(s);
    size_t len, newlen, reqlen;
    char type, oldtype = s[- 1] & SDS_TYPE_MASK;
    int hdrlen;
    size_t usable;

    /* Return ASAP if there is enough space left. */
    // Current length >= New length
    if (avail >= addlen) return s;

    len = sdslen(s);
    sh = (char*)s-sdsHdrSize(oldtype);
    reqlen = newlen = (len+addlen);
    assert(newlen > len);   /* Catch size_t overflow */
    if (greedy == 1) {
        // The new length is smaller than the pre-allocated length (1024 x 1024, doubled: SDS_MAX_PREALLOC = 1024 x 1024)
        if (newlen < SDS_MAX_PREALLOC)
            newlen *= 2;
       // The new length is approximately equal to the pre-allocated length
        else
            newlen += SDS_MAX_PREALLOC;
    }

    type = sdsReqType(newlen);

    /* Don't use type 5: the user is appending to the string and type 5 is * not able to remember empty space, so sdsMakeRoomFor() must be called * at every appending operation. */
    if (type == SDS_TYPE_5) type = SDS_TYPE_8;
    
    // Remalloc is used for both new and old types; otherwise, malloc + freea is used; the current valid length >= new length is returned
    hdrlen = sdsHdrSize(type);
    assert(hdrlen + newlen + 1 > reqlen);  /* Catch size_t overflow */
    if (oldtype==type) {
        newsh = s_realloc_usable(sh, hdrlen+newlen+1, &usable);
        if (newsh == NULL) return NULL;
        s = (char*)newsh+hdrlen;
    } else {
        /* Since the header size changes, need to move the string forward, * and can't use realloc */
        // If not, the memory needs to be reallocated
        newsh = s_malloc_usable(hdrlen+newlen+1, &usable);
        if (newsh == NULL) return NULL;
        memcpy((char*)newsh+hdrlen, s, len+1);
        s_free(sh);
        s = (char*)newsh+hdrlen;
        s[- 1] = type;
        sdssetlen(s, len);
    }
    usable = usable-hdrlen- 1;
    if (usable > sdsTypeMaxSize(type))
        usable = sdsTypeMaxSize(type);
    sdssetalloc(s, usable);
    return s;
}
Copy the code

Space contraction capacity

In the trim operation, lazy free is used, that is, no immediate memory reallocation is used to reclaim the shortened bytes, only moving and marking, and modifying the data.

File location SDS.c/sdSTRIm

sds sdstrim(sds s, const char *cset) {
    char *end, *sp, *ep;
    size_t len;

    sp = s;
    ep = end = s+sdslen(s)- 1;
    while(sp <= end && strchr(cset, *sp)) sp++;
    while(ep > sp && strchr(cset, *ep)) ep--;
    len = (ep-sp)+1;
    if(s ! = sp) memmove(s, sp, len); s[len] ='\ 0';
    sdssetlen(s,len);
    return s;
}
Copy the code

The actual deletion operation is shown in tryObjectEncoding in subsequent operations

File location server. C/tryObjectEncoding

void trimStringObjectIfNeeded(robj *o) { if (o->encoding == OBJ_ENCODING_RAW && sdsavail(o->ptr) > sdslen(o->ptr)/10) { o->ptr = sdsRemoveFreeSpace(o->ptr); }}Copy the code

advantages

  • Constants get string length (len)
  • Avoid buffer overflows
  • Reduces the number of frequent memory reallocations caused by string modifications
  • Binary operation security: Can hold text data as well as binary data (such as video streams, etc.) in any format
  • Ends with ‘\0’ to make it compatible with some C string functions

other

  • SDS is an alias for char*, which can be understood as allocating a contiguous block of memory (header + data) to improve access speed based on locality principle.
  • The data store does not use SDS_TYPE_5 because it needs to be overcharged every time new data of this type is generated.
  • Using C language memory layout, a 0 length array is used in SDS structure, which can not only achieve variable length, but also ensure that the memory is continuous. Therefore, in a series of OPERATIONS in SDS, it is very surprising to see such operations as S [-1], of course, s points to the buF position.

File location: SDS.c/sdSALloc

/* sdsalloc() = sdsavail() + sdslen() */
static inline size_t sdsalloc(const sds s) {
    unsigned char flags = s[- 1];
    switch(flags&SDS_TYPE_MASK) {
        case SDS_TYPE_5:
            return SDS_TYPE_5_LEN(flags);
        case SDS_TYPE_8:
            return SDS_HDR(8,s)->alloc;
        case SDS_TYPE_16:
            return SDS_HDR(16,s)->alloc;
        case SDS_TYPE_32:
            return SDS_HDR(32,s)->alloc;
        case SDS_TYPE_64:
            return SDS_HDR(64,s)->alloc;
    }
    return 0;
}
Copy the code

The resources

  • Redis Design and Implementation, Huang Jianhong
  • github.com/redis/redis