Note: The Redis source code version analyzed in this series is github.com/Sidfate/red… , is the latest version of the article published.

As we mentioned earlier in the dictionary, the key-value structure in the whole DB of Redis is also a dict. As we know, the key of the dict is a string SDS (refer to my previous article), so the value can correspond to various types. Redis provides a common object structure, redisObject, for unified management of multiple value types.

Source structure

The source code structure of redisObject is as follows:

    typedef struct redisObject {
        unsigned type:4;
        unsigned encoding:4;
        unsigned lru:LRU_BITS; /* LRU time (relative to global lru_clock) or * LFU data (least significant 8 bits frequency * and most significant 16 bits access time). */
        int refcount;
        void *ptr;
    } robj;
Copy the code
attribute meaning
type Object data type. That’s four places.
encoding Object encoding. It’s four bits, and all of its values are given up here.
lru The idling time of the object. It’s 24 places.
refcount Reference counting. It allows robj objects to be shared under certain circumstances.
ptr Data pointer, pointing to real data. For example, a ROBj representing string might have a PTR pointing to an SDS structure; A robj representing a list, whose PTR might point to a QuickList.

In fact, we have mentioned five common structures before, its value is as follows:

    /* The actual Redis Object */
    #define OBJ_STRING 0    /* String object. */
    #define OBJ_LIST 1      /* List object. */
    #define OBJ_SET 2       /* Set object. */
    #define OBJ_ZSET 3      /* Sorted set object. */
    #define OBJ_HASH 4      /* Hash object. */
Copy the code

The reason for encoding is that there can be multiple encoding methods for one type. As we mentioned in the string section when type = OBJ_STRING means that robj stores a string, encoding can be one of the following:

  • OBJ_ENCODING_RAW
  • OBJ_ENCODING_INT
  • OBJ_ENCODING_EMBSTR

Encoding also shows the data structures currently used in Redis for data types, and the source code analysis for this series of structures is based on this perspective. Redis defines the encoding of type and encoding as follows:

    /* Objects encoding. Some kind of objects like Strings and Hashes can be * internally represented in multiple ways. The 'encoding' field of the object * is set to one of this fields for this object. */
    #define OBJ_ENCODING_RAW 0     /* Raw representation */
    #define OBJ_ENCODING_INT 1     /* Encoded as integer */
    #define OBJ_ENCODING_HT 2      /* Encoded as hash table */
    #define OBJ_ENCODING_ZIPMAP 3  /* Encoded as zipmap */
    #define OBJ_ENCODING_LINKEDLIST 4 /* No longer used: old list encoding. */
    #define OBJ_ENCODING_ZIPLIST 5 /* Encoded as ziplist */
    #define OBJ_ENCODING_INTSET 6  /* Encoded as intset */
    #define OBJ_ENCODING_SKIPLIST 7  /* Encoded as skiplist */
    #define OBJ_ENCODING_EMBSTR 8  /* Embedded sds string encoding */
    #define OBJ_ENCODING_QUICKLIST 9 /* Encoded as linked list of ziplists */
    #define OBJ_ENCODING_STREAM 10 /* Encoded as a radix tree of listpacks */
Copy the code

As for the LRU field, I will briefly introduce it, which is the free time of the object calculated by the LRU algorithm. As for what the LRU algorithm is and the specific use of this field, please pay attention to my later article, please look forward to it.

Next I’ll focus on the refCount reference count.

Refcount Indicates the reference count

If you look at the previous article, you can see that Redis does too much optimization in memory usage. Many languages have an automatic memory reclamation mechanism for memory optimization, but C does not have one, so Redis implements its own set of reclamation mechanism through object reference counting refcount management. In simple terms, the refcount of an object is determined to determine whether the object needs to be automatically freed and memory reclaimed.

The reference count information for an object changes as the object is in use:

  • When a new object is created, the reference count value is initialized to 1;
  • When an object is used by a new program, its reference-count value is increased by one;
  • When an object is no longer in use by a program, its referential count value is reduced by one.
  • When the referential count of an object becomes 0, the memory occupied by the object is freed.

Refcount: refcount: refcount: refcount: refcount: refcount: refcount: refcount

> set test_str "abc" > debug object test_str Value at:0x7fb47bc09150 refcount:1 encoding:embstr serializedlength:4 lru:3567105 lru_seconds_idle:3 > set test_str 100 OK > debug object test_str Value at:0x7fb47bd06320 refcount:2147483647  encoding:int serializedlength:2 lru:3567067 lru_seconds_idle:754Copy the code

Did you notice anything strange? When test_str is set to “ABC” we see a refcount of 1. Since we created this object, his refcount is initialized to 1. But the reason why, when we reset test_str to 100, refCount suddenly becomes 2147483647, which is such a large number, is because of the number 100.

By default, When Redis initializes the server, it creates 10,000 string objects that contain all integer values from 0 to 9999. When the server needs string objects with values from 0 to 9999, the server uses these shared objects instead of newly created objects.

The advantage of shared objects is that you don’t need to allocate memory frequently for these small integer objects, and you can use not only string values, but dict values, and so on. The reference count for these small integer objects is equal to INT_MAX, which we see as 2147483647.

Why doesn’t Redis share objects that contain strings?

Imagine if WE create a shared object for the string “ABC”. If I wanted to use it, I would need to compare the value of the new string object to be the same as that of “ABC”. In the worst case, O(N). Integers, we just have to compare sizes, order one.

Although shared objects can reduce memory allocation and facilitate management, the tradeoff is that only integers are shared, and it is not easy to determine which strings or non-numeric objects need to be shared.