Following on from why Redis is used, today we’ll talk about specific Redis data types and commands. This article is an important foundation for a deeper understanding of Redis.

This series is based on: Redis-3.2.12

I won’t cover all the commands, but most of them are common at work.

Most of the stuff we look at, it’s pretty straightforward to tell us what this command does, and that command takes several arguments. This way will only know the why, this paper from the time complexity of the command to the purpose, and then to the corresponding type of Redis in the low level of the structure to save data, I hope to let everyone understand more deeply, when using the heart more secure.

  1. Note here in the reading: Although many commands have O(n) time complexity, be aware of what n means.

  2. OBJECT ENCODING XXX is used to check the internal ENCODING of Redis, which is actually the value represented by ENCODING in the redisObject structure. RedisObject provides a unified representation for different types of data.

Type String

This is the most widely used data type in Redis. Some of the commands in this type are widely used. Such as:

  • Caching, which is used a lot;
  • Counter/speed limiter technology;
  • The shared Session server is also based on this data type

Note: The table only shows the 12 commands in String, and the usage scenarios are only partial.

We are often told not to use MSET/MGET commands because their time complexity is O(n), but notice that n is the number of keys set or read, so if you don’t batch read a lot of keys and each key is not very large, Then using batch operation command can save the time of network request and transmission.

The internal structure

How does String data end up stored in Redis? If we want to go into details, we have to start with the STRUCTURE of SDS, but today we will not discuss the details of the source code part, only the internal data structure. Eventually the string we set will be stored in one of three forms.

  • Int, a long integer of 8 bytes. The maximum value is 0x7fffffffffffffl
  • Embstr, a string of less than or equal to 44 bytes
  • Raw

Take a look at the code to see how Redis makes decisions about these three data structures. When we run the command SET test hello,redis on the client, the client saves the command to a BUF and executes it in the order it received it. One of these functions is processMultibulkBuffer(), which internally calls createStringObject() :

#define OBJ_ENCODING_EMBSTR_SIZE_LIMIT 44
robj *createStringObject(const char *ptr, size_t len) {
	  // Check the length of the saved string and select the corresponding type
    if (len <= OBJ_ENCODING_EMBSTR_SIZE_LIMIT)
        return createEmbeddedStringObject(ptr,len);
    else
        return createRawStringObject(ptr,len);
}
Copy the code

If the length of the string hello,redis exceeds 44, if it exceeds raw, use embstr if it does not. Experiment and see:

127.0.0.1: > SET test 12345678901234567890123456789012345678901234/6379 / len = 44 OK 127.0.0.1:6379 > OBJECT encoding test "Embstr" 127.0.0.1: > SET test 123456789012345678901234567890123456789012345/6379 / len = 45 OK 127.0.0.1:6379 > OBJECT encoding test "raw"Copy the code

As you can see, once you pass 44, the underlying type changes to: RAW. Wait, didn’t we also mention that there is an int? You can’t see it in the function at all, right? No hurry, when the command we typed actually starts executing, which is when setCommand() is called, a tryObjectEncoding() function is triggered, which attempts to compress the input string. Look at the code:

robj *tryObjectEncoding(robj *o) {... . len = sdslen(s);// The length is less than or equal to 20, and can be converted to length shaping
    if(len <= 20&& string2l(s,len,&value)) { o->encoding = OBJ_ENCODING_INT; }... . }Copy the code

This function I’ve shrunk down a lot, but simply we can see that it determines whether the length is less than or equal to 20, and tries to convert it to an integer. See an example.

9223372036854775807 is the largest integer represented by 8-bit bytes, and its hexadecimal form is 0x7fffffffffffffffL

127.0.0.1:6379> SET test 9223372036854775807 OK 127.0.0.1:6379> OBJECT Encoding test "int" 127.0.0.1:6379> SET test 9223372036854775808 // bigger 1 OK 127.0.0.1:6379> OBJECT encoding test "embstr"Copy the code

At this point, the process of selecting the type of String is complete. The reference value for us is that when we use String type to save data, we should take into account the different underlying types. Different types will execute different processes inside Redis, and the corresponding execution efficiency and memory consumption are different.

Hash type

We often use it to hold structured data, such as cached information related to a user. If you use a plain String type, serializing and deserializing the String adds overhead and only reads the entire String at a time.

  • Cache structured data, such as article information, and flexibly modify one of its fields, such as reading amount.

The Hash type holds structured data, much like a record in MySQL, where you can easily modify a field, but with more flexibility, each record can contain different fields.

The internal structure

There are two types of data structures that can exist internally for Hash data:

  • ZipList can save more space. Limit: The length of key and field cannot exceed 64, and the number of fields in a key cannot exceed 512
  • HashTable

For Hash, Redis first sets it to use ZipList data structures by default, and then decides if it needs to be changed based on the conditions.


void hsetCommand(client *c) {
    int update;
    robj *o;

    if ((o = hashTypeLookupWriteOrCreate(c,c->argv[1)) = =NULL) return;
    hashTypeTryConversion(o,c->argv,2.3);// Based on the length. . update = hashTypeSet(o,c->argv[2],c->argv[3]);// Based on the number of elementsaddReply(c, update ? shared.czero : shared.cone); . . }Copy the code

HashTypeLookupWriteOrCreate () invokes the internal createHashObject () to create the Hash object.

robj *createHashObject(void) {
    unsigned char *zl = ziplistNew();
    robj *o = createObject(OBJ_HASH, zl);
    o->encoding = OBJ_ENCODING_ZIPLIST;// Set the encoding ziplist
    return o;
}
Copy the code

The hashTypeTryConversion() function internally determines the low-level data structure based on whether the hash_max_ziplist_value limit (64) is exceeded.

void hashTypeTryConversion(robj *o, robj **argv, int start, int end) {
    int i;

    if(o->encoding ! = OBJ_ENCODING_ZIPLIST)return;

    for (i = start; i <= end; i++) {
		  // Check whether the field and value lengths are too long
        if (sdsEncodedObject(argv[i]) &&
            sdslen(argv[i]->ptr) > server.hash_max_ziplist_value)
        {
            hashTypeConvert(o, OBJ_ENCODING_HT);
            break; }}}Copy the code

Then check the hashTypeSet() function to see if the number of fields has exceeded the limit of 512 for hash_max_ziplist_entries.

int hashTypeSet(robj *o, robj *field, robj *value) {
    int update = 0;

    if(o->encoding == OBJ_ENCODING_ZIPLIST) { ... .// Check whether the number of fields exceeds 512
        if (hashTypeLength(o) > server.hash_max_ziplist_entries)
            hashTypeConvert(o, OBJ_ENCODING_HT);
    } else if(o->encoding == OBJ_ENCODING_HT) { ... . }... .return update;
}
Copy the code

To verify the above logic:

127.0.0.1:6379 > HSET test name qweqweqwkejkksdjfslfldsjfkldjslkfqweqweqwkejkksdjfslfldsjfkldjsl (integer) 1 127.0.0.1:6379> HSTRLEN test name (integer) 64 127.0.0.1:6379> OBJECT Encoding test "ziplist" 127.0.0.1:6379> HSET test The name qweqweqwkejkksdjfslfldsjfkldjslkfqweqweqwkejkksdjfslfldsjfkldjslq (integer) 0 127.0.0.1:6379 > HSTRLEN test name (INTEGER) 65 127.0.0.1:6379> OBJECT Encoding test "hashtable"Copy the code

The key setting exceeds 64, and the number of fields exceeds 512, you can test by yourself.

List the type

The List type is also used in a wide range of applications, mainly to summarize common scenarios:

  • Message queue: LPUSH + BRPOP (blocking feature)
  • Cache: the user records various records, the biggest feature is to support paging
  • Stack: LPUSH + LPOP
  • Queue: LPUSH + RPOP
  • Finite queue: LPUSH + LTRIM can maintain the number of data in the queue

The internal structure

The List data types are implemented at a low level as follows:

  • QuickList: This is a LinkedList with ZipList nodes
  • ZipList (provincial storage), found in version 3.2.12 for local use
  • LinkedList, found locally in version 3.2.12

There are some articles on the web that say that LinkedList is not used in Redis 4.0, and I actually found it not used in Redis 3.2.12 either (not directly as a data store structure), None, including ZipList, was used to store data directly in 3.2.12.

To verify this, let’s set a List of 1000 elements, each of which has a value longer than 64 characters.

127.0.0.1:6379> LLEN test (integer) 1000 127.0.0.1:6379> OBJECT Encoding test "quickList" 127.0.0.1:6379> LINDEX test 0 "Qweqweqwkejkksdjfslfldsjfkldjslkfqweqweqwkejkksdjfslfldsjfkldjslq" / / 65 charactersCopy the code

The structure is always QuickList, whether we change the number of elements in the list or the length of the element value. If you don’t believe me, let’s look at the code:

void pushGenericCommand(client *c, int where) {
    int j, waiting = 0, pushed = 0;
    robj *lobj = lookupKeyWrite(c->db,c->argv[1]); . .for (j = 2; j < c->argc; j++) {
        c->argv[j] = tryObjectEncoding(c->argv[j]);
        if(! lobj) {// Create a quick list
            lobj = createQuicklistObject();
            quicklistSetOptions(lobj->ptr, server.list_max_ziplist_size,
                                server.list_compress_depth);
            dbAdd(c->db,c->argv[1],lobj); } listTypePush(lobj,c->argv[j],where); pushed++; }... . }Copy the code

The initial call to createQuicklistObject() sets its low-level data structure to quick List. There is no place to transform this structure later in the process.

The Set type

One of the important features of the Set type is its ability to de-duplicate and unorder. The nature of its collection can be widely used socially.

  • Mutual concern
  • Common preferences
  • Data to heavy

The internal structure

The Set low-level implementation uses two data structures:

  • IntSet, used when the collection members are all integers (cannot exceed the maximum integer) and the number of collection members is less than 512.
  • HashTable

The code for this command is as follows, and the two important calls to determine the type are setTypeCreate() and setTypeAdd().

void saddCommand(client *c) {
    robj *set; . .if (set= =NULL) {
		  / / initialization
        set = setTypeCreate(c->argv[2]);
    } else{... . }for (j = 2; j < c->argc; j++) {
		  // The number of elements is checked to see if the low-level structure needs to be changed
        if (setTypeAdd(set,c->argv[j])) added++; }... . }Copy the code

Let’s look at the initial creation code for a Set object:

robj *setTypeCreate(robj *value) {
    if (isObjectRepresentableAsLongLong(value,NULL) == C_OK)
        return createIntsetObject(); / / use IntSet
    return createSetObject(); / / use the HashTable
}
Copy the code

IsObjectRepresentableAsLongLong () internal determine its integer range, if there is no more than the largest integer is an integer and will use IntSet to save. Otherwise, use HashTable. The number of elements is then checked.

int setTypeAdd(robj *subject, robj *value) {
    long long llval;
    if(subject->encoding == OBJ_ENCODING_HT) { ... . }else if (subject->encoding == OBJ_ENCODING_INTSET) {
        if (isObjectRepresentableAsLongLong(value,&llval) == C_OK) {
            uint8_t success = 0;
            subject->ptr = intsetAdd(subject->ptr,llval,&success);
            if (success) {
                /* Convert to regular set when the intset contains * too many entries. */
                if (intsetLen(subject->ptr) > server.set_max_intset_entries)
                    setTypeConvert(subject,OBJ_ENCODING_HT);
                return 1; }}else {
            /* Failed to get integer from object, convert to regular set. */setTypeConvert(subject,OBJ_ENCODING_HT); . .return 1; }}... .return 0;
}
Copy the code

Take an example, using the maximum integer threshold as an example:

127.0.0.1:6379> SADD test 9223372036854775807
(integer) 1
127.0.0.1:6379> OBJECT encoding test
"intset"
127.0.0.1:6379> SADD test 9223372036854775808
(integer) 1
127.0.0.1:6379> OBJECT encoding test
"hashtable"
Copy the code

For the test of the number of sets, please do your own observation.

SortSet type

Nowadays, there are some leaderboards and other functions, such as investment websites showing the ranking of investment amount, shopping websites showing the ranking of consumption and so on. SortSet is perfect for this. Often used to solve the following problems:

  • Leaderboards
  • Background scripts perform operations based on the sequence of task weights
  • Range lookup, finding out what range of a value is in the collection

The internal structure

Although an ordered Set is also a Set, the low-level data structure of a Set is different from that of a Set. It also has two data structures:

  • ZipList, used when the ordered set has 128 or less elements, or when the member length is 64 or less
  • SkipList

The transformation process is as follows:

void zaddGenericCommand(client *c, int flags) {
if (zobj == NULL) {
        if (xx) goto reply_to_client; /* No key + XX option: nothing to do. */
        if (server.zset_max_ziplist_entries == 0 ||
            server.zset_max_ziplist_value < sdslen(c->argv[scoreidx+1]->ptr))
        {
            zobj = createZsetObject();// skip list
        } else {
            zobj = createZsetZiplistObject();// zip list
        }
        dbAdd(c->db,key,zobj);
    } else{... . }... .if (zobj->encoding == OBJ_ENCODING_ZIPLIST) {
        if (zzlLength(zobj->ptr) > server.zset_max_ziplist_entries)
            zsetConvert(zobj,OBJ_ENCODING_SKIPLIST);// Convert the code according to the number
        if (sdslen(ele->ptr) > server.zset_max_ziplist_value)
            zsetConvert(zobj,OBJ_ENCODING_SKIPLIST);// Convert encoding according to length}}Copy the code

For example, if the length of member exceeds 64:

127.0.0.1:6379 > ZADD test 77 qwertyuiopqwertyuiopqwertyuiopqwertyuiopqwertyuiopqwertyuiopqwer / / member length is 64 (integer) 1 127.0.0.1:6379> OBJECT Encoding test "ziplist" 127.0.0.1:6379> ZADD test 77 Qwertyuiopqwertyuiopqwertyuiopqwertyuiopqwertyuiopqwertyuiopqwerq / / member length is 65 (integer) 1 127.0.0.1:6379 > OBJECT encoding test "skiplist"Copy the code

When we member over 64 bits, the low-level data structure changes from ZipList to SkipList. To test the number of remaining elements, try it.

Global Common Commands

For global commands, no matter what type of data the corresponding key is, it can be operated. Note that the command KEYS cannot be used online because of the Redis single-thread mechanism. If there is too much data in the memory, the operation will be severely blocked, resulting in the entire Redis service cannot respond.

conclusion

  • Redis commands have different time complexity for each type, some of which are related to the number of corresponding elements. Some are related to the number of requests;
  • Reasonable arrangement of element correlation number and length, strive for Redis bottom using the simplest data structure;
  • Pay attention to the time complexity, know their own Redis internal elements, avoid blocking;
  • The simpler the data, the better the performance;
  • Redis each data type corresponds to a variety of data structures at the lower level, and modification and extension are not aware of the upper level.

The first article talked about why to use Redis, this article also talked about most of the commands, and Redis source code for some of their implementation, then began to focus on the concrete practice of some operations. Hope to be helpful to everyone, look forward to any form of criticism and encouragement.

Public id: dayuTalk