Redis source code analysis of the ordered set SortedSet

SortedSet is a very characteristic data structure in Redis, through this article to summarize the knowledge point.

The original address: www.jianshu.com/p/75ca5a359…

SortedSet command

Ordered sets in Redis, which allow the user to sort elements with specified values, and provide a rich set of apis for operations based on the ordered set. Examples are as follows:

Zadd table1 100 a (127.0.0.1:6379> zadd table1 100 ainteger) 1
127.0.0.1:6379> zadd table1 200 b
(integer) 1
127.0.0.1:6379> zadd table1 300 c
(integer127.0.0.1:6379> zrange table1 0 1 1 1)"a"
2) "b"Score 127.0.0.1:6379> zrangebyScore table1 150 400 1)"b"
2) "c"127.0.0.1:6379> zrem table1 b (integer1)Copy the code

In an ordered set, the value used for sorting is called score, and the value actually stored is called member.

Due to the large number of apis available in ordered concentrations, here are just a few common ones, which you can refer to the Redis documentation for.

One very common usage scenario for ordered sets is user reviews. Post a message on APP or website, and there will be many comments below. Usually, the display is in reverse order according to the release time. For this requirement, we can use the ordered set, with the timestamp of the release of comments as the score, and then find the ordered set in reverse order according to the number of displayed comments.

Two, ordered set SortedSet command source analysis

As usual, we will find the handlers for the relevant commands from the command table in the server.c file and analyze them one by one. ZaddCommand starts by adding elements again:

void zaddCommand(client *c) {
    zaddGenericCommand(c,ZADD_NONE);
}Copy the code

Here you can see that the process moves to zaddGenericCommand and passes in a pattern tag. Here’s a quick explanation of how SortedSet works. Let’s take a look at the complete zadd command:

zadd key [NX|XX] [CH] [INCR] score member [score member ...]Copy the code

Let’s look at the options one by one:

NX means that if the element exists, it is returned without substitution.
XX indicates that only existing elements are operated on.
CH returns the number of modified (add, update) elements and can only be used by the ZADD command.
INCR means adding a new score to the original score, not replacing it.

ZADD_NONE in the above snippet represents a normal operation.

ZaddGenericCommand: zaddGenericCommand: zaddGenericCommand: zaddGenericCommand

Void zaddGenericCommand(client *c, int flags) {// An error message static char * nanErr ="resulting score is not a number (NaN)"; Robj *key = c->argv[1]; robj *zobj; sds ele; double score = 0, *scores = NULL; int j, elements; int scoreidx = 0; Int added = 0; int updated = 0; int processed = 0; Scoreidx = 2; scoreidx = 2;while(scoreidx < c->argc) { char *opt = c->argv[scoreidx]->ptr; // Check whether various modes are set in the commandif(! strcasecmp(opt,"nx")) flags |= ZADD_NX;
        else if(! strcasecmp(opt,"xx")) flags |= ZADD_XX;
        else if(! strcasecmp(opt,"ch")) flags |= ZADD_CH;
        else if(! strcasecmp(opt,"incr")) flags |= ZADD_INCR;
        else break; scoreidx++; } // set mode int incr = (flags & ZADD_INCR)! = 0; int nx = (flags & ZADD_NX) ! = 0; int xx = (flags & ZADD_XX) ! = 0; int ch = (flags & ZADD_CH) ! = 0; Elements = c-> argc-scoreidx; elements = c-> argc-scoreidx; // Add a layer of judgment because score and member are in pairsif(elements % 2 || ! elements) { addReply(c,shared.syntaxerr);return; } // score = 1; // Verify parameter validityif (nx && xx) {
        addReplyError(c,
            "XX and NX options at the same time are not compatible");
        return; } // Verify parameter validityif (incr && elements > 1) {
        addReplyError(c,
            "INCR option supports a single increment-element pair");
        return; Scores = zmalloc(sizeof(double)*elements);for(j = 0; j < elements; J++) {// fill the array with elements in pairs, so each score is separated by a memberif(getDoubleFromObjectOrReply(c,c->argv[scoreidx+j*2],&scores[j],NULL) ! = C_OK) goto cleanup; Zobj = lookupKeyWrite(c->db,key);if(zobj == NULL) {// No ordered set is specified and the mode is XX (only existing elements are operated on)if(xx) goto reply_to_client; // Select different storage structures according to the number of elements to initialize the ordered setif(server. Zset_max_ziplist_entries = = 0 | | server zset_max_ziplist_value < sdslen (c - > argv [scoreidx + 1] - > PTR)) {/ / + hash table Zobj = createZsetObject();
        } else{//ziplist mode zobj = createZsetZiplistObject(); } // add dbAdd(c->db,key,zobj); }else{// Return if the collection type of the ZADD operation is not correctif (zobj->type! = OBJ_ZSET) { addReply(c,shared.wrongtypeerr); goto cleanup; }} // Here we start adding elements to the ordered setfor(j = 0; j < elements; j++) { double newscore; Score = scores[j]; int retflags = flags; Member ele = c->argv[scoreidx+1+j*2]-> PTR; // add elements to an ordered set, arguments are ordered set, score of the element to be added, element to be added, operation mode, newscore int retval = zsetAdd(zobj, score, ele, &retflags, &newscore); // Return if add failsif(retval == 0) { addReplyError(c,nanerr); goto cleanup; } // Record the operationif (retflags & ZADD_ADDED) added++;
        if (retflags & ZADD_UPDATED) updated++;
        if(! (retflags & ZADD_NOP)) processed++; Score = newscore; } server.dirty += (added+updated); // Return logic reply_to_client:if (incr) {
        if (processed)
            addReplyDouble(c,score);
        else
            addReply(c,shared.nullbulk);
    } else{ addReplyLongLong(c,ch ? added+updated : added); } // cleanup: zfree(scores);if (added || updated) {
        signalModifiedKey(c->db,key);
        notifyKeyspaceEvent(NOTIFY_ZSET,
            incr ? "zincr" : "zadd", key, c->db->id); }}Copy the code

The code is a bit long, so here’s a picture of the storage structure:

Note: Each entry is composed of score+member

With the structure diagram above, it can be thought that the deletion operation should be carried out according to the different storage structure, if ziplist performs the linked list deletion, if the hash table + hop table structure, then both collections should be deleted. What is the real logic? Let’s take a look at the delete function zremCommand source, relatively short:

Void zremCommand(client *c) {// Get the ordered set name robj *key = c->argv[1]; robj *zobj; int deleted = 0, keyremoved = 0, j; / / checkif ((zobj = lookupKeyWriteOrReply(c,key,shared.czero)) == NULL ||
        checkType(c,zobj,OBJ_ZSET)) return;

    for(j = 2; j < c->argc; J++) {// delete the specified element onceif(zsetDel(zobj,c->argv[j]->ptr)) deleted++; // If all elements in the ordered set are deleted, the ordered table is recycledif (zsetLength(zobj) == 0) {
            dbDelete(c->db,key);
            keyremoved = 1;
            break; }} // Synchronize operationsif (deleted) {
        notifyKeyspaceEvent(NOTIFY_ZSET,"zrem",key,c->db->id);
        if (keyremoved)
            notifyKeyspaceEvent(NOTIFY_GENERIC,"del",key,c->db->id); signalModifiedKey(c->db,key); server.dirty += deleted; } // Return addReplyLongLong(c,deleted); }Copy the code

Take a look at the specific delete operation source:

Int zsetDel(robj *zobj, SDS ele) {// Same as adding elements, different deletion logic is performed according to different storage structuresif(zobj->encoding == OBJ_ENCODING_ZIPLIST) { unsigned char *eptr; // Ziplist is a simple list deletion operationif((eptr = zzlFind(zobj->ptr,ele,NULL)) ! = NULL) { zobj->ptr = zzlDelete(zobj->ptr,eptr);return1; }}else if (zobj->encoding == OBJ_ENCODING_SKIPLIST) {
        zset *zs = zobj->ptr;
        dictEntry *de;
        double score;

        de = dictUnlink(zs->dict,ele);
        if(de ! // dictGetVal(de); // dictGetVal(de); DictFreeUnlinkedEntry (zs->dict,de); Int retval = zslDelete(zs-> ZSL,score,ele,NULL); serverAssert(retval); // Resize the hash table if necessaryif (htNeedsResize(zs->dict)) dictResize(zs->dict);
            return1; }}else {
        serverPanic("Unknown sorted set encoding"); } // Returns 0 if the specified element is not foundreturn 0;
}Copy the code

ZrangeCommand = zrangeCommand = zrangeCommand = zrangeCommand;

Void zrangeCommand(client *c) {// zrangeGenericCommand(c,0); // zrangeGenericCommand(c,0); } void zrangeGenericCommand(client *c, int reverse) {robj *key = c->argv[1]; robj *zobj; int withscores = 0; long start; long end; int llen; int rangelen; // Check parametersif((getLongFromObjectOrReply(c, c->argv[2], &start, NULL) ! = C_OK) || (getLongFromObjectOrReply(c, c->argv[3], &end, NULL) ! = C_OK))return; // Determine whether to return score based on additional parameter informationif(c->argc == 5 && ! strcasecmp(c->argv[4]->ptr,"withscores")) {
        withscores = 1;
    } else if (c->argc >= 5) {
        addReply(c,shared.syntaxerr);
        return; } // Ordered set checkif ((zobj = lookupKeyReadOrReply(c,key,shared.emptymultibulk)) == NULL
         || checkType(c,zobj,OBJ_ZSET)) return; Llen = zsetLength(zobj);if (start < 0) start = llen+start;
    if (end < 0) end = llen+end;
    if(start < 0) start = 0; // Return an empty setif (start > end || start >= llen) {
        addReply(c,shared.emptymultibulk);
        return;
    }
    if(end >= llen) end = llen-1; rangelen = (end-start)+1; AddReplyMultiBulkLen (c, withscores? (rangelen*2) : rangelen); // The same query logic is executed according to the structure of the ordered setif(zobj->encoding == OBJ_ENCODING_ZIPLIST) { unsigned char *zl = zobj->ptr; unsigned char *eptr, *sptr; unsigned char *vstr; unsigned int vlen; long long vlong; // Calculate the starting index in forward or backward orderif (reverse)
            eptr = ziplistIndex(zl,-2-(2*start));
        elseeptr = ziplistIndex(zl,2*start); serverAssertWithInfo(c,zobj,eptr ! = NULL); sptr = ziplistNext(zl,eptr);while(rangelen--) { serverAssertWithInfo(c,zobj,eptr ! = NULL && sptr ! = NULL); ServerAssertWithInfo (c,zobj,ziplistGet(eptr,&vstr,&vlen,&vlong)); / / return a valueif (vstr == NULL)
                addReplyBulkLongLong(c,vlong);
            elseaddReplyBulkCBuffer(c,vstr,vlen); // Return score if requiredif(withscores) addReplyDouble(c,zzlGetScore(sptr)); // Reverse order from back to front, positive order from front to backif (reverse)
                zzlPrev(zl,&eptr,&sptr);
            elsezzlNext(zl,&eptr,&sptr); }}else if(zobj->encoding == OBJ_ENCODING_SKIPLIST) { zset *zs = zobj->ptr; zskiplist *zsl = zs->zsl; zskiplistNode *ln; sds ele; // Find the start nodeif (reverse) {
            ln = zsl->tail;
            if (start > 0)
                ln = zslGetElementByRank(zsl,llen-start);
        } else {
            ln = zsl->header->level[0].forward;
            if(start > 0) ln = zslGetElementByRank(zsl,start+1); } // iterate over and return to the clientwhile(rangelen--) { serverAssertWithInfo(c,zobj,ln ! = NULL); ele = ln->ele; addReplyBulkCBuffer(c,ele,sdslen(ele));if(withscores) addReplyDouble(c,ln->score); ln = reverse ? ln->backward : ln->level[0].forward; }}else {
        serverPanic("Unknown sorted set encoding"); }}Copy the code

Above is about the ordered set SortedSet add, delete, find the source code. It can be seen that SortedSet will choose Ziplist or hash table + skip table two data structures to implement according to the number of stored elements. The source code looks very long, the main reason is to implement different code according to different data structures. As long as you master this core idea, it won’t be too difficult to look at the source code.

Ordered set SortedSet command summary

The logic of ordered sets is not difficult, but the code is a bit long, involving ziplist, Skiplist and dict data structures. In addition to the regular dict, the other two data structures have a lot of content, which is prepared to write a special article to summarize, so it will not be repeated here. The main purpose of this paper is to summarize the implementation principle of ordered set SortedSet.

Redis source code analysis of the ordered set SortedSet

SortedSet command

Two, ordered set SortedSet command source analysis

Ordered set SortedSet command summary

Related Posts

Cache and database consistency in-depth analysis

Design and implementation of Zhihu Kafka platform based on Kubernetes

Jenkins automates dotnet Core 3.1 to Docker