Realization of data object

Earlier, we went through all the major data structures used by Redis.

Instead of using these data structures directly to implement a key-value database, Redis creates a system of objects based on these data structures that contains five types of objects: string objects, list objects, hash objects, collection objects, and ordered collection objects.

Object type and encoding

Source file: object.c

Each time we create a new key-value pair in Redis’s database, we create at least two objects, one for the key of the key-value pair (the key object) and one for the value of the key-value pair (the value object).

Each object in Redis is represented by a redisObject structure:

Typedef struct redisObject {// Unsigned type:4; // Unsigned Encoding :4; // The last time the object was accessed unsigned lRU :REDIS_LRU_BITS; /* lru time (relative to server.lruclock) */ / // Pointer to the actual storage unit void * PTR; } robj;Copy the code

This definition was introduced in the previous section, but I didn’t say it in detail, so let’s take a closer look.

  • Type This property records the type of the object. The optional constants are as follows:
Type constants Object name
REDIS_STRING String object
REDIS_LIST A list of objects
REDIS_HASH The hash object
REDIS_SET A collection of objects
REDIS_ZSET Ordered set object

Note: For Redis key-value pairs, keys are always string objects, and values are one of the five types mentioned above.

  • The PTR pointer to an encoding object points to the object’s underlying implementation data structures, which are determined by the object’s Encoding property. The encoding property records the encoding used by the object, that is, what data structure the object uses as the underlying implementation of the object. The alternative encoding is as follows.
Code constants The underlying data structure corresponding to the encoding
REDIS_ENCODING_INT An integer of type long
REDIS_ENCODING_EMBSTR A simple dynamic string encoded by embstr
REDIS_ENCODING_RAW Simple dynamic string
REDIS_ENCODING_HT The dictionary
REDIS_ENCODING_LINKEDLIST Double side chain table
REDIS_ENCODING_ZIPLIST List of compression
REDIS_ENCODING_INTSET The integer set
REDIS_ENCODING_SKIPLIST Skip lists and dictionaries

Each object can use at least two different encodings, as shown below:

Setting the encoding used by objects through the Encoding attribute, rather than associating a fixed encoding with a particular type of object, greatly increases Redis’ flexibility and efficiency because Redis can set different encoding for an object according to different usage scenarios, thus optimizing the efficiency of an object in a particular scenario.

For example, Redis uses a compressed list as the underlying implementation of a list object when the list object contains fewer elements:

1) Because compressed lists save more memory than double-entailed lists, and compress lists are more suitable when the number of elements is small;

2) When the number of elements increases or a single element becomes too large, the object moves the underlying implementation from a compressed list to a more powerful, double-ended list that is better suited for storing large numbers of elements;

  • Idle duration (LRU) : This property records the time when the object was last accessed. This value is used when Redis’s memory expiration value is obsolete. If the algorithm used by the server to reclaim memory is volatile- LRU or allkeys-lRU, then when memory reaches the upper limit, the part of key objects with higher idle time will be freed by the server first, thus recovering memory.
  • Reference counting Redis has built a reference counting technology in its object system to implement a memory reclamation mechanism, through this mechanism, the program can track the reference counting information of the object, automatically release the object and reclaim the memory at the appropriate time.

The reference count information for an object changes as the object is in use:

When a new object is created, the reference count value is initialized to 1;

When an object is used by a new program, its reference-count value is increased by one;

When an object is no longer in use by a program, its referential count value is reduced by one.

When the reference count value of an object becomes 0, the memory occupied by the object is released.

There is a special API for modifying reference object counters as follows:

The whole life cycle of an object can be divided into three stages: creating an object, manipulating an object and releasing an object. The last API called before releasing is decRefcount:

void decrRefCount(robj *o) { if (o->refcount <= 0) redisPanic("decrRefCount against refcount <= 0"); If (o-> refCount == 1) {switch(o->type) {case REDIS_STRING: freeStringObject(o); break; case REDIS_LIST: freeListObject(o); break; case REDIS_SET: freeSetObject(o); break; case REDIS_ZSET: freeZsetObject(o); break; case REDIS_HASH: freeHashObject(o); break; default: redisPanic("Unknown object type"); break; } zfree(o); } else {o->refcount--; }}Copy the code
  • PTR, which points to the actual stored data unit.

The reference count attribute of the object has the effect of ** object sharing (for integer values only) **.

For example, suppose key A creates A string object containing the integer value 100 as A value object:

If key B also creates a string object that also holds the integer value 100 as a value object, the server can do two things:

· Create a new string object containing the integer value 100 for key B;

· Let key A and key B share the same string object;

The above two methods are clearly the second method is more memory efficient.

In Redis, having multiple keys share the same value object requires two steps:

· Point the value pointer of the database key to an existing value object;

· Increment the reference count of the shared value object by one.

If key A and key B point to the same object, the current situation is as follows:

The shared object mechanism is very helpful in saving memory. The more objects of the same value are stored in the database, the more memory can be saved by the object sharing mechanism.

When Redis initializes the server, it creates 10,000 string objects with all integer values from 0 to 9999. When the server needs string objects with values from 0 to 9999, the server uses these shared objects instead of newly created objects.

Why do shared objects only share integer values, but not objects that contain strings?

Because there needs to be a validation time! The share is only shared if the target is the same as the key you want to create, so integers only need O(1), strings O(n), and lists or hashes O(n^2), which wastes CPU resources.

String object

Source file: t_string.c.

String objects are fundamental to all key-value pairs in the Redis database because keys are string objects. Here are the apis for Redis string SET, SETEX, and so on:

void setGenericCommand(redisClient *c, int flags, robj *key, robj *val, robj *expire, int unit, robj *ok_reply, robj *abort_reply) { long long milliseconds = 0; /* Initialized to avoid any harmness warning */ / fetch expire time if (expire) {// fetch the expire parameter value // T = O(N) if (getLongLongFromObjectOrReply(c, expire, &milliseconds, NULL) ! = REDIS_OK) return; If (milliseconds <= 0) {addReplyError(c,"invalid expire time in SETEX"); return; If (unit == UNIT_SECONDS) milliseconds = 1000; if (unit == UNIT_SECONDS) milliseconds = 1000; } // If NX or XX is set, check whether the condition does not meet these two Settings. If ((flags & REDIS_SET_NX && lookupKeyWrite(c->db,key)! = NULL) || (flags & REDIS_SET_XX && lookupKeyWrite(c->db,key) == NULL)) { addReply(c, abort_reply ? abort_reply : shared.nullbulk); return; SetKey (c->db,key,val); // Set the database to dirty server.dirty++; If (expire) setExpire(c->db,key,mstime()+milliseconds); // notifyKeyspaceEvent(REDIS_NOTIFY_STRING,"set",key,c->db->id); If (expire) notifyKeyspaceEvent(REDIS_NOTIFY_GENERIC, "expire",key,c->db->id); AddReply (c, ok_reply? ok_reply : shared.ok); }Copy the code

The encoding of a string object can be int, RAW, or embstr.

  • Int encoding. If the integer value of a string object can be represented by long, that integer value is stored in the PTR pointer (encoding is different from the actual parameter type).
  • Raw encoding, if the string object is a string and is larger than 39 bytes, it is saved using SDS string;
  • Embstr encoding. If the saved string is less than or equal to 39 bytes, it is saved in embstr format. Embstr is an optimized encoding method for storing short strings. Unlike RAW, emBSTR requires only one call to allocate a contiguic memory space containing redisObject and SDSHDR. Raw uses two calls to allocate redisObject and SDSHDR. Example of embstr encoding is shown below:

What’s the good of that?

  1. Embstr encoding reduces the number of memory allocations required to create string objects from two raw encoding times to one.
  2. Freeing embSTR-encoded string objects requires only one call to the memory freeing function, whereas freeing raw string objects requires two calls to the memory freeing function.
  3. Because all the data of embSTR-encoded string objects is stored in a contiguous piece of memory, embSTR-encoded string objects can take advantage of caching better than raw string objects.

If the command set MSG “hello” is used, the value object is as follows:

Note that floating-point numbers of type long double are also stored as strings in Redis, such as 6.666, which is converted to strings. When the float is needed, it is converted from a string to a float and then evaluated.

Code conversion

Int /embstr will be converted to a raw-encoded object under certain circumstances and cannot be reversed.

  • An int-encoded object will be converted from int to raw if the command is executed and the object becomes a string value.
  • Redis does not write any corresponding modification programs for embstr-encoded string objects (only int-encoded string objects and raw encoded string objects have these programs), and embstr-encoded string objects are actually read-only: When we execute any modification command on an embstr encoded string object, the program first converts the encoding of the object from EMbstr to RAW before executing the modification command. For this reason, an embstr-encoded string object always becomes a raw-encoded string object after a modification command is executed.

Note: Redis commonly used string object command

The command int The implementation method of coding embstr The implementation method of coding raw The implementation method of coding
SET Use int encoding to hold values. Embstr encoding is used to save values. Use raw encoding to save values.
GET Copies the integer value held by the object, converts the copy to a string value, and returns the string value to the client. Returns a string value directly to the client. Returns a string value directly to the client.
APPEND Convert the object to raw encoding, and then perform this operation as raw encoding. Convert the object to raw encoding, and then perform this operation as raw encoding. Call the sdSCATlen function to append the given string to the end of the existing string.
INCRBYFLOAT Take the integer value and convert it to a floating-point number of type LongDouble, add the floating-point number, and then save the resulting floating-point number. Take the string value and try to convert it to a float of type long Double, add the float, and then save the resulting float. If the string value cannot be converted to float, an error is returned to the client. Take the string value and try to convert it to a floating-point number of type LongDouble, add the floating-point number, and then save the resulting floating-point number. If the string value cannot be converted to float, an error is returned to the client.
INCRBY Add integer values, and the result is stored as an integer. The embstr encoding cannot execute this command and returns an error to the client. Raw encoding cannot execute this command and returns an error to the client.
DECRBY Subtract integer values, and the result is stored as an integer. The embstr encoding cannot execute this command and returns an error to the client. Raw encoding cannot execute this command and returns an error to the client.
STRLEN Copies the integer value held by the object, converts the copy to a string value, calculates and returns the length of the string value. Call the sdslen function to return the length of the string. Call the sdslen function to return the length of the string.
SETRANGE Convert the object to raw encoding, and execute this command as raw encoding. Convert the object to raw encoding, and execute this command as raw encoding. Sets the value on a string-specific index to the given character.
GETRANGE Copies the integer value held by the object, converts the copy to a string value, and retrieves and returns the character on the specified index of the string. Fetches and returns the character on the specified index of the string. Fetches and returns the character on the specified index of the string.

A list of objects

Source file: t_list.c.

The underlying code is ziplist or LinkedList.

  • Ziplist code, the bottom is compressed list, each entry saves a list item, there are two conditions: list objects save all string elements of length less than 64 bytes; The list object holds less than 512 elements.

Here is an example of ziplist encoding:

If either condition cannot be met, the encoding conversion of the ** object is performed, converting ** from ziplist to LinkedList. The conversion process is shown in this function:

void listTypeConvert(robj *subject, int enc) { listTypeIterator *li; listTypeEntry entry; redisAssertWithInfo(NULL,subject,subject->type == REDIS_LIST); If (enc == REDIS_ENCODING_LINKEDLIST) {list *l = listCreate(); listSetFreeMethod(l,decrRefCountVoid); Li = listTypeInitIterator(subject,0,REDIS_TAIL); while (listTypeNext(li,&entry)) listAddNodeTail(l,listTypeGet(&entry)); listTypeReleaseIterator(li); Subject ->encoding = REDIS_ENCODING_LINKEDLIST; // Release the original ziplist zfree(subject-> PTR); // Update object value pointer subject-> PTR = l; } else { redisPanic("Unsupported list conversion"); }}Copy the code
  • Linkedlist encoding, the bottom layer is a double-endian linkedlist to hold the SDS string object;

Here is an example of a double-entailed list encoding:

Note: A string object is the only one of Redis’s five types of objects that is nested by the other four types of objects.

Note: Common commands are as follows:

The command ziplist The implementation method of coding linkedlist The implementation method of coding
LPUSH Call the ziplistPush function to push the new element to the head of the compressed list. Call the listAddNodeHead function to push the new element to the head of the double-ended list.
RPUSH Call the ziplistPush function to push the new element to the end of the compressed list. The listAddNodeTail function is called to push the new element to the end of the double-ended list.
LPOP The ziplistIndex function is called to locate the table head node of the compressed list, and the ziplistDelete function is called to remove the table head node after the user is returned with the element saved by the node. The listFirst function is called to locate the head node of the double-ended list, and the listDelNode function is called to remove the head node after returning the elements held by the node to the user.
RPOP The ziplistIndex function is called to locate the end-of-table node of the compressed list, and the ziplistDelete function is called to remove the end-of-table node after returning the element saved by the node to the user. The listLast function is called to locate the tail node of the double-ended list, and after returning the elements held by the node to the user, the listDelNode function is called to remove the tail node.
LINDEX Call the ziplistIndex function to locate the specified node in the compressed list, and then return the element that the node holds. Call the listIndex function to locate the specified node in the double-ended list, and then return the element held by the node.
LLEN Call ziplistLen to return the length of the compressed list. Call the listLength function to return the length of the double-ended list.
LINSERT Use the ziplistPush function to insert a new node to the top or bottom of a compressed list. Use the ziplistInsert function when inserting a new node elsewhere in the compressed list. Call the listInsertNode function to insert the new node into the specified position on the double-ended list.
LREM Iterate through the compressed list nodes and call the ziplistDelete function to remove the node containing the given element. Iterate over the double-ended list nodes and call the listDelNode function to delete the node containing the given element.
LTRIM Call the ziplistDeleteRange function to remove all nodes in the compressed list that are not in the specified index range. Traverses the double-ended list nodes and calls the listDelNode function to remove all nodes in the list that are not in the specified index range.
LSET Call ziplistDelete to remove the existing node on the index specified by the compressed list, and then call ziplistInsert to insert a new node containing the given element onto the same index. Call the listIndex function, locate the node on the specified index of the double-ended list, and update the value of the node through an assignment operation.

The hash object

Source file t_hash.c.

The encoding is ziplist or Hashtable.

  • Ziplist encoding, the bottom is the compressed list, there are two conditions, hash object save all key value pairs of the key and value of the string length is less than 64 bytes; The number of key-value pairs stored in a hash object is less than 512.

    Whenever a new key-value pair is added to a hash object, the two nodes that hold the same key-value pair are always next to each other. The node that holds the key comes first, and the node that holds the value comes second. And these key-value pairs are first come, first served.

Note: When using Ziplist encoding, the time complexity of obtaining the corresponding value of the key is not O(1), but O(N^2)(first traverses the key, then traverses whether the value is equal)! However, since the length and key length are small when using Ziplist, the performance impact is not great.

  • Hashtable encoding, which uses a dictionary as the underlying implementation when the above two conditions are not met. Each key-value pair in a hash object is held using a dictionary key-value pair: each key and value of the dictionary is a string object;

Here is an example of a dictionary-based hash object:

Note: If ziplist does not meet one of the two criteria, all key-value pairs stored in the ziplist list are transferred and saved to the dictionary.

Note: Common commands

The command ziplist Coding implementation method hashtable The implementation method of coding
HSET The ziplistPush function is called to push the key to the end of the compressed list, and the ziplistPush function is called again to push the value to the end of the compressed list. Call dictAdd to add the new node to the dictionary.
HGET The ziplistFind function is called to find the node corresponding to the specified key in the compressed list, then the ziplistNext function is called to move the pointer to the value node next to the key node, and finally to return the value node. Call dictFind to look up the given key in the dictionary, and then call dictGetVal to return the value corresponding to the key.
HEXISTS Call ziplistFind to find the node corresponding to the specified key in the compressed list. If found, the key-value pair exists; if not, the key-value pair does not exist. Call dictFind function to look up the given key in the dictionary. If found, it indicates that the key value pair exists; if not, it indicates that the key value pair does not exist.
HDEL Call ziplistFind to find the node corresponding to the specified key in the compressed list, and then delete the corresponding key node and the value node next to the key node. Call dictDelete to delete the key-value pair corresponding to the specified key from the dictionary.
HLEN Call the ziplistLen function to get the total number of nodes in the compressed list. Divide this number by 2 to get the number of key-value pairs stored in the compressed list. Call the dictSize function to return the number of key-value pairs contained in the dictionary, which is the number of key-value pairs contained in the hash object.
HGETALL Iterate through the compressed list, using the ziplistGet function to return all keys and values (which are nodes). Iterating through the dictionary, using dictGetKey to return the dictionary key and dictGetVal to return the dictionary value.

A collection of objects

Source code ask akin t_set.c.

The encoding of a collection object can be intSet or HashTable.

  • Intset encoding, using a collection of integers as an implementation of a collection object, with two conditions: all elements held by the collection object are integer values; The collection object holds no more than 512 elements;

The set object contains all elements stored in the set of integers, as shown below:

  • Each key of the dictionary is a string object, each string object contains a collection element, and all dictionary values are set to NULL (because no key represents a value).

Note: Common commands

The command intset The implementation method of coding hashtable The implementation method of coding
SADD Call intsetAdd to add all new elements to the set of integers. Call dictAdd, add the key-value pair to the dictionary with the new element as the key and NULL as the value.
SCARD Call intsetLen to return the number of elements in the set of integers. This number is the number of elements in the set object. Call the dictSize function to return the number of key-value pairs contained in the dictionary, which is the number of elements contained in the set object.
SISMEMBER IntsetFind is called to look for the given element in the set of integers. If found, the element exists in the set; if not, the element does not exist in the set. Call the dictFind function to search for a given element in the key of the dictionary. If it is found, it indicates that the element exists in the set; if it is not found, it indicates that the element does not exist in the set.
SMEMBERS Iterate over the entire collection of integers, using the intsetGet function to return the collection element. Walk through the dictionary, using the dictGetKey function to return the dictionary key as the set element.
SRANDMEMBER Call intsetRandom to return a random element from the set of integers. Call dictGetRandomKey to return a random dictionary key from the dictionary.
SPOP Call intsetRandom to remove a random element from the set of integers. After returning the random element to the client, call intsetRemove to remove the random element from the set of integers. The dictGetRandomKey function is called to pick out a random dictionary key from the dictionary. After returning the value of the random dictionary key to the client, the dictDelete function is called to delete the key value pair corresponding to the random dictionary key from the dictionary.
SREM The intsetRemove function is called to remove all given elements from the collection of integers. Call dictDelete to delete from the dictionary all key-value pairs whose keys are given elements.

Ordered set object

Source file t_zset.c.

The codes for ordered collections can be Ziplist and skiplist.

  • Ziplist code, the use of compressed list as the underlying implementation has two conditions: ordered set to hold less than 128 elements; Ordered collections hold all element members less than 64 bytes in length;

Note: In order to ensure order, ziplist may need to move the data in each insert and delete operation.

Each collection element is stored with two compressed list nodes next to each other, the first node holding the element’s members, and the second element holding the element’s score (sorting).

  • Skiplist encoding, ordered collection objects that do not meet the above conditions use the zset structure as the underlying implementation, a zset structure contains both a dictionary and a skiplist:
typedef struct zset {
    zskiplist *zsl;
    dict *dict;
} zset;
Copy the code

The skip table in the ZSET structure stores all the set elements from small to large, and each skip table node stores one set element: the object attribute of the skip table node stores the members of the element, while the score attribute of the skip table node stores the score value of the element. Through this skip list, programs can perform range operations on ordered collections, such as ZRANK, ZRANGE and other commands are based on the skip list API to achieve.

In addition, the dict dictionary in the Zset structure creates a mapping from members to points for ordered collections, with each key-value pair in the dictionary holding a set element: the dictionary’s keys hold the elements’ members, and the dictionary’s values hold the elements’ points. This dictionary allows a program to find the score of a given member with O(1) complexity. The ZSCORE command is based on this feature, and many other ordered collection commands use this feature inside implementations.

The member of each element of an ordered collection is a string object, and the score of each element is a floating-point number of type double. It is worth mentioning that although zset structure used at the same time jump table and a dictionary to store an ordered set elements, but these two kinds of data structure will be Shared by pointer members of the same elements and score, so at the same time use jumping table and dictionaries to save collection elements won’t produce any duplicate member or score, also won’t waste because of this additional memory.

Why do we need to do that?

In theory, ordered collections can be implemented using either a dictionary or a skip list data structure alone, but both dictionaries and skip lists have lower performance than both.

For example, if we implemented ordered collections using only dictionaries, the O(1) complexity of looking up members’ scores would be retained. However, because dictionaries store collection elements in an unordered manner, every time we execute scope-type operations like ZRANK, ZRANG, and so on, Each program needs to sort all the elements stored in the dictionary, which requires at least O(NlogN) time complexity and additional O(N) memory space (because an array is created to hold the sorted elements).

On the other hand, if we implemented ordered collections using only skip lists, all the benefits of skip lists performing scoped operations would be retained, but since there would be no dictionary, the complexity of finding points by members would increase to O(logn).

For the above reasons, Redis chooses to use both dictionary and skip list data structures to implement ordered collections in order to make both lookups and scoping operations of ordered collections as fast as possible.

Note: Common commands

The command ziplist The implementation method of coding zset The implementation method of coding
ZADD Call the ziplistInsert function to insert the member and the score value as two separate nodes into the compressed list. The zslInsert function is called to add the new element to the skip list, and the dictAdd function is called to associate the new element with the dictionary.
ZCARD Call the ziplistLen function to get the number of nodes in the compressed list, and divide this number by 2 to get the number of collection elements. Access the length property of the skip table data structure, returning the number of collection elements directly.
ZCOUNT Iterate through the compressed list and count the number of nodes with scores in a given range. Iterate over the skip list to count the number of nodes with scores in a given range.
ZRANGE Iterates through the compressed list from the head to the end of the table, returning all elements in the given index range. Traverses the skip list from head to tail, returning all elements in a given index range.
ZREVRANGE Traverses the compressed list from the end of the table to the beginning of the table, returning all elements in the given index range. Traverses the skip list from the end of the table to the beginning of the table, returning all elements in a given index range.
ZRANK Traverse the compressed list from the table head to the table end to find a given member, and record the number of nodes along the way. When a given member is found, the number of nodes passing through is the ranking of the corresponding element of the member. The skip list is traversed from the head of the table to the end of the table to find a given member, and the number of nodes passing through is recorded along the way. When a given member is found, the number of nodes passing through is the ranking of the corresponding element of the member.
ZREVRANK Traverse the compressed list from the end of the table to the beginning of the table to find a given member, and record the number of nodes along the way. When a given member is found, the number of nodes passing through is the ranking of the corresponding element of the member. The skip list is traversed from the end of the table to the beginning of the table to search for a given member, and the number of nodes passing through is recorded along the way. When a given member is found, the number of nodes passing through is the ranking of the corresponding element of the member.
ZREM Iterate through the compressed list to remove all nodes that contain the given member, as well as the score node next to the deleted member node. Traverses the skip list and deletes all skip list nodes that contain a given member. Disassociate the deleted element’s member from its score in the dictionary.
ZSCORE Iterate through the compressed list to find the node that contains the given member, and then extract the element score held by the score node next to the member node. Retrieves the score of a given member directly from the dictionary.

Hyperloglog object

Source file: Hyperloglog.c.

For direct reference:

Blog.csdn.net/qq_39885372…

Type checking and polymorphism

The commands used to operate keys in Redis are basically divided into two types. One is the common operation of keys, such as DEL command, EXPIRE command, RENAME command, TYPE command, and OBJECT command. The other is key-specific operations, such as SET, GET, APPEND, STRLEN, etc., which can only be performed on string keys.

Type checking

Before executing a type-specific command, Redis checks that the type of the input key is correct before deciding whether to execute a given command. Via the type attribute of the redisObject structure:

Polymorphic command

The same object can be encoded in two ways, such as hashtable and ziplist. Also based on the encoding format of its object:

The article links

(1) Skiplist, SDS, dictionary, Hyperloglog, # nuggets # juejin.cn/post/694944…

Redis source code, interview guide (2) integer set, compressed list # nuggets article # juejin.cn/post/694955…