Explore Redis design and implementation 7: Redis internal data structure details - INTset

This article was adapted from the Internet

This series of articles will be organized into my GitHub repository for the Java Interview Guide. Check out my repository for more highlights

https://github.com/h2pl/Java-Tutorial

If you like, please click Star

The article was first posted on my personal blog:

www.how2playlife.com

WeChat public is this article number of the Java technology river’s lake 】【 “explore Redis design and implementation of” one, in this paper, some content from the network, in order to put this article speaks with clarity theme, also integrates a lot I think good technology content, reference some good blog posts, if there are any copyright infringement, please contact the author.

In this series of blog posts, you will learn how to use Redis, how to use Redis, how to use Redis, how to use Redis, how to use Redis, how to use Redis, how to use Redis, how to use Redis, how to use Redis, how to use Redis, how to use Redis, how to use Redis. As well as some use methods and considerations as a cache, in order to give you a more complete understanding of the Redis related to the technical system, to form your own knowledge framework.

If you have any suggestions or questions about this series of articles, you can also contact the author on the public account “Java Technology Jianghu”. You are welcome to participate in the creation and revision of this series of blog posts.

This article is the seventh in the Redis Internal Data Structure Series. In this article, we focus on an internal Redis data structure, IntSet.

Intset is used in Redis to implement sets as external data structures. A set structure is similar to the mathematical concept of a set in that its elements are unordered and cannot be repeated. The set structure in Redis also implements the basic set union, intersection, and difference operations. Like other data structures exposed by Redis, the underlying implementation of set varies depending on whether the element type is an integer and how many elements are added. In general, a set uses intSet as its underlying data structure when all the elements added to it are integers and the number of elements is small; otherwise, a set uses dict as its underlying data structure.

In this paper, we will be roughly divided into three parts for introduction:

Focus on intset data structures.
Discusses how sets are built on top of intsets and dict.
The paper focuses on the algorithm implementation of union, intersection and difference of SET and time complexity. Note that two algorithms are implemented in Redis to calculate the difference set.

There is one more Redis configuration we will cover in this discussion (in the ADVANCED CONFIG section of redis.conf) :

set-max-intset-entries 512
Copy the code

Note: The code discussed in this article is based on the 3.2 branch of Redis source code.

Intset Data structure introduction

Intset is, as the name suggests, a collection of integers. In fact, an intset is an ordered set of integers, making it easy to do binary lookups on it to quickly determine whether an element belongs to the set. It is similar to Ziplist in the allocation of memory, is a contiguous whole memory space, and for large and small integers (according to the absolute value) to adopt different encoding, as far as possible to optimize the use of memory.

The data structure of intset is defined as follows (from intset.h and intset.c) :

typedef struct intset {
    uint32_t encoding;
    uint32_t length;
    int8_t contents[];
} intset;

#define INTSET_ENC_INT16 (sizeof(int16_t))
#define INTSET_ENC_INT32 (sizeof(int32_t))
#define INTSET_ENC_INT64 (sizeof(int64_t))
Copy the code

The meanings of the fields are as follows:

encoding: Data encoding, indicating that each data element in an intset is stored in several bytes. It has three possible values: INTSETENCINT16 means each element is stored in two bytes, INTSETENCINT32 means each element is stored in four bytes, INTSETENCINT64 indicates that each element is stored in 8 bytes. Therefore, integers stored in intset can occupy at most 64bit.
length: indicates the number of elements in intset.encodingandlengthThe two fields form the headers of an intset.
contents: is a flexible array (flexible array member), indicating that the header of an intset is followed by the data element. The total length of this array (the total number of bytes) is equal toencoding * length. Flexible arrays appear in many Redis data structure definitions (e.gsds.quicklist.skiplist) to express an offset.contentsIt needs to be allocated space separately, which is not included in the intset structure.

Note that intSet may change its data encoding as data is added:

Initially, the newly created intset uses the smallest intsetENCINT16 (value 2) as data encoding.
Each time a new element is added, the data encoding is upgraded based on the size of the element.

The figure below shows a concrete example of adding data (click to see larger image).

! [intset add data for example] (http://zhangtielei.com/assets/photosredis/intset/redisintsetaddexample.png)

In the picture above:

The newly created intset has only one header, eight bytes in total. Among themencoding = 2, length= 0.
When you add 13 and 5, because they’re small integers, they can both be represented in 2 bytes, soencodingIt’s still going to be 2.
When 32768 is added, it can no longer be represented in 2 bytes (2 words represent range -2)¹⁵~ 2¹⁵Minus 1. 32,768 is 2¹⁵It is out of rangeencodingYou must upgrade to INTSETENCINT32 (value 4), that is, four bytes representing an element.
The intSet is always ordered from small to large as each element is added.
withziplistSimilarly, intsets are stored in little endian mode (see Wikipedia entry)Endianness). For example, in the figure above, after intset has added all the data, representsencodingThe four bytes of the field should be interpreted as 0x00000004, and the fifth data should be interpreted as 0x000186A0 = 100000.

Intset vs. Ziplist:

Ziplist can store arbitrary binary strings, while intset can store only integers.
Ziplist is unordered, while intset is ordered from small to large. As a result, lookups on Ziplist are traversal only, while binary lookups can be performed on IntSet for higher performance.
Ziplist can encode each data item with a different variable length (each data item is preceded by a data length field)len), while intSet can only use a unified encoding (encoding).

Intset find and add operations

To understand some of the implementation details of IntSet, focus on the two key intset operations: Find (intsetFind) and add (intsetAdd).

The key code for intsetFind is shown below (from intset.c) :

uint8_t intsetFind(intset *is, int64_t value) { uint8_t valenc = _intsetValueEncoding(value); return valenc <= intrev32ifbe(is->encoding) && intsetSearch(is,value,NULL); } static uint8_t intsetSearch(intset *is, int64_t value, uint32_t *pos) { int min = 0, max = intrev32ifbe(is->length)-1, mid = -1; int64_t cur = -1; /* The value can never be found when the set is empty */ if (intrev32ifbe(is->length) == 0) { if (pos) *pos = 0; return 0; } else { /* Check for the case where we know we cannot find the value, * but do know the insert position. */ if (value > _intsetGet(is,intrev32ifbe(is->length)-1)) { if (pos) *pos = intrev32ifbe(is->length); return 0; } else if (value < _intsetGet(is,0)) { if (pos) *pos = 0; return 0; } } while(max >= min) { mid = ((unsigned int)min + (unsigned int)max) >> 1; cur = _intsetGet(is,mid); if (value > cur) { min = mid+1; } else if (value < cur) { max = mid-1; } else { break; } } if (value == cur) { if (pos) *pos = mid; return 1; } else { if (pos) *pos = min; return 0; }}Copy the code

Some things to note about the above code include:

intsetFindFinds the specified element in the specified intsetvalueReturns 1 if found, 0 if not found.
_intsetValueEncodingThe function will look it up depending on what it’s looking forvalueIn which range the corresponding data encoding is calculated (that is, how many bytes it should be stored).
ifvalueIf the required data encoding is larger than the current intset encoding, it must be out of the range of data that the current intset can store (very large or very small), so 0 is returned; Otherwise the callintsetSearchPerforms a binary search algorithm.
intsetSearchFinds the specified element in the specified intsetvalueIf found, returns 1 and takes the argumentposPoint to the location of the found element; If none is found, 0 is returned and the argument is passedposPoints to the location where the element can be inserted.
intsetSearchIs an implementation of binary search algorithm, which is roughly divided into three parts:
- Special handling of intSet is null.
- Handles two boundary cases in particular: when looking upvalueBigger than the last element or smaller than the first element. In fact, the special treatment of these two parts is not necessary in binary lookup, but they provide the possibility of quick failure in special cases here.
- Actually perform binary search. Note: If not found at last, insert position atminSpecified location.
In the codeintrev32ifbeThis is for doing size side conversions if needed. As we mentioned earlier, the data in intSet is stored in little Endian mode, so when running on a big endian machine, the data here is stored in little Endian modeintrev32ifbeIt will convert accordingly.
The total time of this search algorithm is order log n.

The key code for intsetAdd is as follows (from intset.c) :

intset *intsetAdd(intset *is, int64_t value, uint8_t *success) { uint8_t valenc = _intsetValueEncoding(value); uint32_t pos; if (success) *success = 1; /* Upgrade encoding if necessary. If we need to upgrade, we know that * this value should be either appended (if > 0) or prepended (if < 0), * because it lies outside the range of existing values. */ if (valenc > intrev32ifbe(is->encoding)) { /* This always succeeds, so we don't need to curry *success. */ return intsetUpgradeAndAdd(is,value); } else { /* Abort if the value is already present in the set. * This call will populate "pos" with the right position to  insert * the value when it cannot be found. */ if (intsetSearch(is,value,&pos)) { if (success) *success = 0; return is; } is = intsetResize(is,intrev32ifbe(is->length)+1); if (pos < intrev32ifbe(is->length)) intsetMoveTail(is,pos,pos+1); } _intsetSet(is,pos,value); is->length = intrev32ifbe(intrev32ifbe(is->length)+1); return is; }Copy the code

Some things to note about the above code include:

intsetAddAdd a new element to intSetvalue. ifvalueIf it already exists, it will not be added againsuccessIt’s set to zero; ifvalueDoes not exist in the original intsetvalueInsert into the appropriate position, when the argumentsuccessIt’s set to 0.
If you want to add elementsvalueIf the required data encoding is larger than the encoding of the current intSet, theintsetUpgradeAndAddUpgrade the encoding of intset and then insert itvalue.
callintsetSearchIf it can be found, it will not be added repeatedly.
If not, callintsetResizeMemory expansion of intset to accommodate newly added elements. Since intSet is a contiguous space, this operation raises memoryrealloc(seeman.cx/realloc). This may lead to a copy of the data. At the same time callintsetMoveTailMoving an element one position behind the insertion position also involves a copy of the data. It is worth noting that in theintsetMoveTailIs to call inmemmoveComplete the data copy.memmoveThis ensures that data is not overlapped or overwritten during copy. For details, seeman.cx/memmove.
intsetUpgradeAndAddIs also called in the implementation ofintsetResizeTo complete the memory expansion. When you do a coding upgrade,intsetUpgradeAndAddThe implementation of intSet takes each element from the original intSet and writes it back to the new location with the new encoding.
Pay attention to theintsetAdd, which returns a new intSet pointer. It may be with the incoming intset pointerisSame, maybe different. The caller must replace the old intset variable passed in with the new intset returned here. Interface usage patterns such as this are common in Redis implementation code, as we described earliersdsandziplistI’ve been in a similar situation.
Obviously, thisintsetAddThe total time complexity of the algorithm is O(n).

The set of Redis

To better understand the set data structure exposed by Redis, let’s take a look at some of the key commands of set. Here are some examples of commands:

! [the set command, for example] (http://zhangtielei.com/assets/photosredis/intset/redissetcmdexample.png)

What these commands mean:

saddUsed separately to sets1ands2Add elements to. The added elements are both numeric and non-numeric (” a “and” b “).
sismemberDetermines whether the specified element exists in the collection.
sinter.sunionandsdiffThey are used to calculate the intersection, union and difference of sets respectively.

As we mentioned earlier, the underlying implementation of a set varies depending on whether the element type is an integer and how many elements are added. For example, during the execution of the above command, the underlying data structure of collection S1 changes as follows:

At the beginning of the executionsadd s1 13 5And then, since I’m adding smaller integers, sos1At the bottom is an INTset, whose data is encodedencoding= 2.
The execution of thesadd s1 32768 10 100000After that,s1The underlying layer is still an INTset, but its data is encodedencodingUpgrade from 2 to 4.
The execution of thesadd s1 a bAnd then, since the element you add is no longer a number,s1The underlying implementation turns into a dict.

Dict is a data structure used to maintain the mapping between keys and values. When a set is represented by dict, what are its keys and values? In effect, key is the collection element to be added, and value is NULL.

In addition to the previously mentioned transition from intset to dict at the bottom of the collection due to the addition of non-numeric elements, there are two other situations that can cause this transition:

A number was added, but it cannot be expressed in 64-bit signed numbers. The largest integer range that an intset can express is -264 to 264-1, so if you add numbers beyond that range, it will cause an Intset to become a dict.
The number of collection elements added exceededset-max-intset-entriesIntset can also be set to dict (see ‘setTypeAdd’ in t_set.c for details).

The main reason to use intset for small collections is to save memory. Especially when the number of elements stored is small, dict incurs a much higher memory overhead (two hash tables, linked list Pointers, and a lot of other metadata). So, when storing a large number of small collections whose elements are all numbers, intset can save a considerable amount of memory.

In fact, intsets on average perform less well than dict in terms of time complexity. For lookups, intset is O(log n), while dict can be thought of as O(1). However, since intset is used with a small number of collection elements, this doesn’t matter much.

Redis set union, intersection, difference algorithm

Redis set and, intersection, difference algorithm implementation code, in t_set.c. ‘sinterGenericCommand’ is used to calculate the intersection, and ‘sunionDiffGenericCommand’ is used to calculate the union and difference. They can both operate on multiple (and possibly more than two) sets at the same time. When the difference set operation is carried out on multiple sets, it expresses the meaning that the difference set is made with the first set and the second set, and the result is made with the third set, and so on.

Here we briefly introduce the implementation of the three algorithms.

intersection

The process of calculating the intersection can be roughly divided into three parts:

Check each set and treat nonexistent sets as empty. Once there is an empty set, there is no need to continue the calculation, the final intersection is the empty set.
Sort each set in order of the number of elements. This sort makes it easier for later calculations to start with the smallest set, with fewer elements to deal with.
Traverses the first sorted set (that is, the smallest set), and for each of its elements, searches all subsequent sets in turn. Only elements found in all sets are added to the final result set.

Note that step 3 above looks up in the collection and the time complexity for intset and dict stores is O(log n) and O(1), respectively. But since intsets are only used by small sets, it is roughly assumed that intset lookups are also constant time. Therefore, as stated in the official Redis documentation (redis. IO /commands/si…) , the time complexity of sinter command is:

O(N*M) worst case where N is the cardinality of the smallest set and M is the number of sets.

And set

The easiest way to compute a union is to iterate through all the sets, adding every element to the final result set. Adding elements to a collection automatically deduplicates.

Because every element of all collections is traversed, the time complexity of the sunion command given by the official Redis documentation is (Redis. IO /commands/su…). :

O(N) where N is the total number of elements in all given sets.

Note that the process of inserting elements into the result set is the same as the previous discussion of intersection calculation. The time complexity is O(1), ignoring intset.

Difference set

There are two possible algorithms for computing the difference set with different time complexity.

The first algorithm:

The first collection is traversed, and for each of its elements, all subsequent collections are searched in turn. Only elements that cannot be found in any set are added to the final result set.

The time complexity of this algorithm is O(N*M), where N is the number of elements in the first set and M is the number of sets.

The second algorithm:

Add all the elements of the first set to an intermediate set.
Iterate through all the following sets, and for each element you encounter, delete it from the middle set.
And then what’s left of the intermediate set is the difference set.

The time complexity of this algorithm is O(N), where N is the sum of the number of elements in all sets.

At the beginning of the calculation of difference set, the expected time complexity of the two algorithms will be estimated respectively, and then the algorithm with low complexity will be selected for operation. Two more points to note:

To some extent, the first algorithm is preferred because it involves fewer operations and only adds, while the second algorithm adds and then removes.
If the first algorithm is selected, the implementation of Redis sorts all sets after the second set according to the number of elements from most to least before executing the algorithm. This sort helps to find elements with a higher probability, which leads to a faster end to the search.

For sDIff time complexity, Redis official documentation (redis. IO /commands/sd…) Only the result of the second algorithm is given, which is not accurate.

Stay tuned for the next installment in this series.

Original article, reprint please indicate the source, and include the following TWO-DIMENSIONAL code! Otherwise refuse to reprint! This paper links: zhangtielei.com/posts/blog-…

mo4tech.com (Moment For Technology) is a global community with thousands techies from across the global hang out!Passionate technologists, be it gadget freaks, tech enthusiasts, coders, technopreneurs, or CIOs, you would find them all here.

Explore Redis design and implementation 7: Redis internal data structure details – INTset

Intset Data structure introduction

Intset find and add operations

The set of Redis

Redis set union, intersection, difference algorithm

intersection

And set

Difference set

Explore Redis design and implementation 7: Redis internal data structure details – INTset

Intset Data structure introduction

Intset find and add operations

The set of Redis

Redis set union, intersection, difference algorithm

intersection

And set

Difference set

Related Posts

Management system necessary technology (11): connect to the public number to send regular reminder notice

IDEA plugin improves work efficiency

Common Linux command line shortcut keys