This article was adapted from the Internet
This series of articles will be organized into my GitHub repository for the Java Interview Guide. Check out my repository for more highlights
https://github.com/h2pl/Java-Tutorial
If you like, please click Star
The article was first posted on my personal blog:
www.how2playlife.com
WeChat public is this article number of the Java technology river’s lake 】 【 “explore Redis design and implementation of” one, in this paper, some content from the network, in order to put this article speaks with clarity theme, also integrates a lot I think good technology content, reference some good blog posts, if there are any copyright infringement, please contact the author.
In this series of blog posts, you will learn how to use Redis, how to use Redis, how to use Redis, how to use Redis, how to use Redis, how to use Redis, how to use Redis, how to use Redis, how to use Redis, how to use Redis, how to use Redis, how to use Redis. As well as some use methods and considerations as a cache, in order to give you a more complete understanding of the Redis related to the technical system, to form your own knowledge framework.
If you have any suggestions or questions about this series of articles, you can also contact the author on the public account “Java Technology Jianghu”. You are welcome to participate in the creation and revision of this series of blog posts.
This article is the seventh in the Redis Internal Data Structure Series. In this article, we focus on an internal Redis data structure, IntSet.
Intset is used in Redis to implement sets as external data structures. A set structure is similar to the mathematical concept of a set in that its elements are unordered and cannot be repeated. The set structure in Redis also implements the basic set union, intersection, and difference operations. Like other data structures exposed by Redis, the underlying implementation of set varies depending on whether the element type is an integer and how many elements are added. In general, a set uses intSet as its underlying data structure when all the elements added to it are integers and the number of elements is small; otherwise, a set uses dict as its underlying data structure.
In this paper, we will be roughly divided into three parts for introduction:
- Focus on intset data structures.
- Discusses how sets are built on top of intsets and dict.
- The paper focuses on the algorithm implementation of union, intersection and difference of SET and time complexity. Note that two algorithms are implemented in Redis to calculate the difference set.
There is one more Redis configuration we will cover in this discussion (in the ADVANCED CONFIG section of redis.conf) :
set-max-intset-entries 512
Copy the code
Note: The code discussed in this article is based on the 3.2 branch of Redis source code.
Intset Data structure introduction
Intset is, as the name suggests, a collection of integers. In fact, an intset is an ordered set of integers, making it easy to do binary lookups on it to quickly determine whether an element belongs to the set. It is similar to Ziplist in the allocation of memory, is a contiguous whole memory space, and for large and small integers (according to the absolute value) to adopt different encoding, as far as possible to optimize the use of memory.
The data structure of intset is defined as follows (from intset.h and intset.c) :
typedef struct intset {
uint32_t encoding;
uint32_t length;
int8_t contents[];
} intset;
#define INTSET_ENC_INT16 (sizeof(int16_t))
#define INTSET_ENC_INT32 (sizeof(int32_t))
#define INTSET_ENC_INT64 (sizeof(int64_t))
Copy the code
The meanings of the fields are as follows:
encoding
: Data encoding, indicating that each data element in an intset is stored in several bytes. It has three possible values: INTSETENCINT16 means each element is stored in two bytes, INTSETENCINT32 means each element is stored in four bytes, INTSETENCINT64 indicates that each element is stored in 8 bytes. Therefore, integers stored in intset can occupy at most 64bit.length
: indicates the number of elements in intset.encoding
andlength
The two fields form the headers of an intset.contents
: is a flexible array (flexible array member), indicating that the header of an intset is followed by the data element. The total length of this array (the total number of bytes) is equal toencoding * length
. Flexible arrays appear in many Redis data structure definitions (e.gsds.quicklist.skiplist) to express an offset.contents
It needs to be allocated space separately, which is not included in the intset structure.
Note that intSet may change its data encoding as data is added:
- Initially, the newly created intset uses the smallest intsetENCINT16 (value 2) as data encoding.
- Each time a new element is added, the data encoding is upgraded based on the size of the element.
The figure below shows a concrete example of adding data (click to see larger image).
! [intset add data for example] (http://zhangtielei.com/assets/photosredis/intset/redisintsetaddexample.png)
In the picture above:
- The newly created intset has only one header, eight bytes in total. Among them
encoding
= 2,length
= 0. - When you add 13 and 5, because they’re small integers, they can both be represented in 2 bytes, so
encoding
It’s still going to be 2. - When 32768 is added, it can no longer be represented in 2 bytes (2 words represent range -2)15~ 215Minus 1. 32,768 is 215It is out of range
encoding
You must upgrade to INTSETENCINT32 (value 4), that is, four bytes representing an element. - The intSet is always ordered from small to large as each element is added.
- withziplistSimilarly, intsets are stored in little endian mode (see Wikipedia entry)Endianness). For example, in the figure above, after intset has added all the data, represents
encoding
The four bytes of the field should be interpreted as 0x00000004, and the fifth data should be interpreted as 0x000186A0 = 100000.
Intset vs. Ziplist:
- Ziplist can store arbitrary binary strings, while intset can store only integers.
- Ziplist is unordered, while intset is ordered from small to large. As a result, lookups on Ziplist are traversal only, while binary lookups can be performed on IntSet for higher performance.
- Ziplist can encode each data item with a different variable length (each data item is preceded by a data length field)
len
), while intSet can only use a unified encoding (encoding
).
Intset find and add operations
To understand some of the implementation details of IntSet, focus on the two key intset operations: Find (intsetFind) and add (intsetAdd).
The key code for intsetFind is shown below (from intset.c) :
uint8_t intsetFind(intset *is, int64_t value) { uint8_t valenc = _intsetValueEncoding(value); return valenc <= intrev32ifbe(is->encoding) && intsetSearch(is,value,NULL); } static uint8_t intsetSearch(intset *is, int64_t value, uint32_t *pos) { int min = 0, max = intrev32ifbe(is->length)-1, mid = -1; int64_t cur = -1; /* The value can never be found when the set is empty */ if (intrev32ifbe(is->length) == 0) { if (pos) *pos = 0; return 0; } else { /* Check for the case where we know we cannot find the value, * but do know the insert position. */ if (value > _intsetGet(is,intrev32ifbe(is->length)-1)) { if (pos) *pos = intrev32ifbe(is->length); return 0; } else if (value < _intsetGet(is,0)) { if (pos) *pos = 0; return 0; } } while(max >= min) { mid = ((unsigned int)min + (unsigned int)max) >> 1; cur = _intsetGet(is,mid); if (value > cur) { min = mid+1; } else if (value < cur) { max = mid-1; } else { break; } } if (value == cur) { if (pos) *pos = mid; return 1; } else { if (pos) *pos = min; return 0; }}Copy the code
Some things to note about the above code include:
intsetFind
Finds the specified element in the specified intsetvalue
Returns 1 if found, 0 if not found._intsetValueEncoding
The function will look it up depending on what it’s looking forvalue
In which range the corresponding data encoding is calculated (that is, how many bytes it should be stored).- if
value
If the required data encoding is larger than the current intset encoding, it must be out of the range of data that the current intset can store (very large or very small), so 0 is returned; Otherwise the callintsetSearch
Performs a binary search algorithm. intsetSearch
Finds the specified element in the specified intsetvalue
If found, returns 1 and takes the argumentpos
Point to the location of the found element; If none is found, 0 is returned and the argument is passedpos
Points to the location where the element can be inserted.intsetSearch
Is an implementation of binary search algorithm, which is roughly divided into three parts:- Special handling of intSet is null.
- Handles two boundary cases in particular: when looking up
value
Bigger than the last element or smaller than the first element. In fact, the special treatment of these two parts is not necessary in binary lookup, but they provide the possibility of quick failure in special cases here. - Actually perform binary search. Note: If not found at last, insert position at
min
Specified location.
- In the code
intrev32ifbe
This is for doing size side conversions if needed. As we mentioned earlier, the data in intSet is stored in little Endian mode, so when running on a big endian machine, the data here is stored in little Endian modeintrev32ifbe
It will convert accordingly. - The total time of this search algorithm is order log n.
The key code for intsetAdd is as follows (from intset.c) :
intset *intsetAdd(intset *is, int64_t value, uint8_t *success) { uint8_t valenc = _intsetValueEncoding(value); uint32_t pos; if (success) *success = 1; /* Upgrade encoding if necessary. If we need to upgrade, we know that * this value should be either appended (if > 0) or prepended (if < 0), * because it lies outside the range of existing values. */ if (valenc > intrev32ifbe(is->encoding)) { /* This always succeeds, so we don't need to curry *success. */ return intsetUpgradeAndAdd(is,value); } else { /* Abort if the value is already present in the set. * This call will populate "pos" with the right position to insert * the value when it cannot be found. */ if (intsetSearch(is,value,&pos)) { if (success) *success = 0; return is; } is = intsetResize(is,intrev32ifbe(is->length)+1); if (pos < intrev32ifbe(is->length)) intsetMoveTail(is,pos,pos+1); } _intsetSet(is,pos,value); is->length = intrev32ifbe(intrev32ifbe(is->length)+1); return is; }Copy the code
Some things to note about the above code include:
intsetAdd
Add a new element to intSetvalue
. ifvalue
If it already exists, it will not be added againsuccess
It’s set to zero; ifvalue
Does not exist in the original intsetvalue
Insert into the appropriate position, when the argumentsuccess
It’s set to 0.- If you want to add elements
value
If the required data encoding is larger than the encoding of the current intSet, theintsetUpgradeAndAdd
Upgrade the encoding of intset and then insert itvalue
. - call
intsetSearch
If it can be found, it will not be added repeatedly. - If not, call
intsetResize
Memory expansion of intset to accommodate newly added elements. Since intSet is a contiguous space, this operation raises memoryrealloc
(seeman.cx/realloc). This may lead to a copy of the data. At the same time callintsetMoveTail
Moving an element one position behind the insertion position also involves a copy of the data. It is worth noting that in theintsetMoveTail
Is to call inmemmove
Complete the data copy.memmove
This ensures that data is not overlapped or overwritten during copy. For details, seeman.cx/memmove. intsetUpgradeAndAdd
Is also called in the implementation ofintsetResize
To complete the memory expansion. When you do a coding upgrade,intsetUpgradeAndAdd
The implementation of intSet takes each element from the original intSet and writes it back to the new location with the new encoding.- Pay attention to the
intsetAdd
, which returns a new intSet pointer. It may be with the incoming intset pointeris
Same, maybe different. The caller must replace the old intset variable passed in with the new intset returned here. Interface usage patterns such as this are common in Redis implementation code, as we described earliersdsandziplistI’ve been in a similar situation. - Obviously, this
intsetAdd
The total time complexity of the algorithm is O(n).
The set of Redis
To better understand the set data structure exposed by Redis, let’s take a look at some of the key commands of set. Here are some examples of commands:
! [the set command, for example] (http://zhangtielei.com/assets/photosredis/intset/redissetcmdexample.png)
What these commands mean:
sadd
Used separately to sets1
ands2
Add elements to. The added elements are both numeric and non-numeric (” a “and” b “).sismember
Determines whether the specified element exists in the collection.sinter
.sunion
andsdiff
They are used to calculate the intersection, union and difference of sets respectively.
As we mentioned earlier, the underlying implementation of a set varies depending on whether the element type is an integer and how many elements are added. For example, during the execution of the above command, the underlying data structure of collection S1 changes as follows:
- At the beginning of the execution
sadd s1 13 5
And then, since I’m adding smaller integers, sos1
At the bottom is an INTset, whose data is encodedencoding
= 2. - The execution of the
sadd s1 32768 10 100000
After that,s1
The underlying layer is still an INTset, but its data is encodedencoding
Upgrade from 2 to 4. - The execution of the
sadd s1 a b
And then, since the element you add is no longer a number,s1
The underlying implementation turns into a dict.
Dict is a data structure used to maintain the mapping between keys and values. When a set is represented by dict, what are its keys and values? In effect, key is the collection element to be added, and value is NULL.
In addition to the previously mentioned transition from intset to dict at the bottom of the collection due to the addition of non-numeric elements, there are two other situations that can cause this transition:
- A number was added, but it cannot be expressed in 64-bit signed numbers. The largest integer range that an intset can express is -264 to 264-1, so if you add numbers beyond that range, it will cause an Intset to become a dict.
- The number of collection elements added exceeded
set-max-intset-entries
Intset can also be set to dict (see ‘setTypeAdd’ in t_set.c for details).
The main reason to use intset for small collections is to save memory. Especially when the number of elements stored is small, dict incurs a much higher memory overhead (two hash tables, linked list Pointers, and a lot of other metadata). So, when storing a large number of small collections whose elements are all numbers, intset can save a considerable amount of memory.
In fact, intsets on average perform less well than dict in terms of time complexity. For lookups, intset is O(log n), while dict can be thought of as O(1). However, since intset is used with a small number of collection elements, this doesn’t matter much.
Redis set union, intersection, difference algorithm
Redis set and, intersection, difference algorithm implementation code, in t_set.c. ‘sinterGenericCommand’ is used to calculate the intersection, and ‘sunionDiffGenericCommand’ is used to calculate the union and difference. They can both operate on multiple (and possibly more than two) sets at the same time. When the difference set operation is carried out on multiple sets, it expresses the meaning that the difference set is made with the first set and the second set, and the result is made with the third set, and so on.
Here we briefly introduce the implementation of the three algorithms.
intersection
The process of calculating the intersection can be roughly divided into three parts:
- Check each set and treat nonexistent sets as empty. Once there is an empty set, there is no need to continue the calculation, the final intersection is the empty set.
- Sort each set in order of the number of elements. This sort makes it easier for later calculations to start with the smallest set, with fewer elements to deal with.
- Traverses the first sorted set (that is, the smallest set), and for each of its elements, searches all subsequent sets in turn. Only elements found in all sets are added to the final result set.
Note that step 3 above looks up in the collection and the time complexity for intset and dict stores is O(log n) and O(1), respectively. But since intsets are only used by small sets, it is roughly assumed that intset lookups are also constant time. Therefore, as stated in the official Redis documentation (redis. IO /commands/si…) , the time complexity of sinter command is:
O(N*M) worst case where N is the cardinality of the smallest set and M is the number of sets.
And set
The easiest way to compute a union is to iterate through all the sets, adding every element to the final result set. Adding elements to a collection automatically deduplicates.
Because every element of all collections is traversed, the time complexity of the sunion command given by the official Redis documentation is (Redis. IO /commands/su…). :
O(N) where N is the total number of elements in all given sets.
Note that the process of inserting elements into the result set is the same as the previous discussion of intersection calculation. The time complexity is O(1), ignoring intset.
Difference set
There are two possible algorithms for computing the difference set with different time complexity.
The first algorithm:
- The first collection is traversed, and for each of its elements, all subsequent collections are searched in turn. Only elements that cannot be found in any set are added to the final result set.
The time complexity of this algorithm is O(N*M), where N is the number of elements in the first set and M is the number of sets.
The second algorithm:
- Add all the elements of the first set to an intermediate set.
- Iterate through all the following sets, and for each element you encounter, delete it from the middle set.
- And then what’s left of the intermediate set is the difference set.
The time complexity of this algorithm is O(N), where N is the sum of the number of elements in all sets.
At the beginning of the calculation of difference set, the expected time complexity of the two algorithms will be estimated respectively, and then the algorithm with low complexity will be selected for operation. Two more points to note:
- To some extent, the first algorithm is preferred because it involves fewer operations and only adds, while the second algorithm adds and then removes.
- If the first algorithm is selected, the implementation of Redis sorts all sets after the second set according to the number of elements from most to least before executing the algorithm. This sort helps to find elements with a higher probability, which leads to a faster end to the search.
For sDIff time complexity, Redis official documentation (redis. IO /commands/sd…) Only the result of the second algorithm is given, which is not accurate.
Stay tuned for the next installment in this series.
Original article, reprint please indicate the source, and include the following TWO-DIMENSIONAL code! Otherwise refuse to reprint! This paper links: zhangtielei.com/posts/blog-…