Chapter 6: Integer set

The set of integers is one of the underlying implementations of the set key. Redis uses the set of integers as the underlying implementation of the set key when a set contains only integer numeric elements and the set has a small number of elements.

6.1 Implementation of integer set

Intset is an abstract data structure Redis uses to store integer values. It can store integer values of type INT16_t, int32_t, or int64_t without repeating elements in the set.

Each intset represents a set of integers:

// uint32_t encoding; // Uint32_t length; Int8_t contents[]; } intset;Copy the code

Contents array is the low-level implementation of integer set: each element of integer set is an item of Conters array, and each item is ordered in the array according to the size of value from small to large, and the array does not contain any duplicate items.

The length property records the number of integers in the contents array.

While the intset structure declares the contents property as an array of type INT8_T, the contents array does not actually hold any int8_T values, and the true type of the contnets array depends on the value of the encoding property:

The cintents attribute is INTSET_ENC_INT16, and the cintents attribute is an int16_T array. Each item of the array is an integer value of int16_T (-32768 to 32767). If the encoding attribute is INTSET_ENC_INT32, then cintents is an array of int32_T, and each item in the array is an integer value of int32_T (-2147483648 to 2147483647). If the encoding attribute is INTSET_ENC_INT64, then cintents is an array of type INT64_T, Each entry in the array is an integer value of type INT64_t (-9223372036854775808 to 9223372036854775807)Copy the code

In short, the array holds the type of encoding.

Figure 6-1 shows an example:

encodingThe value of the property isINTSET_ENC_INT16Represents a set of integerscontentsThe underlying implementation ofint16_tAn array of type, while collections hold bothint16_tThe integer value of the.
lengthProperty has a value of 5, indicating that the set of integers contains five elements.
contentsThe array holds the five elements of the collection in descending order.
Because each collection element is an integer value of type INT16_TcontentsThe size of the array is equal tosize(int16_t) * 5= 80.

Let’s look at another set of integers: intsetEach attribute represents the same meaning as above, with a few differences:

We found that for 1, 3, 5, we only need to use INT16 bits for storage, for- 2675256175807981027.For example, we only need to use INT64 for storage.
Why do they always end up withint64What about storage? And the reason is,contentsWhen an integer set is added, it is automatically upgraded.
For example, at the beginningint16Bit when we add oneint32If it is 32-bit, it will upgrade to 32-bit, and if it is 64-bit, it will upgrade to 32-bit againint64position

6.2 upgrade

Every time we add a new element to the set of integers, and the new element is of a longer type than all the existing elements in the set of integers, the set of integers needs to be upgraded before it can be added to the set.

Upgrading the collection and adding it is done in three steps:

Expands the size of the underlying array of integers based on the type of the new element and allocates space for the new element.
All the existing elements of the underlying array are converted to the same type as the new element, and the converted elements are placed in the correct bit. During the placement of elements, the ordered nature of the underlying array needs to remain unchanged.
Adds the new element to the underlying array

For example, let’s say we have oneINTSET_ENC_INT16A set of encoded integers containing threeint16_tType, as shown in Figure 6-3:Since each element occupies 16 bits of space, the size of the underlying array of the integer set is 3*16 = 48 bits. Figure 6-4 shows the integer set at the 48 bits of three elements.Suppose we want to add an integer of type int32_t to the set of integers. Since the type int32_t is longer than all the current elements in the set of integers, the program needs to upgrade the set of integers before adding 65535 to the set of integers.

The first thing the upgrade does is reallocate the space of our array based on the length of our new type.

The set of integers currently has three elements, with the addition of the new element 65535, the set of integers needs to allocate four elements of space, because each int32_t takes up 32 bits of space, so after space redistribution, the size of the underlying array is: 32*4 = 128 bits.

Although we have reallocated the underlying array, the original three elements 1, 2, and 3 remainint16_tThese elements are still stored in the first 48 bits of the array, so the next step in the program is to convert these three elements toint32_tType, and place the converted elements in the correct bits, preserving the order of the array as the elements are placed.First, because element 3 ranks third among elements 1, 2, 3, and 65535, it is moved to index 2 of the contents array, which is 64 to 95 bits of the array, as shown in Figure 6-6:

The other values are the same, just fill them in.

Because each addition of a new element to the collection may cause an upgrade, and each upgrade requires a type conversion of an existing element at the bottom of the array, the time to add a new element to the collection is O (N).

Other operations to upgrade the code are similar.

6.3 Benefits of The Upgrade

The integer set upgrade strategy has two benefits, one is to improve the flexibility of the integer set, and the other is to save as much memory as possible.

6.3.1 Improve flexibility

Because C is a statically typed language, we usually don’t put two different types of values in the same data structure to avoid typing errors.

For example, we usually only hold int16_t in an array of type INT16_t, and so on.

Because integer sets can be automatically upgraded, we don’t have to worry about type errors, which is very flexible.

6.3.2 Memory Saving

We might say, well, let’s just store the three values as int64_t instead of doing integer upgrades.

If all the current added integers are int16_t, a lot of memory will be wasted. To avoid this waste, we upgrade.

6.4 the drop

Integer collections do not degrade, and once an array is upgraded, the encoding remains the same.

6.5 Integer Collection API

6.6 Key Reviews

The collection of integers is one of the underlying implementations of the collection key
The underlying implementation of a collection of integers is an array, which holds the collection elements in an ordered, non-repetitive fashion. If necessary, the program changes the type of the array according to the type of the newly added element.
The upgrade operation brings operational flexibility to integer collections and saves as much memory as possible.
Integer sets only support upgrade operations, not degrade operations.

6.1 Implementation of integer set

6.2 upgrade

6.3 Benefits of The Upgrade

6.3.1 Improve flexibility

6.3.2 Memory Saving

6.4 the drop

6.5 Integer Collection API

6.6 Key Reviews

Related Posts

REST services, using Dubbo or SpringMVC?

Dubbo monitoring

How to crawl an ASP.NET dynamic web page? Fix the damned dynamic parameters, this article tells you!