You have to work really hard to look effortless!

Wechat search public number [long Coding road], together From Zero To Hero!

preface

A Bitmap, or Bitmap, is a series of contiguous binary arrays (zeros and ones) whose elements can be located by offsets. BitMap by the smallest unit bit 0 | 1 set, said an element value or state, time complexity is O (1). Since the bit is the smallest unit in a computer, using it for storage can be very space-saving, especially for scenarios where large amounts of data are used and binary statistics are used.

The binary state here means that there are only two values of 0 and 1. For example, in the check-in scenario, we only record the check-in (1) or non-check-in (0), so it is a very typical binary state. In check-in statistics, a day’s check-in of each user can be represented by 1 bit, a month’s (suppose 31 days) check-in can be represented by 31 bit, and a year’s check-in only needs 365 bit, without too complex set type. At this point, we can select the Bitmap.

Bitmaps do not belong to the basic data type of Redis, but are bitwise operations based on strings. The maximum length of a string in Redis is 512 MB, so the offset value of the BitMap is also capped. The maximum value is:

8 * 1024 * 1024 * 512  =  2^32
Copy the code

Let’s take a look at Bitmap operations.

SETBIT

Version available: >= 2.2.0

Time complexity: O(1)

The command format

SETBIT key offset value
Copy the code

Command description

  • Sets or clears the bits at the specified offset for the string value stored by key.
  • Bits are set or cleared depending on the value, that is, 1 or 0
  • When the key does not exist, a new string is created. The length of the string is extended until it satisfies the specified offset (0 ≤offset< 2^32), during which the value of the new bits is set to 0

Warning!

If offset is set to a large size, memory allocation may cause Redis to block.

If the string corresponding to the key does not exist or is short, but the offset is large (e.g., 2^ 32-1 at most), Redis needs to allocate memory for the middle bits, and Redis may block.

Take the 2010 MacBook Pro for example, offset = 2^ 32-1 (512MB memory is allocated), which takes about 300ms. Offset = 2^ 30-1 (allocate 128 memory), which takes about 80ms. Offset = 2^ 28-1 (32MB memory is allocated), which takes about 30ms. Offset = 2^ 26-1 (allocate 8MB memory), which takes about 80ms.

After allocating memory for the first time, subsequent operations on the same key have no memory allocation overhead.

Set the Bitmap

If you want to set a non-zero initial value for a Bitmap, how do you do that? One way to do this is to SET each bit to 0 or 1, but this is more cumbersome. Consider using the SET command directly to store a string.

Since bitmaps are based on strings, Bitmap data can also use String commands, mainly SET and GET.

For example, for the string ’42’, 0-7 bits are used to save ‘4’, 8-15 bits are used to save ‘2’, the ASCII code corresponding to ‘4’ is 0011 0100, and the ASCII code corresponding to ‘2’ is 0011 0010. Let’s observe by setting Bitmap one by one:

127.0.0.1:6379> SETBIT bitmapsarestrings 2 1
(integer) 0
127.0.0.1:6379> SETBIT bitmapsarestrings 3 1
(integer) 0
127.0.0.1:6379> SETBIT bitmapsarestrings 5 1
(integer) 0
127.0.0.1:6379> SETBIT bitmapsarestrings 10 1
(integer) 0
127.0.0.1:6379> SETBIT bitmapsarestrings 11 1
(integer) 0
127.0.0.1:6379> SETBIT bitmapsarestrings 14 1
(integer) 0

#By setting the bits one by one, the GET data is the string "42."127.0.0.1:6379 > GET bitmapsarestrings "42"
#Set the string directly, and find the same bit127.0.0.1:6379> set bitkey "42" OK 127.0.0.1:6379> getbit bitkey 0 (integer) 0 127.0.0.1:6379> getbit bitkey 1 (integer) 0 127.0.0.1:6379> getbit bitkey 2 (integer) 1 127.0.0.1:6379> getbit bitkey 3 (integer) 1 127.0.0.1:6379> getbit bitkey 4 (integer) 0 127.0.0.1:6379> getbit bitkey 5 (integer) 1 127.0.0.1:6379> getbit bitkey 6 (integer) 0 127.0.0.1:6379> Getbit bitkey 7 (integer) 0 127.0.0.1:6379> getbit bitkey 8 (integer) 0 127.0.0.1:6379> getbit bitkey 9 (integer) 0 127.0.0.1:6379> getbit bitkey 10 (integer) 1 127.0.0.1:6379> getbit bitkey 11 (integer) 1 127.0.0.1:6379> getbit bitkey 12 (integer) 0 127.0.0.1:6379> getbit bitkey 13 (integer) 0 127.0.0.1:6379> getbit bitkey 14 (integer) 1 127.0.0.1:6379>  getbit bitkey 15 (integer) 0Copy the code

Therefore, we can see from the above example that when setting a non-zero initial value, we do not need to set the bits one by one, but only need to give a string.

The return value

Integer: the original value of the offset position

The sample

#The corresponding ASCII code of 8 is 0011 1000
127.0.0.1:6379> set mykey "8"
OK

#The corresponding ASCII code of 8 is 0011 1000
127.0.0.1:6379> getbit mykey 0
(integer) 0
127.0.0.1:6379> getbit mykey 1
(integer) 0
127.0.0.1:6379> getbit mykey 2
(integer) 1
127.0.0.1:6379> getbit mykey 3
(integer) 1
127.0.0.1:6379> getbit mykey 4
(integer) 1
127.0.0.1:6379> getbit mykey 5
(integer) 0
127.0.0.1:6379> getbit mykey 6
(integer) 0
127.0.0.1:6379> getbit mykey 7
(integer) 0

#Set the 8th bit (index 7) to 1
127.0.0.1:6379> setbit mykey 7 1
(integer) 0
127.0.0.1:6379> getbit mykey 7
(integer) 1

Copy the code

GETBIT

Version available: >= 2.2.0

Time complexity: O(1)

The command format

GETBIT key offset
Copy the code

Command description

  • Returns the string corresponding to key,offsetBit of position
  • whenoffsetIs greater than the length of the value0
  • whenkeyIf it does not exist, value can be considered an empty stringoffsetMust be greater than the empty string length, refer to the previous item, also return0

The return value

Integer: the bit value of the offset position

The sample

#The corresponding ASCII code of 8 is 0011 1000
127.0.0.1:6379> set mykey "8"
OK

#The corresponding ASCII code of 8 is 0011 1000
127.0.0.1:6379> getbit mykey 0
(integer) 0
127.0.0.1:6379> getbit mykey 1
(integer) 0
127.0.0.1:6379> getbit mykey 2
(integer) 1
127.0.0.1:6379> getbit mykey 3
(integer) 1
127.0.0.1:6379> getbit mykey 4
(integer) 1
127.0.0.1:6379> getbit mykey 5
(integer) 0
127.0.0.1:6379> getbit mykey 6
(integer) 0
127.0.0.1:6379> getbit mykey 7
(integer) 0

Copy the code

BITCOUNT

Available version: >= 2.6.0

Time complexity: O(N)

The command format

BITCOUNT key [start end]
Copy the code

Command description

  • In the given string, the bit value is1The number of
  • The entire string is counted by default and can be specifiedstartendTo limit the scope
  • startendIt could be a negative number,- 1The last onebyte, -2 indicates the penultimate byte. Notice that this isbyte, 1 byte =8 bits
  • If key does not exist, return0

The return value

Integer: the number of bits 1

The sample

# a: 0110 0001	
# b: 0110 0010	
# c: 0110 0011
# mykey: 01100001 01100010 01100011
127.0.0.1:6379> set mykey 'abc'
OK

#Count the number of bits =1 in the entire string
127.0.0.1:6379> bitcount mykey
(integer) 10

# a
127.0.0.1:6379> bitcount mykey 0 0
(integer) 3

# ab
127.0.0.1:6379> bitcount mykey 0 1
(integer) 6

# c
127.0.0.1:6379> bitcount mykey 2 2
(integer) 4
Copy the code

Usage scenarios

For example, we have an App that needs to count the login status of each user in 2021. To meet this requirement, we take the user ID +2021 as the key and set the offset corresponding to the user’s online date to 1, so that the login status of each user in this year can be counted. The number of login days can be counted by using bitcount.

For example, if today is the 100th day of 2021 and user_id:10001 has read the website today, run SETBIT 2021:user_id:10001 100 1; If the user also logs in to the App tomorrow, run SETBIT 2021:user_id:10001 101 1, and so on.

Finally, use BITCOUNT 2021:user_id:10001 to count the number of times the user logs in to the App in this year

From the above example, we can see that using Bitmap to collect binary data is very memory saving. A user only needs 365 bits in a year, and 365*10/8=456 bytes in 10 years.

BITPOS

Available version: >= 2.8.7

Time complexity: O(N)

The command format

BITPOS key bit [start [end]]
Copy the code

Command description

  • Returns the first bit value in the string from left to rightbitOffset of (0 or 1)
  • The entire string is checked by default, but can also be specifiedstartandendVariable to specifybyteRange, as described in BITCOUNT
  • SETBITandGETBITAll of them are assignedThe bitThe offset.BITCOUNTandBITPOSSpecifies thebyteThe scope of
  • The offset returned by this command starts at 0, regardless of whether the query range is specified
  • If the key does not exist, it is considered an empty string

The return value

Integer: The first bit value is the offset of the specified bit (0 or 1)

  • If bit=1 and the string is empty, -1 is returned.
  • If the parameter bit is 0, but all the bits in the string are 1, the command returns the maximum offset+1 of the string. For example, if the string bit value is’ 11111111 ‘, 8 is returned.
  • By default, if the query bit is 0 and no range is specified, or if only start is specified, the command replaces the string with 0 to query for offset at bit=0. However, if start and end are specified and all values in the range are 0, -1 is returned, because the user specified the range and there is no 0 in the range
  • If bit=1, the string will never be appended with 1, or -1 will be returned

The sample

#Case 1
# a: 0110 0001	
127.0.0.1:6379> SET mykey "a"
OK
#Offset of the first bit=1
127.0.0.1:6379> bitpos mykey 1
(integer) 1
#The offset of the first bit=0
127.0.0.1:6379> bitpos mykey 0
(integer) 0

#Case 2
# '\xff': 1111, 1111
#When both values are 1, query bit=1 and bit=0127.0.0.1:6379> SET mykey "\ XFF "OK 127.0.0.1:6379> bitpos mykey 1 (integer) 0 127.0.0.1:6379> bitpos mykey 0 (integer) 8
#Example 3
# '\xff': 0000, 0000
#When both values are 0, query bit=1 and bit=0127.0.0.1:6379> SET mykey "\x00" OK 127.0.0.1:6379> bitpos mykey 1 (integer) -1 127.0.0.1:6379> bitpos mykey 0 (integer)  0
#Example 4
# 11111111 11111111 11111111127.0.0.1:6379> set mykey "\ XFF \ XFF \ XFF "OK#Specify the first two bytes, both 1's, do not supplement 0's, and return -1
127.0.0.1:6379> bitpos mykey 0 0 1
(integer) -1
Copy the code

BITOP

Available version: >= 2.6.0

Time complexity: O(N)

The command format

BITOP operation destkey key [key ...]
Copy the code

Command description

  • Bitwise operations are performed on multiple strings and the results are saved in destkey

  • Operation can be AND, OR, XOR, OR NOT

  • BITOP AND destkey srckey1 srckey2 srckey3 … SrckeyN: obtains the logical sum of multiple keys and saves the result to the destkey

  • BITOP OR destkey srckey1 srckey2 srckey3 … SrckeyN: obtains logical or for multiple keys and saves the result to destkey

  • BITOP XOR destkey srckey1 srckey2 srckey3 … SrckeyN: searches for or for multiple keys and saves the result to destkey

  • BITOP NOT destkey srckey specifies the value of the key and saves the value to the destkey

  • With the exception of the NOT operation, any operation can take one or more keys as input.

Strings of different lengths

When the length of the string is different in a given argument, the missing part between the shorter string and the longest string is treated as 0.

An empty key is also considered a sequence of strings containing zeros.

The return value

Integer: The length of the string saved to destkey (equal to the length of the longest string in the key given by the parameter)

The sample

127.0.0.1:6379> set key1 "\ XFF "OK 127.0.0.1:6379> set key2 "\x00" OK
#Logic and
127.0.0.1:6379> bitop AND andkey key1 key2
(integer) 1
127.0.0.1:6379> get andkey
"\x00"

#Logic or127.0.0.1:6379> bitop OR orkey key1 key2 (integer) 1 127.0.0.1:6379> get orkey "\ XFF"
#Exclusive or127.0.0.1:6379> bitop XOR xorkey key1 key2 (integer) 1 127.0.0.1:6379> get xorkey "\ XFF"
#Logic is not
127.0.0.1:6379> bitop not notkey key1
(integer) 1
127.0.0.1:6379> get notkey
"\x00"
Copy the code

BITFIELD

Version available: >= 3.2.0

Time complexity: O(1) for each subcommand

The command format

BITFIELD key [GET type offset] [SET type offset value] [INCRBY type offset increment] [OVERFLOW WRAP|SAT|FAIL]
Copy the code

Command description

  • BITFIELD treats the Redis string as an array of integers, capable of handling bits of different widths and fields of arbitrary offsets. In other words, with this command, the user can do the following: “Set the 5-bit signed integer from offset 1234 to a value,” “Get the 31-bit unsigned integer from offset 4567,” and so on

  • The BITFIELD command can also perform addition and subtraction operations on specified integers, and these operations can be set to handle overflow cases

  • BITFIELD can operate on multiple bits in a single command. That is, a command takes multiple operations and returns a list of values for each operation.

    The following command has two operations: Add the 5-bit signed integer starting at offset 100 and get the 4-bit unsigned integer starting at offset 0

    > BITFIELD mykey INCRBY i5 100 1 GET u4 0
    1) (integer) 1
    2) (integer) 0
    Copy the code

    Note:

    When GET is used outside the current current string range (and if key does not have an empty equivalent string), the excess is treated as a 0

    When INCRBY or SET is used and the string is out of range, the missing part is padded with zeros

Supported subcommands and integer types

  • GET <type> <offset>Returns the specified range of bits
  • SET <type> <offset> <value>— Sets the specified bit range and returns the old value
  • INCRBY <type> <offset> <increment>Increment or decrease (if increment is negative) the specified range of bits and return the new value

In addition to the above three subcommands, there is another subcommand that changes the behavior of SET and INCRBY in the event of an overflow:

  • OVERFLOW [WRAP|SAT|FAIL]

When the range of bits being set is an integer, we can prefix the type argument with I to indicate a signed integer, or use u to indicate an unsigned integer. For example, we can use u8 for 8-bit unsigned integers and i16 for 16-bit signed integers.

Bit value and position offset

We have two ways to set the offset: zero-based bit offset if the number is not preceded by a prefix; If the number has a ‘#’ prefix, offset is equal to the width of the supplied integer multiplied by the offset after ‘#’. Such as:

BITFIELD mystring SET i8 #0 100 SET i8 #1 200
Copy the code

The offset of the first SET is: 8*0=0

The offset of the second SET is 8*1=8

Using the # prefix saves us the trouble of manually calculating where the bits are set.

Overflow control

With the OVERFLOW command, you can control the behavior of bitfields when they OVERFLOW or underflow when performing increments or decrement:

  • WRAP: Use a WRAP to control overflow of signed and unsigned integers. For unsigned integers, wrapping is modulo the value itself with the largest unsigned integer that can be stored, which is standard behavior in C. For signed integers, an overflow causes the number to start from the smallest negative number that can be represented, while an underflow causes the number to start from the largest positive number that can be represented. For example, adding one to an i8 integer with a value of 127 yields -128.

  • SAT: the saturation algorithm is used to calculate, namely, the underflow result is the smallest integer value that can be expressed, and the overflow result is the largest integer value that can be expressed. For example, if you increment an i8 integer of 120 by 10, the result will be 127, the largest integer that the i8 type can store, which remains the same if you increment further. In contrast, if a calculation of an i8 value causes an underflow, the i8 value is set to -127.

  • FAIL: When an overflow or underflow occurs, no operation is performed and a null value is returned to the user indicating that the calculation has not been performed.

Note that the OVERFLOW subcommand only affects the SET and INCRBY commands that follow it until it hits the next OVERFLOW. WARP mode is used by default.

#They both start at 0, perform unsigned plus one, and the first incrby default uses WARP mode and the second one specifies SAT mode
> BITFIELD mykey incrby u2 100 1 OVERFLOW SAT incrby u2 102 1
1) (integer) 1
2) (integer) 1

> BITFIELD mykey incrby u2 100 1 OVERFLOW SAT incrby u2 102 1
1) (integer) 2
2) (integer) 2

> BITFIELD mykey incrby u2 100 1 OVERFLOW SAT incrby u2 102 1
1) (integer) 3
2) (integer) 3

#U2 is at most 3, plus a moment to spill up. WARP: (3+1)%4=0 SAT: Always keep the maximum value
> BITFIELD mykey incrby u2 100 1 OVERFLOW SAT incrby u2 102 1
1) (integer) 0
2) (integer) 3
Copy the code

The return value

List: the return value of each subcommand. OVERFLOW does not return data and only affects the return value of subsequent commands.

motivation

Storing a lot of small integers in a bitmap of large length, or splitting a very large key into smaller ones, can use memory very efficiently, allowing Redis to be used in a wide variety of applications — especially in the field of real-time analysis: BITFIELD’s ability to override control calculations in specified ways makes it applicable in this area.

Performance considerations

BITFIELD is a fast command, but it is important to note that processing large offsets of bits for short strings leads to memory allocation, which in turn leads to increased elapsed time. Processing existing bits takes less time.

The order of bits

BITFIELD considers the bitmap’s first byte offset 0 as the most significant bit, and so on. For example, if we set a bitmap that has been pre-set to all zeros and set its offset of 7 to a 5-bit unsigned integer value of 23 (binary 10111), the command will produce the following bitmap representation:

+ + -- -- -- -- -- -- -- -- -- -- -- -- -- -- -- -- + | | 00000001 | 01110000 + + -- -- -- -- -- -- -- -- -- -- -- -- -- -- -- -- +Copy the code

When offsets and integer lengths are aligned with byte boundaries, bitfields represent bits in the same way as big Endian notation, but it is also important to understand how these bits are arranged in the absence of alignment.

The sample

127.0.0.1:6379> set mykey "\x00"
OK

#Offset starts at 0 and gets the first five signed positive digits
127.0.0.1:6379> bitfield mykey get i5 0
1) (integer) 0

#Offset starts from 0 and sets the first five signed positive digits
127.0.0.1:6379> bitfield mykey set i5 0 10
1) (integer) 0

127.0.0.1:6379> bitfield mykey get i5 0
1) (integer) 10

#Offset starts at 0 and the first five bits are signed plus one127.0.0.1:6379> bitfield mykey incrby i5 0 11) (integer) 11 127.0.0.1:6379> bitfield mykey get i5 0 1) (integer) 11Copy the code

conclusion

This article introduces Bitmap operations, including the following commands

  • SETBIT: Sets the bit
  • GETBIT: queries the bit value
  • BITCOUNT: Counts the number of bits whose value is 1
  • BITPOS: Query for the offset with the first bit value 0 or 1
  • BITOP: Perform logical and, or, xOR, and no operations on a Bitmap
  • BITFIELD: Treat a Bitmap as a set of integers and operate on the integers

More and more

Personal blog: lifelmy.github. IO /

Wechat official account: Long Coding road