Redis Bitmaps

preface

Redis Bitmaps tutorial introduces Bitmaps concepts, operations, and common applications.

Redis version: 6.2.6

A brief introduction to Bitmaps

A bitmap is not an actual data type, but rather a set of bit-oriented operations defined on a String. Because strings are binary-safe BLOBs, and their maximum length is 512 MB, they are suitable for setting up to 2^32 different bits.

Bitmaps are a set of instructions provided by Redis that operate directly on String bits. For example, we now have a String: “A”.

127.0.0.1:6379> set k1 a
OK
127.0.0.1:6379> get k1 
"a"
Copy the code

The binary of a is 0110 0001. We can use the [GETBIT key offset] instruction to get the value of the corresponding bit of the string:

127.0.0.1:6379> getbit k1 0
(integer) 0
127.0.0.1:6379> getbit k1 1
(integer) 1
127.0.0.1:6379> getbit k1 2
(integer) 1
127.0.0.1:6379> getbit k1 3
(integer) 0
127.0.0.1:6379> getbit k1 4
(integer) 0
127.0.0.1:6379> getbit k1 5
(integer) 0
127.0.0.1:6379> getbit k1 6
(integer) 0
127.0.0.1:6379> getbit k1 7
(integer) 1
Copy the code

The offset in this instruction represents the offset. As shown above, the offset bits 1, 2 and 7 are 1, and the other bits are 0. The corresponding binary is 0110 0001, which is the ASCII value of character A.

Then we can use the [SETBIT key offset value] command to change the value of a certain bit, for example:

127.0.0.1:6379> setbit k1 6 1 // offset 6 bits, set to 1 (integer) 0 127.0.0.1:6379> get k1 "c"Copy the code

As shown above, we set the offset of 6 to 1, so that the binary object becomes: 0110 0011, corresponding to the character “C”, and we change the value of the string by directly manipulating the bits of the string.

Bitmaps in Redis is a tool for bitwise manipulating strings. With this tool, we can use strings as a set of binary arrays. Here’s a simple example:

How do you track the online status of a billion users?

Here we use a string of binaries to record the login status of the billion users. Each bit of binary represents a user, 0 represents not logged in and 1 represents logged in. Bitmaps are used to update the value of the corresponding bit each time you log in or log out, and the result looks something like this:

We use the above binary string to record the login status of the billion users. Why do we do that? The main is to save space, read or update fast

Let’s calculate the amount of storage required for this type of storage:

Billion users: 1 bit space required by each user = 1000000000 Bit = 1000000000/8/1024/1024 = 119 MBCopy the code

MySQL as an example, storage space required:

Assume that there are only two fields: User ID, online Status The user ID is BIGINT and the size is 8 Bytes Online Status The user is TINYINT and the size is: 1 Bytes Required space = 1000000000 x (8 + 1) Bytes = 9000000000 Bytes = 8583 MBCopy the code

The difference is obvious, and it makes sense. Using MySQL or Redis Hash storage, the smallest unit is a byte, which is not comparable to the direct operation bit.

That’s a brief introduction to Redis Bitmaps, and I’ll take a look at the basic commands and applications of Bitmaps.

The operation of Bitmaps

SETBIT

Time complexity: O(1)

SETBIT key offset value
Copy the code

Updates the value at offset of the specified key. Value can only be 0 or 1

Note:

Offset indicates the offset. The maximum value is 2 32-1 (because the maximum value is 512MB, the symbol occupies 1 bit).

2. Allocate memory. After a single allocation, there is no overhead for subsequent allocation of the same key.

3. Return value, which is the value before offset operation.

GETBIT

Time complexity: O(1)

GETBIT key offset 
Copy the code

Returns the bit value stored at offset in the string value of key.

Note:

If the key does not exist or the offset is out of range, the integer 0 is returned

BITCOUNT

Time complexity: O(n)

BITCOUNT key [start end [BYTE|BIT]]
Copy the code

Count the number of 1s in the string

Example: 127.0.0.1:6379> set k1 a OK 127.0.0.1:6379> BITCOUNT k1 (integer) 3 127.0.0.1:6379> set k1 aa OK 127.0.0.1:6379> BITCOUNT k1 (integer) 6 127.0.0.1:6379> BITCOUNT k1 0 0 (integer) 3 127.0.0.1:6379> BITCOUNT k1 0 1 (integer) 6 127.0.0.1:6379> BITCOUNT k1 0-1 (integer) 6Copy the code

Note:

1. The start and end arguments are Byte, not bit, and will not be available until version 7.0.

2. If the key does not exist, 0 is displayed

3. The time complexity is O(n). The n refers to the amount of data between start and end parameters.

BITOP

Time complexity: O(n)

BITOP operation destkey key [key ...]
Copy the code

Performs bitbit operations between multiple keys (containing string values) and stores the results in the target key

Operation includes AND, OR, XOR, AND NOT

Destkey refers to the target key. You can store the following keys in the Destkey by bit

/ / AND, 127.0.0.1:6379> set K1 a OK 127.0.0.1:6379> set K2 AA OK 127.0.0.1:6379> set K3 AAA OK 127.0.0.1:6379> bitop and dk1 K1 k2 k3 (integer) 3 127.0.0.1:6379> get dk1 "a\x00\x00"Copy the code

As shown in the above example, k1, k2, k3, bit-by-bit and after the result is stored in dk1, dk1 \x00 is hexadecimal, a\x00\x00 is converted into binary: 0110 0001 0000 0000 0000 0000 0000.

/ / OR, bitwise OR 127.0.0.1:6379 > bitop OR dk2 k1, k2 k3 (integer) 3 127.0.0.1:6379 > get dk2 "aaa" -- -- -- -- -- -- -- -- -- -- -- -- -- -- -- -- -- -- -- -- - / / XOR, Bitwise exclusive or 127.0.0.1:6379 > bitop xor dk3 k1, k2 k3 (integer) 3 127.0.0.1:6379 > get dk3 "a \ x00a" -- -- -- -- -- -- -- -- -- -- -- -- -- -- -- -- -- -- -- -- - / / NOT, \x9e 127.0.0.1:6379> bitop not dk4 k1 (integer) 1 127.0.0.1:6379> get dk4 "\x9e"Copy the code

BITPOS

Time complexity: O(N)

BITPOS key bit [start [end [BYTE|BIT]]]
Copy the code

Returns the first position in the string set to 1 or 0.

Example 127.0.0.1:6379> setbit k1 4 1 (integer) 0 127.0.0.1:6379> setbit k1 13 1 (integer) 0 127.0.0.1:6379> bitpos k1 1 (integer) 4 127.0.0.1:6379> bitpos k1 1 0 0 (integer) 4 127.0.0.1:6379> bitpos k1 1 1 1 1 (integer) 13Copy the code

Note:

1. The start and end arguments refer to Byte. After version 7.0, you can specify Byte or bit.

2. Bitpos, setbit, and getbit follow the same bit position convention.

3. When query 1, all the keys that do not exist or the corresponding range are 0, return -1.

Select * from ‘0’;

K2 = 1111, 1111, k3 does not exist -- -- -- -- -- -- -- -- -- -- -- -- -- -- -- -- -- -- -- -- -- -- -- -- -- -- -- / / do not specify a range, or specify only the start, and the value is 1, this time will be found out the most on the right side of the position of 1 + 1, Can be seen as the right filling 0 127.0.0.1:6379 > BITPOS k2 0 (integer) 8 -- -- -- -- -- -- -- -- -- -- -- -- -- -- -- -- -- -- -- -- -- -- -- -- -- -- -- / / do not specify a range or specify only the start, and the key was not found, Returns 0 127.0.0.1:6379 > BITPOS k3 0 0 (integer) -- -- -- -- -- -- -- -- -- -- -- -- -- -- -- -- -- -- -- -- -- -- -- -- -- -- -- / / specified range, no 0 and scope, Return -1 127.0.0.1:6379> BITPOS k2 1 1 1 (integer) -1Copy the code

BITFIFLD

BITFIELD key [GET encoding offset] [SET encoding offset value] [INCRBY encoding offset increment] [OVERFLOW WRAP|SAT|FAIL]
Copy the code

This command treats the Redis string as an array of bits, and can handle specific integer fields with different bit widths and arbitrary non-aligned offsets. This command has get, set, and incrby operations

We can use this command to process the string bitwise, for example:

127.0.0.1:6379> set k1 aaa
OK
Copy the code
a a a
0110, 0001, 0110, 0001, 0110, 0001,

The binary of K1 is shown in the table above. Next we use the BITFIFLD command to manipulate k1

The GET:

// u8 = unsigned + 8 bits; 127.0.0.1:6379> BITFIELD k1 get u8 01) (integer) 97Copy the code
// i8 = signed + 8 bits; 127.0.0.1:6379> BITFIELD k1 get i8 11) (integer) -62Copy the code

SET:

// Change the number of bits from 0 to 7 to 98. 0110, 0010, 127.0.0.1:6379> BITFIELD k1 set U8 0 98 1) (integer) 97 127.0.0.1:6379> get k1 "baa" -- -- -- -- -- -- -- -- -- -- -- -- -- -- -- -- -- -- -- -- -- -- -- -- -- -- -- -- -- -- -- -- -- -- -- -- -- -- -- -- -- -- 127.0.0.1:6379 > BITFIELD k1 set u8 # 1 98 / / # 1 means Starting from the second eight 1) (integer) 97 127.0.0.1:6379 > get k1 "bba"Copy the code

INCRBY: increases or decreases

// the value of k1 is 97. K1 "bba" 127.0.0.1:6379> BITFIELD k1 incrby u8 0-1 1) (integer) 97 127.0.0.1:6379> get K1 "ABA" 127.0.0.1:6379> BITFIELD k1 incrby u8 # 1-1 1) (integer) 97 127.0.0.1:6379> get k1 "aaa"Copy the code

There is overflow control when incrementing or decrementing using INCRBY, and Redis provides three behaviors to control overflow:

WRAP: In the case of unsigned integers, a line WRAP is like modulo the maximum value an integer can contain

// Take u8 as an example, unsigned, 8 bits, BITFIELD k1 get u8 0 1) (integer) 97 127.0.0.1:6379> BITFIELD k1 overflow WRAP incrby u8 0 256 1) (integer) 97 127.0.0.1:6379> BITFIELD K1 overflow WRAP incrby u8 0 257 // 97+257 = 97+257-256 = 98 1) (integer) 98 127.0.0.1:6379> BITFIELD k1 overflow WRAP incrby u8 0 200 // 98 + 200 = 298-256 = 42 1) (integer) 42Copy the code

In the signed case, it overflows up to negative values and down to positive values, in the case of i8 127 + 1 to -128

127.0.0.1:6379> set k1 aaa OK 127.0.0.1:6379> BITFIELD k1 get U8 0 1) (integer) 97 127.0.0.1:6379> BITFIELD k1 incrby i8 0 30 1) (INTEGER) 127 127.0.0.1:6379> BITFIELD k1 incrby i8 0 1 1) (integer) -128 127.0.0.1:6379> BITFIELD k1 incrby i8  0 -1 1) (integer) 127Copy the code

SAT: The saturation algorithm is used, that is, the minimum integer value is set for underflow and the maximum integer value is set for overflow.

/ / u8, Max. 255 Min.0 127.0.0.1:6379> set k1 AAA OK 127.0.0.1:6379> BITFIELD k1 get U8 0 1) (integer) 97 127.0.0.1:6379> BITFIELD K1 overflow SAT incrby u8 0 20000 1) (integer) 255 127.0.0.1:6379> BITFIELD k1 overflow SAT incrby u8 0-213123 1) (integer) 0Copy the code

FAIL: In this mode, no action is performed on detected overflows or underflows. The corresponding return value is set to NULL to signal the condition to the caller. That is, an overflow returns NUll.

127.0.0.1:6379> set k1 aaa OK 127.0.0.1:6379> BITFIELD k1 get u8 0 1) (integer) 97 127.0.0.1:6379> BITFIELD k1 overflow FAIL incrby U8 0 200 1) (nil) 127.0.0.1:6379> BITFIELD K1 overflow FAIL incrby u8 0-98 1) (nil)Copy the code

In addition, the above BITFIELD commands can be used together:

127.0.0.1:6379> BITFIELD k1 overflow FAIL incrby u8 0-98 get u8 0 1) (nil) 2) (integer) 97Copy the code

BITFIELD_RO

A read-only variant of the BITFIELD command. It is just like the original BITFIELD, but only accepts GET subcommands and can be safely used for read-only copies.

The application of Bitmaps

In the description above, we introduced Bitmaps to keep track of users’ online status. What else can you do with it?

First, let’s summarize the features of Bitmaps:

  • There are only two states: 0 and 1.
  • It takes up very little memory
  • Extremely fast access speed (set and GET operation time complexity is O(1))

Based on these features, Bitmaps can be used to determine whether data exists in a collection or not. Bitmaps can also be used to record user actions such as logging in or viewing, closing, checking in to a page, etc. Here are some common examples.

Day live statistics

How do I count the number of users who log in to the system every day?

Use Redis Bitmaps, set the system name + date as key, such as ZCY20211215, use the user Id as offset in value, use 0 and 1 to record whether the user has logged in on that day, and set the corresponding bit as 1 for login.

When you do that, you get one bitmap per day, and if you want to get the number of users logged in on a given day, you just use the BITCOUNT operation.

If you want to count the number of users who logged in during the last 30 days, you can use the BITOP AND operation to bitmap the last 30 days.

Bloom filter

The essence of bloom filter is Hash mapping algorithm + Bitmaps.

As shown in the figure, a value enters the Bloom filter and obtains its mapping position on the Bitmap through multiple Hash algorithms. When all positions are 1, the value is considered to be in the filter; otherwise, it is considered not to exist.

Marketing data statistics

Bitmaps can be used in a variety of scenarios in a marketing system:

The use of user portrait information, a user has many labels and unlimited expansion, such as gender, whether programmer, single, renting a house, whether pet, we can consider using Bitmap to store and use.

User actions, such as clicking on ads, browsing to the bottom, and picking up coupons, are stored in a Bitmap, similar to the user’s image above.

Here’s a quick example of how Bitmaps can be used in a marketing system:

Let’s say we have an e-commerce application with 100 million users, and the product has this requirement:

When all male users entered the app’s home page, a pop-up AD for an anti-hair loss health product popped up

The requirement is very simple, one interface is done, the user enters the interface to call to get advertising, the interface queries the database to determine whether it is male, whether to return advertising content, whether to return empty.

At this time, the product felt that this kind of push was not accurate enough, and not all men would lose their hair, so it added conditions: programmers, and our demand became:

When all male programmers entered the app’s home page, a pop-up AD for an anti-hair loss supplement popped up

It is still very simple after adding a condition, the user’s gender and occupation information is most likely to exist in a table, also a query.

However, male programmers have a high probability of hair loss, but they may not all be able to afford hair loss prevention health products. At this time, it is necessary to push a little more accurate, so a new condition is added: The average monthly consumption on the platform is more than 500, which is probably not in the same table with the above two conditions of male and programmer. If the above scheme is used, it will add another DB query, which is slow and costly to the database. At this time, the effect of Bitmaps will be obvious.

The existing conditions are: male, programmer, monthly consumption on the platform more than 500

In this scenario, if you want to use Bitmaps, you need to load the data into Redis first. There is a simple and crude way to do this, which is to directly traverse the database and load the required tag information into Redis:

Public Boolean loadUserLabel() {// User gender bitmap data loading String key = "user_label_sex_male"; List<User> users = userDao.queryUser(); users.forEach( t->{ if (Objects.equals(t.getSex(),"male")) { jedis.setbit(key,t.getId(),"1"); }}); return true; }Copy the code

After the above code, the gender of the user is loaded into the Redis Bitmap, and other tags such as occupation and consumption greater than 500 are similar.

After we have the user label information we need in Redis, we can start to use it. Like the above requirements, we can use the AND operation of the BITOP command to generate a Bitmap that meets the three conditions at the same time: male, programmer AND monthly consumption over 500. Do this when there are too many labels. Here we have three conditions, which can be easily checked in the code three times:

Public Response getHomepageAds(User User) {String maleKey = "user_label_sex_male"; String programmerKey = "user_label_occupation_programmer"; String spendMonth500Key = "user_label_spend_month_500"; // Select 1 from bitmap. If (jedis.getbit(maleKey, user.getid ()) && jedis.getbit(programmerKey, user.getid ()) && Jedis. Getbit (spendMonth500Key, user getId ())) {return "content". } return "no ads "; }Copy the code

The above is a small example of Bitmap application in marketing system, which can be extended on this basis, such as real-time incremental loading of each tag, dynamic configuration of the binding relationship between each AD and tag, automatic generation of tags and so on…

Let’s take a look at the use of Bitmaps for recording user behavior. Now the product has a new requirement:

I wonder how many users clicked on our pop-up AD

After clicking the pop-up AD, the front end sends the request, and the back end records the user’s click status:

Public Response getHomepageAds(User User) {String adActionKey = "homepage_ad_action_open"; jedis.setbit(adActionKey,user.getId(),"1"); }Copy the code

Later, if you want to count the number, you can directly use the BITCOUNT command. Other behaviors are recorded similarly, such as whether to drag to the bottom, whether to stay longer than n seconds, etc., which I won’t go into here.

Collaborative graphics

This example comes from The Redis website, from Reddit’s R/Place game on April Fool’s Day 2017, which is a very interesting Bitmaps application.

Game introduction:

Each player can edit one pixel every five minutes (there are 16 colors to choose from). There is no limit to the number of players who can participate. It will expire after 72 hours.

The game was simple, everyone just had to edit the colors of the pixels, and 17 years of activity resulted in this painting (part of which is here) :

There are a million pixels in this game, according to statistics at least 100,000 people participated in this game, the concurrency is very high, how to ensure availability? Reddit uses Redis’s Bitmap here:

First, the position of pixels in the artboard is defined. X represents the abscissa, y represents the ordinate, and the position of each pixel corresponds to an offset in the Bitmap.

When the user edits pixels, the position information (X, Y) and color information are stored in Bitmap, and the drawing board data is also read directly using Bitmap.Copy the code

The idea is very simple, mainly to solve the following two problems:

1. Mapping between coordinates and Bitmap? How coordinates are converted to Bitmap offset, and how offset is converted to the x and y coordinates in the artboard.

2. How to store the information of a coordinate point directly with Bitmaps? I’ll just store the color here.

Here’s how the project works:

1. Since the total number of pixels is 1 million, the formula is designed: X + 1000y = offset

When storing, convert (x,y) to offset, such as (1,2), then offset = 1 + 2000 = 2001

When read, convert offset to (x,y), such as offset = 9008, then x = 9008% 1000 = 8,y = 9008/1000 = 9

2. Use BITFIELD command of Bitmaps to segment operations:

There are 16 colors to choose from, so each dot needs at least 4 bits, so when using BITFIELD, the BITFIELD for the operation should be u4

This is what it looks like:

Above is a brief introduction to the Bitmaps app. See the implementation and source code on the Reddit team’s blog at the end of this article.

Using the BITFIELD command, you can also determine whether the data is duplicate. For example, if there are 1 billion integers, how can you find the duplicate data? Each number can be given two bits, with 01 indicating a single occurrence and 10 indicating a repeat.

Four, small expansion

How do I use Bitmaps when the user Id is not an incremented Id?

In real development, user ids might not be auto-incremented, such as using the Snowflake algorithm, or ids generated by the UUID tool, where using Bitmaps would not work because the offset is too large. At this time, you can refer to bloom filter and design an Id mapping algorithm to map user ids within a certain range.

How do Bitmaps save space for persistent storage?

Use compression algorithms, such as RLE algorithms

With Bitmaps, there is a lot of continuous repetition of location data, and there is room for compression. For example, if the first 1000 bits are all zeros, you can just store 1000 zeros in a sentence instead of 00000… A thousand of them.

Reference documentation

redis.io/

Redditblog.com/2017/04/13/…

Recommended reading

Guava Cache actual Combat – From scenario to principle analysis

Details of HTTP2.0 and HTTPS protocols

Introduction to the JVM (Take you to the JVM from a different perspective)

, recruiting

Zhengcaiyun Technology team (Zero) is a passionate, creative and executive team based in picturesque Hangzhou. The team has more than 300 r&d partners, including “old” soldiers from Alibaba, Huawei and NetEase, as well as newcomers from Zhejiang University, University of Science and Technology of China, Hangzhou Electric And other universities. Team in the day-to-day business development, but also in cloud native, chain blocks, artificial intelligence, low code platform system, middleware, data, material, engineering platform, the performance experience, visualization technology areas such as exploration and practice, to promote and fell to the ground a series of internal technical products, continue to explore new frontiers of technology. In addition, the team is involved in community building, Currently, There are Google Flutter, SciKit-Learn, Apache Dubbo, Apache Rocketmq, Apache Pulsar, CNCF Dapr, Apache DolphinScheduler, and Alibaba Seata and many other contributors to the excellent open source community. If you want to change something that’s been bothering you, want to start bothering you. If you want to change, you’ve been told you need more ideas, but you don’t have a solution. If you want change, you have the power to make it happen, but you don’t need it. If you want to change what you want to accomplish, you need a team to support you, but you don’t have the position to lead people. If you want to change the original savvy is good, but there is always a layer of fuzzy window…… If you believe in the power of believing, believing that ordinary people can achieve extraordinary things, believing that you can meet a better version of yourself. If you want to be a part of the process of growing a technology team with deep business understanding, sound technology systems, technology value creation, and impact spillover as your business takes off, I think we should talk. Any time, waiting for you to write something and send it to [email protected]

Wechat official account

The article is published synchronously, the public number of political cloud technology team, welcome to pay attention to