Recently, when I was thinking about the real-time data warehouse problem, I thought of the huge amount of redis storage problem, and then read this article to share with you
1. Demand Background
The application scenario is DMP cache storage requirements. DMP needs to manage a lot of third-party ID data, including the mapping relationship between cookies of each media and their own cookies (hereinafter referred to as SupperID). It also includes the population label of SupperID, the population label of mobile TERMINAL ID (mainly IDFA and IMEI), as well as some blacklist ID, IP and other data.
It’s not difficult to store hundreds of billions of records offline with the help of HDFS, but DMP also needs to provide real-time queries at the millisecond level. Cookie id itself is unstable, so the browsing behavior of many real users will lead to the generation of a large number of new cookies. Only the data of mapping can be synchronized in time to hit the population label of DMP, and high hits cannot be obtained through preheating, which brings great challenges to cache storage.
According to the actual test, for the above data, conventional storage of MORE than 5 billion KV records requires more than 1T of memory. If high availability multiple copies are needed, the consumption will be huge. In addition, the uneven length of kV will also bring a lot of memory fragments, which requires super-large-scale storage solutions to solve the above problems.
What kind of data to store
Population labels mainly include Cookie, IMEI, IDFA and their corresponding gender, age, geo, etc. The mapping relationship mainly refers to the mapping between media cookies and Supperids. The following is an example of data storage:
- PC ID:
Media ID - Media cookie=> SupperID SupperId => {age= >Age code, gender=> gender code, geo=> geographic location code}Copy the code
- ID of Device terminal:
imei or idfa => { age= >Age code, gender=> gender code, geo=> geographic location code}Copy the code
Obviously, PC data needs to store both key=> Value and key=> HashMap, while Device data needs to store one
Key = > hashmap.
3. Data Characteristics
- Short key short value:
The superid for21Bit numbers: for example1605242015141689522; Imei to lowercase md5: such as 2 d131005dc0f37d362a5d97094103633; Idfa is uppercase with "-" md5: for example, 51DFFC83-9541-4411- e39d04 FA4F - 356927;Copy the code
- The media’s own cookies vary in length;
- Need to provide services for the full amount of data, SupperID is billions of level, media mapping is billions of level, mobile ID is billions of level;
- A billion mapping relationships are created every day;
- Hot data can be predicted for large time Windows (with some persistent stable cookies);
- The current mapping data cannot predict the heat data, and many of them are newly generated cookies.
4 Existing technical challenges
1) Different length may cause memory fragmentation;
2) Due to the existence of a large number of Pointers, the memory expansion rate is relatively high, generally in 7 times, pure memory storage common problems;
3) Although the popularity of cookies can be predicted by their behavior, there are still a lot of newly generated ids every day (the percentage is sensitive and will not be disclosed);
4) As the service is required to be within 100ms in a public network environment (the domestic public network delay is less than 60ms), in principle, all the newly updated mapping and population labels on the day should be in memory instead of sending requests to the cold data at the back end.
5) In business, all data should be kept for at least 35 days or longer in principle;
6) Memory is also relatively expensive, billions of Key and even billions of storage plan is imperative!
5 Solutions
5.1 Elimination Strategy
One of the main reasons for tight storage is that a lot of new data is added to the repository every day, so it’s important to clean up data in a timely manner. The main approach is to find and retain hot data and eliminate cold data.
The number of netizens is far less than billions. Id has a certain life cycle and will change constantly. So for the most part, the ids we store are actually invalid. In fact, the front-end logic of query is advertising exposure, which is related to people’s behavior, so the visit behavior of an ID in a certain time window (may be a campaign, half a month, several months) will have certain repeatability.
Before data initialization, log ids are aggregated and de-duplicated using hbase. The TTL range is generally 35 days. In this way, ids that have not appeared in the last 35 days can be cut off. In addition, the expiration time is set to 35 days in Redis. When there is access and hit, the key is renewed to extend the expiration time, and the natural elimination does not occur in 35 days. This can be effective for stable cookie or ID. It has been proved that the method of life renewal is more practical for IDFA and IMEI, and a very ideal hit can be achieved by long-term accumulation.
5.2 Reducing Expansion
The size of the Hash table space and the number of keys determine the collision rate (or the load factor). Within a reasonable range, the more keys, the larger the Hash table space, and the larger the memory consumption. Add to that a large number of Pointers that are themselves long integers, so the memory expansion is considerable. Let’s talk about how to reduce the number of keys.
Let’s start with a storage structure. We expect key1=>value1 to be stored in Redis, so we can do this. First, the fixed length random hash MD5 (key) value is used as the redis key, which is called BucketId, and key1=> Value1 is stored in the hashMap structure. In this way, during query, the client can calculate the hash according to the above procedure, so as to query Value1.
Get (key1) -> hget(MD5 (key1), key1) to get value1.
If we pre-calculate that many keys can collide in the BucketId space, we can assume that there are multiple keys hanging under a BucketId. With an average of 10 keys per BucketId, we could theoretically reduce the number of redis keys by more than 90%.
It’s a little tricky to implement, and you need to figure out the capacity scale before you do it. The md5 that we usually use is 32-bit hexString, and it’s 128 bits in space, which is a lot of magnitude, and we need to store billions of bits, which is about 33 bits, so we need a mechanism to hash the right number of bits, and to save memory, We need to fill it with all character types (ASCII between 0 and 127) instead of HexString, so that the Key’s length can be cut in half.
Here’s how to do it
public static byte [] getBucketId(byte [] key, Integer bit) {
MessageDigest mdInst = MessageDigest.getInstance("MD5");
mdInst.update(key);
byte [] md = mdInst.digest();
byte [] r = new byte[(bit-1) /7 + 1];Since only 7 bits of a byte can be represented as single characters, ASCII is 7 bits
int a = (int) Math.pow(2, bit%7) -2;
md[r.length-1] = (byte) (md[r.length-1] & a);
System.arraycopy(md, 0, r, 0, r.length);
for(int i=0; i<r.length; i++) {if(r[i]<0) r[i] &= 127;
}
return r;
}
Copy the code
The bit argument determines the size of the final BucketId space, which is a discrete set of integer powers of 2. To explain why only 7 bits are available in a byte, redis needs to store keys as ASCII (0 to 127), not as byte arrays. If we plan the storage of billions and each bucket shares 10 kV, then we only need the number of buckets 2^30=1073741824, which is the final number of keys.
5.3 Debris Reduction
Fragmentation is mainly caused by memory alignment failure and memory cannot be reallocated after expired deletion. Through the method described above, we can store the population label and mapping data in the above way. The advantage of this is that the Redis key is of equal length. In addition, we also optimized the key in hashMap by cutting the last six bits of cookie or DevicEID as the key, which can also ensure memory alignment. Theoretically, there will be the possibility of conflict, but the probability of having the same suffix in the same bucket is extremely low (think id is almost random string, The probability of 10 random id suffixes consisting of longer characters having the same suffix * number of barrel samples = the expected value of conflict <<0.05, that is to say, the occurrence of one conflicting sample is a minimal probability event, and this probability can be controlled by adjusting the expected value of suffix retention length). Value only stores the codes of age, gender, and geo, which are stored in three bytes.
Another low but effective way to reduce fragmentation is to restart the slave and then forcibly perform a failover, which is equivalent to defragmentation of memory for the master.
Recommended Google-TCmalloc, Facebook-jemalloc memory allocation, can reduce memory fragmentation and memory consumption when the value is not large. In the case of excessive value, liBC is more economical.