What is Bigkey
In Redis, a string can be up to 512MB, and a secondary data structure (such as hash, list, set, zset) can store about 4 billion (2^32-1) elements, but in practice I would consider it a Bigkey if there were either of the following.
- String type: it is big because a single value is large. It is generally considered a bigkey if the value exceeds 10KB.
- Non-string types: hashes, lists, sets, and ordered sets whose big is that they have too many elements.
Second, the harm
Bigkey can be said to be the rat excrement of Redis.
1. The memory space is uneven
In this way, the cluster cannot manage the memory in a unified manner, and data may be lost.
2. Timeout is blocked
Due to the single-threaded nature of Redis, operations on BigKey are usually time-consuming, which means Redis is more likely to block, which can cause client blocking or failover, which can occur during slow queries.
For example, if you find such a key in Redis, you can expect the DBA to call you.
127.0.0.1:6379> hlen big:hash(integer) 2000000127.0.0.1:6379> hgetall big:hash1) "a"2) "1"
Copy the code
3. The network is congested
If a bigkey is 1MB and the client visits are 1000 per second, then it will generate 1000MB traffic per second. It is a disaster for a server with a common gigabit network card (128MB/s in bytes). In addition, servers are generally deployed in a single-server multi-instance mode, which means that a BigKey may affect other instances, and the consequences are unimaginable.
4. Delete it after expiration
There is a bigkey that does its job (it only executes simple commands such as hget, lPOP, zscore, etc.), but it has an expiration date, and when it expires, it will be deleted. Without Redis 4.0’s asynchronous expiration (lazyfree-lazy-expire yes), There is a possibility of blocking Redis, and this overdue deletion is not detected by slow queries on the master node (because this deletion is not client-generated, it is an internal loop event, which can be obtained from latency or slow queries on the slave node).
5. Migration difficulties
Migrate: Dump + restore + del: Dump + restore + del: Dump + restore + del: Dump + restore + del Migration may fail, and slower migrate blocks Redis.
Three, how to produce?
In general, bigkey is the result of poor programming, or not knowing the size of the data. Here are a few:
(1) Social: List of fans, if some stars or big V is not carefully designed, it must be Bigkey.
(2) Statistics: for example, store the user set of a certain function or website by day, unless few people use it, otherwise it must be bigkey.
(3) Cache class: serialize data from database load into Redis. This method is very common, but there are two points that need to be noted:
- First, is it necessary to cache all fields
- Second, there is no relevant data
For example, I once encountered an example where the student cached all the video information of a star’s album in a huge JSON, resulting in the JSON reaching 6MB. Later, the star issued an official announcement
Four, how to discover
1. redis-cli –bigkeys
Redis -cli provides –bigkeys to find bigkeys, for example:
-------- summary -------Biggest string found 'user:1' has 5 bytesBiggest list found 'taskflow:175448' has 97478 itemsBiggest set found 'redisServerSelect:set:11597' has 49 membersBiggest hash found 'loginUser:t:20180905' has 863 FieldsBiggest zset found 'hotkey: scan: instance: zset has 3431 members40 strings with 200 bytes (00.00% of keys, Avg SIZE 5.00)2747619 Lists with 14680289 items (99.86% of keys, Avg size 5.34)2855 sets with 10305 members (00.10% of keys, AVG size 3.61)13 hashs with 2433 fields (00.00% of keys, Avg size 187.15)830 zsets with 14098 members (00.03% of keys, AVg size 16.99)Copy the code
Bigkeys shows the top 1 bigkeys for each data structure, as well as the number of keys and average size for each data type.
Bigkeys is handy for troubleshooting problems, but there are a few things to be aware of when using it:
- It is recommended to execute from the node because — Bigkeys are also done through scan.
- It is recommended to perform this on the local node to reduce network overhead.
- If there are no slave nodes, you can use the — I argument, for example (– I 0.1 for 100 milliseconds)
- –bigkeys can only calculate the top1 of each data structure. If there are many data structures, bigkeys can not solve the problem
- debug object
Here’s another scene:
Hello, please help me check all keys larger than 10KB in Redis
Hello, please help to check the hash key whose length is greater than 5000 in Redis
Redis provides the debug object ${key} command to retrieve key values:
127.0.0.1:6379> hlen BIG :hash(integer) 5000000127.0.0.1:6379> Debug object big:hashValue at: 0x7Fda95b0CB20 refCount :1 Encoding: hashtable serializedlength: 87777785 lru: 9625559 lru_seconds_idle: 2 (1.08 s)Copy the code
Where serializedLength is the number of bytes after serialization of the value corresponding to the key.
127.0.0.1:6379 > strlen key (integer) 947394Copy the code
This allows you to scan through all the Redis keys with a debug Object to find the data for which you need the threshold.
Note the following when using Debug Objects:
- The Debug Object Bigkey itself can be slow and block Redis
- You are advised to perform this operation on the secondary node
- You are advised to perform this operation on the local node
- If no relationship with a specific number of bytes, completely can use scan + strlen | hlen | llen | scard | zcard alternative, all of them are o (1)
3. memory usage
The debug object above can be dangerous and inaccurate (serialized length). Is there a more accurate one? Redis 4.0 provides the memory Usage command to calculate the number of bytes per key (both itself and the associated pointer overhead, see the related article for details). For example, here is the result of an execution:
127.0.0.1:6379> Memory Usage BIG :hash(integer) 318663444Copy the code
The current system has only one key, and the total memory consumption is about 400MB. The memory usage is more accurate than that of debug objects.
127.0.0.1:6379> dbsize(integer) 1127.0.0.1:6379> hlen BIG :hash(integer) 5000000# about 300MB127.0.0.1:6379> Memory usage Big :hash(integer) 318663444# about 85MB127.0.0.1:6379> Debug object big:hashValue at: 0x7FDA95b0CB20 refCount :1 Encoding: hashtable serializedlength: 87777785 lru: 9625814 lru_seconds_idle: 9 (1.06 s) 127.0.0.1:6379 > info memory# Memoryused_memory_human: 402.16 MCopy the code
If you are using Redis 4.0+, you can use Scan + Memory usage(pipeline), and the good thing is that memory execution is not slow, of course, still recommended from nodes + local.
4. The client
If you want to find bigKey in real time, on the one hand you can try to modify Redis source code, there is also a way to modify the client, take Jedis as an example, you can add the corresponding detection mechanism in the key entry, such as jedis to obtain results as an example:
protected Object readProtocolWithCheckingBroken() { Object o = null; try { o = Protocol.read(inputStream); return o; }catch(JedisConnectionException exc) { UsefulDataCollector.collectException(exc, getHostPort(), System.currentTimeMillis()); broken = true; throw exc; }finally { if(o ! = null) { if(o instanceof byte[]) { byte[] bytes = (byte[]) o; If (bytes.length > threshold) {// Do many things, such as collecting and displaying with ELK}}}}}Copy the code
5. Monitor and alarm
Redis provides information about the number of bytes in the client input buffer and the length of the output buffer.
If you want to know the specific client, use the client list command to find it
Redis -cli client listid=3 addr=127.0.0.1:58500 fd=8 Name = age=3978 IDLE =25 FLAGS =N DB =0 sub=0 psub=0 multi=-1 qBUf =0 qbuf-free=0 obl=0 oll=0 omem=26263554 events=r cmd=hgetallCopy the code
6. Change the source code
In fact, this can be done, but the cost is relatively high, for the general company is not applicable.
Recommended best practices:
- Redis terminal and client terminal: – BigKeys for temporary use, scan for long-term troubleshooting (localization as far as possible), client real-time monitoring.
- Keep up with the surveillance alarm
- Use debug Objects sparingly
- All data platformization
- Emphasize the dangers of Bigkey to the developers
Five, how to delete
If a bigkey is found, and if it is garbage, it is directly del.
You can see that the deletion speed is acceptable for strings. However, for secondary data structures, as the number of elements increases and the number of bytes of each element increases, the deletion speed will be slower and slower, and there is a hidden danger of blocking Redis. Therefore, it is recommended to delete them in a progressive manner: HSCAN, LTRIM, SSCAN, and ZSCAN.
If you are using Redis 4.0+, an asynchronous unlink deletion is resolved and the following can be ignored.
1. The string
In general, using the del command for strings does not block.
del bigkey
Copy the code
2. hash
Using the HScan command, get a few (say 100) field-values at a time, then delete each field with hdel (pipeline can be used for speed).
Public void delBigHash(String bigKey) {Jedis Jedis = new Jedis("127.0.0.1", 6379); // String cursor = "0"; while(true) { ScanResult<Map.Entry<String, String>> scanResult = jedis.hscan(bigKey, cursor, new ScanParams().count(100)); / / after each scan for new cursor cursor. = scanResult getStringCursor (); List<Entry<String, String>> List = scanresult.getresult (); if(list == null || list.size() == 0) { continue; } String[] fields = getFieldsFrom(list); // Delete multiple fields jedis.hdel(bigKey, fields); // Stop cursor 0 if(cursor.equals("0")) {break; }} // Delete key jedis.del(bigKey); Private String[] getFieldsFrom(List<Entry<String, String>> list) { List<String> fields = new ArrayList<String>(); for (Entry<String, String> entry : list) { fields.add(entry.getKey()); } return fields.toArray(new String[fields.size()]); }Copy the code
3. list
Redis does not provide an API for traversing list types like lscan, but does provide commands like ltrim to gradually remove list elements until the list is removed.
Public void delBigList(String bigKey) {Jedis Jedis = new Jedis("127.0.0.1", 6379); long llen = jedis.llen(bigKey); int counter = 0; int left = 100; While (counter < llen) {// Cut 100 jedis.ltrim(bigKey, left, llen); counter += left; } // Finally delete key jedis.del(bigKey); }Copy the code
4. set
Using the ssCAN command, retrieve parts (say 100) of the elements at a time, and then delete each element using SREM.
Public void delBigSet(String bigKey) {Jedis Jedis = new Jedis("127.0.0.1", 6379); // String cursor = "0"; while(true) { ScanResult<String> scanResult = jedis.sscan(bigKey, cursor, new ScanParams().count(100)); / / after each scan for new cursor cursor. = scanResult getStringCursor (); List<String> List = scanresult.getresult (); if(list == null || list.size() == 0) { continue; } jedis.srem(bigKey, list.toArray(new String[list.size()])); // Stop cursor 0 if(cursor.equals("0")) {break; }} // Delete key jedis.del(bigKey); }Copy the code
5. sorted set
Using the zscan command, retrieve a portion (say 100) of the elements at a time, then delete the elements using zremrangeByrank.
public void delBigSortedSet(String bigKey) { long startTime = System.currentTimeMillis(); Jedis jedis = new Jedis(HOST, PORT); // String cursor = "0"; while(true) { ScanResult<Tuple> scanResult = jedis.zscan(bigKey, cursor, new ScanParams().count(100)); / / after each scan for new cursor cursor. = scanResult getStringCursor (); List<Tuple> List = scanresult.getresult (); if(list == null || list.size() == 0) { continue; } String[] members = getMembers(list); jedis.zrem(bigKey, members); // Stop cursor 0 if(cursor.equals("0")) {break; }} // Delete key jedis.del(bigKey); }public void delBigSortedSet2(String bigKey) { Jedis jedis = new Jedis(HOST, PORT); long zcard = jedis.zcard(bigKey); int counter = 0; int incr = 100; while(counter < zcard) { jedis.zremrangeByRank(bigKey, 0, 100); += incr; } // Finally delete key jedis.del(bigKey); }Copy the code
How to optimize
1. Split
Big list: list1, list2… listN
Big hash: Hash that can be used twice, such as hash%100
Date class: KEY20190320, KEY20190321, key_20190322.
2. Local cache
Reduce the number of visits to Redis and reduce the damage, but be aware that there may be some local overhead (such as serialization using out-of-heap memory, bigkey serialization overhead).
7. Conclusion:
Because developers have different understanding of Redis, bigkeys are inevitable in actual development. It is important to find them in time and deal with them through a reasonable detection mechanism. As a developer, we should not use the simple violence of Redis in business development, and should be more reasonable in the selection and design of data structure. For example, if bigkey appears, we should think about whether we can do some optimization (such as secondary index) to make these Bigkeys disappear in business as far as possible. If bigkey is unavoidable, Also think about pulling everything out each time (e.g., sometimes just hmGET instead of hgetall), delete as well, and try to do it in an elegant way.