This article has participated in the “Digitalstar Project” and won a creative gift package to challenge the creative incentive money.
When it comes to bloem filters, redis,
Redis, as the mainstream noSQL database, has attracted much attention. Its rich value types, as well as its bias toward computing to move attributes to data, reduce the IO cost problem. Favored by developers. We usually use Redis as a data cache, but as a cache Redis has some problems, namely cache penetration, breakdown, avalanche, consistent double write. The main explanation of this is the penetration problem
First of all, let’s think about why the problem of penetration arises.
Suppose we have some data stored in MySQL, but due to the large number of users, we need to perform a filtering and interception in Redis when users access data. If there is data in REIDS, it is allowed, and if there is no data, it is directly rejected. So that the user will not be too much to operate the database, reduce the pressure of the database. But there’s a problem:
- How do we ensure redis to make a judgment when the user brings the data to us? At this time, we need to write an algorithm to disassemble the user’s data and calculate and compare the existing data in REDis. So far, we have theoretically solved the problem of data filtering.
- How do you store MySQL data in Redis? Do you store all data in Redis? If so, then redis memory based noSQL database, there is no way to store that much data. This is where redis comes in
bitmap
Type of. To store the data.
The so-called bitmap uses 1Bit to mark the corresponding value of the element, and the key is the element. Imagine that 1 bytes is 8 bits, so 1 KB is 8192Bit, 1M is 8388608Bit. It is not a problem to process large amounts of data using REids’ bitmaps
Solution to cache penetration
With that in mind: the overall solution looks like this
- First, we need to use the algorithm to load the data that needs to be cached into the Redis bitmap when the project starts
- Then write an algorithm to disassemble the data for user access and compare whether the data exists in redis bitmap. Release exists to prevent direct return.
Illustration:
The above scheme may also have a missing problem, accidentally hit through the past, this situation is not absent. However, we can add a key to redis after passing through, marking the error and preventing it from passing through again
Pay attention to
>1% (No Silver Bullet)
- Do you have any
- Some bitmap markup
- Requests may be mismarked
- However, the probability of data pass penetration is reduced
- The cost is low
To sum up: What redis tells you does not exist must not exist, 100%; But redis tells you there is, but not always
Now that the general idea of the solution is clear, let’s organize the solution:
Cache penetration solution
There are three possible solutions:
First of all, we will implement the first one :(the client implements bloom algorithm and carries bitmap by itself)
Plan a
Public class test {// Bitmap length public static final int NUM_SLOTS = 1024 * 1024 * 8; Public static final int NUM_HASH = 8; Private static BigInteger bits = new BigInteger("0"); private static BigInteger bits = new BigInteger("0"); Private static void addElement(String String) {// addElement to position 1 on bitmap // use hash function to calculate hash value: loop eight times for(int I = 0; i < NUM_HASH; i++){ int bit = hash(string, i); if(! TestBit (bit)){// The BigInteger object must run on another BigInteger object // The left shift will correspond to the bitmap position of 1 bits = bits.or(new BigInteger("1").shiftLeft(bit)); } } } private static int hash(String message, Int index) {// Other hash functions can be used here to calculate the hash value without changing the final result. try { MessageDigest md5 = MessageDigest.getInstance("md5"); byte bytes[] = message.getBytes(); md5.update(bytes); byte bits[] = md5.digest(); BigInteger bi = new BigInteger(bits); return Math.abs(bi.intValue()) % NUM_SLOTS; } catch (NoSuchAlgorithmException e) { e.printStackTrace(); } return -1; } private static Boolean check(String String) {private static Boolean check(String String) {for(int I = 0; i < NUM_HASH; i++){ int index = hash(string, i); if(! bits.testBit(index)){ return false; } } return true; }}Copy the code
Scheme 2
@Component @Slf4j public class RedisBloomUtil { @Autowired private JedisPool jp; @PostConstruct public void initJedis(){ jedisPool = jp; } private static JedisPool jedisPool; private static Jedis jedis = null; Private static long n = 1000000L; private static long n = 1000000L; Private static double FPP = 0.01f; private static double FPP = 0.01f; Private static Long numBits = optNumOfBits(n, FPP); Private static int hashNum = optNumOfHashFunction(n,numBits); @return */ public long getCount(){jedis = jedispool.getResource (); Pipeline pipeline = jedis.pipelined(); Response<Long> newsInfo = pipeline.bitcount(RedisConfig.newsCacheKey); pipeline.sync(); Long count = newsInfo.get(); pipeline.close(); return count; } public static Boolean isExist(String where,String);} public static Boolean isExist(String where,String);} public static Boolean isExist(String where,String) key){ jedis = jedisPool.getResource(); long[] indexs = getIndex(key); boolean flag; Pipeline = jedis.pipelined(); pipelined = jedis.pipelined(); try { for (long index:indexs) { pipeline.getbit(where,index); } flag = ! pipeline.syncAndReturnAll().contains(false); } finally { pipeline.close(); } // if it does not exist, put it into redis bitmap. flag){ // putRedis(where,key); // } return flag; } @param where @param key */ public static void putRedis(String where String key){jedis = jedisPool.getResource(); long[] indexs = getIndex(key); Pipeline = jedis.pipelined(); try { for (long index: indexs) { pipeline.setbit(where,index,true); } pipeline.sync(); } finally {pipeline.close(); } Long ttl = jedis.ttl(where); / / set the key if the expiration date of the 30 days (TTL = = 1 | | TTL = = 2) {jedis. Expire (where, 2592000); Public static long[] getIndex(String key){long hash1 = hashOpt(key); long hash2 = hash1 >>>16; long[] result = new long[hashNum]; for (int i = 0; i < hashNum; i++) { long combinedHash = hash1 + i * hash2; if (combinedHash<0){ combinedHash = ~combinedHash; } result[i] = combinedHash % numBits; } return result; } /** * Get a hash from guava * @param key * @return */ public static long hashOpt(String key){Charset Charset.forName("UTF-8"); return Hashing.murmur3_128().hashObject(key, Funnels.stringFunnel(charset)).asLong(); Private static long optNumOfBits(long n,double FPP){if (FPP ==) 0){ fpp = Double.MAX_VALUE; } return (long) (-n * Math.log(fpp) / (Math.log(2) *Math.log(2)) ); @param * @param numBits * @return */ private static int optNumOfHashFunction(long n, long numBits){ return Math.max(1,(int) Math.round((double) numBits/n * Math.log(2))); }}Copy the code
Dependency on the requirements of option II:
Clients </groupId> <artifactId>jedis</artifactId> <version>3.3.0</version> </dependency>Copy the code
All right, time is limited. There is no explanation for plan three. Redis itself now supports bloom filters. If I have time, I will write about plan three.
Interested partners can meet you on wechat search code to get more exciting content.