Use scenarios of Redis in business

Interviewer: I see Redis is used in your projects. Which business scenarios do you use Redis in?

Redis is used in many of our businesses, just to name a few

  • The product classification of our APP, including primary classification, secondary classification and tertiary classification, is stored in Redis
  • We use Redis to store the information of instant kill products
  • The user’s mobile phone verification code and other data that will expire within a certain period of time are also stored in Redis
  • AccessToken for all third party applications
  • We also use Redis for distributed locking, such as inventory deduction
  • Other data that requires frequent access and is modified less frequently within a given period of time are also stored in Redis

Interviewer: Do you have any basis for this? (Why is this data stored in Redis?)

Generally speaking, these data are modified with low frequency and accessed with high frequency. As long as users browse goods, they must access them. In view of the QPS 10W+/S of Redis, stored in Redis is much better than MySQL. In addition, classification generally has many levels, complex data structure, direct access to MySQL, we query out and write code assembly, Redis supports rich data types, we assembled in advance, stored in Redis, take out the direct use can be.

Second kill commodity information storage Redis is inevitable, second kill system access is huge, with Redis 10W +/S QPS to cope with better, MySQL absolutely can not hold up. As for the mobile phone verification code is basically only 5-10 minutes valid, we can set the expiration time in Redis. Other scenarios are similar, in a word, high frequency access, throughput requirements of the interface involved in data can be considered Redis. Of course, we only use Redis as a cache middleware between the application and MySQL, most of the data will be stored in MySQL.

Redis Indicates the data type

Interviewer: You just said that Redis supports a wide range of data types. What data types have you used?

There are five common data types: string, hash, list, set, and zset

  • String, which is the simplest and most used, stores strings, and inside is an array of characters.
  • List, list is similar to LinkedList in Java, it is a LinkedList, and we all know that the data structure of a LinkedList is very fast to insert and delete, the time complexity is O(1).

  • Hash (dictionary). Hash is similar to Java HashMap in that it works in almost the same way. When a hash collision occurs, the elements are chained together in a linked list. So in Java code, if you pass an object in the value of the hash, it actually serializes to an array of bytes.

  • Set. A set is the Java equivalent of a HashSet, where the values inside are unordered and not repeated. The internal implementation is equivalent to a special dictionary where all value is a NULL value.
  • SortedSet, sort of like the combination of a sortedSet and a HashMap in Java, on the one hand it’s a set that guarantees that the values are unique, and on the other hand it gives each value a weight and score is the sorting weight of that value

Interviewer: Anything else?

Inner heart: unexpectedly this interview officer level is so high, afraid not good cheat…… In addition, some of our services also use Bitmap, Hyperloglog, BloomFilter, Geo.

  • bitmap

We have a user check-in service and we want to show 365 days of check-in records. Originally, it was stored in a string

stringRedisTemplate.opsForValue().set("userId:year:month:day","1"); //1 indicates check-in, and 0 indicates non-check-inCopy the code

Considering that check-in and non-check-in are only an identifier, it is too wasteful to directly store a string, because a string is multiple bytes. Later, bitmap storage is adopted to save space. A bitmap is stored bitwise. It stores consecutive binary digits 0 and 1. 1 byte = 8 bits. So a user’s check-in record for 365 days only needs 365 bits < 368 bits = 46 bytes to be stored, and a slightly longer string as the key is enough.

// Select * from userId where userId = year; The key "1125323041914142722:2021" has 384 bits stringRedisTemplate.opsForValue().setBit("1125323041914142722:2021",364,true); //364 indicates the day 364. True indicates check-inCopy the code

This method saves a lot of memory space than using string storage directly. The Redis bitmap is not a special data structure, but a string.

  • hyperloglog

This data structure is used to calculate cardinality (the number of elements that remain in a set after the repeated elements have been removed). It has an error, but the error rate is small. Most importantly, its memory footprint is very small, hyperLoglog only needs to spend 12KB of memory, can calculate the cardinality of close to 2^64 different elements.

We use it to count the UV of the page, although the result is not accurate, but for the business, the number of UV is 20W or 20W zero 200 is actually not important, they only need a general data for reference

  • bloomfilter

It is used to determine whether an element exists in a set. The underlying data structure is related to bitmap, so it is very memory saving. Like Hyperloglog, there are some errors, but they are small. When it returns that an element exists in a set, it may not. But when it returns an element that does not exist in a set, it must not exist.

Bloom filter can be used to push user preferences, such as a video APP, to recommend a short video to the user, it is recommended by the user is not seen, we put the user has seen the video in bloom filter, push before decide whether to push the video is not exist in the bloom filter, if not, that users have seen, Can push

  • geo

It is used to store and perform various computing operations on the stored information. This feature was added in Redis 3.2. (I don’t use this shit… Can’t go on blowing)

Redis threading model

Interviewer: You also said that Redis is very fast, do you know why?

Redis is fast for two main reasons:

  • Redis is based on memory, which has a response time of 100 nanoseconds. One millisecond equals 1,000,000 nanoseconds
  • Since Redis is single-threaded, the IO multiplexing model is used to handle a large number of client connections.

Interviewer: Can you give me a brief introduction to IO multiplexing

IO multiplexing is the same design idea as NIO (non-blocking IO) in Java. Traditional blocking IO, in the single-threaded case, must wait until the server has finished processing the current client request and the client disconnects. To establish a connection with the next client.

while (true) { socket = server.accept(); New Thread(() -> {//read, write}); }Copy the code

This code is the most classic Java Socket communication, originally each time we start a client to open a thread processing, you see, if the server is single-threaded, then the server must wait for the completion of the IO, the client disconnect to continue to accept the next client. Like this

Java NIO, for example, simply introduces Selector, Channel, and Buffer components. A client connection is a Channel. Every time a new Channel is registered with a Selector, the Selector will always round which Channel and the server have network Channel establishment, read, and write events, and work when events occur (which is supported by the underlying system functions). At this point, the server can process hundreds or thousands of client requests on a single thread. Example code:

public static void main(String[] args) throws Exception { Selector selector = Selector.open(); ServerSocketChannel socketChannel = ServerSocketChannel.open(); socketChannel.socket().bind(new InetSocketAddress(6666)); socketChannel.configureBlocking(false); socketChannel.register(selector, SelectionKey.OP_ACCEPT); // The new channel registers with selector while (true) {selectors. Select (); Set<SelectionKey> selectionKeys = selection.selectedKeys (); Iterator<SelectionKey> Iterator = selectionKey.iterator (); While (iterator.hasnext ()) {SelectionKey key = iterator.next(); If (key.isacceptable ()) {system.out.println (" the server has received a new Accept link "); SocketChannel socketChannel1 = socketChannel.accept(); / / socketChannel1 client socketChannel1. ConfigureBlocking (false); socketChannel1.register(selector, SelectionKey.OP_READ, ByteBuffer.allocate(1024)); } if (key.isreadable ()) {system.out.println (" the server received a new Read link "); SocketChannel channel = (SocketChannel) key.channel(); ByteBuffer buffer = (ByteBuffer) key.attachment(); channel.read(buffer); } iterator.remove(); // Remove processed events}}}Copy the code

IO multiplexing is the same as Java NIO, but it is implemented differently in Linux. Select () -> poll() -> epoll() to epoll(), Epoll () gets the collection of file handles from the kernel where read/write events occur. Redis can be understood in the following picture

Redis uses the IO multiplexing model to connect multiple clients using a single thread, queueing client instructions in sequence so that the execution of commands is always thread-safe.

It’s also worth mentioning here that many people assume that NIO in Java means that the server is doing multiple client connection, read, and write events at the same time, which is completely wrong. If you look at the Java NIO code above, and the diagram above, the server is still handling connections, reads, writes, and instructions sequentially. It’s not a thread that reads and writes to multiple channels at the same time. It’s just a thread that can connect to multiple clients at the same time. It then brings the set of channels in which read and write events occur, but the reads and writes are still performed in a loop. The Channel that has processed read and write events will continue the above operations if another read and write event is found in the next poll.

So it’s pretty clear here that IO multiplexing is appropriate for a server with a lot of client connections like Redis. If there is only one client, the select() function will continue to block round robin after the server has finished processing if the client keeps turning on, consuming CPU resources and no better than traditional blocking IO

Redis persistence

Interviewer: You just said that Redis is based on memory, so if Redis goes down accidentally, would all the data be lost?

Redis is persistent. Not all of them, but some of them, depending on the persistence you choose. Redis provides two types of persistence, RDB and Aof.

  • Persistence – RDB

RDB is called snapshot backup. According to the save <seconds > <changes > rule, a snapshot backup is made at a certain time. A backup is made if the data has been modified a specified number of times after a specified interval, for example:

Save 900 1 # If the change has been made at least once in 900 seconds, then the time point after 900 seconds will be persisted once. Save 300 10 # If the change has been made at least 10 times in 300 seconds....... save 60 10000 #......Copy the code

RDB Redis will recover data quickly, but this mechanism will lose a lot of data at some point. For example, we set save 60 10000. If 10,000 changes have been made after 60 seconds, persist. Let’s say we made 10,000 changes in 50 seconds. It’s supposed to persist for another 10 seconds, but Redis is down. So there’s no persistence, and that’s the end of the story, because all those 10,000 changes are gone. So we can choose the second mechanism, Aof

  • Persistence – Aof

Aof keeps a log of each modified command appended to the disk file, but it is impossible to persist every write command to the disk, which would affect performance, so aof adds an optional parameter to configure the frequency of writing to the disk

#appendfsync always # synchronizes every time, ensuring that data is not lost, but slowly. This is the equivalent of every instruction I/O drive... Appendfsync everysec # synchronizes every second. The system default synchronization policy is recommended for production. #appendfsync no # does not actively synchronize dataCopy the code

This mechanism can lose up to 1s of data, which is a little more acceptable than RDB. Aof has a problem, however, because it uses log replay when Redis is restarted, which means that all the commands in the log file are executed again to recover data. If Redis is running too many commands during the restart, it will run too many commands and wait too long to get up. To solve this problem, Redis 4.0 has given us hybrid persistence

  • Persistence – rdb&aof
Aof-use-rdb-preamble yes # Enable hybrid persistence (default: yes)Copy the code

Write the RDB file contents to the aof file, and then append the incremental log contents to the end of the Aof file. When rebooting Redis, most of the data is in the RDB part of the Aof file, recovery is fast, and a few incremental logs at the end are not significantly affected

Redis Delete policy for expired data

Interviewer: You said before that Redis stores expired data. Do you know how Redis regularly deletes expired data?

I don’t give a shit how it’s deleted… Can you use it?? I’m not developing it, I’m just a user…

Redis provides two strategies for removing stale data

  • Periodically delete

Redis will put all expired keys into a separate internal dictionary, which will be scanned every once in a while to determine if they are expired, and if they are expired, they will be removed. This periodic deletion is very important. Redis performs 10 scans per second. The steps are as follows:

  1. Select 20 keys at random from the dictionary
  2. Delete the expired key among the 20 keys
  3. If the proportion of expired keys exceeds 1/4, the above steps are repeated
  • Lazy to delete

When accessing Redis to obtain data, judge whether this key is expired, and delete it immediately if it is expired

Interviewer: You can’t guarantee that all the expired data will be deleted. With more and more business data, the keys that cannot be deleted will accumulate and occupy memory. The new business data will also occupy memory.

. I don’t have enough memory. Just add it

The above method does in some cases may lose a lot of expired keys can not be deleted, if the business volume is large, Redis is used frequently, these missed expired keys will be stacked in memory, occupying memory space, plus new business data, may indeed cause insufficient memory. Don’t panic. We also have a memory obsoletization strategy!

Redis Memory obsolescence policy

The maxmemory parameter is configured to limit the amount of memory that is expected to be out of capacity. The maxmemory parameter is configured to limit the amount of memory that is expected to be out of capacity. When the actual memory exceeds this parameter, we will free up new memory for use according to our configured policy. In total, there are 8 strategies after Redis 4.0

  • Volatile -lru: Attempts to eliminate keys with expiration time. The least used key is eliminated first
  • Volatile – TTL: Almost the same as above, except instead of LRU, the policy is to compare the TTL value of the remaining lifetime of the key. The smaller the TTL, the higher the priority to be eliminated
  • Volatile -random: Indicates that data is randomly selected from the key set with an expiration time
  • Allkeys-lru: Remove the least recently used key from the key space when memory is insufficient to accommodate new writes (this means that keys that have not been set to expire may become obsolete, this is the most commonly used, recommended for production environments)
  • Allkeys-random: Selects data from the key set with expiration time
  • Noeviction: Disables data expulsion. This means that a new write will report an error when memory is insufficient for it. (default policy) This should not be used!
  • Volatile-lfu: Eliminates the least frequently used data from the set of keys with an expiration time set
  • Allkeys-lfu: Removes the least frequently used key from the key space when memory is insufficient to accommodate new data

There are 8 in total, and we use the allkeys-LRU strategy

Redis implements distributed locking

Interviewer: You mentioned Redis as a distributed lock at the beginning. Can you tell me the difference between distributed locks and Java locks?

Java’s built-in Lock is a process Lock that can only Lock the current service. Production environments are generally clustered, and an order service may have 10 instances. Process locks such as Lock and synchronized can only Lock the current service

Redis can be used to lock all order services because the resource is in the common place

Interviewer: How do you implement distributed locking?

To implement it ourselves we can use the setex command plus spin to implement it in Java code

@Transactional public void test(){ String uuid = UUID.randomUUID().toString().replace("-",""); / / make sure the unlock, solution is himself the lock on the Boolean flag = stringRedisTemplate. OpsForValue () setIfAbsent (TimeUnit. ", "lock", uuid, 5 SECONDS). // Preemption succeeded if(flag! = null && flag){// Execute service... // Release the lock (first determine that the lock is its own, To delete the lock) if (uuid. Equals (stringRedisTemplate. OpsForValue () get (" lock "))) {stringRedisTemplate. Delete (" lock "); } else { TimeUnit.MILLISECONDS.sleep(500); // Sleep 500ms to reduce the spin frequency test(); // Spin acquire lock}}}Copy the code

However, we later found a framework Redisson in the Redis official website, which provides a better implementation and encapsulation for Redis distributed lock, and we are using Redisson to do the specific code implementation of distributed lock in our current project.

RLock lock = redissonClient.getLock("lock"); / / reentrant lock RLock fairLock = redissonClient. GetFairLock (" fairLock "); / / fair lock RLock multiLock. = redissonClient getMultiLock (lock, fairLock); / / interlocking RReadWriteLock readWriteLock = redissonClient. GetReadWriteLock (" readWriteLock "); RLock readLock = readWriteLock. ReadLock (); RLock writeLock = readWriteLock. WriteLock (); / / write lock RSemaphore semaphore = redissonClient. GetSemaphore (" semaphore "); . / / semaphore RPermitExpirableSemaphore mySemaphore = redissonClient getPermitExpirableSemaphore (" mySemaphore "); / / can be expired semaphore RCountDownLatch latch. = redissonClient getCountDownLatch (" anyCountDownLatch "); / / closureCopy the code

The locks provided by the framework are used in almost exactly the same way as the process locks provided under the Java JUC package, so they are used in the same way as they were in Java, except for distributed security.

It provides not only distributed locks, but also advanced usage apis such as distributed collections and encapsulated Bloom filters.

conclusion

Article space is limited, a lot of things did not go to detail, but I think when the interview also do not need to answer so detailed. The Redis interview questions don’t stop there, there are more advanced application questions, clustering questions, etc., will come in the future.