It is the autumn job-hopping season again, and many students have started to take action. Today, I’m going to help you out by sending you this set of Redis interview questions. I hope I can help you when you are preparing for the interview and save your time searching the Internet!
Why is Redis so responsive
1. Data is stored in memory
Redis data is stored in memory. Read and write operations only need to access memory and do not require disk IO.
2. Underlying data structure
As follows:
- Redis data is stored in a hash table in key: value format, time complexity O (1).
- Redis defines a variety of data structures for value, including dynamic string, bidirectional linked list, compressed list, hash, skip table and integer array. You can select the most efficient data structure according to the characteristics of value.
3. Single-threaded model
Redis’s network IO and data reads and writes use a single-threaded model that allows CPU binding, which avoids the overhead of thread context switching.
Note: Redis 6.0 introduced a multithreaded model for network requests, with read and write operations still using a single thread.
Redis multi-threaded network model is shown in the following figure:
5.IO multiplexing
Redis adopts the Epoll network model, as shown below:
The kernel always listens for new socket connection events and read/write events for established socket connections, and puts the listened events into an event queue. Redis uses a single thread to process the event queue continuously. This avoids blocking waiting for connection and read and write events to arrive.
These events are bound to callback functions and Redis’s handler is called for processing.
Redis underlying data structure
1.Redis has five data types, including: string, list, set, ordered set and dictionary.
2. There are six data structures at the bottom of Redis, including dynamic string, bidirectional linked list, Ziplist, Hash table, Skip List and integer array.
The Redis data type has the following correspondence with the underlying data structure:
1. The value is a string
The underlying data structure is a dynamic string.
List 2.
Use a compressed list if both of the following conditions are met, otherwise use a bidirectional list:
- A single element in the list is less than 64 bytes
- There are less than 512 elements in the list
A compressed list is a contiguous space in memory, structured as follows:
The compressed list lookup time is O (n).
Set 3.
An ordered integer array is used if both of the following conditions are met, otherwise a hash table is used:
- The elements in the collection are of integer type
- The number of elements in the collection does not exceed 512
4. Ordered collections
Use a compressed list if both of the following conditions are met, otherwise use a skip list:
- All elements in the collection are less than 64 bytes
- The set contains less than 128 elements
Note: Ordered sets also have a HASH table to store the scores of the elements in the set, which is used to query the ZSCORE operation, so it’s very efficient.
The structure of the hop table is as follows:
Without an index, it would take 10 times to find the number 10. With a secondary index, it would take 5 times to find the number 10. With a primary index, it would take 3 times to find the number 10.
Each layer of the jumper is an ordered linked list, and the lowest layer holds all the data. The time complexity of hop table insertion, deletion, and query is O (logN). Skip tables need to store additional index nodes, which can add additional space overhead.
5. The dictionary
If both of the following conditions are met, the compressed list is used, otherwise the hash table is used:
- The key/value of each entry in the dictionary is less than 64 bytes
- The dictionary has less than 512 elements
Redis cache elimination strategy
Redis has a total of eight elimination strategies, as shown below:
Volatile – LFU and AllKeys-lFU are new to version 4.0:
- Lru: Data is eliminated according to the principle of least recent access. The possible problem is that if a large number of cold data is accessed recently, it will occupy a large amount of memory space. If the cache is full, part of hot data will be eliminated.
- ** LFU: ** Weeding out data according to the principle of minimum access frequency. If two data are accessed for the same number of times, the earlier data is weeded out.
Redis data persistence
There are two methods of Redis persistence, one is after write log (AOF) and the other is memory snapshot (RDB).
1. AOF the logs
AOF logs record each command received. When Redis recovers from a fault, you can load and replay the commands in the AOF logs for fault recovery.
AOF has three synchronization strategies, as shown below:
For services that are not sensitive to data loss, you are advised to use Everysec, which blocks the main thread less and loses data for only 1s after a failure.
2. The RDB snapshots
An RDB snapshot is an in-memory snapshot of all Redis data at one point in time.
3. Mixed logs
Starting with Redis 4.0, AOF files can also hold RDB snapshots. When AOF is rewritten, Redis will delete the contents of the AOF file and record an RDB snapshot starting with “Redis”.
After recording the RDB contents, the AOF file will then record the AOF command. During fault recovery, load the RDB snapshot in the AOF file and then play back the following commands in the AOF file.
4. Master/slave synchronization
When Redis is synchronized, the master node sends a snapshot of the RDB to the slave node and writes the commands into the replication buffer. After the slave node loads the RDB file, the master node sends the commands from the cache to the slave node.
5. AOF rewrite
The AOF log is appended by logging commands, so there may be multiple commands for the same key that can be combined into one. For example, you can combine one set operation log for the same key.
6. Block points
During AOF overwriting and RDB snapshot execution, Redis forks a child process to perform the operation. The child process does not block the main thread during execution.
But two things to note:
- During the Fork process, the Redis main thread copies a memory page table (which records the mapping between virtual and physical memory) to the child process. This process is blocked. The larger the main thread memory, the longer the block time.
- The child process and the main Redis thread share the same physical memory. If a new request comes in, the child process must copy on write the data page to the new memory space for modification. The diagram below:
Note: If large memory pages are enabled, 2MB of memory will be allocated for each copy.
High availability of Redis
Below is a “one master, two slave, three sentinels” architecture diagram:
From the figure we can see that connections are established between sentinels, between sentinels and master and slave nodes, and between sentinels and clients.
If the primary node fails, the Sentinel cluster needs to complete the primary/secondary switchover, as shown below:
Let’s talk about each of these four steps in turn.
1. Check whether the primary node is offline
When a sentinel monitors that the master node is offline, it sends confirmation commands to the other sentinels, who respond with either “Y” or “N” at their discretion.
The primary node is considered offline only if more than n/2+1 sentries agree that the primary node is offline. Here n is the number of sentinels.
The n/2+1 parameter is configured with the quorum parameter, such as five sentinels, which is typically configured to 3. It can also be configured to other values.
2. Elect a new primary node
When the primary node is judged to be offline, the Sentinel cluster selects a new primary node.
Flush the unstable slave node: flush it according to down-after-milliseconds * 10.
Down-after-milliseconds indicates how many times the primary and secondary nodes have been disconnected. 10 indicates how many times the secondary node has been disconnected from the primary node. If the secondary node has been disconnected for more than 10 times, the secondary node has been knocked out.
Slave-priority: specifies the priority of the slave node. When the slave node is selected, the sentry selects the slave node with the highest priority.
Replication Progress: Redis has a cache called repl_backlog_buffer that records incremental master/slave replications.
This is a ring structure of the buffer, as shown below:
The master node has a write offset of master_REPL_offset, and the slave node also has an offset of Slave_REPL_offset.
The slave node whose Slave_REPL_offset is closest to master_REPL_offset is selected as the new master node.
Therefore, the slave node with offset 114 in the figure above is preferentially selected as the new primary node.
ID number: If the priorities and parameters are the same, the secondary node with a smaller ID is selected as the new primary node.
3. Elect sentry leaders
After the first sentinel node that determines that the primary node is offline receives the reply from other nodes and confirms that the primary node is offline, it sends commands to other sentinels to apply to be the sentinel Leader.
The requirements for becoming a Leader are as follows:
- The affirmative vote received must be greater than equal quorum
- A majority vote is required
If the cluster is configured with five sentinels, quorum is set to 3, and one of the sentinels fails, it is likely that the primary node is offline, but cannot be switched because no sentinel Leader can be elected.
If the cluster has two sentinels and one of them fails, the sentinel Leader must not be selected.
The following figure shows the sentry’s successful election as Leader:
4. Switch over the active node
After the new master node and sentinel Leader are selected, the sentinel Leader will perform the operation of primary/secondary switchover.
Some event notifications will be made after completion:
- Notify other sentinels of the new master node address
- Notifies all secondary nodes of the new master node address. After receiving the address, the secondary nodes request the new master node to synchronize the master node
- Notify the client to connect to the new master node
5. Request processing during primary/secondary switchover
If the client’s read request is sent to the slave node, it can be processed normally. The write request will fail before the client receives notification of the new master node address. The client can take some emergency measures when the primary node goes offline, such as caching write requests.
To get the new master node information in a timely manner, clients can subscribe to sentry’s master node offline event and new master node change event.
Why is Redis slower
There are many reasons why Redis is slowing down, but here are 11:
As you can see from the figure, there are two main reasons why Redis slows down: blocking the main thread and operating system limitations.
1. The main thread is blocked
AOF overwrite and RDB snapshot: as mentioned earlier, when Redis overwrites AOF, the main thread forks a bgrewriteAof child. The main thread forks a BGSave child when Redis takes an RDB snapshot.
These two operations do not appear to block the main thread, but the process of forking the child is done on the main thread.
When the child process is Fork, Redis needs to copy the memory page table. If the Redis instance is large, this copy will consume a lot of CPU resources and block the main thread for a longer time.
Large memory pages: Redis supports large memory pages of 2MB by default. Using large memory pages can reduce the number of Redis memory allocations to a certain extent, but it will have a certain impact on data persistence.
Redis needs CopyOnWrite during AOF rewrite and RDB snapshot if the main thread receives a new write request.
With large memory pages, even if Redis only changes one of the 1KB keys, it still needs to copy a whole page of data, that is, 2MB. Large numbers of copies can degrade Redis performance when there are large writes.
High command complexity: Executing complex commands is a common cause of Redis blocking. For example, performing SORT on a set or list data type is O(N+M*log(M)).
Bigkey operation: If the value of a key is very large, it takes time to allocate memory for creation and release memory for deletion.
Since Redis 4.0, layfree has been introduced, which can be removed asynchronously using child processes without affecting the main thread execution. Replace the DEL command with the UNLINK command, and you can use the child process to delete asynchronously.
Redis 6.0 added the configuration item lazyfree-lazy-user-del. After the configuration item is set to yes, the del command can also be deleted asynchronously by the subprocess.
If lazyfree-lazy-user-del is not set to yes, then Redis will use asynchronous deletion depending on the deletion timing.
Redis does not use asynchronous deletion for strings and data types that use integer arrays and compressed lists at the bottom.
Full synchronization from a node: During full synchronization from a node, data in the memory must be cleared before the RDB file is loaded. This process is blocked. If a read request is received, the request can only be processed after the RDB file is loaded.
In addition, if the Redis instance is large, the RDB file is too large and takes a long time to load from the library. So try to keep the Redis instances small, such as a single instance limit of 4G, if exceeded use sliced cluster.
AOF write disk synchronization: The appendfsync policy has three types: always, everysec, and no. If always is used, disks are written to each command at the same time. This process is blocked and the next command can be processed only after disks are written successfully.
The everysec policy is recommended. If data is not sensitive to data loss, you can use No.
Memory reaches maxMemory: some keys need to be eliminated using a knockout strategy. Even with the lazyfree asynchronous deletion, the key selection process is blocked.
Faster elimination strategies can be selected, such as replacing LRU and LFU algorithm elimination with random elimination. The number of slices can also be expanded to reduce the time consumption of eliminating keys.
2. Operating system restrictions
Swap is used: The reason why swap is used is that the operating system cannot allocate enough memory to Redis. If swap is enabled for other operations, the memory data needs to be exchanged with and out of Swap constantly, which has a great impact on performance.
The operating system’s inability to allocate memory may also be due to the fact that other processes are using a lot of memory.
Network problem: If the network card is heavily loaded, the performance of Redis will be greatly affected. It is possible that Redis is indeed heavily trafficked, but it is also possible that other high-traffic applications are hogging bandwidth.
This is best monitored from the operational level.
Thread context switching: Although Redis is single-threaded, context switching may also occur on multi-core cpus.
If the main thread switches from one physical core to another, the CPU’s efficient level 1 and level 2 caches are no longer available.
As shown below:
To prevent this, you can bind Redis to a CPU physical core.
Low disk performance: In the AOF synchronous disk write scenario, low disk performance affects the Redis response. You can use SSDS with better performance.
Designing the leaderboard function
The zset type of Redis saves the score value, which can conveniently realize the function of the leaderboard.
For example, if you want to count the list of 10 articles, you can first set up a Zset to store 10 articles. Whenever a reader reads an article, you can add 1 to the score of the article with ZINCRBY command. Finally, you can count the top several articles in the list with range command.
Redis implements distributed locking
1.Redis distributed lock of a single node
As shown in the following figure, a service has two clients deployed. When acquiring a distributed lock, one succeeds and the other fails.
Redis generally uses setnx to implement distributed locks. The command is as follows:
SETNX KEY_NAME VALUE
If the setting succeeds, 1 is returned; if the setting fails, 0 is returned. There are some problems with using single-node distributed locks.
Client 1 fails after acquiring the lock: as a result, the lock cannot be released and no other client can ever acquire the lock.
The solution is to set the expiration time of the key with the following command:
SET key value [EX seconds] [PX milliseconds] NX
Client 2 dropped the lock by mistake: the solution is to add a client representation when setting the key value. For example, when setting the key value on client 1, concatenate the string application1 before the value.
2. Redis red lock
Redis single node has reliability problems, and the lock operation will fail if the node fails. In order to deal with the single point of failure, Redis designed a multi-node distributed lock, also known as red lock.
The main idea is that the client requests to lock multiple Instances of Redis, and only more than half of the instances successfully lock, the distributed lock is considered to have been successfully obtained.
As shown in the figure below, the client requests locking with three instances respectively, and two instances are locked successfully, so the distributed lock is obtained successfully:
Cache avalanche, breakdown, penetration
1. Cache avalanche
When Redis does cache, if a large number of cached data fails at the same time, client requests will be sent to the database in large numbers, resulting in a surge in database pressure.
The diagram below:
There are three main coping methods:
- Add a small random number to the key expiration time
- Current limiting
- Service degradation
2. Cache breakdown
A hot key suddenly expires and a large number of requests are sent to the database. The solution is to set no expiration time for the hotspot key.
3. Cache penetration
Cache penetration occurs when a hot key, the query cache and the query database are not present.
The diagram below:
There are two main coping methods:
- Null and default values for the cache hotspot
- Query the Bloom filter before querying the database
Data skew
What is data skew? Look at the following interview question: If Redis has a hot key with QPS of 100W, how to store it?
If this hot key is placed on a Redis instance, the access pressure on the instance will be very high.
As shown in the following image, redis3 holds the hot key foo, which can be very stressful to access:
There are two main solutions:
1. Use the client local cache to cache keys.
There are two problems with this transformation:
- Hot keys cached by clients can consume a lot of memory.
- The client needs to ensure that the local cache is consistent with the Redis cache.
2. Add a random prefix to the hot key and save it to different Redis instances.
There are two problems with this:
- Clients need to prefix this key when accessing it
- The client needs to delete the key stored on different instances according to all prefixes
Bitmap using
There’s a classic interview question, how do I reorder a billion integers in memory?
Java an integer type takes up four bytes of memory.
Billion * 4/1024/1024 = 3.7G
It’s taking up too much memory. What if I don’t have enough?
1. The Bitmap is introduced
The Bitmap type uses a String data structure and the underlying storage format is a binary array of bits. Suppose we have four numbers: 1, 4, 6, and 9, stored in the bit array as shown below:
This bit array holds four integers in the space of 10 bits, which is very small.
To return to the interview question, we use a bit array whose length is 1 billion integers (Max – min +1).
If you have a negative number, you need to do a transformation, add all the numbers to the absolute value of the smallest negative number. For example {-2, 0, 1, 3}, we convert to {0, 2, 3, 5} because array subscripts must start at 0.
2. Application scenarios
Employee clock recording: In a company of 100 employees, to count the number of employees who are on full attendance for a month, you can create a Bitmap every day, and the bit position of the employees who check in is 1.
To count employees who checked in that day, just use the BITCOUNT command.
To count the employees on full attendance in the current month, you only need to do the intersection calculation of the Bitmap of each day in the current month.
The command is as follows:
BITOP AND srckey1 srckey2 srckey3 ... srckey30
SrckeyN indicates the Bitmap of the punching record on the NTH day.
Count the daily active users of the website: for example, if the website has 100,000 users, we create a Bitmap with a length of 100,000, and each user ID occupies one bit. If the user logs in, the bit position is set to 1. At the end of the day, the BITCOUNT command is used to count the total number of users logged in that day.
The last
Now that you’ve seen it, give it a thumbs up!!