After all, Redis is still quite popular cache middleware, it is necessary to learn a while, in high concurrency scenarios, but also used more, interview is often asked points.

outline

Redis common interview

What is the Redis

Redis is simply a “database”, but different from traditional databases, redis database is stored in “memory”, so “read and write speed is very fast”, so Redis is widely used in the “cache” direction. In addition, Redis is often used as a “distributed lock”. Redis provides multiple data types to support different business scenarios. In addition, redis supports transactions, Persistence, LUA scripting, LRU driven events, and multiple clusters.

Why Redis

  1. “High performance” : If the user accesses some data in the database for the first time. This process is slow because it is “read from hard disk”. The user accesses “data in the cache” so that the next time the data is accessed it can be retrieved directly from the cache. Operation “cache is direct operation of memory, so the speed is quite fast”. If the corresponding data in the database changes, the corresponding data in the cache can be synchronized to change!
  2. “High concurrency” : “The direct operation cache can handle more requests than the direct access to the database”, so we can consider moving some of the data in the database to the cache, so that part of the user’s requests will go directly to the cache without going through the database.

What are the benefits of using Redis

  1. “Fast” because the data is in memory, similar to a HashMap, which has the advantage of having O(1) time for both lookup and operation.
  2. Rich data type support, support “string, list, set, sorted set, hash”
  3. “Transaction-enabled” operations are atomic, meaning that changes to data are either all or none
  4. Rich features: “Can be used for caching, messages, by key set expiration time, will be automatically deleted after expiration”
  5. And so on…

Why Redis and not Map/Guava

Caches are classified into local cache and distributed cache. Take Java as an example, using its own “Map or Guava implements local cache”, the most important characteristics are “light weight and fast, the life cycle ends with the destruction of JVM, and in the case of multiple instances, each instance needs to save a cache, caching is not consistent.

Using something like Redis or memcached is called a distributed cache. In the case of multiple instances, each instance shares a copy of the cached data and the cache is “consistent.” The downside is the need to keep redis or memcached highly available, which is architecturally complex.

What advantages Redis has over Memcached

  1. All memcached values are “simple strings,” and redis, as an alternative, supports more “rich data types.”
  2. Redis is “much faster” than memcached
  3. Redis can “persist” its data

Redis thread model

Redis internally uses a file event handler, which is a “single-threaded” file event handler, hence redis’s “single-threaded model”. It uses “IO multiplexing mechanism” to monitor multiple sockets at the same time, according to the events on the socket to “select the corresponding event processor” for processing.

The structure of the file event handler consists of four parts:

  • Multiple socket
  • IO multiplexing procedures
  • File event dispatcher
  • Event handler (connection reply handler, command request handler, command reply handler)

Multiple sockets may “concurrently generate different operations”, each corresponding to a different file event, but IO multiplexers listen for multiple sockets and “queue up events” generated by the socket. Event dispatchers fetch events from the queue one at a time. The event is handed to the corresponding “event handler” for processing.

Redis Common performance problems and solutions

  1. It is best for the Master not to do any persistence work, such as RDB memory snapshots and AOF log files
  2. If the data is important, a Slave enables AOF backup, and the policy is set to synchronize data once per second
  3. To ensure the replication speed and connection stability, it is recommended that the Master and Slave reside on the same LAN
  4. Try to avoid adding slave libraries to stressed master libraries
  5. Master < -slave1 < -slave2 < -slave3…

Redis common data structure and application scenario analysis

String

Common commands: set,get, DECr, INCr,mget, etc.

The String data structure is a simple key-value type. A value can be a number as well as a String. General key-value caching applications; General count: number of micro blog, number of fans, etc.

Hash

Common commands: hget,hset,hgetall, etc.

Hashes are a few field and value mapping tables of string type. Hashes are particularly good for storing objects where you can modify the value of a single field directly. For example, we can use Hash data structures to store user information, product information and so on.

List

Common commands: lpush, rpush lpop, rpop, lrange, etc

List is linked list. Redis List has many application scenarios and is also one of the most important data structures of Redis. For example, functions such as weibo following list, fan list and message list can be realized by using the List structure of Redis.

Redis List is implemented as a bidirectional linked list, that is, it can support reverse lookup and traversal, which is easier to operate, but brings some extra memory overhead.

In addition, you can use the lrange command, which is how many elements to read from a certain element. You can implement paging query based on the list. This is a great power.

Set

Common commands: sadd, spop smembers, sunion, etc

A set provides functions similar to a list, except that it can automatically assign weights.

Set is a good choice when you need to store a list without duplicating it, and sets provide an important interface for determining whether a member is in a set that lists don’t provide. The operation of intersection, union and difference set can be easily realized based on set.

For example, in the microblog application, all the followers of a user can be stored in a collection, and all the fans can be stored in a collection. Redis is very convenient to achieve such functions as common following, common fans, common preferences and so on. This is also the process of obtaining the intersection. The specific command is as follows: sinterstore key1 key2 key3 Stores the intersection in key1

Sorted Set

Common commands: zadd,zrange,zrem,zcard, etc

Compared with set, sorted set adds a weight parameter score, so that the elements in the set can be sorted according to score.

For example, in the live broadcast system, the real-time ranking information includes the list of online users in the live broadcast room, the list of various gifts, and the bullet-screen messages (which can be understood as the list of messages according to the message dimension), which are suitable for storage by the SortedSet structure in Redis.

Redis sets the expiration time

Redis has a time expiration feature that allows you to set an expiration time for values stored in the Redis database. As a cached database, this is very useful. For example, tokens or login information in our general projects, especially SMS verification codes, are time-limited. According to the traditional database processing method, they are generally judged to be expired by themselves, which will undoubtedly seriously affect the project performance.

When we set a key, we can give an expire time. With the expire time, we can specify the lifetime of the key.

Delete periodically + lazy delete

  • Delete periodically: By default, Redis randomly selects some keys with expiration time every 100ms, checks whether they are expired, and deletes them if they are expired. Notice this is a random selection. Why should it be random? If you think about it, redis stores hundreds of thousands of keys and iterates through all the keys with expiration dates every 100ms, it will put a lot of load on the CPU!
  • Lazy deletion: Regular deletion may result in many expired keys not being deleted when the time is up. So there’s lazy deletion. If your out-of-date key is not deleted by periodic deletion, it will remain in memory until your system checks that key. This is called lazy deletion, and it’s lazy!

What happens if a regular delete misses a lot of old keys, and you don’t check them in time, so you don’t delete them lazily? If a large number of expired keys accumulate in memory, redis memory blocks are exhausted. How to solve this problem?

Redis memory flushing mechanism.

Mysql has 20 million data, redis only 200 thousand data, how to ensure that the data in Redis is hot data

When the redis memory data set grows to a certain size, a data obsolescence strategy is implemented. Redis offers six data elimination strategies:

  • Voltile-lru: Discard least recently used data from a dataset that has been set to expire (server.db[I].expires)
  • Volatile – TTL: Selects expired data from a set (server.db[I].expires) to be discarded
  • Volatile -random: Selects any data from a set with an expiration date (server.db[I].expires) to be discarded
  • Allkeys-lru: Culls the least recently used data from the dataset (server.db[I].dict)
  • Allkeys-random: Random selection of data from a dataset (server.db[I].dict)
  • No-enviction: Data expulsion is prohibited

What are the differences between Memcache and Redis

  1. storage

    1. The Memecache stores all data in the memory. After a power failure, the Memecache hangs. The data cannot exceed the memory size.
    2. Redis is partially stored on hard disk to ensure data persistence.
  2. Data support type

    1. Memcache support for data types is relatively simple. String
    2. Redis has complex data types. Redis not only supports simple K/V type data, but also provides the storage of list, set, zset, hash and other data structures.
  3. Use the underlying model differently

    1. The underlying implementation mode and application protocol of communication between them and the client are different.
    2. Redis directly built the VM mechanism itself, because normal system calls to system functions would waste a certain amount of time moving and requesting.
  4. Cluster pattern

    Memcached does not have a native clustering mode. It relies on the client to write data in fragments to the cluster. But Redis currently supports the cluster model natively.

  5. Memcached is a multithreaded, non-blocking IO reuse network model. Redis uses a single-threaded multiplex IO multiplexing model.

Redis persistence mechanism

Most of the time we need persistent data, which is to write data from memory to hard disk, mostly for later reuse (such as restarting the machine, recovering data after a machine failure), or to back up data to a remote location in case of a system failure.

Redis differs from Memcached in that it is “persistent” and supports two different persistence operations. One method of Redis persistence is called “snapshotting (RDB)” and another is “append-only file (AOF)”. Both methods have their advantages, but I’ll go into more detail about what they are, how to use them, and how to choose the right persistence method for you.

Snapshotting Persistence (RDB)

Redis can create snapshots to obtain a point-in-time copy of data stored in memory. After Redis has created a snapshot, you can back it up, copy it to another server to create a copy of the server with the same data (the Redis master-slave structure is used to improve Redis performance), or leave the snapshot in place for use when restarting the server.

Snapshot persistence is the default Redis persistence mode, which is configured in the Redis. Conf configuration file by default:

Save 900 1 # After 900 seconds (15 minutes), Redis automatically triggers the BGSAVE command if at least one key has changed

Creating a snapshot.



After 300 seconds (5 minutes), Redis automatically triggers the BGSAVE command to create a snapshot if at least 10 keys have changed.



After 60 seconds (1 minute), Redis automatically triggers the BGSAVE command to create a snapshot if at least 10,000 keys have changed.

Copy the code

AOF (append-only file) persists

Compared with snapshot persistence, AOF has become the mainstream persistence scheme because of its “better real-time persistence”. Redis is not enabled with AOF (Append only file) persistence by default. This can be enabled with the appendonly parameter: appendonly yes

After AOF persistence is enabled, Redis writes every command that changes data in Redis to an AOF file on the hard disk. The AOF file is saved in the same location as the RDB file, which is set with the dir parameter. The default file name is appendone.aof.

There are three different AOF persistence methods in Redis configuration files:

Appendfsync always # Writes to the AOF file every time a data change occurs, which seriously slows down Redis

Appendfsync everysec # explicitly synchronizes multiple write commands to disk once per second

Appendfsync no # let the operating system decide when to synchronize

Copy the code

To accommodate data and write performance, users can consider the appendfSync Everysec option, which lets Redis synchronize AOF files once per second with almost no performance impact. And even in the event of a system crash, users will only lose data generated within a second. When the hard drive is busy performing write operations, Redis also gracefully slows itself down to accommodate the maximum write speed of the hard drive.

Redis 4.0 optimizes persistence

Redis 4.0 starts to support mixed persistence of RDB and AOF (disabled by default and enabled by aof-use-rdb-preamble).

If mixed persistence is turned on, the CONTENTS of the RDB are written directly to the beginning of the AOF file when AOF is overwritten. The advantage of this is that you can combine the advantages of RDB and AOF to load quickly without losing too much data. Of course, there are some disadvantages, the RDB part of AOF is compressed format is no longer AOF format, poor readability.

AOF rewrite

AOF overwriting can create a new AOF file that holds the same database state as the original AOF file, “but in a smaller size.”

AOF overwriting is an ambiguous name. It is implemented by reading “key/value” pairs in a database, without requiring programs to read, analyze, or write to existing AOF files.

When the BGREWRITEAOF command is executed, the Redis server maintains an AOF “rewrite buffer” that keeps track of all write commands executed by the server while the child process creates a new AOF file. “When the child process finishes creating a new AOF file, the server will append all the contents of the rewrite buffer to the end of the new AOF file, so that the state of the database stored in the old and new AOF files is the same”. Finally, the server completes the AOF file rewrite by replacing the old AOF file with the new one.

Redis transactions

Redis implements transaction through commands such as MULTI, EXEC and WATCH. Affairs provides a “packaged several commands to request and then at once, in order to perform multiple command mechanism, and during the term of transaction, the server will not interrupt the transaction and change to follow other client’s orders Request, it will “all the commands of the transaction has been completed, and then to deal with other clients order request.

In traditional relational databases, ACID properties are often used to verify the reliability and security of transaction functions. In Redis, transactions “always” have Atomicity, Consistency and Isolation, and they also have Durability when Redis runs in a particular persistence mode.

What are the common performance issues with Redis? How to solve it?

  1. When the Master writes a memory snapshot, the save command is used to schedule the rdbSave function, which blocks the main thread. When a large number of snapshots are taken, the performance of the main thread is greatly affected and the service is suspended intermittently. Therefore, the Master is advised not to write a memory snapshot.
  2. Master AOF persistence. If AOF files are not rewritten, the impact of this method on performance is minimal. However, the size of AOF files keeps increasing. The Master is advised not to do any persistent work, including memory snapshot and AOF log files. In particular, do not enable memory snapshot for persistence. If data is critical, a Slave enables AOF backup and synchronizes data once per second.
  3. The Master invokes BGREWRITEAOF to rewrite the AOF file. The AOF occupies a large amount of CPU and memory resources during the AOF rewrite. As a result, the load of services is too high and the service is temporarily suspended.
  4. Redis has a performance problem with master-slave replication. For the speed of master-slave replication and the stability of connection, it is better for the Slave and Master to reside in the same LAN

Redis synchronization mechanism is understood?

Master/slave synchronization. During the first synchronization, “The primary node performs bgSave” and records subsequent modification operations to the memory Buffer. After the synchronization is complete, “Fully synchronize the RDB file to the replication node”. After the replication node accepts the synchronization, “Load the RDB image into memory”. After the loading is complete, the master node is notified to synchronize the modified operation records to the replication node for replay.

Do you use Redis clustering? How does clustering work

Redis Sentinel focuses on high availability and automatically promotes the slave to Master in the event of a master outage.

Redis Cluster focuses on scalability. When a single Redis memory is insufficient, a Cluster is used to fragment storage.

Cache avalanche and cache problem solutions

Cache avalanche

The cache fails massively at the same time, so subsequent requests fall on the database, causing the database to collapse under a large number of requests in a short period of time.

  • Pre-event: try to ensure the high availability of the entire Redis cluster and make up for machine outages as soon as possible. Choose an appropriate memory elimination strategy.
  • Issue: Local EhCache + Hystrix limiting & degrade to avoid MySQL crash
  • After the fact: Data saved by redis persistence mechanism is restored to the cache as soon as possible

The cache to penetrate

Generally, hackers deliberately request data that does not exist in the cache, causing all requests to fall on the database, causing the database to collapse under a large number of requests in a short period of time.

Workaround: There are several ways to effectively solve the cache penetration problem, the most common of which is to use a Bloom filter to hash all possible data into a “bitmap” large enough that a non-existent data will be intercepted by the bitmap, thus avoiding query pressure on the underlying storage system. A more crude approach (which we did) is that if a “query returns empty data” (whether the data does not exist, or the system fails), we still cache the empty result, but it has a “short expiration time” of up to five minutes.

How to solve the Redis concurrent competing Key problem

The problem with the so-called Redis concurrent contention for keys is that multiple systems operate on the same Key at the same time, but the order in which the Key is executed is not the order we expect it to be executed, resulting in different results!

One solution is recommended: distributed locks (distributed locks can be implemented by Both ZooKeeper and Redis). (Do not use distributed locks if Redis does not have concurrent competing Key problems, as this can affect performance)

Distributed lock based on ZooKeeper temporary ordered node. The general idea is that when each client locks a method, a unique instantaneous ordered node is generated in the directory of the specified node corresponding to the method on ZooKeeper. The way to determine whether to obtain the lock is very simple, just need to determine the lowest sequence number of nodes. When the lock is released, the instantaneous node is simply removed. At the same time, it can avoid deadlock problems caused by locks that cannot be released due to service downtime. When the business process is complete, delete the corresponding child node to release the lock.

In practice, of course, reliability is given priority to. That’s why Zookeeper came first.

How to ensure data consistency between the cache and the database in dual write

As long as you use cache, it may involve cache and database double storage double write, as long as you are double write, there will be data consistency problem, so how do you solve the consistency problem?

In general, if your system is not strict with cache + database must be consistent, cache can be a little bit with the occasional inconsistent database, it is best not to do this project, read and write request serialization, string into an in-memory queue, so that you can guarantee won’t appear inconsistent

Serialization causes the throughput of the system to be drastically reduced, requiring several times more machines than normal to support a single request on the line.

Creation is not easy, if you think it will help, give a small star. Making address 😁 😁 😁