1. How does redis cluster work? How is redis key addressed in clustered mode? What are the algorithms for distributed addressing? Did you solve the consistent hash algorithm?

Interviewer psychoanalysis

In previous years, if Redis wanted to build several nodes, each node stored part of the data, it had to use some middleware to achieve this, such as CODIS or TwemProxy. There is some Redis middleware, you read and write redis middleware, and the Redis middleware is responsible for distributing your data across multiple machines in redis instances.

In the past two years, Redis continues to develop, redis also constantly new versions, the current Redis cluster mode, you can deploy multiple Instances of Redis on multiple machines, each instance stores part of the data, at the same time, each primary instance of Redis can be connected to the redis slave instance, automatically ensure that said, If the primary redis instance fails, it will automatically switch to the secondary Redis instance.

Now the new version of Redis, everyone is using Redis cluster, which is redis native support redis cluster mode, then the interviewer will definitely give you a few rounds of criticism on redis cluster. If you haven’t used Redis Cluster, it’s fine. Many people use codis and other clients to support clustering, but you should at least study Redis Cluster.

If you have a small amount of data, mainly for high concurrency and high performance scenarios, such as your cache is generally a few GIGABytes, a single machine is enough, you can use Replication, a master multiple Slaves, a few slaves depending on the read throughput you require, Then build a Sentinel cluster to ensure the high availability of redis master-slave architecture.

Redis Cluster is mainly for massive data + high concurrency + high availability scenarios. A Redis cluster supports N Redis master nodes. Each master node can mount multiple slave nodes. This allows redis to scale horizontally. If you want to support a larger cache, horizontally expand the number of master nodes so that each master node can hold more data.

Analysis of interview questions

Redis cluster is introduced

  • Automatically shards data, placing some data on each master
  • With built-in high availability support, it is possible to continue working when part of the master is not available

In redis Cluster architecture, each Redis should release two port numbers, such as 6379, and add 1W port number, such as 16379.

The 16379 port number is used for communication between nodes, that is, cluster Bus, which is used for fault detection, configuration updates, failover authorization. Cluster Bus uses another binary protocol, the Gossip protocol, for efficient data exchange between nodes, which consumes less network bandwidth and processing time.

Internal communication mechanism between nodes

Basic Communication Principles

Cluster metadata can be maintained in two modes: centralized mode and Gossip protocol. Redis Cluster nodes communicate with each other using the Gossip protocol.

Centralized is the storage of several types of cluster metadata (node information, faults, and so on) on a node. A good example of centralized metadata storage is storm in the big data space. It is a distributed real-time computing engine for big data. It is a centralized metadata storage structure. The bottom layer is based on ZooKeeper (distributed coordination middleware) to store and maintain all metadata.

 

Redis maintains cluster metadata in another way, the Gossip protocol, all nodes hold a copy of metadata, different nodes if metadata changes occur, constantly send metadata to other nodes, so that other nodes also make metadata changes.

 

The advantage of centralized mode is that metadata is read and updated with very good timeliness. Once metadata is changed, it will be updated to the centralized storage immediately, and other nodes will be aware of it when reading. The downside is that all metadata update pressure is concentrated in one place, which can cause metadata storage pressure.

Update requests are sent to all nodes intermittently, reducing the pressure. The downside is that metadata updates are delayed, which can cause some delays in some operations in the cluster.

10000 port: Each node has a dedicated port for communication between nodes, that is, the port number that it provides services. For example, port 17001 is used for communication between nodes. Each node sends ping messages to several other nodes at regular intervals, and the other nodes return pong after receiving the ping.

Information exchanged: Information includes fault information, node addition and deletion, hash slot information, and so on.

Gossip protocols

The Gossip protocol contains multiple messages, including ping, pong, meet, fail, and so on.

  • Meet: A node sends a meet to a new node to join the cluster, and then the new node starts communicating with other nodes. Redis-trib.rb Add-node sends a gossip meet message to the newly added node, notifying that node to join our cluster.
  • Ping: Each node frequently sends ping messages to other nodes, including its own status and cluster metadata, to exchange metadata with each other.
  • Pong: Returns ping and meeet, including its own status and other information, also used for information broadcast and update.
  • Fail: After a node determines that another node fails, it sends a fail message to other nodes to inform them that a node is down.

Ping Message Depth

Some metadata must be carried during pings. If pings are frequent, the network may be overloaded.

Each node performs 10 pings per second, each time selecting five other nodes that have not communicated with each other for the longest time. Of course, if the communication delay of a node reaches Cluster_node_timeout / 2, ping the node immediately to avoid a long data exchange delay. For example, if two nodes have not exchanged data for 10 minutes, then the entire cluster is in a situation of severe metadata inconsistencies. Therefore, cluster_node_timeout can be adjusted. If the value is set to a larger value, the ping frequency will be reduced.

Each ping will take the information of its own node, and 1/10 of the information of other nodes, and send it out for exchange. Contains information about at least three other nodes, and a maximum of two other nodes minus the total number of nodes.

Distributed addressing algorithm

  • Hash algorithm (massive cache reconstruction)
  • Consistent Hash algorithm (automatic cache migration) + Virtual Node (automatic load balancing)
  • Hash Slot algorithm of redis cluster

The hash algorithm

A key is given, which first evaluates the hash value, then modulo the number of nodes. And then hit the different master nodes. Once a master node goes down, all requests will be made based on the latest number of remaining master nodes to try to fetch data. This will result in the majority of requests coming in, none of which will be valid in the cache, causing a large amount of traffic to flood the database.

 

Consistent hash algorithm

The consistent hash algorithm organizes the entire hash space into a virtual circle. The entire space is organized clockwise. The next step is to hash each master node (using the IP address or host name of the server). This determines the position of each node on its hash ring.

When a key is given, the hash value is computed and its position on the ring is determined. From there, the first master node encountered is the key’s position.

In the consistent hash algorithm, if a node fails, the data affected is only the data between this node and the previous node in the ring space (the first node encountered by walking counterclockwise), nothing else is affected. The same goes for adding a node.

When there are too few nodes in the consistent hash algorithm, it is easy to cause the problem of cache hot spots because of the uneven distribution of nodes. To solve this hot spot

The consistent hash algorithm introduces the virtual node mechanism, which computes multiple hashes for each node and places a virtual node for each result position. In this way, the data is evenly distributed and the load is balanced.

 

Hash Slot algorithm of redis cluster

The Redis cluster has a fixed hash slot of 16384, calculates the CRC16 value for each key, and then modulates the hash slot of 16384 to obtain the hash slot corresponding to the key.

Each master in a Redis cluster will hold some slots. For example, if there are three masters, each master may hold more than 5000 Hash slots. Hash Slot makes it easy to add and remove nodes. Adding a master moves part of the hash slot of another master, and removing a master moves its Hash slot to another master. The cost of moving Hash Slot is very low. The API of the client can use the same hash tag for the specified data.

Any machine down, the other two nodes, does not affect. Because key is looking for a hash slot, not a machine. ,

 

High availability and active/standby switchover of redis Cluster

The high availability principle of Redis Cluster is almost similar to sentry.

Check whether the node is down

If one node thinks another node is down, it’s a PFAIL, subjective failure. If multiple nodes believe that the other node is down, then fail is an objective failure, almost the same as sentry, sdown, odown.

In cluster-Node-timeout, if a node does not return pong, it is considered pFAIL.

If a node thinks that a node is pfail, it will ping other nodes in the gossip ping message. If more than half of the nodes think that a node is Pfail, it will become fail.

Filter by node

If the master node is down, select one of its slave nodes to become the master node.

Check the disconnection time between each slave node and the master node. If the disconnection time exceeds cluster-node-timeout *cluster-slave-validity-factor, the slave node is not eligible to switch to the master node.

Slave node election

Each slave node sets an election time based on its offset of data replicated from the master. The slave node with a larger offset (more data replicated) has a higher election time and is preferred for election.

All master nodes vote in the slave election. If the majority of master nodes (N/2 + 1) vote for a slave node, then the election passes and that slave node becomes the master.

The active/standby switchover is performed on the secondary node.

Comparison with sentinels

The entire process is very similar to Sentinel, so Redis Cluster is a powerful direct integration of Replication and Sentinel.

2. What is the avalanche, penetration and breakdown of Redis? What happens when Redis crashes? How does the system deal with this? How to deal with redis penetration?

Interviewer psychoanalysis

In fact, this is a very important question to ask about cache because cache avalanche and cache penetration are two of the biggest cache questions that either don’t come up or if they do come up it’s a fatal question, so the interviewer will definitely ask you.

Analysis of interview questions

Cache avalanche

For system A, assuming A daily peak of 5,000 requests per second, the cache would have been able to handle 4,000 requests per second at peak times, but the cache machine unexpectedly went down completely. The cache is down, and 5000 requests per second are sent to the database, so the database can’t handle it, so it’s going to give an alarm, and then it’s down. At this point, if no special solution is adopted to handle the failure, the DBA is anxious to restart the database, but the database is immediately killed by the new traffic.

This is cache avalanche.

 

About 3 years ago, a well-known Internet company in China lost tens of millions of dollars due to a cache accident, which led to an avalanche and the collapse of all the background systems.

Caching avalanches before and after the solution is as follows. – Ex ante: Redis high availability, master slave + Sentinel, Redis Cluster, avoid total crash. – Current: Local EhCache + Hystrix stream limiting & degrade to avoid MySQL being killed. – After: Redis persistent, once restarted, automatically load data from disk, fast recovery of cached data.

 

The user sends A request. After receiving the request, system A checks the local EhCache first. If the request is not found, system A checks redis. If neither EhCache nor Redis exists, check the database and write the result in the database to EhCache and Redis.

The flow limiting component, can set the request per second, how many can pass through the component, the rest of the request did not pass, what to do? Go down! You can return some default values, either as a reminder, or blank values.

Benefits: – The database is never dead, and the flow limiting component ensures that only requests pass per second. – As long as the database doesn’t die, that is, 2 out of 5 requests can be processed by the user. – As long as 2/5 requests can be processed, your system is not dead, for the user, it may not be a few clicks, but a few more clicks, can be a page.

The cache to penetrate

For system A, let’s say 5,000 requests per second, and 4,000 of those requests turn out to be malicious attacks by hackers.

Those 4,000 attacks by the hacker, they’re not in the cache, and every time you go to the database, they’re not.

Here’s an example. The database ID starts at 1, and the hacker sends all request ids with negative numbers. This way, there will be no cache, and the request will be queried directly from the database every time. Cache penetration in this malicious attack scenario would kill the database.

 

 

Each time system A does not find A value in the database, it writes A null value to the cache, such as set-999 UNKNOWN. Then set an expiration time so that the next time the same key is accessed, the data can be fetched directly from the cache before the cache expires.

Cache breakdown

Cache breakdown refers to a situation where a key is very hot and accessed frequently and is in centralized and high concurrency. When the key fails, a large number of requests will break through the cache and directly request the database, just like cutting a hole in a barrier.

The solution is also very simple, you can set the hotspot data to never expire; Or implement a mutex based on Redis or ZooKeeper and wait for the first request to build the cache before releasing the lock so that other requests can access the data through the key.

3, how to ensure the cache and database double write consistency?

Interviewer psychoanalysis

As long as you use cache, you may involve the cache and database double storage double write, as long as you are double write, there will be data consistency problem, so how do you solve the consistency problem?

Analysis of interview questions

In general, it is best not to serialize read and write requests into an in-memory queue if the cache is allowed to slightly inconsistencies with the database, that is, if your system is not strictly “cache + database” consistent.

Serialization guarantees that no inconsistencies will occur, but it can also result in a significant throughput reduction of the system, requiring several times more machines than normal to support a single request on the line.

Cache Aside Pattern

The most classic Cache + database read/write Pattern is Cache Aside Pattern. – Read data from the cache. If the cache does not exist, read data from the database and put it into the cache. At the same time, return a response. – When updating, update database first, then delete cache.

Why delete the cache instead of update it?

The reason is simple. A lot of times, in a more complex caching scenario, the cache is not just a value pulled straight out of the database.

For example, one field of a table may be updated, and the corresponding cache may need to query the data of the other two tables and perform operations to calculate the latest value of the cache.

Also, updating the cache can be costly. Does that mean that every time you change a database, you have to update the corresponding cache? This may be true in some scenarios, but not in more complex scenarios where cached data is computed. If you frequently modify multiple tables involved in a cache, the cache updates frequently. But the question is, will this cache be accessed frequently?

For example, if a table’s fields are changed 20 times in a minute, or 100 times, the cache is updated 20 times, or 100 times. But the cache was only read once in a minute, and there was a lot of cold data. In fact, if you just delete the cache, the cache is recalculated within a minute, and the overhead is significantly reduced. Cache is what you need to cache.

In fact, deleting the cache, rather than updating it, is the idea of lazy computing. Instead of redoing a complex calculation every time, whether it’s needed or not, let it recalculate until it needs to be used. Like Mybatis, Hibernate, have lazy loading idea. Query a department, the department carries a list of employees, there is no need to say that every time you query a department, the data of 1000 employees in the department will be found at the same time. 80% of the time, the department, you just want to access the information in that department. Search the department first and access the employees in the database at the same time, so only when you want to access the employees in the database, you will query 1000 employees in the database.

The most elementary cache inconsistency problem and solution

Problem: Update the database before deleting the cache. If the cache deletion fails, it will result in new data in the database and old data in the cache, causing data inconsistencies.

 

Delete the cache first and then update the database. If the database update fails, the database is old, the cache is empty, and the data is not inconsistent. The old data in the database is read and updated to the cache because the cache does not have it at the time of reading.

More complex data inconsistency problem analysis

The data has changed, the cache has been deleted, and the database has been modified, which has not yet been modified. A request comes in, reads the cache, finds that the cache is empty, queries the database, finds the old data before modification, and puts it in the cache. The subsequent data change procedure completes the database modification. The data in the database is different from the data in the cache…

Why does this problem occur when hundreds of millions of traffic are concurrent?

This problem can only occur when reading or writing data concurrently. In fact, if you have a very low concurrency, especially if you have a very low read concurrency, 10,000 visits per day, then very rarely, you’re going to have the kind of inconsistencies that I just described. However, the problem is that if the daily traffic is hundreds of millions and the concurrent reads per second are tens of thousands, as long as there are data update requests per second, the above database + cache inconsistency may occur.

Solutions are as follows:

When data is updated, operations are routed to an internal JVM queue based on the unique identity of the data. If the data is not in the cache when it is read, the operation of re-reading the data + updating the cache is routed according to the unique identifier and also sent to the same JVM internal queue.

A queue corresponds to a worker thread, and each worker thread receives the corresponding operation sequentially, and then executes it one by one. In this case, a data change operation, first deletes the cache, then updates the database, but has not completed the update. If a read request does not reach the cache, the cache update request can be sent to the queue first. At this time, the cache update request will be backlogged in the queue, and then wait for the cache update to complete synchronously.

There is an optimization point here, in a queue, it is meaningless to string multiple update cache requests together, so we can filter. If there is already one update cache request in the queue, then we don’t need to put another update request in the queue, and just wait for the previous update request to complete.

After the worker thread of that queue has finished the database modification of the previous operation, the next operation, the cache update operation, will read the latest value from the database and write it to the cache.

If the request is still in the waiting time range and polling finds that the value can be fetched, it returns directly; If the request waits more than a certain amount of time, the current old value is read directly from the database this time.

In high concurrency scenarios, the following issues should be addressed in this solution:

(1) The read request is blocked for a long time

Because read requests are very lightly asynchronous, it is important to be aware of read timeouts, within which each read request must be returned. In this solution, the biggest risk is that the data may be updated so frequently that a large number of update operations are backlogged in the queue, and then read requests will have a large number of timeouts, resulting in a large number of requests going directly to the database. Be sure to run some real-world tests to see how often data is updated.

On the other hand, because there may be a backlog of update operations for multiple data items in a queue, you need to test for your own business situation, and you may need to deploy multiple services, each sharing some data update operations. If 100 item inventory modification operations are squeezed in a memory queue and it takes 10ms to complete the inventory modification operation every other time, then the last item read request may wait 10 *100 = 1000ms = 1s before getting data, which will lead to a long time block of read requests.

Must be done according to actual business operation of the system, and to some of the pressure test, and simulated the online environment, look at the busiest time, how much memory queue may squeeze updates, may lead to a final update operations corresponding read requests, how much time will hang, if read requests in 200 ms to return, if you after the calculation, Even on the busiest days, with 10 updates backlogged, waiting up to 200ms, that’s fine.

If a memory queue is likely to have a particularly large backlog of updates, then you add machines so that fewer service instances deployed on each machine process less data, and the fewer backlogged updates per memory queue.

In fact, based on the experience of previous projects, data write frequency is generally very low, so in fact, the backlog of updates in the queue should be very small. For projects like this, with high read concurrency and read cache architecture, write requests are generally very small, and QPS of several hundred per second is good.

Let’s actually do a rough calculation.

If you have 500 writes per second, and if you divide it into five time slices, 100 writes every 200ms into 20 memory queues, you might have 5 writes per memory queue. After each write operation performance test, it is generally completed in about 20ms, so for each memory queue data read request, also hang for a while at most, within 200ms can certainly return.

After the simple calculation just now, we know that the write QPS supported by a single machine is no problem in hundreds. If the write QPS is expanded by 10 times, then expand the machine by 10 times, and each machine has 20 queues.

(2) The concurrency of read requests is too high

There is also a risk that a sudden flood of read requests will hang on the service in tens of milliseconds to see how well the service can hold up and how many machines are needed to hold up the peak of the maximum limit case.

However, not all data are updated at the same time, and the cache is not invalid at the same time. Therefore, the cache of a few data may be invalid every time, and then the corresponding read requests of those data will come, and the concurrency should not be very large.

(3) Request routing for multi-service instance deployment

It is possible that multiple instances of the service are deployed, so it is important to ensure that requests to perform data update operations, as well as cache update operations, are routed through the Nginx server to the same service instance.

For example, read and write requests to the same item are routed to the same machine. You can do your own hash routing between services based on a request parameter, you can also use Nginx’s hash routing function, etc.

(4) Routing problems of hot commodities lead to request skew

In the event that an item’s read and write requests are so high that they all go to the same queue on the same machine, it may cause too much stress on the same machine. That is, since the cache is cleared only when the commodity data is updated, and then the read and write concurrency is caused, the problem is not particularly significant depending on the business system, if the update frequency is not too high, but it is possible that the load on some machines will be higher.

4. What are redis concurrency competition issues? How to solve this problem? Do you know the CAS scheme for Redis transactions?

Interviewer psychoanalysis

This is also a very common problem online, that is, multiple clients simultaneously write a key, the data that should have arrived first may arrive later, resulting in the incorrect version of the data. Or if multiple clients get a key at the same time, change the value and write it back, as long as the order is wrong, the data is wrong.

And Redis has its own Cas-like optimistic locking solution that naturally solves this problem.

Analysis of interview questions

At some point, multiple system instances are updating a key. Distributed locking can be implemented based on ZooKeeper. Each system obtains distributed locks through ZooKeeper to ensure that only one system instance can operate a key at a time, and no one else can read or write the key.

 

You to write to the cache data, has been found out from the mysql, have written to mysql, the writing time must be saved in the mysql a timestamp, from mysql found out, the timestamp is found out.

Before writing, check whether the timestamp of the current value is newer than the timestamp of the value in the cache. If so, you can write; otherwise, you cannot overwrite new data with old data.

5. How is Redis deployed in production?

Interviewer psychoanalysis

See if you understand the deployment architecture of your company’s Redis production cluster. If you don’t, then you are remiss. Is your Redis master-slave architecture? Cluster architecture? What kind of clustering scheme is used? Is there a high availability guarantee? Is persistence enabled to ensure data recovery? How many gigabytes of memory does online Redis give? What parameters are set? How many QPS does your Redis cluster carry after pressure testing?

Dude, you have to know this stuff, or you really haven’t thought it through.

Analysis of interview questions

Redis cluster has 10 machines, 5 of which deploy the master instance of Redis, and the other 5 deploy the slave instance of Redis. Each master instance has a slave instance. 5 nodes provide read and write services externally, and the peak QPS of each node may reach 50,000 per second. The maximum for five machines is 250,000 read/write requests /s.

What is the configuration of the machine? 32G memory + 8-core CPU + 1T disk, but 10G memory is allocated to the Redis process. In general online production environment, the redis memory should not exceed 10G as far as possible, which may cause problems.

Five machines provide external reading and writing, with a total of 50G of memory.

Because each primary instance has a secondary instance, it is highly available. If any primary instance goes down, it will automatically failover and Redis will automatically change from a real instance to the primary instance to continue providing read and write services.

What data are you writing into memory? What is the size of each piece of data? Commodity data, each piece of data is 10KB. 100 pieces of data is 1mb, 100,000 pieces of data is 1 gb. Resident memory is 2 million items of data, occupying 20 GB of memory, only less than 50% of the total memory. The current peak is around 3,500 requests per second.

In large companies, there is an infrastructure team responsible for the operation and maintenance of the cache cluster.


A 300 page PDF document summary of Java core knowledge points!

These are some of the things that the interviewer should ask during the interview. These include basics, Java collections, JVMS, multi-threaded concurrency, Spring principles, microservices, Netty and RPC, Kafka, diaries, design patterns, Java algorithms, databases, Zookeeper, distributed caching, data structures, and more.

The last

Welcome everyone to exchange, like the article remember to pay attention to me like forwarding yo, thank you for your support!