Wechat official account: Deep data Club to learn more about big data related consultation. Questions or suggestions, please leave a message on the public account; If you feel that the vast data Club is helpful to you, welcome to forward circle of friends recommended attention

Whenever we have a database with a large number of users, it is not uncommon to encounter hot spots in the database. For Redis, the same Key in a partition that is frequently accessed is called a hotspot. In this article, we will discuss the common causes of hot spots, assess the impact of this problem, and propose effective solutions to deal with hot spots.

Common causes of hot spots

Reason 1: The size of user consumption data is much larger than the size of production data, including popular items, popular news, popular comments and celebrity live broadcast.

Unexpected events occur in your daily work and life, such as the sale and promotion of some popular items on the same day. When one item is viewed or purchased tens of thousands of times, the demand is greater and this situation can lead to hot issues.

Also, it has been published and watched by a large number of popular news, popular comments, star livestream, etc. These typical no-read-write scenarios have also created hot issues.

Cause 2: The number of requested slices exceeded the performance threshold of a single server.

When a piece of data is accessed on the server, it is usually split or sliced. During this process, the corresponding Key is accessed on the server. Hotkey problems occur when access traffic exceeds the server’s performance threshold.

The impact of hot issues

  • The traffic is concentrated and reaches the upper limit of the physical network adapter.

  • Too many requests are queued, causing the sharding service of the cache to crash.

  • The database is overloaded, causing an avalanche of services.

As mentioned earlier, when the number of hot requests on the server exceeds the limit of the network adapter on the server, the server stops providing other services due to excessive concentration of traffic.

If the hotspot distribution is too dense, a large number of hotspots will be cached, which will exhaust the cache capacity and cause the sharding service of the cache to crash.

After the cache service crashes, the newly generated requests are cached in the background database. Because of the poor performance of the database, it could easily be overwhelmed by a large number of requests, leading to an avalanche of service and a significant performance degradation.

Recommended Solutions

A common solution to improve performance is to rebuild on the server or client.

Server caching solution

The client sends the request to the server. Assuming that the server is a multi-threaded service, you can use a local cache space based on the cache LRU policy.

When the server gets crowded, it forwards requests directly back instead of forwarding them to the database. Only later, after the congestion is cleared, does the server send requests from the client to the database and write the data back to the cache. Access and rebuild the cache.

However, the program also has the following problems:

  • Build issues with caching multithreaded services when caching fails

  • Cache build issues when cache is lost

  • Dirty reads the question

“MemCache + Redis” solution

In this solution, a separate cache is deployed on the client side to address hot spots.

With this solution, the client first accesses the service layer and then the cache layer on the same host.

This solution has the following advantages: proximity access, high speed and zero bandwidth limitation. However, it also has the following disadvantages:

  • Memory resources are wasted

  • Dirty reads the question

Local cache solution

Using local caching causes the following problems:

  • Hot spots must be detected in advance.

  • Cache capacity is limited.

  • The inconsistency lasts a long time.

  • The hotspot is incomplete.

If traditional hotspot solutions are flawed, how do you solve the hotspot problem?

Read/write split solution

This solution solves the hotspot read problem. The following describes the functions of the different nodes in the architecture:

  • Load balancing is implemented at the SLB layer.

  • Implement read/write separation and automatic routing at the proxy layer.

  • Write requests are handled by the primary node.

  • Read requests are handled by read-only nodes.

  • HA is implemented on both slave and master nodes.

In effect, the client sends requests to SLB, which distributes these requests to multiple agents. The agent then identifies and classifies the requests and distributes them further.

For example, the agent sends all write requests to the master node and all read requests to the read-only node. However, the read-only nodes in the module can be further extended to effectively solve the problem of hotspot reading. Read/write separation also provides flexible read hotspot capacity, stores a large number of hotspot keys, and is client-friendly.

Hot data solution

In this solution, hotspots are discovered and stored to resolve hotspot problems.

Specifically, the client accesses the SLB and distributes the request to the agent through the SLB. The proxy then routes the request to the backend Redis.

In addition, caches have been added to the server.

Specifically, the local cache is added to the proxy. This cache uses the LRU algorithm to cache hotspot data. In addition, the hotspot data calculation module is added to the back-end database node to return hotspot data.

The main advantages of the proxy architecture are:

  • The proxy caches hotspot data locally and has a horizontally scalable read capability.

  • The database node periodically computes hotspot data sets.

  • The database feeds hot data back to the agent.

  • The proxy architecture is completely transparent to the client, so there is no need to add compatibility.

Dealing with hot

Reading Hotspot Data

Hotspot processing is divided into two parts: write and read. During data writing, SLB receives data K1 and writes it to the Redis database by proxy.

If K1 becomes a hotspot after calculation by the background hotspot module, the proxy caches the hotspot. This way, the client can access K1 directly the next time it bypasses Redis.

Finally, because the proxy can scale horizontally, the accessibility of hotspot data can be infinitely enhanced.

Discovering Hotspot Data

During discovery, the database first calculates requests that occur over a cycle. When the number of requests reaches the threshold, the database finds the hot spots and stores them in the LRU list. When the client attempts to access the data by sending a request to the agent, Redis enters the feedback phase and flags the data when it discovers that the target access point is a hotspot.

The database calculates hotspots using the following methods:

  • Hotspot statistics based on statistical thresholds.

  • Hotspot statistics based on the statistical period.

  • The statistics collection method based on the version number does not need to reset the initial value.

  • Compute hot spots on the database with minimal performance impact and lightweight memory footprint.

Comparison of solutions

As can be seen from the previous analysis, both of these solutions are improvements of traditional solutions when it comes to solving hot issues. In addition, both read/write separation and hot data solutions support flexible capacity scaling and are transparent to clients, although they do not ensure 100% data consistency.

Read/write separation solutions support storing large volumes of hotspot data, while proxy-based hotspot data solutions are more cost-effective.

Reference links:

HTTPS: / / www.alibabacloud.com/blog/redis-hotspot-key-discovery-and-common-solutions_594446?spm=a2c41.12559851.0.0https://medium .com/@Alibaba_Cloud/redis-hotspot-key-discovery-and-common-solutions-95474d27e0f8

Follow public account