preface

In the actual project, I have experienced the peak of 5W+QPS online, and also experienced the large traffic request of 10W+QPS under pressure test. The topic of this blog is mainly my thoughts on the control of high concurrent traffic.

Some ideas for dealing with heavy traffic

First of all, let’s talk about what is mass flow? High traffic, we are likely to emerge: TPS (transactions per second), QPS (requests per second), 1W+, 5W+, 10W+, 100W+… . In fact, there is no absolute number, if this amount caused the pressure of the system, affect the performance of the system, then this amount can be called a large flow. Secondly, what are some common means to deal with heavy traffic?

Caching: To put it bluntly, get data into the cache as early as possible, keep it close to the program, and don’t make a lot of frequent DB accesses. Degrade: If not the core link, then degrade the service. Metaphorically, now the APP is about thousands of thousands of faces, get the data, do personalized sorting show, if in large flow, the sorting can be downgraded! Limit the flow: As we all know, in the morning rush hour of Beijing subway, the subway station will do one thing, that is, limit the flow! The idea is straightforward: you want to limit requests to a certain range for a certain amount of time, keep the system from being overwhelmed, and maximize the throughput of the system.

It is noted that in some cases, caching and downgrade cannot solve the problem. For example, in e-commerce singles’ Day, users’ purchasing and placing orders involve a large number of write operations and cannot be degraded because they are core links. In this case, traffic limiting is more important. So next, let’s focus on limiting flow.

Common mode of limiting traffic

The common methods of limiting flow are: counters, sliding Windows, leaky buckets, and tokens.

counter

Counter is a relatively simple flow limiting algorithm, widely used, in the interface level, many places use this way of flow limiting. Over a period of time, count, compare with the threshold, and clear the counter to 0 when the critical point of time is reached.

The important thing to note here is that there is a critical point in time. For example, there are no user requests between 12:01:00 and 12:01:58, then 100 requests are sent at 12:01:59, OK, and then another 100 requests are sent at 12:02:00. You should get a sense that at this point you can expect to receive more requests from malicious users than the system expects.

The sliding window

Due to the critical point defect of the counter, a sliding window algorithm was developed to solve it.

The sliding window means dividing a fixed time slice and moving it as time goes by, which cleverly avoids the critical point problem of the counter. In other words, these fixed number of movable grids will be counted to judge the threshold, so the number of grids affects the accuracy of the sliding window algorithm.

bucket

Although the sliding window effectively avoids the problem of time critical point, it still has the concept of time slice, and the leak-bucket algorithm is more advanced than the sliding window in this aspect. With a fixed bucket, the rate of inflow is uncertain, but the rate of outflow is constant, and when the water is full, it will overflow.

The token bucket

Note that the flow rate of the drain bucket is constant, which means that if there is a large instantaneous flow, most of the requests will be discarded (known as overflow). To solve this problem, token bucket algorithm is improved.

The rate at which tokens are generated is constant, while there is no speed limit for requesting tokens. This means that in the face of instantaneous heavy traffic, the algorithm can request a large number of tokens in a short period of time, and the process of getting tokens is not very expensive. (there is a little token of production, consumption of token) whether the token bucket can’t get the token is rejected, or a bucket of water is full of overflow, is to ensure the normal use of most of the traffic, and sacrifice the small flow, it is reasonable, if because few flow need to guarantee, then it is likely that they will cause the system to limit and hang up, do more harm than good.

Flow limiting device: Guava RateLimiter

Not only is Guava powerful for collections, caching, asynchronous callbacks, and more, but it also encapsulates an API for limiting traffic! Guava RateLimiter is based on the token bucket algorithm. All we have to do is tell RateLimiter what the QPS limit is, and RateLimiter will fill the bucket with tokens at that speed, and when it asks, Obtain the license (token) from RateLimiter through the tryAcquire() method.

Traffic limiting in distributed scenarios

Some of the above mentioned ways of limiting traffic are for single machine, in fact, most of the scenarios, single machine limiting traffic is enough. The method of distributed lower limit flow often requires the combination of multiple technologies, such as Nginx+Lua, Redis+Lua, etc. This article focuses on single-machine traffic limiting and will not cover traffic limiting in distributed scenarios in detail. In a word, let the flow of the system, first to queue in the queue, limit the flow, do not let the flow directly hit the system.

Read to this I believe you have a lot of harvest! Like learning friends can pay attention to the author. Will regularly publish quality articles oh.