Small knowledge, big challenge! This article is participating in the creation activity of “Essential Tips for Programmers”.

In high concurrency environment, in order to ensure the stability of the system, current limiting, degradation, fuse and other means are usually used to ensure the stability and availability of the system.

Traffic limiting, as its name implies, limits the traffic processed by services. In fact, fuses and downgrades are essentially traffic limiting, blocking request traffic. This article focuses on common traffic limiting algorithms.

Why limit the current

Why do we need to limit traffic? The problem is easy to understand, that is, excessive traffic requests to the service will lead to service crash. To avoid this situation, you need to limit the traffic.

The following common situations can cause traffic surges:

  • Sales promotion activity
  • Malicious users swipe orders or request services
  • Web crawler
  • The number of regular users increases

All these situations will lead to the increase of traffic. If the QPS of our service can only support 1000 at the maximum, when the number of requests increases to more than 1000 at a certain moment, we should limit the traffic within 1000 to ensure the stable operation of the service, which is called traffic limiting.

Bucket algorithm

Leaky bucket algorithm is a simple idea, just like a funnel, can only according to the size of the funnel mouth out of the water, when the flow into the funnel is too large, it will overflow, overflow water will not flow out of the funnel mouth.

It can be seen that the core logic of the leaky bucket algorithm is:

  1. The cache request
  2. Uniform processing
  3. Discard the excess

Leaky bucket algorithm limits the request rate forcibly, which leads to the following disadvantages:

Cannot handle sudden traffic

For example, if the flow rate of the leaky bucket is 10QPS and the capacity is 30, when a wave of 20 requests /s lasts for 10 seconds, the leaky bucket will be full in the third second, and all subsequent requests will be discarded until a new capacity is released. But this situation is very common, and it’s rude to just throw it away.

Inefficient use of resources

Since the exit rate of the leaky bucket is fixed, if the number of requests is set to 12, 15, 10, 8, and 5, the overall average request rate is also 10QPS, but in the first three seconds because the rate is set to 10, 7 requests will be directly discarded.

Token bucket algorithm

Token bucket algorithm is one of the most commonly used algorithms in Traffic Shaping and Rate Limiting. A lot of people confuse it with leaky buckets by name, but there’s a big difference. The token bucket algorithm is used to control the amount of data sent to the network and to allow burst data to be sent.

The process of token bucket algorithm mainly includes the following:

  1. The system generates tokens at a uniform speed and stores them in the token bucket.
  2. The token bucket has a fixed capacity. When the token bucket is filled, any tokens put into it will be discarded;
  3. Each request fetches the token from the token bucket, processing the request on success, and discarking it on failure.

So why is the token bucket algorithm able to handle unexpected requests?

First, let’s understand what a burst request is. We want the service rate to be 100QPS. Normally, our request will not exceed 100/s. If the request reaches 150 in a certain second, it is a burst request.

If there is a leaky bucket algorithm, if there are 80 requests in the first second and 150 requests in the second second, then 50 requests will be discarded.

However, in the token bucket algorithm we set 100 tokens per second, we can set the capacity of the token bucket to 120, then 80 requests in the first second will consume 80 tokens, and by the time 150 requests arrive in the first second, 120 requests can be processed and only 30 are discarded. Does that mean that buckets can be processed as long as they are large enough? Of course not. If the bucket capacity exceeds 120, the capacity of the service may be too large to handle. If the bucket capacity exceeds the capacity of the service, the function of flow limiting will be lost.

RateLimiter

talk is cheap, show me the code!

RateLimiter in Guava is implemented using the token bucket algorithm.

Running results:

As you can see from the results, it is very accurate to ensure that at most three threads per second get tokens, which is a good control of the request.

The Acquire () method of Guava RateLimiter blocks by default, and if it does not obtain the token, it blocks and waits until it succeeds.

RateLimiter can be used in addition to the acquire() method.

  • TryAcquire (): Try to acquire 1 token, return directly on failure
  • TryAcquire (int permits): Attempts to obtain multiple tokens
  • TryAcquire (long Timeout, TimeUnit Unit): tryAcquire(long timeout, TimeUnit Unit): tryAcquire(long timeout, TimeUnit Unit): tryAcquire(long timeout, TimeUnit Unit): tryAcquire(long timeout, TimeUnit Unit): Try to acquire a token
  • TryAcquire (int Permits, long timeout, TimeUnit Unit): tryAcquire(int Permits, long timeout, TimeUnit Unit): tryAcquire(int Permits, long timeout, TimeUnit Unit): tryAcquire(int Permits, long timeout, TimeUnit Unit): tryAcquire(int Permits, long timeout, TimeUnit Unit

It is important to note that RateLimiter can only support flow limiting on a single machine, so it cannot limit traffic on a cluster.

If the cluster environment needs to perform traffic limitation, the common method is to use Redis to add the number of requests according to the timestamp per second. If the number exceeds the limit, the service will be denied.

conclusion

This paper mainly introduces two common traffic limiting algorithms: leaky bucket algorithm and token bucket algorithm.

Leaky bucket algorithm cannot deal with burst traffic because of mandatory limit of request rate.

Token bucket algorithm can deal with burst traffic to a certain extent, the specific threshold of which depends on the capability of the service.

Of course, in some scenarios where the request rate needs to be strictly guaranteed, the leaky bucket algorithm is better than the token bucket algorithm.

Guava RateLimiter is implemented by token bucket algorithm, which can be used as an alternative of single-machine traffic limiting scheme, and Redis can be used to implement traffic limiting scheme in cluster environment.

There is no best technology, only the most suitable technology.