Six current limiting schemes

Current limit classification

www.jianshu.com/p/bec4b86ad… There are many implementations of current limiting, * as follows:

Validity verification and traffic limiting, such as verification code and IP blacklist, can effectively prevent malicious attacks and crawler collection.
Container traffic limiting: For example, traffic limiting methods such as Tomcat and Nginx can be set to the maximum number of threads. When the number of concurrent threads exceeds the maximum, the system will queue for execution. Nginx provides two kinds of traffic limiting methods: one is to control the rate, the other is to control the number of concurrent connections;
Traffic limiting on the server side: For example, traffic limiting algorithm is used to implement traffic limiting on the server side, which is also the focus of this article.

Validity verification traffic limiting is the most common business code, which is the common verification code and IP blacklist system. This article will not do too much narration. We will focus on the implementation of the last two types of traffic limiting schemes: container traffic limiting and server traffic limiting.

Container current-limiting

Tomcat current-limiting

The maximum number of threads for Tomcat 8.5 is shown in conf/server. XML:

<Connector port="8080" protocol="HTTP/1.1"
          connectionTimeout="20000"
          maxThreads="150"
          redirectPort="8443" />
Copy the code

MaxThreads is the maximum number of Tomcat threads. When the number of concurrent requests is greater than this value (maxThreads), the requests will be queued for execution, thus fulfilling the purpose of traffic limiting.

Tip: MaxThreads can be scaled up to 150 by default (Tomcat version 8.5.42), but not as large as possible. Depending on your hardware configuration, note that each thread started consumes 1MB of JVM memory for the thread stack. And the more threads there are, the heavier the BURDEN on the GC. Note that the operating system has a limit on the number of threads in a process. Windows does not allow more than 2000 threads per process, and Linux does not allow more than 1000 threads per process.

Nginx current-limiting

Nginx provides two methods of limiting traffic: one is to control the rate and the other is to control the number of concurrent connections.

Control the rate of

We need to use limit_req_zone to limit the number of requests per unit time, that is, the rate limit. The example configuration is as follows:

limit_req_zone $binary_remote_addr zone=mylimit:10m rate=2r/s; server { location / { limit_req zone=mylimit; }}Copy the code

The above configuration indicates that the speed limit for each IP address is 2r/s. Because Nginx’s traffic limiting statistics are based on milliseconds, we set the speed to 2r/s, which means that only one request from a single IP address is allowed to pass within 500ms, and only the second request is allowed from 501ms on.

We used a single IP to send and send 6 requests within 10ms. The result is as follows:

image

It can be seen from the above results that his execution is in line with our expectations. Only one execution is successful, while the other five are rejected (the second one will be normally executed at 501MS).

Rate limit upgrade

The rate control above is accurate but too harsh to be applied to the real world, where we should control the total number of accesses per IP unit in total time, not as precise as the above. However, we can enable this setting in milliseconds using the Burst keyword, as shown in the following example configuration:

limit_req_zone $binary_remote_addr zone=mylimit:10m rate=2r/s; server { location / { limit_req zone=mylimit burst=4; }}Copy the code

Burst =4 indicates that each IP address can send a maximum of four burst requests. If a single IP address sends six requests within 10ms, the result is as follows:

image

From the above results, one request was processed immediately, four requests were queued up for execution in the Burst queue, and one request was rejected.

Control concurrency

The limit_conn_zone and limit_conn commands are used to control the number of concurrent requests. The configuration is as follows:

limit_conn_zone $binary_remote_addr zone=perip:10m;
limit_conn_zone $server_name zone=perserver:10m;
server {
    ...
    limit_conn perip 10;
    limit_conn perserver 100;
}
Copy the code

Limit_conn perIP 10 indicates that one IP address can hold a maximum of 10 connections at a time. Limit_conn PerServer 100 Indicates that the server can process a total of 100 concurrent connections at the same time.

Tip: The connection is counted only after the Request header has been processed by the back end.

Traffic limiting on the server

Traffic limiting on the server needs to be implemented together with the traffic limiting algorithm. The algorithm is the “brain” of traffic limiting, which is used to guide the implementation of limiting schemes.

Some people may feel dizzy when they see the word “algorithm”, but they are not. Algorithm is equivalent to the operation of a transaction concrete implementation steps summary, in fact not difficult to understand, do not be frightened by its appearance oh ~

There are three common algorithms for limiting traffic:

Time window algorithm
Bucket algorithm
The token algorithm

Let’s look at it separately.

1. Time window algorithm

The so-called sliding time algorithm refers to taking the current time as the cut-off time and taking a certain time forward, for example, taking the time forward 60s, within which the maximum number of access is 100. At this point, the algorithm’s execution logic is: firstly, clear all request records before 60s. Then calculate whether the number of requests in the current collection is greater than the set maximum number of requests 100. If the number is greater than the set maximum number of requests 100, the traffic limiting denial policy will be implemented. Otherwise, insert this request record and return the mark that can be executed normally to the client.

The sliding time window is shown in the figure below:

image

Each of them represents 10s, and the time period surrounded by the red dotted line is the time interval that needs to be determined. For example, if 100 requests are allowed in 60s, the part of the red dotted line is 60s.

We can use the ordered set ZSet of Redis to realize the time window algorithm to limit traffic. The process is to first use the KEY of ZSet to store the ID of limiting traffic, and score to store the time of request. After each request access comes, the visits of the previous time window should be cleared first. If the number of time Windows is greater than or equal to the maximum access volume, return false to perform traffic limiting to allow the execution of service logic, and add a valid access record to ZSet. The specific code is as follows.

We use the Jedis package to manipulate Redis and add a reference to the Jedis framework in pom.xml as follows:

<! -- https://mvnrepository.com/artifact/redis.clients/jedis --> <dependency> <groupId>redis.clients</groupId> < artifactId > jedis < / artifactId > < version > 3.3.0 < / version > < / dependency >Copy the code

The specific Java implementation code is as follows:

import redis.clients.jedis.Jedis; Public class RedisLimit {// Redis static Jedis Jedis = new Jedis("127.0.0.1", 6379); public static void main(String[] args) throws InterruptedException { for (int i = 0; i < 15; i++) { boolean res = isPeriodLimiting("java", 3, 10); If (res) {system.out.println (" normal execution request: "+ I); } else {system.out.println (" restricted: "+ I); Thread.sleep(4000); Boolean res = isPeriodLimiting(" Java ", 3, 10); // Start requesting periodlimiting (" Java ", 3, 10); If (res) {system.out.println (" After sleep, normal execution of the request "); } else {system.out.println (" after sleep, restricted "); }} /** * traffic limiting method (sliding time algorithm) * @param key Traffic limiting identifier * @param period Traffic limiting period (unit: * @param maxCount Maximum number of periodlimiting * @return */ private static Boolean isPeriodLimiting(String key, int period, int maxCount) { long nowTs = System.currentTimeMillis(); Jedis.zremrangebyscore (key, 0, nowts-period * 1000); jedis.zremrangeByScore(key, 0, nowts-period * 1000);  long currCount = jedis.zcard(key); If (currCount >= maxCount) {return false; if (currCount >= maxCount) {return false; } jedis.zadd(key, nowTs, "" + nowTs); +1 return true; }}Copy the code

The execution results of the above programs are as follows:

Normal execution request: 0

Normal request execution: 1

Normal request execution: 2

Normal request execution: 3

Normal request execution: 4

Normal request execution: 5

Normal request execution: 6

Normal request execution: 7

Normal request execution: 8

Normal execution request: 9

Restricted flow: 10

Restricted flow: 11

Restricted flow: 12

Restricted flow: 13

Restricted flow: 14

After sleep, the request is executed normally

There are two drawbacks to this implementation:

ZSet is used to store each access record. If the data volume is large, it will occupy a lot of space, for example, when 100W access is allowed in 60s.
The execution of this code is non-atomic operations, which are judged first and then added, and the gap can be interspersed with the execution of other business logic, ultimately leading to inaccurate results.

2. Leaky bucket algorithm

The leaky bucket algorithm is inspired by a funnel, as shown in the figure below:

image

One problem with the sliding time algorithm is that within a certain range, for example, there can only be 10 requests within 60 seconds. When the number of requests reaches 10 in the first second, the remaining 59s can only reject all the requests. The leak-bucket algorithm can solve this problem.

The leaky bucket algorithm is similar to the funnel of life, no matter how big the water is poured into the funnel, that is, no matter how many requests, it will slowly flow out at a uniform rate. When the velocity of the flow above is greater than the velocity of the flow below, the funnel slowly fills up. When the funnel is full, new requests are discarded. When the velocity of the water above is less than the velocity of the water below, the funnel will never be filled and will continue to flow out.

The implementation steps of the leaky bucket algorithm are: first declare a queue to store requests, which is equivalent to a funnel. When the queue capacity is full, the new requests will be abandoned. Then re-declare a thread to periodically obtain one or more tasks from the task queue for execution, so as to achieve the leaky bucket algorithm.

Above we demonstrate the control rate of Nginx is actually using the leaky bucket algorithm, of course, we can also use Redis is very convenient to achieve leaky bucket algorithm.

We can use the Redis-cell module provided in Redis version 4.0, which uses the funnel algorithm and provides atomic flow limiting instructions, and relies on Redis, a naturally distributed program, to achieve perfect flow limiting.

The redis-cell method is also very simple. You only need to use the cl.throttle instruction as shown in the following example:

> cl. Throttle mylimit 15 30 60 1) (integer) 0 # 0 2) (integer) 15 # Funnel capacity 3) (integer) 14 # Funnel remaining capacity 4) (integer) -1 # How long will it take to try again after being rejected 5) (integer) 2 # How long before the funnel is completely emptyCopy the code

Where 15 is the capacity of the funnel and 30/60s is the speed of the funnel.

3. Token algorithm

In the token bucket algorithm, there is a program that generates tokens at a certain constant speed and stores them in the token bucket. Each request needs to obtain tokens before execution. If the request does not obtain tokens, it can choose to wait or give up execution, as shown in the following figure:

image

We can easily implement the token bucket algorithm by using Google’s open source Guava package. First, add a Guava reference to PUM.xml and configure it as follows:

<! -- https://mvnrepository.com/artifact/com.google.guava/guava --> <dependency> <groupId>com.google.guava</groupId> < artifactId > guava < / artifactId > < version > 28.2 jre < / version > < / dependency >Copy the code

The specific implementation code is as follows:

import com.google.common.util.concurrent.RateLimiter; import java.time.Instant; Public class RateLimiterExample {public static void main(String[] args) {// Generate 10 tokens per second (100 ms) RateLimiter rt = RateLimiter. Create (10); for (int i = 0; i < 11; I++) {new Thread (() - > {/ / get a token rt. Acquire (); System.out.println(" normal execution method, ts:" + instant.now ()); }).start(); }}}Copy the code

The execution results of the above programs are as follows:

Normal execution method, TS: 2020-05-15T14:46:37.175z

Normal execution method, TS: 2020-05-15T14:46:37.237z

Normal execution method, TS: 2020-05-15T14:46:37.339z

Normal execution method, TS: 2020-05-15T14:46:37.442z

Normal execution method, TS: 2020-05-15T14:46:37.542z

Normal execution method, TS :2020-05-15T14:46:37.640Z

Normal execution method, TS: 2020-05-15T14:46:37.741z

Normal execution method, TS :2020-05-15T14:46:37.840Z

Normal execution method, TS: 2020-05-15T14:46:37.942z

Normal execution method, TS: 2020-05-15T14:46:38.042z

Normal execution method, TS: 2020-05-15T14:46:38.142z

As can be seen from the above results, the token is indeed generated every 100ms, and acquire() method blocks to wait for the token, which can pass an int to specify the number of tokens to obtain. An alternative is tryAcquire(), which returns false if no tokens are available so that it does not block the wait. The tryAcquire() method can also set a timeout, blocking until the maximum waiting time is exceeded, or returning false if no tokens are available.

Note: The token algorithm implemented using Guava is an application-level stand-alone traffic limiting scheme, while the one using Redis-Cell above is a distributed traffic limiting scheme.

conclusion

This article provides six specific methods to implement traffic limiting. They are: Tomcat uses maxThreads to implement traffic limiting; Nginx provides two types of traffic limiting methods: one is limit_req_zone and burst to achieve rate limiting; the other is limit_conn_zone and limit_CONN instructions to control the total number of concurrent connections. Finally, we talked about the time window algorithm can be realized with the ordered set of Redis, the leaky bucket algorithm can be realized with Redis-Cell, and the token algorithm can be realized by solving the Guava package of Google.

It should be noted that the traffic limiting scheme implemented by Redis can be applied to distributed systems, while the traffic limiting scheme implemented by Guava can only be applied to standalone environments. If you don’t like server-side traffic limiting, you can even use container traffic limiting (Nginx or Tomcat) without changing your code, but only if it meets your business needs.

Author: sky _2cd3 links: www.jianshu.com/p/78161722b… The copyright of the book belongs to the author. Commercial reprint please contact the author for authorization, non-commercial reprint please indicate the source.

Current limit classification

Container current-limiting

Tomcat current-limiting

Nginx current-limiting

Control the rate of

Control concurrency

Traffic limiting on the server

1. Time window algorithm

2. Leaky bucket algorithm

3. Token algorithm

conclusion

Related Posts

Servlet Chapter 5 [Introduction to session technology, Cookie API, detailed explanation, Application]

Getting started with UX design, feeling unprepared?

Use common Hive functions