For the convenience of going to work, LAST year I rented out my house in the northern suburbs and moved to the southern suburbs, which is close to my place of work. It saves a lot of time cost for me. I can use it to do a lot of meaningful things, at least I won’t be upset because of traffic jams, and my happiness rises sharply.

But even then, life has other worries. Southern residential density is bigger, so parking is a headache thing, I rent is the non fixed parking on both sides of the road, every time come back from work, must be no parking place, so I can only and other people’s car parked side by side, but such problems is that every morning I have been move car call to wake up, the mood nature is needless to say.

But in the last few days, I started to get smarter. When I parked the car on the first night, I would double park the car on the second day, so I didn’t have to move the car the next day, which was a “huge bonus” for me.

The vehicle limit line is a kind of life very common current-limiting strategy, he besides brings me more benefits, also brought a improvement, our wonderful living environment and the rapid growth of private cars has brought our traffic “burden”, if any, may all the car stuck in the road, This is the great benefit that limiting the flow has brought to our lives.

From life back to the program, suppose that a system can only provide service for 10 w people, suddenly one day because of some hot issues, caused the traffic system in a short period of time quickly increased to 50 w, so is the direct result of system crash, anyone can’t use the system, obviously can only number less than all people can’t use more in line with our expectations, So at this point we use “flow limiting.”

Current limit classification

There are many kinds of implementation scheme of traffic limiting, Lei brother here a little reason, the classification of traffic limiting is as follows:

  1. Legitimacy verification and flow limiting: such as verification code, IP blacklist, etc., these means can effectively prevent malicious attacks and crawler collection;
  2. Container traffic limiting: For example, Tomcat, Nginx and other traffic limiting methods. Tomcat can set the maximum number of threads (maxThreads). When the number of concurrent threads exceeds the maximum number, it will queue for execution. Nginx provides two ways to limit the current: one is to control the rate, the other is to control the number of concurrent connections;
  3. Server-side traffic limiting: For example, we use the server-side traffic limiting algorithm to achieve traffic limiting, which is the focus of this article.

Legitimacy verification traffic limiting is the most common business code, is the common verification code and IP blacklist system, this article will not do too much description, we focus on the latter two kinds of traffic limiting implementation: container traffic limiting and server traffic limiting.

Container current-limiting

Tomcat current-limiting

The maximum number of threads for Tomcat version 8.5 is in the conf/server. XML configuration, as shown below:

<Connector port="8080" protocol="HTTP / 1.1"
          connectionTimeout="20000"
          maxThreads="150"
          redirectPort="8443" />
Copy the code

MaxThreads is the maximum number of Tomcat threads. When the number of concurrent requests is greater than this value (maxThreads), the requests will be queued for execution, thus accomplishing the purpose of traffic limiting.

Tip: The default value for maxThreads is 150 (Tomcat version 8.5.42), but it is not always better, depending on the hardware configuration. Note that each open thread consumes 1MB of JVM memory for the thread stack. And the more threads there are, the heavier the GC burden. Finally, it should be noted that the operating system has certain limits on the number of threads in a process. In Windows, the number of threads per process is not allowed to exceed 2000, and in Linux, the number of threads per process is not allowed to exceed 1000.

Nginx current-limiting

Nginx provides two ways to limit the current: one is to control the rate, and the other is to control the number of concurrent connections.

Control the rate of

Limit_req_zone is used to limit the number of requests per unit time, that is, the rate limit. The example configuration is as follows:

limit_req_zone $binary_remote_addrzone=mylimit:10m rate=2r/s; server { location / { limit_req zone=mylimit; }}Copy the code

The above configuration indicates that the speed of each IP address is limited to 2r/s, because Nginx’s traffic limiting statistics are based on milliseconds, we set the speed to 2r/s, which means that only one request per IP is allowed within 500ms, and only the second request is allowed after 501ms.

We sent and sent 6 requests within 10ms using a single IP address. The execution results are as follows:

It can be seen from the above results that his execution is in line with our expectations. Only one execution succeeds, and the other five are rejected (the second one will be executed normally at 501ms).

Rate limit upgrade

The above rate control is very accurate but is too harsh for real-world use. In real life, we should control the total number of accesses per IP unit of total time, not as precise as the above but milliseconds. We can use the Burst keyword to enable this setting.

limit_req_zone $binary_remote_addrzone=mylimit:10m rate=2r/s; server { location / { limit_req zone=mylimit burst=4; }}Copy the code

Burst =4 indicates that a maximum of 4 burst requests are allowed per IP. If a single IP sends 6 requests within 10ms, the results are as follows:

As you can see from the above results, 1 request was processed immediately, 4 requests were queued for execution in the Burst queue, and 1 request was rejected.

Control concurrency

Use the limit_conn_zone and limit_conn directives to control the number of concurrent requests. The example configuration is as follows:

limit_conn_zone $binary_remote_addr zone=perip:10m;
limit_conn_zone $server_name zone=perserver:10m;
server {
    ...
    limit_conn perip 10;
    limit_conn perserver 100;
}
Copy the code

Limit_conn perip 10 indicates that a single IP address can hold a maximum of 10 connections at the same time. Limit_conn perserver 100 Indicates that the server can process 100 concurrent connections at the same time.

Tip: The connection counts only after the request header has been processed by the back end.

Service end traffic limiting

Server limiting needs to be implemented with the algorithm of limiting, and the algorithm is equivalent to the “brain” of limiting, which is used to guide the implementation of limiting scheme.

Some people see “algorithm” two words may be dizzy, think very esoteric, in fact is not. The algorithm is equivalent to the operation of a transaction specific implementation steps summary, in fact, it is not difficult, do not be scared by the appearance of oh ~

There are three common algorithms for current limiting:

  1. Time window algorithm
  2. Bucket algorithm
  3. The token algorithm

Let’s look at them separately.

1. Time window algorithm

The so-called sliding time algorithm refers to taking the current time as the cut-off time and taking a certain amount of time forward, for example, taking the time forward 60 seconds. The maximum number of accesses running within the 60 seconds is 100. At this time, the execution logic of the algorithm is: first clear all request records before 60 seconds. Then calculate whether the number of requests in the current collection is greater than the set maximum number of requests (100). If the number is greater than the set maximum number of requests (100), the traffic limiting rejection policy will be implemented. Otherwise, the request record will be inserted and the id that can be executed normally will be returned to the client.

The sliding time window is shown in the figure below:

Each small represents 10s, and the time period surrounded by the red dotted line is the time interval that needs to be judged. For example, 100 requests are allowed in 60 seconds, so the red dotted line is 60 seconds.

We can use the ordered set ZSet of Redis to realize the time window algorithm traffic limiting. The process of implementation is to use the key of ZSet to store the ID of traffic limiting, and score to store the time of request. Every time a request comes, the traffic volume of the previous time window is cleared first. If the number of current time Windows is greater than or equal to the maximum access number, return False to perform the flow limiting operation, allow the execution of business logic, and add a valid access record to the ZSet. The specific implementation code is as follows:

We use the Jedis package to manipulate Redis to add a reference to the Jedis framework in pom.xml, as follows:

<! -- https://mvnrepository.com/artifact/redis.clients/jedis --> <dependency> <groupId>redis.clients</groupId> < artifactId > jedis < / artifactId > < version > 3.3.0 < / version > < / dependency >Copy the code

The specific Java implementation code is as follows:

import redis.clients.jedis.Jedis; Public class RedisLimit {static Jedis Jedis = new Jedis("127.0.0.1", 6379);

    public static void main(String[] args) throws InterruptedException {
        for (int i = 0; i < 15; i++) {
            boolean res = isPeriodLimiting("java", 3, 10);
            if (res) {
                System.out.println("Normal execution request:" + i);
            } else {
                System.out.println("Restricted flow:"+ i); } // sleep(4000); Boolean res = isperiodperiod()"java", 3, 10);
        if (res) {
            System.out.println("After sleep, the request is executed normally.");
        } else {
            System.out.println("After sleep, restricted flow."); }} /** * Current limiting method (sliding time algorithm) * @param key Current limiting identifier * @param Period Current limiting time range (unit: second) * @param maxCount Maximum number of access times * @return*/ private static boolean isPeriodLimiting(String key, int period, int maxCount) { long nowTs = System.currentTimeMillis(); // Delete request data that is not within a period of time (for example, when period=60, Jedis. ZremrangeByScore (key, 0, nowTs -period * 1000); long currCount = jedis.zcard(key); // Number of current requestsif(currCount >= maxCount) {// Limit the number of requests exceededreturn false; } jedis.zadd(key, nowTs, nowTs, nowTs)""+ nowTs); // Request record +1return true; }}Copy the code

The execution results of the above procedures are:

Normal execution request: 0

Normal execution request: 1

Normal execution request: 2

Normal execution request: 3

Normal execution request: 4

Normal execution request: 5

Normal execution request: 6

Normal execution request: 7

Normal execution request: 8

Normal execution request: 9

Limited current: 10

Limited current: 11

Traffic limited: 12

Restricted flow: 13

Limited current: 14

After hibernation, the request is executed normally

There are two disadvantages to this implementation:

  • ZSet is used to store the access records of each time. If the amount of data is large, it will occupy a large amount of space. For example, if the 60s allows 100W access;
  • The execution of this code is non-atomic, first judging and then increasing, and the gap between them can be interspersed with the execution of other business logic, resulting in inaccurate results.

2. Leaky bucket algorithm

The leaky bucket algorithm is inspired by funnels, as shown below:

One problem of the sliding time algorithm is that within a certain range, for example, there can only be 10 requests within 60s. When 10 requests are reached in the first second, all requests can only be rejected in the remaining 59s. The leaky bucket algorithm can solve this problem.

The leaky bucket algorithm is similar to the funnel in life. No matter how big the water is pouring into the funnel, that is, no matter how many requests there are, it flows out slowly at a uniform rate. When the velocity of the flow above is greater than the velocity of the flow below, the funnel will gradually become full. When the funnel is full, it will discard incoming requests. When the speed of the water flowing from the top is less than the speed of the water flowing from the bottom, the funnel will never be full and will continue flowing.

The implementation step of the leaky bucket algorithm is to declare a queue to save the request, which is equivalent to a funnel. When the queue capacity is full, it will give up the new request, and then re-declare a thread to periodically get one or more tasks from the task queue for execution, so as to realize the leaky bucket algorithm.

Above, we demonstrated that the control rate of Nginx actually uses the leaky bucket algorithm, of course, we can also use Redis very convenient to achieve the leaky bucket algorithm.

We can use the Redis-cell module provided in The Version 4.0 of Redis, which uses the funnel algorithm and provides atomic current-limiting instructions, and rely on this natural distributed program Redis can achieve more perfect current-limiting.

The redis-cell can be restricted. it is simple to use only one instruction cl.throttle. The example is as follows:

> throttle-throttle mylimit 15 30 60 1) (integer) 0# 0 indicates success, and 1 indicates rejection(2)integer15)# Funnel capacity(3)integer14)# Remaining funnel capacity(4)integer1) -# How long does it take to try again after rejection (in seconds) -1 indicates that no retry is required(5)integer2)# How long before the funnel is completely empty
Copy the code

Where 15 is the capacity of the funnel, and 30/60s is the rate of the funnel.

3. Token algorithm

In the token bucket algorithm, there is a program to generate tokens at a constant speed and store them in the token bucket. Each request needs to obtain the token before execution. If the request does not obtain the token, you can choose to wait or give up execution, as shown in the figure below:

We can use Google’s open source Guava package to implement the token bucket algorithm conveniently. First, add a Guava reference to the POM.xml and configure it as follows:

<! -- https://mvnrepository.com/artifact/com.google.guava/guava --> <dependency> <groupId>com.google.guava</groupId> < artifactId > guava < / artifactId > < version > 28.2 jre < / version > < / dependency >Copy the code

The specific implementation code is as follows:

import com.google.common.util.concurrent.RateLimiter; import java.time.Instant; Public class RateLimiterExample {public static void main(String[] args) {// Generate 10 tokens per second (every 100 ms) Generate a) RateLimiter rt = RateLimiter. Create (10);for(int i = 0; i < 11; I++) {new Thread (() - > {/ / get a token rt. Acquire (); System.out.println("Normal execution method, TS :"+ Instant.now()); }).start(); }}}Copy the code

The execution results of the above procedures are:

Normal execution method, TS: 2020-05-15T14:46:37.175z

Normal execution method, TS: 2020-05-15T14:46:37.237z

Normal execution method, TS: 2020-05-15T14:46:37.339z

Normal execution method, TS: 2020-05-15T14:46:37.442z

Normal execution method, TS: 2020-05-15T14:46:37.542z

Normal execution method, TS: 2020-05-15T14:46:37.640z

Normal execution method, TS: 2020-05-15T14:46:37.741z

Normal execution method, TS: 2020-05-15T14:46:37.840z

Normal execution method, TS: 2020-05-15T14:46:37.942z

Normal execution method, TS: 2020-05-15T14:46:38.042z

Normal execution method, TS: 2020-05-15T14:46:38.142z

We can see from the above results that tokens are generated every 100ms. Acquire () blocks and waits for tokens to be acquired. It can pass an int that specifies how many tokens to acquire. Its alternative is tryAcquire(), which returns false if no token is available so you don’t block and wait. Of course, tryAcquire() can also set a timeout. If the maximum wait time is not exceeded, it will block until the token is acquired. If the maximum wait time is exceeded, it will return false if no token is available.

Note: The token algorithm implemented using Guava is a program-level single-machine traffic limiting scheme, whereas the one above using Redis-cell is a distributed traffic limiting scheme.

conclusion

This article provides six specific ways to implement traffic limiting. They are: Tomcat uses maxThreads to implement traffic limiting; The limit_req_zone and burst instructions are used to limit the number of concurrent connections. The limit_conn_zone and limit_conn instructions are used to limit the number of concurrent connections. Finally, we talked about the time window algorithm that can be implemented with Redis’ ordered set, the leaky bucket algorithm that can be implemented with Redis-cell, and the token algorithm that can be implemented with Google’s Guava package.

It is important to note that the current limiting scheme implemented with Redis can be used in distributed systems, while the current limiting implemented by Guava can only be used in single-machine environments. If you don’t like server-side traffic limiting, you can even use container traffic limiting (Nginx or Tomcat) without changing the code, but only if your business needs are met.

Well, this is the end of the article, look forward to our next ~

The last word

Original is not easy, if you think this article is useful to you, please click a “like”, this is the biggest support and encouragement to the author, thank you!

Reference & thanks

www.cnblogs.com/biglittlean…

Follow the public account “Java Chinese community” reply “dry goods”, get 50 original dry goods Top list.