Stay Hungry,Stay Foolish — Stay Hungry

directory

  • preface
  • The basic concept
  • The solution
    • Implement current limiting based on Guava
    • Traffic limiting is implemented on the gateway layer
    • Middleware implements traffic limiting
  • Common traffic limiting algorithms
    • Token bucket algorithm
    • Bucket algorithm
  • In actual combat
    • Guava based on the actual flow limiting
    • Based on Nginx current limiting actual combat
    • Flow Limiting Component based on Redis+Lua
  • Write in the last

preface

I believe that many partners in small and medium-sized enterprises or TO B enterprises have never been exposed TO current limiting. For example, you may find that software limiting is all around you. I believe that many partners have 12306 tickets home experience bar. The following picture should be familiar to you.

Yes, it’s a captcha that can kill you, especially when you’re trying to get tickets (which is a lot easier this year), and you’re always told you’re wrong. Or they give you a bunch of pictures that you can’t even see, which sometimes makes you wonder about life. In fact, this is a kind of traffic limiting measures, which deliberately create difficulties for users, fair and square limits the access to traffic, greatly reducing the access pressure of the system.

The basic concept

From the above example, the old cat feels that the friends at least have a certain number of limiting current, so let’s take a closer look at the dimension of limiting current.

In fact, for general flow limiting scenarios, there are two dimensions of information:

  1. Time: Traffic limiting is based on a certain time period or point in time, that is, a time window. Traffic limiting is defined in minutes or even seconds.
  2. Resources: Limits based on available resources, such as maximum number of accesses or maximum number of connections available.

Based on the above two dimensions, we can basically give a simple definition of limiting flow. Limiting flow refers to limiting access to resources in a certain time window. For example, a maximum of 100 requests per second. But in a real scenario, we don’t just have one limiting rule, we have multiple rules working together.

QPS and connection number control

For the connection number and access frequency (QPS) traffic limiting shown in the figure above, we can set the IP dimension traffic limiting. It can also be based on the traffic limiting of each server. In actual practice, traffic limiting rules of multiple dimensions are usually set. For example, when accessing the same IP address, the access frequency per second is less than 10 and the connection frequency is less than 5. When setting QPS of each machine, the maximum number of connections is 1000 and the maximum number of connections is 200. Furthermore, we can treat the entire server group and machine room as a whole and set higher level traffic limiting rules. For this scenario, the old cat will present the implementation demo code later in this article.

Transmission rate

For the transmission rate, we should be familiar with, for example, if a disk is not a member to give you a few KB download, after filling the member to give you dozens of M download rate. This is the flow limiting logic based on member users. In fact, in Nginx we can limit the transfer rate, see the demo later in this article.

Black and white list

The blacklist and whitelist is a common method of traffic restriction and release for many enterprises, and the blacklist and whitelist is always dynamic. , for example, if an IP frequent visits in a period of time, don’t system identification for the robots or flow, the IP will be join the blacklist, limiting access to system resources, this is the IP, or said to rob the train ticket, you will use third party software, tickets in the brush, I do not know if you have noticed that sometimes will be locked into a small black room, and then the decisive time was released. In fact, this is based on the blacklist dynamic change.

There’s even less to explain about whitelisting, which is like a passport to travel through traffic restrictions.

Distributed current limiting

Now many systems are distributed systems, old cat shared with you before the distributed locking mechanism. So what is distributed limiting? In fact, it is very simple, is different from the single-node traffic limiting scenario, the entire distributed environment as a whole to consider all the servers. For example, for IP traffic limiting, we limit one IP address to 100 accesses per second at most. No matter which machine it falls on, as long as it accesses the service nodes in the cluster, it will be subject to traffic limiting.

Therefore, distributed traffic limiting schemes generally have the following two types:

  • Gateway layer traffic limiting: Traffic rules are placed at the traffic inlet.
  • Middleware traffic limiting: With middleware, such as the Redis cache, each component can obtain current traffic statistics to permit or deny.

The solution

In this article, I will introduce you to three solutions.

Scheme 1: Realize flow limiting based on GUAVA

I believe many people are familiar with Guava, which is actually a toolkit produced by Google. We often use it to do some collection operations or do some memory caching operations. But in addition to these basic uses, Guava also covers a wide range of areas, including reflection tools, functional programming, mathematics, and more. Of course, Guava has also made contributions in the area of limiting traffic. The main thing is that several stream limiting support classes, led by RateLimiter, are provided in multi-threaded modules. Guava is a client component, meaning that it is limited to the current server and cannot control the flow of other servers within the cluster. A simple example is shown below.

Solution 2: Implement traffic limiting at the gateway layer

Let’s take a look at the picture directly. To be precise, it should be a funnel model, as follows:

We can see from the above figure that this is basically a normal request flow for our daily requests:

  1. User traffic flows from the gateway layer to the background service
  2. The background service receives traffic and invokes the cache to obtain data
  3. If there is no data in the cache, query the database back to the source

So why do we call it the funnel model? Actually very simple, because the flow from top to bottom is decreasing, the gateway layer gathered the most populated user access request, followed by the background service, after service authentication, brushes off part of the error request, the request to the cache, if the absence of cache is the final database layer, so the database request the lowest frequency. Then the old cat will be the gateway layer Nginx traffic limiting demonstration.

Solution 3: Middleware traffic limiting

For developers, the flow limiting at the gateway layer needs to find the cooperation of the operation and maintenance team to achieve, but today’s young people have a strong desire for control, so most developers will decide to carry out flow limiting at the code level, so at this time, the middleware Redis has become the only choice. We can take advantage of Redis’s expiration time feature to request a time span for limiting traffic (such as one request per second, or 10 requests per 10 seconds). At the same time, Redis also has a special skill called script programming. We can write a script to embed the flow limiting logic into Redis, so that the burden of flow limiting can be completely separated from the service layer. At the same time, Redis’s powerful concurrency characteristics and highly available cluster architecture can also well support the flow limiting access of large clusters.

Common traffic limiting algorithms

Algorithm 1: Token Bucket Algorithm

The Token Bucket algorithm is the most widely used traffic limiting algorithm. As its name implies, it consists of the following two roles:

  • Tokens – Requests that obtain tokens are processed; other requests are either queued or discarded.
  • Bucket – the place where tokens are held, and all requests get the relevant tokens from the bucket.

Simple token handling is shown below:

Tokens generated

The process involves a token generator and a token bucket, which in the case of a token generator adds tokens to the bucket at a predetermined rate, such as 100 requests per second. The distribution here is uniform. The token generator is similar to the tap and the token bucket is similar to the bucket. When the bucket is full of water, it will overflow if the bucket is filled with water. The token issuing nature is the same. (You can try to think of an order card issued at a speed of pit).

Token access

After each request arrives, a token must be obtained to execute the logic that follows. If the number of tokens is small and the number of access requests is large, part of the request cannot obtain tokens naturally, then we can set up a buffer queue to store the extra tokens. Buffering queues are also an option, and not all token bucket algorithms implement queues. When the cache queue exists, requests that do not receive tokens are queued in the queue until a new token is generated, and a request is pulled from the queue head to match the token. When the queue is full, this part of the access request is discarded. In fact, in the actual application, you can also set the attributes related to the queue, such as setting the survival time of the queue request, or sorting according to the priority and so on.

Algorithm 2: Leaky Bucket algorithm

The schematic diagram is as follows:

The judgment logic of the leaky bucket algorithm is similar to that of the token bucket, but the operation object is different. The token bucket puts the token into the bucket, while the leaky bucket puts the request into the bucket. Similarly, if the bucket is full, subsequent data is discarded. The leaky bucket algorithm only flows packets out of the bucket at a constant rate.

The connection and difference between the two algorithms:

  • Common ground: Both algorithms have a “constant” rate and an “indefinite” rate. The token bucket creates tokens at a constant rate, but access requests get tokens at a variable rate, issuing as many tokens as they have, and waiting for them to run out. The leaky bucket algorithm processes requests at a constant rate, but the rate of requests flowing into the bucket is variable
  • Difference: The leaky bucket naturally does not generate burst traffic, even if 1000 requests per second, its subsequent service output access rate is always constant. Token buckets, however, have the ability to “pre-store” a certain amount of tokens, and thus consume all of them in a short period of time when dealing with sudden traffic. Burst flow rate efficiency will be higher than the leakage bucket, of course, the corresponding background system pressure will be larger.

Current limit of actual combat

Guava based on the actual flow limiting

Pom depends on:

 <dependency>
            <groupId>org.springframework.boot</groupId>
            <artifactId>spring-boot-starter-web</artifactId>
        </dependency>
        <dependency>
            <groupId>com.google.guava</groupId>
            <artifactId>guava</artifactId>
            <version>18.0</version>
  </dependency>
Copy the code

The demo code:

   // The traffic limiting component can issue two tokens per second
    RateLimiter limiter = RateLimiter.create(2.0);
    // Non-blocking current limiting
    @GetMapping("/tryAcquire")
    public String tryAcquire(Integer count){
        // The number of tokens to obtain per request
        if (limiter.tryAcquire(count)){
            log.info("success, rate is {}",limiter.getRate());
            return "success";
        }else {
            log.info("fail ,rate is {}",limiter.getRate());
            return "fail"; }}// Limit the time of blocking
    @GetMapping("tryAcquireWithTimeout")
    public String tryAcquireWithTimeout(Integer count, Integer timeout){
        if (limiter.tryAcquire(count,timeout, TimeUnit.SECONDS)){
            log.info("success, rate is {}",limiter.getRate());
            return "success";
        }else {
            log.info("fail ,rate is {}",limiter.getRate());
            return "fail"; }}// Synchronous blocking and limiting
    @GetMapping("acquire")
    public String acquire(Integer count) {
        limiter.acquire(count);
        log.info("success, rate is {}",limiter.getRate());
        return "success";
    }
Copy the code

About the guava single-machine flow limiting Demo, the old cat simply wrote several demos. Guava single-machine traffic limiting is mainly divided into blocking traffic limiting and non-blocking traffic limiting. Once you have started the project, you can adjust the relevant entry parameters to see how the log changes.

Old cat, for example, analyze the result of one of the requests localhost: 10088 / tryAcquire? count=2; The log output after the request is as follows:

The 2021-02-18 23:41:48. 5004-615 the INFO [IO - 10088 - exec - 9] com. Ktdaddy. KTController: success, rate is2.0 2021-02-18 23:41:49. 5004-164 the INFO [IO - 10088 - exec - 2] com. Ktdaddy. KTController: success, What is2.0 23:41:49 2021-02-18. 5004-815 the INFO [o - 10088 - exec - 10] com. Ktdaddy. KTController: success, What is2.0 23:41:50 2021-02-18. 5004-205 the INFO/IO - 10088 - exec - 1 com. Ktdaddy. KTController: fail, the rate is 2.0 2021-02-18 23:41:50. 5004-769 the INFO [IO - 10088 - exec - 3] com. Ktdaddy. KTController: success, rate is 2.0 2021-02-18 23:41:51. 470 INFO 5004 - [IO - 10088 - exec - 4] com. Ktdaddy. KTController: fail, the rate is 2.0Copy the code

From the request log, we can see that the first two requests were successful with less than a second between them. What is the cause of this? Here is a mystery, and then we will synchronize guava’s flow preheating model.

Based on Nginx current limiting actual combat

Nginx. conf traffic limiting configuration is as follows:

Limit speed by IP address
# (1) $binary_remote_addr: can be understood as nginx internal system variable
Remote_addr specifies traffic limiting by IP address
#
# (2) The second parameter zone=iplimit:20m
# iplimit specifies the size of the memory area (20m).
# (3) The third parameter rate=1r/s, which identifies the frequency of limiting access
100r/m
limit_req_zone $binary_remote_addr zone=iplimit:20m rate=10r/s;

Traffic limiting by server level
limit_req_zone $server_name zone=serverlimit:10m rate=1r/s;

# Configuration based on the number of IP connections
limit_conn_zone $binary_remote_addr zone=perip:20m;

Traffic limiting by server level
limit_conn_zone $server_name zone=perserver:20m;

# Normal service mapping domain limit-rate.ktdaddy.com maps to http://127.0.0.1:10088/ service.
You can configure the host mapping locally.
server {
        server_name  limit-rate.ktdaddy.com;
        location /access-limit/ {
            proxy_pass http://127.0.0.1:10088/;
            
            IP address-based restrictions
            # 1) The first parameter zone=iplimit => references the limit_req_zone zone variable information
            # 2) The second parameter, Burst =2, sets a buffer area of size 2 when a large number of requests arrive and exceed the frequency of limiting traffic
            # Put it in the buffer area
            # 3) Nodelay => when the buffer is full, return 503
            limit_req zone=iplimit burst=2 nodelay;
        
            # server-based restrictions
            In general, the server level traffic limiting rate is higher than the IP traffic limiting rate.
            limit_req zone=serverlimit burst=1 nodelay;
       
            Keep a maximum of 100 links per server
            limit_conn perserver 100;
            Keep 1 link address for each Ip address
            limit_conn perip 1;
            Specifies that 504 is returned instead of the default 503
            limit_req_status 504;
            limit_conn_status 504;
        }
        # simple download speed limit configuration, which means that the download speed of up to 100m files is limited to 256K download speed
        location /download/ {
              limit_rate_after 100m;
              limit_rate 256k; }}Copy the code

Nginx traffic limiting is based on IP, traffic limiting based on per-service, traffic limiting based on THE number of IP connections, and traffic limiting based on the number of connections per service. Of course, we also mentioned the download speed limit configuration. If you are interested, you can simulate the configuration and experience traffic limiting based on NGINx.

Redis+Lua based traffic limiting components

Here is more important, the old cat decided to separate as a knowledge point and everyone to share, here temporarily omitted.

Write in the last

This article shares with you the related concepts of limiting flow and our practical application as well as related algorithms and implementation. There is no sharing about redis+ Lua script’s implementation of stream limiting. Personally feel or more practical, behind separate pull out and share with you, please look forward to.