The interviewer asked me how to use Nginx to implement flow limiting, and I easily got the Offer!

Writing in the front

Recently, a lot of readers said that after reading my article, learned a lot of knowledge, in fact, I heard after is very happy, their own writing can bring help to everyone, is really a happy thing. Recently, there are also many partners, after reading my article, successfully got the Offer of the big factory, there are also many partners have been brushing my article, improve their internal skills, and eventually become the core business development staff of their own company. Here, the glacier is really happy for you, I hope you can continue to learn, keep a continuous learning mentality, go further and further on the road of technology.

What should I write today? After thinking about it, write an article about high concurrency practices. Yes, write an article about how to use Nginx to implement stream limiting. Friends want to see what article, you can leave a message to me on wechat, or directly in the public account.

Current limiting measures

If you read my book “[High Concurrency] High Concurrency Seckill System Architecture Decrypt, Not all Seckills are Seckills!” Article, I believe that friends will remember what I said: a lot of articles and posts on the Internet in the introduction of the second kill system, said that in order to use asynchronous peak shaving to carry out some flow limiting operations, that is bullshit! Because the single operation is relatively late in the whole process of seckill system, the flow limiting operation must be pre-processed, and it is useless to do the flow limiting operation in the process behind seckill business.

As a high-performance Web proxy and load balancing server, Nginx is usually deployed at the front of some Internet applications. At this point, we can set up on Nginx to limit the number of IP addresses and concurrent accesses.

Nginx official current limiting module

Nginx has two modules for limiting IP connections and concurrency:

The limit_req_zone command is used to limit the number of requests per unit time, that is, the rate limit. The leaky Bucket algorithm is leaky Bucket.
The limit_req_conn command is used to limit the number of connections at a time, that is, the concurrency limit.

The limit_req_zone parameter is set

Limit_req_zone Parameter description

Syntax: limit_req zone=name [burst=number] [nodelay];
Default:    —
Context:    http, server, location
Copy the code

limit_req_zone $binary_remote_addr zone=one:10m rate=1r/s;
Copy the code

The first parameter: $binary_remote_addr specifies the remote_addr identifier. The purpose of “binary_” is to limit memory usage and limit the same client IP address.
The second parameter: zone=one:10m generates a 10m memory region named one to store access frequency information.
The third parameter: rate=1r/s indicates the number of times that clients of the same identity are allowed to access. In this case, the limit is 1 time per second, and there can be other parameters such as 30r/m.

limit_req zone=one burst=5 nodelay;
Copy the code

The first parameter, zone=one, sets which configuration area to use for restrictions and corresponds to the name in limit_req_zone above.
The second parameter, burst=5, is used to describe the configuration, burst means, this configuration means to set a buffer of size 5, when there is a large number of requests (burst), the request exceeds the access frequency limit can be placed in the buffer first.
The third parameter: nodelay. If set, 503 will be returned if the number of accesses is exceeded and the buffer is full. If not set, all requests will be queued.

Limit_req_zone sample

http {
    limit_req_zone $binary_remote_addrzone=one:10m rate=1r/s; server { location /search/ { limit_req zone=one burst=5 nodelay; }}Copy the code

The following configuration can restrict access to a specific UA (such as a search engine) :

limit_req_zone  $anti_spider  zone=one:10m   rate=10r/s;
limit_req zone=one burst=100 nodelay;
if ($http_user_agent~ *"googlebot|bingbot|Feedfetcher-Google") {
    set $anti_spider $http_user_agent;
}
Copy the code

The other parameters

Syntax: limit_req_log_level info | notice | warn | error;
Default:    
limit_req_log_level error;
Context:    http, server, location
Copy the code

When the server is restricted or cached due to LIMIT, write to the log is configured. A delayed record is one level lower than a rejected record. Example: limit_req_log_level notice The delay is basically info.

Syntax: limit_req_status code;
Default:    
limit_req_status 503;
Context:    http, server, location
Copy the code

Sets the return value for the rejected request. Values can only be set between 400 and 599.

Configuration of the ngx_http_limit_conn_module parameter

Ngx_http_limit_conn_module Parameter description

This module is used to limit the number of requests per IP. Not all connections are counted. Connections are counted only when the server has processed the request and has read the entire request header.

Syntax: limit_conn zone number;
Default:    —
Context:    http, server, location
Copy the code

limit_conn_zone $binary_remote_addr zone=addr:10m;
 
server {
    location /download/ {
        limit_conn addr 1;
    }
Copy the code

Only one connection per IP address is allowed at a time.

limit_conn_zone $binary_remote_addr zone=perip:10m;
limit_conn_zone $server_name zone=perserver:10m;
 
server {
    ...
    limit_conn perip 10;
    limit_conn perserver 100;
}
Copy the code

Multiple limit_conn directives can be configured. For example, the above configuration limits the number of connections to servers per client IP address and the total number of connections to virtual servers.

Syntax: limit_conn_zone key zone=name:size; Default: - the Context: HTTPCopy the code

limit_conn_zone $binary_remote_addr zone=addr:10m;
Copy the code

Here, the client IP address is the key. Note that instead of $remote_addr, the $binary_remote_addr variable is used. The $remote_addr variable can be anywhere from 7 to 15 bytes in size. The stored state consumes 32 or 64 bytes of memory on 32-bit platforms, and always 64 bytes on 64-bit platforms. The size of the $binary_remote_ADDR variable is always 4 bytes for IPv4 addresses and 16 bytes for IPv6 addresses. Storage state is always 32 or 64 bytes on 32-bit platforms and 64 bytes on 64-bit platforms. A Megabyte region can hold about 32,000 32-byte states or about 16,000 64-byte states. If the zone store runs out, the server returns an error to all other requests.

Syntax: limit_conn_log_level info | notice | warn | error;
Default:    
limit_conn_log_level error;
Context:    http, server, location
Copy the code

Set the required logging level when the server limits the number of connections.

Syntax: limit_conn_status code;
Default:    
limit_conn_status 503;
Context:    http, server, location
Copy the code

Sets the return value for the rejected request.

Nginx current limiting combat

Limiting access rate

limit_req_zone $binary_remote_addrzone=mylimit:10m rate=2r/s; server { location / { limit_req zone=mylimit; }}Copy the code

The above rules limit the speed of each IP access to 2r/s and apply this rule to the root directory. What if a single IP sends multiple requests concurrently in a very short period of time?

We used a single IP to send and send 6 requests within 10ms, only 1 was successful, and the remaining 5 were rejected. We set the speed at 2r/s. Why only one of them worked? Is Nginx limiting wrong? Of course not, because the Nginx traffic limit is based on milliseconds, we set the speed of 2r/s, which means that only one request is allowed to pass through a single IP within 500ms, and only the second request is allowed after 501ms.

Burst Cache processing

We saw that we sent a large number of requests in a short period of time, and Nginx rejected requests that exceeded the limit based on millisecond accuracy. This is too harsh in the actual scene, the real network environment request arrival is not uniform, it is likely to have a request “sudden” situation, that is, “a share a share”. With this in mind, Nginx allows for caching processing of burst requests with the Burst keyword, rather than outright rejecting them.

Look at our configuration:

limit_req_zone $binary_remote_addrzone=mylimit:10m rate=2r/s; server { location / { limit_req zone=mylimit burst=4; }}Copy the code

We added Burst =4, meaning that a maximum of 4 burst requests are allowed per key(in this case per IP). What happens if a single IP sends six requests within 10ms?

This is an increase of 4 successes over instance 1, which is consistent with the number of bursts we set. The process is as follows: 1 request is processed immediately, 4 requests are placed in the Burst queue, and one request is rejected. With the Burst parameter, we have enabled Nginx limiting to cache the ability to handle bursts of traffic.

But be aware: The purpose of a burst is to allow extra requests to be queued and processed slowly. Without the nodelay parameter, requests in the queue are not processed immediately, but at the rate set, at millisecond precision.

Nodelay Reduces the queuing time

In using burst cache processing, we saw that by setting the Burst parameter, we could allow the Nginx cache to handle a certain amount of burst. The extra requests could be queued and processed slowly, smoothing the traffic. However, if the queue is set to be large, the queue time of the request will be longer, which means that the RT will be longer from the user’s perspective, which is very unfriendly to the user. What’s the solution? The nodelay parameter allows requests to be processed as soon as they are queued, that is, as long as the request can enter the Burst queue, it will be processed by the background worker immediately. Please note that this means that when burst sets nodelay, the system’s instantaneous QPS may exceed the threshold set by the rate. The nodelay parameter only works when used with burst.

To continue the configuration of the burst cache processing, we add the nodelay option:

limit_req_zone $binary_remote_addrzone=mylimit:10m rate=2r/s; server { location / { limit_req zone=mylimit burst=4 nodelay; }}Copy the code

A single IP address sends 6 requests concurrently within 10ms, and the result is as follows:

The request success rate is unchanged compared to Burst cache processing, but the overall time is shorter. How do you explain that? In burst cache processing, four requests are placed in the burst queue. The worker process processes one request every 500ms(rate=2r/s) for processing, and the last request is queued for 2 seconds before being processed. Queued requests are processed in the same way as in the Burst cache, but the difference is that the queued requests are eligible to be processed at the same time, so the five requests start to be processed at the same time, which naturally takes less time.

However, please note that although setting burst and nodelay can reduce the processing time of burst requests, it does not increase the throughput upper limit in the long run. The upper limit of long-term throughput is determined by rate, because nodelay only guarantees that burst requests are processed immediately, but Nginx limits the rate at which queue elements can be released. It’s like limiting the rate at which tokens can be generated in a token bucket.

At this point you may be asking, which bucket is the rate-limiting algorithm with nodelay added, the leaky bucket or the token-bucket algorithm? It’s a leaky bucket algorithm, of course. Consider a case where the token bucket algorithm runs out of tokens. Since it has a request queue, it caches subsequent requests, depending on the queue size. But does it make sense to cache these requests at this point? If the server is overloaded, the cache queue is getting longer and longer, and the RT is getting higher and higher, even if the request is processed a long time later, there is little value to the user. Therefore, when the token is insufficient, the most sensible approach is to directly reject the user’s request, which is called the leaky bucket algorithm.

Custom return value

limit_req_zone $binary_remote_addrzone=mylimit:10m rate=2r/s; server { location / { limit_req zone=mylimit burst=4 nodelay; limit_req_status 598; }}Copy the code

By default, the status of the value returned by status is not configured:

Well, let’s call it a day! Don’t forget to give a look and forward, let more people see, study together progress!!

Write in the last

If you think the glacier is good, please search and pay attention to the “Glacier technology” wechat public number, with the glacier to learn high concurrency, distributed, micro services, big data, the Internet and cloud native technology, “Glacier technology” wechat public number updated a large number of technical topics, each technical article full of dry goods! Many readers have successfully changed their jobs to Dafang by reading articles on the wechat official account of glacier Technology. There are also many readers to achieve a leap in technology, become the technical backbone of the company! If you want to improve your ability like them, realize the leap of technical ability, enter dafa, get a promotion and pay raise, then pay attention to the “Glacier Technology” wechat public account, update the super hard core technology dry goods every day, let you no longer confused about how to improve technical ability!