This is the 9th day of my participation in the November Gwen Challenge. See details: The Last Gwen Challenge 2021. PS: have more text how many days, N write a few. Make sure you write the right copy, or the article won’t count.

preface

This article has been included in Github- Java3C, which contains my series of articles, welcome everyone Star.

Hello everyone, MY name is L Labrami. In my “Super Architect” column, the first two articles “Super Architect” Graph code actual traffic Limiting Algorithm (1), “Super Architect” graph code actual traffic limiting algorithm (2) “, explain several mainstream algorithms of traffic limiting with pseudo code examples. This article mainly explains several means of system current limiting ideas in the form of theory + diagram + code.

Container current-limiting

Tomcat current-limiting

The Maximum number of threads is set in the Tomcat configuration file. When the number of concurrent requests exceeds the maximum number of threads, the requests are executed in a queue, which is equivalent to achieving the purpose of traffic limiting.

In the tomcat directory file conf/server.xml:

<Connector port="8080" protocol="HTTP / 1.1"
          connectionTimeout="20000"
          maxThreads="150"
          redirectPort="8443" />
Copy the code

MaxThreads is the maximum number of concurrent threads.

Nginx current-limiting

Nginx traffic limiting is similar to Tomcat traffic limiting and is implemented through configuration files.

There are two main ways:

  • Control the rate of
  • Control the number of concurrent connections

Control the rate of

Add traffic limiting to the HTTP module of the nginx.conf configuration file:

Format: limit_req_zone Key Zone rate

  • Key: defines a traffic limiting object. Binary_remote_addr is a key that specifies that traffic limiting is performed based on remote_ADDR (client IP address). Binary_ is used to reduce memory usage.
  • Zone: defines a shared memory area to store access information. MyRateLimit :10m Indicates a memory area with a size of 10m and a name of myRateLimit. One meter can store information about 16,000 IP addresses, and 10 meters can store information about 16W IP addresses.
  • Rate: Sets the maximum access rate. Rate =10r/s indicates that a maximum of 10 requests can be processed per second. Nginx actually tracks request information on a millisecond granularity, so 10r/s is actually a limit: one request every 100 milliseconds. This means that if another request arrives within 100 milliseconds after the last request was processed, it will be rejected.
http {
    limit_req_zone $binary_remote_addr zone=myRateLimit:10m rate=10r/s;
}
Copy the code

Configure the server and use the limit_req command to apply traffic limiting.

server { location / { limit_req zone=myRateLimit; proxy_pass http://my_upstream; }}Copy the code

Control the number of concurrent connections

Use the limit_conn_zone and limit_conn directives.

Nginx official example:

limit_conn_zone $binary_remote_addr zone=perip:10m;
limit_conn_zone $server_name zone=perserver:10m;

server {
    ...
    limit_conn perip 10;
    limit_conn perserver 100;
}
Copy the code

Limit_conn perIP 10 The limit_conn perIP 10 key is $binary_remote_ADDR, which limits a single IP address to a maximum of 10 connections.

Limit_conn PerServer 100 The limit_conn perServer 100 key is $server_name, indicating the number of concurrent connections that a virtual host can process at the same time.

Traffic limiting on the server

Current limit algorithm

The traffic limiting algorithms on the server are as follows:

  • Fixed window algorithm
  • Sliding window algorithm
  • Bucket algorithm
  • Token bucket algorithm

In the first two articles “Super Architect” diagram code actual traffic limiting Algorithm (I), “Super Architect” diagram code actual traffic limiting algorithm (II) “there are diagram code details.

Concrete implementation

RateLimiter of the Google Guava library

Based on token bucket algorithm, applicable to single architecture.

// Commit tasks at a rate of 2 per second
final RateLimiter rateLimiter = RateLimiter.create(2.0);
void submitTasks(List tasks, Executor executor) {
    for (Runnable task : tasks) {
        rateLimiter.acquire(); // may waitexecutor.execute(task); }}Copy the code
redis + Lua

Redis provides distributed K-V storage and Lua scripts guarantee atomicity.

The logic of the Lua script is as follows:

-- Gets the first key passed in when the script is called (used as a key for limiting traffic)
local key = KEYS[1]
-- Gets the value of the first argument passed in when the script is called.
local limit = tonumber(ARGV[1])

-- Gets the current traffic size
local curentLimit = tonumber(redis.call('get', key) or "0")

-- Whether the traffic limit is exceeded
if curentLimit + 1 > limit then
-- Return (reject)
return 0
else
-- Does not exceed value + 1
redis.call("INCRBY", key, 1)
-- Set the expiration time
redis.call("EXPIRE", key, 2)
-- Return (release)
return 1
end
Copy the code
Gateway layer traffic limiting:

For example, Zuul, Spring Cloud Gateway, etc., and the underlying implementation principle of Spring Cloud-Gateway traffic limiting is based on Redis + Lua, through the way of built-in Lua traffic limiting script.

The Alibaba Sentinel

Start the Sentinel server and configure the rules for your interface resources in the Sentinel console.

Service fusing

The system was designed with circuit breakers in mind. If a fault occurs in the system and cannot be rectified within a short period of time, the system automatically checks and turns on the fuse breaker to deny traffic access and prevent overload requests from the backend due to heavy traffic.

The system should also be able to dynamically monitor the repair status of the back-end program, and when the program has been stabilized, the fuse switch can be turned off to resume normal service. Common fuses include Hystrix and Ali Sentinel, which have their own advantages and disadvantages and can be selected according to the actual situation of the business.

Service degradation

All the functions and services of the system are graded. When the system has problems and needs emergency flow limiting, the less important functions can be degraded and the service can be stopped. In this way, more resources can be released to provide core functions.

In electric business platform, for example, if a sudden traffic surges, can temporarily to review, non-core functions such as integral to downgrade, stop these services, release the machines and CPU resources to safeguard the normal order, and the whole system such as the degradation function services can be back to normal after, to start again, to the single/compensation processing. In addition to functional degradation, you can also use the method of reading cache and writing cache without directly operating the database as a temporary degradation scheme.

Delays in processing

This pattern requires a traffic buffer pool at the front end of the system to buffer all requests into this pool without immediate processing. The back-end real business handler then pulls the requests out of the pool and processes them in turn, often using the queue pattern. This is equivalent to using an asynchronous way to reduce the back-end processing pressure, but when the traffic is large, the back-end processing capacity is limited, the request in the buffer pool may not be processed in time, there will be a certain degree of delay. The leaky bucket algorithm and the token bucket algorithm are this idea.

Privilege to deal with

In this mode, users are classified. By preset classification, the system gives priority to the user groups that need high security, and the requests of other user groups are delayed or not directly processed.

Cache, degrade, limiting the difference

The cache

Caching is used to increase system throughput, speed up access and provide high concurrency.

demotion

When some service components of the system are unavailable, traffic surges, and resources are exhausted, the system temporarily shields the faulty services and continues to provide degraded services, giving users friendly reminders as far as possible, and returning the bottom data without affecting the overall business process. After the problem is solved, the system will restart services.

Current limiting

Limiting traffic refers to scenarios where caching is used and downgrading is ineffective. For example, when the threshold is reached, the interface call frequency, access times, inventory number, etc., and the service is degraded in advance before the service becomes unavailable. Serve only a subset of users well.

The last

Creation is not easy, thank you for your likes!! 🙏 🙏