This paper describes the design of the current limiting system based on Redis, mainly talks about the design of the function of the current limiting strategy in the current limiting system; In terms of implementation, the algorithm uses the token bucket algorithm to access Redis using Lua scripts.

1, concept,

In computer networks, rate limiting is used to control the rate of traffic sent or received by a network interface controller and is used to prevent DoS attacks

Traffic limiting is to control the flow in and out of the system to prevent large flow in and out, resulting in insufficient resources and system instability.

A traffic limiting system is a control component for resource access. It controls the following two functions: Current limiting and fusing strategy for fusing strategy, different systems have different fusing strategy demands, some system directly refuse, hope some systems there waiting in line, hope hope service degradation, some system can customize their fusing strategy, it’s hard to list, so this article in view of the current limiting strategy only this function do detailed design.

For the function of traffic limiting policy, the traffic limiting system has two basic concepts: resource and policy.

  • Resources: or scarce resources, objects controlled by flow; For example, write interfaces, external merchant interfaces, and read interfaces under heavy traffic

  • Policy: A traffic limiting policy consists of a traffic limiting algorithm and adjustable parameters

Circuit breaker policy: the processing policy of requests exceeding the rate threshold. This is a term I understand and not the mainstream term in the industry.

2. Flow limiting algorithm

  • Limit the number of instantaneous concurrency

  • Limit the maximum number of requests in a time window

  • The token bucket

2.1. Limit instantaneous concurrency

Definition: instantaneous concurrency, the number of simultaneous requests/transactions processed by the system

Advantages: This algorithm can achieve the effect of controlling the number of concurrent

Disadvantages: The usage scenario is relatively simple, generally used to control the incoming traffic

Java pseudo-code implementation:

AtomicInteger atomic = new AtomicInteger(1)try {if(atom.incrementandGet () > limit number) {// fuse logic} else {// handle logic}} finally  { atomic.decrementAndGet(); }Copy the code

2.2. Limit the maximum number of requests in the time window

Definition: Time window maximum number of requests, the maximum number of requests allowed within a specified time range

Advantages: This algorithm can meet most flow control requirements, and the maximum QPS can be directly converted from the maximum number of requests in the time window (QPS = number of requests/time window).

Disadvantages: In this way, the flow may not be smooth, and the proportion of a small section of flow in the time window is particularly large

Lua code implementation:

-- Resource unique identifier Local key = KEYS[1]-- Maximum number of concurrent connections in a time window local max_window_concurrency = tonumber(ARGV[1]) -- Local window = Concurrency = tonumber(ARGV[2]) -- The current number of concurrent connections in the time window key) or 0) if current + 1 > limit then return falseelse redis.call("INCRBY", key,1) if window > -1 then redis.call("expire", key,window) end return trueendCopy the code

2.3. Token bucket

Algorithm description

  • If the average send rate configured by the user is r, a token is added to the bucket every 1/r second

  • Assume that a bucket can hold up to B tokens. If the token arrives when the bucket is full, the token is discarded

  • When traffic enters the bucket at rate V and obtains a token from the bucket at rate V, the traffic with the token passes. If the traffic with no token fails to pass, the circuit breaker logic is performed

attribute

  • In the long run, the rate of conforming traffic is influenced by the rate of token addition and is stabilized at: r

  • Because the token bucket has a certain amount of storage, it can withstand certain traffic emergencies

    • M is the maximum possible transmission rate in bytes per second: M>r

    • T Max = b/(m-r) the time to withstand the maximum transmission rate

    • B Max = T Max * M The traffic transmitted during the maximum transmission rate

Advantages: Smooth flow, and can withstand a certain amount of sudden flow

Because the realization of our flow limiting system is based on the token bucket algorithm, the specific code implementation refer to the following.

3. Project realization

3.1. Technology selection

  • Mysql: stores metadata such as parameters of traffic limiting policies

  • Redis + Lua: token bucket algorithm implementation

Note: Because we position Redis as: cache, computing medium, so metadata is stored in DB

3.2. Architecture Diagram

3.3. Data structure

field describe
name Unique identifier for the token bucket
apps A list of applications that can use token buckets
max_permits The maximum number of tokens for a token bucket
rate The rate at which tokens are added to the token bucket
created_by founder
updated_by Update one

The implementation of the traffic limiting system is based on Redis, which could have nothing to do with applications. However, in order to manage the configuration of traffic limiting metadata uniformly and manage and use it according to the application dimensions, the field apps is added to the data structure. When problems occur, it is convenient to troubleshoot.

3.4. Code implementation

3.4.1 Problems encountered in code implementation

Reference to the token bucket algorithm description and general idea is to put a repeat in RateLimiter – client threads, thread according to the configuration to the token bucket to add tokens, so that the implementation of the following disadvantages:

  • You need to add a repeat thread for each token bucket configuration

  • The repeated interval is not precise enough: the thread needs to add a token to the bucket every 1/r second, and there is no way to set the interval when r>1000. (Ratelimiter-client can handle QPS > 5000, as reflected in the later performance tests.)

3.4.2 Solutions

Based on the above shortcomings, we use a trigger to add tokens, referring to the implementation of RateLimiter in Google’s Guava.

Algorithm description

  • Based on the above token bucket algorithm

  • Change add token to trigger mode, get token is to do add token action

  • When the token is removed, calculate the number of tokens that should be added by counting the time difference between the last token addition and the current one, and then add to the bucket

    • Curr_mill_second = current number of milliseconds

    • Last_mill_second = Number of milliseconds in which the token was last added

    • R = rate of token addition

    • reserve_permits = (curr_mill_second-last_mill_second)/1000 * r

  • The token fetching logic is performed after the token is added

3.4.3 LuA code implementation

-- Get token -- Return code -- 0 No token bucket configuration --- -1 Failed to get token, No token in the bucket -- 1 indicates that the token is successfully obtained -- unique id of the @param key token (resource) -- Number of permits requested @param tokens -- @param curr_mill_second Current number of milliseconds -- @param Context Local function acquire(Key, permitting, curr_mill_second, permitting) context) local rate_limit_info = redis.pcall("HMGET", key, "last_mill_second", "curr_permits", "max_permits", "rate", "apps") local last_mill_second = rate_limit_info[1] local curr_permits = tonumber(rate_limit_info[2]) local max_permits = tonumber(Rate_limit_info [3]) local rate = Rate_limit_info [4] local apps = Rate_limit_info [5] -- Indicates that the token bucket if is not configured type(apps) == 'boolean' or apps == nil or not contains(apps, context) then return 0 end local local_curr_permits = max_permits; -- The token bucket has just been created and the number of milliseconds since the last token was fetched is empty -- trigger to add tokens to the bucket based on the difference between the last token added to the bucket and the current time -- and update the last token added to the bucket -- if there is less than one token added to the bucket, If (type(last_mill_second) ~= 'Boolean' and last_mill_second ~= false and last_mill_second ~= nil) if (type(last_mill_second) ~= 'Boolean' and last_mill_second ~= false and last_mill_second ~= nil) then local reverse_permits = math.floor(((curr_mill_second - last_mill_second) / 1000) * rate) local expect_curr_permits  = reverse_permits + curr_permits; local_curr_permits = math.min(expect_curr_permits, max_permits); -- Greater than 0 indicates that the token is not acquired for the first time, Reverse_permits > 0 then redis. Pcall ("HSET", key, "last_mill_second", curr_mill_second) end else redis.pcall("HSET", key, "last_mill_second", curr_mill_second) end local result = -1 if (local_curr_permits - permits >= 0) then result = 1 redis.pcall("HSET", key, "curr_permits", local_curr_permits - permits) else redis.pcall("HSET", key, "curr_permits", local_curr_permits) end return resultendCopy the code

All the implementation details about current limiting system, I have been in the making, gitbub address: https://github.com/wukq/rate-limiter, interested students can go to see, because the author experience and knowledge is limited, in the code if there are any errors or bias, welcome to discuss and correct.

3.4.4 Management Interface

In the previous design, the configuration of traffic limiting is associated with the application. In order to better manage the configuration, a unified management page is required to control the configuration:

  • Manage traffic limiting configurations by application

  • Different people are assigned different permissions; Related personnel can view configurations, and the responsible person can modify or delete configurations

3.5. Performance test

Configuration: AWS – ElasticCache-Redis 2 core 4G

Because ratelimiter-Client’s functionality is relatively simple, it’s basically a discount on redis’s performance.

  • Single-thread token fetching: QPS of Ratelimiter-client = 250/s

  • 10 threads fetching tokens: QPS of Ratelimiter-client = 2000/s

  • 100 thread fetch tokens: QPS of Ratelimiter-client = 5000/s

4, summarize

The current limiting system is relatively simple from design to implementation, but it is really very practical, with four words to describe is: short and powerful, which is more important to combine the company’s authority system and system structure, design the current limiting system in line with their own company specifications.

Inadequate:

  • Redis we use a single point of Redis, only do master and slave, not use redis high availability cluster (may use Redis high availability cluster, which brings new problems)

  • Current limiting system is only implemented at the application layer, not at the interface gateway

  • The circuit breaker policy needs to be customized. If the implementation is better, some common circuit breaker policy templates can be provided


Reference Books:

1. Redis Design and Implementation 2. Lua Programming Guide

Reference article:

1. Redis’s official website

2. Lua coding specification

3. Talk about limiting traffic in high-concurrency systems

4. Guava Ratelimiter implementation

Token_bucket Wiki entry