High concurrency under the common limiting algorithms are here!

Introduction of current limiting

When it comes to high availability systems, there’s a lot of talk about high availability protection: caching, downgrading, and limiting traffic. This post will focus on limiting traffic. Traffic limiting is short for Rate Limit. Only specified events are allowed to enter the system. If the events exceed the Rate Limit, they will be denied service, queued or waiting, or degraded. For server services, traffic limiting ensures that part of the request traffic can be properly responded to, which is better than not responding to all requests, or even causing a system avalanche. Traffic limiting and circuit breaker are often confused. According to bloggers, the biggest difference between traffic limiting and circuit breaker is that traffic limiting is mainly implemented on a server, while circuit breaker is mainly implemented on a client. Of course, a service can act as both a server and a client, which makes traffic limiting and circuit breaker exist in the same service at the same time.

So why do we need to limit the flow? Many people’s first reaction is that the service is broken and they need to limit the flow. This is not a comprehensive statement. Bloggers believe that limiting the flow is due to the scarcity of resources or for security purposes, self-protection measures. Traffic limiting ensures that services can be maximized with limited resources. Services are provided according to the expected traffic. If the traffic exceeds the limit, services will be denied, queued or waiting, or degraded.

Support for traffic limiting varies from system to system, but there are some standards. The HTTP RFC 6585 standard specifies 429 Too Many Requests, and the 429 status code indicates that the user sent Too Many Requests in a given period of time for traffic limiting (” rate limiting “). It also includes a retry-after response header that tells the client how long it will be before the service can be requested again.

HTTP/1.1 429 Too Many Requests Content-Type: text/ HTML retry-after: 3600<title>Too Many Requests</title>
  
  
     <h1>Too Many Requests</h1>
     <p>I only allow 50 requests per hour to this Web site per
        logged in user.  Try again soon.</p>
  
Copy the code

Many application frameworks also integrate traffic limiting functions and give explicit traffic limiting identification in the Header returned.

X-rate-limit-limit: indicates the maximum number of requests allowed in a period of time.
X-rate-limit-remaining indicates the number of Remaining requests in the current period.
X-rate-limit-reset: indicates the number of seconds to wait for the maximum number of requests.

The response Header tells the caller the frequency of stream limiting on the server side to ensure the upper limit of interface access on the back end, and the client can adjust the request according to the Header in the response.

Starting from https://chenjiabing666.github.io/2021/08/03/ common current limit under high concurrency algorithm are here

Current limit classification

Limit flow, split, just two word limit and flow, limit is a verb limit, very easy to understand. However, flows are different resources or indicators in different scenarios, and diversity is reflected in flows. In network traffic it can be byte stream, in database it can be TPS, in API it can be QPS or concurrent requests, in merchandise it can also be inventory… . But whatever flow it is, it must be quantifiable, measurable, observable, and statistically measurable. We divide the classification of traffic limiting into different categories based on different ways, as shown in the following figure.

Due to limited space, this article will only select a few common categories to illustrate.

Flow limiting particle size classification

According to particle size classification of current limiting:

Single current limiting
Distributed current limiting

The current system is basically distributed architecture, the mode of single-machine has been very few, here said that the single-machine traffic limiting is more accurate to say that the single-service node traffic limiting. Single-node traffic limiting means that traffic limiting measures are taken by a service node after the request reaches the traffic limiting threshold.

In a narrow sense, distributed traffic limiting refers to the realization of multi-node combined traffic limiting at the access layer, such as NGINX+ Redis and distributed gateway, etc. In a broad sense, distributed traffic limiting refers to the organic integration of multiple nodes (which can be different service nodes) to form the overall traffic limiting service.

Single-node traffic limiting prevents the traffic from crushing the service nodes and lacks the perception of the overall traffic. Distributed traffic limiting is suitable for fine-grained traffic limiting control and can match different traffic limiting rules according to different scenarios. Different from single-node traffic limiting, distributed traffic limiting requires centralized storage, which is usually implemented by Redis. With the introduction of centralized storage, the following issues need to be addressed:

Data consistency

In the current limiting mode, the ideal mode is point in time consistency. Time points in the definition of the consistency requires that all components of the data at any time is completely consistent, but in general information propagation velocity is the speed of light, actually cannot reach agreement at any time, there is always a certain amount of time, for the consistency of our CAP just need to read the latest data, Achieving this does not require strict arbitrary time consistency. This can only be a consistent model in theory, which can achieve linear consistency in current limiting.
Time consistency

The time consistency here is different from the point in time consistency above. It refers to the time consistency of each service node. There are three machines in A cluster, but the time on one of the A/B machines is Tue Dec 3 16:29:28 CST 2019, and the time on C is Tue Dec 3 16:29:28 CST 2019. Then using NTPDate for synchronization will also have a certain error, for the time window sensitive algorithm is the error point.
timeout

In a distributed system, network jitter is required for communication, or the response of the distributed traffic limiting middleware is slow due to excessive pressure, or even the timeout threshold is set improperly, which results in the timeout of the application service node. In this case, should traffic be allowed or denied?
Performance and reliability

Distributed current-limiting middleware always has limited resources and may even be a single point (write single point) with an upper limit on performance. If the distributed traffic limiting middleware is not available, how to degenerate to single-machine traffic limiting mode is also a good demotion scheme.

Type of traffic limiting objects

Classification by object type:

Traffic limiting based on requests
Traffic limiting based on resources

Based on the request limiting, the general implementation of the total limit and limit QPS. Limit the total amount is to limit the upper limit of a certain index, for example, buy a certain commodity, the volume is 10W, then can sell 10W at most. Wechat grab a red envelope, the group sent a red envelope split into 10, so there can only be 10 people can grab, the eleventh person will open it will show “slow hand, red envelope sent over”.

Limit QPS, also known as flow limiting, as long as it is done at the interface level, an interface can only be accessed 100 times per second, so its peak QPS can only be 100. The most difficult aspect of limiting QPS is how to estimate the threshold and how to locate the threshold, as discussed below.

Resource-based traffic limiting is based on the usage of service resources. Key resources of a service need to be identified and restricted, for example, the number of TCP connections, threads, and memory usage. Limiting resources more effectively reflects the current cleanup of the service, but like limiting QPS, there are thresholds on how to identify resources. This threshold value needs to be continuously tuned and practiced to get a more satisfactory value.

Classification of traffic limiting algorithms

No matter in what dimension and in what way the classification is based, the bottom layer of the flow limiting is realized by algorithm. The following describes the implementation of common traffic limiting algorithms:

counter
Token bucket algorithm
Bucket algorithm

counter

Fixed window counter

Count traffic limiting is the simplest traffic limiting algorithm. In daily development, we refer to the fixed window count traffic limiting algorithm. For example, an interface or service can only receive 1000 requests at most for 1s, so we will set the traffic limiting to 1000QPS. The realization idea of this algorithm is very simple, maintain a fixed unit time counter, if the detection of unit time has passed, reset the counter to zero.

Its operation steps:

The timeline is divided into multiple independent and fixed size Windows.
Requests that fall within each time window increment the counter by one;
If the counter exceeds the traffic limiting threshold, subsequent requests to this window will be rejected. But when the time reaches the next time window, the counter is reset to 0.

Let’s implement a simple code.

package limit

import (
   "sync/atomic"
   "time"
)

type Counter struct {
   Count       uint64   // Initial counter
   Limit       uint64  // Maximum request frequency per unit time window
   Interval    int64   Ms / / unit
   RefreshTime int64   // Time window
}

func NewCounter(count, limit uint64, interval, rt int64) *Counter {
   return&amp; Counter{ Count: count, Limit: limit, Interval: interval, RefreshTime: rt, } } func (c *Counter) RateLimit() bool { now := time.Now().UnixNano() /1e6
   ifnow &lt; (c.RefreshTime + c.Interval) { atomic.AddUint64(&amp; c.Count,1)
      returnc.Count &lt; = c.Limit }else{ c.RefreshTime = now atomic.AddUint64(&amp; c.Count, -c.Count)return true}}Copy the code

Test code:

package limit

import (
   "fmt"
   "testing"
   "time"
)

func Test_Counter(t *testing.T) {
   counter := NewCounter(0.5.100, time.Now().Unix())
   for i := 0; i &lt; 10; i++ {
      go func(i int) {
         for k := 0; k &lt; =10; k++ {
            fmt.Println(counter.RateLimit())
            if k%3= =0 {
               time.Sleep(102 * time.Millisecond)
            }
         }
      }(i)
   }
   time.Sleep(10 * time.Second)
}
Copy the code

If you look at the logic above, does it seem easy to fix a window counter? Yes, it is. That’s one of the advantages of it. There are also two serious drawbacks. Imagine that the traffic limiting threshold of 1s in a fixed time window is 100, but 99 requests have been received in the first 100ms, so only one request can be received in the subsequent 900ms, which is a defect and basically has no ability to deal with sudden traffic. The second defect is that 100 requests passed in the last 500ms of the time window 00:00:01, and another 100 requests passed in the first 500ms of the time window 00:00:01. For the service, the number of requests reached 2 times of the traffic limiting threshold in 1 second.

Sliding window counter

The sliding time window algorithm is an improvement on the fixed time window algorithm, which is commonly known in TCP traffic control. Fixed window counter can be said to be a special case of sliding window counter, the operation steps of sliding window:

The unit time is divided into multiple intervals, generally divided into multiple small time periods;
Each interval has a counter. If a request falls within this interval, the counter within this interval will be increased by one.
After each time period, the time window slides one space to the right, discarding the oldest interval and incorporating a new one;
When the total number of requests in the entire time window is calculated, all counters in the time segment are accumulated. If the total number of counts exceeds the limit, all requests in the window are discarded.

By dividing the time window into smaller segments and “sliding” by time, this algorithm avoids the above two problems of fixed window counters. The disadvantage is that the higher the accuracy of the time interval, the larger the space capacity required by the algorithm.

The common implementation methods are mainly based on Redis Zset and circular queue implementation. Based on Redis Zset, Key can be used as stream limiting identification ID, Value can be unique, can be generated by UUID, Score can also be recorded as the same timestamp, preferably nanosecond level. Use the ZADD, EXPIRE, ZCOUNT, and ZremrangebyScore provided by Redis to do this, and also turn on Pipeline to maximize performance. The implementation is simple, but the downside is that zset data structures get bigger and bigger.

Bucket algorithm

The leaky bucket algorithm is that the water enters the leaky bucket first, and then the leaky bucket gives water at a certain rate. When the amount of inflow water is greater than outflow water, the excess water directly spills out. In terms of requests, a leaky bucket is equivalent to a server queue. However, if the number of requests exceeds the traffic limit threshold, the extra requests will be denied service. The leaky bucket algorithm uses queues to control the access speed of traffic at a fixed rate and smooth traffic.

You can see it in one of the most popular images on the Internet.

Leaky bucket algorithm implementation steps:

Store each request in a fixed-size queue;
The queue makes outbound requests at a fixed rate. If the queue is empty, the outbound requests stop.
If the queue is full, additional requests are rejected

The leak-bucket algorithm has an obvious drawback: when there are a large number of burst requests in a short period of time, each request has to wait in the queue for some time before being responded to, even if the server load is not high.

Token bucket algorithm

The token bucket algorithm works by putting tokens into the bucket at a constant rate. If the request needs to be processed, a token needs to be obtained from the bucket first. When no token is available in the bucket, the service is denied. In principle, the token bucket algorithm and the leak bucket algorithm are opposite, the former is “in”, the latter is “out”. In addition to the difference in “direction” between the leaky bucket algorithm and the token bucket algorithm, there is a more important difference: the token bucket algorithm limits the average inflow rate (allowing burst requests, as long as there are enough tokens, supporting multiple tokens at a time), and allows a certain degree of burst traffic;

The steps of token bucket algorithm are as follows:

Tokens are generated at a fixed rate and put into the token bucket;
If the token bucket is full, the excess tokens will be discarded directly. When the request arrives, an attempt will be made to fetch tokens from the token bucket. The request with the token can be executed.
If the bucket is empty, the request is rejected.

How to choose the four strategies?

Fixed window: simple implementation, but too rough, unless the situation is urgent, in order to quickly stop the immediate problem can be used as a temporary emergency plan.
Sliding window: The traffic limiting algorithm is simple and easy to implement, and can cope with scenarios with a small amount of sudden traffic increase.
Leaky bucket: it has a strong requirement for absolute uniformity of flow, and the utilization rate of resources is not extreme. However, its wide in and strict out mode protects the system while leaving some allowance, which is a universal scheme.
Token buckets: The system often has traffic surges and squeezes service performance as much as possible.

How to limit the current?

No matter which classification or implementation is used, the system faces a common problem: how to confirm the traffic limiting threshold. Some teams set a small threshold based on experience and adjust it gradually; Some teams do it through stress testing. The problem of this method is that the pressure measurement model may not be consistent with the online environment, the single voltage of the interface cannot feedback the status of the whole system, and the full-link pressure measurement cannot truly reflect the traffic proportion of the actual traffic scenario. Another idea is through pressure measurement + application monitoring data. According to the QPS of the system peak and the usage of system resources, the equal-water-level amplification is used to estimate the flow limiting threshold. The problem lies in that the inflection point of system performance is unknown, and the simple prediction may not be accurate or even greatly deviates from the real scene. As described in Overload Control for Scaling Microservices, in systems with complex dependencies, Overload Control of a particular service may be detrimental to the entire system or the implementation of the service may be flawed. It is hoped that in the future, a more AI operation feedback system can automatically set the traffic limiting threshold, which can dynamically implement overload protection according to the current QPS, resource status, RT and other related data.

Regardless of the traffic limiting threshold, the system should pay attention to the following points:

Operating indicator status, such as current service QPS, machine resource usage, database connections, concurrent threads, etc.
Call relationship between resources, external link request, association between internal services, strong and weak dependence between services, etc.
Control mode: reject subsequent requests directly, fail quickly, and wait in a queue when the traffic limit is reached

Use the GO stream limiting class library

There are many concurrency- limiting libraries for Java, such as concurrency-limits, Sentinel, Guava, etc. There are many concurrency- limiting libraries for Java, such as Concurrency -limits, Sentinel, Guava, etc. Github.com/golang/time… . The code that can go into the language class library is worth reading some time, the classmate who has studied Java whether to the exquisite design of AQS and sigh! Time/Rate also has its subtle parts, so let’s start with the class library phase.

github.com/golang/time/rate

The best thing to do before source analysis is to understand how the class libraries are used, how they are used, and the apis. With an initial understanding of the business, you can get twice the result with half the effort by reading the code. Because space is limited, subsequent blog posts are analyzing the source code of multiple stream limiting libraries. Class library API documentation: godoc.org/golang.org/…

func NewLimiter(r Limit, b int) *Limiter
Copy the code

NewLimiter returns a newLimiter that allows event rates up to r and bursts up to b tokens. Limter limit time frequency that is to say, but the barrels initially capacity for b, b and filled with a token (token pool up to b a token, so a maximum allowed b event, an event spent a token), then each unit time interval (default 1 s) into a bucket into the r a token.

limter := rate.NewLimiter(10.5)
Copy the code

The above example shows that the token bucket has a capacity of 5 and that 10 tokens are put into the bucket every second. The careful reader will notice that the first argument to NewLimiter is of type Limit. If you look at the source code, you will see that Limit is actually an alias of Float64.

// Limit defines the maximum frequency of some events.
// Limit is represented as number of events per second.
// A zero Limit allows no events.
type Limit float64
Copy the code

The limiter can also specify the interval at which tokens are placed in the bucket as follows:

limter := rate.NewLimiter(rate.Every(100*time.Millisecond), 5)
Copy the code

The effect of the two examples is the same; instead of putting 10 tokens in every second interval at once, the tokens are evenly distributed over 100ms intervals. Rate.limiter provides three methods for limiting speed:

Allow/AllowN
Wait/WaitN
Reserve/ReserveN

The following compares the usage modes and application scenarios of the three traffic limiting modes. Let’s look at the first class of methods:

func (lim *Limiter) Allow() bool
func (lim *Limiter) AllowN(now time.Time, n int) bool
Copy the code

Allow is a simplified method of AllowN(time.now (), 1). The API is a bit abstract and confusing. See the API documentation below.

AllowN reports whether n events may happen at time now. 
Use this method if you intend to drop / skip events that exceed the rate limit. 
Otherwise use Reserve or Wait.
Copy the code

This is essentially to say whether the method AllowN can retrieve N tokens from the token bucket at a given time. That means you can limit whether N events can happen at the same time. These two methods are non-blocking, meaning that if they are not satisfied, they are skipped without waiting for enough tokens to execute. As the second line of the document explains, use this method if you intend to lose or skip time that exceeds the rate limit. For example, using the previously instantiated stream limiter, at a certain point the server receives more than eight requests simultaneously. If there are fewer than eight tokens in the token bucket, these eight requests are discarded. A small example:

func AllowDemo(a) {
   limter := rate.NewLimiter(rate.Every(200*time.Millisecond), 5)
   i := 0
   for {
      i++
      if limter.Allow() {
         fmt.Println(i, "====Allow======", time.Now())
      } else {
         fmt.Println(i, "====Disallow======", time.Now())
      }
      time.Sleep(80 * time.Millisecond)
      if i == 15 {
         return}}}Copy the code

Execution Result:

1 ====Allow====== 2019-12-14 15:54:09.9852178 +0800 CST m=+0.005998001
2 ====Allow====== 2019-12-14 15:54:10.1012231 +0800 CST m=+0.122003301
3 ====Allow====== 2019-12-14 15:54:10.1823056 +0800 CST m=+0.203085801
4 ====Allow====== 2019-12-14 15:54:10.263238 +0800 CST m=+0.284018201
5 ====Allow====== 2019-12-14 15:54:10.344224 +0800 CST m=+0.365004201
6 ====Allow====== 2019-12-14 15:54:10.4242458 +0800 CST m=+0.445026001
7 ====Allow====== 2019-12-14 15:54:10.5043101 +0800 CST m=+0.525090301
8 ====Allow====== 2019-12-14 15:54:10.5852232 +0800 CST m=+0.606003401
9 ====Disallow====== 2019-12-14 15:54:10.6662181 +0800 CST m=+0.686998301
10 ====Disallow====== 2019-12-14 15:54:10.7462189 +0800 CST m=+0.766999101
11 ====Allow====== 2019-12-14 15:54:10.8272182 +0800 CST m=+0.847998401
12 ====Disallow====== 2019-12-14 15:54:10.9072192 +0800 CST m=+0.927999401
13 ====Allow====== 2019-12-14 15:54:10.9872224 +0800 CST m=+1.008002601
14 ====Disallow====== 2019-12-14 15:54:11.0672253 +0800 CST m=+1.088005501
15 ====Disallow====== 2019-12-14 15:54:11.1472946 +0800 CST m=+1.168074801
Copy the code

Second method: Because ReserveN is more complex, WaitN is used first.

func (lim *Limiter) Wait(ctx context.Context) (err error)
func (lim *Limiter) WaitN(ctx context.Context, n int) (err error)
Copy the code

Similar to Wait is a simplified method of WaitN(CTX, 1). Different from AllowN, WaitN blocks. If the number of tokens in the token bucket is less than N, WaitN blocks for a period of time. The length of the blocking time can be set using the first parameter CTX. Specify the blocking duration of a context instance as context.WithDeadline or context.WithTimeout.

func WaitNDemo(a) {
   limter := rate.NewLimiter(10.5)
   i := 0
   for {
      i++
      ctx, canle := context.WithTimeout(context.Background(), 400*time.Millisecond)
      if i == 6 {
         // Cancel execution
         canle()
      }
      err := limter.WaitN(ctx, 4)

      iferr ! = nil { fmt.Println(err)continue
      }
      fmt.Println(i, ", execute:, time.Now())
      if i == 10 {
         return}}}Copy the code

Execution Result:

1, the implementation of:2019-12-14 15:45:15.538539 +0800 CST m=+0.011023401
2, the implementation of:2019-12-14 15:45:15.8395195 +0800 CST m=+0.312003901
3, the implementation of:2019-12-14 15:45:16.2396051 +0800 CST m=+0.712089501
4, the implementation of:2019-12-14 15:45:16.6395169 +0800 CST m=+1.112001301
5, the implementation of:2019-12-14 15:45:17.0385893 +0800 CST m=+1.511073701
context canceled
7, the implementation of:2019-12-14 15:45:17.440514 +0800 CST m=+1.912998401
8, the implementation of:2019-12-14 15:45:17.8405152 +0800 CST m=+2.312999601
9, the implementation of:2019-12-14 15:45:18.2405402 +0800 CST m=+2.713024601
10, the implementation of:2019-12-14 15:45:18.6405179 +0800 CST m=+3.113002301
Copy the code

For scenarios that allow blocking waiting, such as consuming message queues, you can limit the maximum consumption rate, which is too high and will be curtailed to avoid excessive consumer load.

The third method:

func (lim *Limiter) Reserve() *Reservation
func (lim *Limiter) ReserveN(now time.Time, n int) *Reservation
Copy the code

Unlike the previous two methods, Reserve/ReserveN returns an instance of a Reservation. Reservation has five methods in the API documentation:

func (r *Reservation) Cancel() CancelAt(time.now ())
func (r *Reservation) CancelAt(now time.Time)
func (r *Reservation) Delay() time.Duration DelayFrom(time.now ())
func (r *Reservation) DelayFrom(now time.Time) time.Duration
func (r *Reservation) OK(a) bool
Copy the code

These five methods allow developers to operate according to business scenarios, which are much more complex than the first two types of automation. Here’s a quick example to learn about Reserve/ReserveN:

func ReserveNDemo(a) {
   limter := rate.NewLimiter(10.5)
   i := 0
   for {
      i++
      reserve := limter.ReserveN(time.Now(), 4)
      // Flase indicates that the specified number of tokens cannot be retrieved, such as a scenario where the number of tokens required is greater than the token bucket capacity
      if! reserve.OK() {return
      }
      ts := reserve.Delay()
      time.Sleep(ts)
      fmt.Println("Execute:", time.Now(),ts)
      if i == 10 {
         return}}}Copy the code

Execution Result:

Perform:2019-12-14 16:22:26.6446468 +0800 CST m=+0.0080002010 s to perform:2019-12-14 16:22:26.9466454 +0800 CST m=+0.309998801 247.999299 ms to perform:2019-12-14 16:22:27.3446473 +0800 CST m=+0.708000701 398.001399 ms to perform:2019-12-14 16:22:27.7456488 +0800 CST m=+1.109002201 399.999499 ms to perform:2019-12-14 16:22:28.1456465 +0800 CST m=+1.508999901 398.997999 ms to perform:2019-12-14 16:22:28.5456457 +0800 CST m=+1.908999101 399.0003 ms to perform:2019-12-14 16:22:28.9446482 +0800 CST m=+2.308001601 399.001099 ms to perform:2019-12-14 16:22:29.3446524 +0800 CST m=+2.708005801 399.998599 ms to perform:2019-12-14 16:22:29.7446514 +0800 CST m=+3.108004801 399.9944 ms to perform:2019-12-14 16:22:30.1446475 +0800 CST m=+3.508000901 399.9954ms
Copy the code

If Cancel() is performed before Delay(), the return interval is zero, meaning that the operation can be performed immediately without limiting traffic.

func ReserveNDemo2(a) {
   limter := rate.NewLimiter(5.5)
   i := 0
   for {
      i++
      reserve := limter.ReserveN(time.Now(), 4)
      // Flase indicates that the specified number of tokens cannot be retrieved, such as a scenario where the number of tokens required is greater than the token bucket capacity
      if! reserve.OK() {return
      }

      if i == 6 || i == 5 {
         reserve.Cancel()
      }
      ts := reserve.Delay()
      time.Sleep(ts)
      fmt.Println(i, "Execute:", time.Now(), ts)
      if i == 10 {
         return}}}Copy the code

Execution Result:

1Perform:2019-12-14 16:25:45.7974857 +0800 CST m=+0.007005901 0s
2Perform:2019-12-14 16:25:46.3985135 +0800 CST m=+0.608033701 552.0048ms
3Perform:2019-12-14 16:25:47.1984796 +0800 CST m=+1.407999801 798.9722ms
4Perform:2019-12-14 16:25:47.9975269 +0800 CST m=+2.207047101 799.0061ms
5Perform:2019-12-14 16:25:48.7994803 +0800 CST m=+3.009000501 799.9588ms
6Perform:2019-12-14 16:25:48.7994803 +0800 CST m=+3.009000501 0s
7Perform:2019-12-14 16:25:48.7994803 +0800 CST m=+3.009000501 0s
8Perform:2019-12-14 16:25:49.5984782 +0800 CST m=+3.807998401 798.0054ms
9Perform:2019-12-14 16:25:50.3984779 +0800 CST m=+4.607998101 799.0075ms
10Perform:2019-12-14 16:25:51.1995131 +0800 CST m=+5.409033301 799.0078ms
Copy the code

In addition to the three traffic limiting modes mentioned above, time/rate also provides the function of dynamically adjusting traffic limiter parameters. Related apis are as follows:

func (lim *Limiter) SetBurst(newBurst int) // equivalent to SetBurstAt(time.now (), newBurst)
func (lim *Limiter) SetBurstAt(now time.Time, newBurst int)// Reset the capacity of the token bucket
func (lim *Limiter) SetLimit(newLimit Limit) SetLimitAt(time.now (), newLimit)
func (lim *Limiter) SetLimitAt(now time.Time, newLimit Limit)// Reset the rate at which tokens are placed
Copy the code

These four methods allow applications to dynamically adjust the token bucket rate and token bucket capacity based on their state.

At the end

Through the above series of explanations, I believe that you have a general grasp of the application scenarios and advantages and disadvantages of each flow limiting, hoping to be helpful in daily development. Traffic limiting is only a small part of the entire service governance. It needs to be combined with various technologies to improve service stability and user experience.

High concurrency under the common limiting algorithms are here!

Introduction of current limiting

Current limit classification

Flow limiting particle size classification

Type of traffic limiting objects

Classification of traffic limiting algorithms

counter

Fixed window counter

Sliding window counter

Bucket algorithm

Token bucket algorithm

How to choose the four strategies?

How to limit the current?

Use the GO stream limiting class library

github.com/golang/time/rate

At the end

Related Posts

Spark Streaming and Streaming

Python game development, PyGame module, Hamiltonian ring algorithm to automatically play snake small game

Python Programming Tutorial 11: Defining functions