preface

There are three tools you can use to protect a high-concurrency system: caching, degradation, and limiting traffic

  • Caching: The purpose of caching is to improve system access speed and increase system processing capacity

  • Downgrading: Downgrading is when the service has a problem or affects the core process, it needs to be temporarily blocked, and can be turned on again after the peak or the problem is resolved

  • Traffic limiting: The purpose of traffic limiting is to limit the rate of concurrent access/requests or requests within a time window to protect the system. Once the rate reaches the limit, the system can deny services, queue or wait, and degrade

Common traffic limiting algorithms

  1. Bucket algorithm

The idea of leaky bucket algorithm is very simple. The water (request) enters the leaky bucket first, and the leaky bucket flows out of the water at a certain speed. When the inflow speed is too high, the water directly overflows.

  1. Token bucket algorithm

In addition to limiting the average data transfer rate, some degree of burst transmission is required for many application scenarios. In this case, the leaky bucket algorithm may not be suitable, but the token bucket algorithm is more suitable. As shown in the figure, the token bucket algorithm works by putting tokens into the bucket at a constant rate. If the request needs to be processed, a token needs to be obtained from the bucket first. When no token is available in the bucket, the service is denied.

RateLimiter use and source code analysis

Google’s open source toolkit Guava provides RateLimiter, a stream limiting tool class, which implements traffic limiting based on token bucket algorithm. It is very convenient and efficient to use.

RateLimiter use

First, a brief introduction to RateLimiter

public void testAcquire(a) {
      RateLimiter limiter = RateLimiter.create(1);
      for(int i = 1; i < 10; i = i + 2 ) {
          double waitTime = limiter.acquire(i);
          System.out.println("cutTime=" + System.currentTimeMillis() + " acq:" + i + " waitTime:"+ waitTime); }}Copy the code

Output result:

cutTime=1535439657427 acq:1 waitTime:0.0
cutTime=1535439658431 acq:3 waitTime:0.997045
cutTime=1535439661429 acq:5 waitTime:2.993028
cutTime=1535439666426 acq:7 waitTime:4.995625
cutTime=1535439673426 acq:9 waitTime:6.999223
Copy the code

First, create a stream limiter with ratelimite.create (1), which represents the number of tokens generated per second. Acquire the tokens by blocking through limite.acquire (I). Of course, the token can also be obtained by tryAcquire(int Permits, long timeout, TimeUnit Unit) waiting timeout time. If the timeout is 0, it means that the token is not blocked and will be returned immediately if the token is not obtained.

In terms of the output, RateLimiter supports pre-consumption. For example, when acquire(5), the wait time is 3 seconds, 3 two rows of tokens were pre-consumed when the last token was acquired, and 3*1 seconds were required to wait, then 5 tokens were pre-consumed, and so on

RateLimiter supports a certain degree of burst request (pre-consumption) by limiting the waiting time of subsequent requests. Note this in the process of using RateLimiter. The implementation principle will be discussed later.

RateLimiter Implementation principle

Guava has two types of traffic limiting modes: SmoothBursty (constant token generation rate) and SmoothWarmingUp (gradual token generation rate). The two modes have similar implementation ideas, but the main difference is in the calculation of wait time. This article focuses on SmoothBursty

The creation of RateLimiter

The instance is created by calling the RateLimiter create interface, which is actually an instance of the SmoothBuisty stable mode created by calling it.

public static RateLimiter create(double permitsPerSecond) {
    return create(permitsPerSecond, SleepingStopwatch.createFromSystemTimer());
  }

  static RateLimiter create(double permitsPerSecond, SleepingStopwatch stopwatch) {
    RateLimiter rateLimiter = new SmoothBursty(stopwatch, 1.0 /* maxBurstSeconds */);
    rateLimiter.setRate(permitsPerSecond);
    return rateLimiter;
  }
Copy the code

The meanings of two construction parameters for SmoothBursty:

  • SleepingStopwatch: An instance of a clock class in Guava that uses this to calculate times and tokens
  • MaxBurstSeconds: Officially, a token can be saved for a maximum of a few seconds when ReteLimiter is not in use. The default is 1

Before I explain the SmoothBursty principle, I will explain the meanings of several attributes of SmoothBursty

/** * The work (permits) of how many seconds can be saved up if this RateLimiter is unused? * Store tokens for a maximum of a few seconds when RateLimiter is not in use * */
 final double maxBurstSeconds;
 

/** * The currently stored permit. */
double storedPermits;

/** * The maximum number of permits of stored permits. * maxBurstSeconds * stableIntervalMicros */
double maxPermits;

/** * The interval between two unit requests, at our stable rate. E.g., A stable rate of 5 permitting * per second has a stable interval of 200ms. * Add token interval = seconds.tomicros (1L) / PermitsPerSecond; (1 second/number of tokens per second) */
double stableIntervalMicros;

/** * The time when the next request (no matter its size) will be granted. After granting a request, * This is pushed further in the future. Large requests push this further than small requests Since RateLimiter allows pre-consumption, after the last request for a pre-consumption token * the next request will wait for the corresponding time until the nextFreeTicketMicros moment to obtain the token */
private long nextFreeTicketMicros = 0L; // could be either in the past or future
Copy the code

Here are a few key functions

  • setRate
public final void setRate(double permitsPerSecond) {
  checkArgument(
      permitsPerSecond > 0.0 && !Double.isNaN(permitsPerSecond), "rate must be positive");
  synchronized(mutex()) { doSetRate(permitsPerSecond, stopwatch.readMicros()); }}Copy the code

This interface sets the number of tokens generated per second by token pass, and the internal time is achieved by calling SmoothRateLimiter’s doSetRate

  • doSetRate
@Override
  final void doSetRate(double permitsPerSecond, long nowMicros) {
    resync(nowMicros);
    double stableIntervalMicros = SECONDS.toMicros(1L) / permitsPerSecond;
    this.stableIntervalMicros = stableIntervalMicros;
    doSetRate(permitsPerSecond, stableIntervalMicros);
  }
Copy the code

This is done by calling resync to generate the token and update the next token generation time, then updating stableIntervalMicros, and finally calling doSetRate for SmoothBursty

  • resync
/**
 * Updates {@code storedPermits} and {@codeNextFreeTicketMicros} based on the current time. * When the next request token is updated, and the token currently stored (which can be interpreted as token generation) */
void resync(long nowMicros) {
    // if nextFreeTicket is in the past, resync to now
    if (nowMicros > nextFreeTicketMicros) {
      doublenewPermits = (nowMicros - nextFreeTicketMicros) / coolDownIntervalMicros(); storedPermits = min(maxPermits, storedPermits + newPermits); nextFreeTicketMicros = nowMicros; }}Copy the code

According to the token bucket algorithm, the token in the bucket is continuously generated and stored. When there is a request, the token must be obtained from the bucket before it can be executed. Who will continuously generate and store the token?

One solution is to start a scheduled task that continuously generates tokens. For example, an interface needs to set the access frequency limit for each user. Assuming that there are 6W users in the system, at most 6W scheduled tasks need to be enabled to maintain the number of tokens in each bucket, which costs a lot.

Another solution is to delay the calculation, as shown in the resync function. This function is called before each token acquisition. If the current time is later than nextFreeTicketMicros, it calculates how many tokens can be generated during this period, adds the generated tokens to the token bucket and updates the data. In this way, you only need to evaluate once when you get the token.

  • The SmoothBursty doSetRate
@Override
void doSetRate(double permitsPerSecond, double stableIntervalMicros) {
  double oldMaxPermits = this.maxPermits;
  maxPermits = maxBurstSeconds * permitsPerSecond;
  if (oldMaxPermits == Double.POSITIVE_INFINITY) {
    // if we don't special-case this, we would get storedPermits == NaN, below
    // Double.POSITIVE_INFINITY stands for infinity
    storedPermits = maxPermits;
  } else {
    storedPermits =
        (oldMaxPermits == 0.0)?0.0 // initial state: storedPermits * maxPermits / oldMaxPermits; }}Copy the code

The maximum number of tokens that can be stored in a bucket is calculated by maxBurstSeconds, which means the maximum number of tokens that can be generated by maxBurstSeconds. The function of this parameter is to control the traffic more flexibly. For example, the limit is 300 times /20 seconds for some interfaces and 50 times /45 seconds for some interfaces. That is, traffic is not limited to QPS

reference

  • Use Guava RateLimiter stream limiting and source code parsing
  • Use Guava’s RateLimiter to limit traffic
  • SpringBoot uses RateLimiter for stream limiting over AOP
  • Guava RateLimiter source code parsing

conclusion

Welcome to wechat public account “Code zonE”, focusing on sharing Java, cloud computing related content, including SpringBoot, SpringCloud, microservices, Docker, Kubernetes, Python and other related technology dry goods, looking forward to meeting you!