The premise
Business background
Took the other day, the double tenth of a “rob coupon activities”, is generally started to set the hour, you think, taobao user community is very big, can reach level, and the amount of service interface can handle per second is limited, so when this problem occurs, we how to programmatically control user rob stamps, so he must be combined with the current limiting function.
The production environment
1. Suppose the service limit provided by the service interface is 500 times /s
2. The number of times the user requests the interface is unknown, and the QPS may reach 800 times /s, 1000 times /s, or higher
3. When the access frequency of the service interface exceeds 500 times /s, the service will be denied and the excess information will be lost
4. The online environment is multi-node deployment, but the same service interface is invoked
Therefore, to ensure the availability of the service, the rate of service interface invocation must be limited (interface flow limiting).
What is current limiting?
Traffic limiting is used to control incoming and outgoing traffic of the system to prevent resource shortage and system instability due to heavy incoming and outgoing traffic.
A traffic limiting system is a control component for resource access. It controls the following two functions: Current limiting and fusing strategy for fusing strategy, different systems have different fusing strategy demands, some system directly refuse, hope some systems there waiting in line, hope hope service degradation, some system can customize their fusing strategy, here only for current limiting strategy this function do detailed design.
Current limit algorithm
1. Limit the number of instantaneous concurrency
Guava RateLimiter provides the token bucket algorithm implementation: SmoothBursty and SmoothWarmingUp implementations.
2. Limit the maximum number of requests in a time window for an interface
That is, the number of requests in a time window, if you want to limit the number of requests/calls per second/minute/day for an interface/service. For example, some basic services will be called by many other systems, such as the commodity detail page service will call the basic commodity service call, but the basic service will be suspended due to the relatively large update volume, then we need to speed up the call consumption per second/minute; One way to do this is as follows:
LoadingCache<Long, AtomicLong> counter = CacheBuilder.newBuilder() .expireAfterWrite(2, TimeUnit.SECONDS) .build(new CacheLoader<Long, AtomicLong>() { @Override public AtomicLong load(Long seconds) throws Exception { return new AtomicLong(0); }}); long limit = 1000; Long currentSeconds = System.currentTimemillis () / 1000; If (counter.get(currentSeconds).incrementandGet () > limit) {system.out.println (" currentSeconds :" + currentSeconds); continue; } // Business processing}Copy the code
We use Guava’s Cache to store the counter, set the expiration time to 2 seconds (to ensure that a counter within 1 second is available), and then we get the current timestamp and the number of seconds as the KEY for counting statistics and limiting traffic.
3. Token bucket
Algorithm description:
- If the average send rate configured by the user is r, a token is added to the bucket every 1/r second
- Assume that a bucket can hold up to B tokens. If the token arrives when the bucket is full, the token is discarded
- When traffic enters the bucket at rate V and obtains a token from the bucket at rate V, the traffic with the token passes. If the traffic with no token fails to pass, the circuit breaker logic is performed
attribute
- In the long run, the rate of conforming traffic is influenced by the rate of token addition and is stabilized at: r
- Because the token bucket has a certain amount of storage, it can withstand certain traffic emergencies
- M is the maximum possible transmission rate in bytes per second. M>r
- T Max = b/(m-r) the time to withstand the maximum transmission rate
- B Max = T Max * M The traffic transmitted during the maximum transmission rate
Advantages: Smooth flow, and can withstand a certain amount of sudden flow
4. RATELIMITER class in the tool library provided by GOOGLE GUAVA (also internally implemented by token bucket algorithm)
The fastest way to do this is to use the RateLimit class, but this is limited to a single node. In a distributed system, the QPS for each node is the same, and the number of requests to the service interface is QPS * nodes. So this scheme is not suitable for distributed situations!
5, based on REDIS implementation, store two keys, one for timing, one for counting. Each time the request is invoked, the counter increases by 1, and if the counter does not exceed the threshold during the timer time, the task can be processed.
This can solve the concurrency problem caused by multiple instances in distributed environment. Because the timers and counters set with Redis are globally unique, no matter how many nodes they use the same timers and counters, very precise flow control can be achieved.
I won’t publish the code, because it’s too personal.
The last
Reference article:
Design of current limiting system based on Redis
If you are interested, you can see how other people’s code is written: github.com/wukq/rate-l…
Original author: Zhisheng
Original address:
Design of lower limit flow system based on distributed environment
Copyright: This article is by the happy science and technology cooperation blogger original, reproduced please indicate the author and source, thank you!
Snap up wechat mini program for one yuan >>>
The link address