In system architecture design, limiting traffic is a topic that has to be discussed, because it is so small, but also so important. Some of this is like the ancient town border guards soldiers, defend the pass, resist the foreign troops, once the pass is lost, all kinds of taotie into the city, is bound to be our painstaking efforts to loot the temple shop, all the efforts before the torch. So today we point this topic, on the one hand is to make a summary of the current limit, on the other hand, to cast a brick, see you in your own system, how to do the current limit. When it comes to limiting the flow, the word that comes to mind is limiting the flow, and the key point is how to limit it. Moreover, this limit is also divided into single-machine limit and distributed limit. Single-machine flow limit, as the name implies, is to control the flow of docker machines or physical machines with applications deployed, so as to make the flow influx present a controllable situation, and prevent the application performance problems caused by excessive and rapid flow influx, or even lose response. Distributed traffic limiting refers to traffic limiting for a cluster. Generally, traffic limiting for such applications is centralized in one place, such as Redis, ZK, or other components that can support distributed traffic limiting. In this way, when the traffic is too large or too fast, the avalanche effect will not be caused by the collapse of one machine in the cluster, resulting in the collapse of the whole cluster application. Let's take a look at the various current limiting operations. 1. Counter based single-machine current limiting This type of current limiting is generally performed through the application counter. Counters can be implemented using variables of type Integer or Java's built-in AtomicLong. The principle is to set a counter threshold value, each time the incoming traffic, increase the counter, when the threshold is reached, subsequent requests will be directly discarded. The code implementation is as follows:Copy the code
Private static AtomicLong counter = new AtomicLong(); // Traffic Limiting Threshold Private static Final Long counterMax = 500; Public void invoke(Request Request) {try {// Request filter if (counter.incrementandGet () > counterMax) {return; } // doBusiness(request); } catch (Exception e) {// doException(request,e); } finally { counter.decrementAndGet(); }}
The above code is a simple counter - based implementation of single-machine current limiting. The code is simple, easy to operate, and can bring good results. But the disadvantage is also very obvious, that is, the flow of the first can generally come in, and then the flow basically will be rejected. The probability of each request being executed is actually different, which makes it impossible for early users to get the opportunity to execute, while late users may be executed. Advantages: simple code and convenient operation Disadvantages: First come, first served, the probability of execution of the first arrived request is 100%, the probability of execution of the last arrived request is lower, and the opportunity of execution of each request is not equal. So what if you want every request to have an equal chance of being executed? 2. Random number based single-machine traffic limiting This traffic limiting algorithm, so that the probability of the request can be executed is consistent, so compared with the counter based traffic limiting, it is more user-friendly. The code is as follows:Copy the code
/ / get a random number private static ThreadLocalRandom ptgGenerator = ThreadLocalRandom. The current (); Private static Final Long ptgGuarder = 10; private Static Final Long ptgGuarder = 10; Public void invoke(Request Request) {try {int currentPercentage = pTGGenerator.nextint (1, 100); public void invoke(Request Request) {try {int currentPercentage = pTGGenerator.nextint (1, 100); If (currentPercentage <= ptgGuarder) {// doBusiness(request); } else { return; }} catch (Exception e) {// doException(request, e); }}
As can be seen from the above code, the system obtains a random execution rate of 1 to 100 for each request, and then compares it with the current traffic limiting threshold (for example, the current interface allows only 10% of the traffic through). If the execution rate is lower than the current traffic limiting threshold, the system permits the request. If the value is greater than the threshold, the system returns the value without performing any processing. Compared to the previous counter limiting, the probability of each request being executed is consistent. Of course, in real service scenarios, users can dynamically configure threshold parameters to control the percentage of incoming traffic per minute or per hour. However, for the sudden increase of high traffic, this method has some problems, because under high concurrency, the entry time between each request is very short, resulting in the probability of repeated values generated by nextInt. Therefore, an optimization point to be made here is to find an appropriate seed for it to optimize the values generated by nextInt. Advantages: simple code, easy to operate, and equal opportunity to execute each request. Disadvantages: Not suitable for applications with sudden increase in traffic. 3. Time-based single-machine traffic limiting Sometimes, our application only wants to allow a fixed amount of traffic per unit of time, such as 100 requests per second, and the rest are discarded. So there are many ways to do this, you can achieve flow limiting based on the counter, and then judge the time, but this kind of approach is a little complicated, controllability is not particularly good. So we're going to use the cache component here. In guava, a key is set. This key is the current number of seconds. The value of the number of seconds is the cumulative number of requests put in. The code is as follows:Copy the code
Private static LoadingCache<Long, AtomicLong> guava = cacheBuilder.newBuilder ().expireAfterWrite(5, TimeUnit.SECONDS) .build(new CacheLoader<Long, AtomicLong>() { @Override public AtomicLong load(Long seconds) throws Exception { return null; }}); Private static final Long requestsPerSecond = 100; Public void invoke(Request Request) {try {// Guava key Long guavaKey = system.currentTimemillis () / 1000; Long guavaVal = guava.get(guavaKey).incrementandget (); If (guavaVal <= requestsPerSecond) {// doBusiness(request); } else { return; }} catch (Exception e) {// doException(request, e); }}
As you can see from the above code, we cleverly take advantage of the caching component features to achieve this. Each time a request comes in, the key value in the cache component accumulates. When the request reaches the threshold, the subsequent request is rejected. In this way, the effect of time segment traffic limiting is conveniently realized. Although the example gives an implementation of limiting traffic by seconds, we can change from this to an implementation by minutes or by hours. Advantages: Simple operation, strong reliability Disadvantages: Sudden increase of traffic, will cause each request to access Guava, because Guava is in-heap memory implementation, is bound to have a little impact on performance. In fact, if limiting traffic affects other memory calculations, we can use the off-heap memory component to implement this limiting operation, such as OHC or MAPDB. It's also a good alternative. Leaky bucket is a Leaky bucket that has a pool of water, an inlet and an outlet where the flow of water can be small or large, but the flow of water at the outlet is constant. The following illustration shows this more clearly:Copy the code
As can be seen from the figure, the tap is equivalent to the flow of each end into the leakage bucket. When the flow is very small, the leakage bucket can carry the flow, and the water outlet is discharged at a constant speed, so that the water will not overflow. When the flow rate begins to increase, the rate of water in the leaky bucket cannot keep up with the rate of water intake, so the water level in the leaky bucket keeps rising. When the flow is large again, the water in the leak bucket overflows. Since many MQ systems, such as RabbitMQ, are implementations of the leaky bucket algorithm, requests are queued first, discarded when the queue is full, and then the consumer consumes the data in the queue at a constant rate. So I'm not going to post any extra code here. Advantage: flow control effect is good disadvantage: can not cope with the sudden increase of flow. Suitable for systems with weak protection performance, but not for systems with strong performance. Leaky bucket algorithms are not appropriate if more powerful systems can handle this surge of traffic. 5. Single-machine traffic limiting based on the Token Bucket algorithm is called a Token Bucket. When a request is received, a Token is applied for in the Token Bucket before services are processed. If no token is requested, the operation terminates. The details are as follows:Copy the code
Because the flow of token generation is constant, in the face of surge traffic, if there are enough tokens in the bucket, surge traffic can quickly obtain tokens and then process them. You can see from this that token buckets are allowed to handle surge traffic. Since we already have a token bucket concrete implementation class, RateLimiter, in the Guava component, we can use this class to implement our token bucket limiting. The code is as follows:Copy the code
Private static RateLimiter limiter = RateLimiter. Create (1); Private static final Long acquireTimeout = 1000; Public void invoke(Request Request) {try {if (limiter. TryAcquire (acquireTimeout, Timeunit.milliseconds) {// doBusiness(request); } else {return; }} catch (Exception e) {// doException(request, e); }}
We can see from the above code, one second to generate a token, so our interface is limited to a second processing a request, if feel TPS 1000 single interface performance can be achieved, so we can properly enlarge the number of tokens in the token bucket, such as 800, when sudden increase traffic to come over, will get the token and then directly to the business process. But after the token bucket is consumed, the request is blocked until another batch of 800 tokens is generated the next second and the request continues to process. Therefore, the advantages and disadvantages of using token buckets are obvious: some: simple to use, with mature components disadvantages: suitable for single-machine traffic limiting, not suitable for distributed traffic limiting. Distributed traffic limiting based on Redis Lua Because the above 5 traffic limiting methods are single-machine traffic limiting, but in practical application, we often need to do not only single-machine traffic limiting, but also distributed traffic limiting operations. Because there are many methods to do distributed limiting, I will not repeat them one by one. The distributed limiting method we use today is implemented by Redis + Lua. Why redis+ Lua? There are two reasons: First, Redis has good performance, strong processing capacity and good disaster recovery capability. Second: a Lua script in Redis is an atomic operation to ensure the correctness of data. Since traffic limiting is required, there must be a key to record the cumulative number of traffic limiting, which can be changed arbitrarily over time. In addition, the key needs to set expiration parameters to prevent excessive invalid data from causing redis performance problems. Take a look at the Lua code:Copy the code
Local key = ‘limitKey ‘.. Local val = tonumber(redis. Call (‘get’, Key) or 0) — local threshold = tonumber(ARGV[1]) if val>threshold then — Request is limited return 0 else — increment the number of requests Redis. Call (‘expire’, key, 5) — Request passed by return 1 end
After that, the call is made directly, and the business logic can proceed depending on whether the returned content is 0 or 1. This allows you to use this code snippet to control the flow of the entire cluster, thus avoiding avalanche effects. Of course, this solution can also use ZK to carry out, because of zK's strong consistency guarantee, it is another good solution, but because zK performance is not as good as Redis, so if you care about performance, or use Redis. Advantages: Overall flow control of the cluster, preventing avalanche effect Disadvantages: Additional Redis components need to be introduced, and Redis is required to support Lua scripts. Summary through the above 6 kinds of current limiting way to explain, mainly think of the role of throwing bricks to attract jade, looking forward to better and better solutions. The above code is pseudo code, please verify online when using, otherwise bring side effects, it is not worth the gainCopy the code
Welcome Java engineers who have worked for one to five years to join Java architecture development: 855835163 Group provides free Java architecture learning materials (which have high availability, high concurrency, high performance and distributed, Jvm performance tuning, Spring source code, MyBatis, Netty, Redis, Kafka, Mysql, Zookeeper, Tomcat, Docker, Dubbo, multiple knowledge architecture data, such as the Nginx) reasonable use their every minute and second time to learn to improve yourself, don’t use “no time” to hide his ideas on the lazy! While young, hard to fight, to the future of their own account!