We know that the complexity of distributed micro service dependencies, for example, the front end into the back-end call service request, a front end request will be turned into a number of back-end call service request, so this time the background service appear unstable or delay, if there is no good current limiting fuse measures, may cause the loss of user experience, When it is serious, there will be an avalanche effect, which will bring down the whole website. If alibaba is in double 11 and other activities, if there is not a good set of circuit breaker measures, it is unimaginable, and it may not be able to support such a large concurrent capacity.
Netflix did not design a good fault tolerance for traffic limiting before 2012. At that time, it was also troubled by the system stability. Several times, the website collapsed due to the lack of good circuit breaker measures. With this system, Netflix has made a big leap in system stability, and since then there has been no large-scale avalanche accident
The following uses hystrix as an example to illustrate a lower limit current fusing
A few concepts:
Fusing, isolation, current limiting and degradation are the most important concepts and modes of distributed fault tolerance.
fusing
If you have circuit fuses in your house, there are fuses that protect you from problems when you use super-powered circuits, and that magnifies the problem.
isolation
We know that the computing resources are limited, CPU, memory, queue, thread pool resources, they are all limit on the number of resources, if not in isolation, a service call may thread consumes a lot of resources, to take up the resources of other services, then knock-on effect should be a service of the potential problems caused other services cannot be accessed.
Current limiting
When we flood our service with heavy traffic, we need certain traffic limiting measures. For example, we only allow a certain number of accesses to pass through our resources within a certain period of time. If there will be problems with the larger system, we need traffic limiting protection.
demotion
If the system fails to provide enough support, a degradation capability is needed to protect the system from further deterioration, and user-friendly flexible solutions can be provided, such as informing users that they are temporarily unavailable and please try again after a period of time, etc.
hystrix
Hystrix encapsulates all of the above fuses, isolation, limiting, and degradation into a single component. Here’s a diagram of hystrix’s internal design and invocation process
The general workflow is as follows:
- Build a HystrixCommand object to encapsulate the request and configure the parameters needed in the constructor for the request to be executed
- Executing commands. Hystrix provides several methods for executing commands. The most common ones are Synchrous and Asynchrous
- Check whether the circuit is open, if so, directly enter the fallback method
- Determine if the thread pool/queue/semaphore is full, and if so, enter the fallback method directly
- Execute the run method, typically hystrixcommand-run (), to enter the actual business call, and directly enter the fallback method when the execution times out or fails and an unexpected exception is thrown
- Every step in the process will be reported to metrics to calculate the monitoring indicators of fuses
- Fallback method is also divided into implementation and backup
- Finally, the return request response
Blog address: “Micro service Series 13” fuses limit traffic isolation degraded