1. The background
Today we learn about SpringCloud’s Hystrix fuse
We continue to use the previous Eureka-Server as the service registry today
The following versions use Springboot and SpringCloud
- Springboot version: 2.3.5-release
- Springcloud version: Hoxton SR9
2. What are Hystrix
In a distributed environment, some of the many service dependencies are bound to fail. Hystrix is a library that helps you control the interaction between these distributed services by adding delay tolerance and fault tolerance logic. Hystrix does this by isolating points of access between services, stopping cascading failures, and providing fallback options, all of which can improve the overall resilience of the system.
3.Hystrix for what
Hystrix is designed to:
- Protects and controls delays and failures of dependencies accessed through third-party client libraries, usually over a network.
- Preventing cascading failures in complex distributed systems.
- Fail fast, recover fast.
- Step back and demote as gracefully as possible.
- Enable near real time monitoring, alerts, and operational control.
4.. What does Hystrix do
Hystrix solves the avalanche problem by providing resource isolation, degradation mechanisms, meltdowns, caching, and more.
- Resource isolation: This includes thread pool isolation and semaphore isolation to limit the use of resources that can be used to call distributed services so that problems in one service invocation do not affect other service invocations.
- Demote mechanism: demote due to timeout or when resources are insufficient (thread or semaphore). After demote, data can be returned together with the demote interface.
- Disconnection: When the failure rate reaches the threshold, the fault is automatically degraded (for example, the failure rate is high due to network failure or timeout). The quick failure triggered by the fuse is quickly recovered.
- Cache: Returns the result cache. Subsequent requests can be directly removed from the cache.
- Request merge: It is possible to combine requests over a period of time (typically requests to the same interface) and then send the request only once to the service provider.
Service circuit breakers and downgrades
-
Service circuit breaker: It is a protection measure used to prevent the whole system from failure when the service is overloaded due to some reason. Simply speaking, service circuit breaker is the condition, and service circuit breaker is the defense mechanism configured on the server
-
Service degradation: Simple data service degradation is one of the solutions to service circuit breakers
Resource isolation:
- Thread isolation
Hystrix adds thread pools between user requests and services. Hystrix allocates a small thread pool for each dependent call. If the thread pool is full, the call is rejected immediately, with no queuing by default. Speed up the failure determination time. The number of threads can be set. Principle: the user’s request will no longer have direct access to services, but by free threads in the thread pool to access the service, if the thread pool is full, would downgrade processing, the user’s request will not be blocked, you can see at least one execution results (e.g., return friendly message), rather than the endless waiting for, or to see a system crash
- Semaphore isolation
In this mode, receiving requests and executing downstream dependencies are done in the same thread, and there is no performance overhead associated with thread context switching, so semaphore mode should be chosen for most scenarios, but in this case, semaphore mode is not a good choice
contrast
contrast | Thread pool isolation | Semaphore isolation |
---|---|---|
Whether fuses are supported | Support, when the thread pool reaches MaxSize, the request will trigger the fallback interface for fusing | Fallback will be triggered when the semaphore reaches maxConcurrentRequest |
Whether timeout is supported | Supported, you can return directly | Not supported. If blocked, you can only call the protocol |
The isolation principle | Each service uses a separate thread pool | Counter through semaphore |
Whether asynchronous invocation is supported | It can be asynchronous or synchronous. Look at the method called | Synchronous invocation, not asynchronous |
Resource consumption | Large, large number of threads context switch, easy to cause high machine load | Small. It’s just a counter |
Whether timeout is supported | row 2 col 2 | row 1 col 2 |
5. Principle of A CircuitBreaker
Three states of fuse
- CLOSED: Indicates that the fuse is CLOSED and the request process is normal
- OPEN: the fuses are on, and the fuses are directly degraded
- Half-open: the fuse is half-open and a request is placed after the end of a fuse time window
Fuse configuration parameters – Important
Circuit Breaker consists of the following six parameters:
1, the circuitBreaker enabled
Whether to enable fuses. The default value is TRUE. 2, circuitBreaker forceOpen
The fuse is forced to open and always remains open, regardless of the actual state of the fuse switch. The default value is FLASE. 3, circuitBreaker. ForceClosed fuse forced closure, remain closed, don’t focus on the actual state of blowout switch. The default value is FLASE.
4, circuitBreaker errorThresholdPercentage error rates, the default value is 50%, for example, for a period of time (10 s) with 100 requests, there are 54 timeout or abnormal, the error rate is 54%, then this period of time is greater than the default value is 50%, This will trigger the fuse to open.
5, circuitBreaker requestVolumeThreshold
The default value is 20. ErrorThresholdPercentage is calculated only when there are at least 20 requests in a period of time. For example, there are 19 requests for a period of time, and all of these requests fail. The error rate is 100%, but the fuse does not turn on, and the total number of requests does not meet 20.
6, the circuitBreaker. SleepWindowInMilliseconds
Half-open trial sleep duration. The default value is 5000ms. For example, after the fuse is turned on for 5000ms, it will try to release part of the traffic to test whether the dependent service is restored
Fuse flow analysis
The detailed process of fuse operation is as follows:
The first step is to call allowRequest() to determine whether the request is allowed to be submitted to the thread pool
If the fuses are forced on, circuitBreaker. ForceOpen is true, disallow release, and return. If the fuse closed, circuitBreaker forceClosed is true, allow the release. In addition, you don’t have to worry about the actual state of the fuses, which means the fuses still maintain statistics and switch states, just not in effect.
The second step is to call isOpen() to determine whether the fuse switch is on
If the fuse switch is open, enter the third step, otherwise continue; If the total number of requests of one cycle is less than the circuitBreaker, requestVolumeThreshold values, allowing the request, otherwise continue; If the error rate is less than a cycle circuitBreaker, errorThresholdPercentage values, allow the request has been submitted. Otherwise, turn on the fuse switch and proceed to step 3.
The third step is to call allowSingleTest() to determine whether a single request is allowed to pass and to check whether the dependent service is restored
If open fuse, fuse open and distance of time or the last test request more than circuitBreaker release time. SleepWindowInMilliseconds value, fuse into the ajar, clears a testing request; Otherwise, release is not allowed. In addition, to provide a basis for decision making, each fuse maintains 10 buckets by default, one per second, and the oldest bucket is discarded when a new bucket is created. Each blucket maintains counters for success, failure, timeout, and rejection of requests, which Hystrix collects and counts.
6. Project construction
6.1 consumers
Add the dependent
<dependency>
<groupId>org.springframework.cloud</groupId>
<artifactId>spring-cloud-starter-netflix-eureka-client</artifactId>
</dependency>
<dependency>
<groupId>org.springframework.cloud</groupId>
<artifactId>spring-cloud-starter-openfeign</artifactId>
</dependency>
<dependency>
<groupId>org.springframework.cloud</groupId>
<artifactId>spring-cloud-starter-netflix-ribbon</artifactId>
</dependency>
<dependency>
<groupId>org.springframework.cloud</groupId>
<artifactId>spring-cloud-starter-netflix-hystrix</artifactId>
</dependency>
<dependency>
<groupId>org.springframework.boot</groupId>
<artifactId>spring-boot-starter-actuator</artifactId>
</dependency>
Copy the code
Add annotations
@EnableCircuitBreaker
@EnableCircuitBreaker
Copy the code
Adding configuration Classes
@FeignClient(name ="ms-feign-producer",path = "/api/user",configuration = Config.class,fallback = UserServiceImpl.class) public interface UserService { @GetMapping("/{id}") public String selectUser(@PathVariable("id") String id); } @Component public class UserServiceImpl implements UserService{ @Override public String selectUser(String id) { return "I'm a circuit breaker "; }}Copy the code
The configuration file
spring: application: name: ms-feign-consumer eureka: client: service-url: defaultZone: http://localhost:8000/eureka register-with-eureka: true instance: prefer-ip-address: true #appname: ${spring.application.name} instance-id: ${spring.cloud.client.ip-address}:${server.port} hostname: ${spring.cloud.client.ip-address} server: port: 8081 hystrix: command: default: circuitBreaker: requestVolumeThreshold: Amount of request # 5 set time window for at least five sleepWindowInMilliseconds: 5000 errorThresholdPercentage: 50 metrics: rollingStats: TimeInMilliseconds: 5000 # Time windowCopy the code
use
@RequestMapping("/api/comsumer/user") @RestController public class UserController { @Autowired UserService userService; @GetMapping("/{id}") public String selectUser(@PathVariable("id") String id){ return userService.selectUser(id); }}Copy the code
6.2 producers
Add the dependent
<dependency>
<groupId>org.springframework.cloud</groupId>
<artifactId>spring-cloud-starter-netflix-eureka-client</artifactId>
</dependency>
<dependency>
<groupId>org.springframework.cloud</groupId>
<artifactId>spring-cloud-starter-openfeign</artifactId>
</dependency>
Copy the code
Modifying a Configuration File
spring:
application:
name: ms-feign-producer
eureka:
client:
service-url:
defaultZone: http://localhost:8000/eureka
register-with-eureka: true
instance:
prefer-ip-address: true
#appname: ${spring.application.name}
instance-id: ${spring.cloud.client.ip-address}:${server.port}
hostname: ${spring.cloud.client.ip-address}
server:
port: 8082
Copy the code
To provide services
@RequestMapping("/api/user") @RestController public class UserController { @Value("${server.port}") Integer port; @GetMapping("/{id}") public String selectUser(@PathVariable("id") String id){ if ("1".equals(id)) { int i = 1/0; } User user = new User(); user.setId(id); user.setName("wangyunqi"); user.setPort(port); return user.toString(); }}Copy the code
test
- 1: When requesting a consumer ID? =1, make more than 5 consecutive requests within 5 seconds, and then return directly after the request: “I am a circuit breaker”
Through the port fuse seriously: http://192.168.1.119:8081/actuator/health
"hystrix": {
"status": "CIRCUIT_OPEN",
"details": {
"openCircuitBreakers": [
"ms-feign-producer::UserService#selectUser(String)"
]
}
},
"ping": {
"status": "UP"
},
"refreshScope": {
"status": "UP"
}
Copy the code
- 2: after waiting for a time sleepWindowInMilliseconds, bearing id = 2, found normal return data
User{id='2', name='111', age=0, port=8082}
Copy the code
Through the port fuse seriously: http://192.168.1.119:8081/actuator/health
"hystrix": {
"status": "UP"
},
"ping": {
"status": "UP"
},
"refreshScope": {
"status": "UP"
}
}
Copy the code
7. Fuse working process
Process description:
- 1: Create a new HystrixCommand for each call, encapsulating the dependent calls in the run() method.
- 2: Execute execute()/queue for synchronous or asynchronous invocation.
- 3: Check whether the circuit-breaker is on. If it is, go to Step 8 for downgrading; if it is, enter the step.
- 4: Check whether the thread pool/queue/semaphore is full. If so, go to step 8. Otherwise, continue the following steps.
- 5: Call the run method of HystrixCommand. Run dependency logic
- 5a: The dependent invocation times out. Go to Step 8.
- 6: Checks whether the logic is invoked successfully
- 6A: Returns the result of a successful call
- 6b: The call fails. Go to Step 8.
- 7: Calculate the status of the fuse, and report all the operating status (success, failure, rejection, timeout) to the fuse for statistics to judge the status of the fuse.
- 8:getFallback() downgrade logic.
- The getFallback call is triggered in four ways:
- (1) : the run () method throws the HystrixBadRequestException anomalies
- (2) : The run() method call timed out
- (3) : Fuse starts interception call
- (4) : Whether the thread pool/queue/semaphore is full
- 8a: A Command that does not implement getFallback will throw an exception directly
- 8b: Fallback returns if the fallback call succeeds
- 8C: An exception is thrown when the degraded logic call fails
- 9: Indicates that the execution result is successful