An overview,

  • Microservices may call each other and cascade, and applications in complex distributed architectures have dozens of dependencies, each of which will inevitably fail at some point.
  • When invoking multiple microservices, it is assumed that microservice A calls microservice B and microservice C, and microservice B and microservice C call other microservices, which is called “fan out”. If the call response time of A microservice on the fan-out link is too long or unavailable, the call to microservice A will occupy more and more system resources, thus causing the system crash, the so-called “avalanche effect”.
  • Therefore, for some delays and errors, a corresponding processing service is needed to avoid the failure of the whole service caused by a long wait.
  • Hystrix is a system to deal with distributed delay, and fault tolerance of the open source library, in a distributed system, many rely on inevitably call fails, such as overtime, abnormal, Hystrix can guarantee in the case of a dependency problem, won’t cause the overall service failure, avoid cascading failure, in order to improve the flexibility of a distributed system.
  • In simple terms, the fuse Hystrix prevents service processing from taking too long and abnormal for the whole service to fail. For these abnormal service processing, fallback is an alternative. Just get over it before it crashes the entire server.
  • Hystrix website
  • Hystrix enables service degradation and service circuit breaker.

Second, service degradation

  • Service degradation is called when a service is calledfallbackMethod as return.
  • Service degradation is triggered when a program runs abnormally, times out, a service circuit breaker triggers a service degradation, or the thread pool/semaphore is full.
  • Here’s how it works

1. The package

  • The most important thingspring-cloud-starter-netflix-hystrixThis package
<dependencies>
    <! --hystrix-->
    <dependency>
        <groupId>org.springframework.cloud</groupId>
        <artifactId>spring-cloud-starter-netflix-hystrix</artifactId>
    </dependency>
    <! --eureka client-->
    <dependency>
        <groupId>org.springframework.cloud</groupId>
        <artifactId>spring-cloud-starter-netflix-eureka-client</artifactId>
    </dependency>
    <! --web-->
    <dependency>
        <groupId>org.springframework.boot</groupId>
        <artifactId>spring-boot-starter-web</artifactId>
    </dependency>
    <dependency>
        <groupId>org.springframework.boot</groupId>
        <artifactId>spring-boot-starter-actuator</artifactId>
    </dependency>
    <dependency>
        <groupId>org.springframework.boot</groupId>
        <artifactId>spring-boot-devtools</artifactId>
        <scope>runtime</scope>
        <optional>true</optional>
    </dependency>
    <dependency>
        <groupId>org.projectlombok</groupId>
        <artifactId>lombok</artifactId>
        <optional>true</optional>
    </dependency>
    <dependency>
        <groupId>org.springframework.boot</groupId>
        <artifactId>spring-boot-starter-test</artifactId>
        <scope>test</scope>
    </dependency>
</dependencies>
Copy the code

2. Configuration file

  • Nothing special. It’s basically standard, okay
server:
  port: 8001

spring:
  application:
    name: cloud-provider-hystrix-payment

eureka:
  client:
    register-with-eureka: true
    fetch-registry: true
    service-url:
      defaultZone: http://127.0.0.1:7001/eureka,http://127.0.0.1:7002/eureka
Copy the code

3. Main startup class

  • Add one to the main startup class@EnableHystrixNote to enable Hystrix
@SpringBootApplication
@EnableEurekaClient // This service will automatically register with eureka service after it is started
@EnableHystrix
public class PaymentHystrixMain8001 {
  public static void main(String[] args) { SpringApplication.run(PaymentHystrixMain8001.class, args); }}Copy the code

4. Business class

  • Here’s a method that works, and here’s a method that simulates 3s
@Service
public class PaymentService {
  public String runsWell(Integer id) {
    return "All ok, thread number:" + Thread.currentThread().getName() + "Id," + id;
  }

  public String runsTimeOut(Integer id) {
    try {
      Thread.sleep(3000);
    } catch (InterruptedException e) {
      e.printStackTrace();
    }
    return "Execute timeout, thread number:" + Thread.currentThread().getName() + "Id,"+ id; }}Copy the code
  • Request processing. By marking the method above@HystrixCommandDeclare what to do after the downgrade and some parameters (such as the timeout period)
@Slf4j
@RestController
@RequestMapping("/hystrix")
public class HystrixController {
  @Autowired
  PaymentService paymentService;

  @GetMapping("/ok/{id}")
  public String runsWell(@PathVariable("id") Integer id) {
    String result = paymentService.runsWell(id);
    log.info("result: " + result);
    return result;
  }
  
  // Set the timeout to 3s and call the runsTimeOutHandler method
  @HystrixCommand(fallbackMethod = "runsTimeOutHandler", commandProperties = {@HystrixProperty(name = "execution.isolation.thread.timeoutInMilliseconds", value = "2000")})  
  @GetMapping("/timeOut/{id}")
  public String runsTimeOut(@PathVariable("id") Integer id) {
    String result = paymentService.runsTimeOut(id);
    log.info("result: " + result);
    return result;
  }

  // Set a runsTimeOutHandler to handle service degradation
  @HystrixCommand(fallbackMethod = "runsTimeOutHandler")
  @GetMapping("/exception/{id}")
  public String runsException(@PathVariable("id") Integer id) {
    Integer result = 10 / 0;
    log.info("result: " + result);
    return result.toString();
  }
  
  // is a plain method. It is worth mentioning that the parameter list needs to be the same as the degraded method
  public String runsTimeOutHandler(Integer id) {
    String result = "Triggered service degradation, current thread name:" + Thread.currentThread().getName();
    log.info(result);
    returnresult; }}Copy the code
  • At this point, simple service degradation is complete. One problem, however, is that if each method has different parameters, you need to write a degraded method for each method.
  • Therefore, you can configure a default degrade method that is called when no degrade method is specified above the method. First, annotate the class with an annotation@DefaultPropertiesDeclare the default processing method and annotate the method that requires a fallback@HystrixCommandCan.
@Slf4j
@RestController
@RequestMapping("/hystrix")
@DefaultProperties(defaultFallback = "runsTimeOutHandler")
public class HystrixController {
  @Autowired
  PaymentService paymentService;

  @GetMapping("/ok/{id}")
  public String runsWell(@PathVariable("id") Integer id) {
    String result = paymentService.runsWell(id);
    log.info("result: " + result);
    return result;
  }

  @HystrixCommand( commandProperties = {@HystrixProperty(name = "execution.isolation.thread.timeoutInMilliseconds", value = "3000")})
  @GetMapping("/timeOut/{id}")
  public String runsTimeOut(@PathVariable("id") Integer id) {
    String result = paymentService.runsTimeOut(id);
    log.info("result: " + result);
    return result;
  }

  @HystrixCommand
  @GetMapping("/exception/{id}")
  public String runsException(@PathVariable("id") Integer id) {
    Integer result = 10 / 0;
    log.info("result: " + result);
    return result.toString();
  }

  public String runsTimeOutHandler(a) {
    String result = "Triggered service degradation, current thread name:" + Thread.currentThread().getName();
    log.info(result);
    returnresult; }}Copy the code

5. With Feign

  • For Feign, there is also a timeout control. To do this, you can set fallbacks after timeouts, exceptions, and outages
  • First, you need to enable it in the configuration file
feign:
  hystrix:
   enabled: true  # Enable Hystrix in Feign
Copy the code
  • This is feign’s remote call interface, declaring an interface’s fallback method in annotations
@Service
@FeignClient(value = "CLOUD-PROVIDER-HYSTRIX-PAYMENT", fallback = HystrixServiceFallback.class)
public interface HystrixService {
  @GetMapping("/hystrix/ok/{id}")
  String runsWell(@PathVariable("id") Integer id);

  @GetMapping("/hystrix/timeOut/{id}")
  String runsTimeOut(@PathVariable("id") Integer id);

  @GetMapping("/hystrix/exception/{id}")
  String runsException(@PathVariable("id") Integer id);
}
Copy the code
  • For the above interface, write an implementation class corresponding to the fallback method it declares
@Component
@Slf4j
public class HystrixServiceFallback implements HystrixService {
  @Override
  public String runsWell(Integer id) {
    String result = "Circuit breaker runsWell.";
    log.info(result);
    return result;
  }

  @Override
  public String runsTimeOut(Integer id) {
    String result = "Circuit breaker runsTimeOut";
    log.info(result);
    return result;
  }

  @Override
  public String runsException(Integer id) {
    String result = "Circuit breaker runsException";
    log.info(result);
    returnresult; }}Copy the code
  • At this point, the basic content of service degradation is completed. In general, service degradation can be used for both the caller and the called microservice. You can write a fallback method for a method, a default fallback for multiple methods, and a fallback method for Feign’s remote invocation interface.

Service circuit breaker

  • A service fuse is triggered when more than N service degradations are triggered within a unit time and the proportion of service degradations exceeds the threshold. When the service comes, simply refuse to attempt the service invocation and let the service degrade. Then slowly try to recover.
  • It is equivalent to fuse. When multiple high-power electrical appliances come over and find that they cannot work normally, they will directly burn off the fuse and temporarily do not deal with the service.
  • The relevant papers
  • Circuit breaker: Circuit breaker is a micro – service link protection mechanism against avalanche effect. If a microservice on the fan out link is unavailable or the response time is too long, the service is degraded. In this way, the microservice invocation of the node is interrupted and an incorrect response message is quickly returned. When detecting that the microservice invocation response of this node is normal, the call link is restored.
  • For Hystrix, a circuit breaker is triggered when 20 calls fail within 5s.
  • There are generally three states of fuse break:
    • Fusible off: Service will not be fusible
    • Fusing on: the request does not call the current service. The internal clock is usually MTTR (average fault handling time). When the opening time reaches the set clock, the service enters the semi-fusing state
    • Partial requests invoke the current service according to the rule. If the request is successful and meets the rule, the current service is considered normal and the fuse is disabled
  • Fuse three important parameters:
    • Snapshot time window: The circuit breaker needs to collect request and error data to determine whether to enable the circuit breaker. The snapshot time window is the latest 10 seconds by default.
    • Total number of requests threshold: In the snapshot time window, the total number of requests threshold must be met to be eligible for fusing. The default is 20, which means that if the Hystrix command is invoked less than 20 times within 10 seconds, the circuit breaker will not open even if all requests time out or fail for other reasons.
    • Error percentage threshold: When the total number of requests exceeds the threshold in the snapshot time window, such as 30 calls, if timeout exceptions occur in 15 of those 30 calls, that is, the error percentage exceeds 50%, the breaker will be opened with the default 50% threshold.
  @HystrixCommand(fallbackMethod = "divFallback", commandProperties = { @HystrixProperty(name = "circuitBreaker.enabled", value = "true"), @HystrixProperty(name = "circuitBreaker.requestVolumeThreshold", value = "10"), @HystrixProperty(name = "circuitBreaker.sleepWindowInMilliseconds", value = "10000"), @HystrixProperty(name = "circuitBreaker.errorThresholdPercentage", value = "60"), }) // Open, 10 times, 10 seconds, 60%
  @GetMapping("/div/{i}")
  public String div(@PathVariable("i") Integer i) {
    int result = 10 / i;
    log.info("result: " + result);
    return Integer.toString(result);
  }
Copy the code

Fourth, the work process

  • The official introduction

  1. Create HystrixCommand (when a dependent service returns the result of a single operation) or HystrixObserableCommand (when a dependent service returns the result of multiple operations) objects.
  2. Command execution. HystrixComand implements the first two execution modes. HystrixObservableCommand implements the latter two: Execute () : executes synchronously, returning a single result object from a dependent service, or throwing an exception if an error occurs. Queue () : Asynchronous execution, directly returning a Future object containing a single result object to be returned at the end of service execution. Observe () : Returns an Observable that represents the results of an operation and is a Hot Obserable (events are published after being created regardless of whether the “event source” has a “subscriber” or not, So every “subscriber” to a Hot Observable might start halfway through the “event source” and see only part of the operation). ToObservable () : It also returns an Observable that represents multiple results of the operation, but it returns a Cold Observable. (Instead of publishing events when there are no subscribers, it waits until there are subscribers. So for Cold Observable subscribers, it’s guaranteed to see the entire operation from the start.
  3. If request caching is enabled for the current command and the command cache hits, the cached result is immediately returned as an Observable.
  4. Check whether the circuit breaker is on. If the circuit breaker is open, Hystrix does not execute the command, but instead passes to the Fallback processing logic (step 8); If the circuit breaker is off, check if there are resources available to execute the command (step 5).
  5. Whether the thread pool/request queue/semaphore is full. If the command depends on the service’s proprietary thread pool and request queue, or if the semaphore (when thread pools are not used) is already full, Hystrix does not execute the command and instead passes to fallback processing logic (step 8).
  6. Hystrix decides how to request a dependent service based on the method we write. Hystrixcommand-run () : Returns a single result, or throws an exception. HystrixObservableCommand. The construct () : returns a observables object to launch multiple results, or sent via onError error notification.
  7. Hystrix reports “success,” “failure,” “rejection,” “timeout” and other information to the circuit breaker, which maintains a set of counters to count these data. The circuit breaker uses these statistics to decide whether to turn the circuit breaker on to “fuse/short-circuit” a service-dependent request.
  8. When a command fails to execute, Hystrix enters the Fallback to attempt a fallback, which is often referred to as “service degradation.” Step 4: The current command is in the “circuit breaker/short circuit” state when the circuit breaker is turned on. Step 5: When the current command’s thread pool, request queue, or semaphore is full. Step 6: HystrixObservableCommand. The construct () or HystrixCommand. The run () throws an exception.
  9. When Hystrix is successfully executed, it returns the result either directly or as an Observable.

Service monitoring hystrixDashboard

  • Hystrix status can be viewed using the Hystrix Dashboard tool.
  • First, add a component under the configuration class
  @Bean
  public ServletRegistrationBean getServlet(a) {
    HystrixMetricsStreamServlet streamServlet = new HystrixMetricsStreamServlet();
    ServletRegistrationBean registrationBean = new ServletRegistrationBean(streamServlet);
    registrationBean.setLoadOnStartup(1);
    registrationBean.addUrlMappings("/hystrix.stream");
    registrationBean.setName("HystrixMetricsStreamServlet");
    return registrationBean;
  }
Copy the code
  • Open thelocalhost:9001/hystrixMonitor, and set parameters.
    • Delay: Controls the Delay of polling monitoring information on the server. The default value is 2000 milliseconds. You can configure this parameter to reduce the network and CPU consumption of the client.
    • Title: This parameter corresponds to the content after the header Title Hystrix Stream. By default, the URL of the specific monitoring instance is used. You can configure this information to display a more appropriate Title.