SpringCloud Ribbon/Feign/Hystrix timeout and retry issues summary

Hi, I am empty night, another week no see!

Today, I’ll talk about how to configure timeouts in the Ribbon and Feign.

In Spring Cloud, feign or the Ribbon is used for service invocation. The Ribbon also has load balancing and retry mechanisms. Feign is based on the Ribbon.

Hystrix is often introduced to ensure high availability of the service and prevent problems such as avalanches. The Hystrix circuit breaker is also related to timeout times.

How to take into account the relationships among the ribbon, Feign, and Hystrix, and add appropriate configurations to make each component perform its own functions and cooperate with each other is a difficult problem.

It makes my head bald to think about it.

Today I want to clarify the timeout relationships between ribbon, Feign, and Hystrix.

First, a corollary:

Feign is an integration of the Ribbon and Hystrix. Feign itself has no timeout limits, which are controlled by the Ribbon and Hystrix.

Therefore, we just need to clarify the timeout relationship between the Ribbon and Hystrix.

The following uses the Ribbon as an example to test timeouts in default and hystrix integration.

1. The default configuration of the ribbon

The default configuration of the ribbon is in the DefaultClientConfigImpl class.

    public static final int DEFAULT_READ_TIMEOUT = 5000;

    public static final int DEFAULT_CONNECTION_MANAGER_TIMEOUT = 2000;

    public static final int DEFAULT_CONNECT_TIMEOUT = 2000;
Copy the code

Note the first sinkhole: Although DEFAULT_READ_TIMEOUT is specified as 5000 ms in the DefaultClientConfigImpl class, debug finds that this default value is replaced when building clientConfig in the ribbon.

Details are as follows:

When using ribbon request interface, will build a IClienConfig object for the first time, this method in RibbonClientConfiguration class, at this time, Reset ConnectTimeout, ReadTimeout, and GZipPayload

public class RibbonClientConfiguration {

    /** * Ribbon client default connect timeout. */
    public static final int DEFAULT_CONNECT_TIMEOUT = 1000;

    /** * Ribbon client default read timeout. */
    public static final int DEFAULT_READ_TIMEOUT = 1000;

    /** * Ribbon client default Gzip Payload flag. */
    public static final boolean DEFAULT_GZIP_PAYLOAD = true;

    @RibbonClientName
    private String name = "client";

    @Autowired
    private PropertiesFactory propertiesFactory;

    @Bean
    @ConditionalOnMissingBean
    public IClientConfig ribbonClientConfig(a) {
        DefaultClientConfigImpl config = new DefaultClientConfigImpl();
        config.loadProperties(this.name);
        config.set(CommonClientConfigKey.ConnectTimeout, DEFAULT_CONNECT_TIMEOUT);
        config.set(CommonClientConfigKey.ReadTimeout, DEFAULT_READ_TIMEOUT);
        config.set(CommonClientConfigKey.GZipPayload, DEFAULT_GZIP_PAYLOAD);
        return config;
    }
    
    / /...
}
Copy the code

In summary, the ribbon’s default ConnectTimeout and ReadTimeout are both 1000 ms

Let’s look at custom configurations.

2. Customize the ribbon configuration

Let’s customize the ribbon. XXX configuration to see if it works:

ribbon:
  OkToRetryOnAllOperations: true Retry all operation requests. Default is false
  ReadTimeout: 1000   Load balancing timeout. Default value: 5000
  ConnectTimeout: 3000 The connection timeout value is 2000
  MaxAutoRetries: 1    # Number of retries for the current instance, default 0
  MaxAutoRetriesNextServer: 0 Retry the number of times to switch instances, default 1
Copy the code

Tests found it didn’t work. How fat? Some code friends on the net in the article is written so ah.

. The reason is simple: to add ribbon. The HTTP client. Enabled = true configuration, custom ribbon timeout configuration take effect.

ribbon:
  http:
    client:
      enabled: true
Copy the code

Let’s test the timeout and retry mechanism:

(I have provided the screenshots of the test, hope you love me, Harm!)

I’m using a Producer service to provide an interface that looks something like this:

    @GetMapping(value = "hello/{name}")
    public String hello(@PathVariable("name") String name, Integer mills) {
        logger.info(Start executing request, name: + name + "Request suspension:" + mills + "毫秒");
        if(mills ! =null && mills > 0) {
            try {
                Thread.sleep(mills);
            } catch(InterruptedException e) { e.printStackTrace(); }}return "hello, [" + name + "], this is service producer by nacos.....";
    }
Copy the code

Note that there is a Mills parameter that specifies the wait time for the Producer interface. This allows us to test the timeout and retry mechanism for the Consumer service that uses the Ribbon to invoke the Producer interface.

A consumer might look something like this:

    @GetMapping(value = "test")
    //@HystrixCommand(fallbackMethod = "testHystrix")
    public String test(String name, Integer mills) {
        logger.info("Start requesting producer, whose pause time is:" + mills);
        String producerRes = restTemplate.getForObject(
                "http://" + service_producer_name + "/producer/hello/" + name + "? mills=" + mills, String.class);
        logger.info("Request obtained successfully, start printing request result:");
        String res = "Test the Consumer /test interface based on the Ribbon using the Server-producer Hello interface. + producerRes;
        System.out.println(res);
        return res;
    }
Copy the code

With the test code ready, let’s start testing:

First, the current instance tries again:

The request ReadTimeOut is set to 5s, 1 + MaxAutoRetries = 2, twice, exactly 10s

Let’s look at the producer side:

Set both MaxAutoRetries and MaxAutoRetriesNextServer to 1 below:

Look at the producer end, each 5s receives a request, a total of 4 requests:

Why four times?

Why does setting MaxAutoRetriesNextServer to 1 increase the request by 2 retries?

MaxAutoRetriesNextServer directly translates to: Maximum number of retries for the next service.

This rustic translation sounds like the number of retries for the next service. Should that be one? What is the next service here?

Don’t panic, little scene. Let’s change the configuration. MaxAutoRetries is set to 2 and MaxAutoRetriesNextServer is set to 3.

Here we start two producers with the same name, so there are two instances of the service.

Test it out:

(Dear, I have marked out the analysis results for you in the picture, isn’t it very warm?)

Another producer:

Based on the time in the log, MaxAutoRetriesNextServer really means: The maximum number of times a service instance must be switched if the request fails (no matter how many instances there are, even one instance will be switched back to the instance itself MaxAutoRetriesNextServer times)

Below we conclude: in the ribbon, requests are executed at most — 1 + maxAutoRetries + (maxAutoRetries + 1) * MaxAutoRetriesNextServer

That’s (1 + maxAutoRetries) * (1 + MaxAutoRetriesNextServer) times

The ribbon, a man who lies and plays with women, has been thoroughly explained. Next, we will face ribbon + Hystrix, a deceitful and flirtatious man + a warm man.

3. Timeout and retry configurations after the ribbon integrates hystrix

Why do you call hystrix the warm guy? Of course there’s a reason.

Hystrix is a service degradation, current limiting, fuse breaker. It can effectively ensure the stability of the micro-service platform and avoid avalanches and other phenomena. So hystrix is still warm.

Ribbon integration with Hystrix is simple:

The startup class adds the @enablehystrix annotation.

Add @hystrixCommand to the interface and configure fallback:

    @GetMapping(value = "test")
    @HystrixCommand(fallbackMethod = "testHystrix")
    public String test(String name, Integer mills) {
        logger.info("Start requesting producer, whose pause time is:" + mills);
        String producerRes = restTemplate.getForObject(
                "http://" + service_producer_name + "/producer/hello/" + name + "? mills=" + mills, String.class);
        logger.info("Request obtained successfully, start printing request result:");
        String res = "Test the Consumer /test interface based on the Ribbon using the Server-producer Hello interface. + producerRes;
        System.out.println(res);
        return res;
    }

    /** * Test circuit breaker *@param name
     * @return* /
    private String testHystrix(String name, Integer mills) {
        return "sorry, " + name + ", this service is unavailable temporarily. We are returning the defaultValue by hystrix.";
    }
Copy the code

Test it out:

The Producer paused for 500ms, normal

The producer pauses for 1000ms and the request is processed by the fallback method specified by @hystrixCommand:

Note: If fallback is not configured, hystrix timeouts do not take effect and are controlled by the ribbon.

Hystrix’s default timeout is 1s, configured in the HystrixCommandProperties class:

private static final Integer default_executionTimeoutInMilliseconds = 1000; // default => executionTimeoutInMilliseconds: 1000 = 1 second

protected HystrixCommandProperties(HystrixCommandKey key, HystrixCommandProperties.Setter builder, String propertyPrefix) {

    // ...
    this.executionTimeoutEnabled = getProperty(propertyPrefix, key, "execution.timeout.enabled", builder.getExecutionTimeoutEnabled(), default_executionTimeoutEnabled);
    // ...
}
Copy the code

Continue testing the ribbon and Hystrix timeout relationship.

After configuring the Hystrix fallback, modify the configuration file and set the Hystrix timeout period to be longer than the ribbon timeout period:

ribbon:
  OkToRetryOnAllOperations: true Retry all operation requests. Default is false
  ReadTimeout: 2000   Load balancing timeout. Default value: 5000
  ConnectTimeout: 3000 #ribbon Specifies the timeout for the ribbon request. Default: 2000
  MaxAutoRetries: 0     # Number of retries for the current instance, default 0
  MaxAutoRetriesNextServer: 0 The number of retries for switching instances, default 1
  # if you don't add ribbon. HTTP. Client. Enabled = true, then the ribbon of the default configuration will not take effect
  http:
    client:
      enabled: true


hystrix:
  command:
    default:  Service id specifies that the application is valid
      execution:
        timeout:
          # If enabled is set to false, the ribbon controls timeout requests. If true, the ribbon controls timeout requests
          enabled: true
        isolation:
          thread:
            timeoutInMilliseconds: 10000 # Circuit breaker timeout, default 1000ms
Copy the code

At this point, send a message with Postman:

As you can see, the request returns around 2s, which is exactly the ribbon.ReadTimeout time. The ribbon has timed out. Then it went into the hystrix circuit breaker process.

4. Conclusion

To sum up:

If the request takes longer than the ribbon timeout Settings, retries are triggered.
With fallback, if the ribbon timeout is exceeded, or Hystrix timeout is exceeded, the fallback will fuse.

In general, the ribbon timeout is set to < hystrix because the ribbon has a retry mechanism. This ribbon timeout includes retries, that is, it is best to allow the ribbon to perform all retries until the ribbon timeout is triggered.

Because connectionTime is generally short, it can be ignored. Then, the timeout should be set to:

(1 + MaxAutoRetries) * (1 + MaxAutoRetriesNextServer) ReadTimeOut < Hystrix timeoutInMilliseconds*

This is the end of today’s sharing, remember to click a “like”! My official account is JavaApes.

SpringCloud Ribbon/Feign/Hystrix timeout and retry issues summary

1. The default configuration of the ribbon

2. Customize the ribbon configuration

3. Timeout and retry configurations after the ribbon integrates hystrix

4. Conclusion

Related Posts

Escape from the Maze: BFS + The maximum area that can be enclosed by a given obstacle

Express Combat (5) : routing

Common concurrency patterns for Golang