In the article “OpenFeign and Ribbon Source Code Analysis Summary”, we only briefly understand the implementation principle of retry mechanism in the Ribbon. In this article, we will analyze the implementation of retry mechanism in detail and find out the answer we want from source code analysis. That is, how the Ribbon is configured to invoke each service interface uses different retry policies, such as the number of retries for configuration failures and the RetryHandler custom retry policy.

  • RibbonRetry mechanism to implement source analysis
  • RibbonTo configure the retry policy
  • How to replaceRetryHandler?

This source code analysis section involves the key class description

  • LoadBalancerFeignClient:OpenFeignintegrationRibbonThe use ofClient(OpenFeignuseClientSend a request);
  • FeignLoadBalancer:OpenFeignintegrationRibbonBy the bridgeLoadBalancerFeignClientCreate;
  • LoadBalancerCommand:RibbonConvert the request toRxJava APIImplementation of the call byFeignLoadBalancerCall;
  • CachingSpringLoadBalancerFactory:OpenFeignintegrationRibbonUsed to createFeignLoadBalancerA bridge with caching capabilityFeignLoadBalancerThe factory.
  • RibbonLoadBalancerClient:RibbonProvided implementationSpring CloudLoad balancing interface (LoadBalancerClient) class;
  • RibbonAutoConfiguration:RibbonAutomatic configuration class, registerRibbonLoadBalancerClienttoSpringThe container.
  • SpringClientFactory:RibbonManage a group on your ownApplicationContext.RibbonFor eachClientTo create aApplicationContext;
  • RibbonClientConfiguration:RibbonFor eachClientprovideApplicationContextImplement environmental isolation, which isRibbonFor eachClientcreateApplicationContextIs used for registrationRibbonVarious functional components, such as load balancersILoadBalancer;
  • RequestSpecificRetryHandler:RetryHandlerThe implementation class of the interface,OpenFeignintegrationRibbonDefault retry failure policy handler used;

RibbonRetry mechanism to implement source analysis

The Ribbon’s retry mechanism uses the RxJava API, and the retry times and retries decisions are made by RetryHandler. The Ribbon provides two implementation classes for RetryHandler, as shown in the figure below.

Now we need to find out which RetryHandler the Ribbon uses. We only analyze the use of OpenFeign and Ribbon integration. We do not analyze the use of @loadBalanced annotation in Spring Cloud.

The RibbonAutoConfiguration class imported from the Spring-Cloud-Netflix-ribbon spring. Factories file is RibbonAutoConfiguration. This configuration class injects a RibbonLoadBalancerClient into the Spring container. RibbonLoadBalancerClient is the Ribbon implementation class for Spring Cloud’s load balancing interface.

A SpringClientFactory was passed to the constructor when the RibbonLoadBalancerClient was created. The source code is shown below.

@Configuration
public class RibbonAutoConfiguration{
    / / create RibbonLoadBalancerClient
    @Bean
	@ConditionalOnMissingBean(LoadBalancerClient.class)
	public LoadBalancerClient loadBalancerClient(a) {
		return newRibbonLoadBalancerClient(springClientFactory()); }}Copy the code

SpringClientFactory is Ribbon USES ApplicationContext, Ribbon will be for each Client to create a AnnotationConfigApplicationContext, used as isolation environment.

SpringClientFactory when calling the superclass constructor introduced into a configuration class: RibbonClientConfiguration, source code is as follows.

public class SpringClientFactory extends NamedContextFactory<RibbonClientSpecification>{

	public SpringClientFactory(a) {
		super(RibbonClientConfiguration.class, NAMESPACE, "ribbon.client.name"); }}Copy the code

RibbonClientConfiguration configuration class in each Client corresponding AnnotationConfigApplicationContext initialization time to take effect, When the first call to service interface AnnotationConfigApplicationContext was created. Create ApplicationContext and the register method is called registered RibbonClientConfiguration configuration and some other configuration, the last call the refresh method to initialize the ApplicationContext.

Corresponding ApplicationContext RibbonClientConfiguration responsible for each Client into service list ServerList < > Server, service list update ServerListUpdater, load balancer ILoadBal Ancer, load balancing algorithm IRule, client configuration IClientConfig, retry decision processor RetryHandler, etc.

  • Service listServerList<Server>: Get the available service provider nodes from the registry;
  • Service list updaterServerListUpdater: Periodically updates the locally cached service listServerListFrom the registry;
  • Load balancing algorithmIRule: Implement various load balancing algorithms, such as random, polling, etc.
  • Load balancerILoadBalancer: Invokes the load balancing algorithmIRuleSelect a service provider node to invoke;
  • Retry decision processorRetryHandler: Determines whether to retry this failure.

Because RibbonClientConfiguration registered Bean is in Client isolation ApplicationContext, So the interface that calls each service provider can use a different client configuration (IClientConfig), retry decision handler (RetryHandler), and so on. This is a prerequisite, but not a necessary and sufficient condition, for the Ribbon to configure a different retry policy for each interface that invokes the service.

RibbonClientConfiguration configuration class will register a retry RetryHandler decision the processor, but the RetryHandler hasn’t be used, can also be used elsewhere.

@Configuration
public class RibbonClientConfiguration{
    / / not used
    @Bean
	@ConditionalOnMissingBean
	public RetryHandler retryHandler(IClientConfig config) {
		return newDefaultLoadBalancerRetryHandler(config); }}Copy the code

Ribbon OpenFeign integration, is really use RetryHandler RequestSpecificRetryHandler. The Ribbon source code for OpenFeign is the FeignLoadBalancer class.

When OpenFeign integrates Ribbon, OpenFeigin uses a LoadBalancerFeignClient. The LoadBalancerFeignClient creates FeignLoadBalancer. The FeignLoadBalancer executeWithLoadBalancer method is called to implement the load balancing call.

Is actually a superclass AbstractLoadBalancerAwareClient FeignLoadBalancer executeWithLoadBalancer method provides methods and its source code is as follows (cut).

public abstract class AbstractLoadBalancerAwareClient{
    public T executeWithLoadBalancer(final S request, final IClientConfig requestConfig) throws ClientException {
        LoadBalancerCommand<T> command = buildLoadBalancerCommand(request, requestConfig);
        try {
            returncommand.submit({.... }) .toBlocking() .single(); } catch (Exception e) { } } }Copy the code

The executeWithLoadBalancer method creates a LoadBalancerCommand, and then calls the LoadBalancerCommand submit method to submit the request.

public Observable<T> submit(final ServerOperation<T> operation) { // ....... // &emsp; Access to retry count final int maxRetrysSame = retryHandler. GetMaxRetriesOnSameServer (); final int maxRetrysNext = retryHandler.getMaxRetriesOnNextServer(); // Use the load balancer Observable<T> o = (server == null ? selectServer() : Observable.just(server)) .concatMap(new Func1<Server, Observable<T>>() { @Override public Observable<T> call(Server server) { //....... // Retries of the same nodeif (maxRetrysSame > 0)
                            o = o.retry(retryPolicy(maxRetrysSame, true));
                        returno; }}); // Retries on different nodesif (maxRetrysNext > 0 && server == null)
            o = o.retry(retryPolicy(maxRetrysNext, false));
        returno.onErrorResumeNext(...) ; }Copy the code

The submit method called retryHandler getMaxRetriesOnSameServer method and getMaxRetriesOnNextServer method for configuration maxRetrysSame, maxRetrysNext respectively. MaxRetrysSame Indicates the retry times of calling the same node. The default value is 0. MaxRetrysNext Indicates the number of retries for calling different nodes. The default value is 1.

The retryPolicy method returns an object that wraps the RxJava API of the RetryHandler retry decision maker, which ultimately decides whether retries are required, such as whether the exception thrown is allowed to be retried. This is done in Func2, which is returned by retryPolicy. This is the RxJava API. The retryPolicy method’s source code is shown below.

private Func2<Integer, Throwable, Boolean> retryPolicy(final int maxRetrys, final boolean same) {
    return new Func2<Integer, Throwable, Boolean>() {
        @Override
        public Boolean call(Integer tryCount, Throwable e) {
            if (e instanceof AbortExecutionException) {
                return false; } // Greater than the maximum number of retriesif (tryCount > maxRetrys) {
                return false;
            }
            if(e.getCause() ! = null && e instanceof RuntimeException) { e = e.getCause(); } // Call RetryHandler to determine whether to retryreturnretryHandler.isRetriableException(e, same); }}; }Copy the code

So where does this retryHandler come from?

Created when the LoadBalancerCommand object is constructed by calling the buildLoadBalancerCommand method in FeignLoadBalancer’s executeWithLoadBalancer method, The buildLoadBalancerCommand method is source code below.

protected LoadBalancerCommand<T> buildLoadBalancerCommand(final S request, Final IClientConfig config) {/ / get RetryHandler RequestSpecificRetryHandler handler = getRequestSpecificRetryHandler(request, config); / / use the Builder initializer schema constructs LoadBalancerCommand LoadBalancerCommand. Builder < T > Builder = LoadBalancerCommand. < T > Builder () WithLoadBalancerContext (this) // Pass RetryHandler. WithRetryHandler (handler).withloadBalanceruri (request.geturi ());return builder.build();
	}
Copy the code

Can be seen from the source code, use RetryHandler is RequestSpecificRetryHandler Ribbon. The Builder pattern is also used here.

FeignLoadBalancer getRequestSpecificRetryHandler method source code is as follows:

@Override
public RequestSpecificRetryHandler getRequestSpecificRetryHandler(
	RibbonRequest request, IClientConfig requestConfig) {
	//.....
	if(! request.toRequest().httpMethod().name().equals("GET"// Call this.getretryHandler () to get the RetryHandler oncereturn new RequestSpecificRetryHandler(true.false, this.getRetryHandler(),
				requestConfig);
	}
	else{// Call this.getretryHandler () to get the RetryHandler oncereturn new RequestSpecificRetryHandler(true.true, this.getRetryHandler(), requestConfig); }}Copy the code

RequestSpecificRetryHandler constructor can pass in a RetryHandler, it’s a bit like class loader this parents delegation model. Such as when RequestSpecificRetryHandler configuration retry count is zero, will obtain the father RetryHandler configure retries.

Which RetryHandler does this. GetRetryHandler get? (Source in LoadBalancerContext, the grandfather of FeignLoadBalancer)

[feignLoadBalancerContext] public class LoadBalancerContext{protected RetryHandler defaultRetryHandler = new DefaultLoadBalancerRetryHandler(); public final RetryHandlergetRetryHandler() {
        returndefaultRetryHandler; } } [FeignLoadBalancer] public class FeignLoadBalancer extends AbstractLoadBalancerAwareClient{ public FeignLoadBalancer(ILoadBalancer lb, IClientConfig clientConfig, ServerIntrospector serverIntrospector) { super(lb, clientConfig); / / use DefaultLoadBalancerRetryHandler enclosing setRetryHandler (RetryHandler. DEFAULT); this.clientConfig = clientConfig; / / IClientConfig RibbonClientConfiguration configuration class injection of enclosing ribbon = RibbonProperties. The from (clientConfig); RibbonProperties ribbon = this.ribbon; ConnectTimeout = ribbon. GetConnectTimeout (); this.connectTimeout = ribbon. this.readTimeout = ribbon.getReadTimeout(); this.serverIntrospector = serverIntrospector; }}Copy the code

It can be seen from the constructor of FeignLoadBalancer RequestSpecificRetryHandler father RetryHandler DefaultLoadBalancerRetryHandler.

The RetryHandler interface is defined as shown below.

RetryHandler interface method description:

  • IsRetriableException method: Indicates whether the exception can be retried.
  • isCircuitTrippingExceptionMethod: Yes or noCircuitAbnormal fuse type;
  • getMaxRetriesOnSameServerMethod: The maximum number of retries of calling the same node;
  • getMaxRetriesOnNextServerMethod: call the maximum number of retries of different nodes;

RibbonTo configure the retry policy

You can set the maximum retry times and connection timeout

FeignLoadBalancer incoming when creating RequestSpecificRetryHandler IClientConfig, this IClientConfig created from where we will analysis again. RequestSpecificRetryHandler in the constructor gets called for the IClientConfig service node with maximum retries and invoke different service node maximum retries, source code is as follows.

public class RequestSpecificRetryHandler implements RetryHandler {
    public RequestSpecificRetryHandler(boolean okToRetryOnConnectErrors, 
            boolean okToRetryOnAllErrors, RetryHandler baseRetryHandler, @Nullable IClientConfig requestConfig) {
        / /...
        // Get the two maximum retries from IClientConfig
        if(requestConfig ! =null) {
           if (requestConfig.containsProperty(CommonClientConfigKey.MaxAutoRetries)) {
               // Get the maximum number of calls on the same node
               this.retrySameServer = (Integer)requestConfig.get(CommonClientConfigKey.MaxAutoRetries);
           }    
           if (requestConfig.containsProperty(CommonClientConfigKey.MaxAutoRetriesNextServer)) {
                // Get the maximum number of retries for calls on different nodes
               this.retryNextServer = (Integer)requestConfig.get(CommonClientConfigKey.MaxAutoRetriesNextServer); }}}}Copy the code

RequestConfig is obtained from SpringClientFactory when LoadBalancerFeignClient creates FeignLoadBalancer. It is also the RibbonClientConfiguration automatic injection configuration.

public FeignLoadBalancer create(String clientName) {
    FeignLoadBalancer client = this.cache.get(clientName);
    if(client ! = null) {returnclient; } / / this. The factory is SpringClientFactory IClientConfig config = this. Factory. GetClientConfig (clientName); ILoadBalancer lb = this.factory.getLoadBalancer(clientName); ServerIntrospector serverIntrospector = this.factory.getInstance(clientName,ServerIntrospector.class); / / create FeignLoadBalancer client = this loadBalancedRetryFactory! = null ? new RetryableFeignLoadBalancer(lb, config, serverIntrospector,this.loadBalancedRetryFactory) : new FeignLoadBalancer(lb, config, serverIntrospector); FeignLoadBalancer this.cache. Put (clientName, client);return client;
}
Copy the code

IClientConfig is in RibbonClientConfiguration configuration, its source code is as follows:

public class RibbonClientConfiguration {
	// Default connection timeout
	public static final int DEFAULT_CONNECT_TIMEOUT = 1000;
	// Read timeout by default
	public static final int DEFAULT_READ_TIMEOUT = 1000;

    ${ribbon. Client. Name}
	@RibbonClientName
	private String name;
    
    // Register an IClientConfig instance using DefaultClientConfigImpl
	@Bean
	@ConditionalOnMissingBean
	public IClientConfig ribbonClientConfig(a) {
		DefaultClientConfigImpl config = new DefaultClientConfigImpl();
		config.loadProperties(this.name);
        // Configure connection timeout
		config.set(CommonClientConfigKey.ConnectTimeout, DEFAULT_CONNECT_TIMEOUT);
        // Configure read timeout
		config.set(CommonClientConfigKey.ReadTimeout, DEFAULT_READ_TIMEOUT);
		config.set(CommonClientConfigKey.GZipPayload, DEFAULT_GZIP_PAYLOAD);
		returnconfig; }}Copy the code

So how do we change the configuration?

The first method: Configuration file configuration method

How to set the Ribbon retry times in the Application configuration file?

We can in this configuration class RibbonClientConfiguration ribbonClientConfig method under the breakpoint debugging, as shown in the figure below.

As shown in the figure, the configuration parameter key is in the following format:

< service provider name (serverId)>:<ribbon>: parameter name >=<value>Copy the code

Suppose we set the maximum retry times on the same node to 10, the maximum retry times on different nodes to 12, and the connection timeout to 15 seconds for the service provider SCK-Demo-provider, then we need to add the following configuration to the application-[environment].yaml configuration file.

sck-demo-provider:
  ribbon:
    MaxAutoRetries: 10
    MaxAutoRetriesNextServer: 12
    ConnectTimeout: 15000
Copy the code

Both MaxAutoRetries and MaxAutoRetriesNextServer work, but ConnectTimeout does not. Reason is created in the RibbonClientConfiguration DefaultClientConfigImpl, first call loadProperties method (passed in the name parameter is the service name) configuration was obtained from the configuration file, and then call set method covers three configuration: Connection timeout configuration, read timeout configuration, and whether to enable gzip compression configuration. Therefore, configuring the connection overload in this way does not take effect.

The second method: code configuration

Code configuration is our registered IClientConfig manually, without the use of RibbonClientConfiguration automatic registration. RibbonClientConfiguration @ is added to the method of automatic registration IClientConfig ConditionalOnMissingBean conditions annotations, and because of that, we can register IClientConfig himself.

But to note, RibbonClientConfiguration is the Ribbon for each Client to create the ApplicationContext effect, so we need to create a Configuration class (Configuration), And register it with SpringClientFactory. This way, when SpringClientFactory creates the ApplicationContext for the Client, the configuration class is registered with the ApplicationContext, The configuration class registered with SpringClientFactory becomes the configuration class for the created ApplicationContext.

@Configuration
public class RibbonConfiguration implements InitializingBean {

    @Resource
    private SpringClientFactory springClientFactory;

    @Override
    public void afterPropertiesSet(a) throws Exception {
        List<RibbonClientSpecification> cfgs = new ArrayList<>();
        RibbonClientSpecification configuration = new RibbonClientSpecification();
        // Which service provider is configured for
        configuration.setName(ProviderConstant.SERVICE_NAME);
        // The registered configuration class
        configuration.setConfiguration(new Class[]{RibbonClientCfg.class});
        cfgs.add(configuration);
        springClientFactory.setConfigurations(cfgs);
    }

    / / specified after RibbonClientConfiguration effect
    @AutoConfigureBefore(RibbonClientConfiguration.class)
    public static class RibbonClientCfg {

        @Bean
        public IClientConfig ribbonClientConfig(a) {
            DefaultClientConfigImpl config = new DefaultClientConfigImpl();
            config.setClientName("Fill it in, it doesn't matter, it doesn't matter.");
            config.set(CommonClientConfigKey.MaxAutoRetries, 1);
            config.setProperty(CommonClientConfigKey.MaxAutoRetriesNextServer, 3);
            config.set(CommonClientConfigKey.ConnectTimeout, 15000);
            config.set(CommonClientConfigKey.ReadTimeout, 15000);
            returnconfig; }}}Copy the code

Because the Ribbon creates the ApplicationContext the first time the interface is called, it works to take the SpringClientFactory and add a custom configuration class to it during the application’s Spring container initialization phase.

RibbonClientCfg statement before RibbonClientConfiguration effect, which will help RibbonClientConfiguration registered IClientConfig with the container.

How to replaceRetryHandler?

Ribbon OpenFeign integration, when using the default use FeignLoadBalancer getRequestSpecificRetryHandler method to create RequestSpecificRetryHandler, the author also look for a source, There’s really no way to replace RetryHandler, but maybe OpenFeign just doesn’t want to do it. In this case, we’ll have to find another way.

Since using FeignLoadBalancer getRequestSpecificRetryHandler RetryHandler method returns, So if we can inherit FeignLoadBalancer rewrite getRequestSpecificRetryHandler method to replace RetryHandler? The answer is yes.

The FeignLoadBalancer code is as follows:

/ * * * custom FeignLoadBalancer, replace the default RequestSpecificRetryHandler * /
    public static class MyFeignLoadBalancer extends FeignLoadBalancer {

        public MyFeignLoadBalancer(ILoadBalancer lb, IClientConfig clientConfig, ServerIntrospector serverIntrospector) {
            super(lb, clientConfig, serverIntrospector);
        }

        @Override
        public RequestSpecificRetryHandler getRequestSpecificRetryHandler(RibbonRequest request, IClientConfig requestConfig) {
            / / return custom RequestSpecificRetryHandler
            // Parameter 1: Indicates whether the connection is abnormal
            // Parameter two: Whether all exceptions are retried
            return new RequestSpecificRetryHandler(false.false,
                    getRetryHandler(), requestConfig) {
                / * * *@paramException thrown by e *@paramWhether sameServer is the same as node service retry *@return* /
                @Override
                public boolean isRetriableException(Throwable e, boolean sameServer) {
                    if (e instanceof ClientException) {
                        // Connection error retry
                        if (((ClientException) e).getErrorType() == ClientException.ErrorType.CONNECT_EXCEPTION) {
                            return true;
                        }
                        // Connection timed out retry
                        if (((ClientException) e).getErrorType() == ClientException.ErrorType.SOCKET_TIMEOUT_EXCEPTION) {
                            return true;
                        }
                        // Read timeout retries are allowed only on different service nodes
                        // Retry on the same node is not supported. If the read times out, do not request the same node again.
                        if (((ClientException) e).getErrorType() == ClientException.ErrorType.READ_TIMEOUT_EXCEPTION) {
                            return! sameServer; }// The server is abnormal
                        // The server fails to switch to a new node and retry
                        if (((ClientException) e).getErrorType() == ClientException.ErrorType.SERVER_THROTTLED) {
                            return !sameServer;
                        }
                    }
                    // Try again if the connection is abnormal
                    returnisConnectionException(e); }}; }}Copy the code

Since FeignLoadBalancer is in OpenFeign LoadBalancerFeignClient calls a CachingSpringLoadBalancerFactory created, So we need to replace the OpenFeign FeignRibbonClientAutoConfiguration CachingSpringLoadBalancerFactory configuration class registration, And rewrite CachingSpringLoadBalancerFactory the create method, the code is as follows.

@configuration public class RibbonConfiguration {/** * Use custom FeignLoadBalancer cache factory ** @return
     */
    @Bean
    public CachingSpringLoadBalancerFactory cachingSpringLoadBalancerFactory() {
        return new CachingSpringLoadBalancerFactory(springClientFactory) {

            private volatile Map<String, FeignLoadBalancer> cache = new ConcurrentReferenceHashMap<>();

            @Override
            public FeignLoadBalancer create(String clientName) {
                FeignLoadBalancer client = this.cache.get(clientName);
                if(client ! = null) {returnclient; } IClientConfig config = this.factory.getClientConfig(clientName); ILoadBalancer lb = this.factory.getLoadBalancer(clientName); ServerIntrospector serverIntrospector = this.factory.getInstance(clientName, ServerIntrospector.class); // Use custom FeignLoadBalancer client = new MyFeignLoadBalancer(lb, config, serverIntrospector); this.cache.put(clientName, client);returnclient; }}; }}Copy the code