First let’s look at the flow of Dubbo calls
This article focuses on clusters.
Dubbo provides multiple fault tolerance schemes in the event of a cluster call failure. The default is failover retry. Dubbo provides Failover, Failfast, Failsafe, Failback, Forking, and Broadcast fault tolerance mechanisms. The following table lists the features of each fault tolerance mechanism.
Mechanism of | Mechanism of introduction |
---|---|
Failover | Default value for Dubbo fault tolerance. When a failure occurs, other services are tried. Users can useretries="2" Set the retry times. This is Dubbo’s default fault tolerance mechanism and will load balance requests. Typically, read or idempotency write operations are used, but retry operations increase the latency of the interface, and retry can increase the load of the downstream machine when the load of the downstream machine has reached its limit. |
Failfast | Fast failure: When a request fails, an exception result is quickly returned without any retries. Fault tolerance mechanisms load balance requests, usually on non-idempotent interfaces. This mechanism is greatly affected by network jitter. |
Failsafe | If an exception occurs, the request load is balanced by ignoring the exception. A common usage scenario is that you don’t care if the call succeeds or not, and you don’t want to throw an exception that affects an external call, such as some unimportant log synchronization, even if it happens in a timely manner. |
Failback | Requests that fail are automatically logged in the failure queue and periodically retried by a timed thread pool for asynchronous or eventually executed requests. Requests are load balanced. |
Forking | Multiple simultaneous calls to the same service return the result as soon as one of them returns. Users can configureforks="2" To set the maximum number of parallelism. It is usually used in the scenario with high real-time interface, but it will waste more resources. |
Broadcast | The broadcast invokes all available services, and an error occurs when any node reports an error. Because the request is broadcast, it does not need to perform load balancing. |
From the table above we can get a general idea of what each of the fault tolerance mechanisms in Dubbo means. Let’s look at how to use the above fault tolerance mechanisms in the service.
Method of use
This is usually done using the cluster attribute on the
,
,
, and
<dubbo:service cluster="Failfast" />
- or
<dubbo:reference cluster="Failfast" />
- or
<dubbo:consumer cluster="Failfast" />
- or
<dubbo:provider cluster="Failfast" />
Source code analysis
Through Dubbo source code as you can see, currently has the following to the expansion of the Cluster, in Dubbo – Cluster/SRC/main/resources/meta-inf/Dubbo/under the internal can see there are so many kinds of fault tolerant mechanism, So which one to use in the actual business depends on the specific scenario, you can refer to the table above.
mock=org.apache.dubbo.rpc.cluster.support.wrapper.MockClusterWrapper
failover=org.apache.dubbo.rpc.cluster.support.FailoverCluster
failfast=org.apache.dubbo.rpc.cluster.support.FailfastCluster
failsafe=org.apache.dubbo.rpc.cluster.support.FailsafeCluster
failback=org.apache.dubbo.rpc.cluster.support.FailbackCluster
forking=org.apache.dubbo.rpc.cluster.support.ForkingCluster
available=org.apache.dubbo.rpc.cluster.support.AvailableCluster
mergeable=org.apache.dubbo.rpc.cluster.support.MergeableCluster
broadcast=org.apache.dubbo.rpc.cluster.support.BroadcastCluster
zone-aware=org.apache.dubbo.rpc.cluster.support.registry.ZoneAwareCluster
Copy the code
- So first let’s take a look
org.apache.dubbo.rpc.cluster.Cluster
This interface, you can seeDubbo Cluster
The default value forFailoverCluster
.
@SPI(FailoverCluster.NAME)
public interface Cluster {
/**
* Merge the directory invokers to a virtual invoker.
*
* @param <T>
* @param directory
* @return cluster invoker
* @throws RpcException
*/
@Adaptive
<T> Invoker<T> join(Directory<T> directory) throws RpcException;
}
Copy the code
- Let’s take the default value of Dubbo Cluster as an example to analyze how to implement fault tolerance.
FailoverCluster
isCluster
An implementation of,FailoverCluster
Create a new Invoker and return it.
Public class FailoverCluster extends AbstractCluster {// @spi specifies a name, Public final static String NAME = default value of Dubbo Cluster"failover";
@Override
public <T> AbstractClusterInvoker<T> doJoin(Directory<T> Directory) throws RpcException {// Directly new a FailoverClusterInvoker. The specific implementation logic is encapsulated in the Invokerreturnnew FailoverClusterInvoker<>(directory); }}Copy the code
FailoverCluster
Implementation logic of
public Result doInvoke(Invocation invocation, final List<Invoker<T>> invokers, LoadBalance loadbalance) throws RpcException { List<Invoker<T>> copyInvokers = invokers; checkInvokers(copyInvokers, invocation); String methodName = RpcUtils.getMethodName(invocation); // Get the number of retries"N"To specify int len = getUrl().getmethodParameter (methodName, RETRIES_KEY, DEFAULT_RETRIES) + 1; // The default value is onceif(len <= 0) { len = 1; } RpcException le = null; List<Invoker<T>> invoked = new ArrayList<Invoker<T>>(copyInvokers.size()); Set<String> providers = new HashSet<String>(len); // loop, retry on failurefor (int i = 0; i < len; i++) {
if(i > 0) { checkWhetherDestroyed(); // Re-enumerate Invoker before retrying. This has the advantage that if a service fails, // copyInvokers = list(Invocation) has the latest available Invoker list; checkInvokers(copyInvokers, invocation); } // select Invoker Invoker<T> Invoker = select(loadbalance, invocation, copyInvokers, invoked); // Add the invoker to the Invoked list. Add (invoker); // Set invoked to the RPC context rpcContext.getContext ().setInvokers((List) invoked); Invocation invocation Result Result = Invocation. Invoke (Invocation);if(le ! = null && logger.isWarnEnabled()) { logger.warn("Although retry the method " + methodName
+ " in the service " + getInterface().getName()
+ " was successful by the provider " + invoker.getUrl().getAddress()
+ ", but there have been failed providers " + providers
+ "(" + providers.size() + "/" + copyInvokers.size()
+ ") from the registry " + directory.getUrl().getAddress()
+ " on the consumer " + NetUtils.getLocalHost()
+ " using the dubbo version " + Version.getVersion() + ". Last error is: "
+ le.getMessage(), le);
}
return result;
} catch (RpcException e) {
if (e.isBiz()) { // biz exception.
throw e;
}
le = e;
} catch (Throwable e) {
le = new RpcException(e.getMessage(), e);
} finally {
providers.add(invoker.getUrl().getAddress());
}
}
throw new RpcException(le.getCode(), "Failed to invoke the method "
+ methodName + " in the service " + getInterface().getName()
+ ". Tried " + len + " times of the providers " + providers
+ "(" + providers.size() + "/" + copyInvokers.size()
+ ") from the registry " + directory.getUrl().getAddress()
+ " on the consumer " + NetUtils.getLocalHost() + " using the dubbo version "
+ Version.getVersion() + ". Last error is: "+ le.getMessage(), le.getCause() ! = null ? le.getCause() : le); }Copy the code
As mentioned above, FailoverClusterInvoker’s doInvoke method first gets the number of retries, then loops the invocation based on the number of retries, and retries after failure. Within the for loop, an Invoker is first selected through the load-balancing component and then called remotely through the Invoke method of this Invoker. If that fails, log the exception and retry. The list method of the parent class is invoked again to enumerate Invoker when retry.