doubt
In the process of participating in the development of NACOS, many students asked why I took the service offline in the Nacos Console, but the service can still be called, which is not in line with the official announcement of the second offline feature. After further inquiry, it was found that those instances that could still provide services after offline had a common feature — they all had rabbion, a load balancing component. Therefore, this article will discuss the problem from two aspects: the implementation of nacOS ‘second-level up-down and rabbion’s instance update mechanism causes instance up-down awareness delay
Nacos second on and off line
@CanDistro
@RequestMapping(value = "", method = RequestMethod.PUT)
public String update(HttpServletRequest request) throws Exception {
String serviceName = WebUtils.required(request, CommonParams.SERVICE_NAME);
String namespaceId = WebUtils.optional(request, CommonParams.NAMESPACE_ID, Constants.DEFAULT_NAMESPACE_ID);
String agent = request.getHeader("Client-Version");
if (StringUtils.isBlank(agent)) {
agent = request.getHeader("User-Agent");
}
ClientInfo clientInfo = new ClientInfo(agent);
if (clientInfo.type == ClientInfo.ClientType.JAVA &&
clientInfo.version.compareTo(VersionUtil.parseVersion("1.0.0"> =))0) {
serviceManager.updateInstance(namespaceId, serviceName, parseInstance(request));
} else {
serviceManager.registerInstance(namespaceId, serviceName, parseInstance(request));
}
return "ok";
}
Copy the code
The parseInstance(request) method extracts instance information from the request. The underlying updateInstance method is as follows
public void updateInstance(String namespaceId, String serviceName, Instance instance) throws NacosException {
Service service = getService(namespaceId, serviceName);
if (service == null) {
throw new NacosException(NacosException.INVALID_PARAM, "service not found, namespace: " + namespaceId + ", service: " + serviceName);
}
if(! service.allIPs().contains(instance)) {throw new NacosException(NacosException.INVALID_PARAM, "instance not exist: " + instance);
}
addInstance(namespaceId, serviceName, instance.isEphemeral(), instance);
}
public void addInstance(String namespaceId, String serviceName, boolean ephemeral, Instance... ips) throws NacosException {
String key = KeyBuilder.buildInstanceListKey(namespaceId, serviceName, ephemeral);
Service service = getService(namespaceId, serviceName);
List<Instance> instanceList = addIpAddresses(service, ephemeral, ips);
Instances instances = new Instances();
instances.setInstanceList(instanceList);
consistencyService.put(key, instances);
}
Copy the code
The next step is the same as the previous post on registering a service instance on the Nacos Server side. Therefore, in the NacOS Console, the instance information data in the NacOS Naming Server is updated immediately once an instance is clicked offline.
Rabbion instance update mechanism
Let’s start with the rabbion instance pull code implemented by NacOS
public class NacosServerList extends AbstractServerList<NacosServer> {
private NacosDiscoveryProperties discoveryProperties;
private String serviceId;
public NacosServerList(NacosDiscoveryProperties discoveryProperties) {
this.discoveryProperties = discoveryProperties;
}
@Override
public List<NacosServer> getInitialListOfServers(a) {
return getServers();
}
@Override
public List<NacosServer> getUpdatedListOfServers(a) {
return getServers();
}
private List<NacosServer> getServers(a) {
try {
List<Instance> instances = discoveryProperties.namingServiceInstance()
.selectInstances(serviceId, true);
return instancesToServerList(instances);
}
catch (Exception e) {
throw new IllegalStateException(
"Can not get service instances from nacos, serviceId="+ serviceId, e); }}private List<NacosServer> instancesToServerList(List<Instance> instances) {
List<NacosServer> result = new ArrayList<>();
if (null == instances) {
return result;
}
for (Instance instance : instances) {
result.add(new NacosServer(instance));
}
return result;
}
public String getServiceId(a) {
return serviceId;
}
@Override
public void initWithNiwsConfig(IClientConfig iClientConfig) {
this.serviceId = iClientConfig.getClientName(); }}Copy the code
You can see that NacosServerList inherits AbstractServerList, so where is this AbstractServerList finally collected? Through the code tracking as you can see, finally is collected in the DynamicServerListLoadBalancer this class
protected final ServerListUpdater.UpdateAction updateAction = new ServerListUpdater.UpdateAction() {
@Override
public void doUpdate(a) { updateListOfServers(); }};public DynamicServerListLoadBalancer(IClientConfig clientConfig) {
initWithNiwsConfig(clientConfig);
}
@Override
public void initWithNiwsConfig(IClientConfig clientConfig) {
try {
super.initWithNiwsConfig(clientConfig);
String niwsServerListClassName = clientConfig.getPropertyAsString( CommonClientConfigKey.NIWSServerListClassName, DefaultClientConfigImpl.DEFAULT_SEVER_LIST_CLASS);
ServerList<T> niwsServerListImpl = (ServerList<T>) ClientFactory
.instantiateInstanceWithClientConfig(niwsServerListClassName, clientConfig);
// Get the implementation classes for all the ServerList interfaces
this.serverListImpl = niwsServerListImpl;
// Get Filter(Filter the pulled Servers list)
if (niwsServerListImpl instanceof AbstractServerList) {
AbstractServerListFilter<T> niwsFilter = ((AbstractServerList) niwsServerListImpl)
.getFilterImpl(clientConfig);
niwsFilter.setLoadBalancerStats(getLoadBalancerStats());
this.filter = niwsFilter;
}
// Get The ServerListUpdater object implementation class name
String serverListUpdaterClassName = clientConfig.getPropertyAsString( CommonClientConfigKey.ServerListUpdaterClassName, DefaultClientConfigImpl.DEFAULT_SERVER_LIST_UPDATER_CLASS);
PollingServerListUpdater = PollingServerListUpdater = PollingServerListUpdater
this.serverListUpdater = (ServerListUpdater) ClientFactory.instantiateInstanceWithClientConfig(serverListUpdaterClassName, clientConfig);
// Initialize or reset
restOfInit(clientConfig);
} catch (Exception e) {
throw new RuntimeException(
"Exception while initializing NIWSDiscoveryLoadBalancer:"
+ clientConfig.getClientName()
+ ", niwsClientConfig:"+ clientConfig, e); }}void restOfInit(IClientConfig clientConfig) {
boolean primeConnection = this.isEnablePrimingConnections();
// turn this off to avoid duplicated asynchronous priming done in BaseLoadBalancer.setServerList()
this.setEnablePrimingConnections(false);
// Enable the scheduled task. This task is to periodically refresh the instance information cache
enableAndInitLearnNewServersFeature();
// Perform an instance pull operation before opening
updateListOfServers();
if (primeConnection && this.getPrimeConnections() ! =null) {
this.getPrimeConnections() .primeConnections(getReachableServers());
}
this.setEnablePrimingConnections(primeConnection);
LOGGER.info("DynamicServerListLoadBalancer for client {} initialized: {}", clientConfig.getClientName(), this.toString());
}
// Update the instance information cache
@VisibleForTesting
public void updateListOfServers(a) {
List<T> servers = new ArrayList<T>();
if(serverListImpl ! =null) {
// Call the method that pulls the new instance information
servers = serverListImpl.getUpdatedListOfServers();
LOGGER.debug("List of Servers for {} obtained from Discovery client: {}", getIdentifier(), servers);
// Update the list of pulled Servers with Filter
if(filter ! =null) {
servers = filter.getFilteredListOfServers(servers);
LOGGER.debug("Filtered List of Servers for {} obtained from Discovery client: {}", getIdentifier(), servers); }}// Update the instance list
updateAllServerList(servers);
}
Copy the code
Let’s see enableAndInitLearnNewServersFeature (); What is the final call to
@Override
public synchronized void start(final UpdateAction updateAction) {
if (isActive.compareAndSet(false.true)) {
final Runnable wrapperRunnable = new Runnable() {
@Override
public void run(a) {
if(! isActive.get()) {if(scheduledFuture ! =null) {
scheduledFuture.cancel(true);
}
return;
}
try {
/ / here UpdateAction object is encapsulated in the DynamicServerListLoadBalancer updateListOfServers implementation
updateAction.doUpdate();
lastUpdated = System.currentTimeMillis();
} catch (Exception e) {
logger.warn("Failed one update cycle", e); }}};// The default task execution interval is 30s
scheduledFuture = getRefreshExecutor().scheduleWithFixedDelay(
wrapperRunnable,
initialDelayMs,
refreshIntervalMs,
TimeUnit.MILLISECONDS);
} else {
logger.info("Already active, no-op"); }}Copy the code
Therefore, it is not difficult to see that nacOS implements second-level instance up and down, but because in Spring Cloud, the rabbion instance information update of the load component is in the form of a scheduled task, it is possible that the task is executed only one second before you execute the instance up and down the next second. The Rabbion must wait for the refreshIntervalMs to sense the change.