In the Introduction to Distributed Architecture section, we have seen the beginnings of distributed architecture in general. Some problems and solutions in the evolution of distributed architecture are also mentioned. We will focus on the registry section here.
The source version mentioned in the article and the Nacos official documentation:
- Spring – the cloud – alibaba – nacos: 2.2.5 RELEASE
- Nacos – server: 2.0.2
- nacos-example: master
Why do you need a registry
In the Introduction to Distributed Architectures section, we learned that the evolution of distributed architectures will eventually (and in fact, the current mainstream architectures) evolve into something like this:
What kind of problems and challenges do we face under such an architecture?
- A single service is split into multiple independent services, and how services communicate with each other.
- In order to avoid single failure, microservice applications will adopt multiple copies, service and service invocation before, how to load and schedule.
- How the calling service obtains the information about the called service, including node information and health status.
Remote Procedure Call (RPC) is a protocol that requests services from Remote computers over the network without understanding the underlying network technology.
RPC is mainly used to solve the problem of service and remote invocation before service. Webservice, restFul, and Dubbo are ALL RPC protocols. Java RPC frameworks include Dubbo, Spring Cloud, Thrift, GRPC, etc.
This article uses Spring Cloud’s Demo (restTemplate) as an example to explore the implementation of the Nacos registry.
Client load balancing
How to load and schedule between the calling service and multiple called services needs to use load balancing to achieve. Here we mainly use Spring Cloud Ribbon. The following figure shows the simple working mode of load balancing. When the client invokes the server, it does not directly invoke the server, but requests the load balancer. The information of a server is selected and returned to the client through the load balancing algorithm, and then invokes the server.
Principles of the Registry
Having solved the above two problems, we found that there were still problems plaguing us. How to obtain service information before service, and how to confirm the current health status of the service.
This is where the registry needs to be introduced. According to the existing problems, we have the following requirements for the registry:
- You can save information about microservices.
- The saved information can be queried with each other before microservices.
- The health status of the service can be detected and updated in time.
Use the Nacos registry
Add the dependent
Service provider
Configure the address of the Nacos Server in
Native annotations via Spring Cloud@EnableDiscoveryClient
Enable the service registration discovery function
public class NacosProviderApplication {
public static void main(String[] args) {, args);
class EchoController {
@RequestMapping(value = "/echo/{string}", method = RequestMethod.GET)
public String echo(@PathVariable String string) {
return "Hello Nacos Discovery "+ string; }}}Copy the code
Service consumer
Configure the address of the Nacos Server in
Enable service registration discovery through the Spring Cloud native annotation @enableDiscoveryClient. Add @loadBalanced to the RestTemplate instance to enable @loadBalanced integration with the Ribbon:
public class NacosConsumerApplication {
public RestTemplate restTemplate(a) {
return new RestTemplate();
public static void main(String[] args) {, args);
public class TestController {
private final RestTemplate restTemplate;
public TestController(RestTemplate restTemplate) {this.restTemplate = restTemplate; }@RequestMapping(value = "/echo/{str}", method = RequestMethod.GET)
public String echo(@PathVariable String str) {
return restTemplate.getForObject("http://service-provider/echo/"+ str, String.class); }}}Copy the code
Nacos as the implementation of the registry
Service Registration process – Client
Spring Cloud Common has an interface for abstracting the Spring Cloud service registration process: Org. Springframework. Cloud. Client. The serviceregistry. AutoServiceRegistration have an abstract implementation under the interface Org. Springframework. Cloud. Client. The serviceregistry. AbstractAutoServiceRegistration. In this abstract class, it listens for events initialized by the Web container and registers service information:
public void onApplicationEvent(WebServerInitializedEvent event) {
/ / to monitor WebServerInitializedEvent, events trigger the bind method
public void bind(WebServerInitializedEvent event) {
ApplicationContext context = event.getApplicationContext();
if (context instanceof ConfigurableWebServerApplicationContext) {
if ("management".equals(((ConfigurableWebServerApplicationContext) context)
.getServerNamespace())) {
return; }}this.port.compareAndSet(0, event.getWebServer().getPort());
/ / start the discovery
From here we see that when the program is to monitor the WebServerInitializedEvent event, Will perform to the org. Springframework. Cloud. Client. The serviceregistry. AbstractAutoServiceRegistration# start method
public void start(a) {
if(! isEnabled()) {if (logger.isDebugEnabled()) {
logger.debug("Discovery Lifecycle disabled. Not starting");
// only initialize if nonSecurePort is greater than 0 and it isn't already running
// because of containerPortInitializer below
if (!this.running.get()) {
// Triggers a pre-registration event
new InstancePreRegisteredEvent(this, getRegistration()));
// Service registration
if (shouldRegisterManagement()) {
// Triggers a post-registration event
new InstanceRegisteredEvent<>(this, getConfiguration()));
this.running.compareAndSet(false.true); }}Copy the code
In AbstractAutoServiceRegistration# start, actually is the register method calls for service registration, issued before, during, and after registration and registered before and after the callback event.
protected AbstractAutoServiceRegistration(ServiceRegistry
serviceRegistry, AutoServiceRegistrationProperties properties)
this.serviceRegistry = serviceRegistry; = properties;
protected void register(a) {
Register in an abstract class proxies requests to the constructor serviceRegistry#register. And we are here is to use Nacos as registry, here is to realize the com. Alibaba. Cloud. Nacos. Registry. NacosServiceRegistry
public void register(Registration registration) {
if (StringUtils.isEmpty(registration.getServiceId())) {
log.warn("No service to register for nacos client...");
/ / get namingserver
NamingService namingService = namingService();
// serviceId is spring.applicatio. Name
String serviceId = registration.getServiceId();
String group = nacosDiscoveryProperties.getGroup();
// Assemble the node information that needs to be registered
Instance instance = getNacosInstanceFromRegistration(registration);
try {
// Register node information with namingServer
namingService.registerInstance(serviceId, group, instance);"nacos registry, {} {} {}:{} register finished", group, serviceId,
instance.getIp(), instance.getPort());
catch (Exception e) {
log.error("nacos registry, {} register failed... {},", serviceId,
registration.toString(), e);
// rethrow a RuntimeException if the registration is failed.
In com. Alibaba. Cloud. Nacos. Registry. NacosServiceRegistry# obtained namingServer service register first, This can be simply interpreted as getting an httpClient based on the nacosServer address configured for And then assemble the current node information need to register, and through namingService. RegisterInstance registration service.
public void registerInstance(String serviceName, String groupName, Instance instance) throws NacosException {
String groupedServiceName = NamingUtils.getGroupedName(serviceName, groupName);
// If it is a temporary node, build the heartbeat task
if (instance.isEphemeral()) {
BeatInfo beatInfo = beatReactor.buildBeatInfo(groupedServiceName, instance);
beatReactor.addBeatInfo(groupedServiceName, beatInfo);
// Register service
serverProxy.registerService(groupedServiceName, groupName, instance);
Here is to realize the com. Alibaba. Nacos. Client. Naming. NacosNamingService# registerInstance. There are two main things that this method does:
- Check whether it is a temporary node. If it is a temporary node, build a heartbeat task.
- Registration services.
public void registerService(String serviceName, String groupName, Instance instance) throws NacosException {"[REGISTER-SERVICE] {} registering service {} with instance: {}", namespaceId, serviceName,
// Encapsulate the request
final Map<String, String> params = new HashMap<String, String>(16);
params.put(CommonParams.NAMESPACE_ID, namespaceId);
params.put(CommonParams.SERVICE_NAME, serviceName);
params.put(CommonParams.GROUP_NAME, groupName);
params.put(CommonParams.CLUSTER_NAME, instance.getClusterName());
params.put("ip", instance.getIp());
params.put("port", String.valueOf(instance.getPort()));
params.put("weight", String.valueOf(instance.getWeight()));
params.put("enable", String.valueOf(instance.isEnabled()));
params.put("healthy", String.valueOf(instance.isHealthy()));
params.put("ephemeral", String.valueOf(instance.isEphemeral()));
params.put("metadata", JacksonUtils.toJson(instance.getMetadata()));
// Invoke the API registration service
reqApi(UtilAndComs.nacosUrlInstance, params, HttpMethod.POST);
Invoke the API registration service
public String reqApi(String api, Map
params, Map
body, List
servers, String method)
,> throws NacosException {
params.put(CommonParams.NAMESPACE_ID, getNamespaceId());
// Check whether the nacos-server service information is empty
if (CollectionUtils.isEmpty(servers) && StringUtils.isBlank(nacosDomain)) {
throw new NacosException(NacosException.INVALID_PARAM, "no server available");
NacosException exception = new NacosException();
// If only one node is configured, try again
if (StringUtils.isNotBlank(nacosDomain)) {
for (int i = 0; i < maxRetry; i++) {
try {
return callServer(api, params, body, nacosDomain, method);
} catch (NacosException e) {
exception = e;
if (NAMING_LOGGER.isDebugEnabled()) {
NAMING_LOGGER.debug("request {} failed.", nacosDomain, e); }}}}else {
// If multiple nodes are configured, one node is executed randomly
Random random = new Random(System.currentTimeMillis());
int index = random.nextInt(servers.size());
for (int i = 0; i < servers.size(); i++) {
String server = servers.get(index);
try {
return callServer(api, params, body, server, method);
} catch (NacosException e) {
exception = e;
if (NAMING_LOGGER.isDebugEnabled()) {
NAMING_LOGGER.debug("request {} failed.", server, e);
index = (index + 1) % servers.size();
NAMING_LOGGER.error("request: {} failed, servers: {}, code: {}, msg: {}", api, servers, exception.getErrCode(),
throw new NacosException(exception.getErrCode(),
"failed to req API:" + api + " after all servers(" + servers + ") tried: " + exception.getMessage());
public String callServer(String api, Map
params, Map
body, String curServer, String method)
,> throws NacosException {
long start = System.currentTimeMillis();
long end = 0;
Header header = builderHeader();
// Assemble the request URL
String url;
if (curServer.startsWith(UtilAndComs.HTTPS) || curServer.startsWith(UtilAndComs.HTTP)) {
url = curServer + api;
} else {
if(! IPUtil.containsPort(curServer)) { curServer = curServer + IPUtil.IP_PORT_SPLITER + serverPort; } url = NamingHttpClientManager.getInstance().getPrefix() + curServer + api; }try {
// Initiate an HTTP request with nacosRestTemplate
HttpRestResult<String> restResult = nacosRestTemplate
.exchangeForm(url, header, Query.newInstance().initParams(params), body, method, String.class);
end = System.currentTimeMillis();
MetricsMonitor.getNamingRequestMonitor(method, url, String.valueOf(restResult.getCode()))
.observe(end - start);
// Check whether the return is normal
if (restResult.ok()) {
return restResult.getData();
if (HttpStatus.SC_NOT_MODIFIED == restResult.getCode()) {
return StringUtils.EMPTY;
throw new NacosException(restResult.getCode(), restResult.getMessage());
} catch (Exception e) {
NAMING_LOGGER.error("[NA] failed to request", e);
Finally, the HTTP request is sent to the server to register the node information with the server.
Service Registration Process – Server
From the client code, when the HTTP request is finally sent, the /v1/ns/instance interface is requested. This interface is made up of com. Alibaba. Nacos. Naming. Controllers. InstanceController# register.
@Secured(parser = NamingResourceParser.class, action = ActionTypes.WRITE)
public String register(HttpServletRequest request) throws Exception {
final String namespaceId = WebUtils
.optional(request, CommonParams.NAMESPACE_ID, Constants.DEFAULT_NAMESPACE_ID);
final String serviceName = WebUtils.required(request, CommonParams.SERVICE_NAME);
// Parses the request parameters to obtain node information
final Instance instance = parseInstance(request);
// Initiate registration
getInstanceOperator().registerInstance(namespaceId, serviceName, instance);
return "ok";
public void registerInstance(String namespaceId, String serviceName, Instance instance) throws NacosException { coreInstance = ( instance;
serviceManager.registerInstance(namespaceId, serviceName, coreInstance);
Copy the code
public void registerInstance(String namespaceId, String serviceName, Instance instance) throws NacosException {
// Creating an empty service, the service information displayed in the Nacos console service list, is actually initializing a serviceMap, whichCreateEmptyService (namespaceId, serviceName, instance.isephemeral ());// Get a registered service node based on the namespace and service name
Service service = getService(namespaceId, serviceName);
if (service == null) {
throw new NacosException(NacosException.INVALID_PARAM,
"service not found, namespace: " + namespaceId + ", service: " + serviceName);
// Call addInstance to create a service instance
addInstance(namespaceId, serviceName, instance.isEphemeral(), instance);
Copy the code
public void addInstance(String namespaceId, String serviceName, boolean ephemeral, Instance... ips)
throws NacosException {
String key = KeyBuilder.buildInstanceListKey(namespaceId, serviceName, ephemeral);
Service service = getService(namespaceId, serviceName);
synchronized (service) {
List<Instance> instanceList = addIpAddresses(service, ephemeral, ips);
Instances instances = newInstances(); instances.setInstanceList(instanceList); consistencyService.put(key, instances); }}Copy the code
Service instances are added to the collection and data is synchronized based on a consistency protocol.
And you can see here,
There are many implementation classes, and in clustered mode, the general Nacos uses raft protocol
public void put(String key, Record value) throws NacosException {
try {
raftCore.signalPublish(key, value);
} catch (Exception e) {
Loggers.RAFT.error("Raft put failed.", e);
This is the synchronization of node information based on RAFT protocol.
Service discovery process – Consumer
After looking at how providers register their information in the registry, let’s look at how consumers get to the list of providers’ services.
public List<ServiceInstance> getInstances(String serviceId) {
try {
return serviceDiscovery.getInstances(serviceId);
catch (Exception e) {
throw new RuntimeException(
"Can not get hosts from nacos server. serviceId: "+ serviceId, e); }}Copy the code
In the Spring in the Cloud, the service found that the client is org. Springframework. Cloud. Client. Discovery. DiscoveryClient which provides two main methods:
Gets all nodes for the specified
Get a list of all services.
The implementation class here is com. Alibaba. Cloud. Nacos. Discovery. NacosDiscoveryClient
public List<ServiceInstance> getInstances(String serviceId) throws NacosException {
String group = discoveryProperties.getGroup();
List<Instance> instances = namingService().selectInstances(serviceId, group,
return hostToServiceInstanceList(instances, serviceId);
There are two main things done here:
- Get the list of services through namingService.
- Convert the list of services returned by NACOS to one common in Spring Cloud
public List<Instance> selectInstances(String serviceName, String groupName, List<String> clusters, boolean healthy,
boolean subscribe) throws NacosException {
ServiceInfo serviceInfo;
// Whether to subscribe to the service address change, the default is true
if (subscribe) {
serviceInfo = hostReactor.getServiceInfo(NamingUtils.getGroupedName(serviceName, groupName),
StringUtils.join(clusters, ","));
} else {
serviceInfo = hostReactor
.getServiceInfoDirectlyFromServer(NamingUtils.getGroupedName(serviceName, groupName),
StringUtils.join(clusters, ","));
// select the healthy nodes in the list and return.
return selectInstances(serviceInfo, healthy);
public ServiceInfo getServiceInfo(final String serviceName, final String clusters) {
NAMING_LOGGER.debug("failover-mode: " + failoverReactor.isFailoverSwitch());
String key = ServiceInfo.getKey(serviceName, clusters);
if (failoverReactor.isFailoverSwitch()) {
return failoverReactor.getService(key);
// Get the cache object
ServiceInfo serviceObj = getServiceInfo0(serviceName, clusters);
// If the cache is empty, update the cache
if (null == serviceObj) {
serviceObj = new ServiceInfo(serviceName, clusters);
serviceInfoMap.put(serviceObj.getKey(), serviceObj);
// Add serviceName to the list to update
updatingMap.put(serviceName, new Object());
// Start the update immediately
updateServiceNow(serviceName, clusters);
// Remove serviceName from the list to be updated
// If the current serviceName is in the list to update, wait.
} else if (updatingMap.containsKey(serviceName)) {
// hold a moment waiting for update finish
synchronized (serviceObj) {
try {
} catch (InterruptedException e) {
.error("[getServiceInfo] serviceName:" + serviceName + ", clusters:"+ clusters, e); }}}}// Start the task update environment
scheduleUpdateIfAbsent(serviceName, clusters);
return serviceInfoMap.get(serviceObj.getKey());
Com. Alibaba. Nacos. Client. Naming. Core. HostReactor# getServiceInfo there are two main logic:
- Local cache is empty, updateServiceNow, load service information immediately.
- ScheduleUpdateIfAbsent Enables the scheduled scheduling function to query service information periodically.
HostReactor#updateServiceNow & HostReactor#updateService
private void updateServiceNow(String serviceName, String clusters) {
try {
/ / call updateService
updateService(serviceName, clusters);
} catch (NacosException e) {
NAMING_LOGGER.error("[NA] failed to update serviceName: "+ serviceName, e); }}public void updateService(String serviceName, String clusters) throws NacosException {
// Get the node before the update
ServiceInfo oldService = getServiceInfo0(serviceName, clusters);
try {
// Call the API to get a list of services.
String result = serverProxy.queryList(serviceName, clusters, pushReceiver.getUdpPort(), false); If the list returned by the server is not empty, the cache is updatedif(StringUtils.isNotEmpty(result)) { processServiceJson(result); }}finally {
if(oldService ! =null) {
synchronized(oldService) { oldService.notifyAll(); }}}}Copy the code
UpdateService does two things:
- Request the server to obtain the node list of serviceName
- Update the cache based on the queried data
public ServiceInfo processServiceJson(String json) {
ServiceInfo serviceInfo = JacksonUtils.toObj(json, ServiceInfo.class);
ServiceInfo oldService = serviceInfoMap.get(serviceInfo.getKey());
if(pushEmptyProtection && ! serviceInfo.validate()) {//empty or error push, just ignore
return oldService;
boolean changed = false;
// If oldService is not empty, the new and removed parts are confirmed by comparison with newService, and the node change event is sent
if(oldService ! =null) {
if (oldService.getLastRefTime() > serviceInfo.getLastRefTime()) {
NAMING_LOGGER.warn("out of date data received, old-t: " + oldService.getLastRefTime() + ", new-t: "
+ serviceInfo.getLastRefTime());
serviceInfoMap.put(serviceInfo.getKey(), serviceInfo);
Map<String, Instance> oldHostMap = new HashMap<String, Instance>(oldService.getHosts().size());
for (Instance host : oldService.getHosts()) {
oldHostMap.put(host.toInetAddr(), host);
Map<String, Instance> newHostMap = new HashMap<String, Instance>(serviceInfo.getHosts().size());
for (Instance host : serviceInfo.getHosts()) {
newHostMap.put(host.toInetAddr(), host);
Set<Instance> modHosts = new HashSet<Instance>();
Set<Instance> newHosts = new HashSet<Instance>();
Set<Instance> remvHosts = new HashSet<Instance>();
/ /... Omit some code
if (newHosts.size() > 0 || remvHosts.size() > 0 || modHosts.size() > 0) {
NotifyCenter.publishEvent(new InstancesChangeEvent(serviceInfo.getName(), serviceInfo.getGroupName(),
serviceInfo.getClusters(), serviceInfo.getHosts()));
DiskCache.write(serviceInfo, cacheDir);
// If oldService is empty, the cache is updated directly and the node change event is sent
} else {
changed = true;"init new ips(" + serviceInfo.ipCount() + ") service: " + serviceInfo.getKey() + "- >"
+ JacksonUtils.toJson(serviceInfo.getHosts()));
serviceInfoMap.put(serviceInfo.getKey(), serviceInfo);
NotifyCenter.publishEvent(new InstancesChangeEvent(serviceInfo.getName(), serviceInfo.getGroupName(),
serviceInfo.getClusters(), serviceInfo.getHosts()));
DiskCache.write(serviceInfo, cacheDir);
if (changed) {"current ips:(" + serviceInfo.ipCount() + ") service: " + serviceInfo.getKey() + "- >"
+ JacksonUtils.toJson(serviceInfo.getHosts()));
return serviceInfo;
Com. Alibaba. Nacos. Client. Naming. Core. HostReactor# processServiceJson code is longer, there are two main branches:
- If oldService is not empty, oldService is compared with newService. If the node information changes, the heartbeat information needs to be updated. Finally, the cache is updated and the event of the node change is sent.
- If oldService is empty, the cache is updated directly and the node change event is sent.
By com again after update cache. Alibaba. Nacos. Client. Naming. Core. HostReactor# getServiceInfo returned to NacosNamingService# selectInstances so far. NacosDiscoveryClient is the process of obtaining node information of a specified service.
This section focuses on the implementation of Nacos as a registry in the Spring Cloud.
About how Nacos implements some of the basic applications of the registry:
- You can save information about microservices.
- The server saves the registration information of microservices and uses RAFT protocol to synchronize data to ensure data consistency.
- The saved information can be queried with each other before microservices.
- through
Query node information on the server and modify the local cache.
- through
- The health status of the service can be detected and updated in time.
- in
In this method, the scheduled task is enabled to periodically detect the changes of service nodes. If any node changes, new data is pulled and the cache is updated.
- in
And the implementation of Nacos in Sring Cloud.
The service provider mainly registers the service through nacosServiceregistryRegister and registers the node information of the service in Nacos. The instancecontrollerRegister of the server receives the registration request. Service instances are added to the collection and data is synchronized based on a consistency protocol.
NacosDiscoveryClient#getInstances is used by service consumers to find node information of corresponding services.