background

Some time ago, there was a news report that foreign HashiCorp announced on its official website that it is not allowed to use, deploy and install its enterprise version products and software in China.

In 2008, before ZK came out, Alibaba needed to do internal service discovery, so it developed The ConfigServer. Ten years later, in July 2018, Alibaba released Nacos (Open source Implementation of ConfigServer) 0.1.0. It has been nearly two years since version 1.3.0, which now supports many features:

  • Service registration and discovery: NacOS has been integrated with many RPC frameworks, such as Dubbo,SpringCloud, etc., so that we can use it easily, and also open up a relatively easy API to customize our OWN RPC.
  • Configuration management: a configuration management center similar to APLLO, so that we do not need to write configuration in a file, unified management in the background.
  • Address server: Allows us to address nacoS in different environments and isolation scenarios.
  • Security and stability: performance monitoring, encrypted transmission, permission control management, etc.

For NACOS, the biggest core function is service registration and configuration management. My article mainly introduces these two modules. This article mainly introduces some use of NACOS service discovery – registration, principle and some optimization compared with other.

The basic concept

Let’s start by looking at some basic concepts of service discovery-registry in Nacos:

  • Namespace: The namespace is the top-level structure of Nacos and is used for tenant level isolation. The most commonly used is isolation of different environments such as test environment and online environment.
  • Service: The concept of Service corresponds to our ordinary micro services one by one, such as order services, logistics services and so on. A namespace can have multiple services. Different namespaces can have the same Service. For example, a test environment or an online environment can have order services.
  • Virtual cluster: All the machines in a service form a cluster, which can be further divided into virtual clusters as needed in Nacos.
  • Instance: roughly understood as a machine or a virtual machine is an instance, more finely understood as one or more services of a process with an accessible network address (IP:Port).

The above is the service domain model diagram provided in the official website document of Nacos. From the diagram, we can know that the hierarchical relationship is: service-cluster-instance. Meanwhile, some data are stored in the service, cluster and instance for other requirements.

In fact, when it comes to service registration, many people will first think of Zookeeper. In fact, ZK does not directly provide the function of service registration and subscription. To realize these functions in ZK, you have to divide files and directories one by one, which is very inconvenient. The Api for service registration for Nacos is as follows:

        Properties properties = new Properties();
        properties.setProperty("serverAddr", System.getProperty("serverAddr"));
        properties.setProperty("namespace", System.getProperty("namespace"));

        NamingService naming = NamingFactory.createNamingService(properties);

        naming.registerInstance("microservice-mmp-marketing"."11.11.11.11", 8888, "TEST1");
        
        naming.subscribe("microservice-mmp-marketing", new EventListener() { @Override public void onEvent(Event event) { System.out.println(((NamingEvent)event).getServiceName()); System.out.println(((NamingEvent)event).getInstances()); }});Copy the code

We just need to create a NamingService, and then call the registerInstance and subscribe methods to complete the registration and subscription of our service.

Nacos. IO /zh-cn/docs/…

AP or CP

CAP

When it comes to distributed systems,CAP theorem is indispensable. CAP theorem is called Brewer’s theorem. For architects designing distributed systems (not just distributed transactions), CAP is your gateway theory.

  • C (consistency): For a specified client, the read operation can return the latest write operation. For data distributed on different nodes, if the data is updated on one node, the latest data can be read on other nodes, then it is called strong consistency. If the data is not read on one node, it is called distributed inconsistency.
  • A (availability) : non-failing nodes return reasonable responses (not error and timeout responses) in A reasonable amount of time. The two keys to usability are reasonable time and reasonable response. A reasonable time means that the request cannot be blocked indefinitely and should be returned within a reasonable time. A reasonable response means that the system should definitely return the result and that the result is correct, and by correct I mean, for example, it should return 50, not 40.
  • P (fault tolerance of partitions): The system can continue to work after network partitions occur. For example, here is a cluster with multiple machines, one machine has a network problem, but the cluster still works.

Familiar with CAP knows, not A total of three, if interested can search proof CAP, in A distributed system, the network cannot be 100% reliable, partition is A inevitable phenomenon, if we chose to give up the CA and P, so when the partition is, in order to ensure consistency, this time, you must decline A request but A and does not allow, Therefore, it is theoretically impossible for a distributed system to choose CA architecture, only CP or AP architecture.

For CP, giving up availability and pursuing consistency and fault tolerance of partition, our ZooKeeper is actually pursuing strong consistency.

For AP, abandon consistency (consistency here is strong consistency), pursue fault tolerance and availability of partition, this is the choice of many distributed system design, the later BASE is also based on AP extension.

Incidentally, CAP theory ignores network latency, that is, replication from node A to node B when A transaction commits, but in reality this is obviously not possible, so there will always be some time of inconsistency. At the same time, if you choose CP, it does not mean that you give up A. Because the probability of P is so small, most of the time you still have to guarantee CA. Even if the partition does appear, you have to prepare for the later A, for example by logging back to other machines.

Registry selection

As mentioned above, all distributed systems make CAP decisions, and the same registry is no exception. Zookeeper is the first choice for many service registries. The company I work for is also using Zookeeper as the registration center. However, with the development of the company, ZK is becoming more and more unstable, and it is very difficult to find services in multiple computer rooms. “Explains in more detail why Ali does not use ZK as the registration center. Here I will briefly elaborate:

  • The performance is not satisfied and the ZK cannot be expanded horizontally: Students familiar with ZK know that ZK writes data to the Leader (master node), so it is difficult to expand horizontally. When the company reaches a certain scale, ZK is not suitable for the registry, and frequent read and write can easily lead to ZK instability. In this case, multiple ZK clusters can be divided, but there is a problem. At the beginning, the clusters may not interact with each other, but later, if there is any collaborative business, it becomes a difficulty for the services of each cluster to call each other.
  • The ZooKeeper API is difficult to use: ZooKeeper really requires an expert to be familiar with many of its exceptions and what to do with them.
  • Registries don’t need to store historical changes: registries theoretically only need to know what services and instances are registered on the registry at this point in time, but ZK keeps a transaction log for future fixes.
  • Disconnected from the equipment room: If we have a three-room Dr Five-node deployment structure, as shown in the following figure:

    If there is network computer room 1, 2, 3 and room partition, namely room inside is good, but between machine is disconnected, by ZK appeared on this room partition, the ZK will not available, then service within the room will not be able to use registry to call each other, it is clear that this is not allowed, If the network between rooms is good, calls should be allowed within the same room.

Based on the above, ZK is not suitable for our registry, in other words, the registry does not need CP, we should provide AP more.

Protocols in Nacos

Distro

In Nacos Instance, the ephemeral field is provided. This field is of type bool. This field has the same meaning as ZK, indicating whether the node is temporary or not. Of course, all instances in the registry are actually temporary nodes by default.

In Nacos, Distro protocol was customized to implement AP. Here’s a look at what Distro really is:

Pure memory save

Distro all data is stored in memory, and in DistroConsistencyService there are:

As you can see, Distro uses ConcurrentHashMap as a container for storage without the need for additional files. Some students will ask if MY machine breaks down, the memory information is all lost, then how do I recover this part of the data?

The final agreement

The DistroConsistencyService Put method adds a task to TaskDispatcher, as shown in the following figure:

At this time, some students will ask, what if I just go online a machine, just did not receive the updated data? Nacos also has a bottom-pocket policy TimedSync, which is executed every 5s with the following code:

Notice the part circled in red. Instead of synchronizing all data, it is traversing all data, which data will be managed by itself (what data will be managed by itself? More on that in the next section), then get all the non-own servers and send the data to check.

By using these two methods: real-time update and periodic update, we can ensure that the data on all Nacos nodes are finally consistent.

Horizontal scaling

One drawback of ZK is that it cannot scale horizontally, which is a big problem of CP. As the company develops and gets bigger, it is difficult to sustain the current business. There is no Leader role in Distro, each node can handle reads and writes, and in this way, we can scale Nacos nodes as horizontally as we want.

Not every node in Distro can handle all read requests, but not every node can handle write requests, and each node will determine if it should handle them themselves based on the hash value of the key. Write requests access domain names, which are made randomly to each node. How Nacos makes these write requests go to the corresponding machine can be found in DistroFilter:

The ServletFilter does some filtering for each request, and if it finds that the request is not its own, it forwards the request to the corresponding server for processing, and then returns the result to the user.

In fact, Nacos can be optimized, we can find that the forwarding action is synchronous, we can use asynchronous send, and enable serlvet asynchronous, the forwarding node can be like a gateway without synchronous wait, can increase the throughput of the Nacos cluster

Compared with other CP protocols, Distro’s advantages are very big in terms of the registry, and the implementation of the whole protocol is much simpler and easier to understand. If you are involved in some aspects of the registry protocol in the future, you can refer to this idea.

Raft

There is also a very consistent protocol in Nacos that uses Raft, which is used in two places:

  • In the registry, there are some data in Nacos that need to be stored persistently, and we use Raft to store consistency synchronization of data, such as Service and namespace data. Nacos considers instances to be fast changing and temporary data, and we don’t need Raft to store consistency. But services and namespaces are less variable data and are suitable for persistent storage. Raft implementation of the registry. Raft protocol in Nacos does not guarantee the continuity of logs.
  • After 1.3.0, in order to gradually implement SOFA – Jraft using the standard raft protocol, Nacos was only used in the configuration center, which could only use mysql storage. After 1.3.0, Nacos used Etcd’s design idea of converting stand-alone KV storage into distributed KV storage through Raft protocol to build a lightweight distributed relational database based on SOFA-JRaft and Apache Derby, while retaining the ability to use external data sources. You can select a data storage solution based on your service requirements.

The details of Raft protocol are not detailed here, if you are interested you can see the translated paper: www.infoq.cn/article/raf…

Registration and subscription of nodes

Another important aspect of the registry is how our nodes are registered and subscribed, how heartbeat detection is performed to prevent node outages, and how subscribed nodes receive updates in real time.

Node registration

naming.registerInstance("microservice-mmp-marketing"."11.11.11.11", 8888, "TEST1");
Copy the code

We just need to write the above line of code to complete our node registration. TEST1Clsuter; / / add a heartbeat task to TEST1Clsuter. / / add a heartbeat task to TEST1Clsuter. / / add a heartbeat task to TEST1Clsuter

As shown in red, a delayed heartbeat task is added to the thread pool, executed in 5s by default.

BeatTask

In nacOS-server’s ClientBeatCheckTask, we periodically scan the Service for instances that have not been synchronized for a long time according to the Service dimension. The default value is 15s, as shown in the red box below:

Subscription of nodes

Node subscription has different implementations in different registries, and the general routine is divided into two kinds of rotation training and push.

Push means that when the subscribed node updates, it will actively push to the subscriber. Our ZK is the implementation of push. The client and the server will establish a TCP long connection, and the client will register a Watcher, and when there is data update, the server will push through the long connection. This mode of establishing long connections severely consumes resources on the server side. Therefore, when there are too many watcher servers and updates are frequent, Zookeeper performance is very low or even dies.

Rotation training means that the subscribed nodes take the initiative to regularly obtain the information of the service node, and then make a local comparison. If there is any change, they will make some updates. Consul also has a Watcher mechanism in Consul, but unlike ZK, it is implemented using Http long polling. Consul server returns immediately on Consul if the request URL contains wait parameters, or if the wait time is specified and the service returns if there is a change in wait time. The performance of using rotation training may be high but the real-time performance may not be very good.

Nacos combines these two ideas and provides both rotation training and active push. Let’s first look at the rotation training part and then the run method in UpdateTask class:

Notice in the red box above that we rotate our ServiceInfo by the Service dimension and then update it.

Nacos will record the subscribers to our PushService. Here is the core code of our PushService:

  • Trigger our push with a ServiceChangeEvent. Notice that distro updates our nodes through distro, and this event will also be triggered when we sync distroT to other machines.
  • Step 2: Obtain the subscribers maintained on the local machine. Because the subscribers are defined according to whether the service node has been queried, the action of querying the service node will be randomly sent to different nacOS-servers, so each node will maintain part of the subscribers, and there will be duplication among the maintained subscribers. Since subsequent UDP deliveries are made, the cost of repeated maintenance of subscribers is not very high.
  • Step3: generate ackEntry, that is, the content we send and cache it. Caching here is mainly to prevent repeated compression process.
  • Step4: finally, send udp packets

The push mode of Nacos will save a lot of resources for Zookeeper through TCP long connection. Even if a large number of node updates will not cause too much performance bottleneck of Nacos. In Nacos, if the client receives a UDP message, it will return an ACK. If nacOS-server does not receive an ACK for a certain period of time, it will resend the message. After a certain period of time, it will not resend the message. But Nacos also has regular rotation training as a backstop, so it doesn’t have to worry about data not updating. Through these two means, Nacos not only ensures real-time performance, but also ensures that data updates will not be missed.

conclusion

Although Nacos is a new open source project, its architecture and source code design are very delicate. For example, MANY EventBus architectures are used in the internal source code, which decoupled many processes. Distro protocol ideas are also worth learning.

There are other details in the Nacos registry, such as selecting by tag and so on. For those interested, check out the Nacos documentation below.

If you find this article helpful to you, your attention and forwarding will be my biggest support.