“This is the 17th day of my participation in the First Challenge 2022. For details: First Challenge 2022.”
As a registry, Spring Cloud Alibaba Nacos not only provides service registration and service discovery functions, but also provides a mechanism for service availability monitoring. With this mechanism, Nacos can sense the health status of services and provide healthy service instances to service callers, ultimately ensuring the normal execution of business systems.
Two health check mechanisms
Nacos provides two health check mechanisms:
- Client active reporting mechanism.
- Reverse detection mechanism on the server.
How do you understand these two mechanisms? Imagine a scenario where a geological disaster occurs in your area and you are buried under rubble. The rescue team must know that you are in the rubble before they can rescue you. Well, is there any way to let the rescue team know you’re under the rubble?
- One, you shout help! help! I am here! Let search and rescue teams know your location and state of health.
- Second, the search and rescue team uses their specialized inspection equipment to detect that you are buried under debris.
The above two methods are similar to the two health check mechanisms of Nacos, namely the client active reporting mechanism, in which the client actively reports its health status to the Nacos server at intervals, while the server reverse detection mechanism is used by the Nacos server to detect the health of the client.
How do I set the health check mechanism?
The health check mechanism in Nacos cannot be set proactively, but the health check mechanism is strongly dependent on the type of service instance in Nacos. That is, two service instances in Nacos correspond to two health check mechanisms:
- Temporary instances (also called nonpersistent instances) : correspond to the client initiative reporting mechanism.
- Persistent instances (also known as persistent instances) : server side reverse detection mechanism.
Why do you need two service instances? Taking Taobao as an example, during the Double Eleven Rush, the traffic will be much higher than usual. At this time, the service definitely needs to add more instances to cope with the high concurrency, and these instances do not need to be used after the double Eleven. It is more appropriate to use temporary instances. For some standing instances of the service, a permanent instance is more appropriate.
Client active reporting mechanism
The temporary instance voluntarily reports its health status every 5 seconds, the sent packets are called heartbeat packets, and the mechanism for sending heartbeat packets is called heartbeat mechanism.If the heartbeat packet interval exceeds 15 seconds, the Nacos server will mark the service instance as unhealthy, and if the heartbeat packet interval exceeds 30 seconds, the Nacos server will remove the service instance from the service list. When running the Nacos project, you can see the log of heartbeat packets reported by the client, as shown in the following figure:As can be seen from the above picture, the Nacos client will report its health status every 5s, requesting the following information:
/nacos/v1/ns/instance/beat? App = unknown&namespaceId = public&port = 8081 & clusterName = DEFAULT&ip = 192.168.3.72 & serviceName = DEFAULT_GROUP @ @ spring – cloud – nac os-producer2
Reverse detection mechanism of the server
Permanent instances use server-side reverse detection to achieve health check. The detection period is 2000 ms + random number (within 5000 ms). If the detection is abnormal, the service instance will be marked as an unhealthy instance, but the service instance will not be deleted like temporary instances.Nacos server reverse probe currently has three built-in probe protocols: HTTP probe, TCP probe and MySQL probe.Generally speaking, HTTP and TCP detection can cover most health check scenarios. MySQL is mainly used for special service scenarios. For example, when the primary and secondary databases need to be accessed by the service name, and the current accessed database needs to be determined whether it is the primary database, our health check interface at this time, Is a MySQL command that checks whether the database is the master database.
TCP detection
By default, persistent instances use TCP probes, which can be observed on the Nacos console, as shown below:By default, IP ports are used for checking, as shown in the following figure: The general logic of TCP detection is to establish a channel with the registered instance and continuously ping the port of the registered instance to check whether the instance is healthy.
HTTP detection
HTTP probes need to be manually configured on the Nacos console, as shown below:We add the implementation code for the probe interface to the service instance:At this point, we restart the service instance. In the service details, we can see that the CONFIGURED HTTP probe has taken effect and the instance is healthy, as shown in the following figure: The Nacos server checks whether the HTTP interface returns a 200 status code to determine whether the instance is healthy.
Health check mechanism of the cluster
The health check mechanism under the cluster can be summed up in a word, that is, “each in his own place”. Each service corresponds to a primary registry, which synchronizes health status to other registries after receiving heartbeat packets from temporary instances. Permanent instances are similar in that each service corresponds to a master registry. When the responsible registry detects a change in the health status of a service instance, it synchronizes the health status of the instance to other registries, thus implementing the health check mechanism under the cluster.
conclusion
Nacos provides two health check mechanisms: client-side active reporting for temporary instances and server-side reverse detection for permanent instances. The temporary instance sends a heartbeat packet every 5s to the Nacos server, which receives the heartbeat packet and synchronizes the health status to the other registries. The permanent instance supports three detection protocols, NAMELY TCP, HTTP, and MySQL. The default detection protocol is TCP, which means that the instance is pinged continuously to check whether the instance is healthy.
Reference & acknowledgements
Nacos Framework and Principles
Judge right and wrong from yourself, praise to listen to others, gain and loss in the number.
Public account: Java Chinese Community
Java Interview Collection: gitee.com/mydb/interv…