preface

In the previous article, Microservices: Taking a look at the source code, Health checks for Nacos are So simple, we talked about relief options when microservices suddenly fail: adjusting the health check cycle and retry fault requests. My friend read the article and suggested that we should talk about how to make the micro service offline gracefully when closing the service normally.

Why is it elegant? We know that in distributed applications, clients of registries like Nacos and Eureka cache instance lists in order to meet the A (availability) of the CAP principle. When an application is normally shut down, the list of instances cached by the client still takes some time to become invalid, although the registry can be actively invoked for logout.

The above situation can result in service requests to instances that have been shut down, and while a retry mechanism can solve this problem, this solution involves retries, which can slow down user-side requests to some extent. This is where the graceful offline operation comes in.

Let’s start with a few common ways to shut down processes.

Method 1: Run the kill command

Spring Cloud itself has support for shutting down services. When a process is Shutdown by the kill command, Shutdown hook will be actively called for the logout of the current instance. Usage:

Kill INDICATES the ID of a Java processCopy the code

This method is based on the Shutdown hook mechanism of Spring Cloud (essentially provided by Spring Boot, the Spring Cloud service discovery function is implemented for specific logout). Before shutting down the service, services such as Nacos and Eureka will be logged out. But this logout only tells the registry that the client’s cache may wait a few seconds (Nacos defaults to 5 seconds) before it senses it.

The Shutdown hook mechanism applies not only to the kill command, but also to the normal program exit, using system.exit (), using Ctrl + C on the terminal, and so on. However, it is not applicable to forcible shutdown such as kill -9 or server downtime.

While this solution is better than hanging up and waiting 15 seconds, it does not solve the client caching problem per se and is not recommended.

Method 2: Based on the /shutdown endpoint

In Spring Boot, the /shutdown endpoint is provided, which also allows for elegant downtime, but is essentially the same as the first approach, based on the Shutdown hook. After processing the logic based on Shutdown hook, the service will also be Shutdown, but it also faces the problem of client cache, so it is not recommended.

This approach first requires the introduction of corresponding dependencies in the project:

<dependency>
    <groupId>org.springframework.boot</groupId>
    <artifactId>spring-boot-starter-actuator</artifactId>
</dependency>
Copy the code

Then configure the on /shutdown endpoint in the project:

management:
  endpoint:
    shutdown:
      enabled: true
  endpoints:
    web:
      exposure:
        include: shutdown
Copy the code

Curl curl curl curl curl curl curl curl curl

Curl -x http:// operating IP addresses /actuator/shutdownCopy the code

Method 3: Based on the /pause endpoint

Spring Boot also provides /pause endpoints (which Spring Boot Actuator provides). Using the /pause endpoints, instances in which/Health is UP can be changed to Down.

The basic operation is to enable pause endpoints in the configuration file:

Pause: enabled: true # Pause endpoints depend on certain versions. Restart Endpoints restart: enabled: true Endpoints: web: exposure: include: pause,restartCopy the code

Then send the curl command to terminate the service. Note that the POST request is required.

The use of /pause endpoints varies greatly from version to version. When the author used Spring Boot 2.4.2.RELEASE, he found that it did not take effect at all. After checking the Issues of Spring Boot and Spring Cloud project, he found that this problem existed from 2.3.1.RELEASE. This seems to be the reason why the management of the Web Server was changed to SmartLifecycle in the latest release, which seems to have been abandoned by the Spring Cloud (remains to be seen), and the latest release calls the/Pause endpoint with no response.

It is not recommended to use the/Pause endpoint for offline microservices due to the above changes, but the whole idea of using the/Pause endpoint is worth learning from.

The basic idea is that when the/Pause endpoint is called, the state of the microservice changes from UP to DOWN, and the service itself is still available. When the microservice is marked DOWN, it is removed from the registry, waiting for a period of time (say five seconds), and then stopped when the list of instances cached by the Nacos client is updated.

The idea is to switch off traffic from microservices before shutting them down or republishing them. This solves the problem of the client caching the list of instances during normal publishing.

Based on the above ideas, corresponding functions can be implemented by ourselves, such as providing a Controller, calling the method in the Controller to unregister the current instance from Nacos, and then waiting for 5 seconds before shutting down the service by script or other means.

Method 4: Based on the/service-Registry endpoint

The solutions mentioned in Approach 3 would be better if the Spring Cloud could support them directly. No, Spring Cloud provides/service-Registry endpoints. But the name indicates an endpoint that is specific to the service registration implementation.

Enable/service-Registry endpoint in configuration file:

management:
  endpoints:
    web:
      exposure:
        include: service-registry
      base-path: /actuator
  endpoint:
    serviceregistry:
      enabled: true
Copy the code

Visit http://localhost:8081/actuator endpoint can view to open the endpoint as follows:

{
    "_links": {
        "self": {
            "href": "http://localhost:8081/actuator",
            "templated": false
        },
        "serviceregistry": {
            "href": "http://localhost:8081/actuator/serviceregistry",
            "templated": false
        }
    }
}
Copy the code

Run the curl command to modify the service status.

curl -X "POST" "http://localhost:8081/actuator/serviceregistry? status=DOWN" -H "Content-Type: application/vnd.spring-boot.actuator.v2+json; charset=UTF-8"Copy the code

Before executing the preceding command, check the corresponding instance status of Nacos as follows:

You can see that the button in the instance details is “offline”, which means it is currently UP. After you run the curl command to delete an instance, click “go online” to delete an instance.

The above command is equivalent to manually logging the instance on and off in the Nacos administration background.

Of course, the above situation is based on the Spring Cloud and Nacos model. Essentially, Spring Cloud defines a specification such that all registries should implement the ServiceRegistry interface. We also define a generic Endpoint based on the Service stry abstraction:

@Endpoint(id = "serviceregistry") public class ServiceRegistryEndpoint { private final ServiceRegistry serviceRegistry; private Registration registration; public ServiceRegistryEndpoint(ServiceRegistry<? > serviceRegistry) { this.serviceRegistry = serviceRegistry; } public void setRegistration(Registration registration) { this.registration = registration; } @WriteOperation public ResponseEntity<? > setStatus(String status) { Assert.notNull(status, "status may not by null"); if (this.registration == null) { return ResponseEntity.status(HttpStatus.NOT_FOUND).body("no registration found"); } this.serviceRegistry.setStatus(this.registration, status); return ResponseEntity.ok().build(); } @ReadOperation public ResponseEntity getStatus() { if (this.registration == null) { return ResponseEntity.status(HttpStatus.NOT_FOUND).body("no registration found"); } return ResponseEntity.ok().body(this.serviceRegistry.getStatus(this.registration)); }}Copy the code

The Endpoint we called above is implemented through the above code. So not only Nacos, but also registries based on Spring Cloud integration, are essentially enabling services to go offline in this way.

summary

Many projects are gradually undergoing microservice transformation, but once the microservice system, they will face a more complicated situation. This article focuses on the graceful downline of Nacos in Spring Cloud system to analyze a common problem and solution in micro-service practice. Are you using microservices, and are you aware of it? Want to learn more micro-service combat, nothing to say, follow it.

Nacos series

  • Source code analysis of Spring Cloud Integration Nacos Service Discovery?
  • Want to learn the service discovery of micro-service? Let’s learn some popular science knowledge first.
  • Nacos, the Soul Ferryman of Microservices, here is a complete overview of the principle.
  • You are also interested in reading source code. Tell me how I read Nacos source code.
  • “Learn Nacos? Let’s get the service up first, Practical tutorial”
  • Microservices: What if Nacos didn’t react when the Service hung too crisp?
  • Microservices: Poking fun at the Crazy output of Nacos Logs
  • An Easy example of Spring Cloud integration with Nacos
  • Microservices: Dissecting the source Code to make Health Checks for Nacos So Easy

About the blogger: Author of the technology book SpringBoot Inside Technology, loves to delve into technology and writes technical articles.

Public account: “program new vision”, the blogger’s public account, welcome to follow ~

Technical exchange: Please contact the weibo user at Zhuan2quan