HashiCorp has recently announced on its official website that it will not allow the use, deployment or installation of its enterprise products and software in China, including Consul. So do domestic enterprises have similar services to provide? The answer is yes! Let’s take a look at Huawei Cloud ServiceStage.

In recent years, more and more enterprises have begun to practice micro-services. In the process of micro-service application in enterprises, they are faced with the selection of micro-service development framework. Whether they are self-developed or choose third-party frameworks, they have to consider the following issues: whether the micro-service framework has high reliability and can not interrupt business at any time; Whether the microservice framework can achieve high-speed communication performance and ensure that the performance will not be degraded when services are switched from single architecture to microservice architecture. This article introduces how SeviceComb, huawei’s open source micro-service framework, helps enterprise applications quickly develop high-performance communication and reliable service management capabilities from two modules: service management center and communication processing. The first part introduces the service management center of micro service.

Overview of ServiceCenter

ServiceCenter is a microservice component that can register and discover microservice instances. It provides a set of standard RESTful apis to manage microservice metadata. ServiceComb’s microservice registry and dynamic discovery capabilities also depend on its implementation.

In addition to the microservice dynamic discovery described above, ServiceCenter helps applications have the following capabilities:

1. Instance caching mechanism

When the micro-service developed based on SDK consumes the Provider micro-service for the first time, an instance discovery operation will be performed. At this time, internal request ServiceCenter will pull the existing instance set of the Provider and save it to the memory cache. Subsequent consumption requests are based on the cache instance set. Send it to an instance service of the Provider according to the custom routing logic.

The advantage of this approach is that SDK processes that are already running always keep an instance cache; Although temporarily unable to perceive the instance changes in time to refresh the cache, but when re-connected to ServiceCenter will trigger a cache refresh, to ensure that the instance cache is ultimately effective; The SDK ensures that the business is always available during this process.

2. Asynchronous caching

In ServiceCenter, because it does not store data, if it is designed as a Proxy service to forward external requests to ETCD, such a design can be said to be unreliable. The reason is that once the back-end service fails or network access fails, The ServiceCenter service is unavailable, and client instance information cannot be pulled or refreshed. So at the beginning of the design, ServiceCenter introduced a cache mechanism.

1) At the beginning of startup, ServiceCenter will establish a long connection with ETCD (watch), and real-time monitoring of resource changes.

2) Before each watch, in order to prevent resource changes in the connection time window, ServiceCenter cannot monitor these events, and will perform a full list resource query operation.

3) During running, the resource changes obtained by List & Watch will be compared with the local cache and refreshed.

4) The local cache first mechanism is used for instance discovery or static data query of microservices.

Asynchronous refresh cache mechanism, can make ServiceCenter and etCD cache synchronization is asynchronous, micro services and read requests between ServiceCenter, basically is not blocked because etCD is unavailable; Although in the period from resource refresh to ServiceCenter Watch to the event, there will be a certain delay in updating the external presentation of resource data, but this is within the tolerable range, and the final presentation of data is consistent; This design greatly improves the throughput of ServiceCenter, while ensuring its high availability.

3. Self-protection mechanism

The cache mechanism mentioned before, to ensure that ServiceCenter in etCD network partition failure still maintain readable state, The Self-preservation mechanism of ServiceCenter ensures that services on the Provider and ServiceCenter are still available when network partitions fail.

Assume the following scenario: The network between most providers and ServiceCenter is partitioned due to some reason, and the Provider heartbeat fails to report heartbeat messages. In this case, a large number of Provider instance information aging offline messages are displayed on ServiceCenter. ServiceCenter pushes the Provider instance offline events to most Consumer terminals on the network. As a result, user services break down. It can be imagined that for ServiceCenter and even the entire micro-service framework is disastrous.

To solve this problem, ServiceCenter needs to have a self-preservation mechanism:

1) ServiceCenter in a time window to listen to 80% instances of etCD offline events, will immediately start the self-protection mechanism.

2) During the protection period, all offline events are stored in the queue to be notified.

3) During the protection period, ServiceCenter will remove the instance from the queue when it receives the registration information reported by the instance. Otherwise, when the instance lease expires, it will push the instance offline notification event to the Consumer service.

4) If the queue is empty, the self-protection mechanism is closed.

After the self-protection mechanism is enabled, even if all data stored on the ETCD is lost, data can be automatically recovered between the SDK and ServiceCenter without affecting services in such extreme scenarios. While this recovery is lossy, it keeps the business mostly available in such a disaster scenario.

The above is the highly reliable architecture design of ServiceComb’s service management center under the distributed system. When large-scale and high-concurrent enterprise application development is carried out, the reliable service management center can make the distributed system run more stably. Meanwhile, high-performance communication also makes the distributed system deal with intensive services more efficiently.


Click to follow, the first time to learn about Huawei cloud fresh technology ~