With the rise of a new generation of application platform with container as the core, microservice is showing new vitality.

What exactly are microservices? How did it develop? How should it be used in practice? In this issue, we will take the evolution of China Unicom’s micro-service architecture as an example to discuss how the service grid takes a foothold in the development history of China Unicom’s micro-service.

China United Network Communications Co., LTD. Software Research Institute (hereinafter referred to as China Unicom Soft Research Institute) has carried out long-term technical research and development and application practice in micro-service technology, with significant benefits. In addition, due to the many troubles caused by the need to support various micro-service architectures, after investigation and evaluation in the industry, Service Mesh was finally determined as the evolution direction of micro-services, and a micro-service R&D team was established. And completed the product development of Chinaunicom Service Mesh (CSM).

On improving the capacity of grid and micro service governance ability, unicom soft institute and baidu intelligent joint r&d Mesh cloud service management ability and its integration into the CSM products, and draw lessons from baidu cloud service grid in the APP, baidu maps in the areas of practice experience, excellent propulsion CSM in unicom pilot promotion and large-scale application of the production, Continue to enrich the experience accumulation of service grid in their respective business scenarios.

Through the service grid technology, the micro-service capability is sunk to the “infrastructure layer”, realizing the unification of micro-service technology stack, cloud biogenicity of technology architecture, micro-service capability alignment in multi-language scenarios, non-invasive access monitoring of services, and greatly improving the service governance experience.

What is a service grid?

>> Microservices Phase 1.0:

Microservice businesses need to actively “rely on SDK” to implement basic microservice capabilities (such as fusing, load balancing, traffic limiting, etc.). Therefore, this part of microservice capability needs to be bundled with the business application and has a strong dependence on the programming language. For example, the C++ microservice SDK cannot be used directly in the Java business.

>> Microservices 2.0 phase:

According to the idea of “infrastructure” sinking, micro-service capability is no longer realized through SDK, but through the idea of “independent Sidecar”. Sidecar, as an independent process and business process, are separated in two independent containers respectively, which naturally solves the multi-language dependency problem in micro-service scenarios.

As shown above, the network plane formed by Sidecar is called “service grid”.

What is unique about the development of China Unicom’s micro-service architecture?

The development of China Unicom’s micro-service architecture has also gone through the following stages:

  • Phase 1: Virtual machines and RPC frameworks are used.
  • Stage 2: Container, Spring and self-developed service framework are adopted;
  • Phase 3: Represented by Service Mesh technology, K8S and Istio are adopted as the main target architecture.

In addition, China Unicom has established the service grid as its target architecture, so in the implementation practice, it not only considers the service grid technology, but also focuses on the migration of compatible storage micro-service architecture.

Understand the connectivity service grid in practice

>> RPC architecture moves to the service grid

Through sorting out the existing stock micro-service business, it is found that a large number of stock micro-service business adopts SDK, among which RPC framework is a typical representative. Its main technical characteristics and business demands include three points:

  • Business code based on interface/method encoding;
  • SDK based on interface level service discovery mechanism;
  • The business wants to migrate to the service grid without changing the existing business code.

Combined with the current business situation of China Unicom and the evolution route of the service grid, we conducted several rounds of discussion and demonstration with Baidu Intelligent Cloud and jointly established the following migration scheme:

▲ Reduce migration costs

Considering the demands of the business side for migration, it is important to ensure that the business “does not change the business code” when designing the migration scheme. The dynamic proxy is compatible with the encoding of the existing interface/method of the business, and the business only needs to add a few lines of annotations to achieve migration, greatly reducing the cost of migration.

▲ Reduce redundant registration data

Considering that the current service discovery mechanism in cloud native technology is based on “application level service discovery mechanism”, the service discovery mechanism is changed from interface level to application level, making the architecture service discovery mechanism after migration more cloud primitive. At the same time, the transformation of this mechanism can effectively reduce the redundant data in the registry and reduce the pressure of the registry.

▲ Migration of Mesos architecture to the service grid

At present, mesOS + Marathon is used for resource invocation of some business microservices architecture, and Spring Cloud is used for service governance. This architecture has the following characteristics:

  • Part of micro-service governance capability is realized through SDK (such as fusing, current limiting, etc.).
  • Resource scheduling and load balancing are implemented by Mesos and Marathon LB.
  • Governance capabilities are scattered across multiple governance components.

Based on the smooth migration to the service grid without changing the existing business code, that is, the mutual access goal between the migrated service and the unmigrated service in the stock application, through joint efforts with Baidu Intelligent Cloud, a satisfactory migration scheme has been developed for the business side. The scheme is as follows:

▲ Reduce migration costs

Similarly, in the migration scenario, the business does not need to modify the business code. SDK V2 (Thin SDK) is realized by removing related micro-service capabilities from SDK V1 (Fat SDK) and compatible with service grid architecture. Micro-service capabilities are unified on infrastructure Sidecar. Taking into account Marathon LB’s architecture in the existing Mesos, the existing business logic is smoothly migrated through central-level configuration rules with few configurations.

▲ Yunyuan Biochemical

By replacing the infrastructure Mesos with K8s and Istio technology stacks, the migrated architecture becomes more cloud-native and aligns with the business mainstream service grid architecture.

Observability construction in service grid scenarios

In order to promote the migration of stock business to service grid, the observability of service grid business and non-service grid business should be complemented while taking into account the insensitive migration of business.

>> Achieve a truly non-intrusive access Mesh for services

Service grid technology is non-intrusive mainly by sinking the infrastructure, and by placing micro-services-related capabilities, such as routing, current limiting, and fusing, in Sidecar.

However, the only service intrusion is reflected in “Trace Header transparent transmission”. What is Header passthrough? In short, the business needs to actively pass through the trace headers generated on the Sidecar in the code, otherwise trace links will be incomplete.

For a large number of Java services, Java Agent (a bytecode enhancement technology) is used to realize zero service transformation (services do not need trace header transparent transmission awareness), and trace header transparent transmission is realized at the bytecode level.

>> Implement method level monitoring, monitoring data more complete

Due to the special nature of service grid technology, there are natural disadvantages in monitoring. The disadvantage here is that the monitoring information can only be generated through Sidecar, and the method level execution details within the business are not known.

For a large number of Java services, Java Agent (a bytecode enhancement technology) is used to collect the implementation details of the method level inside the service, and cooperate with Sidecar level monitoring information to achieve a complete link from Sidecar monitoring to the method level monitoring inside the service.

>> Support a variety of monitoring systems, more flexibility

The OpenTelemetry specification is introduced to unify data protocols at the data collection end.

The monitoring data collected in the Service Mesh architecture is sent to the OpenTelemetry Collector based on OTLP. The OpenTelemetry Collector can connect data to different monitoring systems, such as Jaeger and Skywalking, to mask the differences of underlying monitoring systems.

From the traditional micro-service framework to the service grid, China Unicom’s micro-service technology continues to sink to the basic equipment, and the road ahead is increasingly clear. Next time, we will talk about China Unicom’s planning for the service grid products in the future.

China Unicom: Wen Huaixiang, Baidu: Liu Chao