In this issue, we discuss the latest framework for microservices — ServiceMesh.

Microservices Framework

Microservices, in simple terms, is to separate the original whole application into several small applications with fine granularity and distributed characteristics at the same time. Of course, breaking the whole into parts will bring a series of problems, just like a small shop later opened into a chain store, will bring a lot of problems with purchase, selling goods. Then a series of problems brought by micro-services need to be solved by micro-service framework.

Generally, the narrow sense of “microservice” only refers to traditional microservice frameworks such as SpringCloud and Dubbo, which are also open source microservice frameworks that are widely used in China. Another new type of microservice framework is ServiceMesh, which has gradually become popular in the past two years. Linkerd, Consul, Istio, etc., are all service grid frameworks.

The characteristics of ServiceMesh

The first problem to be solved by micro-service is communication problem: service addressing, access control, flow limiting, fuse degradation, etc., are all communication problems, and the framework of micro-service and service grid is used to solve such problems. However, the traditional microservice framework and the new service grid framework are different from each other in their solutions. The difference between the two can be seen in this graph:

The traditional microservice framework takes SpringCloud as an example. All dependencies related to SpringCloud need to be introduced during development, that is, the communication department is subcontracted to the introduced dependencies (SDK). Of course, this also tests the ability of developers, who need to be familiar with the SpringCloud framework to use it. And currently only the Java language is supported. The framework of the service grid class is simply to take out the above SDK and run it separately. As long as the function of the SDK can be realized, the purpose will be achieved. The SDK that is taken out and run separately is called Sidecar, which belongs to data management in the management of the service grid.

Of course, Sidecar also has a series of problems that need to be solved. For example, Sidecar needs to be the sole agent of its communication, namely traffic hijacking. There should be corresponding mechanism for service discovery and service health detection. Policy delivery and unified management also need to be considered. In the management of service grid, it belongs to the management of control plane.

Service grids do have many advantages over traditional microservice frameworks. The business and communication of the service are divided into two parts. The business developers do not need to care about the micro-service framework and micro-service governance, nor do they need to learn the framework during development, or even care about which development language to use. The operation and maintenance management group or the architecture group also need not worry about the optimization of their communication architecture, and it is difficult to upgrade to the running system. In professional terms, it is to decouple the business and management understanding, and the micro-service governance ability sinks to the operation and maintenance layer, which reduces the difficulty of development and makes it easier to realize the hierarchical, standardized and systematic architecture.

Is performance the biggest challenge for service networks?

We know that when a single application is split into microservices, the original in-process calls become inter-network calls, and the performance is naturally worse than before. In the same way, each microservice is now split into two running programs, which also need to be called through a network. Isn’t this aggravating the performance problem? If we think of every call between services as A ride on A bus, where service A calls service B like A ride on A bus, then A service call with Sidecar is A service ->Sidecar-> service B, which is like two bus transfers in between. That’s a big increase in latency, almost three times as much, right?

In fact, a closer look reveals that ServiceMesh does not actually increase the communication performance cost in terms of processing time. For example, without the introduction of ServiceMesh, service A calls service B, and the time consumed is mainly:

● Before A sends out the information, it will first be processed by its own framework (SDK). Because the traditional micro-service framework is basically load balancing on the client side, it will have some governance functions such as load balancing, algorithm location and fuse downgrade, which takes about 1ms.

● A->B network communication time, under the condition of smooth network, not more than 0.1ms;

● After B receives the data, it will first be processed by the framework (SDK), such as access permissions, black and white lists, etc., let’s call it governance function processing, which will consume about 1ms;

● After the management related processing, is the business process, the first is the protocol encoding and decoding, and then is the business code processing, a not very complex business service processing, should be about 10ms.

After this calculation, the microservice invocation of A->B is calculated from exit A to completion of B processing, which is roughly estimated to be 12.1ms. Let’s look at the business processing of A->Sidecar->Sidecar-B, which roughly takes time:

● Service A sends A communication message to the Sidecar of service A, the network transmission time must be less than 0.1ms, usually is the local call in the host, so it does not occupy network resources;

● After receiving the information, Sidecar of A service will do the same processing as the previous SDK, that is, fusing, current limiting and access control are done here, and the processing time is similar to about 1ms. Note that service grid access control and whitelist are also handled on the calling side;

● The Sidecar of A service transmits information to the Sidecar of B service, which is the same as the transmission from A to B of the traditional micro-service, and the transmission time is the same, let’s say 0.1ms;

● After the information is sent to the Sidecar of B service, if it is not encrypted, then the Sidecar of B service does not need to do any processing, because all governance-related functions have been put in the calling end, so the Sidecar of B service is directly passed through, which consumes almost no time.

● Sidecar of service B transmits information to service B, and the network is also calculated as 0.1ms;

● Finally, the protocol encoding and decoding and business processing of B service are also nearly 10ms.

The average service grid call cost is about 11.3ms. The consumption of delay decreases instead of increasing, which is also a law found after many pressure measurements in the early stage.

Unexpected, understandable

After the above performance analysis, we found that we were surprised to see almost three times the latency, but found that there was little change. Is this a deliberate attempt to boost ServiceMesh, or is the architecture itself designed to optimize performance?

In fact, after understanding Sidecar’s processing mechanism, you can see that this unexpected thing is not so strange.

First of all, Sidecar is not a service. It does not deal with the detailed business content of a newspaper, but only forwarding. After receiving the target information in the packet header, the original packet style does not do any processing. Secondly, load policies, fusing traffic limiting, routing policies, etc., are implemented during forwarding. Then, communication information is recorded for easy monitoring and log management. To implement API granularity management, such as API granularity access control and fusing traffic limiting, Sidecar also needs the URL routing policy function to implement fine-grained API or API group governance based on different URL policies. This is the overall process of Sidecar.

Speaking of which, does Sidecar seem to have the same functionality as an API gateway? In fact, Sidecar is implemented through the API gateway. Each microservice instance is configured with an API gateway instance to take over its traffic and communication, so as to achieve transparent governance. So in terms of performance, CPU, memory consumption is inevitable, and communication delay is not a concern.

conclusion

Service grid was put forward from 2010, and attracted the attention of technical personnel in 2017 and 2018. 2020 is also known as the first year of service grid landing. After understanding the principle of service grid, it can be generally accepted by the market, after all, the core technology also lies in THE API gateway, and API gateway is a fully mature technology. Therefore, ServiceMesh is also a technical specification worth considering if the transformation of microservices is made now.

In the next article, we will make a unified introduction and comparison of API gateway, Mesh and Sidecar. Please look forward to it

Click on BoCloud for more solutions