Mecha: Completes the Mesh

Abstract: Service Mesh has been implemented for three years, but the effect has not been ideal. It is time for reflection. Mecha, as a service-oriented distributed capability abstraction layer, is the natural evolution version of Service Mesh mode and is expected to be the inevitable trend of cloud biochemistry and Mesh. Let’s finish Mesh.

Introduce a Mecha

What is Macha?

The word Mecha is familiar to all students who love animation. Yes, the familiar Mecha:

The term Mecha appears here mainly because of Bilgin Ibryam’s blog post “Multi-Runtime Microservices Architecture”, which proposes a new vision for Microservices Architecture: Multiple Runtime.

Note: This blog post is highly recommended, and I even recommend reading it before reading this post, because what I’ve written today can be seen as an in-depth reading and reflection on it. For convenience, a Chinese translation of the multi-runtime microservices architecture is provided here.

In this blog post, Bilgin Ibryam first analyzes and summarizes four requirements for distributed applications:

Lifecycle
Networking
State
Binding

Due to the problems and limitations of each requirement, traditional solutions such as enterprise service Bus (ESB) and its variants (such as message-oriented middleware, more lightweight integration frameworks, etc.) are no longer suitable. As microservice architectures evolve and containers and Kubernetes become more popular and widely used, cloud native thinking is beginning to influence how these requirements are implemented. The future architecture trend is to evolve comprehensively by moving all traditional middleware functionality to other runtimes, with the ultimate goal of writing only business logic in services.

Note: for details, please see the original text. In order to save space, only a brief overview is made here, and the content of the original text is not quoted completely.

Here is a comparison of traditional middleware platforms, which provide capabilities in the form of various SDKS, and cloud native platforms, which provide capabilities through various peripheral Runtime (typically familiar with Servicemesh/Istio) :

Therefore, the authors introduce the Multiple Runtime concept:

The authors suggest that it is likely that in the future we will eventually use multiple runtimes to implement distributed systems. Multiple runtimes, not because there are multiple microservices, but because each microservice will consist of multiple runtimes, most likely two — a custom business logic run time and a distributed primitive run time.

An explanation of the multi-runtime microservice architecture and Mecha:

Do you remember the movie Avatar and the “motorcycle suits” that scientists made for exploring Pandora in the wild? This multi-runtime architecture is similar to these Mecha-suits, which give humanoid drivers superpowers. In the movie, you have to wear a suit to gain strength and get destructive weapons. In this software architecture, you will have the business logic that forms the core of your application (called micrologic/Micrologic) and Sidecar Mecha components that provide powerful distributed primitives out of the box. Micrologic is combined with Mecha functionality to form a multi-runtime microservice that applies out-of-process functionality to its distributed system requirements. Best of all, Avatar 2 is coming out soon to help popularize this architecture. We could eventually replace the old sidecar motorbike at all software conferences with awesome mecha pictures; -). Next, let’s look at the details of the software architecture. This is a two-component model similar to a client-server architecture, where each component is an independent runtime. It differs from a pure client-server architecture in that both components are on the same host and have a reliable network connection to each other. Both components are equally important in that they can initiate operations in either direction and act as clients or servers. One of these components, called Micrologic, has very little business logic and takes almost all distributed system problems out of the way. Another accompanying component is Mecha, which provides all of the distributed system functionality we have been discussing throughout this article (except for the life cycle, which is platform functionality).

Here the author formalizes the concept of Mecha:

Smart Runtime, Dumb Pipes

My understanding of Mecha is that the business logic should be “naked” at the beginning of the coding phase, focusing on the implementation of the business logic and minimizing the involvement of the underlying implementation logic; In operation, it should be equipped with “mecha”, fully armed, killing everywhere. Familiar smell, right? Standard and authentic cloud original thought.

The essence of the Mecha

In this article, the author discusses the features of the Mecha runtime:

Mecha is a generic, highly configurable, reusable component that provides distributed primitives as off-the-shelf capabilities.
Mecha can be deployed with a single Micrologic component (Sidecar mode) or as multiple shares (note: I call this Node mode).
Mecha makes no assumptions about the Micrologic runtime and works with multilingual microservices and even singleunits that use open protocols and formats such as HTTP/gRPC, JSON, Protobuf, CloudEvents.
Mecha is configured declaratively in a simple text format (such as YAML, JSON) that indicates what functionality to enable and how to bind it to a Micrologic endpoint.
Rather than relying on multiple agents for different purposes (such as network agents, caching agents, binding agents), a single Mecha provides all of these capabilities.

Here is my personal understanding of the above features:

Mecha provides capabilities, capabilities embodied in distributed primitives, not limited to mere network agents.
The deployment model of Mecha is not limited to Sidecar mode, Node mode may be better in some scenarios (e.g. Edge/IoT, Serverless FaaS). At the very least, Mecha has the opportunity to choose on demand, rather than being tied to Sidecar mode.
The interaction between Mecha and Micrologic is open and API standards exist. The “protocol” between Mecha and Micrologic is embodied in the API, not the TCP communication protocol. This provides an opportunity: an opportunity to unify the way Micrologic and Mecha communicate.
Mecha can be configured and controlled in a declarative manner, which is very much in line with the idea of cloud nativity, and also makes the API more focused on the capabilities themselves, rather than how they are configured.
Applications require so many capabilities (see figure above: Four requirements for distributed applications) that if you had an agent for each capability (whether Node or Sidecar), the number of capabilities would be overwhelming and the operational stress would be terrible. Therefore, as the name Mecha implies, runtimes should provide capabilities in a suite of forms, not in a decentralized manner.

To sum it up in one sentence, I think the essence of Mecha should be:

“Application-oriented Distributed Capability Abstraction Layer”

Just as the essence of a Service Mesh is an abstraction layer for communication between services, the essence of Mecha is the various distributed capabilities and primitives required by applications, including but not limited to communication between services.

In this sense, the scope covered by Mecha is a superset of Service Mesh: After all, Service Mesh only covers part of the application’s requirements (communication between services is limited to synchronous/one-to-one /request-response mode), and there are many more distributed capabilities and primitives to cover.

In other words, the goal of Mecha should be: “Take Mesh to the end!”

Mecha’s advantages and future

The authors point out that the benefit of Mecha is the loose coupling between business logic and an increasing number of distributed system problems.

Here is the coupling of business logic and distributed system issues in different architectures:

The idea is the same as the Service Mesh, but it covers a wider range of distributed capabilities.

One question: Will Mecha be the next evolution of microservices architecture? My personal answer: Yes, with the advancement of cloud native, distributed capabilities (typically represented by traditional middleware) will sink, and the scope of Mesh will inevitably continue to expand, and it will get closer and closer to the form of Mecha. That’s what the title of this article is all about: Mecha is the next step in microservices and even cloud native.

Microsoft Dapr

After introducing the Mecha/Multiple Runtime concept, let’s take a look at the current Microsoft Dapr project, which is the industry’s first open source Multiple Runtime practice.

Project address: github.com/dapr/dapr.

Dapr introduction

Dapr, which stands for Distributed Application Runtime, is officially described as “a portable, event-driven Runtime for building Distributed applications across clouds and edges.”

The details of Dapr are:

Dapr is a portable, Serverless, event-driven runtime that makes it easy for developers to build resilient, stateless, and stateful microservices that run on the cloud and edge and include multiple languages and development frameworks. Dapr collates best practices for building microservice applications as open, stand-alone building blocks, enabling you to build portable applications using the language and framework of your choice. Each building block is independent, and you can use one or more of them in your application.

The functions and positioning of Dapr can be summarized in the following chart:

The lowest infrastructure is a variety of cloud platforms (supported by mainstream public clouds) or edge environments;
On top of that are the distributed capabilities that Dapr provides, which Dapr calls building blocks;
The ability of these building blocks to provide services externally with a unified API (support HTTP and gRPC);
Applications can be written in a variety of languages, and then use these capabilities through the APIS provided by Dapr. Dapr also provides client libraries to simplify API calls, enabling multilingual support.

The specific distributed capabilities (building blocks) provided by Dapr are shown below:

The specific capabilities provided by each building block can be found in Dapr’s official documentation: github.com/dapr/docs/t…

Examples of Dapr apis

Let’s look at an example of an application calling the Darp API to get a taste of how Dapr works.

Take the Service Invocation/Service Invocation as an example:

The deployment and invocation is very similar to Service Mesh/ Istio, but the difference is that Dapr provides the capabilities behind the API by providing an API, not by providing a protocol proxy.

Figure 1 shows ServiceA making a request to invoke a remote service. The HTTP request is as follows:

POST/GET/PUT/DELETE http://localhost: < daprPort > / v1.0 / invoke / < appId > / method / < method - the name >Copy the code

Among them:

The daprPort parameter is the listening port started by Dapr Runtime to receive outbound requests from the application.
The appId parameter is the association ID of the remote application in Darp. Each application registered with Dapr has a unique appId.
The method-name parameter is the method name or URL of the remote application to be invoked.

The payload can be stored in an HTTP body and sent along with the request, such as JSON.

Note that while both provide the same functionality, Dapr (or Mecha behind it) and Service Mesh differ in the way they expose apis or proxy communication protocols.

A more obvious example is Dapr’s “publish/ Subscriptions” capability, which allows applications to easily publish messages or subscribe to topics and receive messages. The following is an application Posting message, and the request is sent directly to Dapr:

In the example, the topic parameter specifies the topic to which the message is sent (deathStarStatus in this case). Dapr then completes the queue and pushes the message to the application that subscribed to the topic. Messages were received in a similar way, but this time Darp initiated them:

Dapr first requests the application. The advisory application needs to subscribe to those topics, such as TopicA/TopicB returned by the application in this example.
Dapr implements topic subscription. After receiving the message, it sends the message to the application, and distinguishes different topics by different URL parameters.

Note that during this call, the application completely ignores the underlying PUB/Sub implementation mechanism (such as Kafka, RocketMQ, or any other messaging mechanism provided by the public cloud), and does not introduce the client SDK for that implementation. Simply use the Darp defined API to decouple from the underlying layer and “vendor unbind”.

To further simplify the call process (after all, sending the simplest HTTP GET request requires implementing HTTP call/connection pool management, etc.), Dapr provides SDKS for various languages. For example, Java/Go/Python/dotNET/JavaScript/CPP/Rust. In addition, it provides both HTTP clients and gRPC clients.

Take Java as an example. The Java Client API is defined as follows:

public interface DaprClient {  
   Mono<Void> publishEvent(String topic, Object event);
   Mono<Void> invokeService(Verb verb, String appId, String method, Object request); . }Copy the code

Specific: github.com/dapr/java-s…

Analysis and summary

The Multiple Runtime/Mecha architecture idea was introduced earlier, as well as the Microsoft Dapr project, one of the reference implementations.

The Multiple Runtime/Mecha idea is very new and has just been proposed, while The Microsoft Dapr project is also a new project. Both theoretical thinking and practice are in the very early stage, and a complete methodology has not yet been formed.

Special disclaimer: the following content is more of my personal understanding and perception, only represents my personal opinion, there must be a lot of immature and even fallacious places, welcome correction and discussion.

Mecha and Dapr implications

Mesh patterns should be pushed to a larger area.

As cloud nativity grows, the distributed capabilities required by applications should sink in, not just the inter-service communication capabilities provided by Servicemesh. The application form will move further towards pure business logic and apply more cloud biology.

This is the general trend and the driving force behind the emergence and development of Mecha architecture.

Mecha emphasizes “providing capabilities,” not communication agents.

The way Mecha is used is very different from the way a Service Mesh is used: The emphasis of Mecha is to provide distributed capabilities to applications, which are ultimately presented in the form of fully encapsulated apis. Apis are the “needs” and “wishes” of an application for capabilities, not how they are implemented. Implementation is the Mecha’s responsibility, and what implementation is adopted is controlled by the Mecha.

Under the Service Mesh, there is no such requirement: The Service Mesh provides communication between services. This capability is provided by the Sidecar. There is no other lower level implementation, and there is no possibility of isolation or replacement. Limited by Service communication protocol and packet Schema, Service Mesh can only “forward” requests, focusing on “how to forward”, without other capabilities that need to be isolated or replaced.

When Mecha extends its capabilities beyond the Servic Mesh, many of these capabilities are provided by external systems: pub-sub capabilities can be implemented by different Message queues; State management capabilities can be linked to different key-value implementations. This is where capability isolation and fungibility become a key requirement: decouple the application and capability implementation, allowing Mecha to replace the underlying implementation (thus enabling vendor lock-in, etc.).

No “zero intrusion” is required.

In the Service Mesh, “zero intrusion” is a very important feature, and traffic hijacking schemes such as Iptables are introduced for this purpose. Zero intrusion is of great advantage in some special scenarios, for example, existing applications are connected to the Service Mesh without modification. The benefits are self-evident, but zero intrusion has its own limitations: the client must be able to issue network communication requests that meet the requirements of the server, a process that cannot be interfered with by outsiders.

For inter-service communication, this is not a big problem. For other capabilities, however, the need to decouple from the implementation makes it inappropriate for clients to initiate native protocol requests themselves. Therefore, Mecha tends to use lightweight SDK solutions that are less intrusive and can also be implemented across languages and platforms, but at the cost of implementing various language SDKS. Since the SDK is lightweight enough, the cost is not too high.

This small amount of work and intrusion can be exchanged for the convenience and cooperation provided by lightweight SDKS, which can realize the abstraction of capabilities and the encapsulation of apis. Weighing the pros and cons, Mecha prefers lightweight SDK solutions.

Sidecar deployment is not limited.

Sidecar deployment mode has disadvantages such as resource usage and increased maintenance cost, which may not be appropriate in some cases:

Edge network, IoT scenario: limited resources, not suitable for starting too many Sidecars;
FaaS scenario: The application itself is light enough, even lighter than Sidecar;
Serverless scenario: In the Scale to Zero scenario, the cold startup speed is strictly required. Sidecar startup and initialization may slow down the application startup speed.

Under Mecha, deployment mode is not limited to Sidecar, and Node mode can be selected when appropriate, or even a mixture of Node mode and Sidecar mode can be used.

API and configuration are key.

An API is an abstraction of distributed capabilities that needs to be customer-friendly, easy to use, and stable for developing upper-layer business applications. These apis also need to be standardized and widely accepted and adopted by the community in order to achieve vendor lock-in and free migration and increase customer value.

In addition, apis need to be used in conjunction with configuration, and capabilities are abstracted into apis without providing detailed control over capabilities. These controls are implemented at run time by the Mecha according to configuration, which can be interpreted as: “API + configuration = full capability”.

The formulation and standardization of apis and configurations is expected to be key to the success of Mecha.

The essence of Mecha

Program to an interface, not an implementation. Design Patterns: Elements of Reusable Object-Oriented Software (GOF, 1994)

The essence of Mecha begins with these words:

Under Mecha, the Runtime isolates the underlying implementation for decoupling and replaceable purposes, thus evolving to: “Program to an Runtime, not an implementation.”
Because Runtime is Localhost whether it is deployed in Sidecar mode or Node mode, so “Program to an Localhost, not an implementation.”
To simplify development, Mecha still provides a lightweight SDK that provides an API as an abstraction of capabilities: “Program to an API, not an implementation.”
Given that apis are usually provided in the form of interfaces, the Mecha comes full circle: “Program to an interface, not an implementation.”

Personally, the essence of Mecha lies in these key points: isolation/abstraction/decoupling/replaceable. As shown below:

Under Mecha, MicroLogic (that is, the code implementation of business logic) does not allow direct use of the distributed capabilities provided by the underlying implementation;
The Mecha Runtime will provide distributed capabilities for Micro Logic while isolating applications from underlying implementations;
In order to facilitate the use, lightweight SDK is provided, in which the API layer realizes the abstraction of distributed capabilities, and the application only needs to be API oriented programming;
The lightweight SDK works with the Mecah Runtime to decouple and replace the underlying implementation.

Mecha implementation principles

In the implementation of Mecha, the principles I understand are as follows:

Runtime is the main force, to do thick;
The lightweight SDK is mainly for the Runtime to play with, to make thin;

Specific responsibilities:

Lightweight SDK: multi-language access, low intrusion (but not zero intrusion);
API interface: lightweight SDK to provide unity, target community + standardization, to provide developers with consistent programming experience, while providing portability;
Application: lightweight SDK/Runtime coordination, providing a variety of distributed capabilities, no sense of application, just simple use of API, no coupling of the underlying implementation;

In the Mecha architecture, the Runtime is naturally the core of the architecture, acting as a data plane similar to the one in the Service Mesh:

All processes used by distributed capabilities (including access to internal ecosystems and access to external systems) are taken over and shielded by the Runtime;
The CRD/ control plane implements declarative configuration and management (similar to Service Mesh).
In terms of deployment mode, Runtime can be deployed in Sidecar mode or Node mode.

Note: Mecha has a lot of capabilities and implementation details, so here is a High Level overview. Details will be covered in a series of articles, welcome to more exchanges and discussions.

Mecha summary

It was around the beginning of March when I read the article “Multi-Runtime Microservices Architecture” for the first time, and I felt suddenly enlightened, especially that many questions that had been repeatedly considered and weighed but could not be concluded were clearly answered in this article. It has been a great benefit.

During the three years of Service Mesh exploration and practice, I encountered many problems, many of which I had not thought of before. For example, we used to think that the biggest problem with the introduction of a Sidecar in the Service Mesh would be performance. However, in practice, the maintenance cost caused by the introduction of Sidecar is more troublesome than the performance loss caused by the introduction of Sidecar.

To summarize my core understanding of Mecha architecture, there are two main points:

Mecha is the inevitable trend of cloud protogenics and meshing: as cloud protogenics continue to evolve, the distributed capabilities required by applications will continue to sink, and more and more capabilities will emerge in the form of sidecars. But you can’t have a dozen Sidecars in one app. It’s operational hell. Therefore, new forms are necessary to solve the problem of too many SIDecArs, and merging into one or more sidecArs will become inevitable.
Mecha is a natural evolution of the Service Mesh model. The Service Mesh has been implemented for three years, but the effect has not been satisfactory. It is time to reflect on it. And the scope of Mecha goes far beyond service-to-service communication. New needs should be met with new thinking and breakthroughs. The existing fixed mode of Service Mesh can be broken with Mecha to explore new ways: Don’t stick to Sidecar, try Node mode; Instead of sticking to protocol forwarding, try Runtime’s ability to decouple the underlying implementation; Instead of sticking to zero intrusion, try keeping a lightweight SDK that is light enough in your application.

As the saying goes, “Microservices are the Good Part of SOA practice”, I hope to learn from the success and failure of Service Mesh practice in Mecha exploration and practice. Hopefully Mecha will also be a Good Part of the Service Mesh. It is hoped that Mecha will become the next step in the evolution of cloud native, following microservices and Service Mesh.

Back in reality, Mecha and Multi-Runtime are still a very new idea, and Dapr is just getting started. Mecha has a long way to go, and everything is still feeling its way.

Appendix: References

At the end of the article, I would like to express my special thanks to Bilgin Ibryam, the author of the article “Multi-Runtime Microservices Architecture”. I really appreciate the ideas and concepts in this article. The analysis and induction is very accurate, and the ability to refine and sublimate is admirable.

About the author:

Chief Architect of Red Hat, Committer and member of the Apache Software Foundation. Open source evangelist, blogger, and occasional speaker, author of Kubernetes Patterns and Camel Design Patterns.

This article is based on Bilgin Ibryam:

Multi-runtime Microservices Architecture, author Bilgin Ibryam, Mecha ideas from this article, highly recommended reading. You can also directly see my translated version of the multi-runtime Microservices architecture. As mentioned earlier, it is advisable to read this blog post before reading this article.
The Evolution of Distributed Systems on Kubernetes: Bilgin Ibryam, speaking at QCon London in March 2020, still highly recommended. It is an excellent summary and vision of the evolution of distributed systems on Kubernetes, while still preaching the concept of multi-runtime microservices architecture. Many of the pictures in this article are quoted from this PPT.

Financial Class Distributed Architecture (Antfin_SOFA)