Micro services in 2020, there are adhere to break the game. As Service frameworks continue to evolve and move to the cloud, Service Mesh continues to evolve and remain questionable. Overall, the evolution of microservices architecture does not happen overnight, and neither being too conservative nor too aggressive is the solution. Long-term practice and intensive internal training may be the way forward for micro-service architecture.

In 2020, microservices, a topic that has lasted for many years, is still hot: Service frameworks represented by Spring Cloud and Dubbo continue to evolve and accelerate to Cloud native; Service Mesh, a cloud native, micro-service two-ring “Internet celebrity”, is still moving forward in the fog. For most enterprises, faced with the booming development of cloud native and microservice technologies, it is inevitable to have some doubts: On one side is the mature evolving Service framework, on the other side is the Service Mesh representing the future direction, how to choose the evolution direction of the enterprise architecture?

Service frameworks: Evolution and dilemmas

Spring Cloud: Arsenal mystery

Spring Cloud is to microservices what Spring is to Java open source frameworks: critical.

In the microservices ecosystem, Spring Cloud is like an Arsenal of “weapons” to solve the problems of distributed, microservices scenarios. The current Spring Cloud project has over 30 sub-projects, some of which include a significant number of feature-rich sub-modules. On the one hand, developers can easily reuse existing open source capabilities without having to reinvent the wheel. On the other hand, beginners of Spring Cloud need to pay a lot of costs to learn the usage posture and features of the framework itself, which causes some difficulties in the early application and later maintenance of Spring Cloud.

Spring Cloud Netflix is an excellent sub-project of Spring Cloud that integrates Spring Cloud and Netflix OSS. Most Spring Cloud developers are familiar with Eureka, Hystrix, Ribbon, Zuul and other components from this source. However, with the maturity of these components and the adjustment of Netflix’s open source strategy, Hystrix, Ribbon and other core components have entered the maintenance state since 2018. Spring Cloud also provides corresponding long-term evolution solutions as an alternative. Perhaps influenced by the “active graduation” of Netflix components, Spring Cloud directly upgraded several core components of enterprise OSS into Spring Cloud sub-projects. For example, Zuul, the microservices Gateway, was replaced by Spring Cloud Gateway with a similar implementation mechanism, and Hystrix, the fuse downscaling component, was replaced by Spring Cloud Circuit Breaker that supports multiple solutions. The idea seems to be to free core components from the corporate open source component strategy. In 2020, the updated Spring Cloud Arsenal will continue to develop, but the maturity of the new components is still insufficient. Time is needed to test the capability integrity and performance stability to reach the height of the core components of Netflix.

Spring Cloud Alibaba, as an ecological integration project of Spring Cloud and Arabic OSS, will continue to develop in 2020. Spring Cloud Alibaba integrates Sentinel, Nacos, Dubbo and other ali micro service open source components, aiming to help developers quickly introduce Ali open source components. However, the follow-up development of the project needs to be paid attention to whether the growth model similar to Spring Cloud Netflix will repeat the old path of Netflix or find a new way to develop its own.

The Spring Cloud Arsenal model, accompanied by the OSS form of Netflix, Alibaba and other enterprises, will still be a bit of a “mystery” in 2020. On the one hand, Spring Cloud hopes to reuse enterprise OSS to maintain the maturity of the project. On the other hand, they worry that this model will repeat the mistakes of “graduation” by enterprises. Real enterprise developers can get a little confused while sitting on Spring Cloud’s blessings.

Dubbo: Accelerate the move to cloud native

Dubbo not only reopened after a period of suspension, but also became an Apache graduate program in 2019. In 2020, Dubbo began accelerating its move to cloud native.

In the landmark 2.7.5 release of Apache Dubbo, two major changes have been made:

  • Service model adjustment: From the “interface as a service” model to the same service model as Spring Cloud and Kubernetes.

  • Protocol support tweaks: Added HTTP/2 (gRPC) capability as a communication protocol.

The traditional model of Dubbo uses Interface as a service, which facilitates the maintenance of RPC Interface for developers. However, in the scenario of large-scale enterprise micro-service cluster with the number of services expanding, various performance problems emerge one after another. Starting from version 2.7.5, the service model is no longer subdivided in the Interface dimension. On the one hand, the simplified service model is beneficial to optimize the performance of large-scale scenarios, and more importantly, it is unified with Spring Cloud and Kubernetes service model. It can help enterprise businesses better realize service system compatibility, and then seamlessly access the cloud native service system.

Dubbo traditional RPC protocol selects private binary protocol, which has some advantages over the lower version HTTP protocol in terms of performance. Under the background of cloud native, Sidecar, responsible for traffic agent, sinks to the infrastructure layer. The design of Dubbo native protocol makes the 7-layer protocol content of the request need to be packaged in the request body in the form of attachment. Sidecar needs to parse the body to get the request parameters. Further traffic scheduling later made the performance advantage of Dubbo’s native protocol almost disappear, becoming another “deviation” for Dubbo in the cloud native era. Adding HTTP/2 (gRPC) support not only makes the protocol more universal, but also improves performance and scalability to meet the requirements of large-scale cloud native scenarios.

Dubbo’s accelerated move to cloud native shows that the mature service framework is evolving in the right direction, but the older version compatibility issues and stability caused by the excessive pace are also worthy of enterprise business concerns. Interestingly, Dubbo chose to make a major change to version 2.7.5, but it will also be interesting to see where version 3.0 goes.

Service Mesh: Progress, differentiation, landing

Progress and problems in Istio

In 2020, Istio not only kept its promise to release one release per quarter (1.5, 1.6, 1.7, 1.8), but also brought with it a number of capabilities to improve ease of use and enterprise accessibility:

  • Performance & Scalability: Mixer V2 & WASM

  • Control surface monomer: Pilot, Galley and Citadel were combined into Istiod

  • Multi-cluster capability is clear

  • Non-container capabilities are explicit

  • Security model simplification

  • Enrichment of barrier removal tools

  • … (Everything for the enterprise landing, easy to use)

The first version of Istio, version 1.5, overthrew the previous design and proposed the architectural idea of “returning to monomer”. The Release Note of version 1.6 even indicated that minimalism would be carried out in the beginning. Following the release of version 1.7, Christian Posta, former Red Hat Lead architect, Istio in Action author and CTO of Solo. IO Field, believes Istio 1.7 will be the most stable release available for production… All of these make 2020 Istio look very sunny and hopeful, and usher in the second confidence boom after version 1.1.

However, the light must eventually shine into reality. Istio 1.5 makes it difficult to produce at least 1.5 and 1.6 versions. 1.7 still has strict platform version requirements (Kubernetes’ initial version has been upgraded to 1.16 or above), and dependent API will be forced to migrate. Whether version 1.8, released later this year, will truly be the most stable version available for enterprise production remains to be tested.

In 2020, Istio is still a teenager running on the road.

The second division of the camp

In 2020, the Service Mesh camp is still divided:

  • Google and IBM, the two biggest backers of Istio, clashed over the trademark. Istio was donated by Google to Open Usage Commons instead of CNCF.

  • Microsoft released its own Open source Service Mesh project — Open Service Mesh, which followed the specification of Service Mesh Interface (SMI) developed by other groups.

  • Kong, a veteran of the microservices API gateway, launched his own Service Mesh project, Kuma. Interestingly, Kuma chose Envoy as the data surface instead of the Kong gateway core Nginx+OpenResty.

  • Nginx has launched its own new product, Nginx Service Mesh (NSM), which simplifies the Service Mesh and introduces Aspen Mesh as a commercial product

This divergence in the Service Mesh community does not seem to be a big deal and demonstrates the strategic position of Service Mesh standardization in the eyes of tier 1 cloud vendors.

Have to say: Your circle is a mess.

Key elements of enterprise landing

In 2020, the implementation of enterprise Service Mesh is progressing steadily.

During this year, almost all of the large factories that entered Service Mesh at the early stage went through large-scale production verification, and their work in protocol expansion, registry docking, platform construction, performance stability and other aspects was tested and settled at the production level. It also provides more confidence for those who follow.

Previously, more enterprises waiting outside the field with a wait-and-see attitude also gradually start their own journey of Service Mesh. For enterprises with relatively simple landing scenarios, low performance requirements and light historical burden, they can quickly upgrade their microservice architecture to Service Mesh by choosing the stable version before 1.5. More enterprises choose Service Mesh in the hope of realizing architecture evolution and eliminating historical pain points. These enterprises often have complex implementation scenarios, certain implementation scale and performance requirements, and want to achieve imperceptionless Service migration. High hopes for Service Mesh. Can Service Mesh 2020 really do the job? The answer is not easy.

For most enterprises, landing Service Mesh requires two major capability elements: smooth landing support and large-scale landing support.

1. Smooth landing support

New enterprise services, or services that are already highly platformized, can be implemented in a unified way. The problems most enterprises face are much more complex: different frameworks, inconsistent protocols, diverse registries, fragmented platforms, heterogeneous languages… The ideal Service Mesh has become a luxury, and whether it can support the smooth landing of enterprise business has become the key factor:

  • Support seamless access to business services that already use the Spring Cloud, Dubbo framework and have their own registry

  • Service services deployed on existing VMS or physical servers are managed and managed together with containerized services

  • The support storage uses private protocols and heterogeneous languages to connect Service services to the Service Mesh

  • With full platform capability to achieve smooth and observable business migration

2. Large-scale landing guarantee

In the Service Mesh architecture, the control plane implements Sidecar management, Service information distribution, and configuration distribution on the data plane. This structure simplifies the client logic and makes the Service framework simpler and lighter. Sidecar can focus on managing traffic, and more responsibility of managing Service Mesh is assumed by the control side. Service scale support in large-scale Service scenarios becomes a new challenge for Service Mesh implementation.

Take the Istio control plane as an example. In the native mode, the control plane needs to listen for service information changes of global Kubernetes and distribute the information to all SidecArs in full. In the production scenario, the batch change of global Kubernetes service information makes it difficult for the control plane to bear its heavy burden. Even if Istio adds the mechanism of Sidecar Scope to isolate configuration distribution, it can only optimize the configuration distribution of the link from the control plane to Sidecar. The overall performance of complex production environment configuration processing is still poor.

Current native Istio control surfaces also lack self-protection mechanisms, such as current limiting & fusing, active disconnection, and connection load balancing for self-protection in large-scale scenarios. In extreme cases, if the control plane is not protected, service restart and reconnection storms will also bring great challenges to service SLA.

The traffic proxy capability of Service Mesh architecture decreases, and the overall call link is more complex than that of traditional RPC framework. In extreme cases, if the Sidecar fails to work properly, Service traffic is affected. Once this scenario spreads globally, it can escalate to system-level risk. Native components provide high availability assurance, but lack the ability to fully fail-safe in extreme cases.

In general, Service Mesh in large-scale scenarios still needs solid performance and stability support.

Evolution of micro-service architecture: Long-term practice, hard internal skills

The Service framework and Service Mesh coexist for a long time

Both the Service framework and the Service Mesh are born to solve the evolution of the business micro-service architecture. There is no question of who is better or more advanced. On the contrary, I see the future of microservices as a long-term coexistence and complementarity of Service frameworks and Service Mesh.

Spring Cloud, Dubbo and gRPC are mature Service frameworks. Although they have different positioning and development modes, they can still be used as long-term selection of business Service frameworks. Even under Service Mesh architecture, easy-to-use frameworks and common protocols are also needed to introduce Service traffic into Sidecar. However, more service-level traffic governance capabilities descend from the Service framework to Sidecar, while the code-level governance capabilities of the Service framework can still be retained, which forms the complementarity of fine-grained governance of the Service framework and Service Mesh traffic governance capabilities. The microservice architecture in this co-existence state needs to be supported by compatible and stable solutions.

Business low perception is more important

As the enterprise microservice architecture evolves into Service Mesh, the need for low awareness of business becomes more important.

As mentioned above, enterprises face many challenges in complex scenarios when they evolve from microservices to Service Mesh. It is particularly important to ensure confidence in business evolution and migration to ensure low or no business perception in the actual migration process. Registration center through, request traffic interoperability, governance ability to pull together, gray migration, unified platform control… Addressing business pain points through architecture evolution is the end and goal, while business imperceptibly migration is the starting point of everything.

Platform convergence

In view of the challenges encountered in the evolution of microservices architecture, enterprises either invest their own R&D resources to cope with them, or look for cloud vendors to provide systematic guarantee. Either way, I think the future of microservices is going to be a long term convergence of capabilities for enterprise microservices platforms.

At the 2020 Cloud Native Industry Conference hosted by China Academy of Information and Communication technology, a series of evaluation results of cloud native field were released. In the first evaluation of micro-service platforms, cloud vendors, including Alibaba, netease, Tencent and Huawei, passed the rigorous test of hundreds of micro-service capability items conducted by THE Institute and were awarded the highest level (advanced level). This evaluation covers the complete capability closed-loop of micro-service platform. Hundreds of capability assessments not only clarify the standards of enterprise-level micro-service platform, but also reflect the convergence of micro-service platform from the side.

Work hard, and always

The idea of cloud native brings a new opportunity to the age-old topic of microservice architecture evolution. Both the evolution of Service framework and the implementation of Service Mesh pose great challenges to enterprises and major cloud vendors. Under the trend of convergence of capabilities of enterprise-level micro-service platforms, it is the right direction for the evolution of micro-service architecture to practice the “internal functions” of micro-service capabilities and actively respond to the challenges brought by low-intrusion access and mass production support.

The authors introduce

Pei Fei is a cloud computing technology expert and senior architect of netease Shufan. I have 10 years of experience in enterprise-class platform architecture and development. At present, I am mainly in charge of netease Micro service governance team, focusing on the research and implementation of enterprise micro service architecture and cloud native technology. Led the team to complete the implementation of several projects in netease Group, including Netease Light Boat Service Mesh, micro Service framework NSF and API Gateway, and the output of commercial products.

The resources

Spring Cloud Netflix Projects Entering Maintenance Mode

Apache Dubbo annual Review and summary from 2019 to 2020

Interview with Christian Posta — Istio 1.7 will be the most stable version available for production

Istio 1.7 — Storm Chasers

Ciict hosted the 2020 Cloud Native Industry Conference and released a series of evaluation results in the cloud native field