Introduction: The integration of HSF and Dubbo is the general trend. In order to better serve internal and external users, as well as for the better development of the two frameworks, Dubbo 3.0 and HSF 3 based on Dubbo 3.0 were born to adapt to the infrastructure ecology within the group.
The author | kwok ho
Dubbo and HSF are both microservices RPC frameworks currently in use by Alibaba. HSF is more widely used in Alibaba, which has carried on the internal architecture evolution from single application to micro-service and supported the smooth operation of Alibaba’s Singles’ Day. Dubbo, on the other hand, has rapidly become a popular microservices framework in the industry since it was opened in 2011 and has been widely used both at home and abroad. Since the first version 1.1 was released in May 2008, HSF has evolved over several years from a basic RPC framework to an easily extensible microservices framework that supports ten trillion calls per day. In the internal scenario, users can easily access the micro-service system with a small number of configurations to obtain stable service invocation with high performance. You can also expand the HSF based on service requirements to enhance the capability of obtaining the entire link.
The Dubbo project was born in 2008 and was initially used only on an internal Alibaba system; In 2011, Ali B2B decided to open source the whole project, and in only one year, it gained a large number of users from different industries. In 2014, Dubbo stopped updating due to internal team changes; In September 2017, Dubbo 3 restarted open source and was incubated and graduated by Apache in May 2019, becoming the second project donated by Alibaba to Apache graduation.
Dubbo and HSF’s practice in Alibaba
In 2008, HSF was the main service framework used by taobao within the group, while Dubbo was used by B2B. The two are independent and go their separate ways. With the rapid development of business, cross-language, cross-platform, cross-framework requirements are increasingly obvious, and the call for inter-connectivity between different businesses is getting louder and louder, which soon becomes the demand of almost the whole group. That is, taobao can call B2B services and vice versa.
The service framework is just like the rails of the railway, which is the basis of interworking. Only when the interworking of the service framework is solved, can the higher level business interworking be completed. Therefore, it is an inevitable trend to unify the same standard and build a new generation of service framework. That is, the final framework needs to be compatible with both hF1. X and Dubbo protocols, including 1.x and 2.x. For the needs of the group, stability and performance are the core. Therefore, HSF, which has been tested in high concurrency scenarios like e-commerce, was selected as the core of the new generation of service framework. Subsequently, HSF 2.0 was released and reconfigured to address major issues of previous versions of HSF, reducing maintenance costs and further improving stability and performance. HSF 2.0 solves the framework scalability problems such as communication protocol support opacity and serialization protocol support opacity. Based on the Java version of HSF 2.0, the group has also evolved multi-language clients such as CPP/NodeJs/PHP.
Due to the compatibility of Dubbo protocol, the original Dubbo users can migrate smoothly to the new version. Therefore, HSF was fully rolled out in the group soon after its launch, with the number of deployed servers reaching hundreds of thousands, which basically completed the unification of Alibaba’s internal micro-service framework and experienced the verification of traffic peak at zero point of Double 11 for many years.
Challenges and opportunities for the next generation of microservices
However, the development of business and the iteration of the framework itself make the simple compatibility of the two frameworks from the protocol layer no longer meet the needs. With the continuous development of cloud computing and the widespread spread of the concept of cloud native, the development of micro-services has the following trends:
1. K8s has become the de facto standard for resource scheduling, and Service Mesh has been gradually accepted by more and more users since its development. Shielding underlying infrastructure has become a core evolution goal of software architecture, and the problem faced by Both Alibaba and other enterprise users has changed from whether to go to the cloud to how to smoothly and stably migrate to the cloud at low cost.
2. Due to the diversity of paths to the cloud and the transition state of migrating from the existing architecture to the cloud native architecture, the facilities for deploying applications are flexible and the micro-services on the cloud also show a trend of diversification. Invocation across languages, vendors, and environments will inevitably lead to unified protocols and frameworks based on open standards to meet interoperability requirements.
3. There is an explosive growth trend in the access to back-end services on the end, and the scale of applications and the entire microservice system is growing accordingly.
These trends also present new challenges for HSF and Dubbo.
First, the impact of the upper cloud on internal closed source components. Microservices frameworks are fundamental components, and most companies need to decide which one to use early in the selection process or as their business grows to a certain size. A stable and efficient self-developed framework usually needs a long time of iteration to polish and optimize. As a result, most companies tend to use open source components initially. For AliYun, this presents a problem: the HSF framework is used internally, while the majority of users on the cloud are using the open source Dubbo framework. The two frameworks have differences in protocol, internal module abstraction, programming interface and functional support. How to make the best practices and cutting-edge technologies of alibaba Group internal components using HSF more easily and directly exported to the cloud is a problem that every student who is engaged in technology commercialization will encounter and must solve.
Second, how to integrate the technology stack of the original department or company into the existing technology system is an unavoidable problem. A case in point is Koala, which joined Alibaba in 2019. Koala has been using Dubbo as a micro-service framework before, and built large-scale micro-service applications based on Dubbo, which has high cost and great risk of migration. It will take a long time for the group and kaola’s infrastructure department to conduct pre-migration research and plan design, and then make changes after ensuring basic feasibility. From batch gray on-line, and then to the final full on-line. This kind of change not only requires a lot of manpower, but also takes a long time, which will affect the development and stability of the business.
Third, due to historical reasons, there has always been a certain number of Dubbo users within the group. In order to better serve this part of users, HSF framework for Dubbo protocol layer and API layer compatibility. However, such compatibility is limited to interoperability. With the development of Dubbo open source community for many years, such basic compatibility has great disadvantages in terms of disaster recovery, performance and iterability, and it is difficult to align with Dubbo’s service governance system. There are also risks in terms of stability and not being able to enjoy the technological dividends of the group’s technological development and the evolution of the Dubbo community.
The root cause of these problems is that closed source HSF cannot be directly applied to the majority of cloud users and other external users, and the challenge of open source products to closed source products will become more and more severe with the continuous development of open source and cloud. The sooner this is resolved, the lower the cost of cloud native migration for Alibaba and external enterprise users, and the greater the value generated.
The simplest and most straightforward way is to open source HSF as well. But there are two new problems. First, Dubbo is an early open source star product of Ali. If HSF is also open source, how to divide the relationship and application scenarios of these two similar frameworks will not only confuse external users, but also hinder brand building. Second, existing Dubbo users at home and abroad will need to use the existing HSF-based solution if they want to access Ali Cloud. It will take a lot of effort to migrate all applications using Dubbo to HSF, and cost and stability will have to be considered. For these two reasons, now is not the best time to open source HSF.
Since HSF can’t walk out, the only solution left is for Dubbo to walk in. The HSF framework is rebuilt based on Dubbo kernel by means of core fusion.
In terms of brand construction, the integration can make the existing extensive influence of Dubbo continue to develop. After the large-scale implementation of Dubbo in the group, it will produce a good demonstration effect of the original brand, and external users will have more confidence to choose Dubbo in the selection of micro-service framework. At the same time, users who already use Dubbo have more reason to follow the evolution of the version and enjoy the technology dividend brought by Alibaba’s open source.
In engineering practice, using Dubbo to reconstruct HSF from the internal reunification is more feasible, the complexity of migration is controllable, and it can be gradually and orderly realized. The perfect internal testing process and rich scenarios are the best functional regression testing for Dubbo. Internal and external integration is also the best way to balance commercialization and internal support. In the process of refactoring, improving features, improving performance and embracing the newer, more cloud-native technology stack is also the best way to improve the user experience within the group.
Therefore, the integration of HSF and Dubbo is an inevitable trend. In order to better serve internal and external users, as well as for the better development of the two frameworks, Dubbo 3.0 and HSF 3 based on Dubbo 3.0 were born to adapt to the infrastructure ecology within the group.
Next generation cloud native microservices
First, a general overview of Dubbo 3.0.
- Dubbo 3.0 supports a new service discovery model. Dubbo 3.0 tries to start from the application model, optimize the storage structure, and design the original mainstream model of Qiyun to avoid the interoperability problems in the model. The new model is highly compressed in data organization, which can significantly improve performance and cluster scalability.
- Dubbo 3.0 proposed the next generation RPC protocol, Triple. This is a new open protocol that is fully compatible with gRPC protocol based on HTTP/2 design. Because it is designed based on HTTP/2, it has high gateway friendliness and penetration. Fully compatible with THE gRPC protocol is naturally advantageous in multi-language interoperability.
- For cloud native traffic governance, Dubbo 3.0 provides a set of unified governance rules covering traditional SDK deployment, Service Mesh deployment, VM deployment, and Container deployment. It supports most of the scenarios using one rule, greatly reducing the cost of traffic governance. It makes global traffic governance possible under heterogeneous system.
- Dubbo 3.0 provides a solution to access the Service Mesh. For Mesh scenarios, Dubbo 3.0 provides two access modes. One is the Thin SDK mode, and the deployment model is exactly the same as the current mainstream Service Mesh deployment scenarios. Dubbo will be slimmed-down, shielding the same governance functions as Mesh and retaining only the core RPC capability. The second is Proxyless mode. Dubbo will take over the responsibilities of Sidecar, communicate with the control plane actively, and apply the cloud native traffic governance function based on the unified governance rules of Dubbo 3.0.
1. Application level registration discovery model
The prototype of application-level registration discovery model was first proposed in Dubbo version 2.7.6, and finally formed the stable model in Dubbo 3.0 after several iterations. In 2.7 and previous versions of Dubbo, application service registration and discovery, only at the granularity of interfaces, each interface corresponding in the registry data that are a different machine will belong to the current machine metadata information on registration or interface level configuration information, such as serialization, computer room, unit, the timeout configuration, etc. All servers that provide this service change interface granularity independently when they restart or publish.
For example, a gateway application relies on 30 interfaces of upstream applications. When an upstream application is published, there are 30 corresponding address lists for bringing machines online and offline. The model of interface discovery as the first citizen of registration is the earliest unbundled approach to SOA or microservices, providing the flexibility to dynamically change independently from a single node to a single service. As the business grows, the number of services that a single application depends on grows, and the number of machines per service provider grows for business or capacity reasons. The number of total service addresses that clients depend on is growing rapidly, and the pressure on dependent components such as registries is multiplying.
We noticed two trends in the development of the microservice model: First, with the separation of single applications into multi-microservice applications basically completed, large-scale service separation and reorganization is no longer a pain point, and most interfaces are only provided by one application or fixed several applications. Secondly, a large number of urls used to mark address information are extremely redundant, such as timeout, serialization, these configuration changes are very low frequency, but appear in every URL. So application-level registration discovery came into being.
Application – level service discovery takes application as the basic dimension of registration discovery. The main difference from the interface level is that an application provides 100 interfaces and needs to register 100 nodes in the registry according to the interface level granularity. If the application has 100 machines, then every release is 10,000 virtual node changes for its clients. Application-level registration discovery requires only one node and only 100 virtual nodes are changed per publication. For applications that rely on many services and many machines, this is a drop of tens to hundreds of percent. Memory usage will drop by at least half.
Finally, because the new Service discovery model is consistent with the Service discovery model in Spring Cloud, Service Mesh and other architectures, Dubbo can realize the mutual discovery between nodes in other architectures from the registry level, and realize the interconnection of heterogeneous microservices architecture.
2. The next generation RPC protocol — Triple
The most basic capability of RPC framework is to complete service invocation across business processes and form services into chains and networks, among which protocol is the most core carrier. Dubbo 2.0 provides the core semantics of RPC, including protocol headers, flag bits, request ids, and request/response data, which are grouped together as binary data in a certain order.
The Dubbo 2.0 protocol faces two main challenges in the cloud native era. First, it is difficult for users to understand binary protocols directly because of ecological incompatibility. Secondly, it is not friendly to gateway components such as Mesh, and requires complete parsing protocols to obtain the required call metadata, such as some RPC contexts, which will face challenges from performance to ease of use. At the same time, the design and implementation of the old Dubbo 2.0 RPC protocol has been proved to limit the development of business architecture in some aspects in practice, such as the interaction from terminal devices to back-end services, the adoption of multiple languages in microservices architecture, data transmission model between services, etc. So, on the premise of supporting the existing functions and solving the existing problems, what features should the next generation protocol have?
First of all, the new protocol should be easy to expand, including but not limited to Tracing/ Monitoring and other support, and should be recognized by all layers of devices to reduce the difficulty of users to understand.
Secondly, the protocol needs to solve the problem of cross-language communication. Traditional multi-language multi-SDK mode and Mesh cross-language mode both need a more general and easily extensible data transmission format.
Finally, the protocol should provide a more complete Request model, supporting Streaming and Bidirectional models in addition to the Request/Response model.
Based on these requirements, the HTTP2/ Protobuf combination is the best fit. Mention these two and you may easily think of the gRPC protocol. So what is the relationship between the new generation protocol and gRPC?
First of all, the new Dubbo protocol is based on the GRPC extension protocol, which also ensures that the new protocol and GRPC can communicate and share in the ecosystem. Second, on this basis, the new Dubbo protocol will support Dubbo’s service governance more natively, providing greater flexibility. On the serialization side, since Protobuf is not currently used by most applications, the new protocol will provide sufficient support for serialization to smoothly adapt existing serialization and facilitate migration to Protobuf. In the request model, the new protocol will natively support end-to-end full-link Reactive, which gRPC does not have.
3. Access the Service Mesh natively
The community development team investigated various solutions for how to place Dubbo in the Service Mesh system, and finally determined the two Mesh solutions that best fit Dubbo 3.0.
One is the classic Service Mesh based on Sidecar, and the other is the Proxyless Mesh without Sidecar. For the Sidecar Mesh solution, the deployment method is the same as the mainstream Service Mesh deployment solution. Dubbo 3.0 focuses on providing transparent upgrade experience for business applications. This is not just a program-neutral upgrade. It also uses Dubbo 3.0 lightweight and Triple protocol to minimize losses and operation and maintenance costs on the whole invocation link. This solution is also known as the Thin SDK solution, and the Thin part is to remove all unnecessary components. The Proxyless Mesh deployment solution is another Mesh configuration planned for Dubbo 3.0. The goal is to directly interact with the control plane through the traditional SDK without starting Sidecar.
We envision this being very useful for Proxyless Mesh deployment in the following scenarios:
One is that the business wants to upgrade the Mesh solution, but cannot accept the performance loss caused by traffic hijacking by Sidecar, which is common in core business scenarios.
Second, it is expected to reduce the operation and maintenance cost caused by Sidecar deployment and reduce the system complexity. Third, the upgrade of legacy systems is slow, the migration process is long, and multiple deployment architectures coexist for a long time.
Finally, multiple deployment environments include multiple deployment modes, such as VM and Container, and mixed deployment of multiple types of applications, such as Thin SDK and Proxyless solution. Proxyless mode is deployed for performance-sensitive applications, Thin SDK is deployed for peripheral applications, and multiple data planes are scheduled by the unified control plane.
Overall, Dubbo has Mesh solutions for different business scenarios, different migration stages and different infrastructure guarantees.
4. Enhanced flexible services
Cloud native has brought about a major change in technology standardization, and making it easier to create and run applications on the cloud, with the ability to scale flexibly, is a core goal for all cloud native infrastructure components. With the elastic capabilities of cloud native technologies, applications can scale up a large number of machines in a very short time to support business needs. For example, applications often need thousands or even tens of thousands of nodes to meet users’ requirements in response to zero-second kill scenarios or emergencies. However, capacity expansion also brings many problems of large-scale cluster deployment in cloud native scenarios. For example, there are too many nodes in a cluster, which leads to frequent node anomalies, and service capacity is affected by various objective factors, which leads to unequal node service capacity.
Dubbo hopes to solve these problems based on a flexible cluster scheduling mechanism. This mechanism mainly solves two problems: first, the distributed service can keep stable without avalanches when the node is abnormal; Second, for large-scale applications, it can run in the best state and provide higher throughput and performance. From the perspective of single service, Dubbo hopes to provide an unbeatable service, that is, in the case of a particularly high number of requests, it can selectively reject some requests to ensure the correctness and timeliness of the overall business.
From the perspective of distribution, to minimize the overall performance degradation caused by complex topology and different node performance, the flexible scheduling mechanism can dynamically allocate traffic in an optimal way, so that the heterogeneous system can reasonably allocate requests according to the accurate service capacity at runtime, so as to achieve the optimal performance.
5. Business benefits
For the business, it may be more about the benefits of upgrading to Dubbo 3.0. Two key words are summarized, namely, the improvement of performance and stability of application itself and the native access of cloud.
- In terms of performance and stability, Dubbo 3.0 focuses on large-scale cluster deployment scenarios, optimizes data storage methods to reduce resource consumption on a single machine, and ensures the stability of the entire cluster when dealing with horizontal expansion of a very large cluster. In addition, the concept of flexible service is proposed in Dubbo 3.0, which can effectively guarantee and improve the overall reliability and resource utilization of the whole link to a certain extent.
- Dubbo 3.0 is a milestone version for Dubbo to fully embrace cloud native. At present, Dubbo has a huge base of users at home and abroad. With the advent of cloud native era, these users have an increasingly strong demand for cloud. Dubbo 3.0 will provide a complete and proven solution, migration path, and best practices to help enterprises make the transition to cloud native and reap the benefits of cloud native.
Dubbo 3.0 will significantly reduce the additional resource consumption caused by the framework and improve system resource utilization. From a stand-alone perspective, Dubbo 3.0 saves about 50% of the memory footprint; From the cluster perspective, Dubbo3 can support large-scale clusters with millions of instances, laying a foundation for the future large-scale service expansion. Dubbo3 supports communication models such as Reactive Stream, which can greatly improve overall throughput in large file transfer and streaming scenarios.
Architecturally, Dubbo 3.0 brings more possibilities for business architecture upgrades. The original protocol of Dubbo restricts the access mode of micro-service to some extent. For example, the mobile terminal and front-end business need to access the Dubbo back-end service through the protocol conversion of the gateway layer. For example, Dubbo only supports request-Response mode communication. This makes some scenarios that require streaming or back communication less well supported.
In the process of cloud native transformation, the business is most concerned about changes and stability. The ability to upgrade to the cloud native environment without changing or changing the code less is crucial in the selection of cloud process in the business. Dubbo 3.0 brings a holistic solution to cloud native upgrades on the business side. Whether the underlying infrastructure upgrades drive business upgrades, or proactive upgrades to address business pain points, the cloud-native solutions provided by Dubbo 3.0 enable rapid product upgrades to move into the cloud-native era.
6. Status and Roadmap
In terms of internal use, Dubbo 3.0 has been fully implemented in tens of thousands of nodes of hundreds of applications in Kaola business, and a large number of applications can easily complete application cloud using Dubbo 3.0. At present, it is being piloted and gradually implemented in e-commerce core applications on a large scale, and new features such as application level registration discovery and Triple protocol are enabled. Open source users and commercial applications are also currently migrating from HSF2 or Dubbo 2.0 to Dubbo 3.0, and the service framework team and community are working to compile and write best practices for the migration, which will be available in due course.
Dubbo 3.0 was officially released in June this year as a milestone release after being donated to Apache, which represents a node for Apache Dubbo to fully embrace cloud native. In November 2021 we will release Dubbo version 3.1, which will bring best practices for Dubbo deployment in Mesh scenarios.
Dubbo version 3.2 will be released in March 2022, bringing full support for service flexibility and intelligent traffic scheduling to improve system stability and resource utilization in large-scale application deployments.
In retrospect, Both Dubbo and HSF have played a crucial role at different stages in the development of Alibaba and the microservices framework. Based on the present and looking to the future, Dubbo 3.0 and HSF based on Dubbo 3.0 kernel are advancing both externally and internally to be the most stable and high-performance microservices framework, providing users with the best user experience and continuing to lead the development of microservices in the cloud native era.
Guo Hao, head of Alibaba Service Framework, Dubbo 3.0 architect, focuses on distributed system architecture
The original link
This article is the original content of Aliyun and shall not be reproduced without permission.