Introduction: In this year’s Tmall Double 11, the middleware supported 540.3 billion transactions, and fully upgraded to the public cloud architecture. The architecture upgrade is based on open source as the kernel, public cloud as the basis, and OpenAPI for uncoupling extension. In terms of architecture, open source, self-research and commercialization are unified. Through the adoption and feedback of open source, promoting community construction, through alibaba’s rich business scenarios, polishing the performance and availability of technology, through cloud commercialization services for more enterprises, to create better user experience, to temper the competitiveness of cloud products in an all-round way.
The author | middleware support group on the cloud group
In 2019, 100% of Alibaba’s core systems ran on Aliyun. By 2021, Alibaba business will be 100% cloud biogenic. Alibaba has become the first major tech company in the world to put all of its business on its own public cloud.
The group’s efforts to move all its businesses to the public cloud not only guarantee the certainty of the cloud, but also prove that Ali Cloud has the ability to deal with the technical challenges in a highly difficult and complex environment, and provide a more solid practical guarantee for customers to enjoy the technology bonus on the cloud.
01 Architecture consistency, open source, research, commercialization trinity
In this year’s Tmall Double 11, middleware supported 540.3 billion transactions, and was fully upgraded to the public cloud architecture.
The architecture upgrade is based on open source as the kernel, public cloud as the basis, and OpenAPI for uncoupling extension. In terms of architecture, open source, self-research and commercialization are unified. Through the adoption and feedback of open source, promoting community construction, through alibaba’s rich business scenarios, polishing the performance and availability of technology, through cloud commercialization services for more enterprises, to create better user experience, to temper the competitiveness of cloud products in an all-round way.
In this process, Alibaba’s business r&d efficiency increased by 20%, CPU resource utilization rate increased by 30%, the application of 100% cloud bioengineering, online business container scale reached one million, computing efficiency was greatly improved, and the computing cost of Double 11 decreased by 30%.
Next, we will reveal the whole process of BaaS at the back end, meshing at runtime and Serverless at the business side in the process of 100% cloud service.
02 Middleware backend BaaS, stateful applications can also be delivered in minutes
The previous Double 11 site delivery was linear. Deliver the IaaS resources first, then the middleware, and finally the business.
This year, when the middleware was upgraded to a public cloud architecture, IaaS resources and middleware were delivered simultaneously, saving time for both to be delivered sequentially. All the o&M bases of the middleware public cloud architecture are switched to K8s, so that the stateful middleware can be extremely flexible, so that the delivery efficiency of the middleware is reduced from day level to minute level, which greatly improves the delivery efficiency and reduces the resource retention time and cost.
The back-end support system has also been comprehensively upgraded. For example, security problems can be solved by docking with ali Cloud account authority system. By docking the metering and billing system, the digitalization of IT assets can be solved, and the cost optimization can be carried out visually for the operators of each technical team of the group in the form of bills.
In the user interface, it has also been upgraded to support IPv6, making preparations for the overall evolution of Alibaba production network to IPv6 architecture.
03 Overseas business Mesh, remote live can sink Sidecar
Alibaba overseas has a variety of business forms such as AE&Lazada, and its remote multi-activity system is highly intrusive and its technical architecture is not unified, thus affecting the global high availability and r&d synergy efficiency.
With the evolution and maturity of the service grid architecture, we gradually standardized the service routing, hierarchized the routing function, extended the service through the plug-in mode, sank the remote multi-live system to Sidecar, and uncoupled the business logic to explore a universal, non-invasive and low-cost solution for remote multi-live. This year, the system has been fully verified in overseas business, accumulating practical experience for future commercialization.
With the in-depth application of Mesh service architecture, in addition to Sidecar of remote multi-activity function, Alibaba also unified traffic scheduling technology and product architecture based on Mesh architecture, reduced the implementation and governance cost of traffic scheduling, and improved service disaster recovery capability and online service governance efficiency. It realizes more flexible and stable dispatching rules and inter-cell cutting flow.
04 Serverless on the business side, achieving 38% improvement in r&d efficiency and 200% increase in elasticity
Serverless is alibaba’s first choice for cost reduction and efficiency improvement.
On Double 11 this year, Serverless not only successfully carried three times of peak traffic, but also increased the number of supported application scenarios by two times. The overall R&D operation and maintenance system improved by 38%, mainly reflected in the following two key points.
1. Consolidating the trinity technology system, using Ali Cloud function computing FC support to greatly promote the comprehensive Serverless
The function calculates FC and Ali’s internal operation and maintenance system to achieve comprehensive standardized docking and open up the last mile of r&d. For the first time, the Serverless full-process r&d system of full-link “FaaS + BaaS” has been realized.
Before functional computing entered the group, the Serverless technology system on the cloud has been unable to integrate into the developer ecosystem. Although it has rich and powerful functions, it cannot be used by businesses. Even after the use of Serverless technology, the research and development cost increases. Therefore, in 2021, we made efforts to serverless-DevS tool chain. Based on the standard interface and the technical community within the group, we jointly created a research and development system dedicated to Serverless, and skillfully integrated the technologies on the cloud into the group.
Through the double 11 promotion scene as the “whetstone”, the key core technology is further polished, and then feed back to the commercial products and tool chain on the cloud, tamping the trinity of technology system, this year has delivered a satisfactory response, fully support 2021 Tmall Double 11 business scenarios. Covering multiple business scenarios such as Tao te, Tao Department, Ali Mom, 1688, Autonavi and Flying Pig, the number increased by 2 times, the total peak flow increased by 3 times compared with last year, achieving a breakthrough of 50W QPS, and the overall RESEARCH and development efficiency reached 38%.
2. Increase investment in Serverless core technology. Alibaba polished internally through tmall Double 11 scenario, and externally served tens of millions of enterprises through public cloud output
In the Serverless scenario, the speed of cold start is the key to customer selection and the core competitiveness of products on the cloud.
This year, we increased the investment in core technology research and development, and improved the performance of cold start from “elastic strategy”, “mirror distribution”, “container start” and other aspects. The cold start time was further reduced by 60%, and the rigid delivery capacity was increased by 200%. At the beginning of the year, when function calculation was just applied inside the group, the cold start time of the Runtime layer was in the second level, and the middleware needed to be initialized. The overall cold start time was more than 2s, which seriously restricted the use scenario of Serverless.
Therefore, we innovatively invented Serverless Caching for image distribution. According to the characteristics of different storage services, build data-driven, intelligent and efficient cache systems to achieve software and hardware co-optimization. Even in gigabit-level mirrored cold-start scenarios, functional calculations can improve second-level deliverability.
In terms of scheduling, compared with last year, the elastic strategy with more indicators such as timing /CPU is added, and based on the unified scheduling ability of resources within the group, the instance elasticity of tmall Double 11 business at the level of 10W is supported. At the container layer, the self-developed security container pooling technology is used, and the startup time of the container is further reduced to less than 50ms.
These technologies, which have been proven in the Double 11 scenario and fully exported on the public cloud, have helped our partners easily cope with business peaks.
From Ops to Dev, the transformation of cloud native technology is entering the second half
To let customers use the same technology as Alibaba in the first time is the original intention of the trinity of open-source, self-research and commercialization of middleware. These products from Trinity are helping customers on the cloud to better improve the efficiency of Ops.
The commercial outputs of the Trinity include:
- Microservices engine MSE: Registry & configuration center full (native support Nacos/ZooKeeper/Eureka), gateway (native support Ingress/Envoy) and non-invasive open source enhanced service governance (native support Spring Cloud/Dubbo);
- Message queue MQ: native support for Apache RocketMQ, Apache Kafka;
- Application of real-time monitoring service ARMS: native support for Prometheus and open source based Tracing capability;
- AHAS: native support for Sentinel, ChaosBlade;
- Function calculation FC: support developer tools open source Serverless Devs, open source observable tools, etc.
The first half of cloud computing and cloud native was more about Ops, and we believe the second half is more about Dev.
Centering on the improvement of developer efficiency, middleware has completed the technical layout of Serverless, application runtime, low code, integration of cloud and edge, online IDE and other key fields. Non-business logic is sunk through service grid and application runtime and other technologies, and a new division of research and development is formed through plug-in mode. Let the middleware research and development shield the complex technology at the bottom, let the security research and development establish a credible security defense line at the application runtime, and let the high availability research and development build the common capabilities of fusing, limiting, downscaling, and remote multi-activity at the bottom, so that the business is lighter, more focused on business development, and more efficient to build business competitiveness.
The original link
This article is the original content of Aliyun and shall not be reproduced without permission.