The author | jinzhou source (lu) | Serverless public number
First, Serverless scale landing group’s achievements
In 2020, we made a very big upgrade in Serverless infrastructure, such as computing upgrade to the fourth generation Of Dragon architecture, storage upgrade to Pangu 2.0, network into The Hundred G Luo Shen network, the overall performance increased twice after the upgrade. The BaaS level has also been greatly expanded, such as support for Event Bridge and Serverless Workflow, further improving the system capabilities.
In addition, we also cooperated with more than a dozen BU in the group to help the business side to implement Serverless products, including the application scenarios of double 11 core, which helped it successfully pass the double 11 traffic peak exam, proving that Serverless is still very stable under the core application scenarios.
Two, two big background, two big advantages – accelerate Serverless landing
1. Serverless two backgrounds
Why can quickly achieve the scale of Serverless within the group? First, we have two main premises:
Only the group can enjoy the flexibility bonus of the cloud. If it is still an internal cloud, it is very difficult to achieve the subsequent efficiency and cost reduction. Therefore, Alibaba achieved 100% cloud access for its core system on The Singles’ Day in 2019. There is room for Serverless to play a very useful role.
Second background is comprehensive cloud the original biochemical, built a powerful cloud cloud family native products, assigned to the group internal business, help business reached in the cloud based on two main goals: to improve efficiency and reduce cost, 2020 Tmall double tenth a core system comprehensive cloud the original biochemical, 100% efficiency improvement, cost reduction of 80%.
2. Two advantages of Serverless
- Improve the efficiency of the
A standard cloud native applications, from r&d to launch to ops, you need to complete all the orange in the above work item, to complete a formal service application online, the first is the CI/CD building code, the other is the operational system of visual work project, not only to configure, docking, still need to the whole data link traffic evaluation, safety evaluation, traffic management, etc., This is obviously a very high threshold for manpower requirements. In addition, in order to improve resource utilization, we also need to mix various businesses, and the threshold will be further raised.
It can be seen that for the overall traditional cloud native applications, it is very difficult for developers to complete the work items needed to realize the launch of micro-services, which need to be completed by multiple roles. However, in the Serverless era, developers only need to complete the box coding marked blue in the figure above, and all the subsequent work items. Serverless’s RESEARCH and development platform can directly help businesses to complete the online.
- Reduce the cost
Efficiency improvement mainly refers to labor cost savings, while cost reduction is targeted at application resource utilization. For normal applications we need to reserve resources for peaks, but troughs are wasteful. In the Serverless scenario, we only need to pay as needed and refuse to reserve resources for peak value, which is the biggest cost reduction advantage of Serverless.
The above two backgrounds and two advantages are in line with the trend of cloud technology, so the business parties within the group are in touch with each other. Some big BU have upgraded Serverless landing to the campaign level to accelerate the Serverless scenario of business landing. At present, Serverless scenarios in the group have been very rich, involving some core applications, personalized recommendation, video processing, AI reasoning, business inspection and so on.
Serverless Landing scenario – Front-end light application
At present, the front-end scenario within the group is the fastest and most extensive scenario of application Serverless, including Amoy, Autonavi, Feizu, Youku, Xianyu and other BU of more than 10+. So why is the front-end scenario suitable for Serverless?
The figure above shows the capability model of full stack engineers. Generally, there are three roles in microapplications: front-end engineers, back-end development engineers and operation and maintenance engineers, who jointly complete the online release of applications. In order to improve the efficiency, in recent years the role of the whole stack engineer, as the whole stack engineer, he must have the ability of the three roles, not only need the front-end application development technology, also need the back-end systems level development skills, and to focus on the underlying kernel, system resource management, etc., the threshold for the front-end engineer obviously is very high.
In recent years, node.js technology has emerged, which can replace the role of back-end development engineer. Front-end engineer can play two roles as long as he has front-end development ability, namely front-end engineer and back-end development engineer, but operation and maintenance engineer still cannot be replaced.
The Serverless platform, which addresses the bottom three layers of the triangular structure above, greatly reduces the threshold for front-end engineers to become full-stack engineers, which is very attractive to front-end business developers.
Another reason is that the business characteristics are consistent. Most front-end applications have the characteristics of flood peak, which requires the pre-evaluation of business and has evaluation costs. At the same time, front-end scene update and iteration is fast, fast up and down, high operation and maintenance cost; And lack of dynamic expansion and contraction ability, there are resource fragmentation and resource waste. If you use Serverless, the platform will automatically solve all of the above concerns, so Serverless is very attractive to the front-end scenario.
1. Front end landing scenario
The figure above illustrates the main scenarios and technical points for landing the front end:
BFF is transformed into SFF layer: BFF is mainly Backend For Frontend, and front-end engineers do o&M. However, in the Serverless era, O&M is completely transferred to the Serverless platform. Front-end engineers only need to write service codes to complete the o&M.
Slimming: the front-end business logic is sunk to the SFF layer, and the LOGIC reuse is done by the SFF layer. The operation and maintenance capacity is also handed over to the Serverless platform, so as to achieve lightweight client and sinking efficiency improvement function.
Cloud Integration: a multi-application code, which is a very popular development framework, also needs SFF support.
CSR/SSR: Fast display of the front end first screen can be realized through Serverless to meet the requirements of server side rendering and client side rendering, etc. Serverless combined with CDN can be used as a front-end acceleration solution.
NoCode: It is equivalent to encapsulation on the Serverless platform. Just drag and drop several components to build a front-end page. Each component can be packaged with Serverless, function aggregation, etc., to achieve the effect of NoCode.
Middle and background scenarios: Rich application scenarios of single applications. Single applications can be hosted in Serverless mode to complete the online operation of middle and background applications, which can also save o&M capabilities and reduce costs.
2. Front-end Coding changes
How has coding changed since the use of Serverless in the front-end scenario?
Those with a certain understanding of the front end know that the front end is generally divided into three layers: State, View and Logic Engine will simultaneously sink some abstract business Logic into the FaaS layer cloud function, and then use the cloud function as FaaS API to provide services, and can abstract all kinds of AActions in code writing. Each Aaction can be serviced by a FaaS function API.
Take a simple page as an example. On the left side of the page are some render interfaces to get product details, shipping addresses, etc. This is based on the Faas API. On the right is some interaction logic, such as buy, add, etc., which the Faas API can continue to do.
In page design, all Faas apis can be reused for multiple pages instead of just one page. Once you reuse these apis or drag and drop, you can assemble the front end pages, which is very convenient for the front end.
3. Front end light application r&d improvement: 1-5-10
After front-end application Serverless, we summarized Serverless’s improvement to front-end r&d efficiency as 1-5-10, whose meanings are:
1 minute quick start: we do all kinds of main scene a summary, it is classified as application templates, each user or new starting a business, only need to select the corresponding application launch template, will help the user to quickly generate business code, users only need to write your own business can quickly began to function code.
5 minutes online application: Fully reuse Serverless operation and maintenance platform, use the natural capabilities of the platform to help users complete grayscale publishing and other capabilities; And cooperate with front-end gateway, cutting flow and other functions to complete canary test.
10 minutes troubleshooting: Based on the Serverless function after it is launched, it provides the display of service indicators or system indicators. Through the indicators, it can not only set alarms, but also push error logs to users on the console to help users quickly locate and analyze problems, and master the health status of the whole Serverless function within 10 minutes.
4. Front-end Serverless effect
What happens when the front-end implements the Serverless scenario? We compared the performance and man-hours required by the three apps under the traditional application development mode with those after the application of Faas. It can be obviously seen that the performance can be improved by 38.89% on the basis of the original cloud native, which is very impressive for Serverless applications or front-end applications. At present, Serverless scenario has almost covered the whole group, helping business parties to achieve Serverless, and achieving the two main goals of improving efficiency and reducing cost.
Fourth, technical output, expand new scenes
In the process of the group’s Serverless implementation, we found many new business demands, such as how to quickly realize the migration of existing business and save costs? Can the execution time be increased or extended? Can THE resource configuration be adjusted higher? For these problems, we put forward some solutions. Based on these solutions, we abstract some functions of the product. Next, we introduce several important functions:
1. Customize a mirror
The main purpose of custom image is to achieve seamless migration of existing services, help users to achieve zero code transformation, and completely migrate the business code to the Serverless platform.
The migration of existing business is a big pain point, and it is impossible to have two r&d modes in one team for a long time, which will cause great internal friction. In order to make the business side migrate to the Serverless RESEARCH and development system, it is necessary to launch a thorough reform plan to help users realize the Serverless system transformation. It is not only necessary to support the new business to use Serverless, but also to help the existing business to realize the zero-cost rapid migration, so we launched the function of custom container.
Traditional Single Web application scenario features:
- Application of modern fine-grained responsibility separation, service governance and other o&M burden;
- Historical baggage is not easy to Serverless: the service code on and off the cloud is inconsistent with dependency and configuration.
- Capacity planning, self-built operation and maintenance, monitoring system;
- Low resource utilization (low-traffic services monopolize resources).
Function calculation + container image advantage:
- Low cost migration monomer applications;
- Free operations;
- No capacity planning, automatic expansion;
- 100% resource utilization, optimize idle cost.
The custom container function enables traditional Single Web applications (such as SpringBoot, WordPress, Flask, Express, Rails, etc.) to migrate to function computing in mirroring mode without any modification, avoiding resource waste caused by low-traffic services monopolizing servers. At the same time, you can also enjoy the benefits of no capacity planning for applications, automatic expansion, free shipping and so on.
2. Performance examples
High performance instances, reduce usage restrictions, expand more scenarios. For example: the code package has been increased from 50M to 500M, the execution time has been increased from 10 minutes to 2 hours, the performance specifications have been increased by more than 4 times, the maximum support for 16GB and 32GB large instances, to help users run some very time-consuming long tasks and so on.
Function computing serves many scenarios. In the process of service, we have received many demands, such as many constraints, high threshold of use, insufficient resources of computing scenarios and so on. Therefore, for these scenarios, we have introduced the performance instance function, which aims to reduce the restrictions on the use of function computing application scenarios and reduce the threshold of use. In terms of execution duration and various indicators, users can configure flexibly and on demand.
Currently, the 16-core 32G we support has exactly the same computing capability as ECS of the same specification, which can be applied to high-performance business scenarios such as AI reasoning, audio and video transcoding, etc. This feature will be very important for further application scenarios.
The challenge:
- There are many constraints on elastic instance, such as execution time, instance specification, etc.
- In traditional single application and audio and video computing scenarios, services need to be split and transformed, which increases the burden.
- Resource dimensions such as vCPU, memory, and bandwidth are not specified in the elastic instance.
Goal:
- Reduce the use limit of function calculation and reduce the threshold of enterprise use;
- Compatible with traditional applications and recomputing scenarios;
- Give users a clear commitment to resources.
Practice:
- Launch performance instances with higher specifications and clearer resource commitments;
- In the future, performance instances will have higher stability SLAs and richer feature configurations.
Main scenarios: computing tasks, long-running tasks, elastic and scalable insensitive tasks.
- Audio and video transcoding processing;
- AI reasoning;
- Other computing scenarios that require high specifications.
Advantage:
In addition to the relaxation of the restrictions, the performance instances still retain all the capabilities of the current functional computing products: pay-per-use, reservation mode, single-instance multi-request, multi-event source integration, multi-availability zone disaster recovery, automatic scaling, application construction and deployment, and o&M.
3. Link tracing
The link tracing function includes link restoration, topology analysis, and problem location.
A normal microservice, not one function can do all the work, needs to rely on upstream and downstream services. If the upstream and downstream services are normal, link tracing is not required. However, if the downstream services are abnormal, how can I locate the fault? In this case, you can rely on the link tracing function to quickly analyze upstream and downstream performance bottlenecks or locate problems.
Functional Computing has also investigated many open source technical solutions both inside and outside the group. Currently, it supports x-Trace function, is compatible with open source solutions, embraces open source, and provides product capabilities compatible with OpenTracing.
The figure above is the Demo figure of link tracing. Through computing tracing, the database access overhead of back-end services can be visualized to avoid the difficulty of troubleshooting problems caused by the complex verification relationship between a large number of services. Function computing also supports function code-level link analysis capabilities to help users optimize cold start, critical code implementation, etc.
The Serverless product has brought tremendous benefits from a business perspective, but encapsulation has also created a periodic problem — the black box problem. When we provide users with link tracing technology and expose them to black box problems, they can improve their business capabilities through these black box problems. This is also the direction of Serverless to improve user experience in the future. In the future, we will continue to increase investment in this aspect to reduce the cost of users using Serverless.
The challenge:
- Serverless products have great benefits from a business perspective, but packaging brings black box problems;
- Serverless is connected to the cloud ecosystem, and a large number of cloud services cause complex call relationships;
- Serverless developers still have the need for link restoration, topology analysis, and problem location.
Main advantages of FC + X-Trace:
- Function code-level link analysis to help optimize key code implementation such as cold start;
- Service invocation-level link tracking helps connect cloud ecosystem services and distributed link analysis.
4. Asynchronous configuration
In the Serverless scenario, we provide functions such as offline task processing and message opposite consumption, which account for about 50% in function calculation. In large message consumption, there are many asynchronous configuration issues that are often challenged by the business side, such as, where are the messages coming from? Where are they going? Consumed by what services? Time to spend? What is the success rate of consumption? And so on. Visualization/configurability of these problems is an important issue that needs to be solved.
The above figure shows how the asynchronous configuration works. First, the asynchronous call is triggered from the user-specified event source, and the function calculation immediately returns the request ID. At the same time, it can also call the execution function, and return the execution result to the function calculation or message queue MNS. Triggers and so on can then be configured through event sources, and these effects, or topic consumption, can be re-consumed by the message. For example, if a message processing fails, it can be configured for secondary processing.
Typical application scenarios:
- The first is event closed-loop, such as result analysis of delivery results (such as collection of monitoring indicators and alarm configuration); In production events, customers can not only use FC consumption events, but also use FC to actively produce events.
- The other is routine exception handling, such as failure handling and retry strategy.
- Third, resource recovery. Users can customize the inventory time and discard useless messages in time to save resources, which is a great optimization in asynchronous scenarios.
About the author: Zhao Qingjie (Lu Ling), currently working in aliyun cloud native Serverless team, focusing on Serverless, PaaS, distributed system architecture and other directions, is committed to creating a new generation of Serverless technology platform, the platform technology to achieve more universal benefits. I used to work in Baidu, responsible for the largest PaaS platform, and undertook 80% of the online business. I have rich experience in PaaS and back-end distributed system architecture.
This paper sort the 【 Serverless series of Live broadcast 】 January 26 matinee Live back to see the link: developer.aliyun.com/topic/serve…