preface

First of all, I would like to briefly introduce our business scenario. 1688 belongs to the Domestic Trade Division (CBU) of Ali Group. It is the earliest business established by Ali and has a history of more than ten years. We are mainly responsible for 1688.com on PC and Alibaba APP on mobile, which is the largest type B e-commerce trading platform in China at present. It mainly focuses on B2B e-commerce business scenarios and provides e-commerce trading channels such as retail, wholesale, distribution and processing and customization for small and medium-sized enterprises.

I am the wireless server technology team of the 1688 team. The team mainly provides business support for the App, and is responsible for the construction of various scenarios inside the App of the 1688 mobile App, such as home page recommendation and commodity details, which are also typical e-commerce business scenarios.

(E-commerce business scenario)

Serverless landing selection

(FaaS efficiency improvement practice in 1688 complex business scenarios)

1688’s exploration of FaaS (Function as a Serverles) technology dates back to around 2015. At that time, the biggest business goal of Ali Group was “ALL IN wireless”. Mobile Internet was just emerging, so it was necessary to quickly transfer the business existing IN PC terminal, whether taobao or; Still 1688, need to be able to quickly port it to the mobile terminal, generate App to preempt mobile terminal traffic.

In such a large business context, 1688’s solution is to set up a wireless server. Through the microservice system to call the existing business interface on the PC side, and then for the mobile business of the foreground to carry out some lightweight business logic arrangement and UI layer mapping, and finally through the mobile gateway, it can quickly provide the SERVICE interface with the same business capability for the APP side.

The function iteration of mobile Internet terminal is very fast. In this mode, we soon encountered problems: the traditional microservice system takes a long time to build, develop, deploy and debug, and the business changes facing the foreground are very frequent and require fast speed. The mismatch between technical capabilities and business needs pushes us to find and explore better solutions.

FaaS capability landing in two stages

The landing of FaaS in CBU went through two major stages. In the first stage in 2015, the department developed a dynamic loading system based on JVM internally to achieve rapid release, quick online, hot deployment and so on, basically realizing the effect of FaaS. In the second stage, since last year, we jointly built with ali Cloud FC team to replace the base of the whole FaaS capability with ali Cloud function calculation, so as to obtain better elastic expansion, container isolation and other capabilities.

(1) MBOX: FaaS system based on dynamic loading capability of JVM

As mentioned above, in the context of rapid iteration of the entire wireless side business, we need the ability to quickly publish changes to the server side interface. Around 2015, the prevailing thinking in Java engineering was to take full advantage of the dynamic loading feature of the JVM: that is, a mechanism to compile external code into class bytecode in real time without restarting the JVM, and then load it dynamically into a running JVM instance to achieve the effect of hot loading.

Therefore, based on the above ideas, we built a dynamic service loading system based on JVM — MBOX. In MBOX, we built a generic lightweight service container that can take a piece of code from outside (perhaps a Java class or a simple Groovy script) and compile it in real time to generate class bytecode. After that, the container itself performs some security hardening operations (such as eliminating infinite loops, etc.) on the generated bytecode, and finally loads it into a class in a running JVM through a custom class Loader. Object instances are generated and middleware proxies are injected to provide services externally.

Based on MBOX, we have realized the ability of online coding, online preview and second-level release. From the current perspective, it is a very typical FaaS service platform, with the following characteristics: 1. Compared with traditional micro-services, it is an online development mode, with a very high research and development efficiency. Hot loading update mechanism. It can achieve second releases, and the iterative efficiency of bringing the whole business online is very high. Third, for developers using the platform, it brings Serverless experience, because all the operation and maintenance, machine deployment, etc., are all undertaken by the MBOX platform, development only needs to care about the implementation of its business logic. (Here, we play a role similar to the current cloud service vendor)

In the past five years, MBOX system has carried over 1688 business calls of more than 100,000 QPS, with more than 1500 online functions at its peak, saving a lot of human resources and making great contributions in the whole wireless business expansion stage. It also opens a door for our Serverless technology exploration.

Having said the advantages, let’s talk about the disadvantages and risks of the system:

The first is isolation. Because MBOX is based on the JVM, there is no way for the JVM itself to provide an effective resource isolation mechanism (such as CPU, memory, etc.), so there is a significant security risk: multiple services loaded in the same business container can affect each other. For example, today, A code written by A person in the business cluster A has A memory leak. As A result, the performance of the whole cluster may be slowed down and all the services above may be affected, which is A very serious security risk.

The second point is that the code development mode is too light, pure scripting development, can only write a snippet of code, although the development is very fast and cool, but there is no engineering structure, can not use the framework, can not use any design mode, so that the applicable scenarios are very limited, the quality of the code itself is poor.

Third, resource management is a headache for MBOX maintainers. They often encounter a surge in the water level of the whole cluster, but there is no way to determine which service occupies resources. However, the system itself cannot be flexibly expanded and can only rely on manual expansion. In the later stage of platform maintenance, the operation and maintenance cost is very high.

Around 2019, some problems of MBOX have been highlighted, and at this time, under the influence of Kubernetes, the industry started the technology wave of Serverless and cloud native. We immediately started the corresponding technology research, and finally launched the joint construction with Ali Cloud function computing team at the end of 2020. We hope to create a truly cloud-oriented FaaS platform.

(2) Ali Cloud FC: FaaS system based on Serverless + Sidecar

On the basis of Kubernetes container automatic operation and maintenance capability, Ali Cloud Function Computing (FC) has created a set of FaaS infrastructure with high flexibility, strong isolation, and easy to open and customize. At present, it has basically become the unified solution of FaaS capability within the whole Ali Group.

In addition to the high level of underlying maturity and strong flexible automatic operation and maintenance capabilities, FC also provides a very open Runtime design that allows any language or even any team to customize their own Runtime framework to maximize the needs of front-line developers in the business.

In terms of middleware invocation, a long-standing cross-language problem, FC team and middleware team, combined with Microsoft’s latest open source DAPR technology, realized a set of standardized Sidecar capabilities, covering RPC, cache, message queue, configuration center and other common middleware. Smoothen the multilingual differences while further streamlining the user’s runtime container, further improving cold start and elastic speed of functions.

Finally, with the support of the powerful technology at the bottom of FC, we jointly built the Runtime framework and r&d operation and maintenance supporting facilities for Java developers in the group, replacing the original JVM MBOX system and realizing the technological replacement of FaaS capability.

Looking back on the evolution of CBU Serverless, from the earliest microservice architecture, to the self-developed JVM FaaS system, to the current FC function calculation, we have gradually explored the most suitable technical solutions for business scenarios, and also taken the first step in the industry to implement FaaS on a large scale: As the group’s first large-scale implementation of the Serverless concept of the department, our FaaS system in the department has more than 80% of the business penetration, the use of more than 5 years.

FaaS landing of the soul three questions

Now that we understand the capabilities and implementation of FaaS, let’s move on to the question that most concerns everyone: How do you implement FaaS capabilities in business systems?

Based on our past practical experience, FaaS in the actual business is not as “smooth” as most people imagine, and there are bound to be some problems. Here are three core “soul torture” that we think are the most important:

One is: the transformation of stock business. With existing businesses, especially ones like ours that have been running for years, there is a lot of historical baggage that cannot be shed. There is a problem, how to convert a large number of traditional Serverful apps online into Serverless Function? It is very risky to carry out the transformation directly. Is there a way to gain the advantages brought by FaaS at the minimum cost while maintaining the stable operation of the existing business?

Second, the fragmentation of FaaS. Traditional Serverful applications are built in a very cohesive way, with the core capabilities of a certain business basically aggregated into a few microservice applications, and the number is relatively small. However, functions are different, and their lightweight characteristics will lead to rapid quantity expansion. If they are not designed and controlled, it is easy to appear the fragmentation scene of dozens of functions behind one business.

Third, research and development to improve efficiency. Currently, it is widely believed that Serverless can greatly improve the r&d efficiency. However, it is not so simple to improve the r&d efficiency after real implementation. It is only limited that the traditional Serverful application can be replaced by Serverless function.

How does the inventory of complex business fit with FaaS?

First, answer the first question. In terms of the combination of stock complex business and FaaS capability, through practical exploration, we have concluded two practical modes, namely BFF mode and extension point mode.

(1) BFF mode

BFF mode is now a common mainstream approach for FaaS landing. We can put the traditional Serverful App inside the logic of some abstract, generally we can according to change of frequency, the business logic in the scene is divided into two layers: the part of the code logic comparatively light, and at the same time without some complex dependencies, but demand may be most concentrated in this part, called it a layer.

The other part of the code may be the application framework of the business, the two-party dependence of middleware, and some core business logic. Relatively speaking, it will not be changed too much, but the risk of transformation is very large, and the benefits of transformation may not be ideal, which is called the stable layer.

If your business application can be split along these lines, then it is a good idea to adopt the BFF pattern, which abstracts the variable layer and puts it into the Serverless function, thus achieving a BFF layer effect. The foreground consumer actually consumes the Serverful App’s stable layer API directly through the function. This creates a buffer of business that enables faster release & delivery and less operation and maintenance. While there is a fair amount of old code that you can’t get rid of, you can focus 80% of your effort on improving performance.

This mode is suitable for foreground business scenarios. For example, the Controller layer in the traditional M-V-C architecture is very suitable to be replaced by FaaS.

(2) Extension point mode

The second mode is the extension point mode, which is more suitable for the mid-background scenarios, or some mid-stage systems in the business, such as our commodity center, use this mode.

For the application of the middle and background class, it is generally very complicated, and the business logic is very much, and the history is relatively long, so it is not suitable for radical transformation. However, we can abstract the complex business logic layer to some extent, design some key extension points for the future, and provide the FaaS adaptation scheme of the extension points.

In this way, some subsequent incremental business logic can be provided using FaaS capabilities, while for stock business logic, it can be basically unchanged with only a slight adaptation to the code structure, or it can be a standard extension point implementation.

Another advantage of the extension point mode is that it can make the previously closed architecture more open. By adopting this mode, even for mid-stage applications, as long as the docking specification of the extension point is established, any business party can provide customized FaaS functions to achieve their desired extension capabilities. In 1688 commodity medium platform system, the business open customization ability of commodity price calculation logic is realized through this extension point mode.

Avoid fragmentation problems with FaaS

(1) Trade-off of programming interface

In the early days, when we defined the programming interface of functions in the MBOX system, script programming was adopted. The user’s programming granularity was a piece of code, a Java class. This approach, while very lightweight to write, leads to a very large number of functions (many scripts may be required to implement a slightly more complex business, and the code quality is poor because there is no engineering structure, and some design patterns cannot be used).

Therefore, when developing the programming interface of functions based on FC, we set a rule based on the above experience, that is, the operation granularity of user functions should be a “Micro App”, not a “Single Function”; The granularity of the programming interface should be a “Code Project”, not a “Single Script”.

Based on the principle, for developers, a function instance is closer to a small application, overall retained the most compact structure, which can be at a relatively low cost to the realization of the single function point, also can undertake the development of complex logic, introducing all kinds of the second and third party libraries, less serious function expansion and fragmentation problem.

(2) Construction of internal service market

Adopted Micro App type defines the function of particle size, the use of to the function of the number will still be in business than traditional Micro service application several orders of magnitude more, in order to solve this problem, we have designed the business domain – – group function 】 【 】 【 function 】 – four layer interface 】 【 latitude function definition, classification and embedded in the function of project template plugin. These grouping and classification information will be automatically collected and reported when the function is completed and released, and finally the function service market for internal r&d personnel will be built, so that we can intuitively see the existing function classification and the owning information of each interface API.

Where are the bottlenecks in R&D efficiency?

To enable developers to “Only focus on business”, which is the core concept of Serverless since its birth. However, if we simply switch the underlying infrastructure such as operation and maintenance to Serverless infrastructure and introduce related technical capabilities such as FaaS, In fact, it is far from “Only focus on business” in the real sense for r&d personnel.

In our initial Serverless, did find r&d high coding efficiency seems to be a lot of classmate, but from the whole of the business needs of the business team delivery terms, r&d efficiency of qualitative change happens: most of the requirements development process is lengthy, still needs to promote the process of communication costs, coordination costs are still high. A closer look reveals that the key bottleneck to R&D effectiveness may often lie not in “R&D” itself, but outside the code.

So is it a false proposition that Serverless can improve r&d performance? Of course not. First of all, Serverless and FaaS can significantly reduce the cost of operation and coding and improve efficiency. Secondly, the emergence of Serverless technology makes the technical threshold of the server side lower, making some non-server side professional r & D personnel can also have the ability to develop some simple business logic, which makes the full stack development of requirements possible.

From the perspective of the efficiency of the overall demand delivery, assuming that the r&d personnel can independently complete all the development work of the whole demand without joint communication with others, then the efficiency is bound to be maximized — Serverless brings the possibility of the implementation and popularization of this new RESEARCH and development mode. Perhaps is the real meaning of “Only Focus on Business”.

In short, Serverless is not a silver bullet to improve performance. In fact, no technology is a silver bullet. When we expect r&d performance to improve, we should look at it from a global perspective rather than focusing on the “R&D” stage.

Improvement practices for complex business scenarios

Finally, we take a practical business scenario as an example to introduce how 1688 combines FaaS capability to carry out research and development and improve efficiency in the stock complex business scenario.

Here is a brief introduction to 1688’s commodity details business scenario: Commodity details are the final display page for the buyer, carrying a large number of commodity information. The commodity details page of 1688 is different from other ordinary C e-commerce businesses in that there are multiple transaction channels for B-oriented trade fairs, such as spot wholesale, distribution and distribution, processing and customization, etc. Each channel has different transaction mode, price and inventory logic. In addition, there will be huge differences in expression between type B e-commerce and commodities of different industries such as consumer goods and industrial products. Different channels and industry customization, and then superimpose all kinds of e-commerce marketing activities, making 1688’s commodity details page business complexity is very high.

Multiple teams were involved in the original technical architecture: • The underlying commodity base team: the ability to connect with the group’s mid-range products, the services that precipitate the domain model, and some core commodity logic; • Wireless server team: encapsulate the underlying commodity basic service, client-facing and front-end into a dedicated commodity detail interface; • Set up the launching team: responsible for providing support for page building and launching horizontal capabilities, as well as ios, Android and front end of the final presentation side. • In addition, the corresponding business team needs to be involved in some business customization requirements (such as distribution).

In this mode, there are at most 5 or 6 teams working on a demand at the same time, and the communication cost is very high. In addition, the server side adopts a relatively heavy micro-service application mode, which leads to low efficiency in r&d and operation and maintenance.

Front desk business logic consolidation: BFF mode

We first introduced FaaS capabilities in BFF mode at the front end business logic layer, the most rapidly changing core of the business, and all uI-related logic was folded up into FaaS functions. On the one hand, this improves the development and deployment efficiency of server-side corresponding logic. On the other hand, front-end and client components only need to deal with the simplest presentation logic, thus smoothing out the technical differences as much as possible and allowing some cross-end capabilities to be realized.

Commodity backend: Extension point pattern support definition

The main problem with the commodity backend side pair is the high access cost of various custom business logic, so we use the FaaS extension point pattern to optimize: Abstract the core logic of commodity information (such as price and inventory) into a standard extension point, and connect the function gateway, allowing any business side to write a function according to the template to customize its business logic, so as to realize the openness of closed architecture.

After the transformation of the technical architecture mentioned above, we can see that the overall pattern and link of r&d requirements have changed significantly. In the old mode, if you want to customize the commodity details of a business, it requires the participation of multiple teams from the back end of the business to the client, and a lot of links need to be changed, resulting in very high cost.

In the new mode, from the customization of the core business logic to the implementation of the business logic on the front side of the presentation, simple FaaS functions can be written to achieve, and even all the changes of the background business logic can be completed by only one student. (If front-end and client components can achieve low code + cross-side development, then full stack development of business requirements can be achieved!)

Finally, we take a look at the effect brought by the transformation of the overall business scenario. In combination with the FaaS RESEARCH and development mode, the waiting time of demand related to commodity details is reduced by 80%, the release frequency is increased by 300% +, and the throughput of demand is also improved. The most important thing is that the investment of R&D personnel is reduced by 50%, and the manpower investment of the whole backend is reduced from 2 regular employees to only half regular employees + 1 outsourcing employees. The related teams and personnel involved in requirement development are reduced a lot, and the whole R&D delivery link becomes very simple and clear.

Summary and Prospect

Finally, standing at the current time node, from the perspective of the business team, a simple summary and prospect of Serverless technology is made.

Based on our past experience of landing business scenarios, several key conclusions are summarized:

Serverless is undoubtedly a new technological revolution, and the productivity improvements brought about by core technologies such as FaaS have been strongly demonstrated in many scenarios. Serverless should be targeted according to the actual business scenarios, rather than stick to the bottom. No technology is a “silver bullet” to improve efficiency. The improvement of R&D efficiency is more about the combination of business and technology from the perspective of the team’s organizational structure and people. As for the subsequent development of Serverless and FaaS, I also make some personal views here, and welcome rational discussion:

FaaS will continue to evolve: While Serverless’s FaaS capabilities are mature, there is still a lot of room for improvement. Can have better flexibility and cold start speed. We are seeing the industry continue to explore new directions such as WASM and eBPF, and there is reason to believe that we will continue to see breakthroughs in this area. FaaS will not completely replace traditional microservices: Both forms of traditional microservices (including those running on Kubernetes) and FaaS will continue to coexist, at least for a long time to come, and business teams should be prepared to combine the two technologies in their business scenarios to exploit their strengths. Full-stack and low-code (or low-threshold) R&D will be a trend: We see the rise of Serverless bringing a whole new way of thinking about r&d efficiency, as full-stack business requirements development can significantly reduce the cost of communication and collaboration in projects and increase the throughput of requirements. As Serverless technology matures and spreads, this could change the shape of the current r&d team.

Release the latest information of cloud native technology, collect the most complete content of cloud native technology, hold cloud native activities and live broadcast regularly, and release ali products and user best practices. Explore the cloud native technology with you and share the cloud native content you need.

Pay attention to [Alibaba Cloud native] public account, get more cloud native real-time information!