The overall rhythm of this year’s Singles’ Day has changed from the previous “Singles’ Day” to “nunchaku”, and there are many changes and adjustments in the specific business, according to the local saying of Ali: “The only constant is change”. In the face of these changes, it is a challenge as well as an opportunity. What we need to do is to “support” the business to win first, “at the same time” to ensure the experience and stability to bring users the ultimate experience, “but also” to pursue innovation so that the front-end continuous evolution. In order to achieve “both, but also, but also”, including technical scheme, process mechanism, personnel organization and other aspects have carried out a large number of design and guarantee. Finally, the first shuangfeng Double 11 successfully concluded, Amoy department front end also achieved their goals, including the application of a large number of optimization means and innovative solutions to bring business transformation; FaaS, PHA, ESR and other technologies are applied in more scenarios, and the front-end capabilities and boundaries are further expanded to the server, client and CDN nodes respectively. The application of visual restoration, integrated RESEARCH and development to improve the efficiency of research and development, greatly alleviate the resource bottleneck. The following will be an overall introduction to tao department in the front of this year’s Double 11 thinking and precipitation, I hope to help you. There will also be a series of special articles, I hope you continue to pay attention to.
Change & Challenge
On this year’s Singles’ Day, the first thing I feel is constant change.
Single peak to double peak: Double 11 has changed from one band to two bands this year, and the three stages of pre-sale, pre-heating and formal promotion have also doubled correspondingly. First of all, the research and development workload has increased significantly. It is the first challenge to efficiently complete the research and development requirements while the time schedule remains unchanged, workload increases and personnel remain unchanged. Secondly, in the face of the state changes of the six stages, how to maintain accurate switching, stable operation and smooth experience are the key contents to ensure during the double peak period. Finally, in the face of the ultra-long operation period of more than 20 days, the safety of production, maintenance of personnel status and rapid response need to be guaranteed by strong organization and mechanism.
Photo: Double eleven rhythm
Home page overhaul: There are subversive changes in the first screen content of the latest Taobao home page, such as simplified content on the first screen, recommendation in advance, channel embedded recommendation as content, etc. In the absence of fixed traffic entry, each business needs to actively adjust its operation strategy, product strategy, design scheme and technical scheme. At the same time, the recommendation ability of each scene also needs to be continuously enhanced. This year, by expanding the number of pits to 1000+, the theory can reach infinite pit expansion. Improve click-through rates through smart UI.
Figure: Comparison of hand-washed versions
Business changes: Business innovation and new ways of playing emerge in endlessly, including mini details, flagship store, price expression, pen pen return, sesame purchase and many other businesses are new expression, subversive upgrade. Even if it is a new attempt in business, problems such as structure selection, account reconciliation, consistent expression and scheduling should also be solved in technology.
Completes the labor of
The first thing to do is to do a good job of their own, to ensure demand research and development and stability. In terms of demand research and development, we realized automatic development of most UI modules through D2C, reduced the cost of interactive research and development through the construction of Eva interactive system, and improved the efficiency of research and development and operation and maintenance through the integrated research and development of Serverless, so that the front-end would no longer become a resource bottleneck. Stability is also guaranteed through a series of mechanisms and tool systems. At the same time, we add a piece of capital loss prevention and control strategy and plan that we may not pay much attention to at ordinary times.
D2C r&d efficiency improvement
On November 11 last year, we set up a r&d efficiency project, the core of which is to improve r&d efficiency through the Design to Code (D2C) platform Imgcook. Finally, 78.94% of the newly added module codes were automatically generated in last year’s Double 11 promotion Conference, and the code availability rate reached 79.34%.
This year, the front-end intelligent power front-end research and development model is upgraded. Several BU co-built front-end design draft recognition algorithm model and data set, and the design draft generation code technology system is comprehensively upgraded, such as enhanced intelligent recognition of UI polymorphism, live video components, and circulation. This year, 90.4% of the new module code intelligent generation was accepted, and the code availability rate reached 79.26% (compared with the intelligent inspection ability of the upgraded design draft last year, the visual draft does not need manual adjustment). Thanks to the IMPROVEMENT of THE RESEARCH and development of D2C, this year did not have the situation that the resources were lent to the development of the venue in previous years. Compared with the traditional module development mode, the coding efficiency (the ratio of module complexity to r&d time) increased by 68% after using the design code generation technology, and the throughput of module demand per fixed manpower unit time increased by about 1.5 times.
Figure: D2C operation flow
Interactive R&D upgrade
In the field of e-commerce, interaction is an important user growth scheme, which plays an important role in enhancing user engagement, activity and new engagement. This year, Tao Department interactive team launched the “Super Star Show Cat”, we do not build buildings, do not drive, all people participate in the start of raising cats, three cute cats with different styles once appeared, instantly captured the hearts of countless consumers. Through a set of solutions of THE EVA interactive system, the research and development efficiency is greatly improved, and the interconnection of various apps such as Taobao, Maowile and Alipay is supported. With the help of the client capability and EVA interactive system, the performance and memory are well controlled, so that most users can experience high-definition and stable interaction, realizing zero failure and second opening, and the number of star Show cat participants reaches a new high. The subsequent series of articles will elaborate on how tao Interactive front end team achieves fast, good and stable double 11 interaction, covering three aspects: interactive basis, EVA research and development system and global stability scheme.
Photo: Interactive renderings
Integrated development of Node FaaS
Serverless cloud + terminal research and development mode through the integration of page code and service code research and development, so that the front-end can be fully supported from the front page to the back-end service, saving the cost of intermediate communication and joint adjustment. Its landing on the Tmall list and V list has improved the overall r&d efficiency of FaaS related business of Double 11 Node by 38.89%. The double 11 demand of industry shopping guide also supports the rapid entry of outsourcing under the new mode of cloud + terminal, making the overall improvement of about 20%.
Stability guarantee
Stability guarantee runs through the whole Double 11 cycle from the start to the end of the project. The following is a brief introduction from several key aspects:
Evaluation of change: Every year’s Double 11 stands on the shoulders of giants and has passed the test of the last double 11. The main risk becomes the addition and the change, both in technology and in people. To make a full assessment of the changes, verify in the 99 promotion line, and ensure that there is no change after the 99 promotion line, to meet the Double 11 in a stable state.
Pressure test: First, evaluate the traffic and prepare resources such as machines and bandwidth based on last year’s data and this year’s changes. Complete the single line pressure test to ensure the normal operation of the service and upstream and downstream under the estimated flow model. Carry out the full-link pressure test, and verify the operation of the concurrent business at the peak of 0, especially some underlying public services, as well as the guarantee of priority.
Backstop & Contingency plan: Backstop generally refers to how to minimize user experience and business loss in the case of heavy traffic or other uncontrollable factors. The plan needs to evaluate the various situations that may be encountered and how to deal with them.
Acceptance: function preview, according to the user’s all use path operation, this work is still manual. Time travel, adjust the state of the page and system to the active time for verification, need to open up the upstream and downstream systems and form linkage. Model acceptance, basically divided into high-end machine, middle end machine, low-end machine, respectively acceptance, many businesses need to do function degradation for low-end machine. Stability acceptance, the performance and stability of individual pages are guaranteed separately, but there may be problems after business superposition, especially for big memory consumers such as venue, interaction, live broadcast, flagship store, etc., there are drainage between each other, it is difficult to guarantee after switching, so the overall full-link acceptance is required.
Change & Emergency: The previous failure data shows that most of the problems are caused by changes, so how to do change control is particularly important. It can be divided into weak control period and strong control period according to time. It can be divided into group core application, BU core application and non-core application according to business level. Establish change CR and approval mechanism. Emergency response mainly refers to the circulation mechanism of problems, public opinions and faults during core activities, making requirements for the time of problem discovery, problem positioning and problem repair, and making arrangements for decision-making at different levels.
Monitoring: The front end of tao department continues to build and upgrade the monitoring capacity. It is necessary to guarantee the availability of rush hour and real-time alarm, covering all business scenarios. For increasingly complex scenarios, end-to-end monitoring and data analysis platforms are needed. Grayscale processes lack measurement and spot monitoring. According to these problems and demands, JSTRacker provides the overall solution for production safety, creating an end-to-end front-end monitoring and data analysis platform, and creating a real-time monitoring, multi-end coverage, data analysis and intelligent data platform. At the same time, according to the page situation, error logs, source station data, FaaS logs, etc., the front-end data of Double 11 was created.
Police asset Management
Front-end capital loss prevention and control has always been a very weak part of the platform, front-end triggered capital loss and public opinion problems are not a few. In the past, it all relied on the development of students’ experience and consciousness to ensure the lack of systematic capital loss prevention and control ability. Last year, we organized centralized screening and manual rehearsal at the team level, which consumed a lot of manpower and time, and it was difficult to ensure quality and accumulate and deposit. Therefore, in order to have a lower cost and better prevention effect for capital loss prevention and control, at the beginning of S1 in 2020, we will focus on the design and implementation of relevant products for capital loss prevention and control. At the same time this year also focused on increasing the business, the operation of the background side of capital loss prevention and control.
We have divided the capital loss prevention and control atmosphere into three stages: r&d stage, operation stage and operation stage. In the research and development stage, the warehouse with capital loss risk was marked, and the conventional price, preference, default copywriting and other cases were enumerated, and prevention and control were carried out through static scanning, UI test case scanning and other methods. In the operation stage, it mainly refers to the stage where merchants and operators set discounts and rights and interests. Prevention and control can be carried out through unified expression (to avoid differences in understanding caused by 50% discount and 0.5% discount), secondary confirmation, limited boundary value and low price warning. In the running stage, there are snapshot comparison and server data reconciliation, etc. The prevention and control in the running stage is relatively backward, and there is a high probability that the actual impact has been caused.
However, prevention is still the priority at present, and no capital damage failure can be guaranteed. Next, we are still thinking of link-level prevention and control measures in the production environment, and constructing some alarms and automatic hemostatic to escort the platform.
The business value
On the basis of doing our job well, we hope to bring incremental value to the business. This chapter introduces the performance optimization and transformation of the venue, the transformation of the new scheme of basic link, the customization strategy of call end technology to improve the accuracy, and the intelligent UI to provide different people with clicks without UI promotion.
Performance improvement
The venue is one of the main characters of the annual Double 11, and the user experience of the venue is also the most concerned point every year. In the increasingly complex business needs, how to ensure that our user experience does not deteriorate or even better is an eternal proposition. This year, the pre-rendering scheme and SSR scheme are respectively used for optimization. First, the standard of second opening is redefined, which is upgraded from the original front-end time to the time from the user’s click to jump to the page visible, and the client routing and Webview startup time are added to make the measurement of experience closer to the user’s real sense of body. It covers dozens of scenes including the main venue, industry venue and outsourcing venue.
pre-rendered
Pre-rendering is a technical solution used in this year’s Double 11 venue to improve users’ experience of opening the venue. The time-consuming operations such as WebView initialization, page resource loading and partial JS execution in the original H5 page rendering process are carried out in advance to complete the page “rendering” in off-screen state. When users actually click to enter the venue, reuse the page “rendered” in advance, greatly saving the time to open the venue. The average time for users to open conferences is shortened by 200ms to 700ms, and the opening rate of seconds is increased by 10%-14%. Optimization of the low-end machine absolute income is higher, has been realized in the low-end machine second meeting field. Let users visit the venue experience more smooth, especially low-end mobile phone effect is more obvious. In future articles, we will also talk about the practice and thinking of performance optimization, including pre-rendering, data snapshot, parallel request and so on.
Figure: Comparison of pre-rendered effect of middle and low-end models
SSR
This year, on the premise of not changing the existing architecture and business, the ServerSideRendering technology was used in the conference hall, which increased the second opening rate to a new height (82.6%). While the user experience has been optimized, business metrics such as click-through rate have also increased significantly, bringing good business value. The following series of articles will introduce the specific practice and methodology of front-end in solving engineering and business effect evaluation in detail. Server side in solving front-end module code execution, isolation and performance optimization on the server side thinking and solutions.
Figure: Comparison diagram of SSR effect for low-end models
Based on the link
Basic link is the core link of e-commerce, including home page, commodity details, micro details, transaction (order, order, shopping cart, payment success), information flow, my Taobao and other basic businesses. The existing technical solution is to use the Native version in hand cleaning, pursuing the ultimate experience and stability; Off-site traffic and Aliseries apps, including Alipay, use the H5 version to pursue flexibility and usability. With the improvement of alipay’s containerization system and its cohesion in other apps, the new containerized version of basic link has incubated soil; At the same time, some disadvantages of H5, such as resources are in the remote end and Native ability usage limitation, can also be optimized.
With the help of the previous “New Altron” and “DinamicX” solution (mainly to solve the business customization and android, iOS, H5 three-end consistency, to achieve one development, three-end effect), the containerized version can be rapidly expanded to achieve four-end consistency. In terms of performance data, the loading time is improved by 2s compared with H5 version, basically reaching the goal of opening in seconds. In terms of business data, UV conversion rate of containerized version increased by 70+% compared to H5 version.
At present, it has covered Alipay, special edition, Youku, Autonavi, Taoxiaopu, Yitao and other apps, as well as integrated in many external media apps through baichuan SDK. In terms of business, the company also has access to such businesses as daily must grab, brand direct drop, Taobao special offer, Taobao live broadcast, Baichuan media, Youku, small shop, light shop, and Huabai.
Call the technology
With the peak of traffic, e-commerce war further upgrade, how to do user growth is the major companies must complete the proposition. User growth involves a very wide range of aspects, this year Amoy department front-end focus on call end technology, that is, external flow pull hand Amoy App technology system. The threshold of call terminal technology is very low, as simple as just need to form a scheme like URL can trigger the call terminal. Call technology is very complex, different channels, different OS, different App may have restrictions on call protocol, and there are various compatibility problems; Different services in the call end link may have their own customized requirements, such as transparent transmission of parameters. The efficiency of calling end link is the core point of concern. Different scenarios and different services may have different efficiency, so it is necessary to monitor and compare the calling end effect. To address these complex issues, we made another upgrade to the call end technology, built a customizable call end strategy, and built a detailed call end AB test link. From the effect of this Double 11, the call efficiency (call success rate) in different scenes has been increased by 25 to 40%.
Figure: Call end strategy diagram
Smart UI
With the development of mobile Internet and recommendation system, the accurate matching of people and goods has greatly improved the efficiency of business. More and more refined methods are gradually applied in the field of personalized recommendation, such as scenario-based recommendation and crowd selection technology. At the same time, product information is more abundant than ever before (buyer show, brand endorsement, worrieless purchase service, etc.), different users have different demands for content UI expression, so finding appropriate UI expression for different groups will certainly improve business effect.
At the very early stage of the project, we conducted direct quantitative test through AB experiment, and made it clear that the same UI scheme would produce differences in different scenarios, and different UI schemes would also produce differences in the same scenario. That said, it makes sense to use different solutions for different scenarios. Double 11 in 2020 promoted our first large-scale use of intelligent UI productization program and implemented a number of front-end modules, including guess you like module, commodity module, shop module, etc., covering the pre-sale and official opening stages of Double 11, which stood the test of flood peak traffic and brought stable growth. Covering more than 300 venues, the PV click rate of the highest venue increased by 10%.
Technology upgrades
With the evolution of technology and business development in the industry, we have also made new attempts and iterative upgrades in technology compared with last year, including the in-depth use of FaaS, the progressive enhancement of PHA experience, the application of edge node rendering, etc.
FaaS
Serverless, a piece of deep water hard ice, gradually from the deep sea to pay the surface, Ali Tao department from last year in the promotion practice, gradually Serverless applied to all aspects of the front-end field. This year’s Double 11 first covers scenes. FaaS has expanded from Taobao to venue and marketing business, which has greatly enriched the complexity of business. The capacity is further improved, and the business magnitude supported is also increased from 2K QPS to 5W QPS. When the CPU water level is increased from last year’s high QPS scale, the energy cost is reduced by about 50%. In terms of RESEARCH and development experience, the company has built a solution system, unit guarantee, greatly promoted control, expert system, function inventory and other capabilities, and improved operation and maintenance efficiency by about 50%. In terms of R&D experience, we will build a solution system, lower the r&d threshold, and support the rapid entry of outsourcing.
PHA
PHA stands for Progressive Hybrid App, which is an application framework to improve Hybrid experience. Progressive Web applications that improve page loading speed and interactive experience. Applications developed using PHA do not depart from front-end development and W3C standards in nature, but still have the features and experience of native applications. You might think of PWA, but PHA has more UI capabilities and faster loading than PWA. At present, it has been implemented on many clients such as Handtao, special edition, Lazada, CBU and so on, and has supported many promotions such as 618 and Double 11. PHA cooperated with the client, front-end team and data analysis team across stacks to make a lot of optimization work in the direction of performance optimization, sorting out the buried points of full-link performance, defining new performance caliber (from user click to visible), and using optimization methods such as preloading, prerendering, resource acceleration download and offline resources.
ESR
Currently, Rendering nodes are mainly at the terminal or the Server Side, corresponding to Client Sider Rendering (CSR) and Server Side Rendering (SSR), which have applicable scenes, advantages and disadvantages respectively. At present, ali Cloud can transfer Rendering to CDN nodes, which is ESR (Edge Side Rendering) we will introduce. It can provide Rendering capability for the front end and utilize the computing resources on a large number of CDN machines.
Ali Cloud launched the CDN lightweight programming environment — EdgeRoutine, which provides us with a new direction to try. We can do pre-rendered things at the CDN node. The access strategy of CDN is to find the node nearest to the user, just like the last kilometer of express transportation, it will always be sent to the distribution point nearest to the customer. So it seems that the page network scheduling time is very optimized space. In addition, we can also make use of the resource sharing feature of CDN nodes to cache some data on CDN nodes to reduce remote data requests.
This scheme is very suitable for pages with low data refresh rate and high traffic volume. Take the master page as an example, the first screen time can be improved by about 50%. At present, the technology of ER is just in its infancy, with limited application scenarios, many shortcomings in capacity and continuous construction of the system. However, this new technology provides more possibilities for the front end, which needs us to constantly explore and improve.
Double 11 PM first experience
As the most core festival of the e-commerce year, Double 11 has received the largest amount of input and resources from all aspects. As a veteran of 8 double 11 games, it was the first time for me to be a front end PM, and I had many different feelings.
Complex: the first is our business, there are double 11 custom and characteristic of the main, the main interaction, night cat, etc., and tao’s interior itself has guides, live industry, marketing, and so on dozens of business, at the same time linkage alipay, youku, local life, ali mother BU, rookie group, with stores, isvs, logistics, media coordination and cooperation, etc. The technology is also complex. The front-end page includes all links of development, construction, source station and CDN, as well as Node FaaS containers, middleware, capacity preparation, traffic allocation and machine room deployment. The cognition of the whole system needs further exploration.
Process: As an annual test of e-commerce business, Double 11 has explored a mature process mechanism. Including personnel composition, communication mechanism, time schedule, organizational guarantee and other aspects of a very detailed mechanism to ensure.
Coordination: Double 11 is a very good node, which can make all teams, all posts and all BU to form linkage, so as to further improve such a huge system. Many technological upgrades and breakthroughs were launched and further promoted on Double 11. The pre-rendering scheme was implemented and verified in a very short time in close collaboration between the client and the front end.
Multi-dimensional: The problem can be viewed from more multi-dimensional perspectives, such as different technical positions, whole-link perspectives and business perspectives. Before a change of examination and approval, for example, pay more attention to change the code on the implementation, the impact on the upstream and downstream, the impact on the stability, the impact on the business, whether the introduction of new risks, the range of influence and so on all need to make a comprehensive measure, make a judgment often need to choose from, and not just the technical 1 s and 0 s.