, recruiting

We are in urgent need of browser rendering engine /Flutter rendering engine talent. Welcome to join us!

background

The writer attended the 2020 Shanghai Bund Conference as a lecturer. The following article is an edited transcript of the conference’s shared content.

directory

Good morning everyone, welcome to participate in this sharing, today my theme is “Web Technology Development Trend and U4 kernel Technology Evolution”.

! [](https://pic2.zhimg.com/80/v2-9ad681c444ecd82b35d3e4237976c105_720w.jpg)

There are three main contents today:

First, let’s take a look at some of the important recent developments in Chromium.

Second, the important technical optimization points of our U4 kernel are introduced.

Finally, look at the future of the browser kernel.

Chromium important technical progress

Present situation of Web

Let’s look at these three numbers first.

Do not know what kind of feeling you will have when you see these 3 numbers?

My biggest feeling is: The vitality of Web technology is too strong! Today, with all kinds of new technologies emerging in endlessly, Web has been developing for more than 30 years. In addition to supporting the webpage construction in the traditional Internet field, Web technology also has a very wide range of applications in the mobile Internet field, such as small programs, information flow, conference and other business scenes, we can see the figure of Web technology.

Why does the Web have such a strong vitality? I think it’s very important to be highly standardized. Because it is highly standardized, it has strong downward compatibility and cross-platform compatibility, which makes it very widely used and durable. However, the problem with high standardization is that new features are slow to be implemented. It takes many years for a feature to be proposed, standardized, and finally implemented in the browser. It is more practical for developers to understand the progress of the browser kernel.

So today we’re going to look at the trends in Web technology from the perspective of the browser kernel.

Global browser kernel distribution

Here’s statCounter’s mobile browser market share for the last year. As you can see, Chrome is way ahead, with over 60% market share. And this is only from the dimension of browser statistics, from the dimension of kernel, Opera, UC, QQ and other browsers are based on Chromium kernel, their total market share has exceeded 70%, it can be said that Chromium has led the development direction of Web technology.

Let’s take a look at some of the key developments in Chromium.

Chrome Architecture Optimization

Let’s start with Chromium’s three big architectural optimizations:

Onion Soup: Its main purpose is to understand the component dependencies of the decoupage Chromium kernel, so that various capabilities can be provided in the way of services, so as to simplify the complex code structure and improve scalability and performance. The project has been running for more than five years since 2015 and is still ongoing.

Sliming Paint: It is a major revamp of the rendering assembly line with the main purpose of improving rendering correctness and performance. It has been more than 5 years in the making. Currently, there is a laboratory version, but the official version has not been released yet.

LayoutNG: LayoutNG is a new typesetting system designed to remove the historical burden of the old typesetting system, making it easier to extend new features, and can be segmented and interruptible. LayoutNG has been around for 4 years and had its first release last year with the M76 release. However, according to our laboratory test data, compared with the old typesetting system, it does not have obvious performance advantages. Chrome is also in continuous optimization in this aspect, I believe that with the iteration of the version, LayoutNG will gradually reflect its technical advantages.

Having looked at the big technical architecture optimizations for Chromium, let’s take a look at some of the important features of Chromium.

PWA

The first is the PWA. It is a very important concept proposed by Chromium in recent years. As can be seen from its name (Progressive Web Application), its main goal is to enable developers to make use of Web technology and gradually build Web applications with user experience comparable to native applications. PWA is an integrated package of technologies that includes several important features: Service workers provide offline access, A2HS (Add To HomeScreen) provides the ability To Add ICONS To the desktop, Web Push provides the ability To Push messages, and Notification API provides the ability To trigger mobile notifications. We can see that a Web page can be accessed offline and an icon can be added to the desktop. When users click the icon to open an application, they can hardly tell whether it is a Web implementation or a native implementation. This is the effect that PWA wants to achieve. In addition to these four, it contains several other features that I won’t cover here.

Device API

Let’s look at another feature that is closer to the purpose of PWA, the Device API. The Device API is also meant to enhance Web expressiveness, but it does so by opening up more Device power to developers. For example, M67 adds Sensor API, M81 adds Web NFC, and new Device apis such as Web GPU are currently under development. As the Device API continues to grow, the Web will have more control and power over hardware.

WebAssembly

Let’s take a look at another feature that has gotten a lot of attention recently – WebAssembly, WASM. WASM is a new browser language standard jointly proposed by Firefox, Chrome, Microsoft Edge, and Safari. It exists in binary format, can run directly on the JS engine, and it is AOT-compiled, does not need to go through the complex JIT compilation pipeline can be directly executed, with higher performance. Another feature of WASM is portability, which is also the main purpose of its existence. We can use compilation tools to directly compile native code into WASM to run on the browser, saving the cost of development and migration. For example, some time ago bilibili shared an application where they compiled the FFmpeg audio and video decoded library in C++ directly into WASM and ran it on a browser. With the rise of new business scenarios such as live streaming, AI, AR, and VR, WASM is gaining more and more attention. Its high performance and portability can be well applied to these scenarios. However, there are some problems with WASM, such as not supporting multithreading and not friendly enough for debugging, and Chromium is currently continuously optimizing this feature.

Houdini

Let’s move on to a less familiar project called Houdini. Houdini is not as well known as WebAssembly, but it is a very important project for developers. To put it simply, Houdini provides developers with a set of apis and CSS features that control the typography and rendering pipeline, making it easier for developers to build cool pages. It includes a series of technical standards such as Worklets, CSS Parse API and CSS Layout API, most of which Chrome already supports. For example, with the CSS Paint API, a developer can write JS code to draw an image, add it to the Worklet, and then reference the Worklet in the BACKground-image property of the CSS to draw the element’s background image with JS. As we can see from this example, Houdini’s goal is to open up more browser kernel capabilities to developers, so that developers can more easily control the layout rendering pipeline, so that they can more efficiently implement complex effects, and make the entire Web development experience more user-friendly.

Performance API

In addition to the functionality and Performance optimizations, let’s look at another important feature – the Performance API.

The Performance API is a set of apis that help developers measure online page Performance accurately, such as load time, event response time, memory usage, etc., so that they can optimize and improve the online page experience.

Trends in Web technology

Earlier we looked at some of Chrome’s recent core technical evolutions. From the evolution of these technologies, we can see two major trends in the development of Web technology. One is that the expressive power of Web is increasingly approaching that of native applications. Both PWA and Device API are advancing in this direction.

Second, from the perspective of developers, the whole Web development experience is getting better and better. Whether Houdini, Performance API or the improvement of developer tool platform, it makes it easier and more efficient for developers to build high-quality Web applications.

Under these two major development trends, what kind of technical optimization does our U4 kernel have? Let’s move on to part two.

U4 kernel technology optimization introduction

Ali Web Platform

First of all, let’s take a look at the position of U4 kernel in Ali Group. U4 kernel was mainly known to everyone through UC browser before, after several years of development, it has gradually evolved from a browser kernel into a Web Platform, serving most of Alibaba apps such as Alipay, Taobao, Tmall, UC, Dingding.com, Feizhu, etc. The U4 kernel is a very important part of the Web ecosystem in Ali Group.

What challenges did we face as we evolved from a traditional browser kernel to a Web Platform?

Challenges to the Ali Web Platform

In this process, we encountered two biggest challenges, one is stability, one is security.

Stability is mainly reflected in two aspects:

  1. As our application scenarios and models cover a wider range, we will encounter more device and API compatibility problems, which will affect the stability of the kernel.
  2. Ali’s business scenario is very complex, which requires higher requirements on memory and performance.

Security challenges are mainly related to the form of business. Most of Alibaba’s business is related to money, such as our e-commerce business and payment business. When it comes to money, the safety requirements are very high. This also puts higher demands on our U4 kernel.

So how do we address these two challenges?

U4 kernel multi-process architecture

One of the most important things we did here was to overhaul the process architecture — from a single-process architecture to a multi-process architecture model.

Why does a multi-process architecture meet these two big challenges? This is mainly reflected in three aspects:

  1. Multi-process architecture can isolate the crash of some sub-modules to the sub-process, so as to avoid affecting the main process and causing the crash of the entire APP. In addition, the sub-process can be automatically recovered after the crash, and users can browse the web page normally without any awareness.
  2. By placing some modules in the child process, you can effectively relieve the main process of virtual memory space shortage pressure, thus reducing OOM crashes. Because for 32-bit applications, Android only has 4G virtual memory space per process, it is easy to crash OOM in complex scenarios with insufficient virtual memory space.
  3. We made a Sandbox Render process, which is a restricted process, and the JS engine runs in the Render process. By isolating the Render process into a sandbox process with limited permissions, you can effectively prevent the main process from being attacked, or the core user data from being stolen, even if the JS engine has a security hole.

Let’s take a look at the effect of multi-process transformation.

First of all, we’ve had a huge improvement in stability, with main crashes reduced by 90%+ and OOM crashes reduced by 97%+.

In addition, it can achieve rapid automatic recovery of child process crash, so that users can almost no perception of the situation, resume page browsing.

Let’s take a look at this video. On the left is alipay Ant Forest running on cloud Real machine, and on the right is a command line terminal. We kill its GPU process by shell command to simulate the crash scenario of GPU process. We can see the picture on the left, and we don’t see anything unusual. After the GPU process is killed, the process automatically recovers and the user continues to browse the H5 page without being aware of it.

Build a better Web platform

Can we be a good Web Platform once we have solved the two challenges of stability and security? The answer is definitely no.

In order to build a better Web Platform, we also made more feature optimizations. Let’s start with a feature that we have a little bit more of — blending rendering.

Hybrid rendering

Hybrid rendering is primarily intended to address a business pain point like this:

Some businesses were implemented using Native before, but they want to be implemented using Web later, but only native versions of some third-party components are used. Redeveloping third-party components using the Web can be costly and questionable. One of the most typical scenarios is the map control.

To address this pain point, we implemented a hybrid rendering scheme. This solution provides the ability to embed native components in the page, seamlessly integrate the page and native components, and ensure that the overall effect and interaction is very natural, and the business does not need to do additional adaptation.

Here we watch two videos: The first video is OFO bike mini program on Alipay. Its map is implemented in Native, and the small controls and text overlaid on the map are implemented in H5. As we can see, the whole interaction is very natural, and users can’t distinguish between native components and H5 components at all, realizing the perfect integration of Native and H5 components. The second video is a DEMO of our AR camera, whose camera images are presented by Native technology, and the card effects above are presented by H5, which perfectly combines the high-performance Native camera with the highly dynamic H5 card. It well integrates the advantages of Web technology and native technology.

Mixed rendering has been widely used in small programs and various business scenarios, and has become one of the most important features of the U4 kernel.

Let’s take a look at another feature: game mode.

Game mode

The game mode is designed to address the pain point of low performance in Web games. Through our survey, we found that most game engines currently use Canvas to render game pictures. If the rendering efficiency of Canvas can be improved, the performance of the whole game can be greatly improved. Therefore, we developed a game mode, which enables the Canvas content to be directly output to an independent SurfaceView through a simplified rendering pipeline, and then the Surface content and WebView content are combined to form the final picture output to the screen. This can dramatically improve the performance of your game. In addition to high performance and power saving, our game mode adaptation cost is also very low, only need business developers through JS API set a flag bit can be opened, no need to modify to the game engine.

From the lab test data, we can see that the frame rate and power consumption of our game mode are at a very good level, that is, both performance and power consumption are taken into account.

Direct rasterization & direct synthesis

The game mode only improves the rendering performance for certain scenes, is there any way we can optimize the rendering performance for common scenes?

Before we answer that question, let’s take a quick look at Chrome’s rendering architecture. This diagram makes a very simplified abstraction of the rendering pipeline. A web page goes through three stages from the form of text to the final output to the screen to be seen by the user:

  1. Blink performs analysis, typesetting and rendering of page content, generates a series of rendering instructions, and outputs them to the sub-synthesizer Layer Compositor.
  2. The Layer Compositor is then partitioned, rasterized and textured to generate the Compositor Frame output to its parent Display Compositor.
  3. Finally, the Compositor Frame is translated into GL drawing command by Display Compositor, and the content is drawn to the Surface corresponding to the Window and displayed on the screen.

Since Chrome needs to render UI interfaces in addition to web content, or supports features like Offscreen Canvas, it has several sub-synthesizers that run in different processes, such as the UI Compositor here.

In order to improve the rendering performance of web pages and reduce the memory consumption in the rendering process, we have made a big transformation to the Chromium rendering pipeline, realizing direct rasterization and direct synthesis. There are two main things done here:

  1. We remove sub-synthesizers outside the Layer Compositor and simplify the synthesizer architecture.
  2. The logic of partitioning is removed, and the content of web pages is directly rasterized onto a surface, which can save the memory consumption caused by partitioning cache and be more efficient.

The reconstructed rendering architecture is as shown in this picture. We removed unnecessary sub-synthesizers and block logic, and then put the Display Compositor into the Render process to achieve direct rasterization and direct compositing. Our direct rasterization and direct synthesis bring many benefits.

The first is to save GPU memory consumption, reduced by 15%+;

Secondly, the performance of Motion Mark animation and first screen rendering has been significantly improved.

However, direct rasterization also presents some problems. Due to the lack of a block cache, the page’s inertial scrolling frame rate drops slightly on some low-end machines or especially complex scenes. To solve this problem, we implemented a hybrid rasterization scheme. The kernel can automatically switch between asynchronous rasterization and direct rasterization according to the complexity of business scenes, which gives a better consideration to both frame rate and memory.

V8 JS engine optimization

In addition to rendering engine optimizations, we also made a number of optimizations for the V8 JS engine. Here are three important ones:

  1. We implemented our own V8 Code Cache, which increased the coverage of code cache and improved JS performance by 22% compared to the official version.
  2. We used LLVM to transform the V8 code generation back end, which improved JS performance by more than 20%.
  3. In order to facilitate the business side of the Group to access the JS engine, we standardized the JS engine interface and made a JSI SDK package, which greatly improved the JS engine integration efficiency.

U4 supporting development platform

In addition to the engine level optimization, we are also constantly improve supporting platform and tools, for development, development and online after three key nodes, the whole process of tool support, respectively provides the UC developer tools, LuBanChe and high availability of Web monitoring platform, to help developers to better develop the high quality of Web applications.

  • UC Developer Tools are customized and optimized based on Chrome Remote Inspector. We made Remote Inspector a standalone desktop application. In addition to fixing some of the original debugging issues, we added special debugging features that make it easier for developers to debug our U4 kernels through UC DevTools.
  • Luban Rule is an offline automatic diagnosis platform built by us based on Chrome LightHouse. It can perform automatic diagnosis and analysis of developed applications, and get scores and optimization suggestions, helping developers find page problems and assist in page optimization. It is very suitable as a release bayonet test tool.
  • Web high availability monitoring platform is an online problem monitoring and diagnosis platform created by our cooperation with internal iTrace platform. At present, there are many similar APM platforms, but the combination of our Web high availability monitoring platform and kernel depth makes the monitoring scope wider and data more accurate, and realizes many distinctive monitoring items, such as white screen monitoring. When problems occur, we can load waterfall stream data, JS error messages, etc., to provide effective data for developers to analyze problems. In addition to white screen monitoring, we also have black screen monitoring, memory monitoring, JS exceptions, etc., which effectively help businesses find and solve several major online problems. In order to better serve Web developers, our Web high availability monitoring platform, is also packaged into external products, called “Yue Ying”, we are interested in understanding and access to use.

future

As mentioned earlier, the browser kernel has evolved from the traditional browser kernel to the Web Platform, which is applied to more business scenarios. In the future, I think it will continue to evolve into the App Platform. So what’s the difference between Web Platform and App Platform?

Web Platform mainly emphasizes the breadth of its applications. App Platform should not only have a wider range of applications, but also reach the level of experience comparable to native, no matter in performance experience or development experience.

In order to achieve the goal of building an App Platform, the browser kernel needs to be constantly optimized, including the optimization of architecture, performance optimization and more new standards support, so as to enhance the performance experience of Web applications; In addition, from the perspective of developers, it is also necessary to constantly improve the developer tools and platforms, open more browser kernel data and device capabilities, so that developers can develop a high-quality Web application more efficiently.

We believe that the Web will be one of the most important technologies in the mobile Internet field, and we will continue to invest in building a better App Platform. Our goal is to make the Web omnipotent!

That’s my share. Thank you.

= = = = = = = = = = = = = = = = = = = = = = = = = = = = = = = = = = = = = = = = = = = = = = = = = = = = = = = = = = = = = = = = = = = = = = = = = = =

The U4 kernel is committed to creating the best performance, the most secure web platform, so that the Web can do anything.

Please search for U4 kernel technology and get the latest technology updates immediately