The author: The Bermudarat

preface

In more and more businesses, front-end pages not only display data and provide user operation UI, but also need to bring users richer interactive experience. As a carrier, animation has become a daily front-end development, especially C side development must choose. Improvements in device hardware performance and browser kernel upgrades also make it possible to achieve smooth animations on the page side. Currently, the typical refresh rate on a conventional device is 60 Hz, which means that the browser’s rendering pipeline would need to output 60 images per second (60 FPS) in order for the user to feel no significant lag. Next, we’ll start with a basic rendering tree, introduce the browser rendering pipeline, and some common ways to optimize animation performance.

Rendering based

Render tree

Although different rendering engines have different rendering processes, they all need to parse HTML and CSS to generate a rendering tree. The rendering engine most exposed to front-end development is WebKit (and its derivative Blink), and this article uses WebKit as a base for the rendering tree.

GPU Accelerated Compositing in Chrome

The RenderObject tree, RenderLayer tree, and GraphicsLayer tree make up the “render forest”.

RenderObject

RenderObject holds all the information needed to draw DOM nodes, and its counterpart, the DOM tree, is also a tree. However, the tree of the RenderObject does not correspond to the DOM node in a one-to-one way. Webkit Tech Insider states that a RenderObject is created if the following conditions are met:

  • In the DOM treedocumentNode;
  • Visible nodes in the DOM tree (WebKit does not create non-visible nodesRenderObjectNode);
  • To handle the need, Webkit builds anonymousRenderObjectNodes, such as those representing block elementsRenderBlock(RenderObjectSubclass) node.

To draw DOM nodes on a page, you need to know the hierarchy of each render node in addition to the information about the render node. The browser provides the RenderLayer to define the rendering hierarchy.

RenderLayer

RenderLayer is created by the browser based on RenderObject. RenderLayer was originally used to generate stacking context to ensure that page elements are displayed in the correct hierarchy. Similarly, RenderObject and RenderLayer are not one-to-one. If RenderObject meets the following conditions, RenderLayer (GPU Accelerated Compositing in Chrome) :

  • Root node of the document;
  • Nodes with explicit CSS location information (e.grelative.absoluteortransform)
  • Transparent node;
  • There areoverflow.maskorreflectionProperty node;
  • There arefilterProperty node;
  • You have a 3D Context or an accelerated 2D ContextCanvasNode;
  • The correspondingVideoElement node.

We can think of each RenderLayer as a layer. Rendering is drawing the RenderObject on each RenderLayer layer. This process can be drawn using CPU, which is called software drawing. However, software drawing cannot deal with 3D drawing context. Each layer of RenderObject cannot contain nodes that use 3D drawing, such as Canvas node with 3D Contex, nor can it support CSS 3D changing properties. In addition, every time an element’s size or position changes in a page animation, the RenderLayer tree is restructured, triggering the Layout and its subsequent rendering pipeline. This will cause the page frame rate to drop, resulting in visual lag. So modern browsers have introduced hardware-accelerated drawing done by gpus.

Once the information for each layer is obtained, it needs to be incorporated into the same image, a process known as Compositing, which uses Compositing techniques.

In software rendering, there is no actual need for composition, because software rendering completes each layer in the same memory space in front to back order. In modern browsers, especially mobile devices, hardware accelerated drawing using Gpus is more common. Hardware-accelerated rendering completed by GPU requires compositing, and compositing is done by GPU. This whole process is called hardware-accelerated compositing rendering. Not all graphics in modern browsers need to be done on a GPU, Webkit Tech Insider points out:

For common 2D drawing operations, using GPU for drawing is not necessarily superior to using CPU for drawing, such as text, dots, lines, etc., because CPU uses caching mechanism to effectively reduce the overhead of repeated drawing and does not require GPU parallelism.

GraphicsLayer

To save GPU memory resources, Webkit does not allocate a corresponding back-end store for each RenderLayer. Instead, several renderlayers are grouped together according to certain rules to form a new layer with back-end storage for subsequent compositing, called a compositing layer. In the composition layer, the storage space is represented using GraphicsLayer. For a RenderLayer object, if it is not separately promoted to a composition layer, the composition layer of its parent object is used. A RenderLayer has its own Compositing layer if it has one of the following features (GPU Accelerated Compositing in Chrome) :

  • CSS properties with 3D or perspective transformations
  • Containing video augmentation techniques using hardware accelerationVideoElements;
  • With 3D Contex or accelerated 2D ContextCanvasElements; (Note: normal 2D Context is not promoted to the composition layer);
  • There areopacity,transformChanging animation;
  • Hardware-accelerated CSS filter technology is used.
  • The offspring contain a synthesis layer;
  • Overlap: there is a brother node whose Z coordinate is smaller than its own, and this node is a synthesis layer.

For the improvement of synthesis layer caused by Overlap,Compositing in Blink / WebCore: From WebCore::RenderLayer to cc:LayerThree pictures are given:In Figure 1, the green rectangle at the top and the blue rectangle at the bottom are sibling nodes, and the blue rectangle is promoted to the composition layer for some reason. If the green rectangle is not promoted, it shares the same composition layer with its parent node. This results in rendering errors when the green rectangle is at the bottom of the blue one (Figure 2). So if overlap occurs, the green rectangle also needs to be promoted to the composite layer.

For the improvement conditions of the Composite layer, wireless performance optimization: Composite is described in more detail. With RenderLayer and GraphicsLayer, it’s easy to create RenderLayer animations (size, position, style, etc.) that are promoted to composition. (Note that not all CSS animation elements are promoted to composition.) This will be covered later in the rendering pipeline). This design allows the browser to better use the GPU’s power, giving users a smooth animation experience.

Use Chrome DevTools to easily view the composition layer of a page: Choose “More Tools -> Layers”Above, you can see not only the composition layer of cloud music homepage, but also the reason why each composition layer was created in detail. For example, the playback bar at the bottom of the page is promoted to the composite layer because of “Overlaps other Composited Content”, which corresponds to “Overlap: there is a sibling node whose Z coordinates are smaller than its own and that node is a composite layer”.

In the front-end page, especially in the animation process, the synthesis layer promotion caused by Overlap is easy to happen. If the overlapping top RenderLayer is promoted to the compositing layer each time, it consumes a lot of CPU and memory (Webkit needs to allocate a back-end storage for each compositing layer). To avoid Layer explosion, browsers perform Layer Squashing: if multiple renderlayers overlap with the same compositing Layer, they are compressed into the same compositing Layer, that is, at the same compositing Layer. However, in some cases, the browser cannot compress the layers, resulting in the creation of a large number of composite layers. Wireless performance optimization: Several conditions that can cause Composite layer compression to fail are described in Composite. For reasons of length, I will not introduce it in this article.

RenderObjectLayer, RenderLayer and GraphicsLayer are the basis of rendering in Webkit. RenderLayer determines the hierarchical order of rendering, and RenderObject stores the information needed for rendering by each node. GraphicsLayer uses the GPU’s power to speed up page rendering.

Rendering line

When you create a rendering tree in the browser, how this information is rendered on the page comes up with the rendering pipeline. For the following code:

<body>
    <div id="button">Click on the add</div>
    <script>
        const btn = document.getElementById('button');
        btn.addEventListener('click'.() = > {
            const div = document.createElement("div");
            document.body.appendChild(div);
        });
    </script>
 </body>
Copy the code

The Performance TAB in DevTools records and views the rendering process of the page (the image shown is limited in width and does not capture the event input portion of the composite thread).This process, withAerotwist – The Anatomy of a FrameThe diagram of the rendering pipeline given is almost identical.

There are two processes in the rendering pipeline sketch: Renderer Process and GPU Process. Each page Tab has a separate rendering process, which includes the following threads (pools) :

  • Compositor threads: Receive the browser’s Vsync signals (indicating the end of the previous frame and the beginning of the next frame) as well as user input such as scrolls, clicks, etc. Generates drawing instructions in the case of GPU compositing.
  • Main thread (Tread) : The execution thread of the browser. Common Javascript calculations, Layout, and Paint are performed in the Main thread.
  • Raster thread pool (Raster/Tile worker) : There may be multiple Raster threads for rasterizing tiles. (If the main thread only converts the page content to a list of draw instructions, execute draw instructions here to get the color value of the pixel).

The GPU Process is not executed on the GPU, but is responsible for uploading the tile bitmap drawn in the rendering Process to the GPU as a texture, and eventually drawing to the screen.

Here is a detailed description of the entire rendering process:

1. Frame Start

The browser sends a Vsync signal to indicate the start of a new frame.

2. Input event handlers

The synthetic thread passes input events to the main thread, which handles callbacks to each event (including the execution of some Javascript scripts). Here, all input events (such as Touchmove, Scroll, click) are triggered only once per frame.

3. requestAnimiationFrame

If the requestAnimiationFrame (rAF) function is registered, the rAF function is executed here.

4. Parse HTML

If the previous operation resulted in a DOM node change (such as appendChild), the HTML parsing needs to be performed.

5. Recalc Styles

If you changed the CSS style in the previous step, the browser needs to recalculate the modified DOM node and child node styles.

6. The Layout of the building

Calculate the size, position and other geometric information of each visible element. Normally you need to perform a Layout for the entire document, and changes to some CSS properties do not trigger a Layout (see CSS Triggers). Avoiding large, complex layouts and Layout jitter points to a calculation of browser geometry called layouts in Chrome, Opera, Safari, and Internet Explorer. It’s called Reflow in Firefox, but the process is actually the same.

7. Update Layer Tree

Next you need to update the render tree. Changes to DOM nodes and CSS styles will result in changes to the render tree.

8. Paint

There are actually two steps to drawing, and I’m referring here to the first step: generating the drawing instructions. Browser-generated draw instructions andCanvasThe drawing API provided is similar. You can view it in DevTools:These draw instructions form a draw list, which is output in the Paint phase (SkPicture).

The SkPicture is a serializable data structure that can capture and then later replay commands, similar to a display list.

9. Composite

In DevTools this step is called Composite Layers, and the composition in the main thread is not really a composition. A copy of the render tree (LayerTreeHost) is maintained in the main thread, and a copy of the render tree (LayerTreeHostImpl) is maintained in the composite thread. With this copy, the composition thread can perform composition operations without having to interact with the main thread. Thus, while the main thread is doing Javascript calculations, the composite thread can still function without interruption.

After the rendering tree is changed, two copies need to be synchronized. The main thread will send the changed rendering tree and drawing list to the Composite thread, and block the main thread to ensure that the synchronization can proceed normally. This is the Composite Layers. This is the last step of the main line in the rendering pipeline; in other words, this step just generates the data for composition, not the actual composition process.

10. Raster Scheduled and Rasterize

After receiving information submitted by the main thread (render tree, draw instruction list, etc.), the composite thread bitmaps the information and converts it into pixel values, that is, rasterization. Webkit provides a thread pool for rasterization, and the number of threads in the pool is dependent on platform and device performance. Since each layer of the composite layer is the size of the entire page, the page needs to be split before rasterization, converting layers into tiles. These blocks are usually 256 in size256 ζˆ–θ€… 512512. In DevTools, go to “More Tools -> Rendering” and select “Layer Borders”.The image above shows a page divided into blocks, with orange as the border of the composition layer and cyan as the block information. The rasterization is done on a per-block basis. Different blocks have different rasterization priorities. Usually, blocks that are near the viewports of the browser are rasterized first (see viewpoint for more details)Tile Prioritization Design . In modern browsers, rasterization does not occur in Compositor threads. The renderer maintains a rasterized thread pool, known as the Compositor Tile Workers, with the number of threads in the pool depending on system and device compatibility.

Rasterization can be divided into Software Rasterization and Hardware Rasterization. The difference lies in that bitmap generation is carried out in CPU and then uploaded to GPU for synthesis. Or directly in the GPU drawing and map pixel filling. The hardware rasterization process is shown in the figure below:

Raster Threads Creating the Bitmap of Tiles and Sending to GPU

We can do it atchrome://gpu/ To check whether Chrome hardware rasterization is enabled.

11. Frame End

After the blocks are rasterized, the compositor thread collects block information called Draw Quads to create compositor frames. The composite frame is sent to the GPU process and the frame terminates.

Draw quads: Contains information such as the tile’s location in memory and where in the page to draw the tile taking in consideration of the page compositing. Compositor frame: A collection of draw quads that represents a frame of a page.

12. Image display

The GPU process is responsible for communicating with the GPU and completing the drawing of the final image. The GPU process receives the composite frame, and if hardware rasterization is used, the rasterized texture is already stored in the GPU. 3D apis for drawing (such as Webkit’s GraphicsContext3D class) are provided in the browser to merge textures into the same bitmap.

As mentioned in the previous article, elements with transparency and other animations will be promoted to a composition layer separately. These changes, however, are actually set on the compositing layer. Before the texture is merged, the browser will apply the 3D deformation to the compositing layer to achieve the specific effect. That’s why we say that animations using transfrom and transparency properties can improve rendering efficiency. Since these animations do not change the Layout structure or texture during execution, they do not trigger subsequent Layouts and Paint.

Animation performance optimization

The browser’s rendering pipeline was described above, but not every time a rendering triggers the entire pipeline. Some of these steps may not be triggered only once. Here are some ways to improve rendering efficiency, in the order of the rendering pipeline:

Handle page scrolling properly

In the browser rendering pipeline, the composite thread is the entry point for user input events. When user input events occur, the composite thread needs to determine whether the main thread should participate in subsequent rendering. For example, when the user scrolls the page, all layers are rasterized, and the compositing thread can directly generate the compositing frame without the main thread. If the user binds event handling to elements, the composite thread marks those regions as non-fast scrollable regions. When an input event occurs for the user in the non-fast scroll scroll area, the composite thread passes this event to the main thread for Javascript calculation and subsequent processing. In front-end development, event delegation is often used to delegate the event of some element to its parent or to an outer element (such as document), to bubble up the binding event of its outer element, and to perform functions on the outer element. Event delegation can reduce the memory consumption caused by the binding of multiple child elements to the same event handler function, and can also support dynamic binding, which is widely used in front-end development.

If you bind event handling to the Document in the form of event delegates, the entire page will be marked as a non-fast scrolling area. This means that the compositing thread needs to send each user input event to the main thread, wait for the main thread to execute Javascript to handle these events, and then synthesize and display the page. In this case, smooth page scrolling is difficult to achieve.

To optimize the above problem, the third argument to the browser’s addEventListener provides {passive: True} (default: false), this option tells the synthesizer that it still needs to pass user events to the main thread, but that the synthesizer will continue synthesizing new frames without being blocked by the main thread. In this case, the preventDefault function in the event handler is disabled.

document.body.addEventListener('touchstart'.event= > {
    event.preventDefault(); // Does not block the default behavior
 }, { passive: true });
Copy the code

Image from Inside Look at Modern Web Browser (Part 4)

In addition, in business scenarios such as lazy loading, it is often necessary to listen for page scrolling to determine whether relevant elements are in the viewport. Common method is to use Element. GetBoundingClientReact () for related elements of boundary information, and then calculation is located in the viewport. The main thread, called every frame Element. GetBoundingClientReact () will cause performance issues (such as improper use page compulsory rearrangement). The Intersection Observer API provides a way to asynchronously detect changes in the Intersection of a target element with an ancestor element or viewport. This API supports registering callback functions that are triggered when the intersection of the monitored element and other elements changes. In this way, the judgment of intersection is left to the browser to manage and optimize, thus improving scrolling performance.

Javascript to optimize

Reduce Javascript execution time in the main thread

For a device with a frame rate of 60FPS, each frame must be executed within 16.66 ms. Failure to complete this requirement will cause content to shake on the screen, which is also known as lag, affecting the user experience. In the main thread, the user input needs to be computed. To ensure the user experience, it is necessary to avoid long calculation in the main thread and prevent the subsequent flow from being blocked. In this article, we propose the following points for optimizing JavaScript:

  1. For animation effects, avoid using setTimeout or setInterval and use requestAnimationFrame instead.
  2. Move long-running JavaScript from the main thread to the Web Worker.
  3. Use microtasks to perform DOM changes to multiple frames.
  4. Use Chrome DevTools Timeline and JavaScript profiler to assess the impact of JavaScript.

When using setTimeout/setTimeInterval to perform the animation, because not sure callback will occur in the rendering pipeline which stage, if in the end, just may lead to lost frames. In the rendering pipeline, rAF will be executed after Javascript and before Layout, so the above problems will not occur. Shifting pure computing work to the Web Worker can reduce the execution time of Javascript in the main thread. For large computing tasks that must be performed in the main thread, consider breaking them up into microtasks and processing them in the rAF or RequestIdleCallback per frame (see the React Fiber implementation).

Reduce Force layouts caused by improper Javascript code

In the render pipeline, Javascript/rAF operations may change the render tree, triggering subsequent layouts. If you access Layout properties or calculation properties such as el.style.backgroundImage or el.style.offsetWidth in Javascript/rAF, it may trigger a Force Layout. This causes subsequent Recalc styles or Layout to be executed before this step, affecting rendering efficiency.

requestAnimationFrame(logBoxHeight);
function logBoxHeight() {
  box.classList.add('super-big');
  // In order to get the box's offsetHeight value, the browser should first apply the super-big style change and then perform the Layout.
  console.log(box.offsetHeight);
}
Copy the code

The logical thing to do is

function logBoxHeight() {
  console.log(box.offsetHeight);
  box.classList.add('super-big');
}
Copy the code

Reduce Layout and Paint

That’s the old story of reducing rearrangement and redrawing. Layout and Paint are relatively time-consuming in the three stages of the rendering pipeline: Layout, Paint and composition. But not all frame changes need to go through a full rendering pipeline: Changes to DOM nodes trigger layouts when their size and position change; If the change doesn’t affect its position in the document flow, the browser doesn’t need to recalculate the layout, just generate the draw list and Paint. Paint is based on composition layers, and whenever you change the style of an element that triggers Paint, the composition layer in which that element resides is repainted. Therefore, for certain animation elements, it is possible to promote them to a separate compositing layer, reducing the scope of Paint.

Composite layer lifting

When introducing rendering trees, it was mentioned that RenderObjectLayer meeting certain conditions would be promoted to the composite layer. The rendering of the composite layer is carried out on GPU, which has better performance than CPU. If this compositing layer requires Paint, other compositing layers will not be affected. Some compositing layer animations do not trigger Layout and Paint. The following are several common methods of composition layer promotion in development:

usetransformandopacityWriting animation

As mentioned above, if an element uses CSS transparent animations or CSS transforms, it will be promoted to the composition layer. And these animation transformations are actually applied to the compositing layer itself. These animations are executed without the involvement of the main thread, using the 3D API to deform the compositing layer before the texture synthesis.

  #cube {
      transform: translateX(0);
      transition: transform 3s linear;
  }

  #cube.move {
      transform: translateX(100px);
  }
Copy the code
<body>
    <div id="button">Click on the mobile</div>
    <div id="cube"></div>
    <script>
        const btn = document.getElementById('button');
        btn.addEventListener('click'.() = > {
            const cube = document.getElementById('cube');
            cube.classList = 'move';
        });
    </script>
 </body>
Copy the code

In the case of the animation above, the synthesis layer is promoted only after the animation starts, and the synthesis layer promotion disappears after the animation ends. This avoids the CPU performance cost of browsers creating a lot of compositing layers.

will-change

This property tells the browser that special transformations will be performed on certain elements. When will-change is set to opacity, transform, top, left, bottom, and right (top, left, bottom, and right should be set with clear positioning attributes, such as relative, etc.), The browser raises the composition layer for this element. When writing, avoid the following:

*{ will-change: transform, opacity; }
Copy the code

In this way, all elements are promoted into separate layers of composition, resulting in a large footprint. So you need to set will-change only for the animation element, and you need to remove it manually after the animation is complete.

Canvas

Use accelerated 2D Context or 3D Contex Canvas for animations. Since there is a separate compositing layer, changes to the Canvas do not affect the drawing of other compositing layers, which is especially true for large and complex animations such as HTML5 games. In addition, multiple Canvas elements can also be set to reduce the drawing overhead through reasonable Canvas layering.

CSS Container Module

The CSS Containment Module has just been released in Level 3. The main goal is to improve page rendering performance by isolating a particular DOM element from the DOM tree of the entire document so that changes to its elements do not affect the rest of the document. The CSS container module provides two main properties to support this optimization.

contain

The contain attribute allows developers to specify that a particular DOM element is independent of the DOM tree. For these DOM elements, the browser can individually calculate their layout, style, size, and so on. Contain: if the DOM element that contains the folder property changes, the Layout tree will not change and the Layout and Paint will not change. Contain contains the following values:

layout

Contain layout elements will be independent of the layout of the page. Any changes to the elements will not cause the layout of the page.

paint

Contain a DOM node whose contain value is paint, indicating that its children are not displayed beyond their boundaries. So if a DOM node is off-screen or invisible, its children can be ensured to be invisible. It also has the following functions:

  • forpositionA value offixedorabsoluteThe child node of,containA value ofpaintThe DOM node becomes a containing block (containing block).
  • containA value ofpaintThe DOM node of the
  • containA value ofpaintA DOM node in the.

size

Contain a DOM node whose contain value is size. Its size is not affected by its children.

style

Contain a DOM node with a style value, indicating that its CSS properties do not affect elements other than its children.

inline-size

Inline-size is the latest value added to Level 3. Contain a DOM node containing inline-size, the intrinsic-size of its principal box’s inline axis is unaffected by its content.

strict

Contain: size layout paint

content

Contain: Layout paint

In complex pages with a large number of DOM nodes, adding a contain attribute (e.g. Contain: strict) to DOM elements that are not contained in a separate composite layer will result in Layout and Paint for the entire page.

An introduction to CSS ContainmentOne is given inLong list examplesWill be the first in the long listitem ηš„ containProperty set tostrictAnd change thisitemContent, before and after manually triggering a forced rearrangement of the page. Relative to not being set tostrict, Javascript execution time was reduced from 4.37ms to 0.43ms, rendering performance was greatly improved.containThe browser support is as follows:

content-visibility

The contain property requires that the DOM element should be optimized for rendering and set to the appropriate value at the time of development. Content-visibility provides another way to optimize by setting it to Auto. As mentioned above, the compositing thread converts each page-sized layer into a tile, then rasterizes the tile with a certain priority, and the browser renders all elements that might be viewed by the user. An element whose content-visibility is set to auto will be sized by the browser to properly display page structures such as scrollbars when it is off-screen, but the browser does not generate a rendering tree for its children, meaning that their children will not be rendered. When the page scrolls to appear in the viewport, the browser starts rendering its child elements. But this also leads to a problem: An element whose content-visibility value is set to Auto will not be laid out by the browser when it is off-screen, so the size of its child element cannot be determined. If the size is not specified explicitly, the size of the element will be 0, which will cause errors in the height of the page and the display of the scroll bar. To solve this problem, CSS provides another attribute containing -intrinsic-size to set the size of the element whose content-visibility is auto. This ensures that elements take up space on a page Layout, even if they are not explicitly sized.

.ele {
    content-visibility: auto;
    contain-intrinsic-size: 100px;
}
Copy the code

content-visibility: the new CSS property that boosts your rendering performanceGives an example of a travel blog through reasonable Settingscontent-visibilityPage first load performance improved 7 times.content-visibilityThe browser support is as follows:

conclusion

Much has been written about browser rendering mechanisms. But some of the articles, especially those dealing with the browser kernel, are obscure. This article starts from the browser bottom rendering, detailed introduction of rendering tree and rendering pipeline. Then, in the order of the rendering pipeline, I introduced how to improve the performance of animation: handle page scrolling properly, optimize Javascript, reduce Layout and Paint. Hope to understand the browser rendering mechanism and daily animation development help.

Refer to the article

  1. Webkit technology insider — Zhu Yongsheng
  2. GPU Accelerated Compositing in Chrome
  3. Compositing in Blink / WebCore: From WebCore::RenderLayer to cc:Layer
  4. Wireless performance optimization: Composite
  5. The Anatomy of a Frame
  6. Avoid large, complex layouts and layout jitter
  7. Software vs. GPU Rasterization in Chromium
  8. Optimizing JavaScript execution
  9. Browser rendering pipeline parsing and webpage animation performance optimization
  10. Tile Prioritization Design
  11. CSS Containment Module Level 3
  12. Let’s Take a Deep Dive Into the CSS Contain Property
  13. CSS triggers
  14. Detailed process for browser rendering: redraw, rearrange, and composite are just the tip of the iceberg
  15. Use ONLY CSS to speed up page rendering

This article is published from NetEase Cloud Music big front end team, the article is prohibited to be reproduced in any form without authorization. Grp.music – Fe (at) Corp.Netease.com We recruit front-end, iOS and Android all year long. If you are ready to change your job and you like cloud music, join us!