Singsong: The text is sorted out by myself after reading some good materials. Students who are interested in this knowledge can refer to the section of reference article.

✏️ The latest content is available on Github ❗️

Why write this article?

This will lay the foundation for CSS optimization. To write high-performance websites and applications, in addition to making sure that the code you write runs as efficiently as possible, you also need to ensure page performance, with refresh rates as high as 60FPS. This requires understanding how the browser renders. Browser rendering is closely related to CSS, so understanding how it works makes CSS work better.

In addition, the next article will be a practical optimization, will involve some optimization of JavaScript and CSS. JavaScript optimizations have been covered before: common JavaScript memory leaks. This article is a complement to CSS optimization.

Contents

  • The browser
  • DOM tree
  • CSSOM tree
  • RenderObject Tree (also called Render Tree)
  • Layout (Layout)
  • RenderLayer tree
  • Rendering(Rendering method)
  • GrphicsLayer tree
  • Tiled Rendering(Tile Rendering)
  • High Performance Animations
  • conclusion
  • Refer to the article

The browser

  1. User Interface: including address bar, forward/back button, bookmark menu, etc. All parts of the display belong to the user interface, except for the page you requested displayed in the browser’s main window.
  2. Browser Engine: Transmits instructions between the User Interface and the Rendering engine.
  3. The Rendering engine is responsible for Rendering the requested content. If the requested content is HTML, it is responsible for parsing the HTML and CSS content and displaying the parsed content on the screen.
  4. Networking: Used for network calls, such as HTTP requests. Its interfaces are platform independent and provide an underlying implementation for all platforms.
  5. JavaScript Interperter: Used to parse and execute JavaScript code.
  6. UI Backend: Used to draw basic widgets, such as combo boxes and Windows. It exposes a common interface that is platform-independent, while underneath it uses the operating system’s user interface approach.
  7. Data Storage: This is the persistence layer. Browsers need to keep all kinds of data, such as Cookies, on their hard drives. Browsers also support storage mechanisms such as localStorage, IndexedDB, WebSQL, and FileSystem.

This article focuses on Rendering in the browser, which is the Rendering engine that parses and renders requested HTML and CSS content on the screen.

DOM tree

DOM: Document Object Model. It can access and modify the content and structure of a document in a platform – and language-independent manner. It defines a set of platform-independent, language-independent interfaces that allow programming languages to dynamically access and modify structured documents. A DOCUMENT based on a DOM representation is described as a tree structure that can be manipulated using an interface to the DOM.

Here is an example of an HTML structure containing some text and an image:

<html>
  <head>
    <meta name="viewport" content="width=device-width,initial-scale=1">
    <link href="style.css" rel="stylesheet">
    <title>Critical Path</title>
  </head>
  <body>
    <p>Hello <span>web performance</span> students!</p>
    <div><img src="awesome-photo.jpg"></div>
  </body>
</html>
Copy the code

How the browser handles the HTML page:

  1. Conversion: The browser reads the HTML bytecode from disk or network and converts it to the corresponding character based on the specified encoding of the file (for example, UTF-8).
  2. Tokenizing: The browser converts strings intoThe W3C HTML 5 standardProvisions for various tokens, e.g<html>,<body>, and other strings in Angle brackets. Each token has a special meaning and rules.
  3. Lexing: Convert tokens into “objects” that define their attributes and rules.
  4. DOM Tree: HTML tags define relationships between different tags (some tags are contained within other tags), and the objects created are linked in a tree data structure that also captures parent-child relationships defined in the original tag: HTML is the parent object of the body, body is the parent object of the paragraph, and so on.

The final output of this process is the DOM Tree of the HTML page, which is used for all subsequent browser processing of the page. Every time the browser processes an HTML tag, it does all of the above: converts bytes to characters, identifies tokens, converts tokens to nodes, and then builds a DOM tree.

CSSOM tree

CSSOM: CSS Object Model. CSSOM defines the capabilities and ways in which JavaScript can access styles. It’s a JavaScript interface that gets and manipulates CSS properties or interfaces among interfaces in the DOM, so JavaScript can manipulate CSS styles dynamically. DOM provides an interface for JavaScript to modify HTML documents, and CSSOM provides an interface for JavaScript to obtain and modify style information for CSS code Settings.

When the browser builds the DOM, it encounters the link tag, which references an external CSS style sheet: style.css. Anticipating the need to use the resource to render the page, it immediately issues a request for the resource and returns the following:

body { font-size: 16px }
p { font-weight: bold }
span { color: red }
p span { display: none }
img { float: right }
Copy the code

Similar to working with HTML, you need to convert the CSS rules you receive into some internal representation that the browser can understand and process. So the HTML process is repeated, but for CSS instead of HTML:

Convert CSS bytes into characters, then tokens and nodes, and finally link them into the CSSOM Tree structure:

The CSSOM Tree can be used to determine the computing style of node objects. For example, the span tag contains the color:red style and the font-size:16px style inherited from the body tag.

RenderObject Tree (also called Render Tree)

  • In a DOM tree, there are invisible and visible nodes. As the name implies, invisible nodes are nodes that do not need to be drawn in the final page, such as meta, head, script, etc., and are hidden by CSS style display: None. Instead, visible nodes are visible to the user, such as body, div, SPAN, Canvas, img, etc. For these visible nodes, the browser needs to draw their contents into the final page, so the browser creates RenderObject objects for them. A RenderObject holds various information for drawing DOM nodes. These RenderObject objects, like DOM objects, form a tree called a RenderObject tree. RenderObject Tree is a new tree based on the DOM Tree. It is a new internal representation built for layout calculation and rendering mechanisms. RenderObject Tree nodes do not correspond to DOM Tree nodes one-to-one. The rules for creating a RenderObject are as follows:

  • The document node of the DOM Tree

  • Visible nodes in a DOM tree, such as HTML, body, div, and so on. Browsers do not create RenderObject nodes for invisible nodes.

  • In some cases the browser needs to create an anonymous RenderObject node that does not correspond to any node in the DOM tree.

RenderObject objects make up the RenderObject tree, and each RenderObject holds the computational style for drawing DOM nodes. RenderObject tree can also be interpreted as a combination of CSSOM tree and DOM tree:

Layout (Layout)

When a browser creates a RenderObject, each object does not know its position, size, and other information within the device viewport. The process by which the browser calculates their location, size, and so on from the box-model is called layout calculation. Layout calculation is a recursive process because the size of a node usually needs to calculate the location, size and other information of its child nodes first. To calculate the exact size and position of the node on the page, the browser traverses from the root node of the RenderObject Tree.

Example:

<html>
  <head>
    <meta name="viewport" content="width=device-width,initial-scale=1">
    <title>Critial Path: Hello world!</title>
  </head>
  <body>
    <div style="width: 50%">
      <div style="width: 50%">Hello world!</div>
    </div>
  </body>
</html>
Copy the code

The page contains two nested divs: the parent div has its display size set to 50% of the viewPort width, and the child div has its width set to 50% of its parent, 25% of the ViewPort width.

RenderLayer tree

The browser rendering engine does not directly use the RenderObject tree for rendering. In order to facilitate Positioning, Clipping, overflow-scroll, CSS Transform/Opacity/Animation/Filter, Mask or Reflection, Z – indexing (Z) and so on, the browser will need for some specific RenderObject generate corresponding RenderLayer, And generate a corresponding RenderLayer tree. These specific RenderObjects are directly related to the corresponding RenderLayer. If their children do not have a corresponding RenderLayer, they are subordinate to the parent RenderLayer. Ultimately, each RenderObject is directly or indirectly subordinate to a RenderLayer. Therefore, RenderObject nodes and RenderLayer nodes do not have a one-to-one relationship, but a one-to-many relationship. What conditions must be met for the rendering engine to create the corresponding RenderLayer for the RenderObject:

  • It’s the root object for the page
  • It has explicit CSS position properties (relative, absolute or a transform)
  • It is transparent
  • Has overflow, an alpha mask or reflection
  • Has a CSS filter
  • Corresponds to <canvas> element that has a 3D (WebGL) context or an accelerated 2D context
  • Corresponds to a <video> element

Translation:

  • The Document node of the DOM Tree corresponds to the RenderObject node and the HTML node corresponds to the RenderObject node
  • RenderObject node that explicitly specifies the CSS Position property
  • RenderObject nodes with transparency
  • RenderObject nodes with overflow, alpha, and Reflection styles
  • There are Filter style RenderObject nodes
  • RenderObject nodes using Canvas 2D and 3D(WebGL) technology
  • The RenderObject node corresponding to the video element

Each RenderLayer object can be thought of as a layer in an image, and the layers are stacked together to form an image. The browser iterates over the RenderLayer Tree and over the RenderObjects that are subordinate to the RenderLayer. The RenderObjects store the rendering information and draw. RenderLayer and RenderObject together determine the content of the final rendered web page, and RenderLayer Tree determines the order in which the web page is drawn. The RenderObject that is subordinate to the RenderLayer determines the content of the RenderLayer.

Rendering(Rendering method)

After building the RenderLayer Tree, the browser uses the graphics library to draw the RenderLayer model it has built. The process is divided into two stages:

  • Draw: Draw the RenderObject subordinate to each RenderLayer layer on its RenderLayer. This is called Paint or Rasterization, which converts some drawing instructions into real pixel color values.
    • Software drawing: CPU to complete the drawing operation
    • Hardware-accelerated drawing: GPU to complete drawing operations
  • Compositing: Consolidate RenderLayer layers into a Bitmap. It may also include Translation, Scale, Rotation, Alpha composition, etc.

Rendering engine rendering, there are currently three ways to render web pages:

  • Accelerated Compositing: Use gpus to complete Compositing.
  • Compositing: Term for rendering using compositing techniques.
  • Software Rendering: Use the CPU to draw the contents of each RenderLayer layer (RenderObject) into a bitmap, which is a CPU memory space. This bitmap is used for drawing each layer. The difference is that the drawing positions may be different, and the drawing sequence is from back to front. So the software rendering mechanism has no composition phase.
  • Hardware-accelerated compositing rendering: Use GPU to draw all compositing layers, and use GPU hardware to accelerate compositing.
  • Synthetic rendering of software drawings: some compositing layers use CPU for drawing, others use GPU for drawing. For layers drawn using CPU, the drawing results of that layer are stored in CPU memory and then transferred to GPU memory, and then the GPU is used to complete the composition.

The second and third rendering methods both use synthetic rendering technology, and the synthesis work is also done by GPU. For common 2D drawing operations, using the GPU for drawing is not necessarily a performance advantage over using the CPU for drawing, such as text, dots, lines, etc. The reason is that CPU buffering effectively reduces the overhead of repeated drawing and does not need to consider parallelism with GPU. In addition, GPU memory resources are relatively tight compared to CPU memory resources, and the layering of web pages makes GPU memory usage relatively high. In view of this, it is reasonable for the three to exist in the current situation, and their characteristics are analyzed as follows:

  • Software rendering is a very common technology and the earliest rendering method used by browsers. This technique saves memory, especially valuable GPU memory, but software rendering can only handle the 2D side of things. Simple web pages without complex graphics or multimedia requirements, software rendering is more suitable for rendering this type of web page. The problem is that when faced with many of the new technologies of HTML5, software rendering can’t do much. Second, because of poor performance, such as video, Canvas 2D, etc. As a result, software rendering technology is being used less and less, especially in the mobile space. Another important difference between software rendering and hardware accelerated rendering is the handling of update areas. When there is a request to update a small area of a web page (such as an animation), software rendering may only need to compute a tiny area, while hardware rendering may require redrawing one or more of these layers and then compositing them. Hardware rendering can be much more expensive.
  • For hardware-accelerated composited rendering, each layer is drawn and all layers are composited using GPU hardware, which is especially suitable for operations that require 3D drawing. This way, after the RenderLayer tree, the browser also needs to build more internal representations to support hardware acceleration, which obviously consumes more memory resources. However, on the one hand, hardware acceleration can support all current 2D or 3D drawing standards defined by HTML5; Discussion about update region, on the other hand, if you need to update a layer of an area, because the software rendering didn’t provide the back-end storage for each layer, so it needs to be and this area has all the levels of related areas of overlap redraw forward again, after an from the hardware accelerated rendering only needs to redraw updates occur level, and in some cases, Software rendering becomes even more expensive. Of course, this depends on the structure and rendering strategy of the page.
  • The synthetic rendering method of software drawing combines the advantages of the previous two methods, since many web pages may contain both basic HTML elements and some new HTML5 features, using CPU drawing for some layers and GPU for others. The reason, of course, is a combination of performance – and memory-based considerations described earlier.

Browser can also use the multithreading rendering architecture, web content rendering to the back-end storage operation in a separate thread (thread drawing), and the original thread into synthetic threads, thread drawing with synthetic between threads can use synchronization, part of the synchronous and asynchronous operation mode, the browser can according to need to choose between the performance and effect.

GrphicsLayer tree

For software rendering, the RenderLayer Tree ends, and no additional trees are created to correspond to the RenderLayer Tree. However, for hardware rendering, after the RenderLayer Tree, the browser rendering engine provides more internal structure for hardware rendering to support this mechanism.

In hardware-accelerated synthetic rendering and software-drawn synthetic rendering architectures, a RenderLayer object that requires back-end storage creates a RenderLayerBacking object that takes care of all the storage required by the RenderLayer object. Ideally, each RenderLayer can create its own back-end store, and not all Renderlayers actually have their own RenderLayerBacking object. The RenderLayer is called a Compositing Layer if a RenderLayer object is properly created for back-end storage.

Which RenderLayer objects can be composite layers? A RenderLayer object is a composition layer if it has one of the following characteristics:

  • Layer has 3D or perspective transform CSS properties
  • Layer is used by < video> element using accelerated video decoding
  • Layer is used by a < canvas> element with a 3D context or accelerated 2D context
  • Layer is used for a composited plugin
  • Layer uses a CSS animation for its opacity or uses an animated webkit transform
  • Layer uses accelerated CSS filters
  • Layer with a composited descendant has information that needs to be in the composited layer tree, such as a clip or reflection
  • Layer has a sibling with a lower z-index which has a compositing layer (in other words the layer is rendered on top of a composited layer)

Translation:

  • The RenderLayer has CSS properties for 3D or perspective transformations
  • RenderLayer contains a video decoding technique that uses hardware acceleration<video>The element
  • RenderLayer contains 2D or WebGL-3D that uses hardware-accelerated techniques<canvas>The element
  • RenderLayer uses compositing plugins.
  • RenderLayer usedopacityortransformanimation
  • RenderLayer uses hardware-accelerated CSS Filters technology
  • RenderLayer descendants include a compositing layer (if there is a Clip or Reflection attribute)
  • RenderLayer has a compositing layer smaller than its z-index (i.e., above a compositing layer)

Each compositing layer has a RenderLayerBacking, which is responsible for managing all of the back-end storage required by the RenderLayer, since back-end storage may require multiple storage Spaces. In the browser (WebKit), storage space is represented using the GraphicsLayer class. Browsers create GraphicsLayer for these renderlayers, and different browsers need to provide their own implementation of GrphicsLayer to manage storage allocation, freeing, updating, and so on. Renderlayers that have a GrphicsLayer will be drawn to their own backend storage, renderlayers that don’t have a GrphicsLayer will trace back to the parent or ancestor RenderLayer that has a GrphicsLayer, Up to the Root RenderLayer, which then draws in the storage space of the parent/ancestor RenderLayer of GrphicsLayer, which always creates a GrphicsLayer and has its own storage space. The RenderLayer contents contained in each compositing layer are drawn in the compositing layer’s back-end store, either software or hardware. Several Compositor layers are then combined to create a web page visible to the end user, which is essentially a picture.

The GraphicsLayer in turn forms a tree parallel to the RenderLayer, and the relationship between the RenderLayer and the GraphicsLayer is similar to the relationship between the RenderObject and the RenderLayer. DOM Tree RenderObject Tree RenderLayer Tree Rendericslayer Tree

This allows some RenderLayer layers to be merged, reducing memory consumption. Secondly, after merging, it reduces the redraw performance and processing difficulties brought by merging. In the composite rendering architecture of hardware-accelerated rendering and software drawing, the content of RenderLayer changes only by updating the cache of the GraphicsLayer it belongs to. You only need to draw renderlayers that are directly or indirectly part of the GraphicsLayer, not all renderlayers. In particular, changes to certain CSS style properties don’t actually change the content, just change some of the GraphicsLayer’s blending parameters and remix them. Blending is faster than drawing. These particular CSS style properties are generally called accelerated. This is supported differently by browsers, but CSS Transform & Opacity is essentially accelerated on any browser that supports hybrid acceleration. Animations with accelerated CSS style properties are easier to achieve at 60 frames per second.

However, having more renderlayers with separate caches is not always the better. Having too many renderlayers with separate caches can have some serious side effects:

  • It adds a lot of memory overhead, especially on mobile devices, and even causes browsers to fail to support layer composition acceleration on devices with less memory.
  • It increases the time cost of synthesis, leading to the degradation of synthesis performance, and synthesis performance is closely related to the smoothness of the page scrolling/zooming operation, and ultimately leads to the decrease of the smoothness of the page scrolling/zooming operation, making users feel that the operation is not smooth enough.

Tiled Rendering(Tile Rendering)

Usually the back-end storage of a composite layer is divided into several small storage Spaces of the same size tiles. Each tile can be understood as a texture in OpenGL. The results of the composite layer are stored separately in these tiles. Why use tiled back-end storage?

  • The layer in which the HTML elements of the DOM tree are located can be large because of the height of the web page and the need for a large texture object if only a back-end storage is used, but actual GPU hardware may only support very limited texture sizes.
  • In a large composite layer, only a part of it may change. As described earlier, the entire layer needs to be redrawn, which inevitably incurs additional overhead. With tile back-end storage, only some tiles with updates need to be redrawn.
  • When the layers roll, some tiles may no longer be needed, and then the rendering engine needs some new tiles to draw new areas, these back-end stores of the same size are easy to reuse.

High Performance Animations

After a web page is loaded, a new frame is drawn, which usually goes through three stages: layout, paint and Composite. Therefore, to improve page performance (or FPS), you need to reduce the time per frame. Of the three phases, Layout and Paint are time-consuming, while composition takes less time.

layout

If you change the layout style of a DOM element (width, heihgt, etc.), the browser calculates the elements on the page that need relayout and fires a relayout. The relayout element is then drawn, and finally rendered and merged to generate the page.

paint

If you change the paint style of a DOM element (color, background, etc.), the browser skips the layout and goes straight to the drawing and composition.

composite

If you modify the composite style of a DOM element (transform, opacity, etc.), The browser skips layout and drawing and goes straight to composition. This process is the least expensive and the starting point for optimization.

If you want to know which of layout, Paint, or Composite is triggered by modifying any given CSS style, check it outCSS trigger.

To optimize the

How can you optimize to reduce the time per frame (avoid too much layout or paint) :

  • Reduce Layout and Paint with appropriate web layering techniques. Once an update is requested, the rendering engine may need to redraw all regions if there is no layering, as computing the update portion may consume more time for the GPU. After a page is layered, some areas may be updated only in one or several layers of the page, without the need to redraw the entire page. By redrawing one or more layers of a web page and merging them with other previously drawn layers, you can use the GPU’s power while reducing the redrawing overhead.
  • Use composite property styles (OpCity, TansForm) to complete tansition or animation. When synthesizer is synthesized, each layer can set deformation properties: Translate, Scale, Rotation, and opacity only change the composition layer’s transformation parameters without the need for layout and paint operations, greatly reducing the rendering time of each frame.

GraphicsLayer is created for some RenderLayer using GPU hardware acceleration. Transition or animation is completed by setting the Transform property for each compositing layer, effectively avoiding relayout and Repaint overhead.

conclusion

This paper focuses on the browser rendering engine rendering process, involving DOM Tree, CSSOM Tree, RenderObject Tree, RenderLayer Tree, GraphicsLayer tree. Various rendering modes are briefly introduced, including hardware acceleration mechanism and some optimization suggestions. Knowing these things will help us develop high performance Web applications.

Reference article:

  • GPU Accelerated Compositing in Chrome
  • How browsers work: Behind the scenes of the new Web browser
  • Building an object Model
  • How Rendering Work (in WebKit and Blink)
  • Inside WebKit technology
  • Rendering performance
  • High Performance Animations