How browsers render

preface

After reading the working principle and practice of bing Brother’s browser before, I found that there are a lot of knowledge points that are difficult to understand, and then I read relevant blogs and consult big men to understand. So in this collation of records, convenient follow-up review. Also welcome to point to the wrong errata!!

start

Browser rendering principles mainly include the following stages

Build a DOM tree
Style calculation
The layout phase
layered
draw
block
rasterize
synthetic
About reflux, redraw, composition

You can imagine, from 0, 1 byte stream to the last page to show in front of you, inside this rendering mechanisms must be very complex, so the rendering module into many sub stages in the process of execution, rendering engine byte streams from the process of network data, through these sub stages of processing, the final output pixels, this process can be referred to as the rendering pipeline

Build a DOM tree

The main job of this process is to transform the HTML content into the browser DOM tree structure

What is the DOM

The byte stream of HTML files sent from the network to the rendering engine is not directly understood by the rendering engine, so it is converted into an internal structure that the rendering engine can understand, which is DOM

DOM trees are almost identical to HTML content, but unlike HTML, DOM is an in-memory tree structure that can be queried or modified using JavaScript. The following diagram shows the differences between HTML and DOM trees

In short, the DOM is the internal data structure that expresses HTML, connects Web pages to JavaScript scripts, and filters out unsafe content

How is the DOM tree generated

Inside the rendering engine, there is an HTML parser, whose job is to convert HTML stream into DOM structure

Earlier we said that code is sent over the network as a byte stream, so how is the byte stream subsequently converted into the DOM? Refer to the below

As you can see from the figure, there are three stages for byte stream conversion to DOM.

Convert Token

In the first stage, byte stream is converted into Token through word splitter, which is divided into Tag Token and text Token.

The tokens generated by lexical analysis of the HTML code above are as follows:

As can be seen from the figure, the Tag Token is divided into StartTag and EndTag. For example, StartTag and EndTag are respectively blue and red blocks in the figure, and the text Token corresponds to the green block.

Parse and add

The second and third phases follow synchronically, resolving the Token into a DOM node and adding the DOM node to the DOM tree.

The HTML parser maintains a Token stack structure that is used to compute parent-child relationships between nodes, and the tokens generated in the first phase are pushed into this stack in sequence. The specific processing rules are as follows:

When the HTML parser starts working, it creates an empty DOM structure rooted in document by default and pushes a StartTag Document Token to the bottom of the stack
If you push it on the stackStartTag TokenThe HTML parser creates a DOM node for the Token and adds the node to the DOM tree. Its parent node is the node generated by the neighboring element on the stack.
If the word splitter reads yesText Token, a text node will be generated and added to the DOM tree. The text Token does not need to be pushed into the stack. Its parent node is the DOM node corresponding to the Token at the top of the stack.
If the word splitter is interpretingEndTag TokenFor example, if it is an EndTag div, the HTML parser checks to see if the element at the top of the Token stack is a StarTag div. If so, it pops the StartTag div from the stack to indicate that the div element has been parsed.

The new tokens generated by the tokenizer are pushed and pushed, and the whole parsing process continues until the tokenizer has split all the byte streams.

Take a look at the process using the following HTML as an example

<html>
  <body>
    <div>1</div>
    <div>test</div>
  </body>
</html>
Copy the code

When the HTML parser starts working, it creates an empty DOM structure rooted in document by default and pushes a StartTag Document Token to the bottom of the stack. The first StartTag HTML Token parsed by the tokenizer is then pushed onto the stack and an HTML DOM node is created and added to the document, as shown below:

Then the StartTag body and StartTag div are resolved according to the same process, and the status of their Token stack and DOM is shown in the figure below:

The rendering engine creates a text node for this Token and adds the Token to the DOM. Its parent node is the node corresponding to the top element of the current Token stack, as shown below:

Next, the parser parses the first EndTag div. At this point, the HTML parser determines whether the element at the top of the stack is a StartTag div. If so, it pops the StartTag div from the top of the stack, as shown below:

In accordance with the same rules, parsing all the way, the final result is as follows:

Style calculation

The purpose of style calculation is to calculate the specific style of each element in the DOM node, which can be roughly divided into three steps

Formatting style sheets
Standardized style sheets
Compute the specific style of each DOM node

Formatting style sheets

With CSS, browsers get 0,1 byte stream data, which browsers can’t read directly, so the rendering engine receives CSS text data and performs an operation to convert it to a structure that browsers can understand.

Standardized style sheets

body { font-size: 2em }
p {color:blue; }span {display: none}
div {font-weight: bold}
Copy the code

In the CSS text above, there are many attribute values, such as 2em, Blue, bold. These values are not easy to be understood by the rendering engine. Therefore, all values need to be converted to standardized computed values that the rendering engine can easily understand

body { font-size: 36px; }
p {color: rgb(0.0.255); }span {display: none; }div {font-weight: 700; }Copy the code

Compute the specific style of each DOM node

There are two main points

Inheritance rule: Each child node inherits the parent node’s style by default. If it is not found in the parent node, the browser’s default style, also known as UserAgent style, is adopted
Cascading rules: Cascading is a basic feature of CSS, such as:.box p {}

After the style is calculated, all the style values are hung in window.getComputedStyle, which means that the computed style can be retrieved from JS

The entire process of style calculation is to complete the specific style of each element in the DOM node, the calculation process to follow the CSS inheritance and cascading two rules, the final output content is the style of each node DOM, is saved inComputedStyleIn the

Generate layout tree

Now, we have the DOM tree and the styles of the elements in the DOM tree, but that’s not enough to display the page because we don’t yet know the geometry of the DOM elements. The next step is to calculate the geometry of the visible elements in the DOM Tree, which is to generate a __ Layout Tree __ (Render Tree).

Chrome performs two tasks in the Layout phase: creating a Layout Tree and calculating the Layout

Create a Layout Tree

DOM trees also contain many invisible elements, such as the head tag and elements that use the display: None attribute. So before displaying, we need to build an additional Layout Tree with only visible elements

To see how a Layout Tree is constructed:

As you can see from the figure above, all invisible nodes in the DOM tree are not included in the layout tree

To build the layout tree, the browser basically does the following:

Walk through all the visible nodes in the DOM tree and add them to the layout tree
Invisible nodes are ignored by the layout tree, for exampleheadEverything under the tag, for example body.p.spanThis element, because its attributes containdispaly:none, so this element is not included in the layout tree

Layout calculation

The browser calculates the position of each node on the screen based on the Layout Tree, the CSS definition of each node, and their dependencies

The layout of elements in a Web page is relative. Changes in the location and size of elements on the page often lead to linkage of other nodes and the layout needs to be recalculated. In this case, the layout process is generally called Reflow.

Layer

After the browser to build the layout of the tree, they require a series of operations, it may consider to some complex scenes, as some of the complex 3 d transform, page scrolling, or use the z – index z axis sorting, etc., in order to more easily achieve these effects, rendering engine also need generated dedicated for specific node layer, And generate a corresponding Layer Tree

The final page is made up of these layers stacked together, and they are stacked together in a certain order to form the final page. This means that the browser page is actually divided into many layers, which are superimposed to create the final page

Look at the relationship between the layer tree and the layout tree

Each of these layers is called PaintLayers.

In general, not every node in the layout tree contains a layer, and if a node has no corresponding layer, then the node is subordinate to the layer of the parent node. If the SPAN tags in the image above do not have their own layer, they are subordinate to their parent layer. But anyway, at the end of the day every node is directly or indirectly subordinate to a layer, right

So what criteria does the rendering engine need to meet to create a new layer for a particular node?

Rendering layer

This is the first layer model built during browser rendering. Render objects in the same coordinate space (z-space) will be merged into the same render layer. Therefore, according to the cascading context, render objects in different coordinate space will form multiple render layers to reflect their cascading relationship

There are two conditions for the browser to automatically create a new rendering layer. One is to satisfy the cascading context, and the second is where the clipping is needed

Satisfies the cascading context

The root document
Have clear positioning attributes (relative, fixed, sticky, absolute)
opacity < 1
Currently, animations are applied to opacity, Transform, fliter, and backdrop filter
The overflow is not visible
Has a CSS transform property and is not None
It has the CSS fliter property
The CSS mask property is available
Has the CSS mix-blending-mode property and the value is not normal
Backface -visibility property is hidden
CSS Reflection property
It has the CSS column-count attribute and the value is not auto or it has the CSS column-width attribute and the value is not auto

Where you need to cut

For example, if a div tag is small, 50 by 50 pixels, and you put a lot of text in it, the extra text will have to be clipped. Of course, if the scroll bar is present, the scroll bar will also be promoted to a single layer, which means there will be three layers: div, text, and scroll bar. The following figure

Draw layer Paint

After the rendering layers are built, the rendering engine paints each layer, essentially a pixel-filling process. This process also occurs when parts of the screen are redrawn due to backflow or some CSS modification that does not affect the layout. This process is called Repaint.

The rendering engine will break down the drawing of a layer into smaller drawing instructions, which are then sequentially assembled into a list of layers to draw

As can be seen from the figure, the instructions in the draw list are actually very simple. They are asked to perform a simple drawing operation, such as drawing a pink rectangle or a black line. Drawing an element usually requires several drawing instructions, because each element’s background, foreground, and borders require separate instructions to draw. So in the layer drawing phase, the output is these lists to draw

When this Paint phase completes, the Compsite phase begins

A draw list is simply a list of draw orders and draw instructions that are actually done by the compositing thread in the rendering engine. You can see the relationship between the render main thread and the composition thread in the following image:

As shown above, when the drawing list of layers is ready, the main thread submits the drawing list to the composition thread

Tile

Usually a page may be large, but the user can only see part of it. We call the part that the user can see a viewport.

In some cases, some layer can be very big, such as some pages you use the scroll bar to scroll to scroll to the bottom for a long time, but through the viewport, users can only see a small portion of the page, so in this case, to draw out all layer content, will generate too much overhead, but also it is not necessary

For this reason, the compositing thread divides the layer into tiles, usually 256×256 or 512×512, and then generates the bitmap in preference to the tiles near the viewport. The actual bitmap generation is performed by rasterization

Raster

Rasterization refers to the transformation of a map block into a bitmap. The graph block is the smallest unit for rasterization. The renderer process maintains a rasterized thread pool, where all rasterization of blocks is performed

When the rendering layer meets some special conditions, it will be promoted to the synthesis layer, and the rasterization operation of the synthesis layer is carried out on the GPU. The process of using a GPU to generate bitmaps is called fast rasterization, or GPU rasterization, where the generated bitmaps are stored in GPU memory

GPU operation is run in GPU process, if rasterization operation uses GPU, then the final bitmap generation operation is completed in GPU, which involves cross-process operation. For specific forms, please refer to the following figure:

As can be seen from the figure, the rendering process sends the instruction of generating image blocks to GPU, and then executes the bitmap (CompositingLayer) of generating image blocks in GPU, which is saved in GPU memory

CompositingLayer

Render layers that meet certain special conditions are automatically promoted to compositing layers by the browser. The composition layer has a separate GraphicsLayer, while other rendering layers that are not composition layers share one with their first parent layer that has a GraphicsLayer. Note that document is a rendering layer, not a compositing layer

So what special conditions does a rendering layer meet to be promoted to a compositing layer? Here are some common ones:

Transforms: Translate3D, translateZ, etc
Video, Canvas, iframe and other elements
Animation-opacity conversion via element.animate (
Opacity animation conversion via с SS animation
position: fixed
Has the will-change attribute
Animation or transition is applied to opacity, Transform, fliter, and backdropfilter (animation or transition must be the animation or transition being executed, The animation or Transition layer fails before the animation or Transition effect starts or ends.
Will-change is set to opacity, transform, top, left, bottom, and right (top, left, etc., should be set with clear positioning attributes, such as relative, etc.)

Advantages and disadvantages of composite layers

advantages

The bitmap of the composite layer will be synthesized by the GPU much faster than the CPU
When repaint is required, only repaint itself is required and no other layers are affected
Transform and opacity do not trigger repaint after an element is promoted to a composition layer, or if it is not a composition layer, it still does

disadvantages

The layers drawn must be transferred to the GPU, and the number and size of these layers can be of a certain magnitude, which can cause very slow transfers, resulting in flickering on some low-end and mid-range devices
Implicit composition tends to result in an excessive number of composition layers, each of which takes up extra memory, which is a valuable resource on mobile devices and can cause browsers to crash and make performance optimizations counterproductive. This is also called a layer explosion.

Graphics Slayer

A GraphicsLayer is a layer model that generates content graphics that are ready to be rendered. It has a GraphicsContext, and the GraphicsContext outputs bitmaps for that layer. Bitmaps stored in shared memory will be uploaded to GPU as textures. Finally, MULTIPLE bitmaps will be synthesized by GPU and then drawn on the screen. At this point, our page will also be displayed on the screen, that is to say, redrawing and backflow will not be triggered.

So GraphicsLayer is an important rendering vector and tool, but it doesn’t deal directly with the rendering layer, it deals with the composition layer

Composition and display

When the rendering process’s composition thread receives the drawing message of the layer, it will submit it to the GPU process through the rasterization thread pool, and perform the rasterization operation in the GPU process. Once all the drawing blocks are rasterized, the result will be returned to the rendering process’s composition thread to perform the layer composition operation. Once the layers are composed, a command to draw — “DrawQuad” — is generated and submitted to the browser process

The browser process has a component called viz that receives the DrawQuad command from the compositing thread. The browser process then performs a Display Compositor based on the DrawQuad command, which combines all of __ layers __ into the page content. And draw it into memory, and finally send this part of memory to the graphics card. At this point, through this series of stages, the HTML, CSS, JavaScript, etc., written by the browser will display a beautiful page

The principle of display image explanation:

When the rendering pipeline through the graphics card to generate a picture, the picture will be stored in the graphics card after the buffer, once the graphics card to write the composite image after the buffer, the system will let the back buffer and the front buffer interchange; The display then retrieves the latest image from the front buffer. Under normal circumstances, the refresh rate of the display is 60HZ, that is, 60 pictures are updated every second, that is to say, the rendering line needs to generate a picture in 16.66667ms. Generating images for too long can create a visual lag for the user.

conclusion

A complete rendering process can be summarized as follows:

The renderer process converts the HTML byte stream into a DOM tree
The rendering engine converts CSS styleSheets to styleSheets and calculates the styles of DOM nodes
Create a Layout Tree and calculate the Layout information for the elements
Layer the Layout Tree and create a Layer Tree.
Generate a draw list for each render layer and submit it to the composition thread
Composite threads divide the rendering layer into blocks and convert the blocks into bitmaps through GPU acceleration in the rasterized thread pool
The composite thread sends the DrawQuad command to the browser process
The browser process generates the page from the DrawQuad message and displays it on the monitor

About reflux, redraw, composition

backflow

Another name is rearrangement

When the size, structure, or properties of some or all elements of a Layout Tree change, the browser rerenders part or all of the document.

The following actions trigger backflow:

The geometry of a DOM element changes. Common geometry attributes include width, height, padding, margin, left, top, border, and so on
Add, subtract, or move DOM nodes
When reading and writing offset, Scroll, and client attributes, the browser needs to perform backflow operations to obtain these values
Call the window.getComputedStyle method

Some common properties and methods that cause backflow:

ClientWidth, clientHeight, clientTop, clientLeft
OffsetWidth, offsetHeight, offsetTop, offsetLeft
ScrollWidth, scrollHeight, scrollTop, scrollLeft
ScrollIntoView (), scrollIntoViewIfNeeded ()
getComputedStyle()
getBoundingClientRect()
scrollTo()

Following the rendering pipeline above, when backflow is triggered, if the DOM structure changes, the DOM tree is re-rendered and the rest of the process (including tasks outside the main thread) is completed.

redraw

When a change in the style of an element in a page does not affect its position in the document flow (e.g., color, background-color, visibility, etc.), the browser assigns the new style to the element and redraws it, a process called redraw.

Based on the concept, we know that since there is no change in the DOM geometry, the element position information does not need to be updated, thus eliminating the layout process as follows:

As can be seen from the figure, if you change the background color of the element, the layout stage will not be performed, because there is no change in the geometry position, so you directly enter the drawing stage, and then perform a series of subsequent stages, this process is called redraw

Redraw eliminates layout and layering, so it is more efficient than rearrange

synthetic

In another case, if you change a property that neither lays nor draws, the rendering engine skips layout and drawing and directly performs subsequent compositing operations, a process called compositing.

Does not change the contents of the layer; Changes in text information, layout, and color will all change the layer and involve rearrangement or redrawing
The composition thread is the realization of the entire layer geometry transformation, transparency transformation, shadow and so on; For example, if you scroll the page without changing the content of the entire page, you are actually moving the layer up and down, which can be done directly in the composition thread

In general, the longer the render path, the more time it takes to generate the image. This also explains why composition is better than redrawing and rearranging, and redrawing is better than rearranging.

For example, using CSS’s Transform to animate effects avoids backflow and redraw, and executes compositing animations directly in a non-main thread. Obviously, this method is more efficient. After all, this method is composed on the non-main thread and does not occupy the main thread resources. In addition, it avoids the two sub-stages of layout and drawing, so compared with redrawing and rearrangement, composition can greatly improve the drawing efficiency.

The last

So far, I’ve covered the principles of browser rendering, as well as backflow, redraw, and composition. There are a lot of interesting things to know, such as layer explosion, layer compression, whether CSS and JS block rendering, etc. I won’t put it in this article because of space.

Refer to the article

Bing brother’s browser working principle and practice

Reflow & Repaint for browsers

Browser layer composition and page rendering optimization

CSS3 hardware acceleration also has a pit

Wireless performance optimization: Composite

CSS GPU Animation

More on layer synthesis

How Browsers Work: Behind the scenes of modern web browsers

The most complete ever! Illustrate how a browser works