preface
The core of the browser refers to the most core procedures that support the browser to run. It is divided into two parts, one is the rendering engine, the other is the JS engine. Rendering engines are not always the same in different browsers. Currently, there are four common browser cores: Trident (IE), Gecko (Firefox), Blink (Chrome, Opera), and Webkit (Safari). The most familiar of these is probably the Webkit kernel, which is the real king of the browser world. In this article, we take Webkit as an example to analyze the rendering process of modern browsers in depth.
If you want to read more great articles, please click on the GitHub blog. Fifty great articles a year are waiting for you!
Page loading process
Before introducing the browser rendering process, let’s briefly describe the loading process of the next page to help you understand the subsequent rendering process.
The main points are as follows:
- The browser obtains the IP address of the domain name from the DNS server
- Send an HTTP request to the machine at this IP
- The server receives, processes, and returns HTTP requests
- The browser gets what’s returned
In browser enter https://juejin.im/timeline, for example, and then through a DNS lookup, juejin. Im a corresponding IP is 36.248.217.149 (different time and place of the corresponding IP may be different). The browser then sends an HTTP request to that IP.
The server receives the HTTP request, calculates (pushes different content to different users), and returns the HTTP request with the following content:
It’s just a bunch of HMTL strings, because HTML is the only format that browsers can parse correctly, as required by the W3C standard. Next comes the browser’s rendering process.
Browser rendering process
The browser rendering process can be divided into three parts:
1) Browsers parse three things:
- The first is HTML/SVG/XHTML. The HTML string describes the structure of a page, and the browser parses the HTML structure into a DOM tree.
- The second is CSS. Parsing CSS produces a CSS rule tree, which is similar to the DOM structure.
- The third is Javascript script. After the Javascript script file is loaded, DOM API and CSSOM API are used to operate DOM Tree and CSS Rule Tree.
2) After parsing, the browser engine creates the Rendering Tree using DOM Tree and CSS Rule Tree.
- A Rendering Tree is not the same as a DOM Tree, the Rendering Tree only contains the nodes to be displayed and the style information for those nodes.
- CSS Rule Tree is used to match and attach CSS rules to each Element (i.e. each Frame) on the Rendering Tree.
- Then, the position of each Frame is calculated, which is also called layout and Reflow procedures.
3) Finally draw by calling the API of the operating system Native GUI.
Next, we will elaborate on the important steps taken in this process
Build the DOM
The browser follows a set of steps to convert the HTML file into a DOM tree. Macroscopically, it can be divided into several steps:
- The browser reads the raw bytes of HTML from disk or the network and converts them to strings based on the specified encoding of the file, such as UTF-8.
All the stuff that’s going around the network is zeros and ones. When the browser receives these bytes of data, it converts them into a string, which is the code we wrote.
- Convert a string to a Token, for example:
<html>
,<body>
And so on.Tokens are identified as “start tag”, “end tag”, or “text”.
At this time you must have a question, how to maintain the relationship between nodes?
In fact, that’s what tokens are supposed to do with tokens like “start tag” and “end tag.” For example, the node between the start tag and end tag of the title Token must be a child node of the head Token.
The figure above shows the relationship between nodes, for example: The “Hello” Token is located between the “title” start tag and the “title” end tag, indicating that the “Hello” Token is a child node of the “title” Token. Similarly, the title Token is the child node of the Head Token.
- Generate node objects and build the DOM
In fact, in the process of DOM construction, instead of generating node objects after all tokens are converted, the node objects are generated by consuming tokens while generating tokens. In other words, as soon as each Token is generated, the Token is consumed to create a node object. Note: Tokens with an end tag identifier do not create node objects.
Let’s take an example. Suppose we have some HTML text:
<html>
<head>
<title>Web page parsing</title>
</head>
<body>
<div>
<h1>Web page parsing</h1>
<p>This is an example Web page.</p>
</div>
</body>
</html>
Copy the code
The above HTML will parse like this:
Build CSSOM
The DOM captures the content of the page, but the browser also needs to know how the page is presented, so you need to build CSSOM.
The process of building a CSSOM is very similar to the process of building a DOM. When a browser receives a piece of CSS, the first thing the browser does is recognize the Token, then build the node and generate the CSSOM.
Note: CSS matching HTML elements is a fairly complex and performance problem. Therefore, the DOM tree should be small, CSS should try to use ID and class, do not transition layer on layer.
Building a Render tree
Once we have generated the DOM tree and the CSSOM tree, we need to combine the two trees into a render tree.
In this process, it is not as simple as merging the two. The render tree contains only the nodes that need to be displayed and their style information. If a node is display: None, it will not be displayed in the render tree.
We may have a question: what will browsers do if they encounter JS files during rendering?
During rendering, if
That said, if you want the first screen to render as quickly as possible, you should not load JS files on the first screen, which is why it is recommended to place the script tag at the bottom of the body tag. At this point, of course, it’s not necessary to put the Script tag at the bottom, as you can add either defer or async properties to the script tag (the difference between the two is described below).
The JS file doesn’t just block DOM building, it can cause CSSOM to block DOM building as well.
Originally, DOM and CSSOM were constructed independently of each other, but once JavaScript was introduced, CSSOM also started blocking DOM construction, and only after the COMPLETION of CSSOM construction, DOM construction resumed.
What’s going on here?
This is because JavaScript can not only change the DOM, it can also change styles, which means it can change CSSOM. Because incomplete CSSOM cannot be used, JavaScript must get the full CSSOM when executing JavaScript if it wants to access it and change it. As a result, if the browser has not finished downloading and building CSSOM, and we want to run the script at this point, the browser will delay script execution and DOM building until it has finished downloading and building CSSOM. That is, in this case, the browser downloads and builds CSSOM, then executes JavaScript, and then continues to build the DOM.
Layout and Drawing
When the browser generates the render tree, the layout (also known as backflow) is performed based on the render tree. All the browser has to do at this stage is figure out the exact location and size of each node on the page. This behavior is often referred to as “automatic reordering.”
The output of the layout process is a “box model” that accurately captures the exact position and size of each element within the viewport, and all relative measurements are translated into absolute pixels on the screen.
Immediately after the layout is complete, the browser issues “Paint Setup” and “Paint” events that convert the render tree into pixels on the screen.
Now that we’ve covered the important steps in the browser workflow in detail, let’s discuss some related issues:
A few additional notes
1. What are the functions of async and defer? What’s the difference?
Let’s compare the difference between defer and async properties:
The blue line represents JavaScript loading; The red line represents JavaScript execution; The green line represents HTML parsing.
1) Case 1<script src="script.js"></script>
Without defer or Async, the browser loads and executes the specified script immediately, meaning it reads and executes document elements without waiting for them to be loaded later.
2) Case 2<script async src="script.js"></script>
(Asynchronous download)
The async property represents the JavaScript introduced by asynchronous execution, and the difference from defer is that it will start executing if it is already loaded — either at the HTML parsing stage or after DOMContentLoaded is triggered. Note that JavaScript loaded this way still blocks the load event. In other words, async-script may be executed before or after DOMContentLoaded is fired, but must be executed before Load is fired.
3) Case 3<script defer src="script.js"></script>
(Delay the)
The defer attribute represents delayed execution of the imported JavaScript, meaning that the HTML does not stop parsing when the JavaScript loads, and the two processes are parallel. After the entire Document has been parsed and deferred -script has loaded (in no particular order), all the JavaScript code loaded by deferred -script is executed and the DOMContentLoaded event is triggered.
Defer differs from regular Script in two ways: it does not block parsing of the HTML when loading the JavaScript file, and the execution phase is deferred after parsing the HTML tags. When loading multiple JS scripts, Async loads sequentially, while defer loads sequentially.
2. Why is DOM manipulation slow
Think of DOM and JavaScript as islands, each connected by a toll bridge. — High Performance JavaScript
JS is fast, and modifying DOM objects in JS is fast. In the JS world, everything is simple and fast. But DOM manipulation is not a solo JS dance, but a collaboration between two modules.
Because DOM is something that belongs in the rendering engine, and JS is something that belongs in the JS engine. When we manipulate the DOM with JS, there is essentially “cross boundary communication” between the JS engine and the rendering engine. The implementation of this “cross-border communication” is not simple and relies on bridging interfaces as “Bridges” (see figure below).
There is a charge to cross the bridge — an expense that is not negligible in itself. Every time we manipulate the DOM, either to modify it or just to access its value, we have to bridge it. The more times you cross the bridge, the more obvious performance problems will occur. So the advice to “cut down on DOM manipulation” is not entirely groundless.
3. Do you really understand reflux and redraw
The rendering process basically looks like this (the four yellow steps below) : 1. Compute CSS style 2. Build Render Tree 3.Layout – Position coordinates and size 4. Officially open draw
Note: There are a lot of lines in the flow above, which indicates that Javascript dynamically modifying DOM or CSS properties will result in a Layout relayout. However, some changes will not result in a Layout relayout, such as the arrows pointing to the sky above, such as the modified CSS rule that does not match the element.
Two concepts are important here, one is Reflow and the other is Repaint
- Redraw: When we make changes to the DOM that result in a style change without affecting its geometry (such as changing the color or background color), the browser doesn’t have to recalculate the element’s geometry and simply draw a new style for the element (skipping the backflow shown above).
- Backflow: When we make changes to the DOM that result in a change in the DOM’s geometry (such as changing the width or height of an element, or hiding an element), the browser recalculates the element’s geometry (which also affects the geometry and position of other elements), and then draws the calculated results. This process is called reflux (also known as rearrangement)
We know that when a web page is generated, it will be rendered at least once. It is constantly re-rendered as the user accesses it. Rerender will repeat reflux + redraw or only redraw. Backflow must occur redraw, redraw does not necessarily cause backflow. Redraw and backflow occur frequently when we set the node style, and can greatly affect performance. The cost of backflow is much higher, and changing the child node in the parent node is likely to result in a series of backflows in the parent node.
1) Common backflow causing attributes and methods
Any operation that changes the geometry of an element (its position and size) triggers backflow,
- Add or remove visible DOM elements
- Element size changes — margins, padding, borders, width, and height
- Content changes, such as the user entering text in an input box
- Browser window size changes – when the resize event occurs
- Calculate the offsetWidth and offsetHeight properties
- Sets the value of the style property
2) Redraw properties and methods are commonly caused
3) How to reduce backflow and redraw
- Use transform instead of top
- Use visibility instead of display: None, because the former will only cause redraw and the latter will cause backflow (changing the layout)
- Do not put node property values in a loop as variables in the loop.
for(leti = 0; i < 1000; I++) {// getting offsetTop will cause backflow because you need to get the correct value console.log(document.queryselector ('.test').style.offsetTop)
}
Copy the code
- Do not use the table layout, it is possible that a small change will cause the entire table to be rearranged
- Select the speed of the animation implementation, the faster the animation, the more backflows, you can also choose to use requestAnimationFrame
- CSS selectors match from right to left to avoid too many node hierarchies
- Set nodes that are frequently redrawn or reflow as layers to prevent their rendering behavior from affecting other nodes. For the video TAB, for example, the browser automatically turns the node into a layer.
Performance optimization strategy
Based on the browser rendering principles described above, the DOM and CSSOM structure construction sequence, initialization can be used to optimize page rendering and improve page performance.
- JS optimization:
<script>
The tag, along with the defer and async properties, controls the download and execution of the script without blocking page document parsing.- The defer property: Used to start a new thread to download the script file and have it execute after the document has been parsed.
- Async property: a new HTML5 property that is used to asynchronously download script files and immediately explain the execution code after downloading.
- CSS optimization:
<link>
Setting the rel attribute to preload allows you to specify in your HTML pages which resources are needed immediately after the page loads, optimally configuring the load order and improving rendering performance
conclusion
From what has been discussed above, we may come to the conclusion that…
- Browser workflow: Build DOM -> Build CSSOM -> Build render tree -> Layout -> Draw.
- CSSOM blocks the rendering and only moves on to the next stage of building the rendering tree once the CSSOM is built.
- Normally DOM and CSSOM are built in parallel, but when the browser encounters aScript tag that doesn’t defer or async, DOM building will stop. If the browser hasn’t finished downloading and building CSSOM at this point, JavaScript can modify CSSOM, So you need to wait until the CSSOM is built before executing JS, and then re-dom is built.
Welcome to pay attention to the public number: front-end craftsmen, we will witness your growth together!
Refer to the article
- The difference between async and defer | SegmentFault
- Introduction to browser rendering principles
- The way of the front-end interview
- Key render path
- Front-end performance optimization principles and practices
- From beginner to expert: A hands-on manual for front-end full-link Development
- Web front-end interview guide and high-frequency test question analysis