This article is taken from my public account [Sun Wukong, Don’t talk nonsense]

In front-end development, we probably spend the most time with the browser. However, a lot of times we only know what the browser can do, but we don’t know how it works inside.

I even sometimes wonder, “Is it necessary for the front end to understand how the browser works? What good will it do?” In fact, understanding how a browser works (just a rough idea of how it works) can be very helpful in our daily development, especially when it comes to performance optimization.

Please listen to me slowly.

1. The Heart of the browser

The “heart” of the browser is the core of the browser, the most important module in the browser. Before we dive into the microcosm of how browsers work, we need to take a macro look at the browser kernel.

Typically, the browser kernel is also called the rendering engine. The so-called rendering is the process of constructing mathematical models according to descriptions or definitions and generating images through models.

There are four common browser cores in the market: Trident (IE), Gecko (Firefox), Blink (Chrome, Opera), and Webkit (Safari).

The most famous of these is WebKit. It’s the core of Apple’s Safari browser. In 2005, Apple made the WebKit project open source. In 2008, Google created a new project Chromium with the open source project WebKit as the kernel. Chrome, the future king of the browser, was born in Chromium. Although Chrome changed its kernel to Blink in subsequent iterations, Blink is actually a branch derived from WebKit. From this perspective, the Webkit kernel is the real king of the browser world right now.

Let’s take a closer look at the modules contained in the browser kernel (rendering engine) using WebKit as an example.

As you can see from the figure above, the browser kernel (rendering engine) is made up of multiple modules working together. Among them, we need to pay attention to the HTML interpreter, CSS interpreter, layer layout calculation module, view drawing module and JavaScript engine.

  • HTML interpreter: The HTML document through lexical analysis of the OUTPUT DOM tree.
  • CSS interpreter: Parses CSS documents and generates style rules.
  • Layer layout calculation module: Layout calculates the exact position and size of each object.
  • View drawing module: draw images of specific nodes and render pixels to the screen.
  • JavaScript engine: Compile and execute JavaScript code.

In the browser kernel, as JavaScript engines become more and more independent, we tend to separate them out and collectively refer to the rest as WebCore.

2. Browser rendering process

With this understanding of the building blocks, we can run through the browser’s rendering process. Every page in the browser goes through the following stages when it is first rendered (the arrows do not represent serial, but some operations are done in parallel, just to make it easier to understand) :

(The following steps refer to Chapter 2 of WebKit Tech Insider.)

2.1 parse HTML

In this step, the web page is turned into a series of words (tokens) by the HTML interpreter. The interpreter builds nodes based on tokens to form a DOM tree. If the node is JavaScript code, call the JavaScript engine to interpret and execute it.

If the node needs to rely on other resources (non-javascript resources), such as images, CSS, videos, and so on, call the resource loader to load them. But they are loaded asynchronously and do not prevent the creation of the current DOM tree from continuing. If it is a JavaScript resource URL (not marked asynchronously), you need to stop building the current DOM tree until the JavaScript resource is loaded and executed by the JavaScript engine.

A web page emits a “DOMContent” event and an “onLoad” event for the DOM during the loading and rendering process. The “DOMContent” event occurs after the DOM tree is built; The DOM “onLoad” event occurs after the DOM tree has been built and all the resources the page depends on have been loaded.

2.2 Calculation Style

It is important to note that the CSS interpreter does not work until the CSS file has been loaded.

The loaded CSS file is parsed by the CSS interpreter into an internal presentation structure (which can be interpreted as a list of style rules). The CSS parser works in parallel with the HTML parser.

2.3 Calculate the layer layout

When both the DOM tree from Step 2.1 and the list of style rules from Step 2.2 are generated, the two are combined to build the RenderObject tree. The method is to append appropriate rules from the list of style rules to the DOM tree, based on the known DOM tree structure. This is called the RenderObject tree.

The creation of the RenderObject tree does not mean that the DOM tree will be destroyed. In fact, various presentation structures throughout the process (including the DOM tree, the CSS parsed style rule list, the RenderObject tree and the RenderLayer tree mentioned below, and the drawing context) remain in place until the page is destroyed.

2.4 Drawing Layers

As the RenderObject node is created, WebKit builds the RenderLayer tree based on the web page’s hierarchy, along with a virtual drawing context.

2.5 Integrate layers to get a page

The final image is generated based on the drawing context, a process that relies heavily on 2D and 3D graphics libraries.

After seeing the entire rendering process, we can understand: first, we build a DOM tree based on HTML. This DOM tree is combined with the list of style rules parsed by the CSS interpreter to create the RenderObject tree. Finally, the browser used the RenderObject tree to calculate the layout and draw the image, and our first rendering of the page was complete.

Then, every time a new element is added to the DOM tree, the browser will use the CSS engine to look through the list of style rules parsed by the CSS interpreter, find a style rule that matches the element and apply it to it, and then redraw it.

3. Optimization suggestions based on the rendering process

The actual rendering process is more complex than described above, and we’ll cover Reflow and Repaint in CSS in future articles. But for now, let’s see what points we can optimize based on what we know about the rendering process.

3.1 Optimization of loading sequence of CSS and JS

The loading sequence of CSS files is advanced

In Step 2.3 we mentioned that “after both the DOM tree of Step 2.1 and the list of style rules of Step 2.2 are generated, the RenderObject tree will be combined”. The generation of CSS style rule list depends on the loading of CSS source file. That is, the RenderObject tree is built by loading CSS source files. That is:

CSS is the resource that blocks rendering. It needs to be downloaded to the client as soon as possible to reduce the first rendering time.

In fact, many teams have done this early (put CSS in the head tag) and soon (enable CDN to optimize static resource loading speed). This action of “putting CSS forward” has been internalized as a coding habit for many students. We should also know by now that this “habit” is not a myth, it is determined by the nature of CSS.

The reasonable choice of JS loading mode

In Step 2.1, we mentioned the impact of JavaScript files on HTML parsing. Whenever a node needs to be loaded for JavaScript code or JavaScript resources (not specified asynchronously), the rendering thread passes execution to the JavaScript engine. This directly causes the entire rendering process to block.

The browser lets JavaScript block other activities because it doesn’t know what changes JavaScript is going to make and is afraid of causing chaos if it doesn’t block subsequent operations (because JavaScript code can modify DOM and CSS via the corresponding Web API).

But we’re the ones who write the code, and we know what the code does. If we can confirm that a JavaScript file doesn’t have to be executed at this point, we can avoid unnecessary blocking by using defer and Async on it, which leads to three ways to load external JavaScript files.

Three ways to load JS files

  • Normal mode:
    <script src="index.js"></script>
Copy the code

In this case, JS blocks the browser and the browser must wait for index.js to load and execute before it can do anything else.

  • Async mode:
<script async src="index.js"></script>
Copy the code

In async mode, JS does not block the browser from doing anything else. It loads asynchronously, and when it finishes loading, the JS script executes immediately.

Defer mode:

In the defer mode, JS loads are asynchronous and execution is deferred. After the entire HTML document has been parsed, the JS files that have been tagged defer will start executing in sequence.

From an application point of view, async is usually used when the dependencies between our script and DOM elements and other scripts are not strong; When the script depends on DOM elements and the execution results of other scripts, we choose defer.

In the Async and defer modes, we also have the option of putting JS files in the HEAD of the HTML to further improve efficiency.

By carefully adding async/defer to the script tag, we can tell the browser not to block rendering while waiting for the script to be available, which can significantly improve performance.

3.2 Optimization of CSS style sheet rules

The CSS interpreter and the list of style rules it parses are mentioned several times during rendering: it is used when the RenderObject tree is first generated, and again when the page is updated. The operation of matching a style rule from a list of style rules occurs frequently.

CSS style sheet rules can be matched from right to left or left to right. The author cannot find the exact data to prove temporarily. Gu will not take a position here.)

To make this matching process as inexpensive as possible, we can at least summarize the following performance improvement solutions:

  • Avoid wildcards and select only the elements you need.
  • Focus on attributes that can be implemented through inheritance to avoid repeated matching and repeated definitions.
  • Use label selectors less. If possible, use a class selector instead.
  • Reduce nesting. Try not to go more than three floors.

4. Summary

This paper describes the rendering process of the browser kernel and gives practical optimization suggestions. Hope it works.


Front-end performance optimization series:

(a) : start with the TCP three-way handshake

(two) : for the blocking of TCP transmission process

(3) : optimization of HTTP protocol

(IV) picture optimization

(v) : browser cache strategy

(vi) : How does the browser work?

(vii) : Webpack performance optimization