preface
Browser foundation is a small branch of front-end knowledge network, and it is also a basic knowledge that front-end developers must master. It runs through the entire network system of the front-end, and project optimization is also carried out around the browser.
Developers may be asked during an interview:
What happens between the time you type a web address into your browser and the time the page is fully displayed?
It’s a cliche question, but there’s no single answer. Maybe three or five years ago, there would have been a “relatively” standard answer.
- When the browser receives this command, it will start a separate thread to process the command. First, it must determine whether the user input is a valid or reasonable URL address, whether it is HTTP protocol request, if so, then proceed to the next step
- The browser engine will analyze the URL. If cache “cache-control” exists and has not expired, it will extract the file From the local cache (From Memory cache, 200 return code). If cache “cache-control” does not exist or has expired, The browser will initiate the remote request
- Obtain the IP address corresponding to the website address by resolving the domain name through DNS, and send a GET request to the IP address together with the Cookie and userAgent information of the browser.
- Next comes the classic “three-way handshake,” an HTTP session in which the browser client sends a message to the Web server for communication and data transfer.
- The back-end services into the website, such as Tomcat, Apache, etc., as well as the popular node.js server in recent years, these servers deploy application code, there are many languages, such as Java, PHP, C++, C# and Javascript.
- The server executes the corresponding back-end application logic based on the URL, using the “server cache” or “database”.
- The server processes the request and returns a response message. If the browser has accessed the page and the corresponding resource exists in the cache, 304 is returned if the page is consistent with the last modification record on the server. Otherwise, 200 and the corresponding content are returned.
- The browser receives the return message and begins either downloading the HTML file (no cache, 200 return code) or extracting the file from the local cache (with cache, 304 return code)
- After the browser’s rendering engine gets the HTML file, it begins to parse and build the DOM tree, and download the specified MIME type file (such as CSS, JavaScript scripts, etc.) according to the markup request in the HTML, and use & to set the cache and other content.
- The rendering engine expands the DOM tree to a rendering tree based on CSS style rules, and then rearranges and redraws it.
- If it contains JS files, it will execute Dom manipulation, cache read, event binding and other operations. The final page will be displayed on the browser.
This answer is a short summary of the “back-end oriented MVC pattern” and the browser response of early Web applications. With the development of front-end technology and the advent of “front end separation”, “middleware straight out” and “MNV* mode”, the answer is different.
Take “front and back end separation” as an example. After step 4 of the above answer, there is no direct access to the back-end server. Instead, it is blocked by HTTP and reverse proxy servers such as Ngnix.
- Preceding steps 1, 2, 3, and 4
- When Ngnix listens to HTTP (port 80) or HTTPS (port 443) requests, it makes service distribution according to URL, and distributes (rewrite) to back-end server or static resource server. Home page requests are basically distributed to static server and return an HTML file
- Steps 7, 8, 9, 10
- Execute JS script, asynchronous Ajax, FETCH initiate POST, GET request, re-enter Ngnix distribution, this distribution to the back-end server, step 5, 6, 7, and then return an XML or JSON format information, Usually contains code (return code) and result (dependency information)
- The js callback performs different logic based on the return code, adding, removing, and changing page elements, which may be rearranged or redrawn. The home page is loaded.
From the above steps, it can be found that the browser may trigger two redraws, which is easy to produce “white screen” or “page jitter”. In order to solve this problem, the mode of “middleware straight out” was born. In addition, in order to expand the front end camp and absorb IOS and Android, Google designed “MNV* mode”, the typical representative of which is ReactNative. However, this mode has left the scope of browser, so it will not be extended here.
Many browser functions are used in the rendering process discussed above, such as user address bar input box, network request, browser document parsing, rendering engine rendering web page, JavaScript engine executing JS script, client storage, etc. Let’s take a look at the basic structure of the browser.
The structure of the browser
Browsers are typically made up of seven modules, User Interface (UI), Browser Engine, Rendering engine, Networking, JavaScript Interpreter, AND UI Backend and Date Persistence are shown as follows:
- The user interface – includes the address bar, back/forward buttons, bookmarks directory, etc., which is what you see other than the page display window
- The browser engine – can pass instructions between the user interface and the rendering engine, read and write data in the client’s local cache, etc., and is the core of the communication between the various parts of the browser
- Rendering engine – Parses DOM documents and CSS rules and formats the content into the browser to display a styled interface, also known as a layout engine, we often refer to the browser kernel mainly refers to the rendering engine
- Network – a module used to complete network calls or resource downloads
- UI back end – Used to draw the basic internal controls of the browser window, such as input fields, buttons, radio buttons, etc. The visual effects vary depending on the browser, but the function is the same.
- JS interpreter – Used to interpret modules that execute JS scripts, such as V8 engines and JavaScriptCore
- Data storage – The browser saves various data, such as cookies and localStorage, on the hard disk, and can be invoked through the API provided by the browser engine
As front-end developers, we need to focus on understanding the working principle of rendering engine and flexibly apply data storage technology. These two parts are often involved in actual project development, especially when doing project performance optimization, it is particularly important to understand the working principle of browser rendering engine. Other parts are managed by the browser, leaving the developer with less control. Today we are going to focus on one of these two areas: the Browser Rendering engine
Browser rendering engine
Browser rendering engines are developed by major browser vendors in accordance with W3C standards and are also known as “browser kernels”.
Currently, there are five major browser kernels in use: Trident, Gecko, Presto, Webkit, and Blink.
Trident: Commonly known as THE IE kernel, also known as the MSHTML engine, is currently used in the browser IE11 -, as well as various domestic multi-core browsers in the IE compatible module. In addition, Microsoft’s Edge browser no longer uses the MSHTML engine, but uses a new engine like EdgeHTML.
Gecko: Commonly known as the Firefox kernel, first adopted by Netscape6 and later adopted by Mozilla Firefox, Gecko is characterized by fully open code and, as a result, is highly developable, allowing programmers around the world to write code for it and add features. Because it is an open source kernel, it is favored by many people, and there are many browsers for the Gecko kernel, which is an important reason why the market share of the Gecko kernel is rapidly increasing despite its youth.
Presto: Opera pre-kernel, why pre-kernel? Since Opera12.17 has embraced Google Chrome’s Blink kernel, the kernel has no place to rest
Webkit: Safari kernel, the prototype of the Chrome kernel, is mainly used by the Safari browser, and is the best browser kernel in terms of features. It is also widely used in mobile browsers.
Blink: Developed by Google and Opera Software and used in Chrome (version 28 and later), Opera (version 15 and later) and Yandex. Blink is actually a branch of Webkit, adding some optimized new features, such as cross-process IFrame, moving DOM into JavaScript to improve JavaScript access to DOM, etc. At present, the browser kernel embedded in many mobile applications has gradually begun to use Blink.
Rendering engine workflow
The most important job of a browser rendering engine is to render a combination of HTML and CSS document parsing into a browser window. As shown in the figure below, after the rendering engine receives the HTML file, the main operations are as follows: parse the HTML to build a DOM tree -> build a render tree -> render tree layout -> render tree rendering.
Parsing HTML When building a DOM tree, the rendering engine parses the HTML file’s note elements into multiple DOM element object nodes, which form a tree structure based on the parent-child relationship. At the same time, the CSS file is parsed into a CSS rule table, and then each CSS rule is matched in the DOM tree “from right to left” in reverse order to generate a DOM rendering tree with style rule description. Next is the rendering tree layout, drawing process. Firstly, size and position of DOM elements are positioned according to the style rules of DOM rendering tree. Key attributes include position. width; margin; padding; top; border; . Next, according to the color in the element style rule; background; shadow; . Rules for drawing.
In addition, this process is done gradually, the rendering engine will render the content to the screen as early as possible for a better user experience, rather than wait until all the HTML has been parsed before building and laying out the Render tree. It parses part of the content and displays part of the content, while probably downloading the rest of the content over the network.
Moreover, it is important to note that after the browser rendering the first screen page, if the DOM manipulation will cause the browser engine of DOM render tree to layout and redraw, we called “reorder” and “redraw”, due to the rearrangement and re-paint is depend on the relationship between before and after, redrawn occur is not necessarily will trigger the rearrangement of rendering engine, However, a rearrangement will inevitably trigger a redraw operation, which can cause significant performance damage. Therefore, we should follow the principle of “avoid rearrangement; reduce redraw” when we do performance optimization.
Differences between browser kernels
The browser page rendering process is slightly different under different browser kernels
In the Webkit kernel, the parsing of HTML and CSS files is synchronized, while in Geoko kernel, the parsing of HTML and CSS files is synchronized. CSS files are parsed only after HTML files are parsed into content Sink.
In addition, there are different descriptions of terms. In addition, the two processes are basically the same. The three most important parts are “HTML parsing”, “CSS parsing” and “rendering tree generation”. The principle of the three parts is deep, can involve “lexical analysis” “syntax analysis”, “transformation” “explanations” data structure, such as knowledge, dull, usually it is enough, we learn to understand the students can read the text, the working principle of the browser inside a detailed explanation of the process and principle of the above three parts. There is no further elaboration here.
About matching CSS rules
As we mentioned above, CSS rules are matched “right-to-left” in the REVERSE direction of the DOM tree, resulting in a DOM rendering tree with style rule descriptions.
But do you know why reverse matching is done “right to left”?
Let’s review the WebKit kernel workflow diagram.
CSS rule matching occurs in the webKit engine’s “Attachment” process, where the browser extends CSS Style Rules for each DOM Tree element. For each DOM element, a matching selector must be found in all Style Rules and the corresponding Rules must be combined. This is where the “parsing” of the selector is actually performed, looking for the corresponding selector from the Style Rules as it traverses the DOM Tree.
Let’s take the simplest chestnut:
<template>
<div>
<div class="t">
<span>test</span>
<p>test</p>
<div>
</div>
</template>
<style>
div{ color: # 000; }
div .t span{ color: red; }
div .t p{color: blue; }
</style>Copy the code
Here we have an HTML element and a style element that need to be traversed
Div. T p{color: red; } When the matching item is reached, the computer first needs to find the parent tag and grandfather tag of the tag, judge whether they meet the rules of div. T, and then match whether is a P tag. Here the match fails, resulting in three wastes.
This rule can be excluded if is a P tag for the first time, which is more efficient.
If you make the HTML structure more complex and the CSS rule table larger, then the advantage of “reverse matching” is much greater than that of “forward matching” because the cases of matching are much lower than the cases of mismatching. In addition, if you add a wildcard “*” to the end of the selector, the advantage of “reverse matching” is greatly reduced, which is why many optimization principles say “avoid adding wildcards to the end of the selector”.
At the extreme, if our stylesheet did not have nested relationships, it would look like this:
<template>
<div class="t">
<span class="div_t_span">test</span>
<p class="div_t_p">test</p>
<div>
</template>
<style>
div{ color: # 000; }
.div_t_span{ color: red; }
.div_t_p{color: blue; }
</styleCopy the code
Then the engine’s “Attachment” process will be greatly simplified, and efficiency can be imagined, which is why the “wechat applet” style sheet does not recommend the use of relational line writing.
Related performance optimization
We can roughly see the possible optimization points related to the browser rendering engine in the above example.
There are roughly the following
Reduce the impact of JS loading on Dom rendering
Load the JS file after an HTML document, or load the JS code asynchronously
Avoid rearrangement and reduce redrawing
When making CSS animation, reduce the use of width, margin, padding and other effects on CSS layout rules. CSS3 transform can be used instead. In addition, it is worth noting that when loading a large number of image elements, try to pre-limit the size of the image, otherwise the image layout information will be updated during the image loading process, resulting in a large number of rearrangements.
Reduce the use of relational style sheets
Use unique class names to maximize rendering efficiency and avoid adding wildcards at the end of selectors
Reduce the DOM hierarchy
Reducing meaningless DOM hierarchy can reduce the amount of matching calculation in rendering engine Attachment process
trailer
“Front-end Things” series 2 on front-end optimization strategies