Cut the idle chatter and get to the point. As front-end engineers, we work with pages every day. The page is displayed in the browser. Refresh your browser over and over to see how the page looks as you develop your project every day.
But have we ever wondered how browsers work to turn our projects into pages? Why is it that when we type a URL into a browser and hit Enter, a page is displayed?
One might argue that the browser simply sends a request, downloads an HTML, and parses the rendering. Yeah, that’s exactly what it is. But how does the browser request it? How do you render it?
Let’s discuss it with the following questions:
1. What does the browser do first when making a request? A, look for the local cache
Answer many students should know, that is of course according to the URL search local cache! First look at the local cache whether there is the content of the current request URL, if there is, of course, it is good to use directly. If it doesn’t exist, you have to ask the server for the URL.
But don’t underestimate the local cache, there are universities around here. Let’s first review what the browser’s cache identity fields are.
-
Cache-control: specifies the validity period of the current resource and controls whether the browser Cache data directly from the browser or sends a request to the server to fetch data.
-
Expires, a Web server response header field that tells the browser in response to an HTTP request that the browser can cache data directly from the browser before the expiration date without having to request it again. Expires is an HTTP1.0 thing, so its role can be largely ignored.
-
Last-modified: indicates when the requested resource was Last Modified.
-
Etag, which tells the browser the unique identity of the current resource on the server when the Web server responds to a request.
With so many Cache tokens, there must be a priority. Of course, cache-Control is prioritized over Expires and Etag is prioritized over Last-Modified. With that said, let’s go straight to the picture below for a clearer picture:
Now that I’m sure you have an idea of how browser caching works, let’s look at what browsers do next.
2. What does the browser do now that it has looked in the cache? A DNS query
First, the client sends the domain name query request to the local DNS server. The local DNS server searches the previous record (cache) first. If there is cache, the local DNS server directly uses the cache for resolution. If the local server cannot find the cache locally, the request is sent to the root DNS server.
The DNS server takes our domain name and returns it to the browser as an IP address, so that the browser can directly locate the address of the requested server.
3, after finding the server address is sure to send a request to the server, establish a connection, this need not ask.
So, based on the IP and port, the browser starts to establish a TCP connection with the server. When we see a TCP connection, we think of a three-way handshake.
Simplified version of the three-way handshake:
A: What are you looking at?
B: What do you think?
A: Give me some data.
A and B are gonna give us a number.
Of course, once the connection is established, it’s just nice to send data to each other, so here’s the problem.
4. What does the browser do when it receives data from the server? A, parse the data and generate the render tree
Render the display page for sure, the key is how to render it. Let’s take a look at the browser’s simple steps:
(1) Parse HTML/SVG/XHTML and build DOM tree.
(2) Parse the CSS and build the CSSOM tree.
(3) Build render tree. Render trees are not equivalent to DOM trees, because something like Header or display: None doesn’t need to be in the Render tree.
(4) reflow Calculates where each Element is displayed on the device.
(5) Paint. Finally, draw by calling the API of the operating system Native GUI.
parse content
-
Decode the encoding of the page based on the data returned by the HTTP response.
-
Content-type :text/ HTML; Charset = information, which is judged by the highest priority;
-
Secondly, the meta of the web page itself, charset part of the content-Type information in the header, this is mainly for HTTP header not specified encoding or our local files;
-
If none of the above is found, the browser also provides a feature in the menu that generally allows the user to specify the encoding. Some browsers, such as Firefox, can also opt for automatic encoding detection, which uses a statistics-based approach to determine undetermined encoding.
-
Once the encoding is determined, the web page is decoded into a stream of Unicode characters that can be parsed in HTML. Character by character resolution, while downloading while parsing.
build dom tree
Conversion: The browser reads the raw bytes of HTML and converts them to individual characters (such as UTF-8) according to the specified encoding rules.
Tokenizing: The browser converts strings into certain tags, such as characters with Angle brackets, according to the W3C’S HTML5 standard. Each tag has a specific meaning and a set of rules.
Lexing: Decomposed tags are converted into objects that define their attributes and rules.
DOM construction: The final object created as a result of HTML tags is associated with a tree data structure. The tree reflects the parent-child relationship defined in the original tag, such as the HTML object being the parent of the body object, the body object being the parent of the P object, and so on. Let’s take a look at the picture to make it a little bit more intuitive:
build CSSOM tree
The process is the same as building a DOM tree.
build render tree
Browsers must have both DOM and CSSOM at render time to construct a render tree. The general steps are as follows:
-
Start at the root of the DOM tree and traverse all visible nodes.
-
Some invisible elements, such as metadata tags, script tags, etc., are ignored because they do not affect the rendering results of our pages. Other elements hidden by CSS styles will also be ignored. For example, an element with the display: None attribute set. It is worth mentioning that “visibility: Hidden” is different from “display: None”. The former will hide elements on the page, but will still occupy the original space in the final layout, which is essentially blank. The latter removes the element from the render tree of the page, eliminating the node from the final layout.
-
For each node in the DOM tree, find the corresponding style rule from CSSOM and add it to a newly created render tree node.
-
The final output is the visual node and the computed style on that node, the final render tree. Again, look at the picture below:
5. What should I do after generating the render tree? A: Backflow, draw
-
Reflow (back)
Now that we have the render tree, we know which nodes to display and what they look like. But at this point we don’t know the exact location and size of the nodes in the current device. Yes, this is exactly what the Reflow phase does, calling recursively from the root node, calculating the size, position, and so on of each element, and giving the exact coordinates that each node should appear on the screen.
-
Paint (map)
Now that we know which nodes are visible, their style, their geometric appearance, and their precise location on the device, we are ready to draw. In the Paint phase, the browser draws each pixel by calling the API of the operating system’s Native GUI, updates the video memory, and sends signals to the display, which then displays them.
Congratulations, now you can see the page! So our introduction here is also to end, if there is something wrong, welcome to criticize and correct, have different ideas, welcome to communicate with us.
— — — — — — — — —
Long press the QR code to follow the big Zhuan FE