Browser 🌲 page rendering process

First in nuggets original link reprint please specify nuggets link

Preface 🎤

Understanding how browser page rendering works can sometimes help us optimize page loading, and it’s an essential skill as a front-end developer. Viewing a smooth and polished page can potentially increase user goodwill and even convert users directly into effective users.

Rendering process

The process for loading a browser page is as follows

DNS lookups
Establish a connection
Transmission content
Parse the content
To render

DNS lookups

These processes begin when the user enters a URL into the address bar, or clicks a link to submit a form. When the browser receives a URL, it searches for the corresponding IP address in HOST. If it cannot find the corresponding IP address in HOST, it sends a request to the DNS server. After the request is successful, the target IP address is cached for the next search. Typically, a page will attempt a DNS lookup once. However, a page with a large number of references to external resources that are not under the same domain name can result in multiple DNS lookups. DNS lookups may not have a direct impact on PC users with stable networks, but in the case of mobile networks, establishing any network connection is a lengthy process. At the same time DNS lookup is not plain sailing, you can think of it as a dumping process. When a user requests a DNS lookup, the request will be sent to the direct DNS server, which is the automatic or manual DNS server configured in your IP Settings. When the direct DNS server cannot be found, or there is no cache, the request will be forwarded to the 13 root DNS servers. Note that: These requests are processed by the direct DNS server, and only one direct DNS server interacts with the client. If it cannot find it, it will return to the DNS server of the next level of the domain name, such as the DNS server of juejin.cn, and dump the blame on the direct DNS server. The direct DNS server then goes back and queries juejin.cn’s DNS server until the destination address is found or returns nonexistent. However, DNS processing is not so simple, there are also forwarding mode and other cases, which will not be expanded here.

Establish a connection

We all know that our browsers use HTTP and TCP at the transport layer. Therefore, establishing a TCP connection is the first step in everything.

TCP handshake 🤝

When we need to communicate with a server, the first thing to do is to make a TCP connection, which is the classic three-way handshake. C: Can you hear me? S: Yes, can you hear me? C: Yes.

The TLS negotiations

If you need to set up a communication over HTTPS, a TLS negotiation is also required. The main process is:

The client requests the public key from the server
Generate a conversation key
Encrypt data using the conversation key.

Two key encryption to clarify a few points

The conversation key is symmetric encryption key
Public keys can be encrypted or decrypted
The private key can be encrypted or decrypted
Public key encryption Only private keys can be decrypted
Private key encryption Both the public and private keys can be decrypted because the private key can be inferred from the public key

In the first step, when the client requests the public key from the server, to prevent tampering, the public key is encrypted through the CA private key to generate a certificate and then transmitted.

The first step, client Hello, will send the following data,Protocol version, random number generated by the client, encryption method, compression algorithm.
And then the server responds, at which point the server will say,Determine the version of the encryption protocol to use, the random number of the server, the encryption method to choose, and the server certificate

The client then starts to verify that the server certificate is a trusted authority and displays a warning message in the browser if it is not. If there is no problem, the server public key is fetched. This completes the client’s request to the server for the public key

At this point, our client already has these data: client random number, server random number, server public key. 3. The client responds to the server with the following message: A pre-master key (PMK for short) is used to notify the change of the encoding (all subsequent information is sent using the encryption method and key agreed upon) and the end of handshake notification. At this point, the client and server both have three random numbers and use these three random numbers to generate a session key. 4. The server returns the final response.

To get the data

When a connection is successfully established, the server starts transferring content, typically an HTML file. There is a detail here, which is the TCP slow start, or 14KB principle.

TCP slow start /14kb

In order to fill the network with new connections instantly, TCP uses a slow-start mode to communicate. On the first round trip, the server sends up to 10 packets (approximately 14KB, between 1432B and 1452B). After receiving the packet, the server returns an ACK and sends twice the amount of data, such as 14KB -> 28KB -> 56KB ->…. . Until the request is completed. So, if you want to load like lightning, the HTML document of the home page needs to be controlled within 14KB, but the emergence of MVVC, so that the home page 14KB has become the norm.

Congestion control

When the server sends too much data too fast, it may be discarded and the client will not return an ACK, so when an acknowledgment frame cannot be received, it will slow down.

Parse the content

As soon as the browser receives the first block of data, it begins “speculative parsing.” The parsed content transforms the document into DOM and CSSOM for processing and waiting to be drawn. Instead of waiting for the HTML document to download completely, this parsing will start as soon as it receives the fragment. However, before rendering, all HTML, CSS, and JS need to be parsed.

Build a DOM tree 🌲

You start by processing HTML tags, which include start and end tags, attribute names, and values, and building the DOM tree. After parsing, pairs of tags are added to the document to build the document tree. The DOM tree describes the contents of the document, usually with the HTML tag as the first and root node.

When the browser finds non-blocking resources, such as images or, in some cases, CSS files, it does not block. But meetscriptLabel when, in noasyncanddeferIs blocked and HTML parsing stops.

Preload the scanner

The main thread gets completely blocked while parsing the DOM tree, and when that happens, the preload scanner looks for high-priority resources, such as CSS, JS, or Web fonts. Getting CSS usually doesn’t block HTML downloading and parsing, but it does when it comes to JS. For example, if a script is produced after the link tag, there will be a blockage.

Build the CSSOM tree 🌲

The CSSOM tree and the DOM tree are completely separate trees, but the CSSOM tree is very fast to parse, often less than a DNS lookup.

JS compilation

The script file is interpreted, compiled, parsed, and then run.

Apply colours to a drawing

When everything is ready, the rendering will begin. The rendering steps include styling, layout, and drawing. In some cases there is compositing and even triggering hardware acceleration on the GPU.

Style Style

The DOM and CSSOM trees need to be merged into the Render tree 🌲, and any nodes with display: None will be ignored and will not appear in the Render tree, while those with visibility:hidden will be in the tree because they are just invisible to the user.

Layout of the Layout

This step begins by calculating the size and position of each element, and it is possible that the layout of the page will change as the image loads successfully, causing backflow.

Draw the Paint

The final step is to draw, which draws each node onto the screen and applies styles such as colors, borders, shadows, etc. To maintain speed, any part of the rendering needs to be done in 16.67ms, which is a minimum of 60 renderings per second. To ensure fast redrawing, on-screen drawings are often divided into multiple layers, which are combined when necessary. Under certain conditions, some elements are lifted to the GPU layer for drawing. Such as

Synthesis of Compositing

When multiple layers are drawn and overlapped with each other, they need to be combined to ensure that their drawing sequence does not have problems. If backflow is triggered, the draw and compose steps are retriggered. So, if you set the width and height of the image in advance, it will reduce the browser’s drawing backflow.

interaction

Even if the page drawing is completed, interaction may still not be possible. For example, the main thread is blocked by some JS, and at this time, new macro task processing cannot be triggered, so the interaction process cannot be entered. (Any interaction is a macro task.)

To optimize the

There are a lot of things that can be optimized during page loading.

dns-prefetch

When the browser loads multiple external resources, if not in the same domain, it will continue to trigger DNS lookups. To resolve DNS in advance and speed up the load, you can create a tag and add rel attributes. Such as

<link rel="dns-prefetch" href="https://fonts.googleapis.com/"> 
Copy the code

You can also use it in combination with PreConnect. Such as:

<link rel="preconnect" href="https://fonts.gstatic.com/" crossorigin>
<link rel="dns-prefetch" href="https://fonts.gstatic.com/">
Copy the code

However, this is not without its disadvantages, if the page needs to make a lot of connections, then pre-linking can be counterproductive. So what’s the benefit of that? This is because some browsers that are not preConnect compatible can still fall back to dnS-Prefetch as an alternative. Even if the browser doesn’t support DNS-Prefetch, at best it won’t take advantage and cause problems on the page.

animation

CSS

Transactions or animations can be used to animate CSS. Transactions describe the start and end states of elements. Animations, on the other hand, allow developers to define transformations that precede two states, such as speed or intermediate frame states.

requestAnimationFrame

RequestAnimationFrame provides an efficient way to animate using JS. Its advantages will not be repeated.

It is important to note that both CSS animations and requestAnimationFrame are paused when the page is not visible, or when the browser is running in the background.

performance

In general, their performance should be roughly the same. But CSS animations are more recommended. This is because, if you’re animating a property that doesn’t trigger relayouts, then you can make it a separate compound layer in some way that triggers a separate compound layer and use the GPU to draw it. This allows you to animate away from the main thread.

Elements to observe

In some scenarios, some element content is judged to be visible or not. The common usage method is to frequently call getBoundingClientRect to obtain relevant content information for judgment. If too many elements need to be detected, it will also be a loss of performance. The best scenario is some page that scrolls endlessly. In this scenario, you can use the Intersection Observer for optimization. It runs a callback that you specify when the target element and the visual window, or some element, intersect.

let options = {
    root: document.querySelector('#scrollArea'), 
    rootMargin: '0px'.threshold: 1.0
}

let observer = new IntersectionObserver(callback, options);
let target = document.querySelector('#listItem');
observer.observe(target);

let callback =(entries, observer) = > { 
  entries.forEach(entry= > {
    // Each entry describes an intersection change for one observed
    // target element:
    // entry.boundingClientRect
    // entry.intersectionRatio
    // entry.intersectionRect
    // entry.isIntersecting
    // entry.rootBounds
    // entry.target
    // entry.time
  });
};
Copy the code

However, if you just want to use lazy loading, you can simply add loading=’lazy’ to .