Explains the page loading process in great detail

❝

Typical interview question: “The process between entering the URL and loading the page.” You’ll find that no matter how big or how small you are, you’ll ask, why?

Because it can measure not only the breadth but also the depth of the interviewer’s knowledge.

❞

preface

If this article is helpful to you, ❤️ follow + like ❤️ to encourage the author, the article public account first launch, pay attention to “front-end South Jiu” the first time to get the latest article ~

In the previous article, see how many of these browser interview questions you could answer? I also mentioned this classic interview question. Let’s take a look

What happens from entering the URL to rendering the page? (Knowledge point)

❝

This question can be said to be the most common interview is also an infinite difficult question, the general interview officer out of this question is to examine the depth and breadth of your front-end knowledge.

❞

1. Browser accepts URL to open network request thread (involving: browser mechanism, thread and process, etc.)

2. Start the network thread until it sends a complete HTTP request (involving DNS resolution, TCP/IP request, Layer 5 network protocol, etc.)

3. Received the request from the server to the corresponding background (involving: load balancing, security interception, background internal processing, etc.)

4. HTTP interaction between background and foreground (involving HTTP headers, response codes, packet structures, cookies, etc.)

5. Caching questions (involving: HTTP strong caching, negotiated caching, etc.) (see the previous article [these browser interview questions, see how many you can answer?) (juejin. Cn/post / 702653…

6. The parsing process after the browser receives HTTP data packets (involving HTML lexical analysis, parsing into DOM tree, parsing CSS to generate CSSOM tree, merging to generate render rendering tree. Then layout, painting and rendering, composite layer composition, GPU drawing, etc.)

Enter the URL in the browser address bar

When we enter a URL in the browser’s address bar, the browser opens a thread to parse the URL.

Each process in the browser and its role :(multi-process)

Browser process: responsible for the creation and destruction of TAB page and page display, resource download, etc.
Third-party plug-in process: Manages third-party plug-ins.
GPU process: responsible for 3D drawing and hardware acceleration (up to one).
Rendering process: responsible for page document parsing (HTML, CSS, JS), execution and rendering. (There can be more than one)

DNS Domain name Resolution

Why is DNS domain name resolution required?

Because the URL we type in the browser is usually a domain name, we don’t directly enter the IP address (purely because domain names are easier to remember than IP), but our computer doesn’t know the domain name, it only knows the IP, so this step is needed to resolve the domain name into IP.

URL component

Protocol: indicates the protocol header, such as HTTP, HTTPS, or FTP.
Host: indicates the host domain name or IP address.
Port: indicates the port number.
Path: indicates the directory path.
Query: query parameters;
Hash: The hash value following # used to locate a position.

The parsing process

You first look at the browserDNSCache, if any, directly use browser cache
If not, check the local computerDNSCache (localhost)
I haven’t asked for the recursion yetDNSServer (i.e. network provider, usually this server has its own cache)
If there is still no cache, you need to use the root DNS server andTLDDomain name server to the corresponding authorityDNSThe server finds the record and caches it to the recursive server, which then returns the record locally

“⚠️ Note:”

❝

DNS resolution is time-consuming. If too many domain names need to be resolved on a page, the page performance deteriorates. Consider using DNS for optimization with load or reduce DNS resolution.

❞

Sending an HTTP request

Once you have the IP address, you can make an HTTP request. HTTP requests are essentially TCP/IP request builds. When establishing a connection, “3 times handshake” is required for verification. When disconnecting a link, “4 times wave” is also required for verification to ensure the reliability of transmission.

Three times handshake

First handshake: The client sends packets with bit code SYN = 1 (SYN flag position bit) and random sequence number Seq = J to the server. The server is known by SYN = 1 (bit) and the client is required to set up an online connection.
Second handshake: After receiving the request, the server confirms the online information and sends packets with Ack = (Seq +1, J+1 of the client), SYN = 1, Ack = 1 (SYN, Ack flag bit), and sequence number Seq = K randomly generated to the client.
Third handshake: After receiving the handshake, the client checks whether the Ack is correct, that is, the Seq +1 (J+1) sent for the first time, and whether the bit code Ack is 1. If yes, the client will send Ack = (Seq+1, K+1 on the server), Ack = 1, and the confirmation packet with Seq as the server confirmation number J. If the server receives the Seq (K+1) value and ACK= 1 (ACK is set), the connection is established successfully.

“Plain understanding:”

Client: Hello, are you server? Client: Yes, I am client. After the establishment is successful, the next step is to formally transfer data.

4 times to wave

The client sends a FIN Seq = M (FIN set, serial number: M) packet to disable data transfer from the client to the server.
When the server receives the FIN, it sends an ACK to confirm that it is the received SEQUENCE number M+1.
The server closes the connection with the client and sends a FIN Seq = N message to the client.
The client sends an ACK packet for confirmation. The ACK sequence number is N+1.

“Plain understanding:”

(Active: I have closed the active channel to you, so I can only passively receive it. Passive: I received the message that the channel is closed. Passive: I also tell you that my active channel to you is also closed.

Layer 5 network protocol

1. Application layer (DNS, HTTP) : DNS resolves to IP and sends HTTP requests.

2. Transport layer (TCP, UDP) : establish TCP connection (three-way handshake);

3. Network layer (IP, ARP) : IP addressing;

4. Data link layer (PPP) : encapsulation into frames;

5, physical layer (using physical media transmission bit stream) : physical transmission (through twisted pair, electromagnetic wave and other media).

“OSI seven-layer framework: Physical layer, Data link layer, network layer, Transport layer, Session layer, presentation layer, application layer”

The server receives the request and responds

The HTTP request arrives at the server and the server processes it. Finally, the data is passed to the browser, which returns a web response.

Like the request part, a network response has three parts: the response line, the response header, and the response body.

What happens when the response is complete? Does the TCP connection break down?

Not necessarily. If the request header or response header contains Connection: keep-alive, it indicates that a persistent Connection has been established. In this way, the TCP Connection will remain Alive and the resources of the unified site will reuse this Connection. Otherwise, the TCP connection is disconnected and the request-response process ends.

Status code

The status code consists of three digits. The first number defines the category of the response and has five possible values:

1XX: indicates that the request has been received and processing continues.
2xx: success – The request is successfully received, understood, or accepted.
3xx: Redirect – Further action must be taken to complete the request.
4XX: client error – The request has a syntax error or the request cannot be implemented.
5xx: Server side error – The server failed to fulfill a valid request. The common status codes are :200, 204, 301, 302, 304, 400, 401, 403, 404, 422, 500(please find what they represent by yourself).

The server returns the corresponding file

After the request succeeds, the server returns the corresponding web page, and the browser starts to download the web page after receiving the successful response packet. At this point, the network communication ends.

The browser parses the rendered page

How does the browser render the page on the screen after it receives the HTML, CSS, and JS files?

Parsing HTML to build a DOM Tree

When the browser gets the web page returned from the server, it first parses the corresponding DTD type defined at the top, which is handed over to the internal GUI rendering thread.

“DTD (Document Type Definition) Document Type Definition”

Common document type definitions

//HTML5 document definition <! DOCTYPE HTML > // For XHTML 4.0 strict <! DOCTYPE HTMLPUBLIC "- 4.01 / / / / / / W3C DTD HTML EN" "http://www.w3.org/TR/html4/strict.dtd" > / / for XHTML 4.0 transitional <! PUBLIC DOCTYPE HTML "- / / / / W3C DTD HTML 4.01 Transitional / / EN" "http://www.w3.org/TR/html4/loose.dtd" > / / for XHTML 1.0 strict type <! PUBLIC DOCTYPE HTML "- / / / / W3C DTD XHTML 1.0 Transitional / / EN" "http://www.w3.org/TR/xhtml1/DTD/xhtml1-transitional.dtd" > // For XHTML 1.0 transition <! PUBLIC DOCTYPE HTML "- / / / DTD/W3C XHTML 1.0 Strict / / EN" "http://www.w3.org/TR/xhtml1/DTD/xhtml1-strict.dtd" >Copy the code

The job of the HTML interpreter is to interpret HTML web pages or resources retrieved from the network or local disk from the byte stream into a DOM tree 🌲 structure

See the above figure for a clear understanding of this process: first byte streams, then character streams after decoding, then Tokens are interpreted as words through a lexical analyzer, then nodes are constructed through a parser, and finally these nodes are formed into a DOM tree.

With threaded interpreters, the entire interpretation, layout, and rendering process after a character stream is basically managed by a single rendering thread (not absolutely). Since the DOM tree can only be created and accessed on the rendering thread, the process of building the DOM tree can only take place in the rendering thread. However, the step from string to word can be left to a separate thread, which is the idea behind Chrome. After the words are interpreted, Webkit passes the resulting words back to the render thread in batches.

In this process, if the node encountered is JS code, the JS engine will be called to interpret and execute the JS code. At this time, because the JS engine and GUI rendering thread are mutually exclusive, the GUI rendering thread will be suspended, and the rendering process will stop. If the DOM tree is modified during the operation of THE JS code, Then DOM building needs to start from scratch

If nodes need to rely on other resources, images /CSS, etc., the network module’s resource loader is called to load them, which are asynchronous and do not block the current DOM tree build

If you encounter a JS resource URL (no asynchrony tag), you need to stop building the CURRENT DOM until the JS resource is loaded and executed by the JS engine

Parse the CSS to build the CSSOM Tree

The CSS interpreter interprets the CSS file into an internal representation structure and generates a CSS rule tree. This process is similar to DOM parsing. CSS bytes are converted into characters, followed by lexical parsing and parsing, and finally the tree structure of the CSS object model (CSSOM) is formed

Build a Render Tree

After the DOM Tree and CSSOM Tree are built, they are merged into the Render Tree, which contains only the nodes needed to Render the web page. The Render Tree is then used to calculate the layout of each visible element and output to the rendering process to Render the pixels on the screen.

Render (layout, draw, composition)

Compute CSS styles;
Build render tree;
Layout, main positioning coordinates and size, whether newline, variousposition overflow z-indexProperties;
Draw, draw the image.

The process is complicated and involves two concepts: reflow(reflow) and repain(redraw). Each element in the DOM node exists in the form of a box model, which requires the browser to calculate its position and size, a process called relow. Once the location, size, and other properties of the box model, such as color and font, are determined, the browser begins to draw the content, a process called repain. Pages will inevitably experience reflow and Repain when they first load. Reflow and Repain processes can be very performance draining, especially on mobile devices, and can ruin the user experience, sometimes causing pages to stagnate. So we should reduce reflow and repain as little as possible.

There is a difference between Reflow and Repaint:

(1) Reflow: that is, Reflow. This typically means that the content, structure, position, or size of an element has changed, requiring recalculation of styles and rendering trees.

(2) Repaint. Meaning that the changes to the element only affect the appearance of the element (for example, the background color, border color, text color, etc.), then simply apply the new style to the element.

The cost of backflow is higher than redrawing, and backflow of a node often leads to backflow of child nodes and nodes at the same level, so the optimization plan generally includes to avoid backflow as far as possible.

“Backflow must lead to repainting, but repainting does not necessarily lead to backflow.”

“Composite”

The final step is composite, where the browser sends each layer of information to the GPU, which then composes the layers and displays them on the screen

Normal and composite layers

To put it simply, there are two main categories of layers rendered by browsers: normal layers and composite layers

First of all, the normal document flow can be understood as a composite layer (this is called the default composite layer, no matter how many elements are added to it, they are in the same composite layer).

Second, the absolute layout (as is fixed) can be removed from the normal document flow, but it still belongs to the default composite layer.

You can then declare a new composite layer, hardware-accelerated, that allocates resources separately (and of course out of the normal document flow, so that no matter what happens in the composite layer, it doesn’t affect the backflow redraw in the default composite layer).

To put it simply: “On a GPU, each composite layer is drawn separately, so it does not affect each other”, which is why some scenes have a good hardware acceleration effect

In Chrome source debug -> More Tools -> Rendering -> Layer Borders, the yellow is the compound Layer information.

mo4tech.com (Moment For Technology) is a global community with thousands techies from across the global hang out!Passionate technologists, be it gadget freaks, tech enthusiasts, coders, technopreneurs, or CIOs, you would find them all here.

Explains the page loading process in great detail

preface