What happens when the browser enters the URL

General overview

In general, it can be divided into six steps, of course, each step can be expanded in detail, but here is an overview:

DNS Domain name Resolution

In the online world, you can remember the name of a website, but it’s hard to remember its IP address, so you need an address book, a DNS server. The DNS server is highly available, highly concurrent, and distributed. It is a tree structure, as shown in figure:

Root DNS server: Returns the IP address of the top-level DNS server
Top-level DOMAIN DNS server: Returns the IP address of the authoritative DNS server
Authoritative DNS server: Returns the IP address of the corresponding host

DNS domain name search, between the client and browser, the local DNS query is recursive query; Iterative query is used between the local DNS server and the root domain and its subdomains.

Recursive process:

After the client enters the URL, there will be a recursive search process, from the browser cache search -> local hosts file search -> find the local DNS parser cache search -> local DNS server search, any step in the process will end the search process.

If the local DNS server cannot be queried, query the DNS server based on the forwarder configured on the local DNS server. If forwarding mode is not used, the iterative search process is shown as follows:

The combined process can be represented by a graph:

DNS has multiple levels of cache, sorted by distance from browser: browser cache, system cache, router cache, IPS server cache, root DNS cache, top-level DNS cache, and primary DNS cache.
During the mapping between domain names and IP addresses, applications can perform load balancing based on domain names, simple load balancing, or global load balancing based on IP addresses and carriers.

Establishing a TCP Connection

First, check whether HTTPS is used. If yes, HTTPS is actually composed of HTTP + SSL/TLS, which means that an additional layer is added to HTTP to process encrypted information. Both the server and client are encrypted through TLS. Therefore, the transmitted data is encrypted.

A three-way handshake is used to establish a TCP connection.

First handshake: Establish a connection. The client sends a connection request packet segment and sets the SYN position to 1 and Sequence Number to X. Then, the client enters the SYN_SEND state and waits for confirmation from the server.
Second handshake: The server receives a SYN packet segment. Context The server should acknowledge the SYN segment received from the client, and set this Acknowledgment Number to X +1(Sequence Number+1). Set the SYN position to 1 and Sequence Number to y. The server puts all the above information into a packet segment (namely, SYN+ACK packet segment) and sends it to the client. At this time, the server enters the SYN_RECV state.
Third handshake: The client receives a SYN+ACK packet from the server. A. Then set this Acknowledgment Number to Y +1 and send an ACK segment to the server. After this segment is sent, both the client and the server enter the ESTABLISHED state and complete the TCP three-way handshake.

SSL Handshake Procedure

The first stage of establishing security capabilities includes protocol version session Id password component compression method and initial random number
Phase 2 The server sends the certificate key exchange data and the certificate request, and finally sends the request-end signal of the corresponding phase
Phase 3 If a certificate requests the client to send the certificate, the client sends the key exchange data and sends the certificate authentication message
The fourth stage changes the password component and ends the handshake protocol

When this is done, the client and server can begin transferring data. More information about HTTPS can be found here:

zhuanlan.zhihu.com/p/26682342
Segmentfault.com/a/119000001…

note

ACK: This flag indicates that the reply field is valid, that is, the TCP reply number mentioned above will be included in the TCP packet. There are two values: 0 and 1. If the value is 1, the response field is valid; otherwise, it is 0. According to the TCP protocol, this parameter is valid only when ACK=1, and the ACK value of all packets sent after the connection is established must be 1.

SYN(SYNchronization) : used to synchronize the sequence number when a connection is set up. When SYN=1 and ACK=0, it indicates that this is a connection request packet. If the peer agrees to establish a connection, SYN=1 and ACK=1 should be set in the response packet. Therefore, a SYN value of 1 indicates that this is a connection request or connection accept message.

FIN(finis) means to finish, to terminate, to release a connection. If the FIN value is 1, the sender finishes sending data and requests to release the connection.

The HTTP request is sent, the server processes the request and returns the response result

After the TCP connection is established, the browser can use HTTP/HTTPS to send requests to the server. The server receives the request and parses the request header. If the header contains information about the cache, such as if-none-match and if-Modified-since, the server verifies whether the cache is valid. If the cache is valid, the server returns the status code 304.

One of the processes that occurs here is HTTP caching, which is a common topic of examination.

“Browser related principles (interview questions) Detailed summary 1”

Disabling a TCP Connection

First break up: Host 1 (either client or server), set the Sequence Number and Acknowledgment Number, and send a FIN segment to host 2. At this point, host 1 enters the FIN_WAIT_1 state. This means that host 1 has no data to send to host 2.
Second break off: Host 2 receives the FIN segment from host 1 and sends an ACK segment back to Host 1. This Acknowledgment Number is set to Sequence Number plus 1. Host 1 enters the FIN_WAIT_2 state. Host 2 tells host 1 that I “agree” to your shutdown request;
Third breakup: Host 2 sends a FIN packet to host 1 to close the connection, and host 2 enters the LAST_ACK state.
For the fourth time, host 1 receives the FIN packet from host 2 and sends an ACK packet to host 2. Then host 1 enters the TIME_WAIT state. Host 2 closes the connection after receiving the ACK packet from host 1. If host 1 does not receive a reply after waiting for 2MSL, then the Server is shut down.

Browser rendering

In chronological order of rendering, the pipeline can be divided into the following sub-stages: DOM tree building, style calculation, layout stage, layering, rasterization, and display. As shown in figure:

The renderer transforms HTML content into a DOM tree structure that is readable.
The rendering engine translates CSS styleSheets into styleSheets that browsers can understand, calculating the style of DOM nodes.
Create a layout tree and calculate the layout information for the elements.
Layer the layout tree and generate a hierarchical tree.
Generate a draw list for each layer and submit it to the composition thread. The composition thread divides the layer into blocks and rasterizes the blocks into bitmaps.
The composite thread sends the draw block command to the browser process. The browser process generates the page on command and displays it on the monitor.

Build a DOM tree

After the browser obtains HTML byte data from the network or hard disk, it will go through a process to parse the byte into DOM tree. First, the original HTML byte data is converted into characters specified in the file encoding. Then, the browser will convert the string into various token tags according to THE HTML specification, such as HTML and body. The result is a tree object model, the DOM tree.

Specific steps:

Transcoding (Bytes -> Characters) — Reads the received BINARY HTML data and converts the Bytes into HTML strings in the specified encoding format
Characters -> Tokens — Parses HTML and transforms HTML strings into Tokens with a clear structure. Each Token has its own unique meaning and rules
Build Nodes (Tokens -> Nodes) — Each Node adds specific attributes (or attribute accessors). Pointers identify the parent, child, and sibling relationships of Nodes and the treeScope (e.g. The treeScope of iframe is different from the treeScope of the outer page)
Build a DOM Tree (Nodes -> DOM Tree) – The most important task is to establish the parent-sibling relationship of each node

Style calculation

The rendering engine translates CSS styleSheets into styleSheets that browsers can understand, calculating the style of DOM nodes.

There are three main sources of CSS styles: external CSS files referenced by link, CSS inside the style tag, and CSS embedded in the style attribute of an element. , the style calculation process is mainly as follows:

The page layout

The layout process is to exclude functional and non-visual nodes such as script and meta, exclude display: None nodes, calculate the location information of elements, determine the location of elements, and build a layout tree containing only visible elements. As shown in figure:

Refluxing and repainting
“Browser principles (Interview questions) Detailed Summary ii”

Generating hierarchical tree

For complex effects such as complex 3D transforms, scrolling, and z-indexing, the rendering engine will need to create a LayerTree for each node.

Not every node in the layout tree contains a layer. If a node has no corresponding layer, then the node is subordinate to the parent node’s layer. So what criteria must be met for the rendering engine to create a new layer for a particular node? Detailed can see me another article “browser related principles (interview questions) detailed summary two”, here will not say ~

rasterize

The compositing thread prioritizes bitmap generation based on the blocks near the viewport, and the actual bitmap generation is performed by rasterization. Rasterization refers to the transformation of a map block into a bitmap. As shown in figure:

Usually a page may be large, but the user can only see part of it. We call the part of the page that the user can see the viewport. In some cases, some layer can be very big, such as some pages you use the scroll bar to scroll to scroll to the bottom for a long time, but through the viewport, users can only see a small portion of the page, so in this case, to draw out all layer content, will generate too much overhead, but also it is not necessary.

According to

Finally, the composite thread sends the draw block command to the browser process. The browser process generates the page according to the command and displays it on the monitor. The rendering process is complete.

The resources

Geek Time: Interesting Internet Protocol
Geek Time “How Browsers Work and Practice”

The last

Welcome to add my wechat (Winty230), pull you into the technology group, long-term exchange learning…
Welcome to pay attention to “front-end Q”, seriously learn front-end, do a professional technical people…