This article has participated in the activity of “New person creation Ceremony”, and started the road of digging gold creation together. details
What is the actual process of parsing from url entry to page opening?
- DNS Domain name Resolution
- Establishing a TCP Connection
- The HTTP request is sent, and the server processes the request and returns the result
- Disabling a TCP Connection
- Browser rendering
Process flow chart
Synthesis of URL
The user enters a URL, and the browser determines whether to search or address based on the information the user enters.
If it is a search, combine the search content with the default search engine to create a new URL.
If the content entered by the user complies with THE URL rules, the browser synthesizes a valid URL based on the URL protocol.
1. DNS resolution
DNS domain name search, between the client and browser, the local DNS query is recursive query; Iterative query is used between the local DNS server and the root domain and its subdomains.
- After the client enters the URL, there is a recursive lookup process
- from
Search the browser cache -> search the local hosts file -> search the local DNS parser cache -> search the local DNS server
Finding any step in the process ends the search process. - If the local DNS server cannot be queried, query the DNS server based on the forwarder configured on the local DNS server.
- If forwarding mode is not used, the iterative search process is shown as follows:
With DNS/ top-level/authoritative DNS server:
- Root DNS server: Returns the IP address of the top-level DNS server
- Top-level DOMAIN DNS server: Returns the IP address of the authoritative DNS server
- Authoritative DNS server: Returns the IP address of the corresponding host
The overall process is:
In the search process, there are the following optimization points:
- DNS has multiple levels of cache, sorted by distance from the browser:
Browser cache, system cache, router cache, IPS server cache, root DNS cache, TOP-LEVEL DNS cache, master DNS cache
. - During the mapping between domain names and IP addresses, applications can perform load balancing based on domain names, simple load balancing, or global load balancing based on IP addresses and carriers.
2. Establish the TCP connection
First, check whether HTTPS is used. If yes, HTTPS is actually composed of HTTP + SSL/TLS, which means that an additional layer is added to HTTP to process encrypted information. Both the server and client are encrypted through TLS. Therefore, the transmitted data is encrypted
TCP establishes a three-way handshake:
-
First handshake: [Establishes a connection. The client sends a connection request packet segment. Set the SYN position to 1 and Sequence Number to x. The client then enters the SYN_SEND state and waits for confirmation from the server.
-
Second handshake: [The server receives a SYN packet from the client. The server needs to confirm the SYN packet segment.] Set Acknowlegment Number to X +1 (Sequence Number+1). Set the SYN request to 1 and Sequence Number to y. The server puts all the above information into a packet segment (that is, the SYN+ACK packet segment) and sends it to the client. Then the server enters the SYN_RECV state.
-
Third handshake: [After receiving a SYN+ACK packet from the server, the client sends an ACK packet to the server.] Then set Acknowlegment Numbe to Y +1 and send an ACK packet to the server. After the ACK packet is sent, the client and server enter the ESTABLISHED state to complete the TCP three-way handshake.
SSL Handshake Procedure
- The first stage of establishing security capabilities includes protocol version session Id password component compression method and initial random number
- Phase 2 The server sends the certificate key exchange data and the certificate request, and finally sends the request-end signal of the corresponding phase
- Phase 3 If a certificate requests the client to send the certificate, the client sends the key exchange data and sends the certificate authentication message
- The fourth stage changes the password component and ends the handshake protocol
When this is done, the client and server can begin transferring data
3. Send the HTTP request
The HTTP request is sent, the server processes the request and returns the response result
After the TCP connection is established, the browser can use HTTP/HTTPS to send requests to the server. The server receives the request and parses the request header. If the header contains information about the cache, such as if-none-match and if-Modified-since, the server verifies whether the cache is valid. If the cache is valid, the server returns the status code 304
One process that occurs here is HTTP caching, which you can see in detail
4. Close the TCP connection
First wave: Set the Sequence Number and Acknowlegment Number for host 1 (either a client or a server) to send a FIN packet to host 2. Host 1 enters the FIN_WAIT_1 state. This means host 1 has no data to send to host 2.
Second wave: Host 2 receives a FIN packet from host 1 and sends an ACK packet to host 1. Set the Acknowlegment Number to Sequence Number+1. Host 1 enters the FIN_WAIT_2 state. [Host 2 tells host 1 that I agree to your shutdown request]
Third wave: Host 2 sends a FIN packet to host 1 to close the connection, and host 2 enters the LAST_ACK state.
Fourth wave: Host 1 receives the FIN packet from host 2 and sends an ACK packet to host 2. Then host 1 enters the TIME_WAIT state. Host 2 closes the connection after receiving an ACK packet. If host 1 still receives no reply after waiting for 2MSL, the connection is normally closed, and host 1 also closes the connection.
5. Browser rendering
The renderer transforms the HTML content into a DOM tree structure that you can read.
The rendering engine translates CSS styleSheets into styleSheets that browsers can understand. Figure out the style of the DOM node.
Create a layout tree and calculate the layout information for the elements.
Layer the layout tree and generate a hierarchical tree.
Generate a draw list for each layer and submit it to the composition thread. The composition thread divides the layer into blocks and grilles the blocks into bitmaps.
The composite thread sends the draw block command to the browser process. The browser process generates the page on command and displays it on the monitor.
In chronological order of rendering, the pipeline can be divided into the following sub-stages: DOM tree building, style calculation, layout stage, layering, rasterization, and display. As shown in figure:
-
Build a DOM tree
The renderer process converts HTML content into a DOM tree structure that can be read through the following four processes:
- Transcoding (Bytes -> Characters) — Reads the received BINARY HTML data and converts the Bytes into HTML strings in the specified encoding format
- Characters -> Tokens — Parses HTML and transforms HTML strings into Tokens with a clear structure. Each Token has its own unique meaning and rules
- Build Nodes (Tokens -> Nodes) — Each Node adds specific attributes (or attribute accessors). Pointers identify the parent, child, and sibling relationships of Nodes and the treeScope (e.g. The treeScope of iframe is different from the treeScope of the outer page)
- Build a DOM Tree (Nodes -> DOM Tree) – The most important task is to establish the parent-sibling relationship of each node
-
Style calculation
The rendering engine translates CSS styleSheets into styleSheets that browsers can understand, calculating the style of DOM nodes.
There are three main sources of CSS styles: external CSS files referenced by link, CSS inside the style tag, and CSS embedded in the style attribute of an element.
-
The page layout
The layout process is to exclude functional and non-visual nodes such as script and meta, exclude display: None nodes, calculate the location information of elements, determine the location of elements, and build a layout tree containing only visible elements. As shown in figure:
Among other things, this process requires attention to backflow and redraw
-
Generate hierarchical tree: Layer the layout tree and generate a hierarchical tree
With complex effects like complex 3D transformations, scrolling, and z-ordering with z-indexing, the rendering engine will need to create a LayerTree for each node.
-
rasterize
The compositing thread prioritizes bitmap generation based on the blocks near the viewport, and the actual bitmap generation is performed by rasterization. Rasterization refers to the transformation of a map block into a bitmap.
-
According to
The composite thread sends the draw block command to the browser process. The browser process generates the page according to the instructions and displays it to the display