From entering the URL to rendering the page

Parse the information in the address bar

The browser listens to what the user is typing and tries to match the url or keyword you want to visit. In the case of digging gold, type information in the browser’s address bar and press Enter. The browser makes the following judgments:

Determine if it is a valid URL link;
Is. Continue to check whether the URL is complete. If the URL is not complete, the browser may guess the field and add the prefix, suffix, or prefix suffix to the entered content to complete the URL. Common URL generics include:
- Agreement:http https websocket
- Domain name (host name) : May be an IP address or a domain name. A domain name may consist of a root domain name, a top-level domain name, or a second-level domain name.To divide, as:juejin.cn...Represents the root domain name,.cnStands for top-level domain name,juejin.cnRepresents a secondary domain name (that is, a host name)
- Port number: The default HTTP port number is 80, and HTTPS port number is 443. The browser automatically hides the default port number.
- Path:/Divide each directory level, for example:/web/user
- Query:?Began to&To separate key value pairs, for example:? The username = "zhang SAN" & age = 16
- Hash:#At first, it can be used to locate the specific location of the current page
No. The browser uses the entered content as the search criteria, uses the default search engine, and returns the search results

Find strong cache

The browser process sends the URL request to the network process through interprocess communication (IPC). After receiving the URL request, the network process initiates the actual request. But before making the request, the network process looks to see if the resource is cached locally. If there are cached resources, they are returned directly to the browser process. First, search for strong cache resources and check whether the strong cache resources have expired. If the strong cache resources do not expire, use the resources directly and request resources from the server again. Strong caching involves two fields:

Expires. Expiration time(Expires=Wed, 21 Oct 2015 07:28:00 GMT)HTTP/1.0 uses this field, which is present in the response header returned by the server and tells the browser to use the cached resource directly within the expiration time range. However, it has a major disadvantage. When the server and client time are inconsistent, the time returned by the server is inaccurate. Therefore, HTTP/1.1 abandoned this field in favor of this fieldCache-Controlfield
Cache-Control. Expiration time(cache-control: Max - age = 3600)HTTP/1.1 uses this field, which also exists in the response header returned by the server to tell the browser to use the cached resource directly within the expiration period. It can also set other commands. Here are some of the key commands:
- public. Both browsers and proxy servers can cache resources
- private. Only the browser caches resources, not the proxy server
- no-cache. Skip the strong cache phase. Sends a request to the server to enter the negotiation cache phase
- no-store. Don’t cache
- s-maxage. Proxy server cache time
- must-revalidate. Once the cache expires, it must go back to the source server for validation

Extension:

How to set strong cache? This can be set in the server codeCache-ControlField and its corresponding value;
Where is the strong cache resource cached?memory cache 或 disk cache, that is, in memory or hard disk, will generally cache pictures, script files, font filesmemory cache; Caches the style filedisk cacheIn the.

What is the priority of the access cache? Follow the three-level caching principlememory cacheFind, there are direct use; Not to godisk cacheFind, have refers to direct use; The network request is not made and the resources returned by the request are cached based on the response header field information.

DNS Domain name Resolution

If the required resource is not found in the strong cache, the network request flows directly. Normally, we enter the domain name in the address bar of the browser, but in network communication, the destination host is determined by IP address, so we have to find the corresponding IP address by domain name. What’s DNS? The full name of DNS is domain Name System (DNS). It saves the mapping between domain names and IP addresses in a distributed database, so we can find the corresponding IP through DNS, and this search process is DNS domain name resolution. The following uses juejin. Cn to analyze the domain name resolution process:

Browser DNS cache. The browser extracts the host name from the URL and searches for cached records from the browser DNS. If there are cached records, the browser directly uses the cached IP address to complete the resolution.
Hosts file. Check whether there are cached records in the hosts file on the local host. If there are cached records, the corresponding IP address is returned to complete the parsing.
Local DNS server. The DNS server sends a query request to the local DNS server. In some cases, the local DNS server returns a response to the host to complete the resolution.
ISP DNS cache. The local DNS server forwards the query request to the DNS server provided by the ISP. In some cases, the DNS server returns the query request to the local DNS server. The local DNS server returns the query request to the host.
Root DNS server. If no forwarder is configured on the local DNS server, the local DNS sends the request to the 13 root DNS servers. After receiving the request, the root DNS server determines the domain name (.cn) who is authorized to manage and will return an IP responsible for that TOP-LEVEL domain name server. In forwarding mode, the DNS server forwards the request to the upper-layer DNS server for resolution. If the upper-layer SERVER fails to resolve the request, the DNS server searches for the root DNS server or forwards the request to the upper-layer DNS server.
Top-level domain name server. After obtaining the TOP-LEVEL domain name server address from the root DNS server, the local DNS server sends a query request to the top-level domain name server. After receiving the request, the local DNS server checks the zone file record and returns the record to the local DNS server as a response. The local DNS server returns the record to the host to complete the resolution. If it can’t parse itself, it finds a manager (.cnThe secondary DNS server IP address of the domain is returned to the local DNS server;
Secondary domain name server. After obtaining the address of the level-2 DNS server from the TOP-LEVEL DNS server, the local DNS server sends a query request to the level-2 DNS server. After receiving the request, the level-2 DNS server checks the zone file record and returns the record to the local DNS server for parsing. At this point, the resolution is still not complete. There may be an error in the domain name and an exception is generated.

Recursive query: The client sends a query request to the local DNS server and waits for the result returned by the local DNS server. If the local DNS server cannot resolve the DNS server, it sends query requests to other DNS servers as a DNS client until the query results are returned to the client.

Iterative query: The local DNS server sends a query request to the root DNS server. If the root DNS server does not find the corresponding record, the local DNS server returns the IP address of the next destination DNS server to the local DNS server (also called root prompt) until the query result is returned to the client.

Establishing a TCP Connection

Establishing a connection follows the three-way handshake. The main purpose of the three-way handshake is to confirm whether the receiving and sending capabilities of both parties are normal, and to specify their initial serial numbers for the preparation of the subsequent reliable transmission. In essence, it is to connect the specified port of the server, establish a TCP connection, synchronize the serial number and confirmation number of both connected parties, and exchange TCP window size information. The client is now in the Closed state and the server is in the Listen state

The client sends a SYN packet to the server, specifyingSynchronization bit SYN=1, initial sequence number seq=xAnd then the client is inSYN_SENDState;
After receiving a SYN packet from the client, the server sends its own SYN packet(SYN=1, ACK=1)In reply, and specify its own initialization sequence numberseq=yAt the same time will put the clientseq + 1 = x + 1As the ack value(ack = x + 1)“Indicates that the client has been receivedThe SYN packet, the server is inSYN_REVDState;
The client receivesThe SYN packetThen, an ACK packet is sent(ACK = 1)As a reply, and the place as its own serial numberseq=x+1At the same time will put the serverseq + 1 = y + 1As the ack value(ack = y + 1)“, indicating that you have received the message from the serverThe SYN packet, the client is inESTABLISHEDIndicates that the server is in the ESTABLISHED state after receiving the ACK packet.

The client initiates the request and the server processes it

The customer service will send the corresponding information of the request line, request header, and request body to the server.After receiving the request, the server performs logical processing and returns response data (response line, response header, response body, etc.) based on the processing result.Here are some optimization strategies for HTTP data transfer. During HTTP data transmission, packets are unpacked (divided into small packets) and transmitted to the receiver one by one. The receiver must confirm each received packet to the sender. If the sender does not receive the confirmation message, the packet is considered as lost and the sender resends the packet. The receiver needs to complete packet grouping (assemble each received packet into a complete packet in sequence) to obtain the complete packet.

Disabling a TCP Connection

After data transmission is complete, determine whether to disconnect the Connection based on the Connection field. If Connection: keep-alive is contained in the request header or response header, the Connection will be reused by resources requesting the same site. Disconnection is required if the above conditions are not met. Disconnection follows the four-wave principle. The disconnection action can be initiated by either the client or the server. Both the client and the server are in ESTABLISHED state. Assume that the client initiates the shutdown request, and the process is as follows:

The client initiates a connection to release the FIN packet segment(FIN = 1), a sequence number is specified in the packet(seq = u)To stop sending data and close the TCP connection. Client AccessFIN_WAIT1 (terminate wait 1)Status, waiting for server confirmation;
After receiving the FIN packet segment released from the connection, the server sends an ACK packet(ACK = 1)And specify your own serial number(seq = v)At the same time will put the clientseq + 1 = u + 1As the ack value(ack = u + 1)“Indicates that the client has received the packet. Server AccessCLOSE_WAIT (close wait)Indicates that TCP is in the half-closed state and the connection between the client and the server is released. After receiving the confirmation from the server, the client entersFIN_WAIT2 (terminate wait 2)Status, waiting for the connection-release packet segment sent by the server.
If the server has no data to send to the client, the server sends a connection-release message segment(FIN=1, ACK=1, serial number SEq = W, confirmation number ACK= U +1), the server entersLAST_ACK (Final confirmation)Status, waiting for client confirmation;
After receiving the connection release packet from the server, the client sends an acknowledgement packet(ACK=1, seq=u+1, ACK= w+1), the client entersTIME_WAITState. At this time, TCP is not released, and the timer needs to wait for the time set2MSLAfter that, the client entersCLOSEDState.

Processing response information

After receiving the response data, the network process starts to parse the response header and process the parsed response status code according to different code values. Common status codes are listed below:

200 OK. The request is processed successfully and the response data is placed in the response body.
301 Moved Permanently. Permanent redirection: If the original domain address is no longer used and you need to use a new domain address to access resources, you can set the response status code to 301. By default, the browser performs cache optimization and automatically accesses the redirected IP address when accessing the original address again
302 Found. Temporary redirection, if only temporarily not using the original address can be returned302Status code. For example, if the site is under maintenance, you can provide an explanation page in the current domain to inform visitors
304 Not Modified. Negotiate cache. The first time the browser requests a resource, the server willLast-Modified 和 ETagBoth fields are returned to the browser in the response header.
- Last-Modified. The last modification time of the resource, which is carried in the request header when the browser makes a second request to the serverIf-Modified-SinceField that the server gets in the request headerIf-Modified-SinceField will match the field value to the resource in the current serverLast-ModifiedField value comparison. ifIf-Modified-SinceIs smaller than the value in the serverLast-ModifiedIndicates that the resource on the server has been updated, and the server will updateLast-ModifiedAnd returns the new resource to the browser with the response status code200; Otherwise returns304A status code that tells the browser to use cached resources directly;
- Etag. That is, the last modified content of the resource is generated according to the content of the filehashValue. When the browser makes a second request to the server, it is carried in the request headerIf-None-MatchField that the server gets in the request headerIf-None-MatchField will match the field value to the resource in the current serverEtagIn contrast, if the two values are not equal, the server will updateEtagAnd returns the new resource to the browser with a response status code of200; Otherwise returns304A status code that tells the browser to use cached resources directly. whenEtag 与 Last-ModifiedIn case of simultaneous existence, the first basisEtagJudge, and then according toLast-ModifiedDetermines what status code to return.
400 Bad Request. Incorrect request parameters
401. The identity authentication
403 Forbidden. Server access Forbidden
404 Not Found. The server did not find the resource
500 Internal Server Error. Server error
503 Service Unavailable. The server is busy and temporarily unable to process the response service

Resolving to the response data Type determines the Content-Type field, which tells the browser what Type of response body data the server is returning, and then the browser determines how to display the Content of the response body based on its value. If the value is of application/ OCTEt-stream type, it is usually treated as the download type; If the value is of type TEXT/HTML, prepare the render process.

Prepare the render process

Normally, opening a Tab page will start a rendering process, but here is a special case where tabs belonging to the same site use the same rendering process. Features of the same site include:

Same root domain plus protocol
Add different port numbers to all subdomain names that belong to the same root domain name

// Same site
https://time.geekbang.org
https://www.geekbang.org
https://www.geekbang.org:8080
Copy the code

Document Submission stage

When the render process is ready, it enters the submit document phase. The process is as follows:

First, when the browser process receives the response header data from the web process, it sends a “submit document” message to the renderer process.
The renderer process receives the “submit document” message and establishes a “pipeline” with the network process to transfer data.
After the document data transfer is complete, the renderer process returns a “confirm submit” message to the browser process.
After receiving the “confirm submission” message, the browser process updates the browser interface status, including the security status, the URL of the address bar, the historical status of forward and backward, and the Web page.

Build a DOM tree

Browsers can’t understand and use HTML directly, so you need to transform HTML into a DOM tree structure that browsers can understand. You can view it by typing Document in the browser console. The specific conversion process is as follows:

Conversion (bytes -> characters) : The browser reads the raw bytes of HTML from disk or the network and converts them into characters based on the specified encoding of the file (e.g. Utf-8)
Tokenization (characters -> tokens) : The browser converts a character to a w3C-compliant token (e.g.<html> <body>), and other strings in Angle brackets. Each token has a special meaning and a set of rules
Lexical analysis (token -> node) : The issued token is converted into an “object” that defines its properties and rules
DOM construction (node ->DOM) : due tohtmlTags define relationships between different tags, creating objects linked in a tree data structure that also captures parent-child relationships defined in the original tags:HTMLThe object isbodyObject’s parent,body 是 paragraphObject’s parent, and so on.

In the process of parsing HTML files, network processes may be required to download script files and style files, so there is a problem of blocking DOM parsing and rendering, please refer to the specificCSS and JS block DOM parsing and rendering in this wayIn this article, the conclusion is posted:

CSS does not block DOM parsing, but it does block DOM rendering.
JS blocks DOM parsing, but the browser “peeks” at the DOM and preloads related resources.
Browser encounter<script>And there is nodefer 或 asyncIf the CSS resource has not loaded before, the browser will wait for it to load before executing the script.

Style calculation

Convert CSS to a structure that browsers can understand: Browsers can’t understand CSS styles directly, so when the rendering engine receives CSS text, it performs a conversion operation to convert the CSS text to styleSheets. You can view it by typing Document.stylesheets in the browser console. The specific conversion process is as follows:

Convert attribute values in the stylesheet to standardize them: For example, the hexadecimal color was used when writing the code, which needs to be converted to RGB format. In Teacher Li Bing’s course, it was said that the EM unit should be converted to PX unit, and the bold should be converted to 700 unit. When I opened the Chrome developer tool, IT did not change, which is slightly different from the teacher’s description.
Figure out the style of the nodes in the DOM tree: mainly through inheritance rules and cascading rules. After the computation is complete, the style of each DOM node is printed and saved into the ComputedStyle structure.
Inheritance rule: Child nodes if not setfont-size,color,font-familyIf the parent node does not set its style, the default is the UserAgent style. Note: Only inheritable attributes can be inherited.
Cascading rule: An algorithm that defines how to combine attribute values from multiple sources.

The layout phase

Calculates the geometric positions of visible elements in the DOM tree. The specific process is as follows:

Create a layout tree: Iterate over all visible nodes in the DOM tree to generate a layout tree that contains only visible nodes. Invisible nodes are ignored and will not appear in the layout tree. Invisible nodes include: 1. Nodes that don’t render output (script/link/meta/head), 2. Nodes hidden by CSS (display: None) are ignored.
Layout calculation: Calculates the geometry of the nodes in the layout tree and saves the calculated information into the layout tree.

layered

Because there are many complex effects on the page, such as some complex 3D transformations, page scrolling, or z-index sorting, the rendering engine also needs to generate a LayerTree for a particular node to make it easier to achieve these effects. In general, not every node in the layout tree contains a layer, and if a node has no corresponding layer, then the node is subordinate to the layer of the parent node. These layers are composited into the final page. Conditions for the rendering engine to create a separate layer for a particular node include:

Elements with cascading context attributes are promoted to a separate layer. Here is a schematic of the cascading context properties:

Places that need to be clipped will also be created as layers. When an element is set to a fixed width and height and the content is beyond the element, the excess is clipped and the rendering engine creates a separate layer to hold the clipped content

Layer to draw

Once the layer tree is built, the rendering engine will draw each layer in the tree. The rendering engine will break down the drawing of a layer into smaller drawing instructions, which are then sequentially assembled into a list of layers to draw.

Raster

When the drawing list is ready, the main thread submits the drawing list to the composition thread, which does the actual drawing. What is a viewport? The viewport is the visible area of the page on the screen. Due to the complexity of the business, in some cases the layers can be very long and the user can only see a small part of the page through the viewport. Drawing layers at once can be expensive, so the composition thread will divide layers into tiles, which are usually 256×256 or 512×512.

The compositing thread prioritizes bitmap generation based on the blocks near the viewport, and the actual bitmap generation is performed by rasterization. Rasterization refers to the transformation of a map block into a bitmap. GPU is used to accelerate the generation of bitmaps in rasterization process. The process of using GPU to generate bitmaps is called fast rasterization, and the generated bitmaps are stored in GPU memory.

Composition and display

After all the layers are raster, the compositing thread will generate a DrawQuad command to draw a block. The compositing thread will send this command to the browser process. The Viz component in the browser process will receive the DrawQuad command, which will draw the page contents into memory according to the command. Finally, the contents of memory are displayed on the screen.

Refer to the article

Working principle and practice of browser
Practice this time to thoroughly understand the browser caching mechanism