From the input URL to the rendering of the page

Parse the information in the address bar

The browser listens to the user’s input and tries to match the URL or keyword you want to visit. Take Nuggets as an example. Enter the information in the browser address bar and press enter. The browser will make the following judgment:

Determine whether it is a legitimate URL link;
Is. Continue to determine whether the URL is complete, if not complete, the browser may guess the field, the input of the content to add a prefix, suffix, or suffix to complete the URL, common URL production include:
- Agreement:http https websocket
- Domain name (hostname) : This may be an IP address or a domain name. The domain name may consist of root domain, top-level domain, secondary domain, etc. The name of the domain name is based on the name of the domain name from right to left.To divide by division, as:juejin.cn...Represents the root domain name,.cnRepresents the top-level domain name,juejin.cnRepresents the second level domain name (that is, the host name)
- Port number: The default port number for HTTP is 80, and the default port number for HTTPS is 443. The browser will automatically hide the default port number.
- Path:/Divide each level into categories, such as:/web/user
- Query:?Began to&Separate key-value pairs, such as:? The username = "zhang SAN" & age = 16
- Hash:#At first, it can be used to locate the specific location of the current page
No. The browser will enter the content as the search criteria, using the default search engine set by the user to query and return the results

Finding a strong cache

The browser process sends the URL request via inter-process communication (IPC) to the network process, which receives the URL request and initiates the real request. But before requesting it, the network process looks to see if the resource is cached locally. If there is a cached resource, the resource is returned directly to the browser process. First, find a strong cache resource, if there is a strong cache resource to check whether the expired, not expired directly use the resource, expired again to the server to request the resource. Strong caching involves two fields:

Expires. Expiration time(Expires=Wed, 21 OCT 2015 07:28:00 GMT)HTTP/1.0 uses this field, which exists in the response header returned by the server and tells the browser to use the cached resource directly within the expiration time range. However, it has a big drawback. When the server and client times are inconsistent, the server will return inaccurate times, so HTTP/1.1 dropped this field in favor of itCache-Controlfield

The cache-control. This field is used in HTTP/1.1 and also exists in the response header returned by the server, telling the browser to use the Cache resources directly within the expired time range. It can also set other instructions. Here are some key ones:
- public. Both the browser and the proxy server can cache resources
- private. Only browsers cache resources, not proxy servers
- no-cache. Skip the strong cache phase. Send a request to the server to enter the negotiation cache phase
- no-store. Don’t cache
- s-maxage. The cache time of the proxy server
- must-revalidate. Once the cache expires, you must go back to the source server for validation

Extension:

How to set strong cache? It can be set in server-side codeCache-ControlField and its corresponding value;
Where are the strong cached resources cached?memory cache 或 disk cache, that is, memory or hard disk, generally will be pictures, script files, font files cachedmemory cache; Cache the style filedisk cacheIn the.
What is the priority of accessing the cache? Follow the three-level caching principle: first inmemory cacheIn the search, there is direct use; Not to godisk cacheTo find, there is refers to direct use; A network request is made without the request, and the resource returned by the request is cached according to the response header field information.

DNS domain name resolution

If the required resource is not found in the strong cache, go directly to the network request process. Usually, we enter the domain name in the address bar of the browser, but in network communication, we determine the destination host by the IP address, so we have to find the corresponding IP address through the domain name. What is DNS? The Domain Name System (DNS) stores domain name and IP address mappings in a distributed database, so we can find the corresponding IP through DNS. This process is called DNS domain name resolution. Juejin. cn. To analyze the domain name resolution process:

Browser DNS cache. The browser extracts the host name from the URL, searches for cache records from the browser DNS, and directly uses the cache IP to complete the parsing.
Hosts file. Find out whether there are cache records in the hosts file on the local machine. If there are, return the corresponding IP to complete the resolution.
Local DNS server. Send the query request to the local DNS server, if the local DNS server will return the record as a response to the host, complete the parsing;
ISP DNS cache. The local DNS server forwards the query request to the DNS server provided by the ISP, and returns the record as a response to the local DNS server, which returns the local DNS server to the host to complete the parsing.
The root domain name server. If the forwarding mode is not in use, the local DNS will send the request to 13 root DNS servers. The root DNS will determine the name (s) after receiving the request (s)..cn) is an IP that is responsible for the TLD server who is authorized to manage it and will return one responsible for it. If the mode is forward, the DNS server will forward the request to the higher DNS server, which will be resolved by the higher DNS server. If the higher DNS server cannot resolve the request, it will either find the root DNS server or forward the request to the higher DNS server, and so on.
The top-level domain name server. After getting the address of the top-level domain name server from the root domain name server, the local DNS server sends a query request to the top-level domain name server. After receiving the request, the local DNS server will view the regional file records and return the records as a response to the local DNS server, which will return the records to the host to complete the resolution. If it can’t resolve it itself, it finds an administrator (.cnThe domain’s secondary domain name server IP is returned to the local DNS server;
Secondary domain name server. After getting the address of the secondary domain name server from the top-level domain name server, the local DNS server sends a query request to the secondary domain name server. After receiving the request, the secondary domain name server will check the regional file records and return the records to the local DNS server to complete the resolution. If the resolution is not completed at this stage, there may be an error in the domain name and an exception may be generated. Recursive query: The client initiates a query request to the local DNS server and waits for the local DNS server to return the result. If the local DNS server fails to resolve, it initiates query requests to other DNS servers as a DNS client until the query results are returned to the client.

Iterative query: The local DNS server initiates a query request to the root domain server, and returns the IP of the next target domain server to the local DNS server (also known as the root prompt) if the root domain name does not find a corresponding record, until the query result is returned to the client.

Setting up a TCP connection

Follow the three-handshake rule when making connections. The main purpose of the triple handshake is to confirm that the receiving and sending capabilities of both parties are normal, and to specify their own initialization sequence number in preparation for subsequent reliable transmission. In essence, the connection server specifies the port, establishes the TCP connection, and synchronizes the serial number and the confirmation number of both sides of the connection, and exchanges the TCP window size information. Now the client is in the Closed state and the server is in the Listen state

The client sends a connection SYN message to the server and specifiesSynchronization bit SYN=1, initial sequence number SEQ = X, and then the client is inSYN_SENDState;
When the server receives the client’s SYN message, it will use its own SYN message(SYN=1, ACK=1)As a reply, and specify your own initialization sequence numberseq=yAt the same time will be the clientseq + 1 = x + 1As the value of ACK(ack = x + 1), indicating that it has received the clientThe SYN packet, the server is inSYN_REVDState;
Client receivedThe SYN packetAfter that, an ACK message is sent(ACK = 1)As a reply, and the place as its own serial numberseq=x+1At the same time the server will beseq + 1 = y + 1As the value of ACK(ack = y + 1), indicating that it has received the server’sThe SYN packet, the client is inESTABLISHEDWhen the server receives the ACK message, it is also in the ESTABLISHED state. At this time, the two parties have ESTABLISHED the connection.

The client initiates the request and the server processes the request

The customer service side will send the corresponding information of the request line, request header and request body to the server.

After receiving the request, the server processes it logically and returns the response data (response row, response header, response body, etc.) according to the processing results.

The optimization strategy for HTTP data transfer is mentioned here. During the process of HTTP data transmission, the packet will be unpacked (divided into small packets) and transmitted to the receiver successively. The receiver must confirm to the sender every time the packet is received. If the sender does not receive the confirmed message, the packet will be judged to be lost and the packet will be resended. The receiver needs to complete packet grouping (assembling each received packet into a complete packet in order) to obtain the complete packet.

Close the TCP connection

After data transmission, the Connection field should be used to determine whether the Connection should be disconnected. If the request header or response header contains Connection: Keep-Alive, it means that the Connection is persistent, and resources requesting the same site will reuse the Connection later. If you do not meet the above conditions, disconnect the connection. Follow the four wave principle when disconnecting. Disconnection can be initiated by either the client or the server. At this time, both the client and server are in the ESTABLISHED state. Assuming that the client makes the closure request, the process is as follows:

The client initiates a connection to release the FIN segment(FIN = 1), a sequence number specified in the message(seq = u), and stop sending data, closing the TCP connection. Client entryFIN_WAIT1 (terminate wait 1)Status, waiting for the server to confirm;
When the server receives the FIN segment released by the connection, it issues an ACK message(ACK = 1), and specify your own serial number(seq = v)At the same time will be the clientseq + 1 = u + 1As the value of ACK(ack = u + 1), indicating that the client message has been received. Server entryCLOSE_WAIT (Close wait)Status, when TCP is partially closed and the client-server connection is released. The client enters after receiving confirmation from the serverFIN_WAIT2 (terminate wait 2)State, waiting for the server to release the connection segment;
If the server has no more data to send to the client, the server will issue a connection to release the packet segment(FIN=1, ACK=1, Serial No. Seq = W, Accreditation No. ACK= U +1), server entryLAST_ACK (final confirmation)State, waiting for the client to confirm;
Upon receipt of the connection release message from the server, the client issues an acknowledgement message segment(ACK=1, seq=u+1, ACK= w+1), client entryTIME_WAIT (time to wait)State. TCP is not released at this point, and it will take some time to wait for the timer to set2MSLAfter, the client entersCLOSEDState.

Processing response information

After receiving the response data, the network process begins to parse the response header. When the response status code is parsed, it will be processed according to different code values. The following are the common status codes:

200 OK. The request is processed successfully and the response data is placed in the response body.
301 Moved Permanently. Permanent redirect. If the previous domain address is no longer in use and a new domain address is needed to access the resource, the response status code can be set to 301. The browser will optimize the cache by default and automatically access the redirected address when revisiting the original address
302 Found. Temporary redirection, if only temporarily not using the original address can be returned302Status code. For example: the site is under maintenance, so you can give an explanation page in the current domain to inform visitors
304 Not Modified. Negotiate the cache. The first time a browser requests a resource, the server returns the Last-Modified and ETag fields to the browser in the response header.
- Last-Modified. This is the last modified time of the resource, which is carried in the request header when the browser makes a second request to the serverIf-Modified-SinceField, the server gets in the request headerIf-Modified-SinceThe value of the field is matched to the value of the resource in the current serverLast-ModifiedField value comparison. ifIf-Modified-SinceIs less than the value in the serverLast-ModifiedIf the resource on the server has been updated, the server will updateLast-ModifiedAnd returns the new resource to the browser with the response status code200; Otherwise returns304Status code that tells the browser to use cached resources directly;
- Etag. That is, the last modified content of the resource is generated according to the file contenthashValue. When the browser makes a second request to the server, it is carried in the request headerIf-None-MatchField, the server gets in the request headerIf-None-MatchThe value of the field is matched to the value of the resource in the current serverEtagIn contrast, if the two values are not equal, the server updatesEtagAnd returns the new resource to the browser. The response status code is200; Otherwise returns304A status code that tells the browser to use cached resources directly. whenEtag 与 Last-ModifiedWhen coexisting, first according toEtagJudgment, according toLast-ModifiedDetermine what status code is returned.
400 Bad Request. Incorrect request parameter
401. The identity authentication
403 Forbidden. Server access prohibited
404 Not Found. The server did not find the corresponding resource
500 Internal Server Error. The server failed.
503 Service Unavailable. The server is busy and temporarily unable to process the response service

Parsing to the response data type will determineContent-TypeField, which tells the browser what type of response body data the server is returning, and the browser uses its value to determine how to display the contents of the response body. If the value isapplication/octet-streamByte stream type, usually treated as download type; If the value istext/htmlType prepares the rendering process.

Preparing the render process

Normally, opening a Tab will start a renderer, but there is an exception where a same-site Tab belongs to the same renderer process. Features of the same site include:

The root domain name plus protocol is the same

All subdomains belonging to the same root domain name plus different port numbers

/ / the same site https://time.geekbang.org https://www.geekbang.org https://www.geekbang.org:8080

Submission phase

When the rendering process is ready, it enters the document submission phase. The process is as follows:

First, when the browser process receives the response header data from the network process, it sends a “submit document” message to the renderer process.
When the renderer process receives the “submit document” message, it establishes a “pipeline” with the network process to transmit the data.
When the document data transfer is complete, the renderer process returns a “Confirm submission” message to the browser process.
When the browser process receives the “Confirm Commit” message, it updates the browser interface state, including the security state, the URL of the address bar, the forward and backward history state, and updates the Web page.

Build a DOM tree

Browsers can’t understand and use HTML directly, so you need to transform HTML into a DOM tree structure that browsers can understand. You can type document in the browser console to view. The specific conversion process is as follows:

Conversion (byte-> characters) : The browser reads the raw bytes of HTML from disk or the network and converts them to individual characters based on the specified encoding of the file (such as UTF-8)
Tokenization (character-> token) : The browser converts characters to W3C-compliant tokens (for example:<html> <body>), and any other string inside Angle brackets. Each token has a special meaning and a set of rules
Lexical analysis (Token-> node) : The issued token is transformed into an “object” that defines its properties and rules
DOM build (node-> DOM) : SincehtmlThe tag defines the relationship between the different tags, and the objects created are linked in a tree data structure that also captures the parent-child relationship defined in the original tag:HTMLThe object isbodyThe parent item of the object,bodyisparagraphObject’s parent, and so on.

In the process of parsing an HTML file, a network process may need to download a script file and a style file, so there is a problem with blocking DOM parsing and rendering. For details, refer to the original CSS and JS block DOM parsing and rendering, and post the result:

CSS does not block DOM parsing, but it does block DOM rendering.
JS blocks DOM parsing, but browsers will “peek” at the DOM and download relevant resources in advance.
When the browser encounters a

Style calculation

Convert CSS to a structure that the browser can understand: The browser can’t understand CSS styles directly, so when the rendering engine receives CSS text, it performs a conversion operation to convert the CSS text to Stylesheets. You can view it by typing document.stylesheets in the browser console. The specific conversion process is as follows:
Convert attribute values in the stylesheet to normalize them: For example, when I write code, I use hexadecimal color, which needs to be converted to RGB format. In Li Bing’s course, it is said that I should change the em unit to PX unit, and the bold unit to 700 unit. When I open the developer tool of Chrome, I see that it is not transformed, which is a little different from the teacher’s description.
Calculate the style of the nodes in the DOM tree: This is done mainly through inheritance rules and cascading rules. After the calculation is completed, output the style of each DOM node and save it to the structure of ComputedStyle.
Inheritance rule: child nodes if not setfont-size,color,font-familyYou can inherit the style of the parent node, or use the UserAgent style by default if the parent node also does not set its style. Note: Only inheritable properties can be inherited.
Cascading rule: This is an algorithm that defines how to combine property values from multiple sources.

The layout phase

Calculates the geometric positions of visible elements in the DOM tree. The specific process is as follows:

Create a layout tree: Traverse through all the visible nodes in the DOM tree to generate a layout tree that contains only the visible nodes. Invisible nodes are ignored and will not appear in the layout tree. Invisible nodes include: 1. Node that will not render output (script/link/meta/head), 2. Nodes hidden by CSS (display: none) are ignored;
Layout calculation: Calculates the geometric positions of the nodes in the layout tree and saves the calculated information in the layout tree.

layered

Because there are many complex effects in the page, such as some complex 3D transformations, page scrolling, or Z-axis sorting using z-index, in order to achieve these effects more easily, the rendering engine also needs to generate a dedicated layer for the specific node, and generate a corresponding LayerTree. In general, each node in the layout tree does not contain a layer. If a node does not have a corresponding layer, the node is subordinate to the layer of the parent node. These layers are then combined into the final page. The conditions for the rendering engine to create a separate layer for a particular node include:

Elements with cascading context attributes are promoted to a separate layer. The following is a schematic diagram of cascading context properties:
The areas that need to be clipped will also be created as layers. When an element is set to a fixed width and height and the content in it exceeds the element, the excess will be clipped and the rendering engine will create a separate layer to hold the clipped content.

Layer to draw

After building the layer tree, the rendering engine will draw each layer in the layer tree. The rendering engine breaks down the drawing of a layer into a number of small drawing instructions, and then organizes these instructions into a list of things to be drawn.

Raster (Raster)

When the drawing list is ready, the main thread submits the drawing list to the composite thread, which does the actual drawing. First, what is a viewport? The viewport is the visible area of the page on the screen. Due to the complexity of the business, the layer may be very long in some cases, and the user can only see a small part of the page through the viewport. If the layer is drawn at one time, it will incur great overhead, so the composition thread will divide the layer into tiles, the size of which is usually 256×256 or 512×512.

The synthetic thread generates bitmaps based on the blocks near the viewport first, and the actual bitmap generation is performed by rasterization. Rasterization is the conversion of a block to a bitmap. The process of rasterization uses GPU to accelerate generation. The process of generating bitmaps using GPU is called rapid rasterization, and the generated bitmaps are stored in the memory of GPU.

Composition and display

After all the layers are razed, the compositing thread generates a drawQuad command to draw the block. The compositing thread sends this command to the browser process, which receives the drawQuad command from the VIZ component of the browser process, and it draws the page content into memory. Finally, the contents of memory are displayed on the screen.

Refer to the article

How Browsers Work and Practice
Practice this once and for all to understand the browser caching mechanism