Parse the information in the address bar
The browser listens to the user’s input and tries to match the URL or keyword you want to visit. Take Nuggets as an example. Enter the information in the browser address bar and press enter. The browser will make the following judgment:
- Determine whether it is a legitimate URL link;
-
Is. Continue to determine whether the URL is complete, if not complete, the browser may guess the field, the input of the content to add a prefix, suffix, or suffix to complete the URL, common URL production include:
- Agreement:
http
https
websocket
- Domain name (hostname) : This may be an IP address or a domain name. The domain name may consist of root domain, top-level domain, secondary domain, etc. The name of the domain name is based on the name of the domain name from right to left
.
To divide by division, as:juejin.cn.
..
Represents the root domain name,.cn
Represents the top-level domain name,juejin.cn
Represents the second level domain name (that is, the host name) - Port number: The default port number for HTTP is 80, and the default port number for HTTPS is 443. The browser will automatically hide the default port number.
- Path:
/
Divide each level into categories, such as:/web/user
- Query:
?
Began to&
Separate key-value pairs, such as:? The username = "zhang SAN" & age = 16
- Hash:
#
At first, it can be used to locate the specific location of the current page
- Agreement:
- No. The browser will enter the content as the search criteria, using the default search engine set by the user to query and return the results
Finding a strong cache
The browser process sends the URL request via inter-process communication (IPC) to the network process, which receives the URL request and initiates the real request. But before requesting it, the network process looks to see if the resource is cached locally. If there is a cached resource, the resource is returned directly to the browser process. First, find a strong cache resource, if there is a strong cache resource to check whether the expired, not expired directly use the resource, expired again to the server to request the resource. Strong caching involves two fields:
Expires
. Expiration time(Expires=Wed, 21 OCT 2015 07:28:00 GMT)
HTTP/1.0 uses this field, which exists in the response header returned by the server and tells the browser to use the cached resource directly within the expiration time range. However, it has a big drawback. When the server and client times are inconsistent, the server will return inaccurate times, so HTTP/1.1 dropped this field in favor of itCache-Control
field
-
The cache-control. This field is used in HTTP/1.1 and also exists in the response header returned by the server, telling the browser to use the Cache resources directly within the expired time range. It can also set other instructions. Here are some key ones:
public
. Both the browser and the proxy server can cache resourcesprivate
. Only browsers cache resources, not proxy serversno-cache
. Skip the strong cache phase. Send a request to the server to enter the negotiation cache phaseno-store
. Don’t caches-maxage
. The cache time of the proxy servermust-revalidate
. Once the cache expires, you must go back to the source server for validation
Extension:
- How to set strong cache? It can be set in server-side code
Cache-Control
Field and its corresponding value; - Where are the strong cached resources cached?
memory cache
或disk cache
, that is, memory or hard disk, generally will be pictures, script files, font files cachedmemory cache
; Cache the style filedisk cache
In the. - What is the priority of accessing the cache? Follow the three-level caching principle: first in
memory cache
In the search, there is direct use; Not to godisk cache
To find, there is refers to direct use; A network request is made without the request, and the resource returned by the request is cached according to the response header field information.
DNS domain name resolution
If the required resource is not found in the strong cache, go directly to the network request process. Usually, we enter the domain name in the address bar of the browser, but in network communication, we determine the destination host by the IP address, so we have to find the corresponding IP address through the domain name. What is DNS? The Domain Name System (DNS) stores domain name and IP address mappings in a distributed database, so we can find the corresponding IP through DNS. This process is called DNS domain name resolution. Juejin. cn. To analyze the domain name resolution process:
- Browser DNS cache. The browser extracts the host name from the URL, searches for cache records from the browser DNS, and directly uses the cache IP to complete the parsing.
- Hosts file. Find out whether there are cache records in the hosts file on the local machine. If there are, return the corresponding IP to complete the resolution.
- Local DNS server. Send the query request to the local DNS server, if the local DNS server will return the record as a response to the host, complete the parsing;
- ISP DNS cache. The local DNS server forwards the query request to the DNS server provided by the ISP, and returns the record as a response to the local DNS server, which returns the local DNS server to the host to complete the parsing.
- The root domain name server. If the forwarding mode is not in use, the local DNS will send the request to 13 root DNS servers. The root DNS will determine the name (s) after receiving the request (s).
.cn
) is an IP that is responsible for the TLD server who is authorized to manage it and will return one responsible for it. If the mode is forward, the DNS server will forward the request to the higher DNS server, which will be resolved by the higher DNS server. If the higher DNS server cannot resolve the request, it will either find the root DNS server or forward the request to the higher DNS server, and so on. - The top-level domain name server. After getting the address of the top-level domain name server from the root domain name server, the local DNS server sends a query request to the top-level domain name server. After receiving the request, the local DNS server will view the regional file records and return the records as a response to the local DNS server, which will return the records to the host to complete the resolution. If it can’t resolve it itself, it finds an administrator (
.cn
The domain’s secondary domain name server IP is returned to the local DNS server; - Secondary domain name server. After getting the address of the secondary domain name server from the top-level domain name server, the local DNS server sends a query request to the secondary domain name server. After receiving the request, the secondary domain name server will check the regional file records and return the records to the local DNS server to complete the resolution. If the resolution is not completed at this stage, there may be an error in the domain name and an exception may be generated. Recursive query: The client initiates a query request to the local DNS server and waits for the local DNS server to return the result. If the local DNS server fails to resolve, it initiates query requests to other DNS servers as a DNS client until the query results are returned to the client.
Iterative query: The local DNS server initiates a query request to the root domain server, and returns the IP of the next target domain server to the local DNS server (also known as the root prompt) if the root domain name does not find a corresponding record, until the query result is returned to the client.
Setting up a TCP connection
Follow the three-handshake rule when making connections. The main purpose of the triple handshake is to confirm that the receiving and sending capabilities of both parties are normal, and to specify their own initialization sequence number in preparation for subsequent reliable transmission. In essence, the connection server specifies the port, establishes the TCP connection, and synchronizes the serial number and the confirmation number of both sides of the connection, and exchanges the TCP window size information. Now the client is in the Closed state and the server is in the Listen state
- The client sends a connection SYN message to the server and specifies
Synchronization bit SYN=1, initial sequence number SEQ = X
, and then the client is inSYN_SEND
State; - When the server receives the client’s SYN message, it will use its own SYN message
(SYN=1, ACK=1)
As a reply, and specify your own initialization sequence numberseq=y
At the same time will be the clientseq + 1 = x + 1
As the value of ACK(ack = x + 1)
, indicating that it has received the clientThe SYN packet
, the server is inSYN_REVD
State; - Client received
The SYN packet
After that, an ACK message is sent(ACK = 1)
As a reply, and the place as its own serial numberseq=x+1
At the same time the server will beseq + 1 = y + 1
As the value of ACK(ack = y + 1)
, indicating that it has received the server’sThe SYN packet
, the client is inESTABLISHED
When the server receives the ACK message, it is also in the ESTABLISHED state. At this time, the two parties have ESTABLISHED the connection.
The client initiates the request and the server processes the request
The customer service side will send the corresponding information of the request line, request header and request body to the server.
After receiving the request, the server processes it logically and returns the response data (response row, response header, response body, etc.) according to the processing results.
The optimization strategy for HTTP data transfer is mentioned here. During the process of HTTP data transmission, the packet will be unpacked (divided into small packets) and transmitted to the receiver successively. The receiver must confirm to the sender every time the packet is received. If the sender does not receive the confirmed message, the packet will be judged to be lost and the packet will be resended. The receiver needs to complete packet grouping (assembling each received packet into a complete packet in order) to obtain the complete packet.
Close the TCP connection
After data transmission, the Connection field should be used to determine whether the Connection should be disconnected. If the request header or response header contains Connection: Keep-Alive, it means that the Connection is persistent, and resources requesting the same site will reuse the Connection later. If you do not meet the above conditions, disconnect the connection. Follow the four wave principle when disconnecting. Disconnection can be initiated by either the client or the server. At this time, both the client and server are in the ESTABLISHED state. Assuming that the client makes the closure request, the process is as follows:
- The client initiates a connection to release the FIN segment
(FIN = 1)
, a sequence number specified in the message(seq = u)
, and stop sending data, closing the TCP connection. Client entryFIN_WAIT1 (terminate wait 1)
Status, waiting for the server to confirm; - When the server receives the FIN segment released by the connection, it issues an ACK message
(ACK = 1)
, and specify your own serial number(seq = v)
At the same time will be the clientseq + 1 = u + 1
As the value of ACK(ack = u + 1)
, indicating that the client message has been received. Server entryCLOSE_WAIT (Close wait)
Status, when TCP is partially closed and the client-server connection is released. The client enters after receiving confirmation from the serverFIN_WAIT2 (terminate wait 2)
State, waiting for the server to release the connection segment; - If the server has no more data to send to the client, the server will issue a connection to release the packet segment
(FIN=1, ACK=1, Serial No. Seq = W, Accreditation No. ACK= U +1)
, server entryLAST_ACK (final confirmation)
State, waiting for the client to confirm; - Upon receipt of the connection release message from the server, the client issues an acknowledgement message segment
(ACK=1, seq=u+1, ACK= w+1)
, client entryTIME_WAIT (time to wait)
State. TCP is not released at this point, and it will take some time to wait for the timer to set2MSL
After, the client entersCLOSED
State.
Processing response information
After receiving the response data, the network process begins to parse the response header. When the response status code is parsed, it will be processed according to different code values. The following are the common status codes:
200 OK
. The request is processed successfully and the response data is placed in the response body.301 Moved Permanently
. Permanent redirect. If the previous domain address is no longer in use and a new domain address is needed to access the resource, the response status code can be set to 301. The browser will optimize the cache by default and automatically access the redirected address when revisiting the original address302 Found
. Temporary redirection, if only temporarily not using the original address can be returned302
Status code. For example: the site is under maintenance, so you can give an explanation page in the current domain to inform visitors-
304 Not Modified. Negotiate the cache. The first time a browser requests a resource, the server returns the Last-Modified and ETag fields to the browser in the response header.
Last-Modified
. This is the last modified time of the resource, which is carried in the request header when the browser makes a second request to the serverIf-Modified-Since
Field, the server gets in the request headerIf-Modified-Since
The value of the field is matched to the value of the resource in the current serverLast-Modified
Field value comparison. ifIf-Modified-Since
Is less than the value in the serverLast-Modified
If the resource on the server has been updated, the server will updateLast-Modified
And returns the new resource to the browser with the response status code200
; Otherwise returns304
Status code that tells the browser to use cached resources directly;Etag
. That is, the last modified content of the resource is generated according to the file contenthash
Value. When the browser makes a second request to the server, it is carried in the request headerIf-None-Match
Field, the server gets in the request headerIf-None-Match
The value of the field is matched to the value of the resource in the current serverEtag
In contrast, if the two values are not equal, the server updatesEtag
And returns the new resource to the browser. The response status code is200
; Otherwise returns304
A status code that tells the browser to use cached resources directly. whenEtag
与Last-Modified
When coexisting, first according toEtag
Judgment, according toLast-Modified
Determine what status code is returned.
400 Bad Request
. Incorrect request parameter401
. The identity authentication403 Forbidden
. Server access prohibited404 Not Found
. The server did not find the corresponding resource500 Internal Server Error
. The server failed.503 Service Unavailable
. The server is busy and temporarily unable to process the response service
Parsing to the response data type will determineContent-Type
Field, which tells the browser what type of response body data the server is returning, and the browser uses its value to determine how to display the contents of the response body. If the value isapplication/octet-stream
Byte stream type, usually treated as download type; If the value istext/html
Type prepares the rendering process.
Preparing the render process
Normally, opening a Tab will start a renderer, but there is an exception where a same-site Tab belongs to the same renderer process. Features of the same site include:
- The root domain name plus protocol is the same
-
All subdomains belonging to the same root domain name plus different port numbers
/ / the same site https://time.geekbang.org https://www.geekbang.org https://www.geekbang.org:8080
Submission phase
When the rendering process is ready, it enters the document submission phase. The process is as follows:
- First, when the browser process receives the response header data from the network process, it sends a “submit document” message to the renderer process.
- When the renderer process receives the “submit document” message, it establishes a “pipeline” with the network process to transmit the data.
- When the document data transfer is complete, the renderer process returns a “Confirm submission” message to the browser process.
- When the browser process receives the “Confirm Commit” message, it updates the browser interface state, including the security state, the URL of the address bar, the forward and backward history state, and updates the Web page.
Build a DOM tree
Browsers can’t understand and use HTML directly, so you need to transform HTML into a DOM tree structure that browsers can understand. You can type document in the browser console to view. The specific conversion process is as follows:
- Conversion (byte-> characters) : The browser reads the raw bytes of HTML from disk or the network and converts them to individual characters based on the specified encoding of the file (such as UTF-8)
- Tokenization (character-> token) : The browser converts characters to W3C-compliant tokens (for example:
<html>
<body>
), and any other string inside Angle brackets. Each token has a special meaning and a set of rules - Lexical analysis (Token-> node) : The issued token is transformed into an “object” that defines its properties and rules
- DOM build (node-> DOM) : Since
html
The tag defines the relationship between the different tags, and the objects created are linked in a tree data structure that also captures the parent-child relationship defined in the original tag:HTML
The object isbody
The parent item of the object,body
isparagraph
Object’s parent, and so on.
In the process of parsing an HTML file, a network process may need to download a script file and a style file, so there is a problem with blocking DOM parsing and rendering. For details, refer to the original CSS and JS block DOM parsing and rendering, and post the result:
- CSS does not block DOM parsing, but it does block DOM rendering.
- JS blocks DOM parsing, but browsers will “peek” at the DOM and download relevant resources in advance.
-
When the browser encounters a
Style calculation
Convert CSS to a structure that the browser can understand: The browser can’t understand CSS styles directly, so when the rendering engine receives CSS text, it performs a conversion operation to convert the CSS text to Stylesheets. You can view it by typing document.stylesheets in the browser console. The specific conversion process is as follows:
- Convert attribute values in the stylesheet to normalize them: For example, when I write code, I use hexadecimal color, which needs to be converted to RGB format. In Li Bing’s course, it is said that I should change the em unit to PX unit, and the bold unit to 700 unit. When I open the developer tool of Chrome, I see that it is not transformed, which is a little different from the teacher’s description.
- Calculate the style of the nodes in the DOM tree: This is done mainly through inheritance rules and cascading rules. After the calculation is completed, output the style of each DOM node and save it to the structure of ComputedStyle.
- Inheritance rule: child nodes if not set
font-size
,color
,font-family
You can inherit the style of the parent node, or use the UserAgent style by default if the parent node also does not set its style. Note: Only inheritable properties can be inherited. - Cascading rule: This is an algorithm that defines how to combine property values from multiple sources.
The layout phase
Calculates the geometric positions of visible elements in the DOM tree. The specific process is as follows:
- Create a layout tree: Traverse through all the visible nodes in the DOM tree to generate a layout tree that contains only the visible nodes. Invisible nodes are ignored and will not appear in the layout tree. Invisible nodes include: 1. Node that will not render output (script/link/meta/head), 2. Nodes hidden by CSS (display: none) are ignored;
- Layout calculation: Calculates the geometric positions of the nodes in the layout tree and saves the calculated information in the layout tree.
layered
Because there are many complex effects in the page, such as some complex 3D transformations, page scrolling, or Z-axis sorting using z-index, in order to achieve these effects more easily, the rendering engine also needs to generate a dedicated layer for the specific node, and generate a corresponding LayerTree. In general, each node in the layout tree does not contain a layer. If a node does not have a corresponding layer, the node is subordinate to the layer of the parent node. These layers are then combined into the final page. The conditions for the rendering engine to create a separate layer for a particular node include:
- Elements with cascading context attributes are promoted to a separate layer. The following is a schematic diagram of cascading context properties:
- The areas that need to be clipped will also be created as layers. When an element is set to a fixed width and height and the content in it exceeds the element, the excess will be clipped and the rendering engine will create a separate layer to hold the clipped content.
Layer to draw
After building the layer tree, the rendering engine will draw each layer in the layer tree. The rendering engine breaks down the drawing of a layer into a number of small drawing instructions, and then organizes these instructions into a list of things to be drawn.
Raster (Raster)
When the drawing list is ready, the main thread submits the drawing list to the composite thread, which does the actual drawing. First, what is a viewport? The viewport is the visible area of the page on the screen. Due to the complexity of the business, the layer may be very long in some cases, and the user can only see a small part of the page through the viewport. If the layer is drawn at one time, it will incur great overhead, so the composition thread will divide the layer into tiles, the size of which is usually 256×256 or 512×512.
The synthetic thread generates bitmaps based on the blocks near the viewport first, and the actual bitmap generation is performed by rasterization. Rasterization is the conversion of a block to a bitmap. The process of rasterization uses GPU to accelerate generation. The process of generating bitmaps using GPU is called rapid rasterization, and the generated bitmaps are stored in the memory of GPU.
Composition and display
After all the layers are razed, the compositing thread generates a drawQuad command to draw the block. The compositing thread sends this command to the browser process, which receives the drawQuad command from the VIZ component of the browser process, and it draws the page content into memory. Finally, the contents of memory are displayed on the screen.
Refer to the article
- How Browsers Work and Practice
- Practice this once and for all to understand the browser caching mechanism