What happens from URL input to page rendering:
The DNS query
A TCP connection
The HTTP message
Browser rendering
disconnect
The DNS query
When we enter the content from the browser address, the browser will analyze the information we enter. If the keyword is the keyword, the browser will use the default search engine to search the content. If the domain name is entered, the DNS will search the corresponding IP address.
DNS query process: the browser will search its cache and terminate domain name resolution if it exists. If no entry is found in the cache, the HOST file of the operating system will be read to see if there is a mapping. (PS: A lot of times when we visit a website and the response is very slow, this can be used!) If there is no corresponding mapping relationship in the local HOST file, the local DNS server will be searched. If there is, the DNS server will complete the resolution. Otherwise, it will send a request to the root server for recursive query.
A TCP connection
The browser obtains the corresponding IP address and sends a TCP connection request to the IP address.
- The whole process of TCP connection can be simply described as three handshakes and four waves
- TCP three-way handshake:
- The client sends a packet with SYN=1, Seq=X to the server port (the first handshake, initiated by the browser, tells the server I’m going to send the request)
- The server sends back a response with SYN=1, ACK=X+1, Seq=Y as confirmation (second handshake, initiated by the server, telling the browser I’m ready to accept it, send it now)
- The client sends back a packet with ACK=Y+1, Seq=Z, which means “handshake over” (the third handshake, sent by the browser, tells the server I’m sending soon, get ready to accept).
Introducing the question: why does TCP shake hands three times, and why not two?
- To ensure reliable data transmission, both parties of the TCP protocol must maintain a serial number to identify which packets have been received. The three-way handshake is a necessary step for communication parties to inform each other of the start sequence number and confirm that the other party has received the start sequence number
- In the case of two handshakes, at most only the initial sequence number of the connection initiator can be confirmed, and the sequence number selected by the other party cannot be confirmed
The HTTP message
After the TCP three-way handshake is complete, HTTP packets are sent and received. The packets refer to request packets and response packets respectively
- The request message
The request message is divided into four parts: request line, request header, blank line and request body.
- The response message
The response message is the same as the request message
What can we do about this phase?
- CDN
What is CDN?
How does CDN work?
When the browser requests data from the CDN node, the CDN node will judge whether the cached data has expired. If the cached data has not expired, it will directly return the cached data to the client. Otherwise, the CDN node will issue a back source request to the server, pull the latest data from the server, update the local cache, and return the latest data to the client. CDN service providers generally provide multiple dimensions based on file suffixes and directories to specify CDN cache time to provide users with more refined cache management.
- Browser cache
When the browser requests resources from the server, it determines whether the strong cache is hit and then whether the negotiated cache is hit.
Strong cache:
When loading resources, the browser checks whether the strong cache is matched based on the header information of the local cache resource. If the strong cache is matched, the browser directly uses the cached resources and does not send requests to the server. The header information here refers to Expires and Cahe-Control.
- Expires.
This field is the HTTP1.0 specification, and its value is a time string in the GMT format of an absolute time, such as Expires:Mon,18 Oct 2066 23:59:59 GMT. This time represents the expiration time of the resource, before which the cache is hit. One obvious disadvantage of this approach is that since the outage time is an absolute time, it can lead to cache clutter when the server and client times diverging significantly.
- Cache-control:Cache-control :max-age=3600; cache-control :max-age=3600; Indicates that the validity period of the resource is 3600 seconds. Cache-control In addition to this field, there are several common Settings:
- No-cache: negotiates the cache and sends a request to the server to confirm whether the cache is used.
- No-store: Disables the cache and requests data again each time.
- Public: can be cached by all users, including end users and intermediate proxy servers such as CDN.
- Private: the device can be cached only by the browser of the terminal user and cannot be cached by a trunk cache server such as the CDN.
- Cache-control and Expires can be enabled at the same time in the server configuration, and cache-control has a high priority when both are enabled.
Negotiation cache:
When a strong cache is not hit, the browser sends a request to the server, which determines whether the cache is hit based on the partial information in the header. If it hits, 304 is returned, telling the browser that the resource is not updated and that the local cache is available. The header information here refers to last-modify/if-modify-since and ETag/ if-none-match.
- The Last – the Modify/If – the Modify – Since:
When the browser requests a resource for the first time, last-modify is added to the header returned by the server. Last-modify is a time that identifies the Last modification time of the resource. When the browser requests the resource again, the request header contains if-modify-since, which is the last-modify returned before the cache. After receiving if-modify-since, the server determines whether the resource matches the cache based on the last modification time. If the cache is hit, 304 is returned, the resource content is not returned, and last-modify is not returned.
Disadvantages: Resources change in a short period of time, and last-Modified does not change.
ETag/If None – Match:
Last-modified and ETag can be used together. The server will verify the ETag first. If the ETag is consistent, the server will continue to compare last-Modified, and finally decide whether to return 304.
Browser rendering
The browser takes the response text HTML, parses the relevant file, and converts the HTML to the correspondingDOMThe trees,CSSParsed intoCSSRule tree, and then the DOM tree and CSS rule tree combined to buildRenderTree. The browser builds the layout by rendering information about the rendered objects in the tree and calculating the location and size of each rendered object. By walking through the render tree and calling the renderer’s “paint” method and then drawing on the page.
What can we do about this phase?
- backflow
When backflow occurs: This is when the browser calculates where elements need to be rendered on the page before the view is rendered.
redraw
Redraw does not necessarily trigger redraw, but backflow does
The resources
Why rustling cold (TCP three-way handshake rather than two shake hands) link address: https://blog.csdn.net/lengxiao1993/article/details/82771768 null tsai (practice this time, thoroughly understand the browser cache mechanism) link address: https://segmentfault.com/a/1190000017962411