preface
Now that you know the browser’s navigation flow, you can answer the question of what happens when you enter the URL and press Enter to display the page. This is the first half of the classic problem, which is the process from the moment the user sends the URL request to the moment the page begins to parse, which is the web request part.
As we know, Chrome is a multi-process architecture, which involves the browser process, the web process, and the rendering process. Let’s briefly review the main responsibilities of these three processes:
- Browser process: mainly responsible for page display, user interaction, sub-process management, file storage and other functions.
- Web process: mainly provides web request resource function for browser process and renderer process.
- Renderers: The core task is to transform HTML, CSS, and JavaScript into web pages that users can interact with.
Let’s take a look at how browsers collaborate with each other to fulfill web requests.
User input
The process starts with the user entering a query key into the address bar:
- The user enters a keyword in the address bar.Browser processWill judge it to be a request
url
Or search content- If you are searching for content, the browser will use the default search engine and synthesize the content with the search
url
- If it’s requested
url
, judge whether it is complete according to the rules, and synthesize complete according to the rules if it is noturl
- If you are searching for content, the browser will use the default search engine and synthesize the content with the search
- Press Enter,The browser enters the loading state
- Before the process continues, the browser has one more
beforeunload
The event can be executed to clean up data, interrupt the process, and cancel navigation. - If there is no
beforeunload
Event or agree to continue subsequent operations, the flow continues, and the browser enters the loading state. - But the page is the same as before and will not be replaced until the submission stage.
- Before the process continues, the browser has one more
For example, below the original content of the page is https://www.npmjs.com/package/egg, I in the search box type https://www.npmjs.com/package/koa and press enter, The browser is loaded (the icon in the upper left corner is spinning and the icon in the lower left corner says it’s waiting for a response), but the page is still the egg.
- Now let’s go intoNetwork resource request process.Browser processThrough interprocess communication (IPC)
url
The request toNetwork process.
Network Resource Request
Find the cache
The browser process has handed over the URL request to the web process, so what is the next specific request flow?
A network process does not initiate a network request directly, because a request without a request is the fastest request.
- In accordance with theBrowser Cache policyThe network process checks the resourceWhether caching is allowed(
Cache-Control
),Whether the resource is cached locally(Cache location search in sequence) - If there is a cached resource, the network process directly returns the resource to the browser process, and the network request ends
- If the resource is not found or needs to be validated, the network request process is actually entered
Caching is a big topic that will be covered in more detail in the performance tuning section.
The DNS
The first step in a network request is to get the IP address of the requested server, because the network request process boils down to one computer requesting resources from another computer and establishing a connection through the IP address.
Generally, what users see is a domain name, which is a string of multiple words separated by. For example, Nuggets is juejin.cn, Taobao is www.taobao.com, Baidu is www.baidu.com. This domain name is extremely important, it is the entrance of a website. It can be said that a good domain name has a good dream. I heard that FACEBOOK originally used the domain name thefacebook.com, and later raised money to buy back thefacebook.com domain name for $200,000. If you could go back in time, you could become a billionaire with a bunch of domain names.
To get back to the point, a domain name can correspond to multiple IP addresses, it can be the IP address of the source server, it can be the IP address of the CDN distribution, it can be some temporary IP address when migrating the server, we can do a lot of interesting things with the domain name system.
In order to get the correct IP address, it has its own resolution system, this system is the DNS system, if you want to know more about the DNS system, you can go to the DNS domain name system I have written before. .
So let’s continue the network request process:
- If the IP address is entered, the TCP connection is directly entered
- If you enter a domain name, perform DNS resolution to obtain the IP address of the server and establish a TCP connection
- DNS also has cache, its cache search process is browser cache -> operating system cache ->Hosts file -> non-authoritative domain name server -> root domain name server -> top-level domain name server -> authoritative domain name server
A TCP connection
After receiving the IP address, the network process enters the TCP connection phase.
- The request enters the TCP queue. A single domain name has a maximum of six TCP connections. If the number exceeds the limit, the request must be queued.
- The browser establishes a TCP connection with the server through a three-way handshake
- The client sends a connection request packet to the server
- The server receives the connection request segment and sends a reply if it agrees to connect
- After receiving the connection approval reply, the client sends an acknowledgement message to the server
- If the request protocol is HTTPS, a TLS connection must be established after the TCP connection (securely exchanging symmetric keys).
Making an HTTP request
After the connection is established, the network process formally issues an HTTP request.
- The browser side will construct the request line, request information, etc., and the data related to the domain name Cookie attached to the request header, and then send the request information to the server
- The request line briefly describes how the client wants to operate on the server’s resources. It consists of three parts: request method, request target, and version number
- The header field is
key-value
In the form of,key
和value
In between:
Space, useCRLF
Newline, indicating the end of the field.
The server responds with data
- After the server receives the request information, the server will base on the request informationGenerated response data is returned(Including response line, response header, response body and other information)
- The status line represents the status of the server response, again consisting of three parts: version number, status code, and cause
- The server returns a message to the network process,The network process receives the packet and parses it:
- Parse response header:
- If it is found that the return status code is
301
或302
, the server requires a browserredirectTo the otherurl
The network process will read from the response headerLocation
Field to read the redirected address, and then initiate a new network request; - If the status code is
200
That means the browser can continue processing the request.
- If it is found that the return status code is
- Parse the response data typeThrough:
Content-Type
Header fields distinguish data types.- if
Content-Type
The value of the field is determined by the browser to be a download type, and the request is submitted to the browser’s download manager and theurl
The requested navigation flow ends here - If it is
HTML
Type, then the browser will continue the navigation process;
- if
- Parse response header:
The web request takes the HTML data and proceeds with the navigation process. Since Chrome renders pages in the render process, it is then necessary to prepare the render process.
Prepare the render process
- By default, ChromeA rendering process is assigned to each page
- Typically, a separate rendering process is used to open a new page;
- However, if page B is opened from page A, and A and B belong to the same site (with the same protocol and root domain name), then page B will reuse the rendering process of page A; If otherwise, the browser process creates a new renderer for B.
- Once the renderer process is ready, it cannot immediately enter the document parsing state because the document data is still in the network process and has not been submitted to the renderer process, so the next step is to submit the document.
Submit the document
Submitting a document means that the browser process submits the HTML data received by the web process to the renderer process.
The specific process is as follows:
- First, when the browser process receives the response data from the network process, it sends a “submit document” message to the renderer process.
- After receiving the “submit document” message, the renderer process establishes a “pipeline” with the network process to transfer data.
- After the document data transfer is complete, the renderer process returns a “confirm submit” message to the browser process.
- Browser processAfter receiving the “confirm submission” message, the browser interface status is updated, including the security status, the address bar
url
, historical status of forward and backward, and update the Web page.
This also explains why, when you type an address into the browser’s address bar, the previous page does not disappear immediately, but takes a while to load before the page is updated.
Go to render
After submitting the document, the renderer process begins page parsing and child resource loading.
Once the page is generated, the renderer process sends a message to the browser process, which stops the loading animation on the label icon when it receives the message.
Navigation Process Summary
This is the navigation process from user URL request to page parsing.
In short, this is the following process:
- The browser process assembles the url entered by the user into a complete URL and hands it to the web process.
- The network process obtains the IP address through DNS and establishes a TCP connection with the server. Then it sends an HTTP request and starts parsing after receiving the data from the server.
- After receiving the HTML data, the network process submits the data to the rendering process by submitting the document, ready to enter the page rendering stage.
What exactly the browser does in the render phase will be sorted out in the next article, so stay tuned.
Browser series column directory