From entering the URL to rendering the entire process of the page

This is an interview question about how browsers work.

Browser process: mainly responsible for user interaction, child process management and file storage.

Web process: provides web download function for rendering process and browser process.

Rendering process: the main responsibility is to download HTML, JavaScript, CSS, images and other resources from the Internet to parse into pages that can be displayed and interactive. Because all the contents of the renderer process are obtained through the network, there will be some malicious code to exploit browser vulnerabilities to attack the system, so the code running in the renderer process is not trusted. This is why Chrome makes the rendering process run in a security sandbox, just to keep the system safe.

The whole process.

Step 1: User input

First, the browser process receives the URL request entered by the user, and the browser process forwards the URL to the network process. The actual URL request is then made in the network process.

Detailed flow after user input

1. When a user enters a query keyword in the address bar, there are two scenarios

Search content: The address bar uses the browser’s default search engine to synthesize new urls with search keywords.
Requested URL: for example, if you enter Baidu.com, the address bar will combine this content with the protocol according to the rules and synthesize it into a complete URL, such as www.baidu.com/, and when you enter www.baidu.com, it will become www.baidu.com/

Baidu returns 307 Internal Redirect and returns Location: www.baidu.com/ in the response header. Tell the browser to redirect over there.

2. After the user enters the keyword and enters EnterChecks whether there is a beforeUnload event

This means that the current page is about to be replaced with a new page, but before the process continues, the browser also gives the current page the opportunity to perform a beforeUnload event, which allows the page to perform some data cleaning before exiting and also asks the user whether to leave the current page. For example, the current page may have unfinished forms, so users can unnavigate by using the beforeUnload event to remove any subsequent work from the browser.

As you can see from the figure, the icon on the TAB page enters the loading state as soon as the browser starts loading an address. But at this time the page in the figure is still displayed before the open page content, and did not immediately replace the baidu home page. The page content will not be replaced until the document submission phase is “parsed later.”

Step 2: THE URL request process

Next, you enter the page resource request process. In this case, the browser process sends the URL request to the network process via interprocess communication (IPC). The network process receives the URL request and initiates the actual URL request from here.

Detailed request process flow

1. Cache Check The network process checks whether the local cache has cached the resource. If there is a cached resource, it is returned directly to the browser process. If the resource is not found in the cache, the network request flows directly. Cache checking is a complex process and can be found in the current directory for caches related articles.

2.DNS Resolution The first step before the request is to perform DNS resolution to obtain the server IP address of the requested domain name. The TCP connection is then made, and if the request protocol is HTTPS, a TLS connection is also required. DNS resolution, TCP connection, TLS connection can be found in the current directory related articles.

3. Process the returned data

After receiving the request information, the server generates response data (including response line, response header, and response body) based on the request information and sends it to the network process. After the network process receives the response line and header, it parses the contents of the response header and, for status code 200, passes the parsed data to the browser process. After receiving the response header data from the network process, the browser process sends a “CommitNavigation” message to the renderer process. After receiving the “submit navigation” message, the renderer process begins to prepare a blank page to receive HTML data. The way to receive data is to establish a data pipeline directly with the network process. Finally, the renderer process “confirms submission” to the browser process, which tells the browser process that it is ready to accept and parse the page data. When the browser process receives a “submit document” message from the renderer, it removes the old document and updates the page state in the browser process. Now you can go through the rendering process.

What happens when we come across a status code that is something else?

Design to the status code here, you can find relevant articles in the current directory

For example, the encountered status code is 301/302/306/307. These are the status codes for redirection

So the server needs the browser to redirect to another URL. At this point, the network process will read the redirected address from the Location field in the response header, and then initiate a new HTTP or HTTPS request. Everything starts all over again.

What about response data type processing?

Process according to the response header.

The Content-Type is a very important field in the HTTP header that tells the browser what Type of response body data the server is returning. The browser then uses the value of the Content-Type to decide how to display the response body Content.

Content-Type: text/html; Charset = UTF-8: is an HTML type, then the browser will give the renderer to render the HTML.

Content-type: application/octet-stream: is a byte stream Type, so the browser browser is handed over to the download manager.

Step 3: Prepare the render

The network process sends the requested document to the main process of the browser, and the main process notifies the renderer process. The renderer process and the network process establish a connection pipe, and render the HTML document requested by the network process to the page.

Once the renderer process is ready, it cannot immediately enter the document parsing state because the document data is still in the network process and has not been submitted to the renderer process, so the next step is to submit the document.

Documents submitted

First, when the browser process receives the response header data from the network process, it sends a “submit document” message to the renderer process.

After receiving the “submit document” message, the renderer process establishes a “pipeline” with the network process to transfer data.

After the document data transfer is complete, the renderer process returns a “confirm submit” message to the browser process.

After receiving the “confirm submission” message, the browser process updates the browser interface status, including the security status, the URL of the address bar, the historical status of forward and backward, and the Web page.

When we enter an address, we don’t jump immediately because of resource requests, parsing, and data passing operations.

By default, Chrome assigns a render process to each page, meaning that a new render process is created for each new page opened. However, there are some exceptions, in some cases the browser will allow multiple pages to run directly in the same render process. So far, nuggets, simple books, CSDN, Github, geek time have been tested. Only geek time can be shared by multiple rendering processes.

The chrome architecture does not contain five browser processes. The main browser process, rendering process, network process, plug-in process may be multiple, and GPU process. The current version is 92.0.4515.159.

But the Storage process, Audio Servie, shows that the browser architecture is now dynamic, and you use those modules to dynamically start those processes.

Step 4: Page rendering

When the browser process determines that the document is submitted, the renderer process starts parsing the page and loading the child resources. When the page is loaded, the renderer process sends a message to the browser process. The browser receives the message and stops the loading animation on the label icon

See the browser rendering process in this directory for details

conclusion

Every step of development is very much content, try to understand every step of thinking through.

If you don’t understand anything, or if there are any inadequacies or mistakes in my article, please point them out in the comments section. Thank you for reading.

From entering the URL to rendering the entire process of the page

Step 1: User input

Detailed flow after user input

Step 2: THE URL request process

Detailed request process flow

Step 3: Prepare the render

Documents submitted

Step 4: Page rendering

conclusion

Related Posts

Webgl Learning Notes (2)

ECMAScript 2021 (ES12) is officially written into the ECMAScript standard

A Map that makes it easy to understand data structures