preface
This is a platitude in front of a question, see the title may feel that The dust to say, interview how do you answer this question blah blah, but just know how to answer this question meaning what? Recently in the partial browser some of the underlying knowledge, I believe that after reading this share will be harvested, through the browser to share, do a title party.
Prepare the tip
Browser process
All of our running programs are running processes to do this, so let’s take a look at the processes that open a web page.From top to bottom they are:
-
Browser process is responsible for interface display, user interaction, communication between sub-processes, is a core process block, he will command other processes to operate.
-
GPU process The CHROME UI is drawn on a GPU. It is a process of the browser itself
-
Network process interface network resource acquisition, is in this piece
-
The rendering process is a familiar one. The browser layout engine and JS engine convert our HTML, CSS, and JS into interfaces
-
The plugin process runs while executing the plugin. Since browser plugins are mostly three-party plugins, they can be very unstable, so there is only one process here
Browser server communication
The process of accessing a web page is nothing more than a process of transmitting data packets from one host to another.
IP
IP is the unique identity of the host, the transmission of data between the host to our packet in front of an IP, hey, we can receive, strong ~
TCP
There are so many computer processes, how can we ensure that the transmission to the corresponding process, and the packets are scattered after transmission, how to sort the received packets? This field contains the port number. Each process has its unique port number. After the TCP handshake connection, the connection is the process port and the server.
TCP addresses two points:
- The retransmission mechanism
- Packet sorting mechanism
So here can be pulled to the TCP three-way handshake wave four times, wow this said too much, here is not merely add text, just add that data transmission phase, the receiver need to confirm the operation of each packet, namely after receiving end receives the packet, need to send confirmation packet to the sender, if received feedback within the prescribed time, The packet will be sent again (including possible reasons why pages sometimes load slowly).
The HTTP request
After the TCP connection is completed between hosts, HTTP requests are sent to communicate and transfer data.
Let’s take a look at the process
The lookup cache over here contains two parts, which we’ll talk about later
- DNS cache
- Content caching
To the chase
So here are some preliminary points, so what does the browser do from entering the URL to displaying the page?
Enter the url
There are two cases here
- If the browser decides it’s not a web address then it uses the browser’s default search engine to synthesize a new URL with the search keyword
- If the input is correct
URL
Rule then the address bar according to the rule, this paragraph of content plus the protocol, synthesized into a completeURL
.
So once we’ve typed it in, hey let’s hit Enter and let’s see what’s going on in the browser. As soon as we press Enter, the browser also gives the current page the opportunity to execute beforeUnload events once. This allows us to write an event to block the jump. By default, it will jump directly if it is not listening for functions. In addition, after we press Enter, it does not immediately jump to the URL interface, but only jumps to the url interface when the server gives result.
Send the request
Take the cache
The network process checks to see if there is a valid local cache of the domain’s content before making a request.
Valid cache The average server must send us cacheable data with a time limit, if the query cache is out of time then we still have to go to the next step, if the cache is available then directly back to the browser process display.
If there is no cache, let’s go ahead and send a DNS request to resolve the url. The same DNS request is also cached. The url that has been resolved can be found directly in the cache.
Establishing a TCP Connection
The next step is to establish a TCP connection with the server using the IP address. After the connection is established, the browser side will construct the request line, request information, etc., and attach the data related to the domain name, such as cookies, to the request header, and then send the constructed request information to the server. After receiving the request information, the server generates response data (including response line, response header, and response body) based on the request information and sends it to the network process. After the network process receives the response line and header, it parses the contents of the header.
Server response data
Server response data is divided into many kinds of situations, think about our status code, much is a little scary, here are several situations
redirect
Upon receiving the response header from the server, the network process begins to parse the response header. If the status code returned is 301 or 302, the server needs the browser to redirect to another URL. The network process reads the redirected address from the Location field in the response header, then makes a new HTTP or HTTPS request, and so on
The cache is available
If the cache expires, we send a request, and if there is no update, we return a 304 status code, which means the server tells the browser, “This cache can continue to be used and we won’t send you data again this time. The web process then throws the cached content directly to the browser process for display.
Response type
By differentiating the two cases above, we can finally handle the packets returned by the browser. Here’s the problem again hahaha, we usually see this situation where a url click used to download something directly instead of a web page, related to the Content-Type
The content-type of the returned response header is Application/OCtet-stream, and the browser will process the request based on the download Type. If it’s HTML, the browser will continue with the navigation process. Since Chrome’s page rendering runs in the render process, the next step is to prepare the render process.
Rendering began
three
-
At this point, the web process takes the data returned by the server, goes back to the browser process, and the browser process connects the web process to the renderer process.
-
The renderer sends a message to the browser when the data is received
-
When the browser process receives the message, it updates the browser interface state, including the security state, the URL of the address bar, the history of the forward and backward states, and updates the Web page. We’re essentially going from the direct interface to the URL interface (which may not be fully rendered). Once the page is generated, The renderer process sends a message to the browser process, and when the browser receives the message, it stops the loading animation on the label icon. Ok page loaded. Reward me for looking at my girlfriend.
conclusion
This content summary is the summary experience after reading teacher Li Bing’s browser working principle and practice course, refining the URL process to the realization of each step. This part of the web request is more about the underlying workings of the browser.