This question in the interview, basically is a must ask, given the time limit in the interview, maybe we will talk about DNS query, HTTP request, TCP three handshake four wave, parsing HTML to build a DOM tree, calculate CSS on the DOM tree, composite image, draw on the screen. This question is a very open question, the content involved, can let me water an article.

First, open a browser

Before entering a URL, you always have to open the browser. Otherwise, chrome launches multiple processes: main, renderer, web, plug-in, GPU, etc.

Processes and threads

First, a process is the smallest unit in which the CPU allocates resources, and a thread is the smallest unit in which the CPU schedules tasks. For example, in various departments at work, the allocation of resources must be given to the number of people in the front-end department and the number of people in the back-end. If they feel that their resources are not enough, they can apply again. The thread, is the department of the staff, each staff share a set of code, but are to work for the department, of course, the front end can not write back-end interface, it is not string.

When You start Chrome, there are several processes.

  1. The main process acts as an administrative department, managing the address bar, bookmarks, back and forward buttons, and the creation and destruction of tabs.

  2. Plugin processes, who use the browser has not a few plug-ins, such as Adblock, a plugin a process, is created at the time of use;

  3. GPU process, just one, used to draw 3D, but now web pages and interfaces are all drawn by GPU;

  4. The network process, which was independent in the past two years, was placed in the main process, mainly responsible for network resource requests;

  5. The rendering process is the focus of our front-end attention.

    First, when you open a TAB, Chrome creates a process for it, which is essentially a copy of the renderer process. Within this process, there are various threads, such as GUI threads, network requests, event threads, and JS kernel threads (v8 engine).

    In addition, there is a secondary render process that is resident, but this is different from the secondary render process, which can be righted and then created as a secondary.

Enter the URL

The full name of the URL is Uniform Resource Locator. Take this link address at https://www.example.com/index.html?uid=1#ch1.

  • Https:// is the protocol scheme section. Other protocol schemes include FTP: or javascript:;

  • www.example.com this is after the DNS resolution into IP address, of course, we can also do not use the domain name, directly use IP access to the server, the development of the common 192.168.1.1 is pointing to the node server in the machine;

  • /index.html is the HTML document we want to get;

  • ? Uid =1 query string, used to pass parameters;

  • #ch1 This is called the fragment identifier and is used to mark the location of the subresource of the acquired resource. The implementation of hashRouter is based on this.

Second, network request

Open the browser, enter the URL, find the exit, and then you have to request the resource.

Before we ask for resources, one more thing to be clear about.

There is a separate network process, so we have a new TAB, that is, the renderer process, HTTP request thread, the input URL after the return network request, is the network process or the renderer process initiated?

The answer is initiated by the rendering process of the tabs themselves. Web processes, on the other hand, handle web requests from the browser itself, such as Google accounts, such as downloads.

The port number

Some of the friends who understand the network request know that this network request is from a network cable /wifi inside out, the network flow of a computer can be large, and not only the browser in the network, and what wechat ah network cloud ah are in the request, the computer how to distinguish this request should be given to who?

Think about it. How does takeout get from downstairs to your front door? Is it through the house number? When the rendering process needs to communicate with the network, it will apply for a port from the computer. This is a port, not a PID.

Then there’s the question of how the computer knows to send packets to the router and not to another device in a home with all kinds of networking devices, big and small. Is the above mentioned takeout example, before the door, how to find downstairs? Address, yes, every device has an address, it’s called a MAC address, and it’s unique. First of all, the computer broadcasts before it sends the data, and it finds the MAC address of the gateway, just like the mall does when it’s looking for someone, and the mall doesn’t know where the person is, so it has to broadcast and wait for the person to respond.

So the browser finds the exit route and can start sending data out.

The DNS

In order to solve this problem, DNS service came into being, you can check each other by domain name or IP address. Therefore, after filling in the URL, can not request network resources, but also have to go through the following steps, the following steps can be interrupted, as long as the IP address is found at any stage, it will stop.

  • Chrome’s own DNS cache
  • Operating system cache
  • This machine is the host
  • Make a domain name resolution request to the preferred DNS server configured for your computer system, for example114.114.114.114Is our domestic telecom mobile Unicom’s universal domain name resolution DNS server
  • When a telecom carrier initiates an iterative DNS resolution request, it first finds the IP address of the root domain DNS and sends the request. The root domain does not store the mapping between domain names and IP addresses, but returns IP addresses of domain name suffixes, for exampleexample.comThe root field returns isCom domainThe IP address of.
  • The carrier again requests the domain suffix domain (Com domain) DNS server,Com domainInstead of returning the actual IP address, it returnsexample.comThe DNS address of this domain name, the address is generally provided by the domain name registrar, such as wan Net, new net.
  • After the domain registrar provides the IP address, it returns the IP address to the carrier, which then returns the IP address to the operating system, which then sends the IP address to the browser.
  • If none is present, the operating system will find the NetBios name Cache, which contains the IP addresses that have been successfully communicated with the computer in the recent period.
  • Querying WINS Servers
  • Broadcasting search
  • Read the LMHOSTS file

After the DNS resolution, you get the IP address of the target server, and you can happily request data. Of course, if no IP address is found, the resolution fails.

A TCP connection

When a request is sent out, it will probably go through such steps as network card => router => switch => server, and various forwarding and link of physical layer in the middle. The focus of this article is not here, so I will not expand on it. Before making an HTTP request, you need to establish a connection with the server. TCP is used here, and UDP is used in addition. The main differences between these two protocols are as follows: UDP does not need to establish a connection, and can be sent as you want, which is unreliable.

A TCP connection is a full-duplex connection, which means that data can be transmitted in both directions at the same time. The process of establishing a connection between TCP and the server is often referred to as the TCP three-way handshake.

  1. The first handshake is initiated by the client, sending oneSYN(syn=j)The packets are sent to the server while enteringSYN_SENTWaiting for the server to reply.
  2. After receiving the request, the server acknowledges the clientSYN(ack=j+1)And send one at the same timeSYN(syc=x)Package, at which point the server enters the SYN_RECV state, which is the second handshake.
  3. The last handshake is sent from the client to the serverACKPackage, for status confirmation, both server and client enterESTABLISHEDThe TCP connection is complete.

field Meaning (Uppercase letters, such as ACK, denote flag bits, which have a value of 1 or 0. Lowercase words, such as ACK, denote serial numbers.)
URG Whether the emergency pointer is valid. The value is 1, indicating that one of the packets needs to be processed first.
ACK Check whether the check number is valid. Generally, set it to 1.
PSH Prompts the receiving application to immediately read the data from the TCP buffer.
RST The other party asked to re-establish the connection, reset.
SYN Request to establish a connection and initialize the serial number in its serial number field. Set the connection to 1.
FIN Wish to disconnect.

Here’s the question, why three handshakes?

This is because of the complex network environment, packet loss is very common things, the server to the client to send to confirm receipt, if the packet is lost, so the client will have been SYN_SENT state, while the server will be considered connection has been established and send data to the client, since the client think the connection is not established, The data sent by the server is ignored, and the server tries again and again after the data has timed out, causing a deadlock.

In simple terms, the server needs to know if the connection has been successfully established, rather than just sending the packet itself.

HTTP transport

Once the connection is established, the HTTP protocol, which is all text over the TCP channel, requests web pages with GET requests. During the initial connection, resources are requested from the server. If CDN caching is performed, resources are requested from the CDN server.

If we’ve been to this page before, we’ll check the header Settings in the HTTP request. There’s a cache-control setting, and if the Cache hits and expires doesn’t expire, we’ll get it from the Cache, so we don’t need to get the resource from the server, we don’t need a TCP connection, The HTTP status is 200, which is strong caching.

The request is sent to the server, and the server decides whether the cache is available or not. Last-modified/if-Modified-since and Etag/ if-none-match are related fields, and the server validates the Etag first. Then the last-Modified comparison will be continued and the final decision will be made whether to return to 304.

Pipeline process

Third, render the page

With the HTML document in hand, it’s time to start rendering the page. The network request described above is also done in the rendering process of the current TAB page, which then communicates with the browser process using RCP communication to notify the browser of changes to the page, such as the stop of loading ICONS.

compile

At the same time, the DOM tree is constructed. The top of the stack of DOM tree is usually < HTML >. DOM elements are added to the tree, and the attributes are added to the current node. In addition, browsers also need to build CSS RenderObject trees, both of which are processed by streaming. They take the DOM tree elements in turn and overlay them according to the priority of CSS selectors. After that, the position of each element in the document is determined, that is, typesetting and generating a render tree.

Apply colours to a drawing

Graphics rendering, in Chrome, is handled by the Skia engine, as is the Drawing engine for Android and Flutter, incidentally. What Skia draws is actually a picture, and each picture is a frame. Therefore, the web page in front of us is actually a constantly redrawn rendering of the image. Not the whole map, of course, but the area into blocks, will need to update the parts of the redraw.

Since it is possible to manipulate DOM elements in JS, the concept of redraw and reflow arises. Backflow is when the render tree needs to be rebuilt, such as changing the display: None value of the element to display the element, causing backflow. Redrawing, on the other hand, just modifies the values in the Render Tree. Backflow inevitably triggers redraw, but either redraw or backflow causes the rendering engine to redraw the composite bitmap.

At the same time, GUI and JS are mutually exclusive, when the JS engine is executing, GUI will be suspended, GUI updates will be stored in a queue, waiting for JS to switch to execute immediately. This is to ensure that the rendering engine is consistent in obtaining element data, avoiding JS and GUI manipulation of the same element at the same time.

Task queue

As we all know, the window.onload hook is triggered after the entire page is loaded. After that, it’s the JS event logic. JS is single-threaded and all tasks are queued, but the JS engine can ignore the IO(network requests, mouse clicks, etc.) and process the other tasks first, waiting for the IO to return the results, and then processing the pending tasks.

Three, endnotes

So far, this article is finished, referred to a lot of information, which Teacher Li Bing “browser working principle and practice” gave me a lot of help, is worth reading. Because of the amount of content involved in this problem, the core part is the network request. As for the graphics rendering mentioned later, that is the research direction of computer graphics.