What happens when you go from entering the URL in the browser to rendering the page? During the whole 30 minutes of the show, there was no coding or code questions, only this seemingly old-fashioned question. In fact, to be specific, all the questions were led by this one question. “This question, if you dig deeper, is a good way to identify the integrity of a candidate’s knowledge system.” “The interviewer says.

Q: What happens when you enter the URL in the browser and render the page?

URL parsing

After receiving the URL, the browser parses the URL into specific fields

Q: What are the fields with specific meanings?

  • Protocol: indicates the Protocol header indicating how the browser processes the file to be opened. Such as HTTP, FTP
  • Host: indicates the Host domain name or IP address used to access resources on the Internet. The Host domain name is resolved into an IP address by the DNS
  • Port: indicates the Port number
  • Path: directory Path, requested directory or file name
  • Query: Query parameters, parameters to be passed in a request
  • Fragment: Fragment/anchor, often used as front-end routing

The DNS

URL resolved, in the form of parameters was introduced into the network request thread converts the Host field to IP address and DNS, according to the DNS lookup server IP system, the first query browser own DNS cache, if not checked, continue to own DNS cache lookup system, if not checked, continue to find from the system hosts file, If not found in the local host, the local domain name server will search, if still not found, will go to the root domain name server, COM top-level domain name server and permission domain name server iteration query, finally will find the IP back to the local host, if not found, an error is reported.

Q: What is the difference between a process and a thread?

A process is an instance of a program running, and the operating system allocates a separate memory for the code and data needed to run it. Each process has at least one main thread and several sub-threads, multiple threads can share the resources allocated by the same process, multithreading can obviously improve the running efficiency.

  • If one thread fails, the whole process will crash
  • Processes are independent of each other. The collapse of one process does not affect other processes. Each process can access only the resources allocated by the system, but can communicate with each other through the IPC mechanism
  • Threads can share data of their own processes
  • Resources occupied by a process are reclaimed by the operating system after it is shut down, even if a thread in the process has a memory leak

Establishing an HTTP connection

After DNS resolution, the IP address of the target server is obtained. The browser establishes the connection with the server through TCP three-way handshake. If HTTPS is used, the TSL connection needs to be established and the encryption algorithm needs to be negotiated.

Q: Talk about the TCP model and TCP connections.

The TCP/IP model borrows from the OSI model, which has seven layers

  • Application layer: Responsible for providing the interface to the application so that it can use network services, the HTTP protocol is located in this layer
  • Presentation layer: responsible for data encoding – decoding, encryption – decryption, compression – decompression.
  • Session layer: Responsible for communication between systems
  • Transport layer: is responsible for establishing the end-to-end connection, so that packets can be transmitted from end to end. TCP/IP resides in this layer.
  • Network layer: Provides logical addresses for network devices so that hosts at different address locations have accessible connections and paths
  • Data link layer: Provides reliable data transfer services over unreliable physical links.
  • Physical layer: Defines the standards of physical devices, such as the interface model of network cables or optical fibers

The TCP/IP model combines the application layer, presentation layer, and session layer into the application layer. According to the network model, when the local device is connected to the network, the data link layer and the network layer are connected, and the end-to-end connection can be established through the transport layer to send HTTP requests. Common protocols at the transport layer include TCP and UDP. HTTP uses TCP to ensure the reliability of data transmission. TCP is a connection-oriented protocol. Before data transmission, establish a connection between the client and the server and ensure that both the receiving and sending capabilities of the client are normal, that is, the three-way handshake and the four-way wave

Q: Let’s move on to the three handshakes and four waves.

Three handshakes:

  • First handshake: The client sends a SYN to the client, and the client enters the state waiting for confirmation from the server
  • Second handshake: The server prepares the connection, gives an ACK response with a SYN for the client, and sends a SYN to the client
  • Third handshake: When the client is ready, it sends an ACK response to the server.

The TCP connection waves four times when the request completes

  • The client sends the FIN to inform the server that all data has been sent and enters the FIN_WAIT state
  • After receiving the FIN, the server sends an ACK to inform the client that it has received the CLOSE_WAIT message and enters the CLOSE_WAIT state
  • The server sends a FIN message to the client indicating that all data has been sent and the client enters the LAST_ACK state
  • After receiving the FIN, the client enters the TIME_WAIT state and sends an ACK to the server. The server enters the CLOSED state.

Q: Why three handshakes and four waves?

Three handshakes can confirm that the sending and receiving function of the double sends is normal, and one more handshake is not necessary. The reason why the four waves are needed is to confirm that the double data has been sent and ensure reliable disconnection of the connection. The four waves cannot be the same as the three waves. The server may have its own work to deal with when receiving the disconnection request from the client. However, the client needs to be notified that it has received the disconnection request, so it needs to be sent in two batches.

Q: What if the TCP connection has been established, but the client suddenly fails?

TCP provides a keepalive timer, which is usually two hours. After receiving a request from the client, the server restarts the timer. After two hours, the server sends a probe packet segment every 75 seconds

Get and parse the content

After the connection is established, the browser starts to send requests for content. First, the HTML and CSS files are parsed, and the DOM document object model and CSSOM cascading style sheet model are built. When the DOM and CSSOM trees are built, the browser will merge the DOM and CSSOM trees into a rendering tree. The browser will first perform style calculation to find the corresponding style for the nodes in the DOM tree. After style calculation, the browser will start to build the layout tree to find the corresponding position for the nodes in the DOM tree and display: None is hidden. After the layout tree is completed, the hierarchical tree is established, mainly to meet the complex hierarchical operation of scrolbar, Z-index and position. Finally, the rendering process converts the specific drawing method of each node into the actual pixels on the screen to display the whole page, and the rendering process will appear backflow and redrawing

Q: Talk about backflow and redraw

  • Redraw refers to operations that do not affect the layout, such as changing colors, and redraw can be rendered directly by repeating computed styles with little impact on browser rendering
  • Backflow (rearrangement) refers to operations that affect the layout, such as changing width and height, hiding nodes, etc. Because the layout is changed, style calculation, layout tree regeneration, and hierarchical tree regeneration are required, so the rearrangement has a large impact. Each page is rearranged at least once, the first time the page is loaded