preface

Prepare to change a job, summed up a wave of interview the most frequent interview questions to communicate with you. This article is about the browser FAQ, about 6 out of 10 interview to ask similar questions (mainly large and medium factory). (PART of the interview content has been forgotten, in order to connect a complete story, increase readability, 20% of the content is fictitious), now I am working in Didi Chuxing chengdu team.

Recommend the computer basic quality article: juejin.cn/post/684490…

Question: What happens when you enter a URL from the browser address bar to the request return

If you look at this rotten tooth problem, it is a small case, but the interview boss 996 extended the question by this far beyond the question itself, do not believe you continue to read.

I answered that the first will be url parsing, according to the DNS system for IP lookup.

The voice just fell, at this time a likes to repair blessing newspaper company’s big guy interrupted me, said URL why to parse, WHAT is the DNS query rule? I listened to the in the mind to think, not according to the routine card, the net generally did not ask these two questions, in the mind to think again, as the saying goes, all things are difficult before they are difficult, carry this wave, answer, is sunny, all things in the turmoil of spring!

First, why do URLS have to be parsed (that is, encoded)

  • Because web standards dictate that URLS can only be letters and numbers, and some other special symbols (-_.~! * ‘(); : @ & = + $, /? # [], the special symbols are the data I looked up below, but I can’t stand so many, the most common ones are not including percent sign and double quotation marks), and ambiguity will appear if not escaped, for examplehttp:www.baidu.com?key=value, if mykeyWhich itself includes equal to=Notation, for exampleke=y=valueYou don’t know=Is it a connection?keyandvalueThe symbol of theta, or itselfkeyJust one=.
  • The big guy then beat me up and said, what’s the rule of URl-encoding, I said UTF-8
  • Why UTF-8? Is that true of all browsers? Do you use GB2312 encoding in Chinese? Also, if the browser is not utF-8 as you said, how can you ensure it is all UTF-8 encoding?
  • I hesitated and said, I know about this, I don’t know, should be related to the HTML itself encoding format, then how to guarantee utF-8 encoding, I think can use encodeURIComponent
  • What’s the difference between encodeURIComponent and encodeURI?
  • The difference is that encodeURIComponent has a wider encoding range and is suitable for encoding parameters, while encodeURI is suitable for encoding URL itself (locaion.origin). Of course, projects are generally processed using QS library

Then talk about the DNS resolution process, and how DOES HTML do DNS optimization

First of all, DNS belongs to a long time ago in the computer network xie Xi ren version saw, there are some details forgotten, but the general process is remembered. For example, query a web address: www.baidu.com

1. Enter the www.baidu.com domain name in the server, and the operating system will check whether the hosts file has records first. If so, it will return the corresponding mapped IP address.

2. If the hosts file is not available, check whether the local DNS parser has a cache. (I didn’t answer that.)

3, then go to find the DNS server configured on our computer or there is a cache, and return

If no, go to the root DNS server (13 servers in the world with fixed IP addresses), and then determine which server manages the.com domain name. If the domain name cannot be resolved, go to the. Baidu.com server to check whether the domain name can be resolved until you find the IP address of www.baidu.com

Note: There are two modes of DNS query, one is forward mode, the other is non-forward mode, I said above four non-forward mode.

Front-end DNS optimization can write the DNS cache address in the HTML page header, for example

<meta http-equiv="x-dns-prefetch-control" content="on" />
<link rel="dns-prefetch" href="http://bdimg.share.baidu.com" />
Copy the code

I finally got past the first round of nagging questions, and then I went on to say what happens when you enter the URL in the browser address bar and the request comes back

Once the IP is found, there is a three-way handshake for the HTTP protocol (and then four breakups involved)

Just as I was getting back into my rhythm and ready to talk, the boss of The Po interrupted me again, saying three handshakes, why not two, and what happened with the three handshakes.

I go, careless, did not flash, this is not to say that I said every sentence to be mixed with all kinds of problems, too difficult!!

No way, continue to answer the big guy, I said I first answer the three handshake happened, a brief answer:

  • First handshake: Host A sends A TCP packet with the bit code SYN = 1 to the server and randomly generates one as the acknowledgement number (which is part of the TCP packet). Host B receives the SYN code until HOST A asks for A connection.

  • Second handshake: After receiving the request, host B sends A TCP packet with the acknowledgment number (HOST A’s SEq +1), SYN =1, seq = random number;

  • After receiving the packet, host A checks whether the acknowledgement number sent by host A is correct, that is, whether the acknowledgement number sent by host A for the first time is +1, and whether the ack code is 1. If yes, host A will send the acknowledgement number (SEQ +1 of host B) and ack=1. After receiving the packet, host B will confirm that the SEQ value and ACK value are 1, and then the connection is established successfully.

Then fill in the small question why not shaking hands twice, because the second hand, host B cannot confirm host A confirmation request has been received, also said B think good connections, began to send data, the results sent package has been A is received, the attack is very easy, B I don’t receive special contract, the server is easy to hang up now.

Then, big guy say a bonus point problem, I see you are not a training background, can answer how much is how much. The question is, what happened to sending packets out from the network adapter to the server, prompting me to OSI reference model

I listen, ok, this is not the knowledge of computer network, fortunately read a book before, but also long ago read, can rely on their own understanding solution.

  • I said, first the data sent to the company from LAN switches (if there is no local cache switch MAC address and IP address mapping, have got to) by ARP agreement at this time, the benefits of the switch can be separate collision domains (for Ethernet CSMA/CD protocol, this agreement line on the same time can only have a machine to send data), This would allow not only one machine to send network packets at a time
  • Switches then sends the data to the router, the router is equal to the company’s gateway (small) in our company, routers packets forwarding and grouping of functions (router through the selected route deal’s way by the table structure, irregular with adjacent routers exchange routing information at the same time), then this is through the physical, The data link layer (Ethernet) begins to forward data to the network layer
  • Then the router forwards IP datagrams. Generally, the IP addresses of companies are translated by NAT, so that the Intranet IP can also access the Internet. In our company, I noticed that it is the Intranet IP address starting with 192.168. Through the router’s packet transmission, all data reaches the server.
  • Then the upper-layer protocol of the server, the transport layer protocol, comes into play. According to the port number in the TCP packet, the specific service of the server is allowed to process the incoming packet. Moreover, TCP is byte stream-oriented (TCP has four characteristics, reliable transmission, traffic control, congestion control and connection management), so the request object of our node, The reason it listens for data events is because TCP itself is a byte stream, and the DATA (HTTP layer) used by the request object is a chunk of data transmitted by TCP.
  • Finally, the data is transferred from the transport layer to the application layer, that is, the HTTP service (or HTTPS). The back-end passes a series of logical processing and returns the data to the front end.

Answer here, I say big guy I only know the general process, the specific details I am not very clear, but I will fill in later…

The big guy told me to go ahead, so I shook hands for three times and then said, after the link is established, it’s time to request the HTML file, if the HTML file is in the cache the browser returns directly, if not, go to the background

Just talk about cache, immediately have a kind of ominous foreboding, not surprising big guy first let cache explain. Cache this kind of bad question, thought can easily deal with, the result is still asked a full head package…

What I mean by that is roughly:

  • When the browser successfully loads the resource for the first time, the server returns 200. In this case, the browser not only downloads the resource, but also caches the response header(the date attribute is very important, which is used to calculate the time difference between the current time and date of the second time of the same resource).

  • The next time a resource is loaded, it must be cached first. Cache-control has the highest priority. For example, cache-control: no-cache. If Max -age= XXX, the current time will be compared to the last time 200 was returned. If Max -age is not exceeded, the file will be read from the local cache without sending a request. If it expires, it moves to the next stage, the negotiated cache

  • In the negotiation cache phase, a request is sent to the server with the header with if-None-match and if-Modified-since. The server compares the Etag and returns 304 If it matches the negotiation cache. If not, return the new resource file with the new Etag value and return 200.

  • The second important field in the negotiation cache is if-Modified-since. If the if-Modified-since value sent by the client is the same as the last time the file was changed, the negotiation cache is hit and 304 is returned. If not, return the new last-modified file and 200;

Do I know what from Disk cache and From Memory cache are? When do they trigger?

  • I said that the strong cache will trigger these two, I don’t know what the behavior is, but it’s like this:
1, first look for memory, if there is memory, load from memory; 2. If it is not found in the memory, select the hard disk to obtain it. If it is found in the hard disk, load it from the hard disk. 3, if the hard disk does not find, then the network request; 4, load resources to cache to disk and memory;Copy the code

Do you know what a heuristic cache is and under what conditions it triggers?

This question gives me the feeling of two words, meng bi! And truthfully say I don’t know. (Checked the information is roughly as follows)

Heuristic cache:

If the response does not display Expires, cache-control: max-age, or cache-control: s-maxAge, and the response does not contain any other caching restrictions, the Cache can use a heuristic to calculate the freshness lifetime. The cache time is typically based on the two time fields in the response header, Date, minus 10% of the last-Modified value.

// Date minus 10% of last-Modified value is the cache time. Response_is_fresh = Max (0, (date-last-Modified)) % 10Copy the code

Cssom + domTree = HTML, and then layout and draw

  • Building A DOM tree: parsing HTML documents from top to bottom to generate A DOM tree, also called content tree;

  • Building CSSOM(CSS Object Model) tree: load parsing style to generate CSSOM tree;

  • Execute JavaScript: Load and execute JavaScript code (including inline code or JavaScript files);

  • Build render tree: generate render tree according to DOM tree and CSSOM tree;

  • Render tree: A series of rectangles displayed sequentially on the screen with visual properties such as font, color, and size.

  • Layout: According to the rendering tree, each node of the node tree is laid out in the correct position on the screen.

  • Painting: Iterate through the render tree and draw all the nodes, applying the corresponding style to each node. This is done through the UI back-end module.

The interviewer then asked me some ways to optimize the rendering layer of the page.

Page rendering optimization

  • The structure of HTML documents should be as few as possible, preferably no deeper than six layers;
  • The script as far as possible after, on the front can be;
  • A few first-screen styles are placed inline within the tag;
  • Keep the style hierarchy as simple as possible;
  • Minimize DOM operations in the script, cache the style information of DOM access as far as possible, and avoid excessive triggering of backflow;
  • Instead of using JavaScript code to change the style of an element, use the class name to modify the style or animation.
  • Animation should be used on absolute or fixed positioning elements;
  • Hide from the screen, or try to stop animation while scrolling;
  • Cache DOM lookups as much as possible, and keep the finder as simple as possible;
  • If multiple domain names are involved, you can enable domain name pre-resolution

Finally, the interviewer asked me how to diagnose various performance indicators in page rendering. I roughly said that I used chrome tools, such as Network to see the situation of network requests and Perfermance to see the situation of page rendering. After the interview, I checked some information by myself, such as this article on Zhihu, which I think is very detailed. The following is an excerpt, which I plan to summarize myself when I have a chance in the future.

zhuanlan.zhihu.com/p/105561186

By the way, chengdu here Didi recruitment front end, Java and testing. I will directly push it to the internal push system. Welcome to submit your resume to [email protected]