What happens after you enter the URL and press enter

Generally speaking, there are the following steps from entering URL to displaying page:

  1. Enter the address
  2. DNS resolution is performed to query the IP address entered in the address bar.
  3. The TCP connection is established with the server through an IP address.
  4. The client sends an HTTP request to the server, and if the server returns a redirect like 301, the browser sends the request again based on location in the corresponding header
  5. The server receives the request and responds to the HTTP request, processes the request to generate HTML code, and returns it to the browser.
  6. The browser starts parsing the rendered page and displaying it
  7. Close the connection

Enter the address

When we start typing websites into the browser, the browser is already intelligently matching possible URLS. It will find the url from bookmarks, history, etc., and then give you an intelligent prompt to complete the URL. For Google’s Chrome browser, it even displays web pages from the cache; In other words, when we enter the address, the page appears.

The DNS

The Domain Name System (DNS) service is a protocol at the application layer like HTTP. It provides domain name to IP address resolution. Users usually use host names or domain names to access each other’s computers, not IP addresses.

DNS resolution is the process of finding which machine has the resources you need. When you enter an address in the browser, such as www.baidu.com, it is not the real address of baidu website. The unique identification of every computer on the Internet is its IP address, but IP addresses are not easy to remember. Users prefer to find other computers on the Internet using easy-to-remember sites, such as baidu’s. So Internet designers need to make a tradeoff between user convenience and usability, and that tradeoff is a url to IP address translation, a process known as DNS resolution. It actually acts as a translator, realizing the translation from web address to IP address.

What is the DNS

The Domain Name System (DNS) is a service of the Internet. As a distributed database that maps domain names and IP addresses to each other, it makes it easier for people to access the Internet. DNS uses TCP and UDP port 53. Currently, each level of a domain name is limited to 63 characters, and the total length of a domain name cannot exceed 253 characters. — Wikipedia

The domain name resolution process is a hierarchical query

  1. Browser cache: First, the browser cache reads the records of the last access. In Chrome, enter Chrome ://net-internals/# DNS in the address bar to view the current status of the cache
  2. Operating system cache: Finds caches stored in system running memory. On a MAC, you can use the following command to clear the DNS cache in the system.
dscacheutil -flushcache
Copy the code
  1. Lookup in the host file: If none is found in the cache, the Settings in the system’s default host file are read.
  2. Router cache: Some routers also have the DNS cache function. The visited domain names are stored on the router.
  3. ISP DNS cache: Internet service providers (such as China Telecom) will also provide DNS services, such as the well-known 114.114.114.114. If the DNS service cannot be found locally, it will query the DNS service to ISP. ISP will search for records in the cache of the current server, and if so, it will return the IP address. Requests to the root DNS server for queries begin.
  4. Top-level DNS server/root DNS server: After receiving a request, the root DNS identifies which server is authorized to manage the domain name (.com) and returns the IP address of the top-level DNS server. After receiving the IP address of the top-level DNS server, the requester initiates a query to the server. If the server cannot be resolved, the server returns the IP address of the next-level DNS server (nicefilm.com). The local machine continues to search until the server finds the host (www.nicefilm.com).

We can view the record of domain name resolution by using the dig command

dig math.stackexchange.com
Copy the code

If we focus on the response returned, we see that there are four records that return four IP addresses of the site

;; ANSWER SECTION: Math.stackexchange.com. 34 IN A 151.101.129.69 math.stackexchange.com. 34 IN A 151.101.65.69 math.stackexchange.com IN A 151.101.1.69 math.stackexchange.com. 34 IN A 151.101.193.69Copy the code

34 is the TTL value, indicating the cache time of the domain name. You do not need to query the domain name again within this time. A is the record type of the DNS query, indicating that an IPv4 address is returned. There are other record types such as NS (which returns the server address for the query), AAAA (which returns the IPV6 address), CNAME (alias name for the domain name), and so on.

The parsing process

DNS resolution is a recursive query process.

The above picture is the process of finding the IP address of www.google.com. If no IP address is found, the local DNS server will send a request to the root DNS server. If the root DNS server does not have the domain name, the local domain name will send a request to the COM top-level DNS server, and so on. Until finally the local DNS server gets Google’s IP address and caches it locally for the next query. From the above process, we can see that the url resolution is a process from right to left: com -> google.com -> www.google.com. But did you notice something missing, the root DNS resolution process? Actually, the actual url is www.google.com. It’s not that I typed one more. This one. The root domain name server (root domain name server) is the root domain name server (root domain name server). By default, the root domain name server (root domain name server) is the root domain name server (root domain name server). ->.com -> google.com. -> www.google.com.

DNS optimization

Knowing the DNS process, what can we learn from it? The above mentioned request for Google’S IP address goes through 8 steps, and there are multiple requests in this process (UDP and TCP requests exist at the same time, why there are two request methods, please find by yourself). Is it too time-consuming to go through so many steps every time? How do you reduce the number of steps in the process? That’s the DNS cache.

DNS cache

DNS has multiple levels of cache, sorted by distance from browser: browser cache, system cache, router cache, IPS server cache, root DNS cache, top-level DNS cache, and primary DNS cache.

  • Type :chrome:// DNS/into your Chrome browser and you can see chrome’s DNS cache.
  • The system cache is stored in /etc/hosts(Linux) :

DNS Load Balancing

Have you ever thought about a question: does DNS return the same IP address every time? If it’s the same every time, does that mean you’re requesting resources on the same machine? How much performance and storage does that machine need to handle billions of requests? In fact, behind the real Internet world there are thousands of servers, large websites and even more. But from the user’s point of view, all it needs to do is process his request, and it doesn’t matter which machine does it. DNS can return the IP address of a suitable machine to the user, for example, according to the load of each machine, the geographical distance of the machine from the user, etc. This process is DNS load balancing, also known as DNS redirection. The well-known Content Delivery Network (CDN) utilizes the DNS redirection technology. The DNS server will return an IP address closest to the user to the user, and the SERVER of the CDN node is responsible for responding to the user’s request and providing the required Content.

A TCP connection

TCP is a connection-oriented transport layer protocol. It can ensure the communication between the hosts at both ends (sender and receiver) is reachable. It can deal with packet loss and transmission order disorder in the process of transmission. It also makes efficient use of broadband and eases congestion.

After getting the resource server IP to request, the browser carries out TCP connection with the server through operating OS socket. (Generally speaking, the operating system has encapsulated TCP/IP and other protocols, providing sockets for applications to use, this part involves the knowledge of the standard network model, in addition to the beginning of the expansion.)

This connection is known as the triple handshake and the machine actively opens the connection

  • For the first time, the local machine sets the SYN bit to 1 and sends it to the server. The local status is SYN-sent
  • The second time, after receiving the packet, the server changes the status to SYN-received, sets the SYN and ACK bits to 1, seq = y, ACK = x + 1, and sends the packet to the client.
  • For the third time, after receiving the packet, the client changes the state to ESTABLISHED, sets the ACK bit to 1, seq = x + 1, ACK = y + 1, and sends the packet to the server. After receiving the packet, the server also changes its state to ESTABLISHED.

It should be noted that some articles explain the ACK identification bit and ACK (Acknowledgement Number) vaguely, and some drawings are simply written together. Although the two are related, they are not the same thing, which makes it easier to understand. Others describe the second handshake as two packages (e.g. In fact, this is also not true

  • If the ACK bit is set to 1, I acknowledge receipt of the packet with SEq as X and reply with ACK = x + 1
  • SYN means that this is the first time I randomly generate the sequence X of SEQ, and each packet I send after that will add y to the last one (y is the length of the data when there is data, y = 1 when there is no data). So, after seQ is initialized, there is no need to set SYN to 1, and it is easy to understand why the three-way handshake is SYN, ACK/SYN, and ACK.

TCP FLAG the TCP header contains 20 bytes, of which 6bits are allocated to the TCP FLAG, which is used to represent the current packet type. URGACKPSHRSTSYNFIN (CWRECE is reserved)

  • URG: Urgent pointer, used to identify the packet to be sent as urgent, which means it can be sent to the receiver without waiting for the preceding data to be processed by the response.
  • ACK: Indicates that the packet is received successfully.
  • PSH: Push flag indicating that the packet should be sent immediately without waiting for additional data.
  • RST: reset indicates that the connection is abnormally closed.
  • SYN: Indicates that the TCP connection is initialized.
  • FIN: Indicates the completion identifier, which is used to remove the previous SYN identifier. A complete TCP connection must have SYN and FIN packets. Now that we know how a TCP connection works, the channel is open, and it’s time to use the channel to send something. We go from the transport layer back to the application layer.

Sending an HTTP request

HyperText Transfer Protocol (HTTP) is an application layer Protocol for distributed, collaborative and hypermedia information systems [1]. HTTP is the basis for data communication on the World Wide Web. HTTP was originally designed to provide a way to publish and receive HTML pages. Resources requested over HTTP or HTTPS are identified by Uniform Resource Identifiers (URIs).

At the application layer, the browser analyzes the URL and sets up a request message to send. The request message contains the request line, the request header, the blank line, and the request body. The HTTPS default request port is 443, and the HTTP default is 80.

  • Request line: The request line contains the method, path, and protocol version of the request.
  • Request header: The request header contains some additional information about the request, usually in pairs of key values, such as setting the request file type accept-type and the server’s Settings for the cache.
  • Blank line: The protocol specifies that the request header and the request body must be separated by a blank line
  • Request body: In the case of a POST request, the required parameters are not placed in the URL, so you need a carrier, which is the request subject.

The server responds to the HTTP request

After receiving and interpreting the request message, the server returns an HTTP response message. The HTTP response consists of three parts: the status line, the message header, and the response body. Status code: Consists of three digits. The first number defines the category of the response and has five possible values:

  • 1XX: Indicating message – indicating that the request has been received and processing continues
  • 2xx: Success: The request is successfully received, understood, or accepted
  • 3xx: Redirect – Further action must be taken to complete the request
  • 4XX: Client error – The request has a syntax error or the request cannot be implemented
  • 5xx: Server side error — the server failed to fulfill a valid request

Common status codes, status descriptions, instructions:

  • 200 OK: The client request is successful
  • 400 Bad Request: The client Request has syntax errors and cannot be understood by the server
  • 401 Unauthorized: The request is not authorized. This status code must be used with the WWW-Authenticate header field
  • 403 Forbidden: The server receives requests but refuses to provide services
  • 404 Not Found: The requested resource does Not exist, eg: An incorrect URL was entered
  • 500 Internal Server Error: An unexpected Error occurs on the Server
  • 503 Server Unavailable: The Server cannot process requests from clients. However, the Server may become normal after a period of time

HTTP cache

HTTP message headers include normal headers, request headers, response headers, and entity headers. No specific introduction will be given. Response body: is the content of the resource returned by the server

HTTP caching is an optimization point for browsers to reduce unnecessary requests and thus speed up page rendering. You can use caching by simply setting the HTTP header. There are generally three ways to set it up

Last-modify + if-modified-since When the server returns a resource, last-modify is set to the time when the current resource was Last Modified. The browser saves this time for the next request. The request header if-modified-since contains this time. Upon receiving the request, the server checks whether the resource was last updated after the time set by if-modified-since. If not, the 304 status code is returned and the browser retrits the resource from the cache. Instead, return 200 and the resource content.

ETag (response header) + if-none-match (request header) Determines whether the file has been modified based on the resource identifier. Each time the server returns a resource, it stores the resource identifier in the ETag. The browser receives this identifier and puts it in if-none-match on the next request. The server determines if there is a match, returns 200 and the new resource if there is no match, returns 304 and the browser retrieves the resource from the cache

First of all it’s not a method, it’s an evolution of protocol replacement. Back in the HTTP 1.0 era, we controlled the cache lifecycle based on Pragma and Expires. We can turn off the cache by setting Pragma to no-cache, and we can also set Expires to a time when the cache Expires. It is important to note that this lapse time is relative to server practice, and if the client time is changed artificially, the cache will be invalidated.

To solve this problem, the HTTP1.1 protocol adds cache-control, which controls the Cache cycle by setting the max-age of cache-control. During this cycle, the resource is fresh and the browser does not request it when it needs to use it again.

To render the page

Before the browser accepts the entire HTML document, it will begin to display the page. Different browsers may parse the page differently. Here we only describe WebKit rendering.

  1. Parse the HTML and build the DOM tree
  2. Parses the CSS and generates the CSS rule tree
  3. Combine DOM tree and CSS rules to generate render tree
  4. Render tree (Layout/reflow), responsible for each element size, location calculation
  5. Render the Render tree (paint), which draws the page pixel information
  6. The browser sends each layer’s information to the GPU, which then composes the layers and displays them on the screen

It is important to note that this is a gradual process, rendering engine in order to make every effort to display in a timely manner, will be in the case of incomplete document request begin to render a page, at the same time, if in the process of resolution in the script, document parsing will stop down, analytic scripts executed immediately, if the script is external, will be waiting for a complete and parse request execution. So, in order to render the page without blocking, script scripts are typically placed at the end of the document.

In the latest HTML4 and HTML5 specifications, it is also possible to tag the script as defer, so that the document parsing does not stop and is executed after the parsing is complete. HTML5 adds an option to mark scripts as async so that they can be parsed and executed by other threads.

Connection is closed

Keep-alive is enabled by default to optimize request time, so the exact time a TCP connection is closed is when the TAB TAB is closed. This closing process is known as the four waves. Closing is a full duplex process, and the order of sending packets is not determined. Generally, the shutdown is initiated by the client, and the process is as follows.

If the last data sent by the client is seq = x, ACK = y;

  • The client sends a packet with the FIN set to 1, ack = y, seq = x + 1, and the client status is FIN_WAIT_1
  • After receiving the packet, the server changes to CLOSE_WAIT and sends a packet with ACK of 1, ACK = x + 2. The client state changes to FNI_WAIT_2 after receiving the packet
  • After processing the task, the server sends a FIN packet to the client. Also set its status to LAST_ACK
  • After receiving the packet, the client changes to TIME_WAIT and sends the ACK packet to the server. ACK = y + 1 and closes the connection after 2MSL. Why does the client wait for 2MSL? MSL: Maximum Segment Lifetime. Wait to ensure connection reliability and ensure that the server receives the ACK packet. If the server does not receive the ACK packet, the server resends the FIN packet to the client. The wait time is equal to the waiting time of the server and the transmission time of the FIN.