What happens when you type a web address into the browser’s address bar to render the page?
Network request
Build request
When you type a web site address into the browser’s address bar and press Enter, the browser constructs a request line, which is composed of a simple request method, request URI, and HTTP version protocol. Such as:
GET/HTTP/1.1 // The request method is GET, the path is the root path, and the HTTP version is 1.1
Find strong cache
After the browser builds the request, it checks the strong cache first and uses it if it hits, otherwise it performs DNS resolution. So what is strong caching?
Strong caches, also known as local caches, are controlled using the Expires and cache-Control fields in the HTTP header to indicate how long a resource is cached. In a strong cache, a normal refresh ignores it, but does not clear it, requiring a forcible refresh. The browser forces a refresh with cache-control :no-cache and Pragma:no-cache. So what are Expires and cache-control?
Expires is the HTTP1.0 specification. Its value is a gmT-formatted string of absolute time, which represents the expiration date of the resource. As long as the request is sent before Expires, the local cache is always valid and reads data from the cache. The time of failure is an absolute time. If the time difference between the server and the client is large, the cache will be confused. If cache-Control :max-age and Expires are present at the same time, max-age takes precedence.
Cache-control :max-age=3600 Cache Control:max-age=3600 Cache Control:max-age=3600 Cache Control:max-age=3600 Cache-control In addition to this field, there are several common Settings:
No-cache: no local cache is used. Cache negotiation is required to verify with the server whether the returned response has been changed. If there is an ETag in the previous response, the request will be verified with the server. If the resource has not been changed, the re-download can be avoided.
No-store: directly forbids the browser to cache data. Each time the user requests the resource, a request will be sent to the server, and the complete resource will be downloaded each time.
Public: can be cached by all users, including end users and intermediate proxy servers such as CDN.
Private: the device can be cached only by the browser of the terminal user and cannot be cached by a trunk cache server such as the CDN.
Cache-control and Expires can be enabled at the same time in the server configuration, and cache-control has a high priority when both are enabled.
The DNS
Since we input the domain name and the packet is sent to the other party through the IP address, we need the IP address corresponding to the domain name. This process relies on a service system that maps domain names to IP addresses. We call this system DNS. The process of obtaining the specific IP address is DNS resolution.
However, the browser provides the DNS data caching function, that is, if a domain name has been resolved, the result of the resolution will be cached, and the next processing will be directly out of the memory, without the DNS resolution.
If no port is specified, port 80 of the corresponding IP address is used by default.
Establishing a TCP Connection
Transmission Control Protocol (TCP) is a connection-oriented, reliable, byte stream – based transport layer communication Protocol.
Establishing a TCP connection goes through the following three stages:
Establish a connection between the client and server through a three-way handshake (that is, sending a total of three packets to confirm that a connection has been established).
Data transfer. There is an important mechanism that the receiver must confirm the packet to the sender after receiving it. If the sender does not receive the confirmation message, the packet is judged as lost and the sender resends the packet. Of course, there is an optimization strategy in the process of sending, which is to break the large packets into small packets and transmit them to the receiver one by one, and the receiver will assemble them into complete packets according to the order of the small packets.
Disconnection phase. The data transfer is complete, now it’s time to disconnect, with four waves of the hand.
So what are three handshakes and four waves?
Three-way handshake
A three-way Handshake is a TCP connection that requires the client and server to send Three packets. The main function of the three-way handshake is to confirm whether the receiving and sending capabilities of both parties are normal, and to specify their initial serial numbers for the preparation of reliable transmission. In fact, it is to connect the server to the specified port, establish a TCP connection, and synchronize the serial number and confirmation number of both sides of the connection, exchange TCP window size information.
At first, the client is in the Closed state and the server is in the Listen state
Shake hands three times:
-
First handshake: The client sends a SYN packet to the server and specifies the ISN of the initial sequence number of the client. The client is in the SYN_SENT state. A packet segment with SYN=1 in the header and initial sequence number seq=x cannot carry data but consumes one sequence number.
-
Second handshake: After receiving a SYN packet from the client, the server responds with its own SYN packet and specifies its initial SEQUENCE number, ISN(s). At the same time, the ISN + 1 of the client is used as the ACK value, indicating that the server has received a SYN from the client, and the server is in the SYN_RCVD state. In the acknowledgement packet segment, SYN=1, ACK=1, ACK= X +1, and seq= Y.
-
Third handshake: After receiving a SYN packet, the client sends an ACK packet. Similarly, the ISN + 1 of the server is used as the ACK value, indicating that a SYN packet has been received from the server. The client is in the ESTABLISHED state. After receiving the ACK packet, the server is also in the ESTABLISHED state. In this case, a connection is ESTABLISHED.
The ACK segment is ACK=1, the ACK number is ACK= Y +1, and the sequence number is SEq = X +1 (the initial value is SEq = X, so the second packet segment needs +1). The ACK segment can carry data. If it does not carry data, it does not consume the sequence number.
The sender of the first SYN performs active open and the recipient of the next SYN performs passive open. In socket programming, a client executing connect() triggers a three-way handshake.
As shown in the figure:
Four times to wave
Three handshakes are required to establish a connection, and four handshakes are required to terminate a connection (sometimes called four handshakes). This is caused by TCP half-close. The so-called half-closed is that TCP provides the ability for one end of a connection to receive data from the other end after it has finished sending.
Removing a TCP connection requires Four packets. This is called four-way Handshake. Either the client or the server can initiate the handshake.
Both sides are initially in the ESTABLISHED state. Suppose the client initiates the shutdown request first. The four-wave process is as follows:
-
First wave: The client sends a FIN packet with a specified sequence number. The client is in the FIN_WAIT1 state. That is, it sends the connection release packet segment (FIN=1, serial number seq= U), stops sending data again, actively closes the TCP connection, and enters FIN_WAIT1 (Terminate wait 1) state, waiting for the confirmation of the server.
-
Second wave: After receiving the FIN, the server sends an ACK packet and uses the serial number of the client +1 as the SEQUENCE number of the ACK packet, indicating that the packet has been received. In this case, the server is in CLOSE_WAIT state. That is, after receiving the connection release packet, the server sends the confirmation packet (ACK=1, ACK= U +1, seq= V). The server enters the CLOSE_WAIT state. In this case, TCP is in the half-closed state and the connection between the client and the server is released. After receiving the confirmation from the server, the client enters the FIN_WAIT2 state and waits for the connection-release packet segment sent by the server.
-
Third wave: If the server also wants to disconnect, the server sends a FIN packet with a specified sequence number as the first wave from the client. The server is in the LAST_ACK state. If no data is sent to the client, the server sends a connection release packet (FIN=1, ACK=1, seq= W, ACK= U +1). The server enters the LAST_ACK state and waits for the confirmation.
-
Fourth wave: After receiving the FIN, the client also sends an ACK packet and uses the serial number of the server +1 as the sequence number of its OWN ACK packet. In this case, the client is in TIME_WAIT state. It takes a period of time to ensure that the server receives its ACK packet before it enters the CLOSED state. After receiving the ACK packet, the server is in the CLOSED state. That is, after receiving the connection release packet from the server, the client sends an acknowledgement packet (ACK=1, SEQ = U +1, ACK= W +1) and enters the TIME_WAIT state. In this case, TCP is not released and the client enters the CLOSED state after waiting for the time set by the timer to be 2MSL.
Receiving a FIN simply means that there is no data flow in that direction. It is normal for a client to perform active shutdown and enter TIME_WAIT, while a server usually performs passive shutdown and does not enter TIME_WAIT.
Sending an HTTP request
Now that the TCP connection is established, the browser can communicate with the server, sending HTTP requests. A browser sends an HTTP request with three things: the request line, the request header, and the request body.
Cache-control, if-Modified-since, and if-none-match are all identifiers that may be placed in the request header as Cache information.
Finally, there is the request body, which only exists under the POST method, a common scenario being form submission.
The network response
The HTTP request arrives at the server and the server processes it. Finally, the data is passed to the browser, which returns a web response.
Like the request part, a network response has three parts: the response line, the response header, and the response body.
The response line consists of the HTTP protocol version, status code, and status description. As follows:
HTTP / 1.1 200 OK
The response header contains some information about the server and the data it returned, when the server generated the data, the type of data it returned, and information about the Cookie to be written.
What happens when the response is complete? Does the TCP connection break down?
Not necessarily. If the request header or response header contains Connection: keep-alive, it indicates that a persistent Connection has been established. In this way, the TCP Connection will remain Alive and the resources of the unified site will reuse this Connection.
Otherwise, the TCP connection is disconnected and the request-response process ends.