What happens when the browser enters the URL?

There are basically 6 steps (update…)

DNS Domain name Resolution

We input in the address bar is generally easy to remember the domain name, similar to www.xxx.com. In computer networks, they communicate with each other through IP addresses. Therefore, we need to resolve the domain name in the URL into an IP address to establish a connection with the remote server. This is called DNS domain name resolution.

DNS domain name resolution process:

Check whether the resolved IP address of the domain name exists in the browser cache. If yes, parsing is complete. If not, go to the next step.
Check whether the IP address exists in the operating system cache. Yes, parsing completed; No, go to the next step.
Query the local server (LDNS), and in fact about 80 percent of DNS resolution is done here.
Query the root DNS server.
The root DNS server returns to the local DNS server a top-level domain name server (gTLD) address of the domain being queried.
The local domain sends the request to the top-level domain name (gTLD) server returned in the previous step.
The requested top-level domain name server queries and returns the authoritative server corresponding to this domain name.
The authoritative server queries the stored domain name and IP mapping table and returns it.
The domain name resolution result is returned to the user and cached in the local cache according to the TTL value. The domain name resolution process is complete.

There are two query methods

1. Recursive query The host performs recursive query to the local DNS server. If the local DNS server accessed by the host does not know the IP address of the queried domain name, the local DNS server sends query packets to other root DNS servers as the DNS client.

2. Iterative query The local DNS server performs iterative query to the root DNS server. When receiving the query request packet from the local DNS server, the root DNS server either provides the IP address to be queried or tells the local DNS server which DNS server to query next, and then asks the local DNS server to perform subsequent query.

Browser cache

After parsing the URL, the browser will first check the cache to see if there is a cache. If there is a cache, it will directly fetch the cache. If there is no cache, it will proceed to the next step. cookie session

Establishing a TCP Connection

Three-way handshake

The HTTP request is sent, the server processes the request and returns the response result

HTTP request methods

GET: Obtains resources
POST: transmits the entity body
PUT: transfers files
HEAD: obtains the packet header
DELETE: deletes a file
OPTIONS: Asks for supported methods
TRACE: indicates a tracing path
CONNECT: The tunnel protocol is required to CONNECT to the agent

Get and POST

What’s the difference between GET and POST? This highly praised answer by Zhihu is very in-depth.

GET is used to “read” a resource. Reading does not change the data on the server, so it is idempotent, and because it is reading, the browser can cache the read resources; In contrast, POST is used to make the server “do one thing” and is therefore not idempotent and cannot be cached by the browser.
Parameter location: The GET parameter is in the URL, and the POST parameter is in the body. Note that this is just the specification, implemented in the browser. However, HTTP protocol does not have this limitation, and in various interfaces (such as Ajax apis), parameters can be written in the URL or in the body, depending on the situation.
Parameter size: In the browser, due to the reason of 2 above, the browser itself has a limit on the URL length, resulting in a limit on the size of the GET parameter, and POST is unlimited because it transmits data in the body.
Security: Because GET can be seen directly in a URL, it is often said to be less secure. But technically speaking, because HTTP is a plaintext transmission, neither is secure. Security is usually enhanced through HTTPS.
Encoding: GET parameters can only support ASCII, and POST can support arbitrary encoding. But as you can see from the above, both GET and POST actually work with urls and bodies. So encoding is exactly what the URL is encoded in HTTP and what the body is encoded in.
Number of requests: HTTP requests can be roughly divided into “request header” and “request body.” When using HTTP, there is a convention that all “control class” information should go in the request header, and specific data should go in the request body. As a result, the server always parses all request headers first. Thus, the server always wants to know the control information of the request so that it can decide how to proceed with the request, whether to reject or accept it. The client always sends all request headers to the server for validation. If it passes, the server replies with “100-continue,” and the client sends the rest of the data to the server. If the request is rejected, the server replies with an error of 400 or something, and the interaction is terminated. Based on this, the client can make some optimizations, such as internal Settings to send only “request headers” if the amount of data in a POST exceeds 1KB, or send all at once. The client can even make some Adaptive strategies to count the success rate of sending, and if the success rate is very high, it will always send all of them, etc. Therefore, the client can be flexible in deciding whether to send once or N times.

HTTP and HTTPS (SSL, encryption algorithm) HTTP1.0 HTTP1.1 HTTP2.0 (Long connection, short connection) HTTP packet structure Tcp/IP Layer Tcp/UDP difference

Browser rendering

Close the TCP connection and wave four times

Four times to wave

Ps: Common cyber attacks

See article from entering URL to page loading process? How to improve their front-end knowledge system by a problem. Read the process of a URL request.