Enter the URL to the page to show what happened?
1. The browser checks whether the current URL is cached and compares whether the cache expires.
2. The DNS resolves the IP address of the URL.
3. Establish a TCP connection (three-way handshake).
4.HTTP initiates a request.
5. The server processes the request and the browser receives the HTTP response.
6. The browser parses the rendered page.
7. Close the TCP connection (wave four times).
URL
The URL consists of protocol, host, port, and path
Protocol
Specify the transport protocol to use. The most common is HTTP and HTTPS
Hostname (hostname)
It is the domain name System (DNS) host name or IP address of the server where resources are stored
Port (Port number)
This parameter is optional. If omitted, use the default port number of the solution. All transport protocols have default port numbers. If omitted, the default port number is used.
Path (path)
A string separated by zero or more slashes (/) to indicate the address of a directory or file on a host.
Enter the url
When we Enter a URL, the browser will find a possible URL from history, bookmarks, etc., and give a hint. When you press Enter, if the URL is a domain name rather than an IP address, it will perform domain name resolution (DNS resolution).
DNS Domain name Resolution
The domain name entered in the address bar is not the actual location of the last resource. The domain name is only a mapping between the IP address and the network server. If there are too many IP addresses on the network server, the domain name is generated, and each domain name corresponds to an IP address.
1. There is a hosts file on the local hard disk. The file is used to establish a “database” associated with some common web address domain names and their corresponding IP addresses. Generally speaking, the system automatically searches for the IP address in the hosts file. If any IP address exists, the system directly uses the IP address in the hosts file. Then, the system directly checks the port number.
2. If no, the system searches the cache of the local DNS parser to check whether the URL mapping exists. If yes, the system returns the url mapping.
3. If none of the hosts and the local DNS resolver cache the corresponding url mapping relationship, would first look for TCP/IP parameters set in the preferred DNS server, in this we call it the local DNS server, the server receives the query, if you want to query the domain name, is included in the local configuration of regional resources, the analytical result is returned to the client, Domain name resolution is complete. The resolution is authoritative.
4. If the previous step is not found, that is, the local DNS server does not know queried the IP address of the domain name, if the local DNS server local area file with the cache parse failure, depending on the local DNS server Settings (whether to set the repeater) query, if not use the forward model, the local DNS request to 13 root DNS, The root DNS server receives the request and determines who is authorized to manage the domain name (.com) and returns an IP address that is responsible for the top-level domain name server. After receiving the IP information, the local DNS server will contact the server responsible for the.com domain. Is responsible for the machine. Com domain server after receipt of a request, if you cannot resolve, it will find a management. The next level of the com domain DNS server address to the local DNS server, after the local DNS server receives this address, will be looking for the next level DNS server domain server, repeat the above action, query, until you find to return.
5. If forwarding is enabled, the DNS server forwards the request to the UPPER-layer DNS server for resolution. If the upper-layer SERVER fails to resolve the request, the DNS server searches for the root DNS server or forwards the request to the upper-layer DNS server. Whether the local DNS server uses a forward or root prompt, the result is eventually returned to the local DNS server, which in turn returns the result to the client.
Recursive query: In this mode, the DNS server must return an accurate query result to the client after receiving a request from the client. If the DNS server does not store the queried DNS information locally, the server queries other servers and returns the query results to the client.
DNS cache: There are multiple levels of DNS cache, sorted by the distance from the browser: browser cache, system cache, router cache, IPS server cache, root DNS cache, top-level DNS cache, and primary DNS cache.
DNS delegation: The DNS server assigns a subdomain of a zone to another DNS server for management. In this way, when a client submits a query request to the DNS server, the DNS server of the root domain forwards the request to the DNS server that maintains the subdomain.
DNS forwarding: All domain name queries that are not in the local domain and cannot be found in the cache are forwarded to the configured DNS forwarder. The DNS performs the resolution work and caches. Therefore, the cache of the forwarder records abundant domain name information. Therefore, the forwarder is likely to find answers to non-local queries in the cache, avoiding sending external queries again and reducing traffic. Forwarding occurs only if the server is unauthorized and there is no record in the cache.
A TCP connection
To ensure reliable data transmission, TCP establishes transmission connections between application processes. It is to establish a logical connection between two transmission users, so that both communication parties confirm each other as their own transmission connection endpoint.
Before establishing a connection, the server passively opens a well-known port and listens on the port. When the client wants to establish a connection with the server, it initiates an active request to open a port (the port is usually temporary). Then enter the three-way handshake
It can be popularly understood as
1.A calls B. Can you hear me? A doesn’t know if B can hear her.
2.B hears A and replies to A, indicating that he can hear A. Can A hear him? B doesn’t know if A knows she heard it.
3. At this time, A can confirm two things: B can hear himself speak, and he can also hear B speak. At this time, B can reply to B, I can hear you speak and can chat
Why three handshakes
The three-way handshake prevents invalid connection request packet segments from being received by the server, resulting in errors.
A segment of a connection request packet sent by the client is not lost, but is detained on a network node for a long time. As a result, it is delayed until a certain time after the connection is released. Originally, this is an invalid packet segment. However, after the server receives the invalid connection request packet segment, it mistakenly thinks it is a new connection request sent by the client. Then the client sends a confirmation message to agree to establish a connection.
Assuming that the “three-way handshake” is not used, a new connection is established as soon as the server sends an acknowledgement. However, since the client does not send a connection request, it ignores the server’s confirmation and does not send data to the server. However, the server thinks that the new connection has been established and waits for data from the client. As a result, many of the server’s resources are wasted. The three-way handshake prevents this from happening. For example, the client does not issue an acknowledgement to the server’s acknowledgement. Since the server receives no acknowledgement, it knows that the client has not requested a connection
Making an HTTP request
Request message from client to server. When the browser sends a Request to the Web server, it sends a data block, namely the Request information, to the server. The HTTP Request information consists of three parts :(1) Request method URI protocol/version (2) Request Header (3) Request body
1. Request line: Method URI protocol/version For example: GET/sample. JSP HTTP/1.1 Common methods are GET, POST, PUT, DELETE, OPTIONS, and HEAD.
2. Request header: The request header contains a lot of useful information about the client environment and the request body. For example, the request header can declare the browser language, the length of the request body, and so on.
3. Request body: When using methods such as POST and PUT, the client usually needs to send data to the server. This data is stored in the request body. In the request header there is some information related to the request body
The server processes the request and returns HTTP packets
The browser receives the HTTP response. An HTTP response packet also consists of three parts: a status code, a response header, and a response packet.
Status code: The status code consists of three digits. The first digit defines the category of the response and has five possible values:
1XX: indicates that the request has been received and processing continues.
2xx: success – The request is successfully received, understood, or accepted.
3xx: Redirect – Further action must be taken to complete the request.
4XX: client error – The request has a syntax error or the request cannot be implemented.
5xx: Server side error – The server failed to fulfill a valid request. The common status codes are :200, 204, 301, 302, 304, 400, 401, 403, 404, 422, 500(please find what they represent by yourself).
Response header:
Response message: Text message returned by the server to the browser. Usually, HTML, CSS, JS, and images are stored in this part.
The browser parses the rendered page
After the browser receives the HTML,CSS, and JS files, how does it render the page to the screen? Below is the WebKit rendering process
Browsers are part of the process of parsing and rendering
1. The browser parses HTML files to build a DOM tree.
2.CSS resources are parsed into CSS Rule trees.
3. Combine DOM and CSSOM to build a rendering tree.
4. The browser starts laying out the render tree and drawing it to the screen. (This process is complicated and involves reflow and repain). Each element in the DOM node is in the form of a box model, which requires the browser to calculate its position and size, etc. This process is called relow. Once the location, size, and other properties of the box model, such as color and font, are determined, the browser begins to draw the content, a process called repain. Pages will inevitably experience reflow and Repain when they first load. The reflow process is very performance intensive and should be reduced as little as possible.
Render tree vs. DOM tree: Those invisible DOM elements (such as… , display=none) will not be inserted into the render tree; There are also nodes that are absolutely positioned or floating. These nodes are outside of the text stream, so they are located in different places in the render tree, which identifies their actual location, and the DOM tree, which identifies their original location, with a placeholder structure
If the < script > tag is encountered, the browser will execute immediately (ignoring defer and async properties) and the page will block, waiting not only for the JS file in the document to download and load, but also for the JS parsing to complete before resuming the HTML document loading and parsing. This is a case where the browser needs to rebuild the DOM tree in order to prevent JS from changing the DOM tree. The DOM tree changes and the browser needs to go back and re-render this part of the code, so the browser wants to block the download and rendering of other content, To avoid more unnecessary Reflow is called backflow or rearrangement. Loading CSS files does not affect loading of JS files, but does affect the execution of JS files. The browser must ensure that the CSS file has been downloaded and loaded before executing the JS code.
If there are any image resources, the browser will also send a request, to obtain image resources, this is an asynchronous request, so don’t wait to download the pictures, but continue to render the back of the HTML document, wait until the server returns the picture file, if not previously set for the image width is high, so the images occupy a certain area, affected the following paragraphs configuration, The browser Reflow
Disabling a TCP Connection
TCP is full-duplex communication. Both the server and the client can send and receive data. When TCP disconnects, both the server and the client need to confirm that the other side will not send data.
1. User A tells USER B that the task is finished and wants to disconnect the connection, indicating that User A will not send A request but can still receive messages.
2.B replied to A after receiving your disconnection request (so that A would not continue to send the disconnection request because B did not receive the reply). I prepared that B might still need to send data to A at this time.
3. Ensure that the data communication is completed normally and reliably. After B finishes processing the data, it tells A that I am ready to disconnect and B enters the timeout waiting state.
4. User A tells user B to confirm the disconnection, and user B releases the connection.