Original: itsOli @ front-end 10,000 hours this article was first published in the public number "front-end 10,000 hours" this article belongs to the author, without authorization, please do not reprint! This article is adapted from "sparrow" private pay column "front end | ten thousand hours from zero basis to easily obtain employment"Copy the code
❗ ️ ❗ ️ ❗ ️
The following link is the latest erratum to this articleBefore and after the probation 】 【 column end interaction | (02) interaction rules, standard: HTTP, (2) the HTTP urls and URN “three-way handshake, URI,
1 HTTP three-way handshake
1.1 Basic Concepts
Before we get into the HTTP three-way handshake, we need to clarify one concept:
In the process of sending and returning “HTTP requests” between “client” and “server”, we need to create a “TCP Connection” thing in the first place!
❗️ because HTTP itself does not exist the concept of “connection”, it only “request” and “response” these two concepts!
And “request” and “response” are “packet”, “packet” transmission needs a “transmission channel”, this “channel/connection” is created in TCP.
Once a “TCP transport channel/connection” is created, it can be declared in a way that keeps it there. Our “HTTP request” can then be sent on this “connection” basis.
Of course, since the “connection” will remain there after it is created, we can send multiple “HTTP requests” on this “TCP connection.”
❓ why do we keep “TCP connections” there all the time? A: Because “TCP connection” has a “three-way handshake” process, and this process has three network consumption. If every time the connection is closed, it will bring a lot of consumption and delay!
Three-way handshake indicates that there are three network transfers between the client and server. A TCP Connection can be created and an HTTP request can be sent only after the three network transfers are completed.
1.2 Sequence diagram of “Three Handshakes”
-
1️ first, the “client” initiates a “packet request” of “I want to create a connection” to the “server” :
SYN=1
: SYN is a flag bit, indicating that one has been createdPackets marked with SYN;Seq=X
: will be accompanied by a Seq, which is equal to a number, generallyX=1
。
-
2️ retail Then, when the “server” receives the request on 1️ one, it knows that a client needs to establish connection with us at this time. The “server” then opens a TCP socket port and returns a packet to the “client” :
-
SYN=1: SYN is also a flag bit;
-
ACK=X+1: Return an ACK flag bit at the same time, whose value is equal to the value of Seq +1 sent from “client” on 1️; (❗️ is used together, indicating that “server” returns a packet marked SYN/ACK to “client”.)
-
Seq=Y: At the same time, the “server” also returns a Seq equal to a new number Y.
-
-
3️ finally, when the “client” gets data transferred from 2️ on the “server”, it knows that the “server” has allowed it to create a “TCP connection”. The client then returns a packet to the server:
ACK=Y+1
The: ACK is an “flag bit” indicating that the “client” has sent onePackets marked WITH ACK to “server”, its value is equal to the value of Seq sent from the “server” on 2️ + 1;Seq=Z
At the same time, the “client” also returns a Seq whose value is equal to a new number.
❓ Why do we need a “three-way handshake” process? A: This is to prevent the “server” side from opening some useless “connections”.
As we know, data need to be transmitted through optical fiber and various intermediate proxy servers, which is bound to bring certain “delay”. If there is no “three-way handshake”, the “client” initiated request 1️ disconnection “after arriving at the” server “, the “server” will directly create this “connection” and return the relevant data to the “client”.
At this time, the accident appeared: data transmission due to network reasons, “data” lost! The “client” side has not received the data returned by the server. At the same time, the “client” may set a “timeout period” — how long it will take to close the “connection” without receiving data from the server. And then to make a new data request.
However, since there is no “three-way handshake”, my “server” side does not know that your “client” has not received “data” at all, and your “client” has not given me any feedback! At this point, the “server” port is always open and wasted.
Therefore, the “three-way handshake” is mainly to avoid the network transmission process, due to delay caused by the “server” cost problem.
1.3 Using the Wireshark to visually describe the Three-Way Handshake process
I will use Wireshark, a powerful network packet capture tool, to capture some packets for illustration.
❗️Wireshark installation is not described here. Windows and macOS have official installation packages. Here first follow me to use it simply, behind the actual work of which details need to go further.
First, enter the following command line on the terminal to view the IP address of the host (marked by the red box in the image below) :
Ifconfig // In Windows, the value is ipconfigCopy the code
My LAN IP is192.168.8.107
:
Then, open the Wireshark and click the red box in the following figure to capture the data packets under the current Wi-Fi of the machine:
Immediately, you can grab a lot of packets (the red box in the upper left corner of the app stops the packet capture). Here’s one of them:
57030
Represents a port on this machine;443
Represents a port for the server.
Take a close look at the information related to each handshake “packet”. Whether it is the use of “flag bits” or the relationship of each “value”, it is exactly the same as explained in the “three handshakes” sequence diagram above.
2 URI, URL, and URN
2.1 define
- URI: Uniform Resourse Identifier Indicates the Uniform resource Identifier
- URL: Uniform Resourse Location Uniform resource locator
- URN: Uniform Resource Name Specifies the Uniform Resource Name
The URL is a Web page address that you need to enter when accessing a Web page using a browser, for example, http://www.baidu.com
Uris are more general resource identifiers and consist of two main subsets:
- URL: Describes a resource by describing its location;
- URN: Identifies a resource by name, independent of its location (not much in use right now, so check it out).
2.2 Components of a URL
Our common urls consist of three main parts:
- A plan, or what we call an agreement;
- Server location;
- Resource path.
For example: https://www.yuque.com/olizhao/qdywxs
Specifically, a common URL consists of nine parts:
<scheme>://<user>:<password>@<host>:<port>/<path>; <params>? <query>#<hash>Copy the code
-
: The most common protocols for Web pages are HTTP and HTTPS;
-
:: user and password are not common now, we will not write the user name and password in the URL in plain text, but through the login to the user authentication; -
: the host can be an IP address or a domain name.
-
: the port number is used to distinguish the processes on the host and facilitate the locating of the Web server. The default HTTP port is 80. The process that provides HTTP services listens on TCP port 80. It’s as if a bank has multiple Windows in its service hall, one of which offers foreign currency exchange services. In order to make it easier for customers to find the window, the headquarters of the bank has stipulated that by default, the 80th window of each branch provides foreign currency exchange service, so that customers who need exchange service can only find any branch and go straight to the 80th window. Here, each branch address can be understood as IP address, and each window in the hall can be understood as port. The service content of each window in the lobby can be arranged by the lobby manager, who can be understood as the server administrator. This means that although the default HTTP port is 80, an administrator can change port 80 to 81, or change port 80 to other services such as SSH. -
: path is the path of the resource, that is, the location of the resource. The path may not correspond to the physical path, but must comply with the routing convention of the Web server. -
: Params requires parameters in some protocols to access resources. The parameters are name/value pairs, and the URL can contain multiple parameter fields with semicolons between each other and the rest of the path; Space. Such as: ftp.prep.ai.mit.edu/pub/gnu; Type =d — The parameter is type=d, where the parameter name is type and the value is d. -
: Query is the most common way to pass parameters in GET requests. Using a character? Separate it from the rest of the URL — such as? A = 1 & b = 2 & = 3;
-
: Hash, also known as a fragment, is designed to identify parts of a document, and many MVVM frameworks use it for routing functionality. Some resource types, such as HTML, can go beyond the resource level. For a large text document with sections, for example, the URL allows a hash to represent a fragment within the resource that hangs to the right of the URL, preceded by a character #. Such as: http://www.baidu.com/tools.html#drills this case, the fragment cited baidu page on the server/tools. A part of the HTML, this part of the name is drills.
I wish you good, QdyWXS ♥ you!