Http is introduced
HyperText Transfer Protocol (HTTP) is the most widely used network Protocol on the Internet. All WWW files must comply with this standard. HTTP was originally designed to provide a way to publish and receive HTML pages. In 1960, Ted Nelson, an American, conceived a way to process text information by computer and called it hypertext, which became the foundation of the standard architecture of HTTP hypertext transfer protocol.
The above is not the focus of this article. For more details about HTTP, please go to www.baidu.com
Http is a network protocol, and is a stateless hypertext protocol, based on Tcp/Ip protocol application layer protocol.
I want to IP
When a user requests resources of a certain domain name, for example, when a browser type http://www.qq.com, the browser first queries the IP address based on the entered domain name. Where do we look? Here we need to introduce the concept of DNS, which can be thought of as a domain name mapping IP ledger. When the client sends a DNS request, first of all the local DNS server receives a request, could have any current in local query cache first the mapping relationship between domain name and IP, if have the direct return to IP information, if not, will ask other DNS servers, simple here that the DNS server on the network structure, The DNS server is in a tree structure on the network. There is a root server whose child nodes are level-1 domain name servers (such as.com and.cn). The child nodes of level-1 domain name servers are also called authoritative DNS servers
When local DNS server is not related query information will be in accordance with the order of above tree query the corresponding relation of domain name and IP, to later, you can to the local DNS cache, the end result is the process to obtain the IP address of the corresponding to the relevant domain name, if the client input is the IP address information, omitted the query above IP process.
Any website that accesses the Internet is essentially IP addressable.
Establishing a Tcp Connection
When an HTTP request is made and the correct server IP address is obtained, the connection can be established. One thing to be clear about is that HTTP is based on Tcp. So the first step is to establish a Tcp connection, which is what many web articles call a three-way handshake: Client: Hi, I’m a Client. Server: Hello Client, THIS is Server. Client: Hello Server…
The three-way handshake can be represented in the following order: Client’s question > Server’s answer > Client’s answer
Some interviewers get bored and ask why three handshakes instead of two or four or five? You can understand that when two people A and B want to communicate with each other, the easiest way is for person A to ask A question and get an answer from person B, and person B to ask A question and get an answer from person A. This is also at the heart of the three-way handshake.
Generally speaking, Tcp is connection-oriented. The connection here is actually a process in which both parties agree on a certain format for communication (including the order of sending packets, the size of buffer, etc.). It is just like maintaining a connection in logic.
I need to exit the gateway
Once the Tcp connection is established, HTTP requests can organize the data to send packets. The current HTTP protocol version is mostly 1.1. In this version, there is an attribute keep-alive, which indicates that the TCP connection established by the HTTP connection should be kept Alive. This attribute is enabled by default.
There are a lot of articles on the Internet describing HTTP long connections, which is a misrepresentation of a TCP connection. Turning keepalive on an HTTP connection simply keeps the TCP connection open.
HTTP packets are divided into three parts. The first part is the request line, the second is the request header, and the third is the request body. The specific HTTP protocol concepts are not discussed here because they are a bit too much. HTTP is located at the application layer. Therefore, the packet to be sent contains HTTP content and is sent to the next layer.
The next layer is the transport layer, which has two protocols: Tcp and Udp. HTTP selects Tcp, and Tcp has two ports, one is the source port and the other is the target port. For example, HTTP requests generally target port 80. The transport layer encapsulates the port information and passes the request packet to the network layer.
The protocol at the network layer is THE IP protocol, which encapsulates the source IP address and the destination IP address (the destination IP is the IP address of the requested website, obtained by querying DNS).
The operating system knows the IP address to be sent and determines whether the IP address is in the LOCAL area network (based on the subnet mask). If it is not, it needs the gateway to send the request (the gateway’s IP address is usually configured by DHCP). How does the operating system figure out where the gateway is? This process basically relies on broadcast and uses THE ARP protocol. When all devices in the LAN receive the CONTENT of ARP, they will judge whether the IP address is the same as the gateway IP address and reply if they are. After this process, the system finds the gateway and obtains its MAC address. In addition, the gateway MAc address and the local MAc address are encapsulated into a request packet and sent to the next MAc layer. Finally, the nic sends the message to the gateway.
A MAC address is used to locate a computer on the same LAN and is valid only on the LAN
Reach the target server
When the request packet reaches the gateway, the gateway checks whether the MAC address of the request packet is the same as its own MAC address. If so, the gateway receives the request packet. Then, it determines the destination IP address in the message. If the destination IP address is not in its own LAN, it needs to send the message to the next connected gateway according to its own routing rules. Gateways communicate with each other, and how gateways calculate the optimal path is not expanded here. Take the common home router as an example. The gateway IP of each router is actually assigned by the carrier, and network packets are generally sent by changing the IP address (NAT). Specific steps:
- The gateway checks whether the destination IP address is in its own LAN. If not, it obtains the MAC address and MAC address of the next gateway to be transmitted, changes the destination IP address and MAC address to the IP address and MAC address of the next gateway, and changes the source IP address and MAC address to the IP address (external IP address) and MAC address of the current gateway.
- After receiving the message, the next gateway checks whether the MAC address matches that of the gateway. If yes, the gateway checks whether the destination IP address is in its LAN. If no, repeat the preceding steps
- Repeat the above steps until you reach the gateway where the target server resides.
- After receiving the message, the gateway of the target server will judge that the current target IP address is within the range of my LAN, and will not skip to the next gateway. Instead, it sends AN ARP request to the LAN to find the target server. The target server will respond to the request, and the gateway will send the specific request to the target server.
- The target server receives the message and parses the requested message. After comparing the MAC and IP information, the target server will get the port information. The target server will look for the program listening on this port on the local machine.
- The request is sent through the port to a specific processing program, which will parse the content of the HTTP request and make a corresponding reply according to the content.
- The request completes an HTTP request by following all of the above steps and returning the response to the requester (the gateway router remembers the source path).
The gateway address for the next hop is determined by routing tables between gateways (routers)
Write in the last
The above is just an overview of the HTTP request process, in fact, each step is very complicated, not detailed. For example, routing protocols and IP allocation.
Add attention, view more beautiful version, harvest more wonderful