preface

It all starts with the classic question:

What happens when you type www.google.com/ into your browser and press Enter?

Whaaat!!! You’re still asking me that in 9,012 years!

Parse URL, DNS query, get server IP address, send HTTP request to target address, server receive, response, return page, browser receive, render, bingo!

You smile with satisfaction, another perfect answer.

Stop! I’m not done yet. How did the request you sent reach the target server and how did the response from the server come back?

You fall silent and say some TCP, OSI model, router, etc.

The question then arises: how does the browser send the request?

The silence in the air deepened, as if it were dripping water..

In this article, I will focus on the common TCP/IP model of network communication, and explain the various stages involved in network communication, whether it is a regular page visit, or HTTP requests common in our business, are included in such a process. It is beneficial for front-end developers to understand the whole. Let’s go!

directory

  1. What is the TCP/IP reference model
  2. A complete network communication process
  3. In-depth understanding of each stage

What is the TCP/IP reference model

Before getting into the specifics, let’s do a quick review of TCP/IP.

In order to ensure the orderly progress of network communication, relevant organizations began to formulate various communication protocols, such as the earliest Network control Protocol (NCP), and later the well-known OSI seven-layer protocol. The whole Internet developed rapidly under the constraints of these protocols. Things change all the time, and technology changes naturally. In the 1980s, TCP and IP protocols were first proposed and applied in ARPANET (ARPANET, the originator of the Internet) project of the United States Department of Defense, and quickly became the mainstream universal protocol of Internet communication due to its excellent performance.

This protocol got its name from the two most important protocols TCP and IP that were first proposed. Later, all kinds of Internet communication protocols (HTTP, IP, DNS, TCP and ARP) were integrated into this protocol system, which is collectively called “TCP/IP protocol family”.

In other words, the TCP/IP protocol family originally only had TCP and IP, and now it is a collection of various protocols related to network communication. Corresponds to the agreement, at the same time developed TCP/IP reference model, this model is a hierarchical model of abstracting, all protocol in TCP/IP protocol family is classified into four levels of this model, each layer are independent of each other, the next layer of a layer provides services, collaboration between all levels, completed the main work of Internet communications.

These four layers are the Network access layer (also known as the data link layer or network interface layer), the network layer, the transport layer, and the application layer, which are usually mapped to the more detailed OSI seven-layer model:

In the TCP/IP reference model, each communication protocol has its own attribution, such as the one we commonly use in browsers: HTTP (Hypertext Transfer Protocol), DNS (Domain name System), FTP (File Transfer Protocol), and SMTP (Simple Mail Transfer Protocol) belong to application layer protocols. TCP and UDP belong to transport layer protocols, and IP belongs to network layer protocols. You can search for more communication protocols. In this article, we will analyze the communication process based on HTTP.

Two, a complete network communication process

In order to have a better understanding of the communication process of HTTP requests, I will start from the four layers of TCP/IP, corresponding to each layer of communication entities, or media (such as browser, router and network cable, etc.), and see how each protocol plays a role in these communication entities to send a request to the server. And how the response from the server comes back.

Whole, a complete communication, much like the express delivery parcel, objects are combined with packing, write down the address information and contact way, after another Courier station, arrived at the consignee’s position, in the network request, the request data is the need to express, IP address and MAC address is the address of the communication, Network cables and routers, switches and hubs are transportation roads and delivery hubs, while network protocols can be seen as couriers and delivery policies. Similar to express delivery, network communication also suffers from packet loss (parcel damage). Therefore, a simple HTTP request, to complete the request information, the middle of a number of operating system components, communication protocols, communication entities to participate in order to ensure the smooth progress of communication.

So, how does this express process work?

The basic principle is that, in a hierarchical order, the sender sends from top to bottom and the receiver receives from bottom to top. In order to ensure that the data is successfully sent to the next level, some necessary information is added to the header of the send, which is to ensure the integrity of the data and meet the constraints of the requirements. When you send an HTTP Request, you often set the Request Header, like content-Type: text/ HTML to say to the server, I need an HTML page, you can’t return anything else.

These headers may also contain protocol information, request path, request method, etc., but this only happens at the application layer. The entire communication process, through four layers, is successively added HTTP headers, TCP headers, IP headers, and Ethernet headers. The data is then sent to the next layer, the data is wrapped up, it starts from your network card, it goes through routers, network cables, switches, to the subnet where the target server is, and the server does the opposite, it parses layers of data based on the header information that was added. Finally, it reaches the server for processing (which is the responsibility of the server code), and then returns the response data. The overall process is shown as follows:

Data is transmitted in turn between different layers, and the data unit transmitted can be uniformly understood as data packets. Data packets have different names at different layers. What we are familiar with is that when they start from the application layer, they are called HTTP request messages or packets, and after adding message headers to them, they are sent to the transmission layer. Add TCP header, call it segment, to network layer, add IP header, become IP packet, to the last network access layer, it has been wrapped layer upon layer, here, add Ethernet header, but also add some tail information, change itself to framing. This process is called encapsulation. Whatever you call it, data is transferred between layers in the form of packets, and when there is a large amount of data, it can be subcontracted.

Let me extend this process to see how the previously mentioned communication protocols are involved between the various layers of TCP/IP after typing www.google.com/ :

3. What happened at each stage

1. The DNS query

A URL is simply a location identifier of an Internet resource set up to make it more user-friendly for humans to identify web applications. The browser first parses the URL and obtains the requested protocol (HTTPS), domain name (google.com), and path (/). Since there is no other subdomain name, that is, the default home page for querying Google. After parsing, DNS appears.

DNS, also known as the domain name query system, is responsible for the query of THE IP address corresponding to the URL. Each operating system has built-in Socket protocol library, and there is a resolver in the protocol library to take charge of this process. It will be a long process (of course, we will not wait for a long time). For those of you who are interested, check out How DNS Works, a resource I listed at the end of this article, which shows this process in lively and interesting animation.

After a DNS query, the browser gets the IP address of the requested resource. An IP address, also known as an Internet protocol address, is a digital identifier assigned to devices on the network. That is, any device connected to the Internet, such as our computers, phones and routers, has an IP address. It is the required identifier for our request data to identify the location of the server in the network.

2. Query the Mac address

IP addresses are used at the network layer, but when transmitting data on the actual data link, different computers on the same link must use another address to identify them — a Mac address, also known as a physical address.

In network communication, we usually refer to three addresses: IP address, Mac address and port number.

IP address: identifies the interconnected host and router on the network. Mac Address: physical address of each NIC. Port number: Identifies different applications on the same host, also known as program address.

Therefore, in a network communication, we need to use five identifiers: source IP address, destination IP address, protocol, source port number and destination port number.

Anyway, the Mac address is written into every network card when it is produced, so it is immutable and unique, so how to get the Mac address of the server, another protocol has emerged – ARP.

ARP, also known as Address Resolution Protocol, is used to obtain the physical Address of a communication device based on the IP Address. Its working principle is similar to that of broadcast, which sends packets to all hosts on the same Ethernet. If the Mac address of the target host matches the CORRESPONDING IP address, the target is found. However, if ARP requests are sent to all hosts each time, many ARP packets will appear on the network. Therefore, each host has an ARP cache, which stores commonly used Mac addresses. The host first queries the cache to see if it has the required information. If it does, it does not send a broadcast. Mac addresses add IP headers at the network layer and send them to the network access layer. In fact, you can find caches, such as HTTP cache, DNS cache and ARP cache, everywhere in network communication to improve efficiency and performance. It is worth noting that in IPv6, there is an NDP (Neighbor discovery Protocol) to do the job instead of ARP.

3. Data transfer — Sockets to help

As mentioned earlier, there is a protocol library in the operating system, which is responsible for many functions of network communication in the local machine. When DNS query is performed, the parser in the protocol stack is involved. When the application layer obtains the IP address and Mac address of the server, it already has the necessary conditions for data transfer. The browser will issue a delegation instruction to the protocol library of the operating system, and call the program components in the Socket library to establish a Socket. The essence of the Socket can be understood as an entry and exit of a data channel, and its implementation is a “open, read/write, close” process.

During data transfer, both the client and the server create a socket and then call the CONNECT component in the socket library. This component is based on the descriptor (the socket matching token, the socket cipher with the server socket), the server IP address, and the port number. Establishing a transport channel between the browser and the target server (there is actually no real channel between N gateways, routers, and firewalls). The famous TCP three-way handshake actually occurs at this stage, and we will explain what TCP does later.

After the channel is established, the next operation is the data read and write operation. Our program code cannot directly control the Socket. It still trusts the components in the Socket library to complete the write and read operation. Call the close component in the Socket library to disconnect the connection. After the browser discovers the connection, it also disconnects the connection and data transfer is complete.

As you can see, behind the application code, the built-in component library in the operating system and the Internet protocol are each other’s right and left arm of network communication, responsible for the whole communication process.

4. Three handshakes

TCP and UDP are the two main protocols at the transport layer. The differences between them are as follows: The former is connection-oriented (socket pipes, in effect), a reliable streaming protocol, while the latter is unreliable and takes a “best effort” transport strategy. Remember that applications such as browsers and mail typically use TCP to transfer data, while shorter transmissions such as DNS queries use UDP. We’ll focus on TCP.

In general, the reason you want to set up a socket before you send data is because TCP is a connection-oriented transport protocol, where it doesn’t matter what you send, they treat it as a string of data of a certain length.

To ensure reliable data transmission, TCP uses the three-way handshake policy to transmit data when establishing a socket connection.

Here’s a three-way handshake flow chart:

Syn in the figure is short for Synchronize Sequence Numbers, which identifies a data transfer Sequence.

It’s out of acknowledgement.

In layman’s terms, it’s a process like this:

First handshake: The client attempts to connect to the server and sends a SYN packet (synchronous sequence number), syn= J, to the server, which enters the SYN_SEND state and waits for the server to confirm: I’m ready to go!

Second handshake: The server receives and acknowledges the SYN packet (ACK = J +1) from the client and sends a SYN packet (ACK = K) to the client. At this time, the server enters the SYN_RECV state: The server receives the request and accepts it at any time.

Third handshake: The client receives the SYN + ACK packet from the server and sends an ACK (ACK = K +1). After the packet is sent, the client and the server enter the ESTABLISHED state and complete the three-way handshake: OK, I know, let’s start sending data.

You may think it is rather troublesome, just confirm once, have you received? Received, transmitted. Why do we need three handshakes?

In fact, in order to ensure the absolute reliability of data transmission, three confirmations is the minimum number of times. Imagine a situation like this: After the first confirmed, the client sends data out, but this data because the network problems, delay to the server, the server switch off channel, but the channel is turned off, just data ran again, at that time, the server will be the data as a new transfer request from the client, agreed to establish a connection, and establish a new connection, However, the client does not send the data, so it ignores the server’s acknowledgement and does not respond to the server. The server waits for nothing, which wastes resources on the server.

I see another understanding on Zhihu: the essence of the need for three-way handshake is that the transmission channel is unreliable and you do not know when the network will go wrong. Therefore, in order to ensure reliable transmission, three-way handshake is the minimum in theory.

TCP transport data, not all data all throw, but will send the data stored in a buffer, and will be waiting for the next part of the application program to arrive, so dry, because as we get the data immediately to send, is likely to cause waste and idle channel, cause the network efficiency is too low, as clearly wide road, We have to go through it like a log bridge.

How TCP determines when data will be sent, it has a calculation mechanism, basically, based on the length of the network packet and the time it takes for the application to send the packet. Even though your data is very small, you don’t send it for half a day, and I’m not going to wait forever.

Another situation is that when sending packets is very big, in the buffer to split the package to send, split, TCP modules will also have a good the beginning and end of each piece of data bytes, and write it down in the TCP header information, for the other party to confirm, based on these initial sequence number, the receiver can determine whether there is data missing.

Each send a package, you need to wait for an ack packet, this also is a waste of resources, so in order to use leisure time, waiting for an ack data can be sent directly without waiting for an ack, however, this creates a situation, while sending blindly, but exceed the processing ability of the receiver, can appear network congestion, do more harm than good. To solve this problem, TCP adopts a sliding window sending strategy, which works like this:

Sliding window is actually a data flow control strategy, we can imagine all of the data as a sequence, and the window is data can handle the size of the data, the receiver will be the size of the window to write an ack packet header information, inform the sender, the sender will send rate on the basis of dynamic adjustment, due to the ongoing transmission, the window size is dynamic, So flow control is achieved, and the essence of the purpose is not to send too fast for the receiver to process.

The road ahead

We just mentioned sending and receiving, but how does data find its way to a server on the vast Internet, with so many gateways, routers, switches, and so on between the browser and the server?

This is where network layer protocols come in.

In fact, all of this is prepared before the request is sent. We use DNS and ARP to get the IP address and Mac address of the target server, and these two addresses direct the packet to the server. The IP address is used to match the target host of the communication among all the hosts in the network. The IP address contains the network ID and host ID. The former is used to distinguish different network segments, while the latter identifies different hosts on the same network segment. When the data reaches the router, the router forwards the data to the corresponding network segment based on the network ID.

At this point, you may have a question, since the IP address can find the target server, why also need a Mac address?

Let me explain:

In communications within the same network segment, only know the Mac address, you can find the target accurately, however, the Internet network quantity very big, is network operators split into multiple segments, each segment has multiple subnets, if only know a Mac address, have to be match of all the network devices, is quite a big project, At this time, the CLEVER use of IP address reflected: it can be used to find the way, which is similar to the express provinces and cities, etc., and the Mac address may be the only name, you find a small qiang in the whole Of China that is too much, but you want to specific to a village or community, it is very simple.

Another reason is that switches recognize only Mac addresses, just as routers use IP addresses to route network requests, switches use Mac addresses to determine the location of network cards.

Routers, switches, network cable, optical fiber is four bottom layer protocol, TCP/IP network access layer, data link layer and physical layer of the OSI model is, in this layer, the data was combined with Ethernet first, encapsulated into a frame and transmitted between each router forwarding, often, the router between the commonly used point-to-point protocol, PPP (Point-to-point Protocol) is a network access layer Protocol that sends frames with frame delimiters before and after them. Finally, reaching the subnet where the server resides will forward the message to the network layer for upward delivery.

The up-passing process, as described earlier, is the reverse of parsing the header information and eventually reaching the server.

This process we can see clear figure in front of the whole process, the request data to the server after the subnet of the router, it is a bottom-up process, it is the basic of the client sends data, on the other hand, every level, will be added to the request in front of the analytical data of multilayer header information, refer to a layer, With the help of the entire TCP/IP protocol stack, our request data finally reaches the target server and passes through the server’s network card. The server operating system also has the protocol stack to find the corresponding port of the server program and hand the data to the server application for processing.

The server will process the return result based on our request header information, and then the request response information restarts and goes back to the browser.

The browser receives the request, parses it, renders it, and presents the page to us.

Refer to the link

  1. How DNS works
  2. Xie Xiren, Computer Network
  3. Hogenqin, How Networks Are Connected
  4. what-happens-when…
  5. TCP three handshakes and four breakups are understood in colloquial terms