It is a network layered model composed of a series of protocols
The network layer
Why layering?
Because the network is unstable
- Application layer: mainly responsible for HTTP protocol of web browser, FTP protocol of file transfer, SMTP protocol of email, DNS of domain name system and so on.
- The Http protocol
- Transport layer: mainly responsible for transmitting data packets of the application layer.
- TCP protocol for reliable transmission, data block assembly (larger data)
- Especially efficient UDP protocol, live games and so on do not need to retransmit data,
- Network layer: Primarily IP protocol, responsible for addressing (finding the location of the target device)
- The IP protocol sends and receives chunks of network data in minimal units
- Data link layer: mainly responsible for the conversion of digital signals and physical binary signals.
- Ethernet, wifi, etc
Data transfer process
Data link layer > network layer > Transport layer > application layer, layer by layer decoding, finally in the browser to get the target device sent from the index.html
The role of layer 4 network protocols
TCP/IP refers to a protocol family developed for the Internet, not just TCP and IP.
- The sending end sends the data from the upper layer to the lower layer by adding the data (radical) of each layer protocol in the header.
- From bottom to top, the receiving end decrypts the data received from the lower layer and removes the radical of the head before sending it to the upper layer.
- After layers of encryption and decryption, the application layer finally gets the data it needs.
Osi 7-layer network model and TCP/IP 4-layer network model
- OSI seven-layer model is put forward by the academic circle, which is known from the number of layers that it divides the network into more details, which also makes its implementation more complex, so its academic value is greater. TCP/IP network model is proposed and implemented by computer oligarchy, which is a simplified version of OSI seven-layer model. Belong to open source product, can provide the user directly to use.
- The four-tier model is an industry standard that integrates some of the OSI layers (application layer, presentation layer, and session layer into application layer, data link layer and physical layer into network interface layer) or distributes functions to other layers, so it is more widely used in practice.
- The five-layer model is only a compromise method often adopted in learning the principles of computer network, because the OSI seven-layer model has too many layers and is too detailed for learning, and the four-layer model is too simplified, so the five-layer model is proposed by integrating the advantages of OSI and TCP/IP. This is concise and illustrates concepts (you can’t learn the OSI and TCP/IP four-tier model at the same time).
As you can see from the figure, the bottom two layers of the four-tier model always have different names, and it is more important for us to understand these two layers.
- The first layer from host to network layer, also known as network interface layer, is responsible for monitoring data exchange between host and network, that is, receiving data packets from the network interconnection layer and sending them out through the network, or receiving physical frames from the network, taking out IP datagrams and giving them to the network interconnection layer; It corresponds to the data link layer and physical layer in the OSI seven-layer model.
In the TCP/IP four-layer model, the protocol of this layer is not defined. Instead, each network participating in the interconnection uses its own physical layer and data link layer protocols. Why is “each network” mentioned here? Because from the physical layer and data link dealing directly with the hardware, and the underlying protocol used by different network types are often not consistent, such as to distinguish between transmission medium according to the wired network (twisted pair/coaxial cable), wireless networks, light nets, etc., as a result of difference of signals, frequency, etc, their data resolution protocol must be different.
- Network interconnection layer, also known as the Internet layer, as the name implies, is the communication between networks, it is the core of the entire TCP/IP protocol stack, its function is to send packets to the target network or host. This layer needs to address congestion control issues.
Appendix:
Differences between TCP and UDP
- TCP is connection-oriented while UDP is connectionless, that is, no link is required before sending data.
- TCP ensures data correctness, UDP may lose packets, TCP ensures data sequence, UDP does not. That is to say, data transmitted through the TCP connection is error-free, not lost, not repeated, and in order to arrive; UDP best effort delivery, i.e., no guarantee of reliable delivery Tcp through checksum, retransmission control, sequence identification, sliding Windows, confirmation reply to achieve reliable delivery. For example, when the packet is lost, the sequence control can also be carried out for the subcontracting out of order.
- TCP supports only 1 to 1, UDP supports 1 to 1 and 1 to many.
- TCP is byte stream oriented, UDP is packet oriented. UDP has better real-time performance and higher work efficiency than TCP. And network congestion does not slow down transmission rates (hence packet loss for real-time applications such as IP telephony and video conferencing).
- TCP requires more system resources, while UDP requires less.
- TCP’s header is 20 bytes larger than UDP’s 8 bytes.
A TCP connection
Connection: refers to the TCP connection
- HTTP is stateless
- TCP is a stateful interaction
Three handshakes and four waves
Three handshakes and four waves
There are a few fields in the figure above that need to be highlighted:
- (1) Serial number: Seq serial number, consisting of 32 bits, which identifies the byte stream sent from the TCP source to the destination and is marked when the initiator sends data.
- (2) Confirmation number: THE Ack number is 32 bits. The confirmation number field is valid only when the Ack flag bit is 1. Ack=Seq+1. (Not to be confused with ACK in flag bits, the two are not the same thing)
- (3) Flag bits: 6 in total, URG, ACK, PSH, RST, SYN, FIN, etc., with the following meanings:
- SYN: Requests a connection and initializes the sequence number in its sequence number field. Set the connection to 1
- ACK: Verifies whether the serial number is valid. It is usually set to 1.
- FIN: Releases a connection.
- PSH: prompts the receiving application to immediately read data from the TCP buffer. The receiver should send the packet to the application layer as soon as possible.
- RST: resets the connection. The other party asked to re-establish the connection, reset.
- URG: Urgent Pointer is valid.
TCP connection establishment: three-way handshake
First handshake: When the client sends a connection request packet to the server, the SYN in the same part of the packet header is 1 and the initial sequence number seq= X is randomly generated. Then, the TCP client process enters the SYN-sent state. According to TCP, the SYN segment (SYN=1) cannot carry data, but must consume a sequence number. The first of three handshakes. Indicates that the client wants to establish a connection with the server.
Second handshake After receiving the request packet, the TCP server sends a confirmation packet if it agrees to the connection. In the acknowledgement packet, ACK=1, SYN=1, ACK= X +1, and a sequence number (SEq = Y) should be randomly initialized. In this case, the TCP server process enters the SYN-RCVD state. This message also does not carry data, but again consumes a serial number. This message, with SYN(establishing a connection) and ACK(confirming) flags, asks the client if it is ready.
Third handshake The TCP client process also sends an acknowledgement to the server after receiving the acknowledgement. ACK=1 and ACK= y+1. In this case, the TCP connection is ESTABLISHED and the client enters the ESTABLISHED state. According to TCP, AN ACK packet segment can carry data, but does not consume serial numbers if it does not. Here the client says I’m ready.
Write it silently for the last time
1. Client: "I want to send you a message" 2. Server: "OK, I want to send you a message" 3. Client: "Ok.Copy the code
Why three handshakes
In The fourth edition of Computer Network written by Xie Xiren, the purpose of “three-way handshake” is “to prevent the invalid connection request message segment from being suddenly transmitted to the server, resulting in errors”. In another classic book, Computer Networks, the purpose of the three-way handshake was to solve the problem of “delayed repeated grouping in networks”. In his book Computer Network, Xie Xiren also gave an example as follows: “Invalid connection request message segment” is generated in such a situation: The segment of the first connection request packet sent by the client is not lost, but is detained on a network node for a long time. As a result, it is delayed to reach the server until a certain time after the connection is released. Originally, this is an invalid packet segment. However, after the server receives the invalid connection request packet segment, it mistakenly thinks it is a new connection request sent by the client. Then the client sends a confirmation message to agree to establish a connection. Assuming that the “three-way handshake” is not used, a new connection is established as soon as the server sends an acknowledgement. Since the client does not send a connection request, it ignores the server’s confirmation and does not send data to the server. However, the server assumes that the new transport connection has been established and waits for data from the client. As a result, many of the server’s resources are wasted. The three-way handshake prevents this from happening. For example, the client does not issue an acknowledgement to the server’s acknowledgement. When the server receives no acknowledgement, it knows that the client has not requested a connection.”
The summary is to prevent the server from wasting resources waiting.
Closing of TCP connection: four waves
The first time the server receives the FIN, it sends back an ACK confirming receipt of the FIN number +1. As with the SYN, one FIN occupies one sequence number. After receiving the connection release packet, the server sends an acknowledgement packet with ACK=1, ACK= U +1 and its serial number seq= V. In this case, the server enters close-wait state. The TCP server notifies the higher-level application process that the client is released from the direction of the server. This state is half-closed, that is, the client has no data to send, but if the server sends data, the client still accepts it. This state also lasts for a period of time, i.e. the duration of the close-wait state.
After receiving the acknowledgement request from the server, the client enters the fin-WaIT-2 state and waits for the server to send a connection release packet (before receiving the final data from the server).
The third wave sends a FIN(end) to the client, and the server closes the client connection. After sending the LAST data, the server sends a connection release packet with FIN=1 and ACK = U +1 to the client. The server is probably in the semi-closed state. Assume that the serial number is SEQ = W, then the server enters the last-ACK state and waits for the client’s confirmation.
For the fourth time, the client sends an ACK message to confirm the shutdown and sets the sequence number of the ACK to +1. After receiving the connection release packet from the server, the client sends ACK=1, ACK= W +1 and its serial number is SEq = U +1. In this case, the client enters the time-wait state. Notice That the TCP connection is not released at this time, and the client can enter the CLOSED state only after 2∗∗MSL (maximum packet segment life) and the corresponding TCB is revoked.
The server enters the CLOSED state immediately after receiving an acknowledgement from the client. Similarly, revoking the TCB terminates the TCP connection. As you can see, the server ends the TCP connection earlier than the client.
Write it silently for the last time
1. Client: "I no longer send you messages" 2. Server: "OK" 3. Server: "I will not send you any more messages" 4. Client: "OK"Copy the code
Q&A
① Why do you shake hands three times (is it ok to shake hands twice?) The client sends the first connection request packet. However, due to the poor network, the request does not immediately reach the server. Instead, it stays in a network node and does not reach the server until a certain period of time. The system still sends an acknowledgement message to the client, indicating that the client agrees to connect. If the three-way handshake is not used, the new connection is established as long as the server sends an acknowledgement. However, this request is invalid. The client ignores the server’s acknowledgement and does not send an acknowledgement request to the server, but the server thinks that the new connection has been established. In this case, the server does not waste resources. The three-way handshake is used to prevent this situation. The server knows that the client has not established a connection because it does not receive the confirmation packet. That’s what the triple handshake is for. In simple terms, it prevents the server from wasting resources by waiting
(2) When A and B shake hands with each other for 4 times, A and B can directly bring their SYN message and ACK message to A, but when WAVING, A says I want to disconnect, B has not sent the last data, so I need to respond to A first. I have received your request to disconnect. But you have to wait for me to give you the final content, so there are two steps: (1) Respond to A; (2) Send its last data to ensure that the data can be transmitted.
③ How to do if the client suddenly hangs up When the connection is normal, the client suddenly hangs up. If no measures are taken to deal with this situation, the client and server will be idle for a long time. The solution is to set a keepalive timer on the server and reset the timer every time the server receives a message from the client. The timeout period is usually set to 2 hours. If the server does not receive the client’s message for more than two hours, it sends a probe message segment. If 10 probe segments are sent, each 75 seconds apart, and no response is received, the client is considered faulty and the connection is terminated. Set the keepalive timer
④ why does TIME_WAIT state take 2MSL to return to CLOSE? TIME_WAIT state is used to resend ACK packets that may be lost. What if the ACK of the last reply to B is lost? Within this time, A can send the packet again. However, if the waiting time exceeds the maximum, it will be useless even if the packet cannot be received, so it can be closed.
For the same domain name, the PC browser will establish 6 to 8 connections to the server of a single domain name at the same time, and the number of connections to the mobile phone is generally controlled in 4 to 6 (this may vary according to the browser kernel). If the maximum number of browser connections is exceeded, subsequent requests will be blocked.
Long connection and short connection
Short connection
Concept: as shown in the figure below, the client and the server establish a connection to start communication, and then disconnect the TCP connection after the communication is completed once/for a specified number of times. When the next communication is completed, the client establishes the TCP connection again. Advantages: It does not occupy the server memory for a long time, so the server can process a large number of connections.
- ** If the server wants to send data to the client, what can be done? There is no way, or wait until the next time to request data, such as we use polling (30 seconds or longer) pull messages, then the real-time communication between the server and the client will be lost
- Clients use polling to obtain information in real time, or a large number of clients use short connection to communicate, so a large number of CPU and bandwidth resources are wasted for establishing and releasing connections, there is a waste of resources, or even cannot establish connections. Such as the classic HTTP long polling (wechat web client)
A long connection
Concept: As shown in the following figure, TCP is connected to the server until the service is no longer needed. Advantages:
- Fast data transfer
- The server can actively transfer data to the client in the first time
Disadvantages:
- Because the client and server are always connected in this way, the number of clients in a highly concurrent distributed cluster system will increase and occupy a lot of system resources
- TCP itself is a kind of stateful data, in high concurrency distributed system will cause the background design is difficult to do
Why long connections
- The server will actively send messages to the client: if no connection exists, the server will never actively find the client and will not be able to send messages to the client in a timely manner
- The client and server communicate frequently: if the connection is short, the connection must be established each time to send messages. If the concurrency is high, a large number of TIME_WAIT state sockets may appear, which means that the connection cannot be established later
- For example, when the client is offline, the server needs to do some processing, such as clearing its cache or other resources, or other business meanings, such as THE QQ profile picture showing offline
This section describes the problems to be considered when designing TCP long connections
- The default TCP keep-alive timeout is too long: the default is 7200 seconds, or 2 hours, which can be changed
- The socket proxy disables TCP keep-alive: All proxy applications can forward ONLY TCP application data, but cannot forward TCP internal packets
- Mobile networks need to keep signaling alive: For smart terminals such as mobile phones, which use mobile networks to access the Internet, operators will try to close sockets that do not send data for more than 60 or 45 seconds to save channel resources
Heartbeat detection (long connection implementation)
Heartbeat detection is a mechanism implemented by the user at the application layer. It is used to imitate the KEEPalive mechanism of TCP. The user sends a heartbeat detection packet to the peer within a specified period of time. If no reply is received within the specified time, then the appropriate action is taken (such as closing the TCP connection).
Implementation in Netyy