Solidify your network's underlying base and don't be afraid to be asked TCP/IP/UDP again

This is the 9th day of my participation in the August More Text Challenge. For details, see:August is more challenging

TCP/IP five-tier model

First look at the diagram, layering the network model, and the complete workflow of a complete HTTP request in a five-tier model

The application layer: The highest layer that provides application-specific protocols. Protocols running at this layer include HTTP, FTP, SSH, Websocket, and so on
The transport layer: Provides a common data transfer protocol for the communication between two host processes, such as TCP and UDP
The network layer: Is responsible for addressing and routing functions that send packets to specific computers. The main protocol is the IP protocol. Routers are located at this layer
The link layer: is responsible for the binary data packets and network signal conversion, switch, network card is in this layer
The physical layer: Mainly include receiver, transmitter, repeater, fiber optic cable and so on

Network protocols work through layers to clarify the responsibilities of each layer, and work together through well-defined interfaces, so that layer 1 can use the functions of the layers below without worrying about how each layer is implemented. Just as we develop encapsulated components, each component is responsible for its own business and does not interference with each other, which also increases the degree of reuse, as shown in the basic file transfer process above, which is the HTTP hierarchical workflow:

Host A initiates A request. Before sending the data, the data is divided into many fragments, called packets. Then, the packets are encapsulated using HTTP protocol, and the request header is added, and then passed to the next layer
The transport layer takes the data, assigns a port number to each packet, which is used to determine which application of the target computer, and then processes it using THE TCP protocol, plus the TCP header or UDP header, and passes it to the next layer over TCP
The network layer takes the data and adds the IP address of the target computer to each packet, decides what route to send it to or which host to receive it, and then encapsulates it and sends it to the next layer
The link layer translates the data into electronic signals, which are further encapsulated into data frames and transmitted to the physical layer
The physical layer is transmitted to the link layer on host B through a cable
After receiving the data, the link layer of host B checks the destination address in each packet and determines where to send it. If it is not sent to host B, it discards it. Then it determines the protocol type based on the data and passes it to the IP protocol module at the network layer
The network layer disconnects the IP header, determines that the IP address in the header matches the IP address, and forwards TCP or UDP packets based on the header protocol type
After receiving the packets, TCP at the transport layer calculates and verifies the packets to determine the integrity of the data. Then, it processes the logic of sequential packet receiving and determines which program to forward the packets to according to the port
Finally, after receiving the data, the application layer parses the data according to the HTTP protocol

Just expand IP at the network layer and TCP and UDP at the transport layer

IP

If you’re on a LAN and you’re using MAC addresses, outside the LAN, you’re using IP. MAC is like id, IP is like address. All TCP, UDP, and ICMP data is transmitted in the format of IP datagram

The IP protocol itself does not support IP packets that fail to reach the destination address, nor does it provide a direct way to obtain diagnostic information, such as which routers were sent and the round-trip time. ICMP is responsible for this

ICMP does not provide reliability for the IP network. It is only used to feedback various fault and configuration information. Packet loss does not trigger ICMP

The common ping is to query packets using ICMP. However, ping uses ICMP protocol to skip the transport layer, so the ping program does not have a port number

The IP protocol has the following features:

IP protocol is an unreliable transport protocol. If there is an ICMP transmission exception, IP will discard the packet and may respond with an ICMP error message to the sender. Any reliability requirements must be provided by upper-layer protocols such as TCP
The IP protocol is connectionless. No state information is maintained about the subsequent data, each data is independent. It can be received out of the order in which it is sent, without maintaining the connection state, and without maintaining the link state information of replication (which TCP will cover).

UDP

The characteristics of UDP

Connectionless sends data directly without a handshake or wave.
Unreliability: is a transfer of data porters, to send a package. Does not back up, does not care whether the other party received correctly, transmission order is not guaranteed. So only the application layer can guarantee reliability, because the network layer is also unreliable
- At the sending end, the application layer sends the data to the UDP at the transport layer. It only adds a UDP header (UDP protocol) and sends the data directly to the network layer.
- At the network layer, the receiver sends the data to the transport layer. At the transport layer, only the IP packet header is removed and the UDP packet is sent to the application layer.
- I don’t care about anything else, but thatReduce overhead and latency before sending data
Support broadcasting: unicast, multicast, broadcast functions, not only support one to one transmission, but also support one to many, many to many
Head overhead is small: 8 bytes (source port (not mandatory), purpose port, UDP length (the whole length of the datagram), UDP inspection and (if there is a fault detection UDP datagram or destination port can’t find the corresponding process, each 2 bytes), because it requires is not high and the functions don’t have much, so the first field is not much, and TCP has 20 bytes. Its data can be 0, so it can be at least 8 bytes
Packet-oriented: It is suitable for transmitting a small amount of data at a time. The application layer sends UDP packets of many lengths in the same way. That is, a complete UDP packet is sent at a time without merging or splitting. If the packet is too long and the UDP packet is sent to the network layer in its entirety, the network layer must fragment the packet. Because the link layer requires an MTU, the network layer must fragment the packet. This affects the efficiency of the network layer
No congestion control: Suitable for real-time applications, as it will always send data at a constant speed, even if the network conditions are poor, the transmission rate will not be adjusted. This may lead to packet loss in the case of poor network, but it has obvious advantages. In some scenarios with high real-time requirements, such as chat, online video, voip and so on, UDP is used instead of TCP. For example, it is not too big a problem if intermittent calls occur when making wechat calls. Of course, there are some remedies for congestion such as forward correction or retransmission

Why is UDP unreliable

Before transferring dataNo connection needs to be established first
Message delivery is not guaranteedThe transport layer of the remote host does not need to confirm after receiving UDP packets
There is no guarantee that the order will be, no packet number is set, no rearrangement is performed, and no queue head blocking occurs
Congestion control is not implemented, no built-in feedback mechanism, no retransmission, no timeout

TCP

This is the protocol that we use most of the time, especially back and front TCP provides a completely different service to applications than UDP. TCP is a reliable service that is connection-oriented. Connection-oriented means that before two applications of TCP can exchange data, Establishing a TCP connection by communicating with each other TCP provides a byte stream abstraction to applications: instead of automatically inserting record flags or message boundaries, for example, the sender may send 10 bytes and 30 bytes, but the receiver may read in two 20 bytes

The characteristics of TCP

It is connection-oriented. Both parties must establish a connection before communication can be made
Only unicast transmission is supported, that is, point-to-point transmission. A TCP connection can have only two endpoints
It provides reliable delivery services with integrity check. Data will not be lost. Packets will be retransmitted and arrive in sequence
Is byte stream oriented. Unlike UDP, each packet is transmitted independently, but is transmitted as a byte stream without preserving the packet boundary
Congestion control is provided. When the network is congested, flow control is provided to reduce the speed and quantity of data transmission, alleviate congestion, and ensure stability
Providing full-duplex communication and reliable communication means that both sender and receiver can send and receive data at the same time. Because both sides have a send cache and a receive cache
- Send the cacheIt’s the sending cache that has the data in the queue that’s ready to be sent and the data that has been sent but hasn’t received an acknowledgement from the recipient, and if it hasn’t received an acknowledgement it has to be resent so it can’t be thrown away, it might be retransmitted, because TCP needs to make sure that it’s transmitted reliably
- Receive bufferSo the data that arrived in order but hasn’t been read by the receiving application and the data that didn’t arrive in order, need to be sorted before the receiving application can receive the data one by one

Why is TCP reliable

The recipient will send an ACK acknowledgement message after receiving the data. In this way, the sender knows that its data has been received by the other party. If it does not receive the ACK for a certain period of time, it will resend it. Therefore, even if the data is not sent to the receiver, or the RECEIVER’s ACK packet is lost, there is a retransmission mechanism to ensure that both parties can finally receive the message correctly through retransmission

The retransmission mechanism

Because the underlying network layer of TCP may be lost, duplicated, or out of order, TCP needs to provide reliable data transmission services.

To ensure the correctness of data transmission, a timer is started after a packet is sent. If no ACK packet is received within a certain period of time, the timer will retransmit the packet. If the ACK packet is not successfully sent for a certain number of times, the timer will give up and send a reset signal

Congestion control mechanism

It is mainly reflected in four aspects

One is thatSlow startDo not send large amounts of data at first, test the network, and then slowly increase the congestion window size from small to large
The second isCongestion avoidance, once the network congestion is determined, the transmission is set to half the size of the congestion, and the congestion window is set to 1, and then the slow start algorithm is restarted
The third isThe fast retransmissionThe fast retransmission algorithm specifies that the sender immediately retransmits the packet segments that have not been received as long as it receives three consecutive repeated acknowledgements without waiting for the retransmission timer to expire
Fourth,Fast recovery, considering that if the network is congested, it will not be able to receive several consecutive repeated acknowledgments, so the sender will think that the network may not be congested, so instead of implementing the slow start algorithm, the congestion avoidance algorithm will be implemented

Flow control

The idea is to keep the sender from sending the data so fast that the receiver has time to receive it.

When processing accepted data in the receiver’s cache, reduce the sender’s window size to allow the receiver enough time to receive the packet. Or when the receiver is idle, the sender increases the window size to speed up transmission and make rational use of network resources

TCP Three-way handshake

The first handshake: The client sends a message to the server (SYN, SEQ)
- aThe SYN packet
- One clientInitialize a random serial number(seq)
Second handshake: The server sends a request to the client after receiving it (SYN,ACK, SEQ,ACK)
- Their ownThe SYN packetandACK packet
- A serverInitialize a random serial numberseq
- A confirmation number ack= serial number sent by the client +1, indicating that the client has received it
The third handshake: After receiving the acknowledgement from the server, the client sends (ACK, SEQ, ACK) to the server
- Confirm and reply ACK packets
- A SEQ that is the value of the ACK sent by the client during the second handshake
- An ACKNOWLEDGEMENT ack, with the server’s serial number +1, tells the server I received it

Why not twice?

Confirm that the two sidessendandreceiveIs not normal
Each otherConfirm the initialization sequence numberIf the server uses only two packets, the server does not know whether its serial number is acknowledged by the peer. As a result, invalid packet segments may be received by the server, resulting in errors.

Why not four times?

Because there is no need, the confirmation of the three times have been confirmed

TCP waves four times

Both the client and the server can initiate the shutdown operation. The following uses the client as an example

The browser sends firstFINThe message,Seq= Initializes the sequence number to the server and stops sending data, but still accepts data in response from the server
After receiving the packet, the server sends the packetACK= Browser serial number +1 to the browser, indicating that received
All the server data has been sent. Send it to the browserFINThe message,Seq= Serial number to the browser
The browser receives the packet and sends itACK= Serial number of the server +1 to the server: received

After the end of the fourth wave, a period of time (2MSL after the timer is set) is required to ensure that the server has received its ACK packet. After receiving the ACK packet, the server also closes the connection

Why wait a while before closing? Why not wait?

This is to prevent the loss or error of the acknowledgement segment sent to the server, which may cause the server to fail to shut down. The waiting time is 2MSL, which is the maximum lifetime of packets on the network. If the waiting time exceeds this time, packets will be discarded

In RFC793, MSL is set at 2 minutes, but in practice, it is usually used for 30 seconds. If this time is exceeded, the active switchback will send a packet with RST status bit, indicating that the connection has been reset. At this time, the switchback will know that the other party has closed the connection

If the initiator does not wait, due to port reuse, the initiator may have opened another connection and the initiator is still retrying the FIN request, resulting in the initiator receiving a lot of useless packets. Because the packet has a serial number, so you can determine that this connection is not the received packet, not tube. To do this, we need to make the active switcher-off wait to ensure that the switcher-off will not send any FIN requests before port multiplexing

Why four waves

TCP uses the four-wave wave because the TCP connection is full-duplex. Therefore, both parties need to release the connection of the other party. The release of the connection of an independent party means that data cannot be sent to the other party but can be received

So in close connection, the server receives a FIN message, probably does not immediately close the socket, because it may have a message not sent out, so can only reply an ack packet, tell the client “you hair message I have received”, etc. All of my message is sent out, I can send a FIN message, so you can’t send together, So it takes four waves

TCP packet sticking and processing

To prevent a large amount of data from being transmitted due to a small amount of data, packets are glued into multiple TCP segments and sent. It is to glue several packets of data into a packet, from the receive buffer, the head of the last packet of data is close to the end of the previous packet of data. Because the TCP layer transport is streaming transport, streaming, the biggest problem is that there are no boundaries, no boundaries will cause data to stick together

Unpacking means splitting tasks to reduce the error rate

A sticky scenario

The receiver does not receive packets from the buffer in time, causing multiple packets to be received
Because TCP is used by defaultNagle algorithm, the algorithm itself may also lead to sticky packaging problems
As a result of TCPreuseCause sticky bag. Due to the reusability of TCP connections, which can be used by multiple processes on a single host, boundary segmentation is bound to be a bizarre problem when streaming data of many different structures into TCP
The sticky package problems posed by the packet is too large, such as application process buffer size of the content of a message sent over the size of the cache area, is likely to produce glue bag, because the message has been divided, the first part has been received, but the other part may just cache rushed into the area to prepare to send, after this will lead to a part of the package
Flow control, congestion control can also lead to sticky packets

The Nagle algorithm does two things: it sends the next packet only if the last one is confirmed; The second is to collect multiple small packets, the size of which reaches the maximum segment size (MSS), and send them together when a confirmation arrives. Multiple groupings are sent as one data segment. If the boundary problem is not handled properly, packets will be sticky when unpackingCopy the code

How to deal with the sticky package

If it isNagle algorithmCause, combined with the application scenario to close the algorithm can be appropriate
If it is not
- Tail tag sequence. The boundary of the packet is represented by a special identifier, such as \n\r\t or some hidden character
- The header marks step receive. Add the data length to the header of the TCP packet. Using a protocol with a header, the header stores the start id and the length of the message. When the service provider obtains the header, it parses the length of the message and then reads the contents of that length backwards
- When the application layer sends data, it sends it with a fixed length. The server reads the content of a fixed length as a complete message. If the length is not long enough, it fills in fixed characters in the space

Why does UDP not stick to packets

Because UDP isMessage-oriented protocols, the UDP segment is a message, the application must be the message as the unit to extract data, can not extract any byte of data at a time
UDP hasProtect message boundariesEach UDP packet has a header (source address, port information, etc.) so that it is easy for the receiver to partition. The transmission protocol transmits data as an independent message on the Internet, and the receiver can only receive independent messages. If the message content is too large to be accepted by the receiver at one time, a part of the data will be lost, because even if it is lost, it will not receive it in two times

Differences and application scenarios between TCP and UDP

TCPTransmission speed is slow;UDPSpeed is fast
TCPThe protocol is reliable, congestion control and flow control;UDPThe protocol is unreliable and there is no congestion control, flow control, etc
TCPThe protocol is connection-oriented and requires a 3-way handshake;UDPThe protocol is connectionless and handshake is not required
TCPOnly one-to-one connection;UDPSupport broadcasting, one to one, one to many, many to many can be
TCPMinimum header size of 20 bytes;UDPMinimum 8 bytes
TCPIt is byte stream oriented in transmission; whileUDPIs message oriented
TCPWhen the protocol transmits the data segment, it labels the segment.UDPAgreement without

In addition, the TCP and UDP port numbers are independent, so they can be the same

Applicable scenario

TCP is suitable for the scenarios that require large amounts of data to be transmitted and reliable transmission (such as data confirmation, resend, and sorting), such as login and file transmission. UDP is suitable for the scenarios that require small amounts of data to be transmitted and high efficiency, such as real-time applications, im, chat, and video calls

conclusion

Praise support, hand stay fragrance, and have glory yan

Thanks for seeing this, come on!

reference

With you understand the inner workings of network programming
Juejin. Cn/post / 684490…

mo4tech.com (Moment For Technology) is a global community with thousands techies from across the global hang out!Passionate technologists, be it gadget freaks, tech enthusiasts, coders, technopreneurs, or CIOs, you would find them all here.

Solidify your network’s underlying base and don’t be afraid to be asked TCP/IP/UDP again

TCP/IP five-tier model

IP

UDP

The characteristics of UDP

Why is UDP unreliable

TCP

The characteristics of TCP

Why is TCP reliable

The retransmission mechanism

Congestion control mechanism

Flow control

TCP Three-way handshake

TCP waves four times

Why wait a while before closing? Why not wait?

Why four waves

TCP packet sticking and processing

A sticky scenario

How to deal with the sticky package

Why does UDP not stick to packets

Differences and application scenarios between TCP and UDP

conclusion

reference

Solidify your network’s underlying base and don’t be afraid to be asked TCP/IP/UDP again

TCP/IP five-tier model

IP

UDP

The characteristics of UDP

Why is UDP unreliable

TCP

The characteristics of TCP

Why is TCP reliable

The retransmission mechanism

Congestion control mechanism

Flow control

TCP Three-way handshake

TCP waves four times

Why wait a while before closing? Why not wait?

Why four waves

TCP packet sticking and processing

A sticky scenario

How to deal with the sticky package

Why does UDP not stick to packets

Differences and application scenarios between TCP and UDP

conclusion

reference

Related Posts

Flutter animation – Realizations twinkle stars

Flutter ListView source analysis

Abstract equality comparison with JavaScript (==)