Details on the differences between TCP and UDP
Computers and other network devices communicate with each other, and both parties of communication must send and receive data packets based on the same rules (for example, rules on how to find the target of communication, how to initiate communication, and how to end communication need to be determined in advance), which is called Protocol.
The TCP/IP protocol cluster is the basis of the Internet. It is a group of network protocols, for example, TCP, UDP, IP, FTP, HTTP, ICMP, and SMTP belong to the TCP/IP protocol family. These protocols are divided from the bottom up into four layers in computer networks: link layer, network layer, transport layer and application layer.
- Link layer: sends and receives ARP/RARP packets.
- Network layer: This layer contains IP and Routing Information Protocol (RIP).
Mainly responsible for the transmission of data packets between hosts
; - Transport layer: primary responsibility
Locate the specific process that processes the data and forward the data
(TCP provides reliable data flow transport service, UDP provides unreliable data service); - The application layer is responsible for providing applications to users, such as HTTP, FTP, Telnet, DNS, SMTP, etc.
In the network architecture, the establishment of network communication must be carried out in the peer layer of the communication parties, not interleaved. In the whole data transmission process, the protocol header and protocol tail of the corresponding layer must be attached to the data sender when it passes through each layer (only the protocol tail needs to be encapsulated at the link layer).
UDP and TCP are the core members of TCP/IP protocol cluster.
A, UDP
User Datagram Protocol (UDP) is a connectionless Protocol that provides unreliable User Datagram services. The UDP Protocol is defined in RFC 768 published in 1980.
UDP data structure
The UDP data structure is as follows:
The UDP header contains only four fields: source port, destination port, UDP length, and UDP parity code. Each field contains 16 bits, that is, two bytes.
- Source port Port number of the sender process. The receiver can use this field (which may not be exact) to send information to the sender (range: 0-65535).
- Destination port Port number of the data receiver (range: 0 to 65535).
- UDP Length Indicates the total length of the protocol header and datagram. It indicates the size of the entire datagram.
- The UDP verification code uses the IP header, UDP header, and data in the datagram for calculation. The receiver can use the verification code to verify the accuracy of data and discover problems during transmission.
Example of UDP header data
Common DNS protocol can use UDP protocol to obtain domain name resolution results:
0000 ff 7c 00 35 00 23 c2 6e
Copy the code
The values of the four fields in the UDP header are as follows:
- Source port 0xFF7C = 65404
- Destination port 0x0035 = 53 The destination port is 53 because the DNS protocol uses port 53
- UDP length 0x0023 = 35
- UDP verification code 0xC26E
The position of UDP in data transmission
Here we can divide the transfer process from application to application into two parts: host-to-host data transfer and host-to-application data forwarding.
- UDP header
Destination port number
Used to locate specific processes that process data and forward data; - The Internet Protocol (IP) at the bottom layer of UDP is responsible for the transmission of data packets between hosts.
We say that UDP is a transport layer protocol, but the real data transfer between hosts is IP protocol, UDP only plays the role of locating specific processes.
UDP data transmission features
- Connectionless UDP does not require a three-way handshake to establish a connection as TCP does before sending data. UDP sends data directly. UDP is only a porter of data packets and does not split or splice data packets.
- Unreliable reliability is reflected in the first connection, communication does not need to establish a connection, want to send, such a situation is certainly not reliable; They send the data they receive without backing it up and without caring whether the other party has received the data correctly. Moreover, the network environment is up and down, but UDP always sends data at a constant speed because it has no congestion control. The transmission rate will not be adjusted even if the network condition is not good. The disadvantage of this implementation is that packet loss may occur when the network condition is not good, but the advantage is also obvious. In some scenarios requiring high real-time performance (such as live broadcast and teleconference), UDP is used instead of TCP.
- UDP does not establish a connection, so it can transmit data to anyone. It supports one-to-many, many-to-many, and many-to-one transmission modes as well.
- UDP is a UDP packet sent by the sender to the application program. After the header is added, the packet is sent to the IP layer (UDP packets sent by the application layer are neither merged nor split, but the boundaries of the packets are preserved).
- Low header overhead and high data transmission Efficiency UDP has a low header overhead of only 8 bytes and is efficient in transmitting data packets (UDP is frequently used in real-time scenarios such as live broadcast, conference call, and media transmission).
Second, the TCP
Transmission Control Protocol (TCP), defined in RFC 793, is a connection-oriented, reliable, and byte stream based transport layer communication Protocol.
When users view a web page or email, they want to see the content complete and in the correct order, without losing any content; When you download a file, you want the whole file, not just a part of it. TCP can be used as the transport layer protocol in the preceding application scenarios.
TCP data structure
- Source port and destination port The port number of the sender process and the port number of the data receiver (0-65535).
- Serial number is mainly to solve the problem of disorder (good number to know which to come first, which to arrive later);
- The packet sent by the confirmation serial number should have confirmation, so that you can know whether the other party received it. If not, you should send it again. This solves the problem of not losing packets.
- Status bits: SYN initiates a connection, ACK replies, RST reconnects, and FIN terminates a connection. TCP is connection-oriented and requires both parties to maintain the connection state. Packets in these status bits cause state changes for both parties.
- Window size To control TCP traffic, each communication party needs to declare a window to identify its current processing capability.
TCP three-way handshake
Before TCP sends data, a connection must be established at both ends of the communication. The connection method is TCP three-way handshake:
- First handshake The client sends a connection request packet to the server. After the request is SENT, the client enters the SYN-sent state.
- Second handshake After receiving the connection request packet, if the server agrees to the connection, it sends a reply and enters the SYN-received state.
- Third handshake After receiving the connection approval reply, the client sends an acknowledgement packet to the server. The client enters the ESTABLISHED state after sending the packet. The server also enters the ESTABLISHED state after receiving the response. In this case, the connection is ESTABLISHED successfully.
Why does TCP require three handshakes to establish a connection, instead of two? TCP not only ensures reliable data transmission, but also improves transmission efficiency, and uses three times (the client and the server send a response to the packet, communication both sides all have back and forth) precisely meets the requirements of the above two aspects!
TCP waved four times
TCP disconnects, also known as four waves:
- First wave
A: B, I quit
The client A
toThe service side B
Send a connection release request; - Second wave
B: OK, A No, I see
After receiving the connection release request, server B sends an ACK packet and enters the CLOSE_WAIT state. In this case, server B does not receive the data sent by client A. However, server B continues to send the data if it has not finished sending. - Third wave
B: A, I'm done, too. Bye
Server B sends A connection release request to A, and then B enters the last-ACK state. - Fourth wave
A: OK, B No more. Bye
After receiving the release request, client A sends an acknowledgement to server B. In this case, client A enters the time-wait state. The time-wait state of client A lasts for 2MSL (maximum segment duration, which refers to the duration of the packet segment on the network. The timeout will be discarded). If client A does not resend A request from client B within this period, it enters the CLOSED state. When B receives the confirmation reply, it also enters the CLOSED state.
TCP features
Compared with UDP, TCP has the characteristics of connection-oriented, sequential guarantee, reliable transmission, and congestion control.
To ensure sequence, each TCP packet has a serial number ID. When establishing a connection, the system determines the start ID and sends the packets one by one. In order to ensure that packets are not lost, it is necessary to reply to all packets sent. Here, the reply is not one by one, but a previous ID, indicating that all packets have been received. This mode becomes a cumulative reply. To keep track of all packets sent and received, the sender and receiver need to cache these records separately.
The cache of the TCP sender is arranged one by one according to the serial number ID of the packets, and divided into four parts according to the processing situation:
- Sent and confirmed;
- Send unconfirmed;
- Not sent waiting to be sent;
- Not sent and will not be sent for the time being;
In the TCP protocol, the receiving end will send sending end a Advertised Window size, which is equal to the sum of the second and third parts above. If the size exceeds this Window, the receiving end cannot handle the Advertised Window and cannot continue sending temporarily.
In the TCP sender cache queue above:
- 1, 2, 3 have been sent and confirmed;
- 4, 5, 6, 7, 8 and 9 have all been sent and have not been confirmed.
- 10, 11, 12 are not sent yet;
- 13, 14, 15 are the ones that the receiver has no space to send.
The cache content of the TCP receiver is as follows:
- Received and confirmed;
- It hasn’t been received yet, it’s about to be received;
- Not received and cannot receive;
In the TCP receiver cache queue above:
- 1, 2, 3, 4, 5 are completed ACK;
- 6, 7 are waiting to receive, 8, 9 are received and have not ACK;
- 10, 11, 12, 13, 14, 15 are temporarily unacceptable;
The status of the TCP sender and receiver is as follows (according to the above two figures) :
- 1, 2, 3 there is no problem, both sides have reached an agreement;
- 4, 5 The receiver has responded to ACK, but the sender has not received it.
- 6, 7, 8 and 9 must have been sent, and 8 and 9 have arrived, but 6 and 7 have not been received, there is a disorder, cache cannot ACK;
From this example, we can know that both order problems and packet loss problems may exist:
Assume that the sender of the ACK response from 4 received, but the ACK from 5 was lost. What should I do if packets of 6 and 7 are lost?
- One method is timeout retry, that is, set a timer for each packet that has been sent but has not been ACK, and retry after a certain number of events. The retry time must be longer than the round trip time, but it should not be too long. Otherwise, the access will slow down if the timeout time is longer. For example, after a period of time, ACK packets of 5, 6, and 7 all time out, the sender will resend the packets. 5, 6, and 7 received the send ACK.
- Another fast retransmission mechanism is that when the receiver receives a packet segment with a larger serial number than expected, it detects the interval between data flows and sends three redundant ACKS. After receiving the packet, the client knows that the datagram is lost and retransmits the lost packet segment. For example, the receiver finds that 6, 8, and 9 have been received, but 7 has not arrived (7 is missing), so it sends three ACKS of 6, asking for the next 7. When the client receives three ACKS, it realizes that seven is missing and resends it immediately.
reference
UDP – RFC768: tools.ietf.org/html/rfc768
TCP – RFC973: tools.ietf.org/html/rfc793
Stackoverflow: UDP checksum calculation, Sep 2017 stackoverflow.com/questions/1…
Baidu encyclopedia, UDP: baike.baidu.com/item/UDP/57…
Baidu encyclopedia – TCP: baike.baidu.com/item/TCP/33…
The difference between TCP and UDP: blog.csdn.net/zhang622328…
Is to understand the difference between TCP and UDP www.cnblogs.com/fundebug/p/…