Protocol feature Overview

agreement connectivity Duplex sex reliability order boundedness Congestion control Transmission speed Order of magnitude Head size
UDP Connection less n:m Unreliability (Data loss after packet loss) A disorderly There are message boundaries and no sticky packets There is no fast high 8 bytes
TCP Connection Oriented Full duplex (1:1) Reliable (retransmission mechanism) The orderly (Sort by SYN) None. The package is stuck There are slow low

UDP

For a message

UDP is a packet-oriented protocol. The packet is not split or spliced.

To be specific:

  • At the sending end, the application layer passes the data to the UDP protocol at the transport layer. UDP simply adds a UDP header to the data to identify the UDP protocol, and then passes the data to the network layer
  • At the receiving end, the network layer transmits data to the transport layer, and UDP transmits the IP packet header to the application layer without any concatenation

Unreliability.

  1. UDP is connectionless and communication does not need to be established and disconnected
  2. UDP is unreliable, does not care about backing up data and does not guarantee data reachability
  3. UDP does not have congestion control and always sends data at a constant speed, which can lead to packet loss in poor network conditions. It is more advantageous in the scene with high real-time requirement

efficient

UDP is not as complex as TCP, with a header of only 8 bytes.

Mainly contains data:

  • Two hexadecimal port numbers: source port (optional) and destination port
  • The length of the entire data packet
  • Checksum of the entire data packet (IPV4 optional). This field is used to discover errors in header information and data.

transport

UDP supports not only one-to-one transmission, but also one-to-many, many-to-many, and many-to-one. In short, it provides unicast, multicast and broadcast capabilities

TCP

The head

The following fields are important for the TCP header:

  • Sequence number: This Sequence number ensures that all TCP packets are in order. The receiver can concatenate the packets according to Sequence number
  • Acknowledgement Number: This Number indicates the Number of the next byte that the data receiver expects to receive, and also indicates that the data of the previous Number has been received
  • Window Size: Indicates the Size of the Window, indicating how many bytes of data can be accepted for flow control
  • Identifier:
    • URG=1: The value of 1 indicates that the data part of the datagram contains emergency information and is a data packet with high priority. In this case, the emergency pointer is valid. The emergency data must precede the data section of the current packet, and the emergency pointer indicates the end of the emergency data.
    • ACK=1: If this field is 1, the confirmation number field is valid. In addition, TCP specifies that ACK must be set to one for all segments sent after a connection is established.
    • PSH=1: A field of one indicates that the receiver should push the data to the application layer immediately, rather than wait until the buffer is full.
    • RST=1: This field indicates that the current TCP connection has a serious problem and may need to be re-established. It can also be used to reject illegal packet segments and connection requests.
    • SYN=1: When SYN=1 and ACK=0, the current packet is a connection request packet. When SYN=1 and ACK=1, the current packet is a reply packet agreeing to establish a connection.
    • FIN=1: If the field is 1, the packet is a request packet to release the connection.

The state machine

HTTP is connectionless, so as the lower layer of TCP protocol is connectionless, although it seems that TCP will connect the two ends, but in fact, both ends jointly maintain a state:

TCP’s state machine is more complex and involves establishing a handshake on disconnection, so we’ll cover the two-handshake.

In addition, RTT represents the round-trip time between sending data and receiving data from the sender.

Three-way handshake

In THE TCP protocol, the client initiates the request actively, and the server connects the request passively. After a TCP connection is established, both the client and server can send and receive data. Therefore, TCP is also a full-duplex protocol.

Initially, both ends are CLOSED. Before communication begins, both parties create a TCB. After the TCB is created, the server enters the LISTEN state and waits for the client to send data.

The purpose of the three-way handshake is for both parties to confirm each other’s ability to send and receive messages

First handshake

Client initiating

The client sends a connection request packet segment to the server. The paragraph contains its own initial data communication sequence number. After the request is SENT, the client enters the SYN-sent state. X indicates the initial number of data communication on the client.

Second handshake

The server validates the client’s ability to send

After the server receives the connection request packet, if the server agrees to the connection, it sends a response containing its initial data communication number. After the response is complete, the server enters the SYN-received state.

Third handshake

The client verifies the server’s ability to receive and send

The server receives the message confirming the client’s ability to receive it

When the client receives a connection approval reply, it sends an acknowledgement message to the server. After sending the packet, the client enters the ESTABLISHED state. After receiving the response, the server also enters the ESTABLISHED state.

PS: The third handshake can contain data, via TCP Quick Open (TFO) technology. In fact, as long as the handshake protocol is involved, TFO can be used. The client and the server store the same cookie, and send the cookie when the next handshake to reduce RTT.

Four times to wave

TCP is full-duplex. When the connection is disconnected, both ends need to send FIN and ACK packets.

The purpose of the four waves is

First wave

Client: I’ve finished sending the message

If client A considers the data transfer complete, it needs to send A connection release request to server B.

Second wave

Server: OK, but wait a minute. I’ll take care of the rest

After receiving the connection release request, USER B tells the application layer to release the TCP connection. The ACK packet is then sent and the state CLOSE_WAIT is entered, indicating that the connection from A to B has been released and the data sent by A will not be received. But because the TCP connection is bidirectional, B can still send data to A.

Third wave

Server: Ok, now the message has been sent

If there is still uncompleted data sent by USER B, user B will send A connection release request to user A. Then, user B enters the last-ACK state.

PS: The second and third handshakes can be combined to delay the delivery of the ACK packet through the technique of delayed acknowledgement (usually with a time limit, otherwise the recipient will mistakenly think that a retransmission is needed).

Fourth wave

Client: The message is finished, so let’s disconnect

Server: Received

After receiving the release request, user A sends A confirmation reply to user B. User A enters the time-wait state. The state lasts for 2MSL (maximum segment lifetime, which refers to the duration of the packet segment in the network. The timeout will be discarded). If there is no resending request from B within this period, the state is CLOSED. When B receives the confirmation reply, it also enters the CLOSED state.

Why does A enter the time-wait state and WAIT 2MSL before entering the CLOSED state?

In order to ensure that B can receive A’s confirmation. If A enters the CLOSED state directly after sending the confirmation reply, if the confirmation reply does not arrive due to network problems, B cannot be CLOSED normally.

Why can’t the second and third times merge, and what is the wait between the second and third times?

When the server performs the second wave, the client will not request any more data from the server, but the server may still be sending data to the client. At this time, the server will wait for the completion of the data transmission before sending the shutdown request.

Life timer

In addition to the time wait timer, TCP also has a Keepalive timer. Imagine a scenario where a client establishes a TCP connection with a server on its own initiative. But then the client’s host suddenly failed. Obviously, the server can no longer receive data sent by the client. Therefore, measures should be taken to prevent the server from waiting in vain. This is where the survival timer comes in.

Every time the server receives data from a customer, it redesigns the timer. The time is usually set to two hours. If no data is received from the client within two hours, the server sends a probe segment every 75 seconds. If there is no response from the client after 10 consecutive probe packet segments are sent, the server considers the client invalid and closes the connection

How does TCP ensure reliable transmission

  • Checksum: The sender calculates the checksum before sending the data, and the receiver calculates the checksum after receiving the data. If the data is inconsistent, the transmission is incorrect.
  • Acknowledge reply, sequence number: Data is numbered during TCP book transmission. Each ACK sent by the receiver has an acknowledgement sequence number
  • Timeout retransmission: Refer to the ARQ protocol below. Simply put, if the sender does not receive an ACK after sending data for a period of time, the data will be retransmitted
  • Connection management: Three handshakes and four waves, see above
  • Flow control: see belowThe sliding window
  • Congestion control: see belowCongestion control

ARQ protocol

ARQ is the so-called timeout retransmission mechanism, which ensures the correct delivery of data through confirmation and timeout mechanism. ARQ protocol includes ARQ that stops waiting and continuous ARQ.

Stop waiting ARQ

  • Normal transmission process: As long as A sends A short packet to B, it stops sending and starts A timer to wait for the reply from the peer end. After receiving the reply from the peer end within the timer time, it cancels the timer and sends the next packet

  • Packet loss or error: Packet loss may occur during packet transmission. When the timer expires, the packet loss data will be sent again until the peer end responds. Therefore, back up the sent data every time. Even if the packet is normally transmitted to the peer end, packet errors may occur during transmission. In this case, the peer end discards the packet and waits for the retransmission. PS: Generally, the set Time of a timer is greater than the average RTT(round-trip Time).

  • ACK Timeout or Loss: The reply transmitted by the peer end may also be lost or timed out. If the timer expires, end A retransmits packets. When receiving A packet with the same serial number, end B discards the packet and replies until end A sends the next packet with the same serial number. In the case of timeout, A terminal will determine whether the serial number has been received. If so, it only needs to discard the reply.

The disadvantage of this protocol is that the transmission efficiency is low. In a good network environment, each packet must wait for the ACK of the peer end.

Continuous ARQ

In continuous ARQ, the sender has a sending window and can continuously send the data in the window without receiving any reply. In this way, the waiting time is reduced and the efficiency is improved compared with stopping the waiting ARQ protocol.

The cumulative confirmation

In continuous ARQ, the receiver continuously receives packets. It would be a waste of resources if it stopped waiting for a packet to be received in ARQ and then sent a reply. After multiple packets are received, a reply packet can be sent. The ACK in the packet can be used to tell the sender that all the data before the sequence number has been received. Please send the data with the sequence number + 1 next time.

But there is a downside to cumulative validation. When receiving packets with serial number 5, you may not receive packets with serial number 6, but receive packets later than serial number 7. In this case, ACK can only reply 6, which causes the sender to send data repeatedly. In this case, Sack can be used to solve the problem, which will be discussed below.

The sliding window

The sending window in the previous section has such a window at both ends, namely the sending end window and the receiving end window respectively.

The sender window contains the data that the sender has received and the data that can be sent but has not been sent:

The sender window is determined by the remaining size of the receiver window. The receiver writes the remaining size of the current receiving window into the reply packet. After receiving the reply packet, the sender sets the size of the sending window based on the value and the current network congestion. Therefore, the size of the sending window is constantly changing.

After receiving the reply packet, the sender slides the window

Sliding window realizes flow control. The receiver notifies the sender of how much data can be sent through the packet to ensure that the receiver can receive the data in time.

Zero window

During packet sending, a zero window may appear on the peer end. In this case, the sender stops sending data and starts a Persistent timer. The timer periodically sends requests to the peer end to inform the peer end of the window size. If the number of retries exceeds a certain number, the TCP connection may be interrupted.

Congestion control

Congestion processing is different from traffic control, which acts on the receiver. Congestion space ensures that the receiver has time to receive data. The former functions on the network to prevent excessive data congestion and excessive network load.

Congestion processing includes four algorithms: slow start, congestion avoidance, fast retransmission, and fast recovery

Slow start algorithm

The slowstart algorithm, as its name suggests, slowly and exponentially expands the sending window at the beginning of the transmission, thus avoiding network congestion caused by large amounts of data being transferred at the beginning

The steps of the slow-start algorithm are as follows:

  1. Connection initial set Congestion Window to 1MSS(maximum amount of data in a segment)
  2. The window size is x2 after each RTT
  3. Exponential growth to a certain threshold back to startCongestion avoidance algorithm

Congestion avoidance algorithm

Congestion avoidance algorithm is relatively simple, and the size of each RTT window is increased by 1. This prevents network congestion caused by exponential growth and slowly adjusts the size to the optimal value.

If the timer times out during transmission, TCP considers that the network is congested and immediately performs the following operations:

  • Set the threshold to half of the current congestion window
  • Set the congestion window to1MSS
  • Enable congestion avoidance algorithm

The fast retransmission

Fast retransmission usually comes with fast recovery. Once the received packets are out of order, the receiver responds only to the sequence number of the last packet (in the case of no Sack). If three duplicate ACKS are received, fast retransmission is started instead of waiting for the timer to expire. There are two specific algorithms:

TCP Tano :

  • Set the threshold to half of the current congestion window
  • Set the congestion window to 1MSS
  • Restart slow start algorithm

TCP Reno:

  • Congestion window halved
  • Set the threshold to the current congestion window
  • Enter the quick recovery phase (resend the packets required by the peer end, usually exit this phase upon receiving a new ACK reply)
  • Use congestion avoidance algorithms

TCP New Reno

The TCP New Reno algorithm improves the defects of the previous TCP Reno algorithm. Previously, quick recovery would exit as soon as a new ACK packet was received. In TCP New Reno, the TCP sender first jots down the maximum number of the segments of the three repeated ACKS.

With TCP packet

If the client continuously sends data packets to the server, two data packets may be stuck together.

The reasons are as follows:

  1. TCP is based on byte streams. Although the TCP layer of the application layer interacts with data blocks of varying sizes, TCP treats these data blocks as a series of unstructured byte streams without boundaries.
  2. From the frame structure of TCP, there is no data length field at the head of TCP

Based on the above two points, packet sticking or unpacking occurs when TCP is used to transmit data.

The receiver receives two packets, but they may be more or less complete.

Sticky bag generation

  1. The sender generated sticky packets

The client and server that use THE TCP protocol to transmit data always maintain a long connection. When the connection is constantly open, the two sides can transmit data continuously. However, when the sent packets are too small, TCP will enable the default Nagle algorithm to merge these small packets and send them. The merge process takes place in the send buffer.

  1. Sticky packets are generated on the receiver

The procedure for receiving data using TCP is as follows:

  • Data to the receiver is transferred from underneath the network model to the transport layer
  • The TRANSPORT layer’s TCP protocol handles it by placing it in the receive buffer, which is then actively retrieved by the application layer

The problem is that the function that reads the data in the program cannot take the data out of the buffer in time, and the next data comes and part of the data is put into the end of the buffer, and when the data is read, there is a sticky packet.

Solve sticking and unpacking

  1. Special character control
  2. Adds the length of the packet to the header

Refer to the link

  • Differences between TCP and UDP
  • Computer General Knowledge – network
  • Computer networks too difficult? It’s enough to know this one