Front knowledge

MTU and MSS

The MTU in the data link layer defines the maximum length of a network packet.

If the length of a packet exceeds the MTU limit, it is divided into several packets smaller than the MTU. Each packet has an independent IP header. The field specifying the total length of an IP packet is 16 bits, which means that an IP packet can be up to 65535 bytes. Therefore, when the IP network layer method is invoked to send messages, the IP layer automatically obtains the MTU value of the local area network (LAN) and fragments messages based on the MTU size of the local network. The receiver then reassembles the sender’s IP fragments into IP messages based on the IP header. This fragmentation is inefficient and easy to be lost. Therefore, TCP defines the maximum packet segment length (MSS).

MSS defines the maximum length of a single packet that a host expects the peer host to send over a TCP connection.

The TCP three-way handshake synchronizes the MSS to the peer host. Then, when the host calls the TCP layer method, the kernel will fragment the message flow into multiple network groups according to the MSS told by the other side, and then call the IP layer method to send data. If the MSS is still larger than the MTU of the router, the router returns an ICMP error with the MTU value that the current router can receive, so that the sending host can rewrite the MSS. (This sharding process occurs when the operating system copies the data to be sent from the user state to the kernel state.)

TCP three-way handshake process

The three-way handshake is used to establish a TCP connection between two hosts. The TCP connection is a logical connection. When hosts at both ends communicate with each other, the client first creates a socket. TCP adds a TCP header to a packet to form a TCP packet segment, and then passes the socket down to the network layer until it reaches the server. When a server receives a segment, a specific socket and a specific TCP receive cache are created for the client, and the TCP segment is sent to the host process through the socket. So a TCP connection really consists of the send/receive cache, the TCP packet segment, and the socket that communicates with the process. So a TCP three-way handshake is actually an exchange of TCP segments between two end systems (the same thing as a four-way handshake)

① At the beginning, the CLIENT TCP is in the closed state. The CLIENT TCP generates a TCP packet segment and sets the SYN flag in the segment to 1. In addition, the client randomly selects an initial sequence number and places it in the sequence number field of the SYN segment. The packet is then encapsulated into an IP datagram and sent to the server. The client TCP entered the SYN_sent state. Procedure (Note: Random initial sequence numbers are selected for SYN segments, which reduces the possibility of misting previously terminated SYN segments with the same port number as valid segments for new connections and prevents delayed SYN segments from reaching the server again. In TCP data transmission, the serial number of the TCP packet segment is the byte stream number of the first byte of the packet segment.

② The server listens to port 80 and is in listen state. The server receives the IP datagram, extracts the SYN segment, and allocates the TCP receive cache for the TCP connection. Then, the server generates a packet segment, places the SYN flag at 1, selects a random sequence number and places it in the sequence number field. Then, the initial sequence number of the received SYN packet +1 is placed in the ACK number field, indicating the sequence number of the next byte expected from the sender, and places the ACK flag at 1. The SYNACK packet segment is then put into an IP datagram and sent to the client. The server changes from the LISTEN state to the SYN_RCVD state. (Note: the send and receive cache acts as an intermediary between the socket and the lower link, and they send and receive data at appropriate locations.)

③ The client receives the IP packet, extracts the SYNACK packet segment, and allocates the TCP send cache for the TCP connection. The client then generated message segment, the position of the SYN flag is 0, the ACK tag location is 1, choose a random number is placed into the serial number field, the SYN packet received the serial number of + 1 is placed into the ACK confirmation number field, this stage can carry data in the message section (because has succeeded in establishing a connection), then sent to the server. The TCP server enters the Established state. Procedure The TCP connection is established.

Three-way handshake at socket Angle

When a server binds and listens on a port, it creates a SYN and Accept queue for that port. The client uses connect() to establish a connection and send a SYN segment. The server receives a SYN packet segment, places it in the SYN half-connection queue and returns a SYN+ACK to the client. After receiving a SYN+ACK, the client sends an ACK packet to the server. After the server receives an ACK packet, the kernel takes the SYN connection out of the SYN queue and places it in the Accept queue. That is, the connection is established successfully. The server uses the Accept () method to fetch the connection socket from the Accept queue. (If syncookies are enabled, the server will send SYN+ACK to the client in the form of cookies when the SYN queue is full. The client sends a packet with a cookie to restore the connection.

Why does TCP need three handshakes?

① Avoid confusion caused by old connection requests arriving at the server again. For example, if the client sends a connection request but does not receive any confirmation because the connection request packet is lost, the client retransmits the connection request. Confirmation was received and a connection was established. After data transmission is completed, the connection is released. The client sends two connection request message segments. The first one is lost, and the second reaches the server. The server for the client and send a new connection request, and then send a confirmation message to the client, agreed to establish a connection, do not use three-way handshake, as long as the server send confirmation, to establish a new connection, confirmation letter from the client to ignore the service side, at this time also not send data, the server has been waiting for the client to send data, waste of resources.

② The sequence number of the TCP packet segment is the key to reliable transmission. Only in this way, the client and the server can ensure that the initial sequence number, the information required to establish the connection (MSS) can be reliably synchronized, ensuring the full duplex communication between the client and the server.

③ Only when the server receives the ACK for the third handshake, it removes the connection socket from the SYN half-connection queue and places it in the Accept queue. Otherwise it’s just an invalid connection.

TCP waved four times

Four wave flow

Both processes involved in a TCP connection can terminate the connection, and we assume that the client intends to terminate the connection (see three-way handshake for details).

① The client generates a packet segment and sends the FIN marker to 1 to the server. The CLIENT closes the TCP connection between the client and the server, and the state changes from Established to FIN_WAIT_1.

② After receiving a FIN packet, the server generates an ACK packet and sends it to the client. The server enters the close_wait state. In this case, the connection between the client and the server is released, and TCP is in the half-closed state. The client receives an ACK packet and enters the FIN_WAIT_2 state. The client connection is closed

③ The server generates a FIN packet and sends it to the client for confirmation. The server enters the last_ACK state.

④ After receiving a FIN packet segment, the client generates an ACK packet segment and sends the packet to the server. The client enters the time_wait state. Description The server receives an ACK packet and enters the close state. At this point, the full-duplex channels at both ends are released, and four waves are completed.

Active shutdown of party time_wait:

Reasons for existence:

  • If the last ACK sent by the client is lost, the server retransmits the FIN packet segment. Therefore, the client must maintain the time_wait state to receive FIN packets and retransmit ACK packets.

  • If the packet sent by the server is delayed and reaches the client when the client is shut down, data will be corrupted if the port of the packet matches the serial number and the new connection (i.e., the TCP connection is multiplexed). So clients should keep dropping delayed packets for some time.

Time_wait time: generally, the value is 2MSL(maximum TTL of packets). If a packet arrives in time_wait state, the 2MSL timer is restarted. The MSL is set to double because it takes one MSL for the server to detect packet loss and one MSL for the server to resend the FIN segment to the client. After 2MSL, all the packets of this connection disappear, which ensures that the new connection will not be damaged.

A large number of ports on the passive close side are in close_wait state:

The passive close is blocked or the interface on the passive close is time-consuming. The passive close does not complete its one-way connection. Therefore, FIN packets do not occur and the passive close is blocked in close_wait state.

Why does TCP need four waves?

To ensure that the data can be transmitted in its entirety. When the passive close sender receives a FIN packet from the active close, it only indicates that the active close has no data to send to the passive close. Since the TCP connection is a full-duplex two-way transmission, the passive closing party may not send data, so it sends an ACK to close the connection between the active party and the passive party. After sending its own data, the passive party sends a FIN packet to the active party, indicating that no data is sent. That is, the bidirectional connection is closed.

How does TCP ensure reliable transmission

Check and:

The checksum of the TCP header verifies the TCP header, body, source IP address, destination IP address, and protocol.

Serial number and confirmation Number:

In a TCP packet segment, the serial number is the byte stream number of the first byte of the packet segment, and the acknowledgement number is the serial number of the next byte that the host expects to receive. That is, the peer end receives the confirmation number with serial number +1, indicating that the peer end successfully receives byte data.

The retransmission mechanism

  • Timeout retransmission: After sending a packet, the sender puts the packet into a queue and maintains a timeout retransmission timer. If no ACK is received within the timeout period, the packet is resend. If an ACK is received, the packet is removed from the queue. (Timeout The timeout period for retransmission should be slightly longer than RTT: round-trip time of a packet. When the retransmitted data times out again, TCP doubles the next timeout interval to avoid frequent occurrences due to poor network environment.)
  • Fast retransmission, SACK selective confirmation in congestion control.

Flow control

Both ends of a TCP connection are configured with receive caches for the connection. When the TCP connection receives the correct and ordered bytes, it puts the data into the receive cache, from which the application process reads the data. If the application process reads too slowly, or the sender sends too much too fast, the receive cache overflows. So TCP provides traffic control to prevent receiving cache overflows.

Both sides of the TCP connection maintain a receive window. The receiver puts the RWND of the receive window into the receive window field of the ACK packet segment and sends it to the sender to inform the sender how much space is available in the receive cache. The sender maintains a difference value of “the amount of data sent to the connection but not yet acknowledged”. Therefore, the sender keeps the amount of unacknowledged data within RWND to ensure that the cache of the receiver is not overflowed. The same goes for the sender. Both sides maintain the size of each other’s receive window. Note: TCP sends packets only when it needs to send data or confirmation. Therefore, the sender stops sending packets when the receiver is full and does not notify the sender when the receiver has new space. So the TCP specification says that when the receive window is full, the sender will continue to send a segment with only one byte of data, and the receiver will send an acknowledgement that the segment carries the receive window. The bytes in the sliding window are placed in the operating system buffer, so the application layer does not timely read the data sent to the cache by the peer, which affects the size of the receive window.

Reduce tabloids: Each packet has a fixed 20-byte packet header and a 20-byte IP header, thus wasting bandwidth. Confused window syndrome, delayed acknowledgement (reduced ACK sending)



Congestion control:

TCP provides the method of sending byte streams of variable length to the application layer, which enables TCP to occupy the full bandwidth. However, if many TCP connections attempt to occupy the bandwidth at the same time, malicious congestion events may be sent. Congestion control reduces network congestion, prevents TCP sender from being blocked due to IP network congestion, and improves the transmission speed of all TCP connections.

Congestion control algorithm consists of four parts

(1) Slow start: in THE TCP connection, the sender will maintain a congestion window variable CWND to indicate the maximum traffic that the sender can send to the network. So send window = the minimum value of the peer receive window and CWND. So during slow start, CWND adds one for every ACK received. It has an initial congestion window of 10 MSS (exponential growth for slow start?)

(2) Congestion avoidance: ssTHRESH slow start threshold is defined. CWND grows at a linear rate (MSS * MSS/ CWND) when it enters the congestion avoidance phase when the slow start threshold is reached. When packet loss occurs due to the continuous increase of CWND, the CWND will immediately decrease to small, and the slow start threshold becomes half, and the slow start starts.

③ Fast retransmission and fast recovery:

The packet sent by the sender is lost, but its sending window is still available. Therefore, the sender sends the next packet without any sense. In this case, the subsequent packet that the receiver receives is an out-of-order packet segment. When the receiver sends an ACK, the ACK confirmation number is the sequence number of the lost packet (because the receiver sends the expected gap sequence number). If the recipient receives three duplicate ACK segments, it does not wait for the retransmission timer to trigger, but immediately starts to rapidly retransmit the missing segments.

In the case of packet loss that triggers fast retransmission, fast recovery is initiated: set ssthRESH slow start threshold to half of CWND and CWND to slow start threshold +3MSS. CWND then adds an MSS for each repeated missing ACK it receives. When a new ACK arrives, the CWND becomes SSTHRESH.

SACK and selective retransmission: however, how much retransmission is appropriate at the sending end? If only the missing segment is retransmitted, what if the missing segment is also lost? If all retransmission is very inefficient. Therefore, in the TCP header, if Type is set to 4, the function of selecting intermediate packet segments is supported. If Type is set to 5, the function of selecting out-of-order packet segments that have been received is carried. Therefore, the sender can know which out-of-order segments the receiver has received after the missing segment, and these segments do not need to be retransmitted.

⑤Nagle algorithm: When an application process invokes the sending method, it may only send a small chunk of data each time due to congestion window and other restrictions, resulting in the TCP header in each small chunk of the packet occupying unnecessary bandwidth. Therefore, the small TCP packets should be combined into a larger TCP packet and sent together. Nagle algorithm is to go to a TCP connection can only be sent out at most but has not been confirmed by the small packet, before the packet confirmation can not be sent to other packets, so the realization of waiting for multiple tabloids, synthesis of large packets sent together.

Keep Alive: –

Long connection Does not transfer data for a long time, occupying memory. Keep-alive specifies that the keep-alive function will be enabled if no data is sent within the timing period, and multiple probe packets will be sent. If the probe packet receives an ACK, the connection survives and keep-alive is retimed. If there is no ACK and retry a certain number of times, the long connection will be closed. A TCP connection has a ConnectionID, so the connection can be reused for different IP addresses and ports.

Differences between TCP and UDP

(1) TCP is connection-oriented. After a connection is established, the sockets at both ends maintain the send and receive caches for the connection. (The so-called connection is the establishment of certain data structures on the server and client to maintain the state of interaction between the two sides, and the use of such data structures to ensure the connection point characteristics. UDP is connectionless. UDP is connectionless. TCP records the source IP address, destination IP address, source port, and destination port during data transmission, enabling accurate point-to-point transmission. UDP only records the source port and destination port. It broadcasts through the IP header and MAC header. Anyone can parse this packet.

②TCP provides reliable delivery. Data transmitted through TCP can be guaranteed to arrive in sequence without error, without repetition and without loss. UDP before and after the occurrence of no handshake, can be said to be almost directly with the IP layer, it can not ensure that the data is not lost, in order to arrive.

③TCP is byte stream oriented, sending data in the form of stream transmission, and then at the receiving end to certain rules to assemble the correct packet. UDP, on the other hand, is based on datagrams and has a header and tail structure.

(4) TCP provides congestion control, which controls the sending rate according to packet loss and network environment. UDP does not control the sending rate, so he can send it

⑤TCP is a stateful service, that is, it records which packets are sent, which packets are successfully received, and which packets need to be retransmitted. UDP is a stateless service and does not record this.