preface
We know that the Ip layer wraps the TCP packet segment from the source Ip address to the destination Ip address, and that if something goes wrong (16-bit Ip checking and errors), the Ip protocol simply dismisses the datagram and does not generate an error message. In this case, TCP detects data loss and retransmits data. In this article I want to explore how TCP does this, and as a former designer, I can’t help but draw two diagrams.
The IP address, UDP, and TCP are different
URG: emergency pointer ACK: confirm sequence number valid PSH: send data to the receiving process as soon as possible RST: re-establish the connection SYN: synchronize sequence number Used to initiate a connection FIN: Send the task
Connect processes and states
I drew a diagram to illustrate the TCP connection process and state changes:
General case (purple and green in figure)
Three-way handshake
First purple arrow (top to bottom) :
By default, the server listens for a port in the LISTEN state. When a client attempts to connect to this port, it randomly generates an initial sequence number (j). The 32-bit sequence number in the header becomes J +1, and the SYN in the header is set from 0 to 1. After the client sends the TCP segment with these headers, the state changes to SYN_SENT.
First green arrow:
When the server receives the first TCP segment with SYN from the client, the client changes from LISTEN state to SYN_RCVD state and sends a piece of data, in which the ACK is set to 1 and the 32-bit acknowledgement number is the serial number (J)+1 from the client, indicating that the data has been received from the client. SYN is also set to 1 and the sequence number is k. It means I want to connect with you, too.
Second purple arrow:
When the client receives an ACK from the server, it changes its status to ESTABLISLISHED and sends the data at the beginning with a K +1 indicating that the server has received the data. At this point, the client has established a connection, and whether the server can establish a connection depends on whether it can receive the data with ACK sent by the current client. If the server receives it, its status will also change to ESTABLISLISHED. At this point, the connection is established and data can continue to be exchanged.
Four times to wave
Third purple arrow:
Without considering special circumstances such as power outages, the party that actively shuts down (the client in the picture but not always) sends FIN+ sequence number M, and the status changes to FIN_WAIT_1.
Second green arrow:
The passively closed party (server in the figure but not always) changes its status to CLOSE_WAIT upon receipt. When the ACK header is set to 1, the server status changes to LAST_ACK. This is the last acknowledgement. After receiving the ACK, the client status changes to FIN_WAIT_2.
Third green arrow:
The server may perform other operations after LAST_ACK, and then send FIN+ n.
Fourth purple arrow:
After receiving the FIN, the client changes its status to TIME_WAIT and sends the ACK n+1 message. After receiving the ACK, the server changes its status to CLOSE. At this point, the connection is disconnected.
Why do you wave four times instead of three?
There is a question when waving, why can’t ACK and FIN be sent together? This is caused by TCP half-close. TCP connections are full-duplex (that is, data can be transmitted in both directions at the same time), so each direction must be closed separately. Receiving a FIN only means that the receiver will not send me any more data, but TCP still allows me to send data to the receiver (of course, in real applications, there are very few programs that do this). If we use Wireshark and other tools to capture packets, we will often see only three waves.
What is 2MSL next to TIME_WAIT?
The TIME_WAIT state is also called the 2x MSL wait state. MSL is the maximum TCP packet lifetime. When a TCP connection closes and sends the last ACK(the last purple line in the lower right corner of the figure), it must wait in TIME_WAIT state for twice the MSL time to prevent the peer from receiving the ACK and resending the FIN. So one trip is twice as long at most. By default, the operating system closes the connection in TIME_WAIT for 240s(or twice two minutes), and the port will remain occupied until then. On a Linux server, you can modify the default value net.ipv4.tcp_fin_timeout=30 by changing the /etc/sysctl.conf file. But there are often problems with this.
Open at the same time (purple and orange)
It is highly unlikely that two applications will open at the same time as each other. Both are clients and both are servers. The process is the same as usual, except you wave four times instead of three. TCP considers this to be a case of establishing one connection instead of two.
Both closed (purple and orange)
For convenience, I drew simultaneous opening and simultaneous closing together. In fact, simultaneous opening does not necessarily mean simultaneous closing. It can be closed at the same time without being opened at the same time.
The TCP server handles port numbers
Let’s take a look at the Telnet server by calling the natstat command. -a: Displays all information. -n: indicates an IP address in dotted decimal notation. -f inet: displays only TCP and UDP. First, it looks like this when there is no connection:
LISTEN
LISTEN
ESTABLISHED
Incoming connection request queue
If the server receives a large number of requests at the same time and the server is not currently capable of processing them (or has a higher priority process). TCP will first cache the request in a queue of fixed length (queue length is usually 5, called backlogs). The application layer constantly consumes the queue. Once the queue is overstocked, the server stops receiving SYN packets and does not return any messages. In this case, the client displays a timeout.
Delay confirmation time
Typically TCP does not send an ACK immediately upon receiving data; Instead, it delays sending so that an ACK is sent along with the data that needs to be sent in that direction (a phenomenon known as data incidental ACKS). Most implementations use a delay of 200ms. 200ms is the maximum time, if there is always data waiting to be sent, then there is no delayed confirmation time.
Nagle algorithm
Nagle algorithm is designed to solve the small segment problem. A “small segment” refers to a data block smaller than the maximum MSS size. For example, an application that generates one byte of data at a time (such as interactive input) can cause the network to overload with too many packets. The basic definition of Nagle’s algorithm is that there can be at most one unrecognized segment at any time. Here are the rules:
1. If the packet length reaches MSS, the packet can be sent; 2. If the packet contains a FIN, the packet can be sent. 3. If the TCP_NODELAY option is set, sending is allowed. 4. If the TCP_CORK option is not set, packets sent can be sent if all packets sent are confirmed. 5. If none of the above conditions is met, but a timeout (generally 200ms) occurs, send immediately.
The advantage of Nagle’s algorithm is that it is adaptive: the faster the acknowledgement arrives, the faster it is sent. The disadvantage of Nagle algorithm is the delay of sending. We can turn off the Nagle algorithm with the TCP_NODELAY socket option.
conclusion
For front-end engineers, understanding TCP helps them better understand and use HTTP. It also helps us optimize the network in more detail. My understanding is also very limited, I hope to discuss with you.
The resources
- Tcp/ IP Details – Volume 1
- Diagram of Tcp/ IP