TCP is the focus of the focus of the network protocol, but also we often say, this time a good study analysis of TCP protocol.
2.1 introduce TCP
TCP provides a reliable connection – oriented byte stream service. The term “connection-oriented” refers to the fact that two applications using TCP must establish a TCP connection by contacting each other before they can exchange data. The typical analogy is making a phone call, waiting for the other person to pick up and say “Hello”, and then “Who to call”. This is just two ends of a TCP connection communicating with each other. (Three handshakes)
2.1.1 TCP Reliability
- TCP must convert a sending application’s byte stream into a set of packets that IP can carry. This is called grouping. These packets contain the serial number, which in TCP actually represents the byte offset of the first byte of each packet across the data stream, not the packet number. At the receiving end, packets can be regrouped, called regrouping.
- TCP maintains a mandatory checksum. The checksum involves all fields in its header, any associated application data, and the IP header.
- TCP sets a retransmission timer and waits for an acknowledgement. TCP does not set a different retransmission timer for each message segment. Instead, it sets a timer for sending window data and updates the timeout when ACK arrives. If an acknowledgement is not received in time, the segment is retransmitted.
- Flow control between two directions.
2.1.2 the TCP header
The TCP header, immediately following the IP header, is usually 20 bytes (without option bytes), and we will now examine a TCP header.
This diagram is much more complex than the UDP protocol written yesterday, but it is true that TCP protocol is very complex, but how complex, we also want to analyze and understand today.
Source port: indicates the source port. Destination port: indicates the destination port. The source port and destination port plus the source IP address. Serial number, which identifies the byte stream sent from the TCP source to the TCP target. It represents the first byte in the packet segment. Ack num: The ack number field is valid only when the ACK flag is 1. It contains the next byte of data that the target expects to receive from the source. Header len: The length of the header, in 4 bytes. This bit is 4 bits. The maximum value is 15, so the maximum number of bytes is 60, but it is usually 20 bytes. Resv: Reserved bits define 8-bit fields:
- CWR — Congestion window minus (sender reduces its sending rate)
- ECG — ENC echo (sender receives an earlier congestion notification)
- URG — urgent (the emergency pointer field is valid — rarely used)
- ACK — confirm (the confirmation number field is valid — the connection is usually enabled after it is established)
- PSH — push (the receiver should send this data to the application as soon as possible — not implemented or used reliably)
- RST — Reset connection (connection cancelled, often due to an error)
- SYN – Used to initialize the synchronization sequence number of a connection
- FIN: Indicates that the sender of the packet has stopped sending data to the sender
Windows size: Window control, TCP traffic control is done by this field, which is also the byte that the receiver wants to receive. TCP checksum: TCP check; Urgent Pointer: urgent pointer, valid only for URG. Option bytes: Use them later
This TCP header field is really complicated, but it gets more complicated later. Keep going.
2.2 TCP Connection Management
TCP is a connection-oriented unicast protocol. Before sending data, communication parties must establish a connection with each other.
2.2.1 TCP three-way handshake
Finally, I learned the three-way handshake. I have heard about it many years ago, but I don’t know exactly what it is. Until recently, I gradually understand the importance of the three-way handshake. A TCP connection consists of a 4-tuple consisting of two IP addresses and two port numbers. (This is to apply a FD into the socket application). A TCP connection has three phases: startup, data transfer, and exit. The so-called initiation is a three-way handshake, so I’m going to use the drawing tool to draw my own three-way handshake, just to make a better impression.
Everyone turn on the computer, draw, draw more can better remember TCP three handshakes, unexpectedly draw is written quite clear, text description need not write so clear. 1, by the client to send a connection, because it is the first frame of message, so you need to buy a TCP header the syn bit, is in the 8 bits of above, the syn bit said, this is communication, however, in a message, also carries the source IP address and destination IP address, source port and destination port, is the quad, said a lot of times.
2. After receiving the SYN signal, the service stores the received SYN packets in the SYN queue. The queue responds to the client with an ACK signal and waits for the client to reply again.
3. After receiving the ACK message from the server, the server replies with an ACK. Then, the server searches the SYN pair column for a matching quad. (Accept queue data will be covered later)
If you want to reply to the last request, you need to return an ACK message with the value of the previous SEQ +1.
2.2.2 How to determine the length of syn and Accept queues?
The backlog argument can be interpreted in two ways, depending on the kernel version. The first is simply to specify the length of the SYN queue. Later, it was changed to specify the total length of the SYN and Accept queues.
The size of the queue should be increased:
- When the request speed to establish a connection is really high, even for a high-performance service, the SYN queue may need to be set to a larger size.
- The size of the SYN queue, in other words the number of connections waiting for ACK packets. That is, the greater the average round-trip time to the client, the more connections pile up in the SYN queue. For scenarios where most clients are far away from the server, such as round-trip times of more than a few hundred milliseconds, you can set the queue size to be larger.
- The TCP_DEFER_ACCEPT option, if turned on, causes the socket to remain in SYN-RECV for a longer time, increasing the time it spends in the SYN queue.
However, making the backlog too large can also have negative effects:
- Each slot in the SYN queue requires some memory. There is no need to waste resources on packets that initiate SYN Flood attacks. Inet_request_sock structures in SYN queues, each of which takes up 256 bytes of memory under the 4.14 kernel.
These phases refer to this blog, the TCP SYN queue and the Accept queue, and I’m a beginner and don’t know much about it, so I refer to someone else’s blog.
2.2.3 TCB Control block
If you are also interested in what is stored in the ACCEPT queue, it is actually a TCP control block called TCB. This TCB is a connection of the maintenance TCP. When the TCP three handshake is successful, the TCB will be stored in the Accept queue. The TCB control block can then be released, i.e. the entire TCP connection life.
How do I find the corresponding TCB? There must be a lot of people say FD, I also thought it was FD at first, but after Teacher King’s explanation, I found that FD is the thing of application layer, and now we are in the transport layer, where FD. So with fd to find was a mistake, this time is reminiscent of a group of 5 yuan, yes, the 5 yuan group in the network is very important, also has emphasized many times, that’s right, the other party sent data packet, includes 5 yuan group, and then after we receive the data is through the 5 yuan group to find the TCB, Then through TCB to find the FD of the application layer, so that the application layer can care about a FD, can complete the network communication.
The structure of TCB is as follows:
struct tcb{
int sip; 5 yuan / / group
int dip;
short sport;
short dport;
char proto;
unsigned char status; // State in 11, more on that later
int fd; // This is fd
unsigned char *rbuf; // When receiving buF, network transmission is full-duplex
unsigned char *sbuf; / / send the buff
}
Copy the code
If you look at this data structure, it will be clear. This is definitely not a prototype in the kernel code, and we will talk about it later when we analyze the kernel.
2.2.4 the send/recv function
Send /recv (fd); send/recv (FD); Send = 0; send = 0; send = 0; send = 0; It does not mean that the data should be received by the other party, but copied to the queue of the kernel. There is also a copy, which is to copy the data from the SK_buff to the nic driver, and then sent by the NIC driver.
The important thing is that we send data through the application layer until the data is sent. We copy data from the application layer to the kernel space (SK_buff), and then copy the kernel space to the nic driver, twice. One of the most popular technologies is zero-copy technology. Zero-copy technology is not really zero-copy, but copies the data of the network card directly to the application layer, skipping the kernel space. Netmap is zero-copy technology
The process of RECV is the opposite of the process of SEND, also through two copies to the application layer, at this time, it is through the 5-tuple to find TCB, find TCP to find FD, and then copy to the buff of the application layer, the whole process can be completed.
2.2.5 Wave four times
We’ve talked about startup and data transfer, and now that data is transferred, it’s time to say goodbye, because TCP is short connections, you can make long connections, we’ll talk about long connections later, short connections, you make connections, you send data, you disconnect, that’s short connections.
When it comes to disconnecting, it’s back to TCP’s four waves. This is less famous than the three-way handshake, but it has to finish what it started, right? Even if it is connected, it should be disconnected. Instead, when breaking up, it should be amicable.
Although the writing is a little funny, but it is easy to understand why the wave needs four times, as can be seen from the above description, because it involves two ends of communication, both sides need to confirm that I have disconnected, so that the real disconnect.
The detailed steps of the four waves will not be described, you can understand it by looking at the picture, and know that everyone is an old driver.
2.2.6 11 TCP States
The TCP state diagram is a bit difficult, but you can find one on the Internet and draw a simpler state transition diagram yourself
This is a headache to look at, but it is a classic graph, so we still need to take it for analysis, we do not analyze this graph, we draw our own graph analysis:Drawing is a little ugly, but also through their own hands, good memory is not as bad as the pen, or to write more, familiar with.
The client and server are both in the CLOSED state. 2. The server switches to the SYN_RECV state through the LISTEN function. 3. After receiving the SYN signal, the server returns an ACK. In this case, the client also returns an ACK and changes the state to ESTABLISHED. 4. After receiving the ACK message from the client, the server changes to ESTABLISHED and is ready.
6. After receiving the FIN signal from the client, the server sends an ACK signal and switches to CLOSE_WAIT to close the connection. 7. Will switch to FIN_WAIT_2 state 8, after the server to send the data to send, will once again send a FIN signal, by the client, said I also close the connection, and switch into LAST_ACK, wait for an ACk. 9. After receiving the FIN signal, the client switches to the TIME_WAIT state and sends an ack to the server. If the TIME_WAIT state requires 2ML, the link cannot be re-established during this period. 10. When the server receives an ACK, it switches to the CLOSED state. 11. If the FIN signal reaches the client before the ACK signal, it is in the CLOSING state, and the CLIENT changes to the TIME_WAIT state after receiving the ACK signal.
2.2.7 time_wait state
The TIME_WAIT state is also called the 2MSL wait state. In this state, TCP will wait twice as long as the maximum lifetime, sometimes referred to as double wait. Each implementation must choose a value for the maximum lifetime. It represents the maximum time that any segment of a message is allowed to exist on the network before being discarded. We know that this time limit is limited because TCP packets are transmitted as IP datagrams, which are used for TTL fields and hop limit fields.
On Linux, net.ipv4.tcp_fin_timeout records the timeout time for the 2MSL state. On Windows, the following registry keys also store the timeout time: HKLM\SYSTEM\CurrentControlSer\Services\Tcpip\Parameters\TcpTimeWaitDelay
Does anyone have such a question, if the last ACK does not arrive?
If the last ACK is lost, the client can retransmit an ACK signal to ensure that the ACK is not lost. The final ACK is resent not because TCP retransmits the ACK, but because the other end of the communication retransmits its FIN. In fact, TCP always retransmits the FIN until it receives a final ACK.
Time_wait also has a negative effect if the client is in time_wait state and the communication parties define the connection as unusable until 2ML ends, or the initial sequence number used by a new connection exceeds the highest sequence number used by the previous instance. Or allow the timestamp option to distinguish the message segments of previous connection instances to avoid confusion before the connection can be used again. Unfortunately, if a port number is used by any communicator in a 2ML wait state, the port number cannot be used again.