TCP is a reliable flow control protocol that requires a connection between two parties before they can exchange information. The connection is the exchange of some information, including socket (source IP, source port, destination IP, destination port), serial number, window size, these three information is the connection.
TCP requires a handshake to obtain the connection information, which ensures reliability and flow control. Therefore, the two parties establishing a TCP connection need to reach a consensus on the three types of information. The socket in the connection consists of an IP address and a TCP port number, which is used to locate resources. Window size is used for flow control; The serial number indicates how the received data is to be reorganized, and the receiver can use the serial number to notify the sender of what data has been received, and the sender can use the serial number to track what data has been sent.
We now know that TCP requires a three-way handshake to synchronize the socket, sequence number, and window size. So now the question becomes what does TCP do to synchronize these connections through a three-way handshake?
It takes a three-way handshake to prevent repeated initialization of old connections
In RFC 793, the main reason for using the three-way handshake for TCP connections is to prevent the confusion caused by the history of repeated connection initialization, and to prevent the two parties using TCP from establishing the wrong connection.
We can imagine the following scenario: If the people on both sides of the communication connection is established using two shake hands, then the sender sent once the connection is established after the request will not be able to cancel the request, if the network status under the condition of complex or worse, the sender sends continuously repeated pleas to establish a connection, if TCP to establish a communication only two shake hands, so the receiver can only choose to accept or reject the request, It does not know if the request is an expired connection due to network reasons.
Therefore, TCP chooses to use a three-way handshake to establish a connection and introduce RST control information into the connection. When receiving a connection request, the receiver sends the SEQ + 1 sent by the sender back to the sender. The sender checks whether the connection is valid (historical connection).
If the current connection is historical, that is, SEQ has expired or timed out, the sender will directly send the RST control message to terminate the connection. If the current connection is not a historical connection, the sender replies with ACK and the connection is established successfully.
Use the three-way handshake and RST control information to give ultimate control over whether or not to establish a connection to the sender, since only the sender has enough context to determine that the current connection is faulty or expired.
Initial serial number
Another important function of the three-way handshake is that the two parties negotiate the initial serial number. As a reliable transport layer protocol, TCP needs to build a reliable transport layer in an unstable network environment. The uncertainty of the network will lead to the loss and reverse order of packets, and other common problems are as follows:
- The packet is sent more than once by the sender, causing the receiver to receive duplicate packets
- Packets are discarded by routes or other nodes during transmission
- Packets do not arrive at the receiver in the order they were sent
To solve this problem, the designer added a sequence number to TCP. After the sequence number was added:
- The serial number allows the receiver to know which packets are duplicated and to de-duplicate them
- The serial number can tell which packets are not received, that is, may be discarded. The sender can send the packets that may be lost repeatedly through ACK
- The receiver can sort and reorganize the data by sequence number.
Random initial serial number
This raises another question: how does the sender determine whether the connection is valid? To determine whether the connection is valid, a random initial serial number is required. If the sender uses a fixed serial number as the initial serial number, such as 0, if the sender sends multiple requests to establish a connection, the initial serial number is 0, and the recipient will reply with ACK of 1. Then the sender cannot determine which connection the ACK reply is for. If a random initial serial number is used, the initial serial number used by the sender for establishing a connection is almost impossible to be the same each time. In this case, the sender can find the request corresponding to the initial serial number according to the replied ACK and establish a connection for this request.
The window size
In the process of establishing a connection to the sender and the receiver will able to receive the number of bytes to inform each other, because both the sender and the receiver, will receive the data cached, waiting for the upper applications use, but there is a cap this cache space, so, if the received data more than the cache space online, will be redundant data is discarded, This can also lead to wasted bandwidth.
We consider a scenario, if there is no window size information, the recipient available cache space, after the sender still send a large amount of data, the receiver because no buffer space is available, so can only be discarded, and the sender does not know the receiver has no buffer space, it will only after a timeout, the data retransmission, However, the receiver still has no cache space, and this part of the retransmitted data will only waste bandwidth and may cause network congestion. If there is information about the window size, when there is no cache space available, the receiver will tell the sender that its window size is 0. The sender knows that the receiver has no more cache space to receive data, and will not send packets any more.
How does the sender know when the receiver has room to receive data again? The sender will send a packet containing at least one byte of data every two minutes, and the receiver will re-report the new window size to the sender if space is available.
When the receiver receives a packet with a window size of 0, it still needs to reply with an ACK reporting the next packet number and window size (0) that it wants to fetch.
Can TCP carry data on the first handshake?
One case is why TCP does not carry data with the sender during the first handshake. The receiver stores the data after receiving it and uses it after the handshake is complete. The downside of this approach is that if you keep sending the first handshake request with a large amount of data each time, the receiver will need to store the large amount of data first, which will quickly run out of memory and deny access.
However, each time a message is sent, three handshakes are required. The cost of delay is too high, so TFO (TCP Fast Open) is developed to minimize the delay caused by the handshake.
TFO needs to use the TFO Cookie to prove that the previous connection between the client and the server is valid because, as mentioned above, the three-way handshake is used to synchronize the connection information of both parties and prove that the other party’s network is reachable.
How it works: The TFO cookie (a TCP option) in the SYN packet at the beginning of the handshake validates a previously connected client. If the validation is successful, it can start sending data before the final ACK packet of the three-way handshake is received, thus skipping a detour and reducing the latency at the start of transmission. This encrypted Cookie is stored on the client and set up at the time of the initial connection. This Cookie is then returned repeatedly each time the client connects.
Request a Fast Open Cookie
sequenceDiagram
Client->>Server: SYN + Cookie Request
Server-->>Client: SYN + ACK + Cookie
Client-)Server: ACK
- When connecting for the first time, the client will bring a Cookie Request to the server for the first handshake.
- Upon receiving the Cookie Request, the server generates a Cookie and sends it to the client.
- The client receives this Cookie and saves it for the next three-way handshake.
Start using TFO
sequenceDiagram
Client->>Server: SYN + Cookie + data
Server-->>Client: SYN + ACK
Server-->>Client: response data
Client-)Server: ACK
- The client needs to connect to the server. The first handshake sends three messages: SYN, Cookie stored on the client, and data to be sent.
- After receiving the Cookie, the server verifies the validity of the Cookie. In addition to replying the handshake message, the server can also bring its own data to be sent.
- The client sends an ACK to complete the handshake and confirm the data.
TFO allows TCP to transfer data during a handshake, reducing the latency associated with the handshake. But it is also not secure. Although cookies are created using encryption algorithms that make it impossible for third parties to fake them, they do not prevent middlemen from obtaining cookies. After obtaining the Cookie, the middleman can send a large amount of data in the first handshake by forging the source IP address and port, causing server memory overflow and suspending service provision.
conclusion
The sender needs to include a connection requests information socket, random initial sequence number, window size, if the receiver did not reply within the prescribed time information, the sender will use the information before to launch a connection request, it launched a connection request for many times, but are all belong to the same connection (the number of retries is limited). If it is an artificial re-connection request, the sender will regenerate the connection information, i.e., a new socket, a new random initial sequence number, and a new window size. In this case, there will be two different connections. If there are only two handshakes, then one connection is useless, which wastes resources. And the sender can fake the source IP address, so the receiver is not connected to an actual sender at all.
Handshake is essentially the test network can be inaccessible, from the sender to send connection requests to the sender receives confirmation, it seems to the sender really proves that the network can reach and set up a connection, but, it seems, at the receiving end receives the connection request alone doesn’t show anything, because this connection requests may be the old connection request, the client already rolled off the production line, It can also be an invalid connection request or a forged IP address. From the perspective of the receiver, it also needs to confirm whether the network is reachable. Therefore, after receiving the confirmation, the sender needs to confirm the receiver to prove the network is reachable.