Hello, I’m Xiao Lin.
Why do the client and server have different initial sequence numbers during the TCP three-way handshake?
Next, I tell you step by step, I think there should be a lot of people will have similar problems, so today in a liver!
The body of the
Why do the client and server require different initial sequence numbers during the TCP three-way handshake?
The main reason is to prevent historical packets from being received by the next connection with the same quple.
The TIME_WAIT state of the TCP wave wave lasts for 2 MSL, and the historical packets disappear long ago.
Yes, if you wave four times normally, the TIME_WAIT state lasts for 2 MSL, and the historical packets will disappear before the next connection.
But there is no guarantee that each connection will be properly closed with four waves of the hand.
Assume that each time a connection is established, the client and server initializes serial numbers starting with 0:
The process is as follows:
- When a client establishes a TCP connection with a server, the network blocks data packets sent by the client. When the server restarts, the client sends RST packets to disconnect the connection.
- Next, the client establishes a connection with the server that is the same quad as the previous connection;
- After the new connection is established, the packet blocked by the network in the previous connection just arrives at the server, and the serial number of the packet happens to be in the receiving window of the server. Therefore, the packet will be normally received by the server, resulting in data disorder.
As you can see, if the initial serial number of the client and server is the same each time a connection is established, it is easy to have the problem that the historical message is received by the next connection with the same quplet.
Doesn’t that happen when the client and the server initialize different serial numbers?
Yes, it is possible to receive historical packets even if the initial sequence numbers of the client and server are different.
However, we need to be clear that whether the historical message can be received by the peer party depends on whether the serial number of the historical message is in the receiving window of the peer party. If it is not, the historical message will be discarded, and if it is, it will be received.
If the initial serial numbers of the client and server are “different” each time a connection is established, there is a high probability that the serial numbers of historical packets are “not” in the receiving window of the other party, thus avoiding historical packets to a large extent, as shown in the following figure:
On the contrary, if the serial numbers of the client and server are the same each time a connection is established, there is a high probability that the serial numbers of the historical packets are just “good” in the receiving window of the other side. As a result, the historical packets are successfully received by the new connection.
Therefore, a different initial sequence number can largely prevent the history message from being received by the next connection with the same quplet. Note that it is largely, but not completely avoided.
The client and server initialization sequence numbers are random, so it is possible to randomly form the same sequence number.
RFC793 mentions the algorithm for initializing the sequence number ISN random generation: ISN = M + F(localhost, localport, remoteHost, remotePort).
- M is a timer that increments by one every four milliseconds.
- F is a Hash algorithm that generates a random value based on the source IP address, destination IP address, source port, and destination port. Ensure that the Hash algorithm cannot be easily calculated externally.
As you can see, the random number is incrementing based on the clock timer, and it is almost impossible to randomly create the same initialization sequence number.
Ok, client and server initialization sequence numbers are randomly generated, so the connection can avoid receiving historical messages.
Yes, but it’s not completely avoided.
To better understand this reason, let’s first look at the sequence number (SEQ) and the initial sequence number (ISN).
- Serial number is a TCP header field that identifies a byte in the data flow from the TCP sender to the TCP receiver. TCP is a reliable protocol for byte streams. To ensure the sequence and reliability of messages, TCP assigns a number to each byte in each transmission direction. In order to facilitate the confirmation of successful transmission, retransmission after loss, and at the receiving end to ensure that there is no disorder. The serial number is a 32-bit unsigned number, so it recirculates back to 0 after reaching 4G.
- Initial sequence number: When establishing a TCP connection, the client and server each generate an initial sequence number based on a random number generated by the clock to ensure that each connection has a different initial sequence number. The initialization sequence number can be viewed as a 32-bit counter that increments its value by 1 every 4 microseconds, taking 4.55 hours to cycle.
Seq in the figure below is the serial number, where the red boxes are the initial serial numbers generated by the client and the server respectively.
The picture
As we know, the sequence number and initialization sequence number are not infinitely increasing and will be wound to the initial value, which means that the old and new data cannot be judged based on the sequence number.
Do not assume that the upper limit of the serial number is 4GB, so it is very large and difficult to rewind. When a large amount of data is transferred over a sufficiently fast network, the serial number winding time becomes shorter. If the sequence number winding time is very short, we will again face the problem that the sequence number is still valid after the arrival of the previously delayed message.
To solve this problem, you need a TCP timestamp. The tcp_timestamps parameter is turned on by default. When tcp_timestamps is turned on, the TCP header uses the timestamp option, which has two benefits: it facilitates accurate calculation of RTT and it prevents sequence number winding (PAWS).
Take a look at the following example. Assume that the TCP send window is 1 GB and the timestamp option is used. The sender assigns a timestamp value to each TCP packet.
The 32-bit serial number is wound between moments D and E. Suppose a packet is lost and retransmitted at time B, and suppose the packet segment takes a long route on the network and reappears at time F. If TCP cannot recognize the loopback, data integrity is compromised.
The timestamp option can effectively prevent the above problems. If the lost message reappears at time F, its timestamp is 2, which is smaller than the last valid timestamp (5 or 6). Therefore, the anti-wrap sequence number algorithm (PAWS) will discard it.
The anti-winding sequence number algorithm requires both parties to maintain the timestamp of the last received data packet (Recent TSval). Each new data packet is read and compared with the value of “Recent TSval”. If the timestamp in the received data packet is not increasing, it indicates that the data packet is expired. The packet is discarded.
The initial sequence number of both the client and the server is randomly generated, which can largely avoid the historical message being received by the next connection with the same quplet. Then, the timestamp mechanism is introduced, which completely avoids the problem of the historical message being received.
Yeah, yeah.