I enjoyed the feedback from “TCP three handshakes, four waves, and some details” in my previous article on TCP, so I’ll continue with the section on timeouts and retransmissions.
We all know that TCP has a retransmission mechanism, that is, if the sender thinks that the packet has been lost, it will resend the packet. Obviously, we need a way to “guess” if a bag loss has occurred. The simplest idea is for the receiver to send an ACK to the sender for each packet it receives, and conversely, if the sender does not receive an ACK for a period of time, it knows that the packet is most likely lost and resends the packet until it receives an ACK.
You may have noticed THAT I said “guess,” because even if it timed out, the packet probably wasn’t lost, it just took a long detour and arrived late. After all, TCP is a protocol at the transport layer, and it is impossible to know exactly what is happening at the data link layer and the physical layer. This does not interfere with our time-out retransmission mechanism, however, because duplicate packets are automatically ignored by the receiver.
The concept of timeout and retransmission is as simple as that, but there are a lot of details inside, and one of the first questions that comes to mind is, how long does a timeout take?
How is timeout determined?
A one-size-fits-all solution would be for me to set the timeout to a fixed value, say 200ms, but that would be a problem. Our computers interact with many servers, both at home and abroad, with vastly different latency. For example:
- My personal blog is in China, and the delay is about 30ms, that is to say, under normal circumstances, packets can be received in about 60ms, but according to our method, it takes 200ms to confirm packet loss (normally 90 to 120 ms), which is really a little low efficiency.
- If you visit a foreign website, the delay is 130 ms, which is troublesome. Normal packets may be considered as timeout, resulting in a large number of packets being resend. It can be imagined that the resend packets are also easy to be misjudged as timeout. The avalanche effect
So it’s not very reliable to set a fixed value, we have to dynamically adjust the timeout based on the network latency, the larger the delay, the longer the timeout.
Two concepts are introduced here:
- Round Trip Time (RTT) : indicates the Time between sending and receiving ACK packets. **RTT is for connections, and each connection has its own RTT.
- RTO (Retransmission Time Out) : indicates the Retransmission timeout period.
Compare standard RTT definitions:
Measure the elapsed time between sending a data octet with a particular sequence number and receiving an acknowledgment that covers that sequence number (segments sent do not have to match segments received). This measured elapsed time is the Round Trip Time (RTT).
Classic methods
The original specification “RFC0793” used the following formula to get a smooth ESTIMATE of RTT (called SRTT) :
SRTT <- α·SRTT + (1 – α) ·RTT
RTT refers to the latest sample value. This estimation method is called “exponentially weighted moving average”, which sounds fancy, but the whole formula is easier to understand. It takes a weighted average of the existing SRTT value and the latest MEASURED RTT value.
RFC0793 = SRTT; RFC0793 = SRTT;
RTO = min(ubound, Max (lbound, SRTT)·β)
Ubound is the upper bound of RTO, lbound is the lower bound of RTO, and β is the delay discretization factor. The recommended value is 1.3 to 2.0. The formula takes the value of (SRTT)·β as the RTO, but limits the upper and lower limits of the RTO.
This calculation method, at first glance, seems fine (at least to me), but when applied in practice, there are two drawbacks:
There were two known problems with the RTO calculations specified in RFC-793. First, the accurate measurement of RTTs is difficult when there are retransmissions. Second, the algorithm to compute the smoothed round-trip time is inadequate [TCP:7], because it incorrectly assumed that the variance in RTT values would be small and constant. These problems were solved by Karn’s and Jacobson’s algorithm, respectively.
This quote is from “RFC1122”, let me explain:
-
RTT calculation can be “troublesome” in the case of packet retransmission. I have drawn a picture to illustrate these situations:
The graph shows two cases in which RTT is calculated differently (this is known as retransmission ambiguity) :
- Case 1: RTT = T2-T0
- Case 2: RTT = T2-T1
However, for the client, it does not know which situation is happening, and the result of the wrong case is that the RTT is too large/too small, affecting the calculation of the RTO. (The simplest and most crude solution is to ignore retransmitted packets and count only those that have not been retransmitted, but this can cause other problems. See Karn’s Algorithm)
-
Another problem is that this algorithm assumes relatively small RTT fluctuations, because this weighted average algorithm, also known as a low-pass filter, is not sensitive to sudden network fluctuations. If the network delay increases suddenly, the actual RTT value is much larger than the estimated value, which leads to unnecessary retransmission and increases the network burden. (An increase in RTT already indicates that the network is overloaded, and these unnecessary retransmissions can further burden the network).
The standard method
To tell the truth, this standard method comparison,, trouble, I will directly paste the formula:
SRTT <- (1 – α)·SRTT + α·RTT // As the basic method, find the weighted average of SRTT
Rttvar < – (1 -) h · rttvar + h. (| RTT – SRTT |) / / SRTT calculation and the real value of gap (called absolute error | Err |), use a weighted average of the same
RTO = SRTT + 4·rttvar // The new RTO estimated, rTTvar coefficient 4 is adjusted from the callback
The whole idea of this algorithm is to combine the mean (the basic method) with the mean deviation to estimate, and a wave of metaphysical parameters is good. To learn more about this algorithm, refer to “RFC6298”.
Retransmission – an important TCP event
Timer based retransmission
Under this mechanism, each packet has a corresponding timer. Once the RTO exceeds and no ACK is received, the packet is resend. Packets that have not received an ACK are stored in the retransmission buffer and deleted from the buffer after ACK.
To be clear, timeout retransmission is a very important event for TCP (THE RTO is usually greater than twice the RTT, and timeout usually means congestion). When this happens, TCP not only retransmits the corresponding data segment, but also slows down the current data transmission rate, because TCP thinks that the current network is congested.
Simple timeout retransmission mechanisms tend to be inefficient, as in the following case:
If packet 5 is lost and packets 6,7,8, and 9 have all reached the receiver, the client can only wait for the server to send an ACK. Note that the server cannot send an ACK for packets 6,7,8, and 9 because of the sliding window mechanism, so the client has no idea how many packets are missing. It may be pessimistic that the data packets after 5 are also lost, so it is a waste to retransmit these 5 data packets.
The fast retransmission
The fast retransmission mechanism “RFC5681” triggers retransmission based on the feedback from the receiving end, rather than the retransmission timer timeout.
As mentioned earlier, timer based retransmissions tend to wait a long time, but fast retransmissions use a clever way to solve this problem: if the server receives a packet out of order, it will also reply to the client with repeated ACK. For example, the server sends ACK = 5 when it receives out-of-order packets 6,7,8, and 9. In this way, the client knows that a vacancy has occurred in 5. In general, if the client receives repeated ACKS three times in a row, it retransmits the corresponding packet without waiting for the timer to time out.
But fast retransmission still doesn’t solve the second problem: how many packets should be retransmitted?
Retransmission with selection confirmation
This improvement is SACK (Selective Acknowledgment), which simply means returning the serial number range of the most recently received packet segment on a fast retransmission basis, so that the client knows which packets have reached the server.
A few simple examples:
-
Case 1: The first packet is lost and the remaining 7 packets are received.
When any of the seven packets are received, the receiver returns an ACK with the SACK option to tell the sender what out-of-order packets it received. Note: Left Edge and Right Edge are the Left and Right edges of these out-of-order packages.
Triggering ACK Left Edge Right Edge
Segment
5000 (lost)
5500 5000 5500 6000
6000 5000 5500 6500
6500 5000 5500 7000
7000 5000 5500 7500
7500 5000 5500 8000
8000 5000 5500 8500
8500 5000 5500 9000
Copy the code
- Case 2: The 2nd, 4th, 6th and 8th packets are lost.
-
When the first packet is received, there is no out-of-order condition, and the normal reply is ACK.
-
When receiving the 3rd, 5th, 7th packets, reply to ACK with SACK due to out-of-order packets.
-
Because there are so many fragments in this case, there are so many groups in the corresponding Block segment, and of course blocks are capped because of the size limit of the option field.
-
Triggering ACK First Block 2nd Block 3rd Block Segment Left Right Left Right Left Right Edge Edge Edge Edge Edge Edge 5000 5500 5500 (lost) 6000 5500 6000 6500 6500 (lost) 7000 5500 7000 7500 6000 6500 7500 (lost) 8000 5500 8000 8500 7000 7500 6000 6500 8500 (lost)Copy the code
However, SACK specification “RFC2018” is a bit of a trap. The receiver may provide a SACK to tell the sender this information and then “renege,” meaning that the receiver may delete the (out-of-order) packets and then notify the sender. The following is an excerpt from “RFC2018” :
Note that the data receiver is permitted to discard data in its queue that has not been acknowledged to the data sender, even if the data has already been reported in a SACK option. Such discarding of SACKed packets is discouraged, but may be used if the receiver runs out of buffer space.
This is not recommended when the receiver buffer is about to be exhausted.
Because of this operation, the sender cannot directly clear the retransmission buffer after receiving the SACK until the receiver sends a normal ACK number greater than its maximum sequence number. In addition, the retransmission timer is also affected, and the retransmission timer should ignore the effect of SACK, after all, the receiver deleted data is no different from the packet loss.
DSACK extension
DSACK, also known as repeated SACK, carries additional information on the basis of SACK to inform the sender which packets have been repeatedly received. The purpose of DSACK is to help the sender determine whether a packet out of order, ACK loss, packet duplication, or pseudo-retransmission has occurred. TCP can better do network flow control.
About DSACK, “RFC2883” cited a lot of examples, interested readers can go to read, I will not tell so details here.
Timeout and retransmission content is about this much, hope to help you.