The origin of TCP congestion control
In 1986, network throughput from LBL to UC Berkeley dropped dramatically from 32Kbps to 40bps due to Congestion. Van Jacobson’s 1988 paper “Congestion Avoidance and Control” started with this problem. The law of packet conservation and the algorithm of slow start, congestion control and fast retransmission were proposed, and the fast recovery algorithm was put forward in 1990.
Packet conservation principle
The flow of packets over a smoothly running TCP connection should be conserved, meaning that new packets can be added to the connection only after the old packets have been successfully transferred to the peer. In TCP, we can use ack to determine whether the packet has reached the peer. That is, when the sender receives a good ACK (an ACK greater than the maximum ack the sender has received), it can send a new packet. This mechanism of deciding to continue sending packets based on ACK is called self clocking (also called ACK clocking).
Slow start
According to the principle of packet conservation, we know that we can use ack to decide whether to send new data, and send data before receiving ACK. Slow start starts behavior control when sending data. The whole idea of a slow start is to start with a very low initial value and gradually increase the speed of data delivery until a timeout or packet loss is reached. The implementation idea of slow start is as follows:
- Maintains a variable CWND (Congestion Window) per connection
- Set CWND to 1, MSS (maximum segment size)
- Each time a new ACK is received, CWND adds one
- When sending data, the number of data packets that can be sent is min(CWND, AWND), and AWND is the size of the sliding window of the receiving end (Reciver’s advertised window).
As can be expected, slow starts grow exponentially without timeouts or packet loss, so slow starts are not actually that slow. “slow” is slow with only 1 MSS at its starting point.
Congestion avoidance
As mentioned earlier, the purpose of slow start is to gradually increase the transmission speed to test until there is congestion, and what to do when there is congestion is what congestion avoidance does. Congestion avoidance mainly consists of two parts:
- A mechanism for determining current network congestion
- A mechanism to reduce transmission speed in case of congestion
The implementation idea of congestion avoidance is as follows:
- When timeout occurs, set CWND to half of the current value (that is, congestion is considered when timeout occurs)
- CWND + 1/ CWND + 1/ CWND + 1/ CWND + 1/ CWND + 1
The behavior of the two changes here, CWND, is commonly referred to as “multiplication decreases” and “addition increases.”
Combine slow start and congestion avoidance algorithms
It is worth noting that slow start and congestion avoidance are actually two different algorithms, one for testing the upper limit of network resources and the other for behavior when resource utilization reaches or approaches the upper limit. In 1988, an algorithm combining slow start and congestion avoidance was presented, and the specific implementation ideas are as follows:
- The sender maintains two variables: Congestion window CWND (Congestion window) and slow start threshold SSthRESH (slow Start threshold), which determine whether the slow start or congestion avoidance algorithm should be implemented.
- When sending data, the number of packets that can be sent is min(CWND, AWnd)
- When a timeout occurs, SSTHRESH is updated to CWND /2 (but not less than 2) and CWND is set to 1
- Each time a new ZCK is received, the sender’s behavior is:
- if
cwnd<ssthresh
.cwnd+=1
(In slow start phase, window index level increases) - Otherwise,
cwnd+=1/cwdn
(Congestion avoidance stage, window increases linearly)
- if
Fast retransmission
The purpose of fast retransmission is to make the sender feel the packet loss as soon as possible. The TCP sender starts a timer every time it sends a segment, and if the corresponding packet is confirmed not to have been sent back within a certain time, the sender assumes that the segment has been lost on the network and needs to be resent. This is also the measure TCP uses to estimate RTT.
Repeated confirmation
Duplicate acknowledgment is based on the following process: If the receiver receives a segment, the sequence number of the segment is sent back to the sender with a value of data bytes as the segment acknowledgment number, indicating that the sender is expected to send the segment of the next sequence number. However, if the recipient receives the segment with a larger sequence number early, or if the segment arrives out of order, the recipient needs to send the segment confirmation immediately using the previous confirmation number. At this time, if the sender receives the segmented confirmation with the same confirmation number from the receiver more than once, and the segmented timeout timer of the corresponding serial number has not yet timed out, it is a duplicate confirmation and fast retransmission is required.
Fast to send the retransmission mechanism is based on the following: if repeated threshold to 3, when the sender received four of the same confirmation number segment confirmation (first received confirmed expectations serial number, repeat 3 times the expectations of the serial number to confirm), you can think to continue to send more serial number of the segmented recipient will be discarded, and won’t be able to order delivery. The sender should ignore the timeout timer waiting for retransmission and immediately retransmit the fragment corresponding to the serial number in the repeated fragment acknowledgement.
The implementation of TCP congestion control
The implementation of TCP congestion control is listed here.
- TCP Tahoe/Reno, Reno added the “fast Recovery” phase for the first time
- TCP Vegas
- TCP New Reno
- TCP BIC/CUBIC
- TCP Westwood/Westwood+
- Compound TCP
- TCP PRR
- TCP BBR