In-depth understanding of TCP congestion control

With the rapid development of network technology, more and more work depends on network to complete, and the quality and real-time performance of internet-based real-time communication system also depends on network quality to a great extent. However, the occurrence of congestion is inherent in the TCP/IP architecture of the Internet. Network congestion is refers to the users of network resources (including the link bandwidth, storage space and processor capacity, etc.) the demand more than its intrinsic capacity and capacity, compared with the UDP, TCP congestion control mechanism, its being and the need to guarantee reliable data transmission, it can cause a certain of audio and video real-time transmission based on TCP. This article will explain in depth the congestion control mechanism of TCP and how to design a real-time audio and video system based on TCP transmission.

PART 01 Introduction to TCP Congestion control

After TCP/IP stack became widely used, the network began to suffer from congestion and collapse. That is, the data sender sends its data packets to the Internet at the recommended speed. When some routers become congested, the data packets are discarded. For TCP, a transmission protocol with a retransmission mechanism, when data loss occurs, retransmission prolongs the time of data arrival. At the same time, the high frequency of retransmission will not alleviate network congestion, resulting in more congestion. To avoid such problems, TCP congestion control was introduced into the network protocol in the late 1980s.

Broadly speaking, TCP congestion control allows each source to determine how much capacity is available on the network so that it knows how many packets it can safely transmit, preventing too much data from being injected into the network, and preventing routes or links in the network from becoming overloaded. When congestion occurs on a network, congestion control reduces the speed of sending data to the network to prevent a vicious cycle. At the same time, when the network is idle, the speed of sending data is improved to maximize the utilization of network resources. Of course, determining available network capacity is not always easy, as new TCP connections are added and removed; To make matters worse, the available bandwidth varies over time, meaning that any given source must be able to adjust the number of packets it has in transit.

PART 02 Classification of TCP congestion control algorithms

In theory, congestion control can be implemented in two ways:

End-to-end congestion control: In this congestion control method, the sender determines whether it is congested and then adjusts the transmission rate.
Network-assisted congestion control: The router in the network tells the sender about the congestion condition of the network.

To realize congestion control through feedback of congestion information at the network layer, it needs the support of network devices and the modification of the underlying hardware. Most commonly used TCP protocols use end-to-end congestion control, that is, the sender determines whether congestion is present. If the sender detects this, it should slow down the rate at which it sends data. If not, it can slowly increase the rate. Congestion control algorithm needs to solve the following three problems:

How TCP limits the transmission rate of data;
How does TCP detect network congestion?
What algorithm does TCP use to adjust rates (when and by how much)?

During the development of TCP congestion control algorithm, the following different ideas appeared:

Congestion control based on packet loss: regard packet loss as congestion, adopt slow detection method, gradually increase the congestion window, when packet loss occurs, reduce the congestion window, such as Tahoe, Reno, BIC-TCP, Cubic, etc.
Delay based congestion control: the delay growth is regarded as congestion, and the congestion window is increased when the delay increases, and the congestion window is reduced when the delay decreases, such as Vegas and Westwood.
Congestion control based on link capacity: real-time measurement of network bandwidth and delay, congestion occurs when the total number of packets on the network is greater than the product of bandwidth and delay, such as BBR.
Learning-based congestion control: without specific congestion signals, but with the aid of evaluation functions, a control strategy based on training data is generated using machine learning methods, such as Remy.

PART 03 Common TCP congestion control algorithms

TCP Reno

Reno algorithm contains slow start, congestion avoidance, fast retransmission and fast recovery mechanisms, which are the basis of many existing congestion control algorithms based on packet loss. The sender maintains a state variable called congestion window CWND (Congestion Window) and slow start threshold SSthRESH state variable. Ssthresh is used as follows:

When CWND < SSTHRESH, the slow start algorithm is used.

When CWND > SSTHRESH, congestion avoidance algorithm is used instead.

When CWND = SSTHRESH, slow start with congestion avoidance algorithm arbitrary.

(1) Slow Start algorithm

CWND = 1 is initialized when the connection is established, indicating that one MSS size of data can be passed.

Whenever an ACK is received, cwnd++; It goes up linearly.
Whenever an RTT passes, CWND = CWND *2; It’s going up exponentially.

(2) Congestion Avoidance algorithm — Congestion Avoidance

When CWND >= SSTHRESH, the Congestion Avoidance algorithm is entered. The algorithm is as follows:

When receiving an ACK, CWND = CWND + 1/ CWND
For each RTT, CWND = CWND + 1

(3) Congestion state algorithm – Fast Retransmit

Reno starts retransmission when three duplicate acks are received, instead of waiting until the RTO times out. When congestion occurs:

cwnd = cwnd/2
sshthresh = cwnd

(4) Fast Recovery

CWND = sshthRESH + 3 * MSS

Retransmission of packets specified by Duplicated ACKs;
If DUPLICATED Acks are received, CWND = CWND +1;
If a new Ack is received, then CWND = sSHTHRESH and the congestion avoidance algorithm is entered.

BIC – TCP and CUBIC

In the case of TCP-RENO with large congested Windows, it takes a long time to recover the window narrowing caused by the loss of a packet (only increasing by 1 each time). In this way, the bandwidth utilization rate is not very high, and with the continuous improvement of the link bandwidth of the network, this disadvantage will become more and more obvious. In order to improve the performance of Reno congestion avoidance phase, BIC-TCP proposed the following dichotomous idea: When packet loss occurs, the optimal window value should be smaller than this value, so BIC sets CWND to max_win and multiplicative to min_win. BIC then performs dichotomy between the two — skipping to the midpoint of max_win and min_win.

BIC-TCP has good scalability, fairness among competing streams and low window oscillation stability in high-speed networks. However, bIC-TCP’s congestion control window growth can still be too large, especially over short RTT or low speed networks. In addition, several different stages of congestion window control (binary search increase, maximum probe, Smax, and Smin) add complexity to protocol implementation and performance analysis.

CUBIC is an improved algorithm of BIC-TCP. By looking for a new window growth function, CUBIC function, CUBIC algorithm keeps the advantages of BIC-TCP (especially its stability and scalability) while simplifying the complexity of window control. CUBIC window control function is as follows:

Where, W(t) is the size of the window at time t, C is CUBIC parameter, t is the time passed by the last window reduction, K is the time cycle required by the above function to increase W to Wmax without further loss event, when congestion event occurs, CUBIC sets the current CWND to Wmax * β, and multiplication reduces the coefficient; Therefore, K is:

As you can see from the picture below,

When CWND < Wmax. CUBIC is in the convex function region, namely the congestion avoidance region, the increase of CWND decreases with the increase of time.
When CWND >=Wmax. CUBIC is in the concave function area, i.e., the new Wmax detection area, the longer it is since the last congestion event, the faster the CWND growth is.

TCPW (TCP Westwood)

Congestion control method based on packet loss interpreted packet loss happened to the network congestion, and assume that link error caused by packet loss is ignored in high speed network, however, this assumption is not established, when higher data transfer rate, or in the wireless network environment, link error can not be ignored. Packet loss does not necessarily mean that the network is congested. At the same time, the congestion control method based on packet loss tends to fill the buffer. When the buffer of the bottleneck link is large, it takes a long time to empty the packets in the buffer, resulting in a large network delay, which is called buffer bloat.

Unlike congestion control based on packet loss, the TCPW sender monitors the rate at which ACK packets are received to estimate the data transmission rate (available bandwidth) that can be achieved by the current connection. When the sender detects packet loss (timeout or three duplicate ACKS), the sender sets the congestion window size (CWND) and slow start threshold (SSthRESH) according to the estimated sending rate.

TCPW evaluates the bandwidth of the current sampling point as follows:

Dk represents the amount of data sent, tk, tK-1 represents the time when the ACK package is received and the time when the last ACK was received. In order to remove the rate sampling noise brought by ACK, TCPW applies a low pass filter to the rate of sampling to obtain the low frequency part of the available bandwidth, resulting in the following discrete time filter:

For ease of understanding, suppose tK-TK-1 is a constant; The current bandwidth estimation can be simplified as:

It can be considered that the current assessed bandwidth is a smooth value of the previous assessed bandwidth and the last two assessed bandwidth; When ACK loss occurs, the bandwidth of the current sampling time is considered to be 0. It can be seen that TCPW considers the current bandwidth to be a multiplication reduction on the last evaluated bandwidth.

Tcp-westwood avoids too conservative window reduction operation. Compared with congestion control algorithm based on packet loss, TCP-Westwood is more suitable for TCP connection of wireless link class. TCPW came up with a similar idea for Google BBR back in the late 1990s, estimating the capacity of the network by continuously measuring bandwidth and minimum RTT, and eventually data convergence would occur.

BBR

BBR is a congestion algorithm designed by Google and released in 2016. In the past, most congestion algorithms are based on packet loss as a signal to reduce transmission rate, while BBR is based on model active detection.

Since the optimal bandwidth and delay cannot be measured at the same time (THE measurement of btlBw will cause the network cache to increase RTT, while the measurement of RTprop requires the network cache to be empty), the bandwidth (btlBw) and delay (RTprop) are estimated respectively, and finally CWND is calculated. At the same time, the variable pacing rate (btlBw * gain coefficient) is added to control the sending rate of the sender to solve the network queuing problem caused by the sender burst.

On the one hand, TCP BBR can improve the transmission rate in packet loss environment and make full use of network bandwidth. On the other hand, TCP BBR can also reduce the utilization of network link buffer and reduce transmission delay. TCP BBR is not only suitable for TCP scenarios, but also QUIC uses BBR as congestion control algorithm.

PART 04 Real-time Multimedia QoS and TCP congestion control

Although most of the real-time multimedia communication is based on UDP protocol to achieve, but there are some cases, need to use TCP to transmit audio and video; For example, UDP port masking. Compared with UDP data transmission packet loss, disorder, TCP network transmission data delay, queue blocking and other problems, real-time audio and video transmission also brings greater challenges.

Some features of real-time multimedia:

For the demand of video hd, under the scenario of large bit rate, the unit throughput and single frame size increase significantly, resulting in a large increase in the number of network packet loss.
The requirement of low delay in real-time communication leads to high requirement of real-time data transmission.

To better improve real-time multimedia QoS, you need to consider the following aspects when setting congestion control and traffic control in the TCP environment:

Network congestion events are quickly detected.
Precise control of coding rate, especially for video key frame coding, to avoid large network impact;
More active network detection, the full use of network bandwidth, can bring better audio and video experience;
Through Simulcast and SVC, in the case of insufficient bandwidth, as far as possible to ensure the real-time and fluency of audio and video;
Avoid sending invalid data to impact the network, such as FEC and NACK.

PART 5 summarizes

Congestion control should be a relatively complex part of TCP, through the introduction of TCP congestion control design ideas and some common congestion control algorithm design ideas, I hope you can have a better understanding of TCP congestion control.

In-depth understanding of TCP congestion control

PART 01 Introduction to TCP Congestion control

PART 02 Classification of TCP congestion control algorithms

PART 03 Common TCP congestion control algorithms

PART 04 Real-time Multimedia QoS and TCP congestion control

PART 5 summarizes

Related Posts

Thoughts of Software Testing Engineer in 10 Years — To a friend who is still confused: “Ten years of grinding a sword, the end of the article welfare”

C++ : compile source files

Calculation of vectors in the game (judge the overlap and intersection of line segments, combine line segments, and the relationship between points and lines)