The article involves more knowledge points, because each piece can be independently written an article to be introduced separately, so please read the officer according to the navigation taste (foreword can be skipped);
1. Introduction
Why is the title a Fresh Start series? I was thinking about calling it retrospective, but on second thought it would be better to start afresh; I recall that when I just entered the operation and maintenance industry, I swallowed a lot of new knowledge crazily and only pursued how to achieve and use it quickly, ignoring the principles behind it or even simply copying and using it. In this way, I always feel that I can’t transform this technical point or knowledge point completely, and precipitate as a leaf node on my knowledge tree (commonly known as not bragging well with others B). Later, when I talked with many friends about these things, I found another interesting problem was that many things were not systematically learned at the beginning. Many concepts suddenly flooded into my head, causing more chaos, which would exist for a long time until we calmed down and cracked it. Take this opportunity to start series afresh, study some fuzzy knowledge point theory afresh; (a little bit like going back with advanced equipment to fight advanced replicas that you couldn’t beat before).
Let’s move on to today’s topic, the fourth transport layer in the seven-layer model
2. Learn some simple concepts
In the transport layer, we usually talk about two protocols, UDP and TCP. Before we talk about these two protocols, we need to clarify some prior knowledge.
- If we make the name of the data clear:
The application layer
Is calledmessage(the message)The transport layer
Is calledMessage segment (segment)The network layer
Is calledThe datagram(datagram);
- The transport protocol must at least provide error verification to detect data loss or errors.
- Transport layer process delivery with hostMessage segmentThe process is calledmultiplexingAnd (transport – layer multiplexing)demultiplexing(demultiplexing);
- Multiplexing means that the sending host collects data blocks from different sockets to encapsulate the header information and then sends the message segments to the network.
- Multiplexing means that the receiving end receives a packet segment at the transport layer and delivers the data in the packet segment to the correct host process by checking the field information marked by the socket.
Now, before WE talk about UDP and TCP, an interesting thing to mention is the full names for both protocols. The name of User Datagram Protocol (UDP) and Transmission Control Protocol (TCP) indicates the reliability of TCP. But also only on behalf of TCP high reliability, does not represent that it is better than UDP, technology is divided scene, such as from the scene discussion technology is playing rogue;
3. UDP is transmitted without connection
3.1 INTRODUCTION to UDP and features
-
UDP defines only the transport protocol and the most basic actions are the reuse/decomposition function and a small amount of error detection, which means that the upper-layer program only knows that the data is missing or is incorrectly unavailable.
-
Before UDP packets are sent, there is no handshake between the sender and receiver at the transport layer, so it is called connectionless.
-
Also and TCP on the socket to obvious different is the way of sending and receiving message, when sending the data sent to the destination host directly to the target port (socket), and the receiving party either from which the source host’s message at the transport layer are only through the same socket to reach the upper application and does not guarantee the order of the data in question. In TCP, you set up different sockets for different sender and more on that later;
Instead, parse it with a little Python code below, paying attention to the comments;
The service side
udp_server.py
from socket import *
The service port is defined
server_port = 12000
Select socket type as socket_dgram this is a udp socket
server_socket = socket(AF_INET, SOCKET_DGRAM)
The system will generate a socket for this port.
server_socket.bind("", server_port)
while true:
Get 2048 bytes of data from socket and client address information; The client information includes the source host and source port
message, client_address = server_socket.receiver_from(2048)
# Quick information processing
modified_message = message.upper()
The socket sent to the destination host
server_socket.sendto(modified_message, client_address)
Copy the code
The client
udp_client.py
from socket import *
server_name = 127.0. 01.
server_port = 12000
# Select a socket
client_socket = socket(AF_INET, SOCKET_DGRAM)
message = "This is a test for udp."
No need to establish a connection with the server, directly send the packet to the destination port;
client_socket.sendto(message, (server_name, server_port))
recived_message, _ = client_socket.recvfrom(2048)
client_socket.close()
Copy the code
3.3 UDP Packet Segment Structure
From the packet structure, it can be seen that the UDP structure is not complicated. In addition to the source port and destination port, we can also see the length and checksum in the header. The length is used to tell the application program the length of the application data of the current packet. Then the application program can build a buffer to read and read data according to the packet length. Checksums are mainly used for error detection, but they are useless for error repair, and upper-layer applications can only receive warnings.
The following figure shows the UDP packet structure:
3.4 Advantages and disadvantages of UDP
3.4.1 track advantages
The advantages of UDP are also discussed in some of its application scenarios.
-
The application layer has more fine-grained control over what data to send and when. For example, some real-time communication, real-time video streaming, etc., can relatively tolerate the loss of some data, but requires high-speed transmission of message segments. If the TCP model is used, it is not appropriate, because TCP will control the transmission quality of the network information (congestion), resulting in some concession behavior resulting in the transmission rate. So some applications under THE TCP protocol will do as many concurrent threads as possible to preempt bandwidth.
-
A situation where no connection is required. For example, if DNS is based on TCP, then each connection will undergo three handshakes, the efficiency of the resolution will be slowed down by more than three times.
-
The link is down. For example, some live streams that support a large number of active users, because if TCP is used, then TCP needs to maintain a large number of connection states in the system, such as send cache, receive cache, congestion control parameters and serial numbers and confirmation numbers, etc. Then using TCP in the same software will greatly reduce the number of user connections;
3.4.2 shortcomings
Speaking of the disadvantages of UDP, it is basically that there is no control over the traffic at all.
- Such as lack of congestion control, not fair concession leads to serious network congestion;
- If it is known that data packets are lost without data retransmission;
- If there is no traffic control, a large number of network device datagrams overflow, resulting in packet loss.
3.5 Some extended knowledge
3.5.1 track of KCP agreement
KCP is a fast and reliable protocol that can reduce the average latency by 30% to 40% and the maximum latency by up to three times at the cost of wasting 10% to 20% of the bandwidth compared to TCP.
-
This is just a brief introduction, you can think of it as UDP encapsulates a layer of TCP, but it is better and more flexible configurable TCP; Why do you say so? Because it adds a lot of control methods to UDP control (less than TCP), and then in a better way for fast retransmission, smaller RTO retransmission, so that will consume more bandwidth resources, but in exchange for a higher transmission rate; After you look at TCP, go back and compare some of KCP’s controls to see the advantages of KCP with high latency. TCP is controlled by congestion in high latency networks, and the transmission window size is greatly compressed, which affects the transmission rate.
-
Source code address: github.com/skywind3000…
3.5.2 QUIC agreement
Quick UDP Internet Connection (QUIC) is an application-layer protocol that encapsulates UDP.
- Don’t have much understanding, recommend a popular science article zhuanlan.zhihu.com/p/32553477
4. Connection-oriented TCP
4.1 introduction
It is well known that TCP has these characteristics of connection-oriented, reliable transmission, flow control, congestion control and other characteristics; If there are any features you haven’t heard of, let’s talk about them today. If I’m not clear or wrong, please point out in the comments section what I got wrong. (More terms will appear later, please be patient.)
The following sections introduce some events that TCP may encounter in daily life.
4.2 Connection-Oriented
They say TCP is connection-oriented, but what exactly is connection-oriented? This is because before one application process can start sending data to another, the two processes must “shake hands” with each other; that is, they must send each other some preliminary text to establish parameters that ensure the data transfer. This action is necessary because we are doing reliable traffic over an unreliable three-tier network. Let’s pause the TCP connection-oriented description here and jump right into a little Python code to parse the connection process.
4.2.1 Python Code parsing
-
In TCP, the server listens on a port and generates a server socket for listening. When the client makes a connection, it sends a request to the server’s listening port. The server responds to the request. This process is known as the triple handshake. After the three-way handshake, TCP creates a connection socket for the client for data transmission. Once established, TCP maintains the corresponding set of send and receive caches, variables, and sockets.
Socket: UDP is binary ancestor (destination IP, destination port); TCP is a quaternary ancestor (source IP, source port, destination IP, destination port)
-
Socket of the TCP process
- tcp_server.py
from socket import *
# define port
server_port = 12000
Select a TCP socket
server_socket = socket(AF_INET, SOCK_STREAM)
The port can be reused immediately after it is released, as the system waits for 2 minutes by default.
server_socket.setsockopt(SOL_SOCKET, SO_REUSEADDR, 1)
Bind ports to generate sockets
server_socket.bind((' ', server_port))
Set the maximum number of connections to 1
server_socket.listen(1)
print("The server is ready to receive.")
Call accPET to create a connection socket for the client
conn_socket, addr = server_socket.accept()
while True:
# Value from cache
sentence = conn_socket.recv(1024)
m_sentence = sentence.upper()
print(m_sentence)
# Send the modified message
conn_socket.send(m_sentence)
The connection is broken when an exit message is received
if m_sentence == b'EXIT':
conn_socket.close()
break
server_socket.close()
Copy the code
- tcp_client.py
import time
from socket import *
server_name = "127.0.0.1"
server_port = 12000
client_socket = socket(AF_INET, SOCK_STREAM)
Connect to the server
client_socket.connect((server_name, server_port))
while True:
sentence = input(">")
Write data to cache
client_socket.send(sentence.encode())
Read data from the cache
modified_sentence = client_socket.recv(1024)
print(f"From server: {modified_sentence.decode()}")
Disconnect the server
if sentence == "exit":
break
Copy the code
- TCP process simple transport diagram
4.2.2 TCP Packet Structure
So let’s take a look at what’s in a TCP segment, and how does a segment come about?
- When an upper-layer application sends a large data file to TCP, the file data is cut into packet segments and stored in the TCP cache. What is the size of the packet segments? This refers to a value, the Maximum Segment Size (MSS), which is ultimately limited by the length of data transmitted at the link layer, namely the Maximum Transmission Unit (MTU). In Ethernet, link-layer protocols have MTU of 1500 bytes, and packet segment length is usually 1460 bytes, and 40 bytes are the head of TCP/IP.
As shown in the following figure, the TCP packet segment structure contains many terms. Below, only the parts that need attention are described;
-
The source port and destination port do not need to be resolved.
-
Ordinal and confirmation numbers are used for reliable transmission and are simply a sequential relationship, more on that later.
-
The receive window is used as a flow control, and this field is used to indicate the number of bytes the receiver is willing to receive.
-
Flags indicate some of the package’s functions
- URG – Emergency: When URG is set to 1, the data pointed to by the emergency data pointer is preferentially transmitted without passing through the TCP cache.
- Syn-synchronization: starts a session request.
- RST – reset: used to close abnormal connections, that is, do not need to pass 4 breakups;
- Psh-push: send data as quickly as possible to the upper layer, such as the interactive terminal will see this marked packet;
- Ack-reply: valid for confirming the value of the acknowledgement number;
- FIN – End: terminates the connection, that is, initiates four breakups.
- ECE – Display congestion The following command output is displayed: This flag is related to congestion control. It is a flag added for the packet segment through the routing device (supporting the ECN function).
- CWR – Congestion window reduction: the sender will reduce the sending rate by reducing the size of the sending window, and we will talk about congestion control later.
-
This is a screenshot of the first packet to initiate four breakups. You can see the function of this packet from flags:
4.2.3 TCP Connection Management
Let’s talk about the familiar but somewhat unfamiliar “three handshakes” and “four breakups”.
4.2.3.1 Three handshakes
-
Why three handshakes? Instead of “two handshakes”?
In fact, it is very easy to understand, so that both sides ensure clear communication protocol and some transmission parameters. Request connection [client] -> Allow connection [server] -> Confirm connection [client]
-
Start by looking at the three-way handshake (you’ve probably seen it many times, and this one might be more fruitful?).
-
Let me break it down as usual
Step 1: The client sends a special packet segment to the server. The SYN bit in the FLAG segment is set to 1. The client randomly sets an initial client_ISN number, and the server responds with an ACK of client_ISN + 1.
Step 2: Once the server receives a TCP packet segment containing the SYN, it allocates the cache and variables and sends the ack segment to the client. The ACK becomes client_ISN + 1 and sends its initial sequence number server_ISN.
Step 3: After receiving a SYN or ACK packet, the client allocates cache and variables for the connection and sends an acknowledgement packet. The SYN flag bit in this packet is set to 0 because a connection has been established and represents the end of the three-way handshake.
-
In fact, this decomposition is not enough to let us understand the process more than usual, let’s grab the package to see the real situation; (Client: 55816, Server: 12000)
-
Client ->server: SYN, ECN, CWR, ECN, CWR, CWR, ECN, CWR The reason is that SYN is set to 1 and no ACK is set to 0.
-
In the second segment (server->client), SYN, ECN, ACK appear in flags.
-
For the third segment (client->server), only ACK is displayed, indicating that the connection has been completed. However, the window value of the segment sent by the client is changed to 12759, which can be compared with that of the two segments. In fact, there is a window Scale field in the option field of the last packet segment, and the size of the sending window needs to be modified in negotiation (). At this point, the three-way handshake ends.
-
The fourth segment (server->client) is a segment where the server explicitly confirms the TCP Window Update message to the client. If there is no congestion in the future, data will be transmitted at this window size. (This screenshot is just for understanding, it is not a three-way handshake.)
-
-
Note: During packet capture, the default MTU of the network adapter where the loopback address is located (for example, LO0) is 16384 bytes, so we can see that the MSS value is 16344 bytes. Why is there a difference of 40 bytes? If you don’t know, go back to the message structure above.
4.2.3.2 Four breakups
-
Why break up four times?
In fact, it is also very easy to understand, data transmission to end the connection is not to inform the other party to do resource recycling, the above has said that in the connection establishment stage will allocate cache, variables and socket resources for the connection; The same logic can be seen on Linux. When a program receives an exit signal (kill 15), it should start the close process.
-
As usual, in the diagram of “Four breakups”, there is a period of time waiting after TIME_WAIT until the official shutdown, which will be ignored by many articles on the network. In fact, the time of this period of time waiting will directly affect the machine’s concurrency.
Step 1: The client initiates a packet segment with the FIN flag and enters the FIN_WAIT_1 state, which means I want to close the connection.
Step 2: After receiving a FIN packet segment, the server enters CLOSE_WAIT and replies with an ACK indicating that it agrees to close the connection. The client immediately enters FIN_WAIT_2 after receiving an ACK packet segment.
Step 3: The server sends a FIN packet segment to the client and then enters the LAST_ACK. During this process, the server reclaiming resources. .
Step 4: After receiving the FIN packet segment, the client immediately replies the final ACK and enters the TIME_WAIT state. If the server receives an acknowledgement packet, the connection status changes to CLOSED. At this point, the client waits for twice the Max segment livetime (MSL), which allows TCP to send the final ACK again in case it is lost. The default value of this time in Linux is mostly 1 or 2 minutes. After waiting, the client changes the connection status to CLOSED and reclaims resources.
In practice, the first initiator of a FIN packet segment is not necessarily the client but the server. The sender initiates the FIN packet segment first and enters FIN_WAIT_1, and the receiver enters CLOSE_WAIT.
4.2.3.3 Talking about connection Status
TCP seems to have a lot of connection management states, but the states we usually look at are TIME_WAIT and CLOSE_WAIT in the four Breakups.
-
CLOSE_WAIT
This is a state on the server that confirms the closed connection. In some cases we see a large number of close_waits on the server, most likely because the client initiates the request to close the connection and then immediately exits, leaving the server waiting (i.e. not going down the close process). In liunx system, it will wait 7200 seconds by default to start connection probe. By default, the idle connection will be terminated after about 11 minutes (9 probes with 75 seconds interval). So now we can optimize with some system parameters.
sysctl -a |grep live net.ipv4.tcp_keepalive_intvl = 75 net.ipv4.tcp_keepalive_probes = 9 net.ipv4.tcp_keepalive_time = 7200 #Idle time before detection echo 120 > /proc/sys/net/ipv4/tcp_keepalive_time #Detection interval time echo 2 > /proc/sys/net/ipv4/tcp_keepalive_intvl #The probe number echo 1 > /proc/sys/net/ipv4/tcp_keepalive_probes Copy the code
-
TIME_WAIT
This is a case where the client waits to close the connection and clean up after closing the 2MSL value. In a high concurrency scenario, a large number of TIME_WAIT events occur when FIN packet segments are disconnected.
#Open the configuration vi /etc/sysctl.conf #Indicates that reuse is enabled. Allow time-Wait Sockets to be re-used for new TCP connections. Default is 0, indicating closure. net.ipv4.tcp_tw_reuse = 1 #Fast recovery of time-wait Sockets in TCP connections is enabled. The default value is 0, indicating that fast recovery of time-Wait Sockets is disabled. net.ipv4.tcp_tw_recycle = 1 #Modify the system default TIMEOUT time, this is the MSL time net.ipv4.tcp_fin_timeout = 60 #To enable the configuration sysctl -p Copy the code
4.3 Reliable Transmission
4.3.1 Some concepts and principles
How to establish a reliable transmission channel over an unreliable one? Let’s start with some concepts.
- RDTReliable Data Transfer Protocol (RELIABLE Data Transfer Protocol), which is a protocol for establishing reliable transmission and defines how to construct a reliable transmission channel.
- FSM (finite-state-machine), a concept that defines how the sender and receiver flow State and generate corresponding actions based on events. For example, when to send an ACK, when to retransmit, and when to deliver up.
- A. SEQ(Sequence number) and ACK(Acknowledgment, acknowledgment), which I believe everyone is clear about the sequential acknowledgment mechanism.
- Checksum: verifies data integrity.
- Timer and Automatic Repeat Request (ARQ), the sender will add a countdown timer event for each packet data. If no ACK is received after the timer time expires, the timer will immediately enter the timeout retransmission event and resend the packet.
- Duplicate data packet. If the sender fails to receive an ACK within the timeout period due to network delay or other reasons, the packet will be retransmitted. Therefore, the receiver will receive the redundant packet data, and the protocol can identify the received packets by serial number.
- Pipeline protocolPipelining, which allows senders to send multiple packets (limited by window length) at once without waitingACKBefore sending another packet. (Actually, there is another transport protocol calledStop such agreement(stop-and-wait), which sends a packet and then waits for oneACKAnd then continue to send packets,TCPDo not use this method, it affects performance.
Protocol characteristics
:- The increment of the group number must be unique;
- The sender and receiver of the protocol need to cache multiple packets. The sender should be able to cache packets that have been sent but not acknowledged. Recipients need to cache packets that have been properly received but not delivered.
- Error recovery (dealing with lost, lost, and delayed packets). Two kinds ofSliding window protocol, the window agreement will send all groups within the length of the window and wait for the confirmation of the group. After the confirmation of the previous group is completed, it will move forward, so it is called sliding window.
- GBN(go-back-n), the letter N is often called window size. The sender sends multiple packets at a time. For example, one to five packets are receivedACKBut the 3 packets have not been receivedACK, the sender will wait for all the packets after 3 packets until the timeout retransmission (retransmission of all the packets after 3 packets), in this process, even if the latter packets arrive, they will be discarded. This method is calledCumulative confirmation. Its disadvantage is that it consumes a lot of useless retransmission in delayed networks.
-
GBN maintains the send window, which maintains only the send window; (Photo from Internet)
-
- SRSelective Repeat allows the sender to retransmit only those groups that it suspects have gone wrong at the receiver, avoiding unnecessary retransmission (on-demand retransmission). It also requires Windows to limit incomplete, unacknowledged groups in the pipeline and cache them (even if they arrive out of order), and finally validate them one by one. The drawback is that the sender and receiver have to maintain a separate window, but once it appearsACKGroups are lost, and inconsistencies appear when the Windows on both sides slide.
-
SR maintains a set of Windows, and as you can see below, stagnation occurs when Windows are inconsistent; (Photo from Internet)
-
- GBN(go-back-n), the letter N is often called window size. The sender sends multiple packets at a time. For example, one to five packets are receivedACKBut the 3 packets have not been receivedACK, the sender will wait for all the packets after 3 packets until the timeout retransmission (retransmission of all the packets after 3 packets), in this process, even if the latter packets arrive, they will be discarded. This method is calledCumulative confirmation. Its disadvantage is that it consumes a lot of useless retransmission in delayed networks.
- It’s a little confusing to have so many key concepts popping up all at once, but just because these are RDT concepts doesn’t mean TCP is going to copy them all and not add to them. TCP is an implementation of the reliable transport protocol, so some new concepts will be added;
4.3.2 TCP Reliable Transmission
In 4.3.1 we introduced some of the concepts, most of which are referenced in TCP, which are explained as clearly and in detail as possible:
-
SEQ and ACK are used in packet transmission to construct ordered byte streams and checksum to ensure data integrity. (This should be easier to understand);
-
When packet loss occurs, TCP uses timeout and retransmission mechanism to deal with this kind of problem, which involves timer and retransmission protocol just mentioned. It detects the Round trip time (RTT) of the group at intervals to estimate a timeout, and the timeout must be larger than the RTT. Packet retransmission occurs after the timeout, but each TCP retransmission is twice the previous timeout value of the same packet. If the first timeout is 0.75 seconds, the next timeout is 1.5 seconds. In the mechanism of Retransmission, TCP not only Retransmission timeout (RTO) but also adds fast retransmit.
- When the sender waits for three redundant ACKS, it immediately retransmits without waiting for a packet timeout. So why three redundant ACKS? Because the receiver decides when to send an ACK, a series of redundant ACKS will be generated in the delayed network. The generation of these ACKS is related to the FSM mechanism mentioned above, which is determined by different events in a certain state. (For example, the recipient will immediately send an ACK for the desired group when the larger group arrives first and the desired group does not.)
-
As mentioned above, TCP uses pipeline protocol, so it needs to have a window to batch send multiple packets. GBN and SR methods are mentioned in the protocol. In fact, TCP uses both methods, but not completely. In TCP’s pipelined protocol, multiple packet packets are sent in a window at once. The receiver should cache any correctly received segment (out-of-order) and then be sacked (Selective Acknowledgment) instead of using cumulative acknowledgment. Finally, the SR mechanism is used to skip confirmed groups and retransmit only unconfirmed groups. But in general, the mechanism of SR is more selective retransmission, and also needs to maintain a set of sending and receiving Windows.
What we now know is that TCP’s sliding Windows provide reliability (ack for all packets is guaranteed by retransmission) and flow control (window size is controlled). We’ll talk about the characteristics of flow control in the next section.
Q: What is the window capacity? Before you look at the answer, note that widnow is a 16-bit field in the packet structure.
. . .
The standard TCP window is 2^16-1=65535 bytes
4.4 Flow Control
In TCP, the size of the sliding window is determined by RWND (receiving window) and CWND (congestion window). Whichever value is smaller is the size of the sliding window, i.e. window in TCP. We’ll talk about RWND in this section, save CWND for congestion control.
To eliminate the possibility that the sender overflows the cache of the receiver, TCP controls the traffic of both parties to achieve a speed matching state, that is, the sending rate of the sender matches the read rate of the receiver application. Therefore, TCP maintains an RWND at both ends of the connection. The value of RWND is sent to the sender by the receiver in the Window field when sending ACK, and this value is dynamic. Its size depends on the speed of the receiver reading cache data.
Let’s look at the relationship between the receiver cache and the receive window;
4.5 Congestion control
TCP congestion control is a topic that might not be discussed at all in your daily work unless you focus on transport layer development, but congestion control is the soul of TCP. Imagine a scenario where, in a transport channel with no congestion control, everyone moves segments at their highest rate. In this way, the throughput of the network device is easily overloaded and packet loss occurs. After packet loss, the sender will retransmit the packet segment in the maximum window. Such repeated traffic finally leads to network breakdown. Congestion control is all about getting the network device to work as close to its maximum core load as possible (squeezing it to its limit) without it becoming overloaded.
- As shown below (picture from network)
There are four algorithms for TCP congestion control. Slow start, congestion avoidance, fast retransmission, fast recovery. Fast retransmission has already been mentioned above so I won’t repeat it below.
4.5.1 slow start
An important concept is congestion window (CWND), a window variable maintained by senders that dynamically changes in value depending on network congestion and speeds up senders’ traffic. (In fact, it has been mentioned above that the congested window will most likely end up as the window value)
Assuming that the current sender congestion window (CWND) value is 1 MSS, when the first packet segment is transmitted and an ACK is received, the CWND will add 1 MSS (window size is now 2MSS). In this case, two message segments will be sent at the same time next time, and two more MSS will be added after receiving the two message segments. And you keep doing this until the CWND value is 16,16, which we call ssthresh, which is actually a dynamic value and we’ll talk about that in the next stage when congestion is avoided.
- Start slowly, as shown in the following figure
4.5.2 Congestion Avoidance
After a slow start, we entered congestion avoidance. At this time, CWND became linear plus 1. For example, after sending 16 message segments and receiving 16 message segments at the same time, CWND became +1, that is, CWND became 17 MSS.
Now suppose CWND is 24MSS, and in the next packet transmission the sender receives three redundant ACKS. We said earlier that TCP will retransmit quickly if it receives three redundant ACKS. The SSTHRESH value is adjusted to 12MSS (CWND /2), which is half that of CWND, and finally CWND is set to 1MSS and enters the fast recovery or slow start phase.
4.5.3 Quick Recovery
Quick recovery is a recommended component in TCP, but it has been built into new versions of TCP and is being updated iteratively. Let’s take a look at the earliest congestion control algorithms (Tahoe and Reno). The quick recovery mechanism was introduced in the Reno algorithm and will be used in future algorithms.
Reference: wikpedia
Both algorithms take RTO (Retransmission Timeout) and redundant ACK as conditions to judge packet loss events, but their approaches are different.
-
Three redundant ACKS are found:
- Tahoe: If three repeated ACKS are received, that is, the same ACK is received for the fourth time, Tahoe will enter fast retransmission, change the slow start threshold to half of the current congestion window, reduce the congestion window to 1 MSS, and re-enter the slow start phase.
- Reno: If three repeated ACKS are received, the Reno algorithm starts fast retransmission and only halves the congestion window to skip the slow start phase. The slow start threshold is set to the current value of the new congestion window to enter fast recovery.
-
RTO (timeout retransmission) cases
- Both algorithms reduce the congestion window to 1 MSS and then enter the slow start phase.
Here’s a look at the illustration (from the web) :
4.5.4 Congestion Notification (Extended)
ECN(Explicit Congestion Notification) allows end-to-end Notification of Congestion control without loss of packets. ECN is an optional feature that may be used by both ecN-enabled endpoints if supported by the underlying network infrastructure. In the event of successful ECN negotiation, the ECN-aware router can place a flag in the IP header instead of dropping the packet to indicate that blocking is about to occur. The receiver of the packet responds to the sender by slowing down its transmission rate, as if packet loss had been detected normally.
4.6 TCP Fairness
I personally understand that the so-called TCP fair fallback principle is influenced by congestion control, as any loss of packets can drastically reduce the rate. Because the congestion window is actually “additive increase, multiplicative decrease”, multiple TCP ends will repeatedly trigger congestion control through packet loss during transmission, and finally achieve a stable rate in a state of no packet loss.
4.7 Some Calculations
-
Formula for calculating TCP throughput
- Formula: TCP window size (bits) x 0.75 / latency (seconds) = Throughput (bits) per second
- For example, the general window size of Windows system is 64K, and the network delay from China to America is 150ms.
- 64KB = 65536 Bytes
- 65536 * 8 = 524288 bits
- Throughput per second (bits) = 0.75 x 524288/0.15 = 327680 bit/s = 0.31MB/ s
Therefore, even if it is a 10M line, the maximum speed of a single Tcp connection can only reach 0.31M.
-
A formula for calculating the optimal TCP window size
- Formula: Bandwidth (bits per second) * round trip delay (seconds) = TCP window size (bits) / 8 = TCP window size (bytes)
- In the example of 10Mbps bandwidth and 150ms delay, it can be calculated as follows:
- 10 x 1024 x 1024 BPS x 0.15 seconds = 1572864 bits / 8 = 1,572,864 Bytes = 1.5 MB
5. Write at the end
See officer all saw this, the article liver is not easy, point a praise bai; If there are knowledge points behind the supplement will continue to update.
5.1 Reference Materials
- Computer Networks: The Top-down Approach (Original Book, 7th edition)
- wikipedia