preface

TCP protocol is a must-ask in an interview. We have compiled 15 very classic TCP interview questions, I hope you find the ideal offer ah

  • Public number: a boy picking up snails
  • Github address, thanks to every star

1. Describe the TCP three-way handshake process

Both the client and server are in the CLOSED state. Then the server listens for a port and enters the LISTEN state

  • The client enters the SYN_SEND state after the first handshake (SYN=1, seq=x) is sent
  • After the second handshake (SYN=1, ACK=1, SEq = Y, ACKnum=x+1) is sent, the server enters the SYN_RCVD state.
  • For the third handshake (ACK=1, ACKnum=y+1), the client enters the ESTABLISHED state after the packet is sent. When the server receives the packet, the client also enters the ESTABLISHED state. The TCP handshake is used to start data transmission.

2. Why is the TCP handshake three times instead of two? Can’t be four times?

Why are TCP handshakes three times? The most important thing for two people to get together is to love each other, that is, I love you, and I know that you love me. Let’s simulate the process of three handshakes.

Why can’t you shake hands twice?

If there are only two handshakes, the girl may not know, her “I love you too”, whether the boy received, the love relationship will not be happy.

Why can’t you shake hands four times?

Because you can’t shake hands four times? Because three times is enough, three times is enough to let both sides know: you love me, I love you. Four times is more than enough.

3. Describe the TCP four-wave process

  1. After the first wave (FIN=1, SEq = U) is sent, the client enters the FIN_WAIT_1 state
  2. After the second wave (ACK=1, ACK= U +1,seq = V) is sent, the server enters the CLOSE_WAIT state, and the client enters the FIN_WAIT_2 state after receiving the acknowledgement packet
  3. After the third wave (FIN=1, ACK1, SEQ = W, ACK = U +1), the server enters the LAST_ACK state and waits for the last ACK from the client.
  4. On the fourth wave (ACK=1, SEq =u+1, ACK= w+1), the client receives the shutdown request from the server, sends an acknowledgement packet, and enters the TIME_WAIT state, waiting for a fixed amount of time (two maximum lifetime, 2MSL, 2 Maximum Segment Lifetime); if the server does not receive an ACK from the server, the connection is CLOSED. After receiving the acknowledgement packet, the server closes the connection and enters the CLOSED state.

4. Why does TCP wave four times?

Take an example!

Xiao Ming and Xiao Hong talk on the phone. At the end of the conversation, Xiao Hong says, “I have nothing more to say.” Xiao Ming replies, “I know.” But Xiao Ming may still have something to say, xiao Hong can not ask Xiao Ming to follow his rhythm to end the call, so Xiao Ming may talk about a lot, finally xiao Ming said “I finished”, Xiao Hong replied “I know”, so the call is over.

5. Why do I need to WAIT 2MSL in time-wait state

2MSL, 2 Maximum Segment Lifetime

  • One MSL ensures that the last ACK message of the active closing party in the four waves can reach the peer end
  • An MSL ensures that the peer end can receive the REtransmitted FIN packet if it does not receive an ACK

6. Differences between TCP and UDP

  1. TCP connection-oriented (for example, to make a phone call, dial up to establish a connection); UDP is connectionless, that is, no connection is required before sending data.
  2. TCP requires security and provides reliable services. Data transmitted through TCP connections is not lost, repeated, and safe and reliable. While UDP does its best to deliver, that is, reliable delivery is not guaranteed.
  3. TCP is point-to-point, UDP one-to-one, one-to-many, many-to-many
  4. TCP transmission efficiency is relatively low, and UDP transmission efficiency is high, it is suitable for high-speed transmission and real-time communication or broadcast communication.
  5. TCP is suitable for web pages, mail, etc. UDP is suitable for video, voice broadcast, etc
  6. TCP byte stream oriented and UDP packet oriented

7. Describe the fields in the TCP packet header and their functions

  • 16-bit port number: indicates the source port number of the host from which the packet is sent. Destination port number, which upper-layer protocol or application to pass to
  • 32-bit serial number: The number of each byte of the byte stream in a transmission direction during a TCP communication (from the time the TCP connection is established to the time it is broken).
  • 32-bit confirmation number: used as a response to the TCP packet segment sent by the other party. The value is the sequence number of the received TCP packet segment plus 1.
  • 4-bit header length: Indicates the number of 32-bit words (4 bytes) in the TCP header. Because four bits can identify up to 15, TCP headers are up to 60 bytes long.
  • 6 flag bits: URG(whether the emergency pointer is valid), ACk (whether the confirmation number is valid), PSH (buffer is not full), RST (request to re-establish the connection), SYN (connection establishment message), FIN (tell the local end to close the connection)
  • 16 bit window size: is a means of TCP traffic control. The window mentioned here refers to the receiving notification window. It tells the other party how many bytes of data the local TCP receive buffer can hold, so that the other party can control the speed of sending data.
  • 16-bit checksum: The packet is filled by the sender. The receiver performs the CRC algorithm on the TCP packet segment to check whether the TCP packet segment is damaged during transmission. Note that this check includes not only the TCP header but also the data portion. This is also an important guarantee of TCP reliable transmission.
  • 16-bit emergency pointer: a positive offset. This is added to the value of the ordinal field to indicate the ordinal number of the next byte of the last emergency data. So, to be exact, this field is an offset of the emergency pointer relative to the current ordinal number, which we might call an emergency offset. TCP’s emergency pointer is a way for the sender to send emergency data to the receiver.

8. How does TCP ensure reliability

  • First, TCP connections are based on a three-way handshake, whereas disconnecting is a four-way wave. Ensure reliable connection and disconnection.
  • Secondly, the reliability of TCP is also reflected in stateful; TCP keeps track of what data is sent, what is accepted, and what is not, and ensures that packets arrive in order and that data is transmitted without error.
  • Again, the reliability of TCP, but also reflected in the control. It has mechanisms such as packet verification, ACK response, timeout retransmission (sender), out-of-order data retransmission (receiver), discarding duplicate data, flow control (sliding window), and congestion control.

9. TCP retransmission mechanism

Timeout retransmission

TCP implements the retransmission mechanism to ensure reliable transmission. The most basic retransmission mechanism is timeout retransmission. When sending data packets, a timer is set. If no ACK reply packet is received at an interval, the packets are retransmitted.

So what is the interval, what is the interval? Let’s see what RTT is.

RTT is the time from the time a packet is sent to the time it is returned, i.e. the round-trip time of the packet. Timeout Retransmission Timeout (RTO).

How long is the RTO set?

  • If the RTO is small, it is likely that the data will not be lost and will be retransmitted, which will cause network congestion and more timeouts.
  • If the RTO is large, the effect will not be good until the flowers have faded and there is no reissue.

In general, RTO is slightly larger than RTT, and the effect is the best. Some of you may ask, is there a formula for how long you can work overtime? Some! There’s a standard way to calculate the formula for RTO, also known as the Jacobson/Karels algorithm. So let’s look at the formula for calculating RTO

1. Calculate SRTT first (calculate smoothed RTT)

SRTT = (1 - α) * SRTT + α * RTT // Calculate weighted average of SRTTCopy the code

2. Calculate RTTVAR (round-trip time Variation)

RTTVAR RTTVAR + = (1 - beta) * beta * (| RTT - SRTT |) / / computational SRTT and the gap between the real valueCopy the code

3. Final RTO

RTO = µ * SRTT + ∂ * RTTVAR  =  SRTT + 4·RTTVAR  
Copy the code

Among them, α = 0.125, β = 0.25, μ = 1, ∂ = 4 are the optimal parameters obtained by a large number of results.

However, timeout retransmission has these disadvantages:

  • When a packet segment is lost, the packet is retransmitted after a timeout period, increasing the end-to-end delay.
  • When a packet segment is lost, in the process of waiting timeout, the following packet segment may be received by the receiving end but cannot be acknowledged. The sending end considers that the packet segment is also lost, resulting in unnecessary retransmission, which wastes resources and time.

Also, TCP has a policy of doubling the timeout interval. Timeout retransmission takes a long time to wait. Therefore, you can also use a fast retransmission mechanism.

The fast retransmission

Fast retransmission mechanism, which is not time driven, but data driven. It triggers retransmission based on feedback from the receiver.

Let’s take a look at the fast retransmission process:

The sender sends 1,2,3,4,5,6 copies of data:

  • The first Seq=1 is sent first, so it will Ack back 2;
  • The second Seq=2 is also sent, assuming also normal, so ACK 3;
  • The third Seq=3 was not delivered due to network and other reasons;
  • The fourth copy of Seq=4 was also delivered, but it was not received because of Seq3. So ACK returns 3;
  • The back Seq=4 and 5 have also been sent, but ACK still replies 3, because Seq=3 has not been received.
  • The sending end receives three repeated ACK packets equal to 3 (actually four packets, but the first one is a normal ACK and the last three are repeated acks). Then it knows which packet segment is lost during transmission and retransmits the packet segment before the timer expires.
  • Seq=4, Seq=4, Seq= 5, Seq= 6, and Seq=4

However, there is another problem with fast retransmission: ACK only informs the sender of the largest ordered packet segment. Which packet is missing? Not sure! So how many packets should I retransmit?

Is it retransmitting Seq3? Seq3, Seq4, Seq5, Seq6? Because the sender does not know who sent the three consecutive ACK3’s.

Retransmission with selective confirmation (SACK)

To solve the problem of fast retransmission: How many packets should be retransmitted? TCP provides the SACK method (retransmission with Selective Acknowledgment, Selective Acknowledgment).

SACK mechanism is that, on the basis of fast retransmission, the receiver returns the serial number range of the most recently received packet segment, so that the sender knows which packets were missed and which packets should be retransmitted. The SACK flag is added to the TCP header options field.

As shown in the figure above, after receiving the same ACK=30 packets for three times, the sender triggers the fast retransmission mechanism. The SACK information finds that only the data segment 30-39 is lost, so only the TCP packet segment 30-39 is selected for retransmission.

D-SACK

D-sack, also known as Duplicate SACK, is an extension of SACK, which is mainly used to tell the sender which packets have been repeatedly received by themselves. The purpose of DSACK is to help the sender determine whether a packet out of order, ACK loss, packet duplication, or pseudo-retransmission has occurred. TCP can better do network flow control. Here’s a picture:

10. Talk about TCP sliding Windows

TCP sends one packet and receives an acknowledgement before sending the next packet. This has the disadvantage of being less efficient.

It’s like we’re talking face to face, and you finish one sentence, and I respond, and then you say the next. So, if I’m busy with other things and haven’t been able to get back to you in time. It’s obviously unrealistic to wait until I get back to you after you’ve finished your sentence.

To solve this problem, TCP introduces Windows, a cache space created by the operating system. The window size value represents the maximum value at which data can continue to be sent without waiting for a confirmation reply.

The TCP header has a field called WIN, or the 16-bit window size, that tells the peer how many bytes of data can be stored in the local TCP receive buffer, so that the peer can control the speed at which data is sent, thereby controlling the flow of data.

Generally speaking, every time the receiving side receives a packet, when sending an acknowledgement message, it also tells the sender how much free space it has in the cache. The free space of the buffer is called the size of the acceptance window. That’s Win.

TCP sliding Windows are divided into two types: send window and receive window. The sliding window on the sending side consists of four parts, as follows:

  • Sent and received an ACK confirmation
  • Sent but no ACK acknowledgement was received
  • Not sent but can be sent
  • It cannot be sent if it is not sent

  • The dotted rectangular box is the sending window.
  • Snd. WND: indicates the size of the sending window. The number of dotted boxes in the figure above is 14.
  • Snd. UNA: An absolute pointer to the sequence number of the first byte that was sent but not acknowledged.
  • Snd.nxt: The next sent location, pointing to the sequence number of the first byte that was not sent but could be sent.

The recipient’s sliding window consists of three parts, as follows:

  • Received and confirmed
  • Data not received but can be received
  • Data that cannot be received when no data is received

  • The dotted rectangular box is the receiving window.
  • Rev. WND: Represents the size of the receiving window, as shown in the dotted box above.
  • Rev.nxt: The next location to receive, pointing to the sequence number of the first byte that was not received but could be received.

11. Talk about TCP traffic control

TCP three-way handshake, the sender and receiver enter the ESTABLISHED state, and they can happily transfer data.

However, the sender cannot frantically send data to the receiver, because if the receiver cannot receive the data, the receiver can only store the data in the cache. If the cache is full and the sender continues to send data frantically, the receiver will have to throw away the packets it receives, wasting network resources.

TCP provides a mechanism for the sender to control the amount of data sent based on the actual receiving capability of the receiver. This mechanism is called flow control.

TCP uses a sliding window to control traffic. Let’s take a look at the flow control process briefly:

First, the two sides shake hands three times and initialize their respective window sizes, both of which are 400 bytes.

  1. If the current sender sends 200 bytes to the receiver, then, the sender’sSND.NXTIt moves 200 bytes to the right, which means that the window currently available is 200 bytes less.
  2. Rev. WND =400-200=200 bytes, so win=200 bytes returned to the sender. The recipient carries a 200-byte slide-down window at the head of the ACK packet
  3. The sender sends another 200 bytes, the 200 bytes arrive, and it goes on to the buffer queue. At this point, however, due to the heavy load, the receiver cannot process so many bytes, only 100 bytes can be processed, and the remaining 100 bytes continue to be placed in the buffer queue. At this point, Rev. WND = 400-200-100=100 bytes, i.e. win=100 returns to the sender.
  4. The sender goes ahead and sends 100 bytes, at which point the accept window win becomes 0.
  5. The sender stops sending, starts a scheduled task, and asks the receiver every once in a while until win is greater than 0 before continuing to send.

12. TCP congestion control

Congestion control works on the network to prevent excessive data packets from being injected into the network and prevent the network from being overloaded. Its main goal is to maximize the bandwidth of bottleneck links on the network. How is it different from flow control? Traffic control is applied to the receiver. It controls the sending speed according to the actual receiving capability of the receiver to prevent packet loss.

We can think of a network link as a water pipe, and if we want to maximize the flow of data through the network, it’s to get the water pipe as full as possible.

The sender maintains a variable called CWND (Congestion window) to estimate the amount of data (water) that the link (water pipe) can carry and transport over a period of time. It’s a proxy for how congested the network is, and it changes dynamically, but in order to maximize transmission efficiency, how do we know how efficient the pipe is?

A simple method is to increase the amount of water transferred until the pipe is about to burst (which corresponds to packet loss on the network).

As long as there is no congestion in the network, the value of the congestion window can be increased to send more packets, but as long as there is congestion, the value of the congestion window should be decreased to reduce the number of packets injected into the network.

In fact, there are several commonly used congestion control algorithms

  • Slow start
  • Congestion avoidance
  • Congestion occurs
  • Fast recovery

Slow start algorithm

Slow start algorithm, which literally means take your time. It means that after the TCP connection is established, do not send a large amount of data at first, but first detect the network congestion level. Increase the size of the congestion window gradually from small to large. If there is no packet loss, the CWND size of the congestion window will be increased by 1 (MSS) for every ACK received. The sending window is doubled in each turn, showing exponential growth. If packet loss occurs, the congestion window is halved, and the congestion avoidance stage is entered.

  • The TCP connection is completed, and the CWND = 1 is initialized, indicating that one MSS unit of data can be transmitted.
  • Each time it receives an ACK, the CWND increments by one;
  • After each RTT, CWND doubled; It goes up exponentially

To prevent network congestion caused by excessive CWND growth, a slow start threshold (SSthRESH) state variable needs to be set. When the CWND reaches this threshold, it reduces congestion as if the water pipe has been turned down. That is, when CWND > SSTHRESH, congestion avoidance algorithm is entered.

Congestion avoidance algorithm

In general, the slow start threshold ssthRESH is 65535 bytes after CWND reaches the slow start threshold

  • CWND = CWND + 1/ CWND for each ACK received
  • For each RTT, CWND = CWND + 1

Obviously this is a linear ascending algorithm to avoid network congestion problems too quickly.

Congestion occurs

Packet loss occurs when network congestion occurs:

  • RTO retransmission timed out
  • The fast retransmission

If RTO timeout retransmission occurs, the congestion generation algorithm is used

  • Slow start threshold sshTHRESH = CWND /2
  • CWND is reset to 1
  • Enter a new slow start process

This is really decades of hard work, once back to the pre-liberation. In fact, there is a better way to handle it, is fast retransmission. When the sender receives three consecutive duplicate ACKS, it quickly retransmits them without waiting for the RTO to time out.

Slow start thresholds SSTHRESH and CWND vary as follows:

  • Congestion window size CWND = CWND /2
  • Slow start threshold ssTHRESH = CWND
  • The fast recovery algorithm is displayed

Fast recovery

Fast retransmission and fast recovery algorithms are usually used together. The fast recovery algorithm thinks that three more duplicate Acks are received, indicating that the network is not that bad, so there is no need to be as strong as the RTO timeout.

As mentioned earlier, CWND and SSHTHRESH have been updated before going into quick recovery:

- cwnd = cwnd /2
- sshthresh = cwnd
Copy the code

Then, the really fast algorithm is as follows:

  • cwnd = sshthresh + 3
  • Retransmission of duplicate Acks (missing packets)
  • If repeated ACK is received, then CWND = CWND +1
  • CWND = sSHTHresh if new data is received after ACK. The receipt of an ACK for new data indicates that the recovery process is over and the congestion avoidance algorithm can be entered again.

13. Relationship between half-connection queues and SYN Flood attacks

Before TCP enters the three-way handshake, the server changes from the CLOSED state to the LISTEN state, and creates two internal queues: the half-connection queue (SYN queue) and the full connection queue (ACCEPT queue).

What is a half-connection queue (SYN queue)? What is an ACCEPT queue? Recall the TCP three-way handshake diagram:

  • In the TCP three-way handshake, the client sends a SYN to the server. After receiving the SYN, the server replies with ACK and SYN. The status changes from LISTEN to SYN_RCVD.
  • When the client replies with an ACK and the server receives, the three-way handshake is complete. At this point, the connection is waiting to be picked up by the specific application. Before being picked up, it is pushed to the ACCEPT queue, which is the full connection queue.

SYN Flood is a typical Denial of Service (DoS) attack. It forges non-existent IP addresses in a short period of time to Flood the server with SYN packets. After the server replies to a SYN+ACK packet, it does not receive an ACK packet. As a result, a large number of half-connection queues are established on the server and the half-connection queues are full. As a result, normal TCP requests cannot be processed.

The solutions include SYN cookie and SYN Proxy firewall.

  • The syn cookies: After receiving a SYN packet, the server calculates a cookie value as the serial number of its own SYNACK packet based on the source address, port and other parameters of the packet. After replying with a SYN+ACK packet, the server does not immediately allocate resources for processing. Check whether the confirm serial number in the packet is correct based on the source address and port of the packet. If yes, establish a connection; otherwise, discard the packet.

  • SYN Proxy firewall: The server firewall proxies and responds to each received SYN packet and maintains a half-connection. After the ACK packet is returned by the sender, the SYN packet is reconstructed and sent to the server to establish a real TCP connection.

14. Nagle algorithm and delayed validation

Nagle algorithm

What do you think would be the problem, dear friends, if the sender was frantically sending tiny packets to the receiver, just one byte?

In TCP/IP, no matter how much data is sent, a protocol header is added before the data. When receiving the data, the peer party also sends an ACK for confirmation. To maximize network bandwidth, TCP always wants to send as much data as possible. Nagle’s algorithm is designed to send as large a chunk of data as possible, rather than saturate the network with many small chunks.

The basic definition of Nagle algorithm is: at any time, there can be at most one unrecognized segment. The so-called “small segment” refers to the data block smaller than the SIZE of MSS. The so-called “unacknowledged” refers to that after a data block is sent, no ACK is received to confirm that the data has been received.

Implementation rules of Nagle algorithm:

  • If the packet length reaches MSS, send is allowed;
  • If it contains FIN, send is allowed.
  • If the TCP_NODELAY option is set, sending is allowed.
  • If TCP_CORK is not set, if all sent small packets (the packet length is smaller than MSS) are confirmed, the packets can be sent.
  • If none of the preceding conditions is met but a timeout (usually 200ms) occurs, the system sends the packet immediately.

Delay to confirm

If the receiver has just received a packet from the sender, it receives a second packet within a very, very short period of time. Is it better for the recipients to reply one by one or all together?

After receiving the data packet, if the receiver has no data to send to the peer end, it can wait for a period of time to confirm the data packet (the default value is 40ms on Linux). If data happens to be sent to the peer during that time, the ACK is transmitted along with the data rather than sending a separate ACK. If no data is to be sent within a certain period of time, the ACK is also sent to avoid packet loss.

However, some scenarios cannot be delayed. For example, out-of-order packets are detected, packets larger than one frame are received, and the window size needs to be adjusted.

In general, the Nagle algorithm and delayed acknowledgement cannot be used together. The Nagle algorithm means delayed sending, and delayed acknowledgement means delayed receiving, which can cause greater latency and performance problems.

15. TCP packet sticking and unpacking

TCP is a stream oriented, unbounded string of data. TCP layer does not understand the specific meaning of the upper business data, it can package according to the actual situation of TCP buffer, so that in our business, a complete package may be TCP split into multiple packages to send, it is also possible to multiple small bag packaging into a large packets to send, this is called TCP package and unpacking.

Why does sticking and unpacking occur?

  • The size of the data to be sent is smaller than the size of the TCP send buffer. TCP sends the data that has been written into the buffer for several times. Packet sticking occurs.
  • The application layer at the receiving end does not read the data in the receiving buffer in time, and sticky packets will occur.
  • If the data to be sent is larger than the remaining space of the TCP send buffer, packet unpacking will occur.
  • If the data to be sent is larger than MSS (maximum packet length), TCP unpacks the data before transmission. That is, TCP packet length -TCP header length >MSS.

Solution:

  • The sender encapsulates each packet into a fixed length
  • Add special characters to the end of the data for segmentation
  • Data is divided into two parts, one is the header, the other is the content body; The header structure has a fixed size and a field declares the size of the content body.

Reference and thanks

  • The WHOLE TCP thing (part 2)
  • You need to understand TCP congestion control principles
  • 30 illustrations: TCP retransmission, sliding Windows, flow control, congestion control
  • The soul of TCP, consolidate your network infrastructure
  • TCP packet sticking and unpacking
  • Baidu encyclopedia