Summarize the points you have encountered in the interview. If you ask too much, you will lose the research.
TCP and UDP are transport layer protocols.
1. The TCP protocol
Transmission Control Protocol (TCP) is a connection-oriented Protocol.
- Before sending or receiving data, a reliable connection must be established with the other party.
The focus of this protocol is connection-oriented. The content is all about connection, the first introduction is the process of establishing and disconnecting the connection.
1.1 Establishing a connection: Three handshakes
The three-way handshake is used to ensure that the receiving and sending capabilities of the client and server are normal before communication.
TCP headers involved:
- SYN
- ACK: If the ACK sequence is X +1, it indicates that all the previous X data has been introduced and it is expected to be sent from x+1.
- seq
1. Detailed process of three-way handshake
- Host A sends data to host B with the SYN flag bit.
- Host B sends an ACK and A SYN to host A
- Host A sends an ACK to host B.
First handshake: The client sends a network packet containing SYN and SEQ, and the server receives it.
- The server perspective concludes that the client sending capability and the server receiving capability are normal.
Second handshake: the server sends a network packet containing SYN, ACK, and SEQ, and the client receives it.
- From the client perspective, it is concluded that the receiving and sending capabilities of the server and the client are normal.
Third handshake: the client sends a network packet containing ACK and SEQ, and the server receives it.
- From the perspective of the server, it is concluded that the receiving and sending capabilities of the client and the sending and receiving capabilities of the server are normal.
After the above three handshakes, both the client and server confirm that their receiving and sending capabilities are normal. Then you can communicate normally.
2. If it becomes two handshakes, is it feasible
A double handshake indicates that a client sends a SYN connection request, and the server replies with an ACK to establish a connection.
- For example, if the client sends a connection request but does not receive any confirmation because the connection request packet is lost, the client retransmits the connection request.
- Confirmation was received and a connection was established.
- After the data is transferred, the connection is released.
In procedure 1, the lost SYN packet segments are delayed to reach the server at a certain time after the connection is released because they are stuck on some network nodes for a long time. At this point, the server mistakenly thinks that the client has sent a new connection request, so it sends a confirmation message to the client, agreeing to establish a connection. At this point, the server considers the connection established.
- The server keeps the connection and waits for the client to send data, wasting resources.
3. Problems with three failed handshakes
The server establishes a connection:
- If the third ACK is lost on the network, the server changes the status of the TCP connection to SYN_RECV and resends the SYN+ACK packet after 3 seconds, 6 seconds, or 12 seconds according to the TCP timeout retransmission mechanism. If no ACK response is received from the client after a specified number of retransmissions, the server automatically closes the connection after a certain period of time.
Client establishes connection:
- In Linux C, the client normally connects to the server through the connect() function, and connect() returns the value on completion of the second TCP three-way handshake. In other words, when the CLIENT receives a SYN+ACK packet, the TCP connection status is Established. If the ACK packet in the third handshake is lost, the Client sends data to the server, and the server responds with the RST packet to detect the server error.
Syn flood attack:
- If no confirmation message is sent for the third handshake, the host disconnects the half-opened connection and reclaims resources after waiting for a period of time. This sets up a hidden danger for DOS attacks. When the active party sends a large number of SYN packets but does not respond to the third handshake. The Server allocates resources (but does not use them) for these SYN packets, causing the server to consume large amounts of memory and deplete the server connection environment. This is a SYN flood attack.
1.2 Disconnecting: Wave four times
The quad wave can be divided into two parts, each containing two waves, indicating a disconnection in one direction.
Client disconnection:
- The client sends a network packet containing the FIN, and the server receives the packet.
- The server sends a network packet containing an ACK (close-wait), the client receives it, and the connection in this direction is closed.
The server is disconnected:
- The server sends a network packet containing the FIN, and the client receives the packet.
- The client sends a network packet containing an ACK (time-wait starts), the server receives it, and the connection in this direction is closed.
1. The function of time-wait
Time-wait indicates the state of the active connection closing party after sending the last wave (ACK=1) of the four TCP waves.
The duration of time-wait is 2MSL.
- MSL indicates the Maximum Segment Lifetime, which can be 30s, 1min, or 2min. Engineering is 2min, 2msL is 4min.
Q: Why did it last so long? A:
-
To ensure that the last ACK packet segment sent by the client can reach the server. Because the last ACK packet may be lost, the server times out and retransmits the FIN packet for the third wave, and the client retransmits the ACK packet for the fourth wave. If there is no this 2 MSL, the client sends out the last ack datagram directly closed after the connection, then receive less than the server timeout retransmission fin information submitted to the (here should be the client receives a illegal message segment, and returns a RST datagram, suggests to reject the communication, then the two sides have exceptions, rather than not receive.) The server cannot enter the close state as normal. This will cost the server resources. When there are a lot of timewait states on the network, you can imagine the strain on the server.
-
After the fourth wave, 2msl of time is enough for all segments from this connection to disappear from the network, so that the next new connection will definitely not have the segments from the old connection. The invalid connection request message segment appears in this connection. If not, this might be the case: the connection ends as soon as it’s waved, with no timewait. One syn packet gets lost in the network, and then the next one starts, and then the next one sends a SYN packet, and then all of a sudden the other one gets a SYN packet at the same time or at different times, and then you have a problem.
2. close_wait
Close_wait appears on the passive close side
- Close_wait occurs only when a FIN is sent and no ACK is sent on the program’s side to confirm. In other words, the application does not detect that the connection has been closed, or the application itself has forgotten to close the connection at this time, so the resource is occupied by the application.
The quick solution at this point is:
- Close running applications, depending on the business situation.
- Fix the bugs in the program as soon as possible, then submit the test to the online server.
1.3 Flow Control
Traffic control: To avoid packet loss, it controls the sending speed of the sender so that the receiver can receive the packet in time.
- The fundamental purpose of traffic control is to prevent packet loss, which constitutes the reliability of TCP.
Flow control is implemented by sliding window protocol (continuous ARQ protocol). The sliding window protocol not only ensures error-free grouping and orderly receiving, but also realizes flow control.
- The main way is that the ACK returned by the receiver will contain the size of its own receive window, and use the size to control the data sent by the sender.
ACK number: indicates the number n of the next byte to be received. This n indicates that the recipient has received the first N-1 byte of data. If the recipient receives the NTH byte of data instead of the NTH byte of data, the recipient will not send an ACK with the number N +2.
Window size: The current window size is M, so that the sender can calculate how many bytes of data can be sent to the other party after receiving the two data contained in ACK. Assuming that the current sender has sent the x-th byte, the number of bytes that can be sent is y=m-(x-n). This is the basic principle of sliding Windows to control flow.
Deadlock caused by flow control
Situation: When the sender receives a reply with a window of 0, the sender stops sending and waits for the receiver’s next reply. But if the reply with a non-zero window is lost during transmission, the sender keeps waiting while the receiver thinks the sender has received the reply and waits to receive new data, so the two parties wait for each other, resulting in a deadlock.
Solution: To avoid deadlocks caused by flow control, TCP uses a persistence timer. This timer is started every time the sender receives a reply with a zero window. When the time is up, it proactively sends a message to ask about the size of the recipient’s window. If the receiver still returns a zero window, reset the timer to continue waiting; If the window is not 0, it indicates that the reply packet is lost. In this case, the sending window is reset and the packet is sent to avoid deadlock.
1.4 Congestion control
Congestion control: Congestion control is used to prevent too much data from being injected into the network, preventing the network from being overloaded.
The sender maintains a status variable called the congestion window. The size of the congestion window depends on the level of congestion on the network and changes dynamically. The sender makes its sending window equal to the congestion window, and considering the receiving capacity of the receiver, the sending window may be less than the congestion window.
Common methods are:
- Start slowly and avoid congestion
- Fast retransmission, fast recovery.
1. Start slowly and avoid congestion
Slow start: Indicates that the congestion window starts from 1. After each RTT, the congestion window doubles until the window size reaches the threshold. Congestion avoidance: After the threshold is reached, use the congestion avoidance algorithm and increase 1 for each RTT
- When network congestion occurs, the threshold is reduced to half of what it is now, and then the window starts at 1 again
The appearance of congestion:
- Packet loss (router cache overflow)
- Too much packet latency (queuing in the router cache)
2. Fast retransmission and quick recovery
Fast retransmission: The receiver is required to send repeated acknowledgement immediately after receiving an out-of-order segment (in order to enable the sender to know in advance that a segment has not reached the other party, which can improve the network throughput by about 20%) rather than wait until it sends data with random acknowledgement.
- According to the fast retransmission algorithm, the sender should immediately retransmit the unreceived packet segment as long as it receives three consecutive repeated acknowledgements, rather than waiting for the retransmission timer to expire.
Fast recovery: When the sender receives three consecutive repeated acknowledgements, the multiplication reduction algorithm is performed to reduce the congestion window threshold by half (to prevent network congestion). But then, instead of executing the slow start algorithm, the current sent window is set to the value after the congestion window threshold is halved, and then the congestion avoidance algorithm is implemented.
2. The UDP protocol
UDP protocol: only in IP datagram service added reuse and reuse function and error detection function.
- Only connectionless packets cannot be reliably transmitted.
- UDP adds only the header to the data handed over by the application layer and performs special processing to the network layer
- After unpacking the head of the user datagram transmitted from the network layer, it is delivered to the application layer intact.
3. Differences between TCP and UDP
1. Connection-based vs. connectionless
- TCP is a connection-oriented protocol.
- UDP is a connectionless protocol. UDP is better suited for multicast publishing of messages, transferring messages from a single point to multiple points.
2. The reliability
- TCP provides a delivery guarantee. If a packet is lost during transmission, the packet will be retransmitted.
- UDP is unreliable and does not provide any guarantee of delivery. (Loss of online games and videos)
3. The order
- TCP ensures that messages are ordered, even if they arrive at the client in a different order.
- UDP does not provide an order guarantee.
4. Data boundaries
- TCP does not save data boundaries.
- Although TCP also generates a complete message after collecting all the bytes, this information is stored in the TCP buffer before being sent to the receiving end to ensure better use of network bandwidth.
- UDP guarantee.
- In UDP, packets are sent individually and are integrated again only when they arrive. Packets have clear boundaries to which packets have been received, which means that after the message is sent, there will be a read operation at the receiver interface to generate a complete message.
Speed of 5.
- TCP is slow
- UDP is fast. Applications in online video media, television broadcasting and multiplayer online games.
6. Send consumption
- TCP is the heavyweight.
- UDP is lightweight.
- Because UDP transmission of information does not assume any indirect creation of connection, guarantee delivery or order of information.
- This is also reflected in the size used for the header.
7. Header size
- The TCP header.
- The size of a TCP packet header is 20 bytes.
- The TCP header contains sequence numbers, ACK numbers, data offsets, reservations, control bits, Windows, emergency Pointers, options, padding items, check bits, source and destination ports.
- The UDP header.
- The UDP datagram header is 8 bytes.
- UDP headers contain only the length, source port number, destination port, and checksum.
8. Congestion or flow control
- TCP has traffic control.
- TCP requires three packets to set up a socket connection before any user data can be sent. TCP processing reliability and congestion control.
- UDP cannot control traffic.
9. The application
- Because TCP provides the guarantee of reliable delivery and order, it is best suited for applications that require high reliability and low transfer time requirements.
- UDP is more suitable for applications that need fast, efficient transmission, such as games.
- UDP is stateless and useful in applications where the server needs to respond to a small number of requests from a large number of clients.
- In practice, TCP is used in the financial field, as THE FIX protocol is a TCP-based protocol, while UDP is heavily used in gaming and entertainment venues.
10. Upper-layer protocols
- Tcp-based: Telnet, FTP, and SMTP.
- UDP protocol: DHCP, DNS, SNMP, TFTP, BOOTP.