Attach a copy of dry goods! A 700 + page backend interview notes covering common backend development topics.

Link: pan.baidu.com/s/1dsDmlcay…

Extraction code: 0DAS

Take a look at this issue’s table of contents

1. Each layer protocol and function of computer network?

Computer network system can be roughly divided into three types, OSI seven-layer model, TCP/IP four-layer model and five-layer model.

  • OSI seven-layer model: large and complete, but more complex, and is the first theoretical model, no practical application.
  • The TCP/IP four-layer model is developed from the development of practical applications. In essence, TCP/IP has only the top three layers, and the bottom layer has no concrete content. The TCP/IP reference model does not really describe the implementation of this layer.
  • Five-layer model: the five-layer model only appears in the computer network teaching process, which is a compromise between the seven-layer model and the four-layer model, which is concise and can explain the concept clearly.

Main functions of each layer of the seven-layer network architecture:

  • Application layer: Provides interactive services for applications. There are many application layer protocols in the Internet, such as domain name system DNS, HTTP protocol to support world Wide Web applications, SMTP protocol to support E-mail and so on.

  • Presentation layer: mainly responsible for data format conversion, such as encryption and decryption, conversion and translation, compression and decompression, etc.

  • The session layer is responsible for establishing, maintaining, and terminating communication between two nodes in the network, such as servers verifying user logins.

  • Transport layer: sometimes translated as transport layer, provides a common data transfer service to host processes. This layer has the following two protocols:

    • TCP: provides connection-oriented and reliable data transmission services.
    • UDP: provides a connectionless, best-effort data transfer service without ensuring the reliability of data transfer.
  • Network layer: select appropriate routing and switching nodes to ensure timely data transmission. Includes IP protocol.

  • Data link layer: The data link layer is often referred to as the link layer for short. The IP packets transmitted from the network layer are assembled into frames, and then the frames are transmitted on the links of adjacent nodes.

  • Physical layer: realize transparent transmission of bitstreams between adjacent nodes and shield the differences of transmission media and communication means as much as possible.

2. What is the difference between TCP and UDP?

The comparison is as follows:

UDP TCP
Whether connection There is no connection connection-oriented
reliable Unreliable transmission, not using flow control and congestion control Reliable transmission, using flow control and congestion control
Whether to order A disorderly Ordered. Messages may be out of order during transmission, and TCP reorders them
Transmission speed fast slow
Number of connected objects Supports one-to-one, one-to-many, many-to-one and many-to-many interactive communication It can only be one-to-one communication
transport For a message Word oriented stream
The first overhead The header overhead is small, only 8 bytes Minimum 20 bytes, maximum 60 bytes
Applicable scenario For real-time applications (IP phone calls, video conferencing, live streaming, etc.) Suitable for applications that require reliable transmission, such as file transfer

Conclusion:

TCP is used when reliable transmission is necessary at the transport layer, and UDP is used for high-speed transmission and real-time communication. TCP and UDP should be used on demand according to the application purpose.

3. What are the application scenarios of UDP and TCP?

TCP is connection-oriented and ensures reliable data delivery. Therefore, IT is often used for:

  • FTP File Transfer
  • HTTP / HTTPS

UDP is connectionless, it can send data at any time, plus UDP itself is simple and efficient processing, so it is often used for:

  • Communication with a small amount of packets, such as DNS and SNMP
  • Video, audio and other multimedia communication
  • Radio communication

4. Describe the TCP three-way handshake in detail.

Image from: juejin.cn/post/684490…

Three-way handshake:

  • First handshake: The client requests to establish a connection, sends a synchronization packet (SYN=1) to the server, selects a random number seq = X as the initial serial number, and enters the SYN_SENT state for confirmation by the server.

  • Second handshake: : After receiving a connection request packet and agreeing to establish a connection, the server sends a synchronization acknowledgement packet (SYN=1, ACK=1) with ACK= X + 1 and selects a random number seq = y as the initial serial number. At this time, the server enters the SYN_RECV state.

  • Third handshake: After receiving the confirmation from the server, the client sends an acknowledgement packet (ACK=1) with ACK= Y + 1 and SEq = X + 1 to the server. The client and server enter the ESTABLISHED state to complete the three-way handshake.

Ideally, once a TCP connection is established, it is maintained until either of the communicating parties voluntarily closes the connection.

5. Why three handshakes, not two?

There are three main reasons:

  1. Prevents expired connection request packets from suddenly being sent to the server, resulting in errors and waste of resources.

    In the case that the two parties can establish A connection by shaking hands twice, assume that the client sends A packet to establish A connection, but A cannot reach the server temporarily due to network problems. The server does not return A confirmation packet until it receives the request packet.

    After receiving the confirmation message, the client also enters the ESTABLISHED state. The two sides establish a connection and transmit data. Then disconnect as normal.

    The server then returns the confirmation message and enters the ESTABLISHED state. However, the client that has entered the CLOSED state cannot accept the confirmation message, let alone enter the ESTABLISHED state. As A result, the server will wait unilaterally for A long time. Waste of resources.

  2. It takes three handshakes to confirm that both you and the other are sending and receiving properly.

    The first handshake: the client can only confirm the segment of the request packet at the sender, but the server can confirm its receiving capability and the sending capability of the peer.

    Second handshake: The client can confirm that the sending and receiving capabilities of itself and the receiving party are normal.

    Third handshake: The server can confirm that the sending and receiving capabilities of itself and the receiving party are normal.

    It can be seen that only three handshakes can make both parties confirm that their sending and receiving capabilities are all normal, so that they can communicate happily.

  3. Inform the other party of your initial serial number and acknowledge receipt of the initial serial number.

    TCP achieves reliable data transmission. One reason is that the TCP packet segment maintains sequence number fields and acknowledgement sequence number fields, through which both parties can know which of the data sent by themselves has been acknowledged by the other party. The values of these two fields are incremented based on the initial number value. In the case of a two-handshake, only the initial number of the initiator can be confirmed, but not the initial number of the other party.

6. Why three handshakes, not four?

Since the three-way handshake can confirm that the sending and receiving capabilities of both parties are normal, both parties know that they are ready for each other, and can also complete the initial number confirmation of both parties, there is no need for a fourth handshake.

  • First handshake: The server confirms that the received by itself and sent by the client function properly.
  • Second handshake: The client confirms that the packets sent by itself, received by itself, received by the server, and sent by the client function properly, and the client considers that the connection is established.
  • Third handshake: The server confirms that the packet sent by itself and received by the client function properly, and the two parties establish a connection and can communicate normally.

7. What is a SYN flood attack? How to prevent?

SYN flood attack, a DOS attack, uses TCP defects to send a large number of half-connection requests, consuming CPU and memory resources.

Principle:

  • During the three-way handshake, the server sends[SYN/ACK]After the package (the second package), the client is received[ACK]The TCP connection before the packet (the third packet) is called a half-open connect, where the server is inSYN_RECV(Waiting for client response) status. If received from the client[ACK]If yes, the TCP connection succeeds. If no, the TCP connection succeedsKeep resending requestsUntil success.
  • SYN attack attacker in a short period of timeFake a lot of IP addresses that don’t existTo the server continuously[SYN]Package, server replies[SYN/ACK]Package and wait for the customer’s confirmation. Since the source address does not exist, the server needs to retransmit until it times out.
  • These forgeries[SYN]Packets will occupy the disconnected queue for a long time, affecting the normal SYN, resulting in slow performance of the target system, network congestion, and even system breakdown.

Detection: When you see a large number of semi-connected states on the server, especially if the source IP address is random, you can basically conclude that this is a SYN attack.

Guard:

  • Filtering gateway protection is implemented through firewalls and routers.
  • You can harden TCP/IP stack defense, such as increasing the maximum number of connections and shortening the timeout period.
  • SYN cookies. SYN Cookies are used to defend against SYN flood attacks by modifying the three-way handshake on the TCP server.

8. What happens when the last ACK packet is lost in the three-way handshake connection phase?

Server:

  • If the third ACK packet is lost on the network, the server changes the STATUS of the TCP connection to SYN_RECV and resends the SYN+ACK packet after 3 seconds, 6 seconds, or 12 seconds according to the TCP timeout retransmission mechanism.
  • If no ACK response is received from the client after a specified number of retransmissions, the server automatically closes the connection after a certain period of time.

Client:

The client thinks the connection has been established, and if the client sends data to the server, the server responds with an RST packet (Reset). At this point, the client knows that the third handshake failed.

9. Describe the TCP wave four process in detail.

Image source: juejin.cn/post/684490…

  • First wave: The client sends a connection release packet (FIN=1, ACK=1) to the server, closes the connection, and waits for confirmation from the server.

    • Sequence number seq = U, that is, the sequence number of the last byte of the packet sent by the client + 1
    • Ack = K, that is, the sequence number of the last byte of the last packet sent by the server + 1
  • Second wave: After receiving the connection release packet, the server sends an acknowledgement packet (ACK=1) with the sequence number seq = K and ACK= U + 1.

    In this case, the TCP connection is half-closed, that is, the connection from the client to the server is released, but the connection from the server to the client is not released. This indicates that the client has no data to send, but the server may still send data to the client.

  • Third wave: The server sends A connection release packet (FIN=1, ACK=1) to the client, closes the connection, and waits for A’s confirmation.

    • Serial number seq = w, that is, the serial number of the last byte of the last packet sent by the server + 1.
    • The acknowledgement number ack = u + 1 is the same as the second wave because the client is not sending data during this time
  • Fourth wave: After receiving the connection release packet from the server, the client sends an ACK packet (ACK=1) with sequence number (SEq = U + 1) and ACK= W + 1.

    At this point, the client enters the time-wait state. Notice The client is not released from the TCP connection. The client enters the CLOSED state after 2 x MSL (maximum packet segment life). The server enters the CLOSED state as soon as it receives an acknowledgement from the client. As you can see, the server ends the TCP connection earlier than the client.

10. Why is there a three-way handshake when you connect, but a four-way handshake when you close?

After receiving a FIN packet from the client, the server may still have some data to transmit. Therefore, the server does not close the connection immediately but replies with an ACK packet.

The server may continue to send data. After the data is sent, the server sends a FIN packet to the client to close the connection. The server’s ACK and FIN are typically sent separately, resulting in an extra wave, which requires four waves.

11. Why must the client in time-wait state WAIT 2MSL?

There are two main reasons:

  1. Ensure that the ACK packet can reach the server so that the server can close the connection.

    During the fourth wave, the ACK packet from the client may not reach the server. The server retransmits FIN/ACK packets due to timeout. In this case, if the client is disconnected, it cannot respond to the secondary request from the server. As a result, the server fails to receive the FIN/ACK packet and cannot be disconnected.

    MSL is the maximum time that a message segment can survive on the network. The client can receive the FIN/ACK packet retransmitted from the server after waiting for 2MSL (1MSL ACK packet timeout + 1MSL FIN packet transmission from the server). Then the client retransmits the ACK packet and restarts the 2MSL timer. This ensures that the server can be shut down properly.

    If the server fails to transmit the FIN successfully to the client within 2MSL, the server will continue to timeout and retry until the connection is disconnected.

  2. Prevents invalid connection request message segments from appearing in subsequent connections.

    TCP requires that the same sequence number not be used within 2MSL. After sending the last ACK packet segment, the client can ensure that all the packet segments generated during the duration of this connection disappear from the network after 2MSL. This prevents the old connection request segment from appearing in the next connection. Or even if the message is received, it can not be processed.

12. What if the client fails after the connection is established?

Or, what if the bag for the three-way handshake, four-way wave goes missing? For example, the server resends the FIN lost.

In short, try to get confirmation through a timer + timeout retry mechanism until the connection is automatically disconnected.

Specifically, TCP has a keepalive timer. The server resets the timer every time it receives data from the client, usually for 2 hours. If the server does not receive any data from the client within 2 hours, the server tries again: it sends a probe packet segment every 75 minutes. If the client does not respond after 10 probe packets are sent, the server considers the connection disconnected.

13. What are the consequences of too much time-wait? How to deal with it?

In terms of the server, if a large number of Client connections are closed in a short period of time, a large number of TIME_WAIT connections will appear on the server, seriously consuming the server resources. In this case, some clients will be displayed as disconnected.

From the client side, too much TIME_WAIT on the client side will lead to the occupation of port resources, because there are 65536 ports, which will lead to the failure to create new connections.

Solutions:

  • The server can avoid the TIME_WAIT state by setting the SO_REUSEADDR socket option, which tells the kernel to continue and reuse this port even if it is busy (in TIME_WAIT state).

  • Modify kernel parameters in the /etc/sysctl.conf file, that is, net.ipv4.tcp_tw_reuse and tcp_timestamps

    Net.ipv4. tcp_tw_reuse = 1: enables reuse. Allow time-Wait Sockets to be re-used for new TCP connections. Default is 0, indicating closure. Net.ipv4. tcp_TW_recycle = 1 Enables the fast recycling of time-wait Sockets in TCP connections. The default value is 0, indicating that the fast recycling of time-Wait sockets is disabled.Copy the code
  • Force closing and send RST packets to enter the CLOSED state instead of TIME_WAIT.

14. Is TIME_WAIT server status? Or client state?

TIME_WAIT is the state entered by the party that disconnects actively. In general, this state is the state of the client. Generally, the server does not close the connection.

TIME_WAIT requires a wait of 2MSL. In the case of a large number of short connections, TIME_WAIT can be too much, which can consume a lot of system resources. For servers, this problem can be reduced to some extent by specifying KeepAlive in the HTTP protocol (the browser reuses a TCP connection to handle multiple HTTP requests) and letting the browser disconnect the connection.

15. How does TCP ensure reliability?

TCP mainly provides checksum, sequence number/confirmation reply, timeout retransmission, sliding window, congestion control and flow control methods to achieve reliable transmission.

  • Check and: By checking and, the receiving end can detect whether there are errors or exceptions in the data. If there are errors, the receiving end directly discards the TCP segment and resends the data.

  • Serial number/confirmation reply:

    The role of the serial number is not only the role of the response, with the serial number can be received according to the sequence number of the data, and remove the duplicate serial number of the data.

    During TCP transmission, each time the receiver receives data, it sends a confirmation reply to the sender. That is, an ACK message is sent with the corresponding sequence number to tell the sender what data is received and where to send the next data.

  • Sliding window: Sliding window not only improves the efficiency of packet transmission, but also prevents the sender from sending too much data, causing the receiver to fail to process the data properly.

  • Timeout retransmission: Timeout retransmission refers to the time between the sent data packet and the received confirmation packet. If the time exceeds this time, the packet is considered as lost and needs to be retransmitted. The maximum timeout is calculated dynamically.

  • Congestion control: During data transmission, network congestion may occur due to network status problems. In this case, a congestion control mechanism is introduced to ensure TCP reliability and improve performance.

  • Traffic control: If host A continues to send data to host B without regard to host B’s acceptance capability, host B’s receive buffer may be full and cannot receive any more data. As A result, A large number of data packets are lost and the retransmission mechanism is triggered. In the process of retransmission, if the receiving buffer condition of host B is not improved, a large amount of time will be wasted in retransmission, reducing the data transmission efficiency. Therefore, A flow control mechanism is introduced. Host B can control the amount of data sent by telling host A the size of its receiving buffer. Traffic control is related to the size of the window in the TCP header.

16. Tell me more about TCP sliding Windows.

During data transmission, if the data to be transmitted is large, you need to split it into multiple data packets. TCP sends the next packet only after confirming the data. As a result, time is wasted waiting for a confirmation packet.

To avoid this, TCP introduces the concept of Windows. The window size is the maximum number of packets that can continue to be sent without waiting for a confirmation reply packet.

From the picture above, you can see that the groups that have been sent and confirmed are on the left side of the sliding window, and the groups that have not yet been turned are on the right side of the sliding window.

The sliding window is also divided into two parts: one is the group that has been sent but has not been confirmed, and the other is the group waiting to be sent in the window. As the sent packets are confirmed, the waiting packets in the window are also sent. The entire window moves to the right, allowing groups that have not yet turned to enter the window.

It can be seen that the sliding window serves as a flow limiting function, that is, the current size of the sliding window determines the current TCP packet sending rate, and the size of the sliding window depends on the minimum value between the congestion control window and the flow control window.

17. Talk more about congestion control?

TCP uses four algorithms to achieve congestion control:

  • Slow-start;

  • Congestion avoidance;

  • Fast retransmit;

  • Fast Recovery.

The sender maintains a state variable called congestion window CWND (congestion Window). When cwNDSSTHRESH, use the congestion avoidance algorithm instead.

** Slow start: ** Don’t send a lot of data at first, increase the size of the congestion window gradually.

** Congestion avoidance: ** The congestion avoidance algorithm makes the congestion window grow slowly, that is, RTT increments the sender’s congestion window CWND by one instead of doubling it every round trip time. So the congestion window grows slowly in a linear fashion.

Fast retransmission: We can eliminate some unnecessary congestion packets and improve network throughput. For example, a receiver sends repeated acknowledgements immediately after receiving an out-of-order segment of a message, rather than passing them along when sending the data itself. Fast retransmission: If the sender receives three consecutive repeated acknowledgements, it should immediately retransmit the unreceived packet segment without waiting for the retransmission timer to expire.

Fast recovery: mainly with fast retransmission. When the sender received three consecutive repeated confirmation, is executed “multiplicative decrease” algorithm, to halve the ssthresh threshold (in order to prevent network congestion), but then does not perform slow start algorithm, because if the network congestion, I wouldn’t receive several repeated confirmation, repeat and confirm received three can also indicate that the network condition.

Shoulders of giants

Segmentfault.com/a/119000002…

Juejin. Cn/post / 684490…

www.nowcoder.com/discuss/568…

Blog.csdn.net/yrx420909/a…

www.cnblogs.com/xiaolincodi…

Imageslr.com/2020/07/07/…

cloud.tencent.com/develo here also recommend a computer book warehouse I collected, the warehouse has hundreds of classic CS e-books, read classic books will be deeper ~

Click this link to get you to the list of must-read books (PDF download included)

Github also has a repository at github.com/cosen1024/a… Welcome to star.