preface
Recently, I have been preparing for an interview. I have reviewed a lot of topics (programming language, distributed, network, algorithm, database, message-oriented middleware, etc.) and I am sharing them slowly
First article: Summary of back-end interview – Web article author: can play code
The makdown documentation is available here
tcp/ip
reliability
The checksum
The TCP header contains two bytes indicating the checksum. If a packet with incorrect checksum is received, TCP dismisses the packet and waits for retransmissionCopy the code
The serial number
The TCP capital of each packet has a 4-byte sequence number, which is used to solve out-of-order and duplication problems (correctly ordering packets according to the sequence number, which is delivered to the application layer; Packets with the same serial number are discarded.
Serial number winding
> Cause: The sequence number is 4 bytes. When more than 2^32 data is transmitted, the sequence number of the next packet may become smaller than the previous one.Copy the code
"> < p style =" max-width: 100%; clear: bothCopy the code
The retransmission mechanism
- When will it be retransmitted?
After data packets are sent, a timer is started. If no ACK confirmation is received from the peer end within a certain period of time (timeout retransmission time), the device retransmits data, which is called timeout retransmission
> When the sender receives three or more identical ACK packets, it indicates that some packets are lost and retransmits them immediately. This is called fast retransmissionCopy the code
- The retransmission times
The maximum number of retransmissions is determined by /proc/sys/net/ipv4/tcp_retries2, but the actual number of retransmissions depends on the RTT size of the network
The retransmission interval adopts the exponential backoff algorithm, that is, the retry wait interval becomes longer and longer
TCP calculates a timeout value based on the TCP_retries2, and if the time interval between the first time that a packet is sent and now exceeds the timeout value, it is discarded and it is not retransmitted again, so in the case of a small RTT, the number of retransmissions is determined by the TCP_retries2, If YOU post retries2 times, the timeout value is exceeded, and it’s discarded; However, in the case of a large RTT, the data packet is discarded after the timeout time is reached without needing to be retransmitted twice
- SACK
The receiver uses SACK to record the serial number range of the received packets, and the sender can know which packets need to be retransmitted
Flow control
- Why flow control
Packets to the receive buffer at the receiving end, the application data read from the buffer, but may be due to the application processing speed is slow, lead to receive buffer already occupied, at this time of the sending end should have to know the situation at the receiving end, and wait for the receiver to receive buffer has free space before proceeding to send data
- All kinds of window
Receiving window
After receiving the packet, the receiving end adds the remaining free space in the receiving buffer of the receiving end to the ACK message, which is the receiving window
Send window/slide window
From the perspective of the sender, packets can be divided into four types according to the sending status and confirmation status:
- Data that has been sent and confirmed
- Data that has been sent but not yet confirmed
- Data that is not sent but can be received by the receiver (indicates that the data is already in the send buffer and the receiver has been told that there is room to receive the data)
- Unsent data that cannot be received by the interface (indicates the amount of data that exceeds the capacity of the receiving end)
Send window = Region 2 + Region 3
- Control process
During the three-way handshake, the two parties inform each other of the size of the receiving window (that is, the size of area 2+ area 3 is confirmed). During transmission, the sender slides the sending window according to the acknowledgement number in the received ACK message and the current receiving window, and adjusts the amount of data to be sent next time
Congestion control
- Why do we need congestion control
To avoid network congestion, the sender needs to limit the amount of data it can send. If the congestion window is smaller than the receive window, the device can transmit up to the number of bytes defined in the congestion window before waiting for confirmation. Conversely, if the receive window is smaller than the congestion window, the device can transmit up to the number of bytes defined in the receiver window before waiting for confirmation.
- Different cases of blockage
The receiver receives an out-of-order packet, and the sender quickly retransmits the packet
- Slow start: After three handshaking, the receiving window size of the peer end is obtained through ACK, and the congestion window size of each side is initialized (default initial congestion window size is 10 MSS). For each ACK received, CWND +1, so after one RTT, CWND becomes twice as large as before
- Congestion avoidance: there is a slow start threshold (SSTHRESH), when CWND > SSTHRESH, each RTT, CWND +1
- Fast recovery: Set SSTHRESH to half of current CWND (SSTHRESH = CWND /2); Set CWND to SSTHRESH; Congestion window increases linearly
The network is blocked, and the sender cannot receive the ACK
- Slow start
- Congestion avoidance
- Set SSTHRESH to half of current CWND (SSTHRESH = CWND /2), CWND =1, restart the slow start and congestion avoidance algorithm
Various concepts
MTU and MSS
The MTU is a link layer concept. The size of packets transmitted on the network is limited by the Ethernet frame size. The maximum frame size is 1518, and the minimum frame size is 64
Influenced by MTU, need the sender to limit the size of the packet, is originally a larger packet segmentation processing, considering in the TCP/IP layers, only the transport layer retransmission mechanism, in the process of transmission, piecewise loss and damage occurs, you can via TCP retransmission mechanism to ensure that recipients can receive the full packet, So segmentation should be done by the transport layer. In other words, the size of packets sent by the TCP transport layer is also affected by the MTU. The value is Max Segment Size (MSS), which indicates the maximum number of packets that can be sent by TCP: MSS = MTU-IP header – TCP header
MSL(maximum TTL of packets) and TTL
MSL is the maximum lifetime of TCP packets on the network. This value is closely related to the TTL field of the IP packet header. The TTL is the maximum number of routes that an IP packet can pass through. For each router, the TTL decreases by 1. When the TTL decreases to 0, the IP packet is discarded
The socket option
SO_LINGER (affecting the behavior of close calls)
- L_onoff =0 (Disable linger feature (default))
Close () returns immediately
If any data remains in the send buffer, the system attempts to send the data to the peer end, and then waves the connection down four times as normal
- L_onoff =1 (Enable linger feature)
l_linger=0
- Close () returns immediately
- Discard the data in the buffer and send the RST to the peer end, so the peer end may only receive part of the data
l_linger=xxx
- Close () does not return immediately and blocks for a maximum of XXX time
- The maximum waiting time is XXX. If the data transmission is not complete after the timeout, the system sends an RST to the peer end to disconnect and discards the data in the buffer
SO_REUSEADDR (Port reuse)
After the server disconnects, the connection will be released after 2MSL. During this period, an error message “Address already in use” is displayed when the service is started. After SO_REUSEADDR is enabled, the restriction can be removed
SO_REUSEADDR applies to both FIN_WAIT2 and TIME_WAIT connections. It is important to note that since the FIN_WAIT2 state also allows port reuse, the restarted server may receive unexpected data
SO_REUSEPORT
role
- Allow multiple processes to listen on the same port, load balancing at the kernel level, and avoid stampede effects in multi-process applications (multiple processes wake up when a new connection request is made, but only one process ends up processing the request)
- Implementing rolling updates
The kernel stores the socket in listen state in the hash bucket of 32 slots, and the ports with the same hash value in the same slot. The kernel stores the sockets of different ports in the slot using a linked list. When there is a request, the socket is located in the hash slot first, then traverses the linked list and scores each socket. Retrieve the socket with the highest score for processing
(Linux kernel <4.5) For sockets with SO_REUSEPORT enabled, multiple sockets with the highest score will be iterated through the linked list, and one of them will be randomly extracted.
(Linux kernel >=4.5) SO_REUSEPORT group is introduced because each request has to be traversed through the linked list, which is inefficient. After finding the matching socket, a second hash is performed to find the corresponding group group and select one for request processing
Various features
Nagle algorithm (reduce frequent sending of small packets to the peer)
The principle of
- For the first time, the packet is immediately sent to the peer end without waiting
>2. The following data can be sent only when certain conditions are metCopy the code
- If there are sent unacknowledged data packets, TCP puts the data to be sent into the buffer and sends the data packets until the size of the data packets reaches MMS
>> 2. If all the previously sent packets have received ACK packets, the system sends them immediatelyCopy the code
> *** advantages and disadvantages ***Copy the code
Advantages: In the case of high network latency, it can effectively avoid the transmission of a large number of packets on the network and improve bandwidth utilization (each packet has more valid data).
Disadvantages: Because packets may be grouped and sent together, there is a delay in client data transmission, and nagle’s algorithm is not suitable for applications that require real-time preview (SSH)
This function is enabled by default. You can disable this function by setting TCP_NODELAY on the server
When will an ACK reply be received after receiving the packet?
Delay to confirm
After receiving the data packet, the local end does not return an ACK immediately, but waits for a period of time for confirmation. If there is data to be sent to the peer end during this period, the local end sends an ACK along with the data. If there is no data to be sent to the peer end after a period of time, the local end also returns an ACK for confirmation
> *** immediately reply to the scene ***Copy the code
- A packet larger than one frame is received, and the window size needs to be adjusted. Procedure
> 2. In Quickack mode (Quick && Not pingpong)Copy the code
When one end does not send data, the end will regard the connection as not pingpong by default. When the end sends data, the end will regard the connection as interactive. When the ack is returned due to timeout (the local end has no data to transmit to the peer end in the timer), the end will change the connection to non-interactive again.
- Description An out-of-order packet was received
- If the second data arrives while waiting for an ACK to be sent, send an ACK immediately!
TCP header field parsing
Port number (source port and destination port)
Each packet contains two bytes and identifies different application programs. After receiving the packet, the host sends the packet to different application programs based on the destination port number
Reserved ports: Range from 0 to 1023. You need root permission to listen on these ports
Registered port: The value ranges from 1024 to 49151. Common users can listen to ports
Temporary port: When a client tries to connect to a server, the system allocates a temporary port (source port) for the connection. On Linux, the port range that can be allocated is determined by the /proc/sys/net/ipv4/ip_local_port_range variable. The ip_LOCAL_port_range value can be adjusted to allow more ports to be available on servers that require a large number of active connections (such as web crawlers and forward proxies)
The serial number
The TCP capital of each packet has a 4-byte sequence number; Serial number refers to the sequence number of the first byte of the packet segment. The SEQUENCE number in a SYN packet is called the INITIAL sequence number (ISN), which is used to exchange the initial sequence number of both sides. Serial numbers in other packets are used to solve out-of-order and duplicate problems
Confirmation no.
Four bytes; After receiving a packet, TCP sends an ACK (acknowledgement number). The ACK value is the serial number that it expects to receive next time. The confirmation number has two functions: 1. It informs the sender that the packet whose sequence number is smaller than ACK has been received. 2. Inform the sender of the sequence number of the packet to be sent next time
TCP flags
-
SYN (Synchronize) : indicates the initial sequence number used to Synchronize data packets
-
ACK (Acknowledge) : Acknowledge data packets
-
RST (Reset) : This flag is used to force a connection to be disconnected, usually if the previously established connection is no longer available, the package is invalid, or there is nothing you can do about it
-
FIN (Finish) : Notify the peer that I have finished sending all data and intend to disconnect. I will not send you any more data packets.
-
PSH (Push) : indicates that these data packets should be sent to the upper-layer application immediately after they are received and cannot be cached
-
The window size
The TCP header has only 16 bits to represent the window size, that is, the maximum window size is only 65535 bytes, but the size of some packets is much larger than 65535 bytes, so the “window scaling” option of the scale factor is introduced, the optional value is 0-14, indicating that the window is expanded to the original N ^2 times, so, The actual message size is “window size” * (” window scale “^2)
- optional
MSS: Indicates the maximum segment size that TCP allows to receive packets from the peer
SACK: Select the confirmation option
Window Scale: Window scaling option
Begin to connect
Three-way handshake
A way to reduce the performance cost of the three-way handshake
Reuse the same TCP connection to avoid repeated creation and destruction
TCP fast open (TFO, transfer data during handshake)
- The client sends a SYN packet with the Fast Open option in the header and an empty Cookie indicating that the client is requesting a Fast Open Cookie
- After the server receives the SYN packet, it generates a cookie value (a string)
- The server sends a SYN + ACK packet and sets the cookie value in the Options Fast Open option
- The client caches the server’s IP and received cookie values
After the first time, the client has cached the cookie value locally, followed by the handshake and data transfer process as follows:
1. The client sends a SYN packet containing data and a Fast Open Cookie cached locally. (Note that all SYN packets we introduced earlier cannot contain data.)
The server verifies that the RECEIVED TFO Cookie and transmitted data are valid. If the packet is valid, a SYN + ACK packet is returned and sent to the application layer. If the packet is invalid, it is discarded
After receiving the data, the server can send the response data to the client before the handshake is complete
The client sends an ACK packet confirming the SYN packet and data (if any) from step 2
The following process is the same as the non-TFO connection process
Client and server requirements
Linux kernel versions that support TFO (the Client kernel version is 3.6; Config (vim /etc/sysctl.conf) add net.ipv4.tcp_fastopen = 3,
1 indicates that the client is enabled, 2 indicates that the server is enabled, and 3 indicates that both the client and server are enabled
Semi-connection and full connection queues
Semi-connected queue
When a client sends a SYN to a server, the server replies with an ACK and its SYN. In this case, the TCP status on the server changes from LISTEN to SYN_RCVD (SYN Received). In this case, the connection information is put into the half-connection queue. After the server sends ACK+SYN, a timer is started. If no ACK is received after the timeout, the server retransmits the ACK. The number of retries is determined by the tcp_SYNack_retries parameter
After the connection is half full, the server rejects new requests
Full connection queue
After the server sends ACK+SYN and receives an ACK from the client, the connection is moved from the half-connection queue to the full connection queue, waiting to be removed by the application call Accept (), which removes the connection from the queue header
After the connection is full, the server discards the ACK sent by the client. In this case, the server considers the connection failed and retransmits ACK+SYN.
SYN_BLOOD attack
When a client sends SYN packets with a large number of forged IP addresses, the server replies with ACK+SYN to an “unknown” IP address. These connections in SYN_RCVD fill the server’s half-connection queue, causing the server to fail to process other normal requests
Solution Reduce the retry times of SYN + ACK. Flush these connections from the semi-connection queue in a timely manner
Use the tcp_syncookies mechanism
Principle: When a server receives a SYN packet, it does not immediately put the connection on the semi-connection queue. Instead, it calculates a Cookie value as the sequence number for the second step of the handshake and replies with SYN+ACK. Check whether the ACK value is valid when the ACK packet is received by the peer party. If the ACK value is valid, the handshake succeeds for three times and the peer is put into the full connection queue for processing
The default value of /proc/sys/net/ipv4/tcp_syncookies is 1, indicating that the system starts when the queue is full
End connections
Difference between Close and shutdown
int close(int sockfd)
Close closes the data flow in both directions
In the read direction, the kernel sets the socket to unreadable, and any read operation returns an exception.
On the outgoing side, the kernel attempts to send the data in the send buffer to the peer end and then sends a FIN packet to end the connection. Writing data to the socket returns an exception during this process.
If the peer end still sends data, an RST packet is returned
The ⚠️ socket maintains a count. When a process holds the count plus one, the close call checks the count and closes the connection only when the count is zero. Otherwise, the socket’s count is simply reduced by one
int shutdown(int sockfd, int howto)
Shutdown is more elegant, allowing you to close only one direction of the connection
Howto = 0 Closes the read direction of the connection and returns EOF directly after reading the socket. The data in the receive buffer is discarded. After the arrival of data, the data is ACK and then quietly discarded.
Howto = 1 Closes the write direction of the connection, sends the data in the send buffer, and then sends the FIN packet. Application writes to this socket return an exception
Howto = 2 0+1 Close the connection in both directions.
⚠️shutdown does not check the socket count and closes the connection
Four times to wave
Why do I need to wait on TIME_WAIT
Prevent new connections (connections using the same quintuple) from receiving packets from old connections, causing data confusion
>
Copy the code
(LAST_ACK->CLOSE) (LAST_ACK->CLOSE)
Why is the waiting time 2MSL
>
Copy the code
Ensure that the new connection will not receive the packet from the old connection (because the packet lives in the network at most 1MSL)
>
Copy the code
After the active closing party sends an ACK, the passive closing party must receive the packet within 1MSL under normal circumstances. Otherwise, the passive closing party will resend the FIN packet, and the active closing party will receive the packet within 1MSL, so the longest time to come and go is 2MSL
What’s the problem with too many TIME_WAIT connections
Scenario: The client disconnects and immediately reconnects to the server, causing a large number of TIME_WAIT states on the client
Impact: Insufficient temporary ports on the client (many ports are in TIME_WAIT)
>
Copy the code
Scenario: The server disconnects, then the client immediately reconnects, and so on, causing a large number of TIME_WAIT connections on the server
Impact:
- Server: Occupies the memory and CPU of the server
- Client: Insufficient temporary ports on the client (a large number of ports correspond to connected servers in TIME_WAIT)
TIME_WAIT scenarios and solutions
When the client is disconnected, nginx will also disconnect from the back-end service, resulting in a large number of time_waits on nginx
Adjust net.ipv4.ip_local_port_range to increase the number of temporary ports
>>
Copy the code
Use connection pooling to connect to back-end services
>>
Copy the code
Add the number of nginx machines
>>
Copy the code
Add the number of configured IP addresses for nginx
>>
Copy the code
Enable the tcp_TW_reuse parameter on the nginx machine
> The server disconnects actively, resulting in a large number of time_waits on the serverCopy the code
Enable the tcp_TW_recycle parameter (with caution! Ensure that the client is not a NAT environment)
Related tuning parameters
net.ipv4.tcp_timestamps
It is a TCP header option field consisting of 10 bytes, including type, length, sending timestamp, and echo timestamp
To work, both connections need to be enabled. Whether to enable this feature is determined in the SYN packet in the three-way handshake
When the sender sends data, a send timestamp is placed in the sender timestamp TSval
After receiving the data, the receiver fills the received time stamp in Tsecr and its own time stamp in TSVal in the reply message
net.ipv4.tcp_tw_reuse
Net.ipv4. tcp_timestamps needs to be enabled
>>
Copy the code
Active disconnection for clients, enabled on the client machine
>>
Copy the code
After the connection is reused, the connection time will be updated, and the packets receiving the timestamp is small and the new connection time will be discarded (solve the problem of data confusion caused by the new connection receiving the old connection data).
>> Reuse the time_WAIT connection process (call the active connection/disconnection start in time_WAIT A and the peer B)Copy the code
Case 1: The old ACK connection is not lost and is in the process of transmission. As a result, USER B has not received the ACK and is in the LAST_ACK state
- A sends A SYN packet in the SYN_SENT state
After receiving an ACK for the old connection, USER B enters the CLOSE state. >>> 3. User A does not receive an ACK for the old connection, so the handshake is retransmittedCopy the code
Case 2: The ACK of the old connection is lost. As a result, USER B has not received the ACK and is in the LAST_ACK state
- A sends A SYN packet in the SYN_SENT state
>>> 2. User B retransmits the FIN packet because it does not receive an ACK packet. >>> 3. User A replies with an RST packet after receiving A FIN packetCopy the code
>>> Case 3: B is in the CLOSE stateCopy the code
The normal three handshakes
net.ipv4.tcp_tw_recycle
Net.ipv4. tcp_timestamps needs to be enabled
>> Active disconnection is enabled on the serverCopy the code
>> After this function is enabled, TCP quickly reclaims connections in TIME_WAIT and records the time stamp of the last received data packet. If packets earlier than this time stamp are received on this connection, TCP discards them directlyCopy the code
>> ⚠️ If the server is on a NAT network or using a load balancer to connect to the back-end service, from the server's point of view, an IP (such as the load balancer proxy) establishes a large number of connections with it, and the proxy (such as the load balancer) is likely to use "the server is still in TIME_WAIT socket" to establish new connections. If the timestamp in the new connection is earlier than that recorded by the server, the creation will fail (the time of the clients may not be 100% synchronized).Copy the code
http/https
HTTP versions are iterated
www.ruanyifeng.com/blog/2016/0…
Strong and negotiated caching
Strong cache
Without accessing the server, use the local disk/memory resource cache directly
Negotiate the cache
Access the server, which decides whether to use the cache on the browser
process
- The first time a resource is requested, the server returns expires(a point-in-time string in http1.0, GMT format, in the response header, Cache-control (http1.1, Cache policy), last-modified (http1.0, resource Last Modified), Etag(http1.1, resource hash)
expires/cache-control
- Expires is a 1.0 specification, and because it records absolute time, it can lead to cache clutter when client and server time are out of sync
- Cache-control is a 1.1 specification
Max-age: records the cache validity period and relative time
Cache policy:
- No-cache Does not use local cache. A negotiated cache is required.
- No-store directly disallows the browser to cache data and asks the server for the full resource each time it requests a resource
- Public can be cached by all users, including end users and middleware proxy servers such as CDN
- Private can only be cached by the end user’s browser
Priority: cache-Control > Expires
Last-Modified/Etag
- Last-modified is a 1.0 specification that records when a resource was Last Modified and has the following disadvantages
- Sometimes the update time of the resource is changed, and the content itself has not changed. In this case, you do not want to let the user pull the resource again
- Last-modified records the granularity in seconds. If a resource has been modified more than once in a second, it cannot be sensed
- Sometimes the last modification time of the resource is not available
- Etag is a 1.1 specification that records the hash value of a resource
Priority: Etag > last-Modified
- The second time a resource is requested, the local cache is determined based on cache-Control/Expires in the cache
- If the local cache is not available, the browser will make a request to the server with if-modified-since = last-modified; If-none-match = last Etag
- The server checks if-none-match/if-modified-since, and returns 304 If the resource has not been Modified. If the resource has been modified, the resource data is returned
Priority: if-none-match > if-modified-since
HTTPS connection establishment process
process
- The client obtains the certificate of the server
- The client generates a random number, encrypts the public key in the previous certificate, and sends the encrypted information to the server
- After obtaining the number, the server decrypts it using the private key to obtain the random number
- The client and server encrypt and decrypt symmetrically through random numbers
Write in the last
If you like this article, welcome to pay attention to the public account “will play code”, focus on plain English to share practical technology