preface

Recently, I have been preparing for an interview. I have reviewed a lot of topics (programming language, distributed, network, algorithm, database, message-oriented middleware, etc.) and I am sharing them slowly

First article: Summary of back-end interview – Web article author: can play code

The makdown documentation is available here

tcp/ip

reliability

The checksum

The TCP header contains two bytes indicating the checksum. If a packet with incorrect checksum is received, TCP dismisses the packet and waits for retransmissionCopy the code

The serial number

The TCP capital of each packet has a 4-byte sequence number, which is used to solve out-of-order and duplication problems (correctly ordering packets according to the sequence number, which is delivered to the application layer; Packets with the same serial number are discarded.

Serial number winding

> Cause: The sequence number is 4 bytes. When more than 2^32 data is transmitted, the sequence number of the next packet may become smaller than the previous one.Copy the code
"> < p style =" max-width: 100%; clear: bothCopy the code

The retransmission mechanism

  • When will it be retransmitted?

After data packets are sent, a timer is started. If no ACK confirmation is received from the peer end within a certain period of time (timeout retransmission time), the device retransmits data, which is called timeout retransmission

> When the sender receives three or more identical ACK packets, it indicates that some packets are lost and retransmits them immediately. This is called fast retransmissionCopy the code
  • The retransmission times

The maximum number of retransmissions is determined by /proc/sys/net/ipv4/tcp_retries2, but the actual number of retransmissions depends on the RTT size of the network

The retransmission interval adopts the exponential backoff algorithm, that is, the retry wait interval becomes longer and longer

TCP calculates a timeout value based on the TCP_retries2, and if the time interval between the first time that a packet is sent and now exceeds the timeout value, it is discarded and it is not retransmitted again, so in the case of a small RTT, the number of retransmissions is determined by the TCP_retries2, If YOU post retries2 times, the timeout value is exceeded, and it’s discarded; However, in the case of a large RTT, the data packet is discarded after the timeout time is reached without needing to be retransmitted twice

  • SACK

The receiver uses SACK to record the serial number range of the received packets, and the sender can know which packets need to be retransmitted

Flow control

  • Why flow control

Packets to the receive buffer at the receiving end, the application data read from the buffer, but may be due to the application processing speed is slow, lead to receive buffer already occupied, at this time of the sending end should have to know the situation at the receiving end, and wait for the receiver to receive buffer has free space before proceeding to send data

  • All kinds of window

Receiving window

After receiving the packet, the receiving end adds the remaining free space in the receiving buffer of the receiving end to the ACK message, which is the receiving window

Send window/slide window

From the perspective of the sender, packets can be divided into four types according to the sending status and confirmation status:

  1. Data that has been sent and confirmed
  2. Data that has been sent but not yet confirmed
  3. Data that is not sent but can be received by the receiver (indicates that the data is already in the send buffer and the receiver has been told that there is room to receive the data)
  4. Unsent data that cannot be received by the interface (indicates the amount of data that exceeds the capacity of the receiving end)

Send window = Region 2 + Region 3

  • Control process

During the three-way handshake, the two parties inform each other of the size of the receiving window (that is, the size of area 2+ area 3 is confirmed). During transmission, the sender slides the sending window according to the acknowledgement number in the received ACK message and the current receiving window, and adjusts the amount of data to be sent next time

Congestion control

  • Why do we need congestion control

To avoid network congestion, the sender needs to limit the amount of data it can send. If the congestion window is smaller than the receive window, the device can transmit up to the number of bytes defined in the congestion window before waiting for confirmation. Conversely, if the receive window is smaller than the congestion window, the device can transmit up to the number of bytes defined in the receiver window before waiting for confirmation.

  • Different cases of blockage

The receiver receives an out-of-order packet, and the sender quickly retransmits the packet

  1. Slow start: After three handshaking, the receiving window size of the peer end is obtained through ACK, and the congestion window size of each side is initialized (default initial congestion window size is 10 MSS). For each ACK received, CWND +1, so after one RTT, CWND becomes twice as large as before
  2. Congestion avoidance: there is a slow start threshold (SSTHRESH), when CWND > SSTHRESH, each RTT, CWND +1
  3. Fast recovery: Set SSTHRESH to half of current CWND (SSTHRESH = CWND /2); Set CWND to SSTHRESH; Congestion window increases linearly

The network is blocked, and the sender cannot receive the ACK

  1. Slow start
  2. Congestion avoidance
  3. Set SSTHRESH to half of current CWND (SSTHRESH = CWND /2), CWND =1, restart the slow start and congestion avoidance algorithm

Various concepts

MTU and MSS

The MTU is a link layer concept. The size of packets transmitted on the network is limited by the Ethernet frame size. The maximum frame size is 1518, and the minimum frame size is 64

Influenced by MTU, need the sender to limit the size of the packet, is originally a larger packet segmentation processing, considering in the TCP/IP layers, only the transport layer retransmission mechanism, in the process of transmission, piecewise loss and damage occurs, you can via TCP retransmission mechanism to ensure that recipients can receive the full packet, So segmentation should be done by the transport layer. In other words, the size of packets sent by the TCP transport layer is also affected by the MTU. The value is Max Segment Size (MSS), which indicates the maximum number of packets that can be sent by TCP: MSS = MTU-IP header – TCP header

MSL(maximum TTL of packets) and TTL

MSL is the maximum lifetime of TCP packets on the network. This value is closely related to the TTL field of the IP packet header. The TTL is the maximum number of routes that an IP packet can pass through. For each router, the TTL decreases by 1. When the TTL decreases to 0, the IP packet is discarded

The socket option

SO_LINGER (affecting the behavior of close calls)

  • L_onoff =0 (Disable linger feature (default))
  1. Close () returns immediately

  2. If any data remains in the send buffer, the system attempts to send the data to the peer end, and then waves the connection down four times as normal

  • L_onoff =1 (Enable linger feature)

l_linger=0

  1. Close () returns immediately
  2. Discard the data in the buffer and send the RST to the peer end, so the peer end may only receive part of the data

l_linger=xxx

  1. Close () does not return immediately and blocks for a maximum of XXX time
  2. The maximum waiting time is XXX. If the data transmission is not complete after the timeout, the system sends an RST to the peer end to disconnect and discards the data in the buffer

SO_REUSEADDR (Port reuse)

After the server disconnects, the connection will be released after 2MSL. During this period, an error message “Address already in use” is displayed when the service is started. After SO_REUSEADDR is enabled, the restriction can be removed

SO_REUSEADDR applies to both FIN_WAIT2 and TIME_WAIT connections. It is important to note that since the FIN_WAIT2 state also allows port reuse, the restarted server may receive unexpected data

SO_REUSEPORT

role

  1. Allow multiple processes to listen on the same port, load balancing at the kernel level, and avoid stampede effects in multi-process applications (multiple processes wake up when a new connection request is made, but only one process ends up processing the request)
  2. Implementing rolling updates

The kernel stores the socket in listen state in the hash bucket of 32 slots, and the ports with the same hash value in the same slot. The kernel stores the sockets of different ports in the slot using a linked list. When there is a request, the socket is located in the hash slot first, then traverses the linked list and scores each socket. Retrieve the socket with the highest score for processing

(Linux kernel <4.5) For sockets with SO_REUSEPORT enabled, multiple sockets with the highest score will be iterated through the linked list, and one of them will be randomly extracted.

(Linux kernel >=4.5) SO_REUSEPORT group is introduced because each request has to be traversed through the linked list, which is inefficient. After finding the matching socket, a second hash is performed to find the corresponding group group and select one for request processing

Various features

Nagle algorithm (reduce frequent sending of small packets to the peer)

The principle of

  1. For the first time, the packet is immediately sent to the peer end without waiting
>2. The following data can be sent only when certain conditions are metCopy the code
  1. If there are sent unacknowledged data packets, TCP puts the data to be sent into the buffer and sends the data packets until the size of the data packets reaches MMS
>> 2. If all the previously sent packets have received ACK packets, the system sends them immediatelyCopy the code
> *** advantages and disadvantages ***Copy the code

Advantages: In the case of high network latency, it can effectively avoid the transmission of a large number of packets on the network and improve bandwidth utilization (each packet has more valid data).

Disadvantages: Because packets may be grouped and sent together, there is a delay in client data transmission, and nagle’s algorithm is not suitable for applications that require real-time preview (SSH)

This function is enabled by default. You can disable this function by setting TCP_NODELAY on the server

When will an ACK reply be received after receiving the packet?

Delay to confirm

After receiving the data packet, the local end does not return an ACK immediately, but waits for a period of time for confirmation. If there is data to be sent to the peer end during this period, the local end sends an ACK along with the data. If there is no data to be sent to the peer end after a period of time, the local end also returns an ACK for confirmation

> *** immediately reply to the scene ***Copy the code
  1. A packet larger than one frame is received, and the window size needs to be adjusted. Procedure
> 2. In Quickack mode (Quick && Not pingpong)Copy the code

When one end does not send data, the end will regard the connection as not pingpong by default. When the end sends data, the end will regard the connection as interactive. When the ack is returned due to timeout (the local end has no data to transmit to the peer end in the timer), the end will change the connection to non-interactive again.

  1. Description An out-of-order packet was received
  2. If the second data arrives while waiting for an ACK to be sent, send an ACK immediately!

TCP header field parsing

Port number (source port and destination port)

Each packet contains two bytes and identifies different application programs. After receiving the packet, the host sends the packet to different application programs based on the destination port number

Reserved ports: Range from 0 to 1023. You need root permission to listen on these ports

Registered port: The value ranges from 1024 to 49151. Common users can listen to ports

Temporary port: When a client tries to connect to a server, the system allocates a temporary port (source port) for the connection. On Linux, the port range that can be allocated is determined by the /proc/sys/net/ipv4/ip_local_port_range variable. The ip_LOCAL_port_range value can be adjusted to allow more ports to be available on servers that require a large number of active connections (such as web crawlers and forward proxies)

The serial number

The TCP capital of each packet has a 4-byte sequence number; Serial number refers to the sequence number of the first byte of the packet segment. The SEQUENCE number in a SYN packet is called the INITIAL sequence number (ISN), which is used to exchange the initial sequence number of both sides. Serial numbers in other packets are used to solve out-of-order and duplicate problems

Confirmation no.

Four bytes; After receiving a packet, TCP sends an ACK (acknowledgement number). The ACK value is the serial number that it expects to receive next time. The confirmation number has two functions: 1. It informs the sender that the packet whose sequence number is smaller than ACK has been received. 2. Inform the sender of the sequence number of the packet to be sent next time

TCP flags

  • SYN (Synchronize) : indicates the initial sequence number used to Synchronize data packets

  • ACK (Acknowledge) : Acknowledge data packets

  • RST (Reset) : This flag is used to force a connection to be disconnected, usually if the previously established connection is no longer available, the package is invalid, or there is nothing you can do about it

  • FIN (Finish) : Notify the peer that I have finished sending all data and intend to disconnect. I will not send you any more data packets.

  • PSH (Push) : indicates that these data packets should be sent to the upper-layer application immediately after they are received and cannot be cached

  • The window size

The TCP header has only 16 bits to represent the window size, that is, the maximum window size is only 65535 bytes, but the size of some packets is much larger than 65535 bytes, so the “window scaling” option of the scale factor is introduced, the optional value is 0-14, indicating that the window is expanded to the original N ^2 times, so, The actual message size is “window size” * (” window scale “^2)

  • optional

MSS: Indicates the maximum segment size that TCP allows to receive packets from the peer

SACK: Select the confirmation option

Window Scale: Window scaling option

Begin to connect

Three-way handshake

A way to reduce the performance cost of the three-way handshake

Reuse the same TCP connection to avoid repeated creation and destruction

TCP fast open (TFO, transfer data during handshake)

  1. The client sends a SYN packet with the Fast Open option in the header and an empty Cookie indicating that the client is requesting a Fast Open Cookie
  2. After the server receives the SYN packet, it generates a cookie value (a string)
  3. The server sends a SYN + ACK packet and sets the cookie value in the Options Fast Open option
  4. The client caches the server’s IP and received cookie values

After the first time, the client has cached the cookie value locally, followed by the handshake and data transfer process as follows:

1. The client sends a SYN packet containing data and a Fast Open Cookie cached locally. (Note that all SYN packets we introduced earlier cannot contain data.)

  1. The server verifies that the RECEIVED TFO Cookie and transmitted data are valid. If the packet is valid, a SYN + ACK packet is returned and sent to the application layer. If the packet is invalid, it is discarded

  2. After receiving the data, the server can send the response data to the client before the handshake is complete

  3. The client sends an ACK packet confirming the SYN packet and data (if any) from step 2

  4. The following process is the same as the non-TFO connection process

Client and server requirements

Linux kernel versions that support TFO (the Client kernel version is 3.6; Config (vim /etc/sysctl.conf) add net.ipv4.tcp_fastopen = 3,

1 indicates that the client is enabled, 2 indicates that the server is enabled, and 3 indicates that both the client and server are enabled

Semi-connection and full connection queues

Semi-connected queue

When a client sends a SYN to a server, the server replies with an ACK and its SYN. In this case, the TCP status on the server changes from LISTEN to SYN_RCVD (SYN Received). In this case, the connection information is put into the half-connection queue. After the server sends ACK+SYN, a timer is started. If no ACK is received after the timeout, the server retransmits the ACK. The number of retries is determined by the tcp_SYNack_retries parameter

After the connection is half full, the server rejects new requests

Full connection queue

After the server sends ACK+SYN and receives an ACK from the client, the connection is moved from the half-connection queue to the full connection queue, waiting to be removed by the application call Accept (), which removes the connection from the queue header

After the connection is full, the server discards the ACK sent by the client. In this case, the server considers the connection failed and retransmits ACK+SYN.

SYN_BLOOD attack

When a client sends SYN packets with a large number of forged IP addresses, the server replies with ACK+SYN to an “unknown” IP address. These connections in SYN_RCVD fill the server’s half-connection queue, causing the server to fail to process other normal requests

Solution Reduce the retry times of SYN + ACK. Flush these connections from the semi-connection queue in a timely manner

Use the tcp_syncookies mechanism

Principle: When a server receives a SYN packet, it does not immediately put the connection on the semi-connection queue. Instead, it calculates a Cookie value as the sequence number for the second step of the handshake and replies with SYN+ACK. Check whether the ACK value is valid when the ACK packet is received by the peer party. If the ACK value is valid, the handshake succeeds for three times and the peer is put into the full connection queue for processing

The default value of /proc/sys/net/ipv4/tcp_syncookies is 1, indicating that the system starts when the queue is full

End connections

Difference between Close and shutdown

int close(int sockfd)

Close closes the data flow in both directions

In the read direction, the kernel sets the socket to unreadable, and any read operation returns an exception.

On the outgoing side, the kernel attempts to send the data in the send buffer to the peer end and then sends a FIN packet to end the connection. Writing data to the socket returns an exception during this process.

If the peer end still sends data, an RST packet is returned

The ⚠️ socket maintains a count. When a process holds the count plus one, the close call checks the count and closes the connection only when the count is zero. Otherwise, the socket’s count is simply reduced by one

int shutdown(int sockfd, int howto)

Shutdown is more elegant, allowing you to close only one direction of the connection

Howto = 0 Closes the read direction of the connection and returns EOF directly after reading the socket. The data in the receive buffer is discarded. After the arrival of data, the data is ACK and then quietly discarded.

Howto = 1 Closes the write direction of the connection, sends the data in the send buffer, and then sends the FIN packet. Application writes to this socket return an exception

Howto = 2 0+1 Close the connection in both directions.

⚠️shutdown does not check the socket count and closes the connection

Four times to wave

Why do I need to wait on TIME_WAIT

Prevent new connections (connections using the same quintuple) from receiving packets from old connections, causing data confusion

>
Copy the code

(LAST_ACK->CLOSE) (LAST_ACK->CLOSE)

Why is the waiting time 2MSL

>
Copy the code

Ensure that the new connection will not receive the packet from the old connection (because the packet lives in the network at most 1MSL)

>
Copy the code

After the active closing party sends an ACK, the passive closing party must receive the packet within 1MSL under normal circumstances. Otherwise, the passive closing party will resend the FIN packet, and the active closing party will receive the packet within 1MSL, so the longest time to come and go is 2MSL

What’s the problem with too many TIME_WAIT connections

Scenario: The client disconnects and immediately reconnects to the server, causing a large number of TIME_WAIT states on the client

Impact: Insufficient temporary ports on the client (many ports are in TIME_WAIT)

>
Copy the code

Scenario: The server disconnects, then the client immediately reconnects, and so on, causing a large number of TIME_WAIT connections on the server

Impact:

  1. Server: Occupies the memory and CPU of the server
  2. Client: Insufficient temporary ports on the client (a large number of ports correspond to connected servers in TIME_WAIT)

TIME_WAIT scenarios and solutions

When the client is disconnected, nginx will also disconnect from the back-end service, resulting in a large number of time_waits on nginx

Adjust net.ipv4.ip_local_port_range to increase the number of temporary ports

>>
Copy the code

Use connection pooling to connect to back-end services

>>
Copy the code

Add the number of nginx machines

>>
Copy the code

Add the number of configured IP addresses for nginx

>>
Copy the code

Enable the tcp_TW_reuse parameter on the nginx machine

> The server disconnects actively, resulting in a large number of time_waits on the serverCopy the code

Enable the tcp_TW_recycle parameter (with caution! Ensure that the client is not a NAT environment)

Related tuning parameters

net.ipv4.tcp_timestamps

It is a TCP header option field consisting of 10 bytes, including type, length, sending timestamp, and echo timestamp

To work, both connections need to be enabled. Whether to enable this feature is determined in the SYN packet in the three-way handshake

  1. When the sender sends data, a send timestamp is placed in the sender timestamp TSval

  2. After receiving the data, the receiver fills the received time stamp in Tsecr and its own time stamp in TSVal in the reply message

net.ipv4.tcp_tw_reuse

Net.ipv4. tcp_timestamps needs to be enabled

>>
Copy the code

Active disconnection for clients, enabled on the client machine

>>
Copy the code

After the connection is reused, the connection time will be updated, and the packets receiving the timestamp is small and the new connection time will be discarded (solve the problem of data confusion caused by the new connection receiving the old connection data).

>> Reuse the time_WAIT connection process (call the active connection/disconnection start in time_WAIT A and the peer B)Copy the code

Case 1: The old ACK connection is not lost and is in the process of transmission. As a result, USER B has not received the ACK and is in the LAST_ACK state

  1. A sends A SYN packet in the SYN_SENT state
After receiving an ACK for the old connection, USER B enters the CLOSE state. >>> 3. User A does not receive an ACK for the old connection, so the handshake is retransmittedCopy the code

Case 2: The ACK of the old connection is lost. As a result, USER B has not received the ACK and is in the LAST_ACK state

  1. A sends A SYN packet in the SYN_SENT state
>>> 2. User B retransmits the FIN packet because it does not receive an ACK packet. >>> 3. User A replies with an RST packet after receiving A FIN packetCopy the code
>>> Case 3: B is in the CLOSE stateCopy the code

The normal three handshakes

net.ipv4.tcp_tw_recycle

Net.ipv4. tcp_timestamps needs to be enabled

>> Active disconnection is enabled on the serverCopy the code
>> After this function is enabled, TCP quickly reclaims connections in TIME_WAIT and records the time stamp of the last received data packet. If packets earlier than this time stamp are received on this connection, TCP discards them directlyCopy the code
>> ⚠️ If the server is on a NAT network or using a load balancer to connect to the back-end service, from the server's point of view, an IP (such as the load balancer proxy) establishes a large number of connections with it, and the proxy (such as the load balancer) is likely to use "the server is still in TIME_WAIT socket" to establish new connections. If the timestamp in the new connection is earlier than that recorded by the server, the creation will fail (the time of the clients may not be 100% synchronized).Copy the code

http/https

HTTP versions are iterated

www.ruanyifeng.com/blog/2016/0…

Strong and negotiated caching

Strong cache

Without accessing the server, use the local disk/memory resource cache directly

Negotiate the cache

Access the server, which decides whether to use the cache on the browser

process

  1. The first time a resource is requested, the server returns expires(a point-in-time string in http1.0, GMT format, in the response header, Cache-control (http1.1, Cache policy), last-modified (http1.0, resource Last Modified), Etag(http1.1, resource hash)

expires/cache-control

  1. Expires is a 1.0 specification, and because it records absolute time, it can lead to cache clutter when client and server time are out of sync
  2. Cache-control is a 1.1 specification

Max-age: records the cache validity period and relative time

Cache policy:

  1. No-cache Does not use local cache. A negotiated cache is required.
  2. No-store directly disallows the browser to cache data and asks the server for the full resource each time it requests a resource
  3. Public can be cached by all users, including end users and middleware proxy servers such as CDN
  4. Private can only be cached by the end user’s browser

Priority: cache-Control > Expires

Last-Modified/Etag

  1. Last-modified is a 1.0 specification that records when a resource was Last Modified and has the following disadvantages
  1. Sometimes the update time of the resource is changed, and the content itself has not changed. In this case, you do not want to let the user pull the resource again
  2. Last-modified records the granularity in seconds. If a resource has been modified more than once in a second, it cannot be sensed
  3. Sometimes the last modification time of the resource is not available
  1. Etag is a 1.1 specification that records the hash value of a resource

Priority: Etag > last-Modified

  1. The second time a resource is requested, the local cache is determined based on cache-Control/Expires in the cache
  2. If the local cache is not available, the browser will make a request to the server with if-modified-since = last-modified; If-none-match = last Etag
  3. The server checks if-none-match/if-modified-since, and returns 304 If the resource has not been Modified. If the resource has been modified, the resource data is returned

Priority: if-none-match > if-modified-since

HTTPS connection establishment process

process

  1. The client obtains the certificate of the server
  2. The client generates a random number, encrypts the public key in the previous certificate, and sends the encrypted information to the server
  3. After obtaining the number, the server decrypts it using the private key to obtain the random number
  4. The client and server encrypt and decrypt symmetrically through random numbers

Write in the last

If you like this article, welcome to pay attention to the public account “will play code”, focus on plain English to share practical technology