1. SO_REUSEADDR

If the server fails and actively disconnects, it takes two MSLS to finally release the connection, and the service restarts to bind to the same port. By default, operating system implementations prevent new listening sockets from binding to this port. Enabling the SO_REUSEADDR socket option removes this restriction.

2. SO_REUSEPORT

By default, an IP and port combination can only be bound to one socket. Since version 3.9, the Linux kernel has introduced a new socket option, SO_REUSEPORT, also known as Port Sharding, which allows multiple sockets to listen on the same IP and port combination.

2.1 Multi-process network model

  • Main process + multiple worker child processes listening on the same port (causing stampede issues)

  • Multi-process + REUSEPORT

2.2 Stampede problem

Multiple processes/threads listen on the same socket at the same time. When a network event occurs, all waiting processes/threads are awakened at the same time, but only one process/thread can handle the network event. The other processes/threads fail to acquire the network event and go to sleep again.

2.3 Accept system call scare problem

Before Linux 2.6, there was a stampede problem. In later versions, the WQ_FLAG_EXCLUSIVE option was introduced to solve the accept call scare problem.

2.4 Alarm problem of epoll system call

When a new network event occurs, multiple processes blocking at epoll_wait are awakened simultaneously. In this case, epoll’s shock still exists.

2.5 SO_REUSEPORT principle

The kernel allocates 32 hash buckets for the socket in LISTEN. The monitored port numbers are broken into these hash buckets through hashing algorithm, and the ports with the same hash are zipped to resolve conflicts.

  1. After receiving a SYN handshake packet from the client, the system calculates a hash conflict list based on the hash value of the target port number

  2. The socket with the SO_REUSEPORT option is hashed twice to find the corresponding SO_REUSEPORT group, and a random one is selected for processing.

2.6 SO_REUSEPORT role

  • The kernel automatically loads loads by balancing requests to different sockets, i.e., threads, when data arrives
  • Support rolling upgrade
  1. Start a new version v2, listen on the same port, and process requests with the old version V1.
  2. Signal the v1 version of the process to stop accepting new requests
  3. After a period of time, all user requests of v1 are processed, the process of V1 exits, and services of V2 continue

2.7 SO_REUSEPORT Security

For security reasons, prevent port hijacking

  • Only server processes with the same effective-user-id can listen on the same IP :port
  • Only when the first process is enabled with the SO_REUSEPORT option, subsequent processes can be bound to the same port.

3. SO_LINGER

Set the behavior of the close() function to close a TCP connection. The default behavior of close() is that if there is data remaining in the socket send buffer, the system will continue to send the data to the other party, wait for acknowledgement, and then return.

The SO_LINGER option uses the following structure:

struct linger {
     int l_onoff;
     int l_linger;
};
Copy the code
  1. If l_onOFF is 0, this option is turned off, the value of L_linger is ignored, and close() closes the connection with the default above.
  2. L_onoff is non-0, l_linger is 0, close() closes the connection with the following a.
  3. L_onoff is non-0, l_linger is non-0, close() closes the connection using the b method below.
  • A. immediately shut down the connection, by sending a RST packet (rather than using normal FIN | ACK | FIN | ACK four branch) to close the connection. If there is incomplete data in the send buffer, it is discarded. Closing the TCP status of a party directly skips TIMEWAIT and goes to CLOSED.

  • B. Set a timeout to close the connection. If there is still data left in the socket send buffer, the process goes to sleep and the kernel goes into a timed state to send as much data as possible.

    Before the timeout, if all data is sent out and confirmed by the other party, the kernel with normal FIN | ACK | FIN | ACK four grouped to close the connection, the close () returns success.

    If the data fails to be sent and acknowledged after the timeout, close the connection using method A above. Close () returns EWOULDBLOCK.

Refer to the article

In-depth understanding of TCP: from principle to Practice