Last year at TechDay, we had Go Night reading guru Wen Yang, who was inspired to share the open source theme with us. It just so happens that Kotlin’s official library, Ktor, found a very unpopular question a while back and asked for a PR. After a long month of waiting, someone has finally reviewed and merged into the Master branch. It will be available in the next release.

Related content is recorded here. The PR address is here: github.com/ktorio/ktor…

The main content of the article is – what is TCP self-connection

  • Parity analysis of connect and bind(0) port number allocation in Linux kernel
  • How to fix the TCP self-connection code

For background, what is this PR

GetAvailablePort () is used to find an even number of available ports, which in the test example returned 42064. I’ll focus on why the port numbers are even.

fun testSelfConnect() { runBlocking { // Find a port that would be used as a local address. val port = getAvailablePort() val tcpSocketBuilder = aSocket(ActorSelectorManager(Dispatchers.IO)).tcp() // Try to connect to that address repeatedly. for (i in 0 until 100000) { try { val socket = TcpSocketBuilder. Connect (InetSocketAddress("127.0.0.1", port)) println("connect to self succeed: ${socket.localAddress} to ${socket.remoteAddress}") System.`in`.read() break } catch (ex: Exception) { // ignore } } } }Copy the code

Running the code above will result in a connection with the same source port number as the destination port number.

This is obviously not normal, and if this happens, if the server program can no longer listen on port 42064. The essence of this problem is TCP self-connection, and this PR is to solve this problem. Let’s see what TCP self-connection is.

Since TCP connection

TCP self-connection is an interesting phenomenon, and even many people think it is a bug of the Linux kernel. Let’s first look at what TCP’s self-connection is.

Create a new script, self_connect.sh, with the following contents:

While true do Telnet 127.0.0.1 50000 doneCopy the code

Before executing this script, run the netstat command to ensure that 50000 is not listening. The script is then executed, and after a while, Telnet succeeds.

Trying 127.0.0.1...
telnet: connect to address 127.0.0.1: Connection refused
Trying 127.0.0.1...
telnet: connect to address 127.0.0.1: Connection refused
Trying 127.0.0.1...
Connected to 127.0.0.1.
Escape character is '^]'.
Copy the code

Run netstat to check the connection status of port 50000, as shown in the following figure.

Proto Recv -q Send -q Local Address Foreign Address State PID/Program name TCP 00 127.0.0.1:50000 127.0.0.1:50000 ESTABLISHED 24786/telnetCopy the code

127.0.0.1:50000 = 127.0.0.1:50000 = 127.0.0.1:50000 = 127.0.0.1:50000 = 127.0.0.1:50000 = 127.0.0.1:50000

Self-connection cause analysis

The following figure shows the packet capture result of successful self-connection.

For self-connected packets, both the sender and receiver of each packet in the Wireshark are themselves. Therefore, a total of six packets are sent and received. The following figure shows the packet interaction process.

Does this look familiar? The interaction of the first four packets is the process of TCP opening simultaneously.

When one party initiates a connection, the operating system automatically assigns a temporary port number to the initiator. If the temporary port that happens to be allocated is 50000, the procedure is as follows.

  • The first packet is a SYN packet sent to port 50000
  • The sender receives the SYN packet, thinks it wants to open it at the same time, and replies with SYN+ACK
  • After replying with a SYN+ACK, the server receives the SYN+ACK, assumes that the handshake is successful, and enters the ESTABLISHED state

Hazards of self-connection

Imagine the following scenario:

  • The business system B you write accesses native service A. Service A listens on 50000 ports
  • The code for business system B is written to be slightly more robust, adding the logic to disconnect and reconnect to service A
  • If service A is disconnected and does not start for A long time, service system B starts to reconnect
  • After a period of retry, system B automatically connects to the system
  • When service A wants to start monitoring port 50000, the address is occupied and the service cannot start normally

If self-linking occurs, there are at least two obvious problems:

  • The self-connected process occupies the port. As a result, the server process that really needs to listen on the port cannot listen successfully
  • The self-connected process appears to connect successfully, but in fact the service is abnormal and cannot communicate properly

How to solve the self-connection problem

Self-linking is rare, but logically problematic when it does occur, so avoid it as much as possible. There are two common approaches to self-linking.

  • Make it impossible for the service to listen on the same port as the one randomly assigned by the client
  • When self-connection occurs, actively close the connection

For the first method, the client randomly assigns the range determined by the /proc/sys/net/ipv4/ip_local_port_range file. On my Centos 8, this value ranges from 32768 to 60999. As long as the port on which the service listens is less than 32768, the client will not be the same as the service port. This method is recommended.

How to fix this problem

You only need to determine whether the connection is self-connected after it is established. If it is self-connected, close the connection.

The trouble is to write test cases, first to obtain an available port to test, if I just randomly select a port number, it may itself be monitored by a program lock on the server, it is easy to appear port number conflict situation. So INITIALLY I used bind(0) and the kernel allocated a free port, as shown below.

private fun getAvailablePort(): Int {val port = ServerSocket(). Apply {bind(InetSocketAddress("127.0.0.1", 0)) close()}.localPort}Copy the code

The problem is that I can’t reproduce the problem in ubuntu 18.04 in this way. I can reproduce the problem by changing the fixed 50000 port.

Why does the above test code have to connect to an even number of ports?

Connect non-even ports can also be reproduced on earlier Versions of the Linux kernel, In the new kernel version 4.2 introduces a feature (some Linux distributions have a backport this feature) git.kernel.org/pub/scm/lin… , as shown in the figure below.

The /proc/sys/net/ipv4/ip_local_port_range file specifies low and high for temporary port numbers. By default, low is even, and on my computer low and high are 32768 and 60999, respectively.

To put it simply, the new kernel has made some changes to the port allocation policy:

  • Bind (0) is preferentially assigned randomly odd ports that differ from low’s parity, that is, odd ports. If the odd number of ports are allocated, try to allocate even number ports
  • Connect is preferentially assigned temporary ports with the same parity as low, that is, even ports. If the even-numbered ports are allocated, try to allocate odd-numbered ports

To test this, write the following code.

#include <stdio.h> #include <stdlib.h> #include <unistd.h> #include <arpa/inet.h> #include <sys/socket.h> void print_local_port() { int sockfd; if (sockfd = socket(AF_INET, SOCK_STREAM, 0), -1 == sockfd) { perror("socket create error"); } const struct sockaddr_in remote_addr = {.sin_family = AF_INET,.sin_port = htons(8080), .sin_addr = htonl(INADDR_ANY) }; if (connect(sockfd, (const struct sockaddr *) &remote_addr, sizeof(remote_addr)) < 0) { perror("connect error"); } const struct sockaddr_in local_addr; socklen_t local_addr_len = sizeof(local_addr); if (getsockname(sockfd, (struct sockaddr *) &local_addr, &local_addr_len) < 0) { perror("getsockname error"); } printf("local port: %d\n", ntohs(local_addr.sin_port)); close(sockfd); } int main() { int i; for (i = 0; i < 10; i++) { print_local_port(); } return 0; }Copy the code

Run the above code to see that connect 10 local port numbers are all even.

$ ./a.out 
local port: 49238
local port: 49240
local port: 49242
local port: 49244
local port: 49246
local port: 49248
local port: 49250
local port: 49252
local port: 49254
local port: 49256
Copy the code

You can also write a similar code to test that the bind(0) kernel randomly assigns an odd number of ports when port resources are plentiful.

// bind(0)
const struct sockaddr_in serv_addr = {
        .sin_family = AF_INET,
        .sin_port   = htons(0),
        .sin_addr   = htonl(INADDR_ANY)
};

if (bind(sockfd, (struct sockaddr *) &serv_addr, sizeof(serv_addr)) < 0) {
    perror("bind error");
}

// Get the local socket address
const struct sockaddr_in local_addr;
socklen_t local_addr_len = sizeof(local_addr);
if (getsockname(sockfd, (struct sockaddr *) &local_addr, &local_addr_len) < 0) {
    perror("getsockname error");
}

printf("bind local port: %d\n", ntohs(local_addr.sin_port));
close(sockfd);
Copy the code

In the new version of the kernel, you can see that bind(0) returns odd random port numbers.

$ ./a.out
bind local port: 37173
bind local port: 43605
bind local port: 51155
bind local port: 37209
bind local port: 45985
bind local port: 57833
bind local port: 39517
bind local port: 45873
bind local port: 42387
bind local port: 53887
Copy the code

So in the previous getAvailablePort example, bind(0) returned odd random ports, and connect’s temporary port numbers were even, so self Connect would never succeed, and there would be no way to test if the change was necessary.

Bind (0) port parity source code analysis

Bind is a system call that ends up in the inet_csk_find_open_port function. The call stack is shown below.

Here’s a key line

offset |= 1U
Copy the code

In fact, this sentence means that the random number is generated into an odd number, so the following way of generating port

port = low + offset;
Copy the code

So low will add to an odd number:

  • If low is an odd number, port is an even number
  • If low is an even number, port is an odd number

In this way, the generated port numbers are odd-even opposite to the lower low value of the port range. If low defaults to even, bind(0) randomly generates an odd port number.

Connect Parity analysis of temporary port numbers

The source code for assigning temporary port numbers to connect is implemented in __inet_hash_connect

As you can see, in contrast to bind(0), it forces offset to be even.

offset &= ~1U;
Copy the code

Then add to low, and port will have the same parity as low.

Test code changes

So how do you change that? Here we simply try to get an even number of ports available in the for loop.

private fun getAvailablePort(a): Int {
    while (true) {
        val port = ServerSocket().apply {
            bind(InetSocketAddress("127.0.0.1".0))
            close()
        }.localPort

        if (port % 2= =0) {
            return port
        }

        try {
            // try bind the next even port
            ServerSocket().apply {
                bind(InetSocketAddress("127.0.0.1", port + 1))
                close()
            }
            return port + 1
        } catch (ex: Exception) {
            // ignore}}}Copy the code

Set pieces

This problem is not specific to Kotlin or Java, I also had this problem in the early Connect code when I was looking at Golang’s source code, but someone later raised an issue and fixed it, the code is shown below.

func (sd *sysDialer) doDialTCP(ctx context.Context, laddr, raddr *TCPAddr) (*TCPConn, error) { fd, err := internetSocket(ctx, sd.network, laddr, raddr, syscall.SOCK_STREAM, 0, "dial", sd.Dialer.Control) // TCP has a rarely used mechanism called a 'simultaneous connection' in // which Dial("tcp", addr1, addr2) run on the machine at addr1 can // connect to a simultaneous Dial("tcp", addr2, addr1) run on the machine // at addr2, without either machine executing Listen. If laddr == nil, // it means we want the kernel to pick an appropriate originating local // address. Some Linux kernels cycle blindly through a fixed range of // local ports, regardless of destination port. If a kernel happens to // pick local port 50001 as the source for a Dial("tcp", "", "localhost:50001"), // then the Dial will succeed, having simultaneously connected to itself. // This can only happen when we are letting the kernel pick a port (laddr == nil) // and when there is no listener for the destination address. // It's hard to argue this is anything other than a kernel bug. If we // see this happen, rather than expose the buggy effect to users, we // close the fd and try again. If it happens twice more, we relent and // use the result. See also: // https://golang.org/issue/2690 // https://stackoverflow.com/questions/4949858/ // // The opposite can also happen: if we ask the kernel to pick an appropriate // originating local address, sometimes it picks one that is already in use. // So if the error is EADDRNOTAVAIL, we have to try again too, just for // a different reason. // // The kernel socket code is no doubt enjoying watching us squirm. for i := 0; i < 2 && (laddr == nil || laddr.Port == 0) && (selfConnect(fd, err) || spuriousENOTAVAIL(err)); i++ { if err == nil { fd.Close() } fd, err = internetSocket(ctx, sd.network, laddr, raddr, syscall.SOCK_STREAM, 0, "dial", sd.Dialer.Control) } if err ! = nil { return nil, err } return newTCPConn(fd), nil } func selfConnect(fd *netFD, err error) bool { // If the connect failed, we clearly didn't connect to ourselves. if err ! = nil { return false } // The socket constructor can return an fd with raddr nil under certain // unknown conditions. The errors in the calls there to Getpeername // are discarded, but we can't catch the problem there because those // calls are sometimes legally erroneous with a "socket not connected". // Since this code (selfConnect) is already trying to work around // a problem, we make sure if this happens we recognize trouble and // ask the DialTCP routine to try again. // TODO: try to understand what's really going on. if fd.laddr == nil || fd.raddr == nil { return true } l := fd.laddr.(*TCPAddr)  r := fd.raddr.(*TCPAddr) return l.Port == r.Port && l.IP.Equal(r.IP) }Copy the code

This section explains in detail why selfConnect method is used. The logic of self-connection is to determine whether the source IP address is equal to the target IP address and whether the source port number is equal to the target port number.

A little feeling

Submitting a PR change can be just a few lines of code, but testing code can be a pain to write.