Get aliyun coupon for free. My live broadcast – “THE Road to PHP Advancement”

The fault

In my bullet screen service, the code was run in a different environment. Although the service was available, I found the following situation, which basically output once every 1 second:

java.io.IOException: Connection reset by peer at sun.nio.ch.FileDispatcherImpl.read0(Native Method) at sun.nio.ch.SocketDispatcher.read(SocketDispatcher.java:39) at sun.nio.ch.IOUtil.readIntoNativeBuffer(IOUtil.java:223) at  sun.nio.ch.IOUtil.read(IOUtil.java:192) at sun.nio.ch.SocketChannelImpl.read(SocketChannelImpl.java:380) at io.netty.buffer.PooledUnsafeDirectByteBuf.setBytes(PooledUnsafeDirectByteBuf.java:311) at io.netty.buffer.AbstractByteBuf.writeBytes(AbstractByteBuf.java:853) at io.netty.buffer.WrappedByteBuf.writeBytes(WrappedByteBuf.java:641) at io.netty.channel.socket.nio.NioSocketChannel.doReadBytes(NioSocketChannel.java:240) at io.netty.channel.nio.AbstractNioByteChannel$NioByteUnsafe.read(AbstractNioByteChannel.java:115) at io.netty.channel.nio.NioEventLoop.processSelectedKey(NioEventLoop.java:514) at io.netty.channel.nio.NioEventLoop.processSelectedKeysOptimized(NioEventLoop.java:471) at io.netty.channel.nio.NioEventLoop.processSelectedKeys(NioEventLoop.java:385) at io.netty.channel.nio.NioEventLoop.run(NioEventLoop.java:351) at io.netty.util.concurrent.SingleThreadEventExecutor$2.run(SingleThreadEventExecutor.java:116) at io.netty.util.internal.chmv8.ForkJoinTask$RunnableExecuteAction.exec(ForkJoinTask.java:1412) at io.netty.util.internal.chmv8.ForkJoinTask.doExec(ForkJoinTask.java:280) at io.netty.util.internal.chmv8.ForkJoinPool$WorkQueue.runTask(ForkJoinPool.java:877) at io.netty.util.internal.chmv8.ForkJoinPool.scan(ForkJoinPool.java:1706) at io.netty.util.internal.chmv8.ForkJoinPool.runWorker(ForkJoinPool.java:1661) at io.netty.util.internal.chmv8.ForkJoinWorkerThread.run(ForkJoinWorkerThread.java:126)Copy the code

Github.com/netty/netty… Connection reset by peer is typically a result of your peer sending a TCP RST to you.

tcpdump

Since the network is faulty, is tcpdump available?

$sudo netstat LNP | grep TCP 8657 0 0 0.0.0.0:443 0.0.0.0: * LISTEN 8657 / JavaCopy the code

Instead of listing all socket connections, start with the port on which the current process is listening. Ifconfig You can see that the machine has two network adapters (eth0 internal IP address and eth1 public IP address).

TCP/IP protocol resolution in tcpdump

The following describes the format of TCP/IP data output using tcpdump

Refer to www.tcpdump.org/tcpdump_man… TCP protocol description, I arranged the following format

The general format of a TCP protocol line is

src > dst: Flags [tcpflags], seq data-seqno, ack ackno, win window, urg urgent, options [opts], length lenCopy the code
  1. src and dst are the source and destination IP addresses and ports.
  2. tcpflags are some combination of S (SYN), F (FIN), P (PUSH), R (RST), U (URG), W (ECN CWR), E (ECN-Echo) or . (ACK), or none if no flags are set.
  3. data-seqno describes the portion of sequence space covered by the data in this packet.
  4. ackno is sequence number of the next data expected the other direction on this connection.
  5. window is the number of bytes of receive buffer space available the other direction on this connection.
  6. urg indicates there is urgent data in the packet.
  7. opts are TCP options (e.g., mss 1024).
  8. len is the length of payload data.

Caught data

Use tcpdump to check the network status of port 443. Slightly different from the above format. Since I have two network cards and the request comes in from eth0 and comes out from eth1, I tried using tcpdump -i any to find that the third handshake output is incorrect. (Why does this happen? I don’t know yet. Who knows, please inform me?)

$sudo tcpdump -i any-nn port 443 16:18:41.553460 IP xxx.xxx.238.110.5745 > XXX.xxx.198.40.443: S 806033:806033(0) win 14600 < MSS 1460,sackOK,timestamp 2230744217 0, NOp,wscale 8> 16:18:41.553483 IP XXX.xxx.198.40.443 > XXX. XXX. 238.110.5745: S 1720728675:1720728675(0) ack 806034 win 14480 <mss 1460,sackOK,timestamp 2425389802 2230744217,nop,wscale 7> 16:18:41.553647 IP xxx.xxx.238.110.5745 > XXX.xxx.198.40.443: .ack 1 win 58 < NOp, NOp,timestamp 2230744217 2425389802> 16:18:41.553677 IP XXX.xxx.238.110.5745 > XXX.xxx.198.40.443: R 1:1(0) ack 1 win 58 <nop,nop,timestamp 2230744217 2425389802>Copy the code

However, if you want to export the *. Pcap file and then use Wireshark to view it, you will still have to merge the traffic from both network cards.

sudo tcpdump -i any port 443 -c 3 -w log.pcapCopy the code

Two network cards are captured separately

$sudo tcpdump -i eth-nn port 443 $sudo tcpdump -i eth1-nn port 443 16:51:16.073956 IP xxx.xxx.238.2.61835 > XXX, XXX. 198.40.443: S 2659415794:2659415794(0) win 14600 < MSS 1460,sackOK,timestamp 2328745293 0, NOP,wscale 8> 16:51:16.073985 IP XXX. XXX. 198.40.443 > XXX. XXX. 238.2.61835: S 955422999:955422999(0) ack 2659415795 win 14480 <mss 1460,sackOK,timestamp 2427344323 2328745293,nop,wscale 7> 16:51:16.074147 IP xxx.xxx.238.2.61835 > XXX.xxx.198.40.443: .ack 955423000 win 58 < NOp, NOp,timestamp 2328745293 2427344323> 16:51:16.074192 IP xxx.xxx.238.2.61835 > Xxx.xxx.198.40.443: R 0:0(0) ack 1 win 58 < NOp, NOp,timestamp 2328745293 2427344323>Copy the code

The handshake above is relatively simple corresponding to

Time protocol SRC > DST tcpflags data-seqno ack ackno win window <opts>Copy the code

Shake hands with logic

Steps for a three-way handshake established by a connection:

  1. Caller sends SYN
  2. Recipient responds with SYN, ACK
  3. Caller sends ACK

The data analysis

  1. SRC sends tcpFlags = S, that is, sends SYN requests, randomly generating a value of data-seqno = J(2659415794) and sends the packet to DST
  2. Tcpflags = S; ACK = 1JPlus 1, which is just 12659415795, and randomly generate a new data-seqno = K(955422999) to SRC
  3. SRC checks whether the received ACKno is equal toJ+1And then send an ackno =KPlus 1, which is just 1955423000, said in the previous documenttcpflagsfor.Is also saidACK
  4. Why did SRC end up sending a tcpFlags = R request to DST?

Why RST

Because it is periodic, I guess, it is LVS health check on my backend service. The mechanism of health check is that the RST is lost after the successful establishment and the session is not occupied. This is the company specification, the problem is positioning.

Since the company’s specification, we need to customize and modify it. RST sent by VIP will not be processed.

Other reference

Github.com/netty/netty… Tools.ietf.org/html/rfc793 my.oschina.net/costaxu/blo…