Flannel’s UDP mode was introduced in the previous article, but because it involves the principle of the Linux virtual network device Tun, I will introduce the Tun device first, and then turn to the main topic in this article.

Flannel has three working modes: UDP, VXLAN, and host-GW. If Directrouting is enabled, VXLAN and host-GW are used. Host-gw is used if no network segment is used, and VXLAN is used if no network segment is used. In this article, we will analyze flannel’s UDP mode. Although this mode is no longer used in production, the understanding of UDP mode can help us better understand the Virtual network device Tun in Linux. And provide us a way to operate the kernel protocol stack data in user mode.

When Flannel is running in UDP mode, each node has two daemons and a binary file:

1. Kube-flannel, running on DAemonset of K8S, was responsible for interacting with ETCD, obtaining the latest node and subnet information, and synchronizing the information to the Flanneld process of the same node through Unix Domain socket

2. A daemon named Flanneld is responsible for listening on UDP port 8285 (default port, can be modified) and opening /dev/net/tun. At the same time, a Unix Domain socket is opened to receive instructions from Kube-Flannel to update the route.

3. Binary file: /opt/ cnI /bin/flannel /opt/ cnI /bin/flannel /opt/ cnI /bin/ Flannel /opt/ cnI /bin/ Flannel Bridge and host-local, usually in the /opt/ cnI /bin directory) to complete the configuration of communication with the host container.

In addition, at each node:

1. Create a Tun device named Flannel.1

2. A direct route to POD-CIDr was created. All traffic was routed to flannel.1’s Tun device.

Below is a step-by-step explanation of how the above components are linked together to communicate between pods across the host. The Tun device we created below is called TUN0.

Environment to prepare

First of all, I would like to introduce the preparation environment, as shown below:(host1) (host2)

Two hosts, Host1 and Host2:

Host1: eth0 IP address: 10.57.4.20. Linux Veth Device Veth1 is connected to POD1 and no IP address is set. Linux Tun device TUN0, set IP 10.244.1.1 (POD and POD communication is usually not used, the host to POD will be used) host route three, one is the default gateway; The IP address of the other two direct routes to POd1 and 10.244.0.0/16 POD1 is 10.244.1.3

Host2: eth0 IP address: 10.57.4.21. The Linux VEth device is connected to POD2 and has no IP address. Linux Tun device TUN0, set IP 10.244.2.1 (POD and POD communication is usually not used, the host to POD will be used) host route three, one is the default gateway; The other two direct routes to pod2 and to 10.244.0.0/16 are at pod2’s address 10.244.2.3

Also on both hosts:

1. Route forwarding (net.ipv4.ip_forward=1) is enabled.

2. The Kube-Flannel and Flanneld processes have been run respectively. Kube-flannel has connected to ETCD or API-server normally, and subscribed to all nodes and node subnet information, which has also been sent to the routing table on flanneld of the host.

3. The pod of the host is normally connected with the host in the way of VETH. The default gateway 169.254.2.2 is configured on the POD as mentioned in the previous article, and the PEER VEth1 is enabled with ARP reply, so POD1 can send the traffic from the container to the host VEth1 at this time.

The following example illustrates the process of sending and receiving packets. Flannel’s work with the binary file is not the point of this article.

Sending process

Let’s assume that Pod2 has a Web service running. When we send an HTTP request to Pod2 from POD1, we go through the following steps:

1. The data packet comes out of the user process of POD1 and enters the protocol stack of POD1. When the protocol stack finds that the destination is not on the same network segment, it sets the next hop as the default gateway 169.254.2.2. Then fill the MAC address of veth1 into the target MAC address, complete the encapsulation of the MAC header, and send the packet to veth1 to enter the network protocol stack of host1.

2. The packet passes ROUTE judgment in host1’s network protocol stack, and it is found that the destination address 10.244.2.3 is not the local address.

3. Host protocol stack Searches for an appropriate route for packets in the host route. Packets destined for network segment 10.244.0.0/16 go through TUN0 and are then forwarded out of TUN0.

4. Tun0 is a Linux Tun device. Packets received from the protocol stack will be received by the user process on the other end of tun0.

5. Flanneld looked at the IP header of the packet and found that the destination was 10.244.2.3, so it took a hop to this destination from its own route. The message from Kube-Flannel indicates that the subnet 10.244.2.0/24 is on host 10.57.4.21 (this was done immediately after the Kube-Flannel and Flanneld processes started), so after the Flanneld process did the necessary processing, Packets from TUN0 (including TCP headers and IP headers) are treated as data and sent to UDP8285 port 10.57.4.21 through the udp port opened.

At this time, the packet sent by POD1’s protocol stack has become the data in the data area of another packet. The structure of the packet from Host1’s eth0 is as follows:

Receiving process

The host network segment transmits the data packet to the eth0 network card of Host2 according to the normal flow. Let’s see the whole process of receiving:

The package rollback process from 10.244.2.3 is similar.

During the communication between pod and POD across hosts, flannel’s UDP mode needs to switch between the user mode and kernel mode twice. Therefore, flannel’s performance is much lower than that of VXLAN mode. Although VXLAN mode uses UDP, the performance of flannel’s UDP mode is much higher than that of UDP mode because packets are processed in kernel mode.

In the source code of Flannel, Flanneld is directly implemented by C language, the key code is /backend/ UDP/proxy_ADm64.c, while Kube-Flannel is implemented by GO language. The key parts are in /backend/udp/cproxy_adm64.go. Proxy_adm64.c tun_to_udp and udp_to_tun are the key elements in proxy_adm64.c.

It is said that when the author implemented udp mode, the Linux kernel did not support VXLAN. However, I checked on the Internet, Linux kernel supports VXLAN since 3.7. By 3.12 vxLAN is complete kernelnewbies.org/Linux_3.7#V… Kernelnewbies.org/Linux_3.12#…

3.7 was released at the end of 2012, 3.12 was released in February 2013, and the flannel code was first submitted in 2014. In addition, vxLAN is obviously easier to implement.

For more information please visit: www.deepexi.com/bbs/develop.