Kubernetes is undoubtedly the most popular container arrangement tool, network is a very important part of Kubernetes, this paper mainly introduces some corresponding network principles and terms, and Kubernetes in the network scheme and comparison. Kubernetes itself does not provide network functionality, just open up the network interface, through the form of plug-in implementation. In order to meet different Network functions and requirements, so that the Container Network can be easily configured when the Container is created or destroyed, CNI (Container Network Interface) was created to define the Interface between the runtime and the plug-in. In Kubernetes, CNI connects kubelet and the network plug-in to configure the corresponding network Settings for the container.

1 background

Container networks are mechanisms by which containers choose to connect to other containers, hosts, and external networks. In the Design of kubernetes network model, it is often required that each Pod has an independent IP address, and it is assumed that all pods are in a flat network space that can be directly connected. Users do not need to worry about establishing connections between pods or mapping container ports to host ports. All nodes can communicate with all containers without NAT, and the address of the container is the same as the address seen by others.

2 Technical Terms

IPAM: IP address management. This IP address management is not unique to containers. Traditional networks such as DHCP are also IPAM. When it comes to IPAM in the container era, there are two mainstream methods: IP address segment allocation based on CIDR or IP address allocation for each container. However, once a container host cluster is formed, the containers above must assign it a globally unique IP address, which involves the topic of IPAM.

Overlay: Builds an independent network on top of an existing layer 2 or 3 network. This network usually has its own independent IP address space, switching, or routing implementation.

BGP: a routing protocol on the autonomous network of the backbone network. It is used to manage the routing mode of packets between edge routers. BGP helps figure out how to send packets from one network to another by taking into account available paths, routing rules, and specific network policies. BGP is sometimes used as a routing mechanism in CNI plug-ins rather than as an encapsulated overlay network.

Encapsulation: Encapsulation is the process of encapsulating network packets in an additional layer to provide additional context and information. In overlay networks, encapsulation is used to translate from the virtual network to the underlying address space so that it can be routed to different locations (packets can be unpacked and continue to their destination).

3 CNI

Container Network Interface (CNI) is a Container Network specification initiated by CoreOS, which is the basis of Kubernetes Network plug-in. The basic idea is that the Container Runtime creates a Network namespace when creating a Container, invokes the CNI plug-in to configure the network for the NetNS, and then starts the processes in the Container.

The CNI Plugin is responsible for configuring the network for containers. It must be implemented as an executable called by a container management system (RKT or Kubernetes) and consists of two basic interfaces to configure the network:

Configure network: AddNetwork(net NetworkConfig, rt RuntimeConf) (types.result, error)

Clear the network: DelNetwork(net NetworkConfig, rt RuntimeConf) error

In Kubernetes, Kubelet determines which network the container should join and which plug-in it needs to invoke. The plug-in then adds the interface to the container network namespace as a side of a VETH pair. It then makes changes on the host, such as connecting the rest of the VEth to the bridge. After that, it assigns IP addresses and sets up routing by calling a separate IPAM (IP address Management) plug-in.

4 IPAM

The above CNI plug-in solves the problem of network configuration in Pod, but there is another network problem to solve is IP management. To understand the coupling of network configuration and IP management, CNI defines a second type of plug-in – IP address management plug-in (IPAM plug-in).

Like CNI plug-ins, IPAM plug-ins are invoked by running an executable file. The IPAM plug-in is responsible for configuring and managing IP addresses for interfaces.

The CNI plug-in calls the IPAM plug-in at execution time, and the IPAM plug-in determines the interface IP/subnet, gateway, and routing information, so that when the container is started, the IP address is assigned and the network is configured, and this information is returned to the CNI plug-in, which is called again when the container is deleted to clean up these resources.

The IPAM plug-in can retrieve information from protocols (such as DHCP), data stored on the local file system, the “IPAM” section of the network configuration file, or a combination of the above.

5 This section describes two common K8S network solutions

Flannel A flannel is a network solution designed by the CoreOS team. Using ETCD as storage, each node container is assigned a globally unique IP address. An overlay network is used to communicate with each other.

The communication between PODS is as follows:

• Pod1 and POD are not on the same host

Flanneld service listened on the other end of the network card. Flannel maintained a routing table between nodes through Etcd service. Flannel used the Etcd service to maintain a routing table between nodes. Etcd is used to manage the resources of IP address segment that can be allocated. Meanwhile, the actual address of each Pod in ETCD is monitored. Flanneld service of the source host encapsulates the original data content through UDP and delivers it to Flanneld service of the destination node according to its own routing table. Then it directly enters the Flannel0 virtual network card of the destination node, and then is forwarded to the Docker0 virtual network card of the destination host. Finally, it communicates with the docker0 route to the destination container like the local container.

• Pod1 and Pod2 are on the same host

If Pod1 and Pod2 are on the same host, the Docker0 bridge forwards requests directly to Pod2, bypassing Flannel.

calico

Calico is a pure 3-tier data center network solution that seamlessly integrates with IaaS cloud architectures like OpenStack to provide controlled IP communication between VMS, containers, and bare metal machines.

By squeezing the principles of scalable IP networking across the Internet down to the data center level, Calico implemented an efficient vRouter for data forwarding at each compute node using the Linux Kernel. Each vRouter uses BGP to transmit workload routing information to the entire Calico network. Small-scale deployment can be directly connected, and large-scale deployment can be completed through the specified BGP Route reflector. In this way, all workload data traffic is interconnected through IP routing.

Calico node networking can directly utilize the Network structure of the data center (whether L2 or L3) without additional NAT, tunnel or Overlay Network.

Calico also provides rich and flexible network policies that guarantee Workload multi-tenant isolation, security groups, and other accessibility restrictions through ACLs on individual nodes.

6 K8S network practice in cloud wing

The birth of container technology has been widely used with many advantages of its own. Containers eliminate deployment environment differences, ensure environmental consistency standards for application life cycles, high resource utilization and isolation, and rapid deployment, improving production efficiency and saving costs for enterprises.

With the rapid growth of jingdong cloud business, business deployment can no longer be confined to the physical machine, virtual machine such as the traditional way, cloud wing began as early as 2017 containers in the direction of exploration, we found that the idea behind container is very advanced, but in reality the production environment has a lot of stock business, cannot match, there are some differences in concept For example, the community container advocates one Process One Container (running only one application in a container).

Cloud wing as jingdong enterprise cloud platform, provides a set of deployment, operations, monitoring, logging, and many other functions, most of these functions are needed within the instance deployment Agent with the corresponding service communication, how to identify the instance to itself is often use IP, container in landing within a very strong demand is to be able to fixed IP, So that o&M or development can easily log in to the container to troubleshoot problems; A large portion of the existing business architecture relies on fixed IP; In addition, some internal basic systems are filtered by IP address. For example, IP addresses are required for the back-end of access /LB. It is difficult to give up the original technical system when the container is placed in the interior without considering the compatibility of storage services. We hope that the container reference can reduce the starting cost and be closer to the traditional operation and maintenance habits. For convenient management, we put the container network and the internal network of the machine room on a flat network.

We developed ipamD. The realization principle is that POD calls IPAMclient to request ipamD to apply for an address every time it is created. IpamD is a resident process that maintains the address information of the corresponding groups under each application. In this way, the requirement of fixed IP address is realized.

In addition, in order to reduce cost and increase efficiency, and to improve and meet the specific needs of some business parties, we hope that containers can run on THE VIRTUAL machine of Jingdong Cloud to facilitate the management and control of relevant businesses. We have developed corresponding network plug-ins to meet the needs of running containers in the cloud host, so that users of Cloud Wing can use IaaS resources without any difference.

The CNI plug-in on the cloud host achieves:

  1. All containers can communicate with each other without NAT
  2. All containers to Node can communicate with each other without NAT and vice versa

(from https://github.com/aws/amazon-vpc-cni-k8s/blob/master/docs/cni-proposal.md)

Note: For details, see Amazon-Vpc-CNI-K8S (github.com/aws/amazon-…).

With the help of Docker/K8S, Cloud Wing greatly improves the productivity of development, testing, operation and maintenance, and simplifies manual and automatic system management. If you want to know more about JINGdong Cloud Wing, please click “Read” ~

Welcome to”Jingdong cloud”Learn more