In K8S, our applications are scheduled to each node in the form of pod, which is a challenge in designing how clusters handle networks between containers. Today we will start our discussion of K8S networks with POD (application) communication.

The essay consists of the following:

  • K8s network model and implementation scheme
  • Pod internal container communication
  • Pod communicates with POD
  • The POD communicates with the Service
  • The extranet communicates with service

K8s network model and implementation scheme

Each Pod (minimum dispatch unit) in a K8S cluster has its own IP address, the IP-per-POD model.

In the IP-per-POD model where each POD is unique in the cluster, we don’t need to explicitly create links between each pod, and we don’t need to deal with mappings between container ports and host ports. A Pod can be considered an independent virtual machine or a physical host for port allocation, naming, service discovery, load balancing, application configuration, and migration. Java training

As shown in the figure below, on the surface, the two containers communicate with client in docker network and K8S network.

K8s is a large distributed system. In order to simplify core functions and adapt to the Network environment of different service users, K8S integrates various Network solutions through Container Network Interface (CNI). These network schemes must meet the requirements of the K8S network model:

  • A Pod on a node can communicate with a Pod on any other node without NAT
  • Agents on the node (e.g., system daemons, Kubelet) can communicate with all pods on the node

Note: Only for platforms that support Pods on host networks (e.g. Linux) :

  • Pods running on the host network of nodes can communicate with pods on all nodes without NAT

Operating like this, isn’t it a bit like Meituan? Outsourcing the delivery business (CNI) to three companies (implementation scheme), I don’t care what kind of aircraft cannon (network) the rider delivers the meal, as long as it meets the relevant rules such as punctuality and no leakage (model requirements), it is a qualified delivery.

CNI does two things, network allocation when the container is created and network resources are freed when the container is deleted. The commonly used CNI implementation schemes include Flannel, Calico, Weave and CNI plug-ins launched by various cloud manufacturers according to their own networks, such as Huawei CNI-Genie and Ali Cloud Terway. The principle of each implementation is not the focus of this discussion, there is an opportunity to write a separate article.


Pod internal container communication

Container in Pod is very simple. In the same Pod, all containers share storage, network, that is, use the same IP address and port space, and can discover each other through localhost. Pod uses an intermediate container, Infra, which is created first in Pod, while other containers are associated with Infra containers by joining Network Namespace.

We have a POD that contains busyBox and nginx containers

kubectl get pod -n training
NAME                             READY   STATUS    RESTARTS   AGE
pod-localhost-765b965cfc-8sh76   2/2     Running   0          2m56s
Copy the code

Telnet to port 80 of the nginx container in BusyBox.

kubectl exec -it  pod-localhost-765b965cfc-8sh76 -c container-si1nrb -n training -- /bin/sh

# telnet localhost 80
Connected to localhost
Copy the code

When a POD has multiple containers, you can specify the name of the container to enter by using -c (see describe for the container name). Obviously, port 80 of the Nginx container in the same POD can be easily accessed by using localhost. This is also typically deployed in the same POD in many closely related applications.


Pod communicates with POD

  1. Pod on the same host

We use the Node selector to schedule both pods into the same node

. nodeSelector: kubernetes.io/hostname: node2 ...Copy the code

The two containers obtain an IP address, and the communication between the two containers is normal through the IP address.

# kubectl get pod -o wide -n training NAME READY STATUS RESTARTS AGE IP NODE NOMINATED NODE READINESS GATES Pod-to-pod-64444686ff -w7c4g 1/1 Running 0 6m53s 100.82.98.206 node2 <none> <none> pod-to-pod-busybox-7b9db67bc6-tl27c <none> <none> # kubectl execit pod-to-pod-busybox-7b9db67bc6-tl27c-n training -- /bin/sh /# Telnet Connected to 100.82.98.206Copy the code

The pod interworking of the same host network is similar to the Docker Bridge we learned before. Virtual devices are added through the Linux network bridge to connect containers and host host namespaces to the Veth pair. See the docker Container Network Bridge article for details.

Let’s take the previous diagram and just replace the gray part with the CNI solution implementation in K8S.

  1. Pod on different hosts

At this point, our POD distribution is as follows:

kubectl get pod -o wide -n training NAME READY STATUS RESTARTS AGE IP NODE NOMINATED NODE READINESS GATES Pod-to-pod-64444686ff-w7c4g 1/1 Running 0 104M 100.82.98.206 node2 <none> pod-to-pod-busybox-node2-6476f7b7f9-mqcw9 1/1 # kubectl execit pod-to-pod-busybox-node2-6476f7b7f9-mqcw9-n training -- kubectl execit pod-to-pod-busybox-node2-6476f7b7f9-mqcw9-n training -- /bin/sh / # Telnet 100.82.98.206 80 Connected to 100.82.98.206Copy the code

Pod communication between different hosts depends on CNI plug-in. Here we take Calico as an example to make a brief understanding. It can be seen from Calico architecture diagram that each node still adopts container network mode. Calico uses Linux kernel to implement an efficient virtual router vRouter at each node to forward data. Each virtual router broadcasts routing information to the network and adds routing forwarding rules. In addition, iptables also provides a variety of Network policies to implement the Network Policy of K8S to limit the Network accessibility between containers.

Simply understood is by starting a virtual router (Calico Node) on the host, each host is used as a router to achieve interconnection network topology.

In Calico node networking, the Network structure (L2 or L3) of the data center can be directly utilized without additional NAT, tunnel, or Overlay Network. There is no additional packet unpacking, which can save CPU calculation and improve Network efficiency.


The POD communicates with the Service

We know that containers can be destroyed at any time in K8S, and pod IP is obviously not durable, disappearing and appearing as applications scale or shrink, or as applications crash and nodes restart, etc. Services are designed to deal with this problem. A service manages the state of a set of Pods, allowing us to track a set of Pod IP addresses that change dynamically over time. The client only needs to know about service, the immutable virtual IP.

Let’s take a look at the typical use of service and pod. We create a service with a tag selector app:nginx that will route to pod with app=nginx tag.

# kubectl get service -n training NAME TYPE cluster-ip external-ip PORT(S) AGE train-service ClusterIP 10.96.229.238 <none> 8881/TCP 10mCopy the code

Service is exposed to port 8881, so that pods in the cluster can access pods bound to Service with a label of APP =nginx

Kubectl runit --image nginx:alpine curl --rm /bin/sh / # curl 10.96.229.238:8881 <! DOCTYPE html> <html> <head> <title>Welcome to nginx! </title> <style> ...Copy the code

In most cases, you do not know the service IP address during automatic service deployment. Therefore, you can use ServiceName:Port to access the service after resolving the domain name through DNS.

How does a service do service discovery?

Endpoints are resource objects in K8S. K8s monitors Pod IP through Endpoints, and Service associates Endpoints to discover Pod. The service discovery mechanism, which is roughly shown below, will be explored in more detail in a later article.


The extranet communicates with service

In fact, the so-called extranet communication is also a form of Service.

There are several types and different uses of services.

  • ClusterIP: For inter-cluster access scenarios, using ClusterIP to access Services (pod and Service).
  • NodePort: Used to access services from outside the cluster through ports on nodes.
  • LoadBalancer: The scenario used for access from outside the cluster is actually an extension of the NodePort, accessing the Service through a specific LoadBalancer that forwards the request to the node’s NodePort, while the external accesses only the LoadBalancer.
  • None: Used for Pod discovery. This type of Service is also called Headless Service.

Let’s start with NodePort:

We specify type: NodePort in the service to create a service that will have a port 30678 open on all nodes, so that we can access any node IP:30678 to access our pod

# kubectl get service -n training NAME TYPE cluster-ip external-ip PORT(S) AGE train-service NodePort 10.96.229.238 <none> 8881:30678/TCP 55m # curl 192.168.1.86:30678 DOCTYPE html> <html> <head> <title>Welcome to nginx! </title> <style> ....Copy the code

The LoadBalancer type, like its name, is designed for load balancing. Its structure is shown below,

LoadBalancer itself is not a component belonging to Kubernetes, if a cloud vendor’s container service is used. They usually provide a set of their load balancing services such as ALIyun ACK SLB, Huawei Cloud ELB and so on. Service is based on layer 4 TCP and UDP, while K8S Ingress, another resource object, is based on layer 7 HTTP and HTTPS. It can be more granular by domain name and path.