This is the third article in the illustrated Kubernetes network series, and IF you haven’t already read the first and second articles, I encourage you to go back to them.

Dynamic clustering

Due to the dynamic nature of the Kubernetes cluster and the distributed nature of the system, Pods (and especially Pods’ IP addresses) are constantly changing. Either a Kubernetes rolling upgrade or a zooming event caused by a Pods or Node crash will make the IP address of the Pod inaccessible.

Kubernetes Service acts as a virtual load balancer for Pods as the virtual IP addresses of the endpoints of a set of Pods IP (selected through the Label filter). The IP address of Kubernetes Service does not change even when the Pods IP address changes.

The virtual IP address is actually a set of iptables rules managed by Kube-Proxy (the Kubernetes component) (recent versions of Kubernetes can use IPVS for the same purpose). Prior to version 1.0, Kube-proxy acted like a network proxy, which caused a lot of resource switching back and forth between user state and core state, resulting in performance loss. Now, kube-proxy is a normal controller in Kubernetes, monitoring API server endpoint changes and updating the corresponding iptables rules.

When the destination address of the network packet is specified as the Service IP address, the rules of Iptables translate the destination address of the network packet from the service IP address to the ENDPOINT IP address corresponding to any POD. It is because of the DNAT rules in Iptables that requests for service IP are evenly distributed to the POD at the back end.

After DNAT occurs on a network request, DNAT information (request protocol Protocol, source IP address srcIP, source port srcPort, destination address dstIP, and destination port dstPort) will be stored in the Conntrack table (which stores network request information). When a request is returned, the system changes the source IP address of the returned request from the Pod IP address to the Service IP address based on the DNAT information stored in the Conntrack table. This makes it impossible for the client to know the flow of the network request’s work at the back end even after the successful completion of the request.

Due to the nature of Kubernetes Service DNAT, different applications of Pod using the same Port will not conflict, and also makes Service discovery easy. We can hardcode the service hostname into the iptables rules, or we can use the default SERVICE host IP address and port number in Kubernetes.

Note: The second approach will save a lot of unnecessary DNS parsing

Outbound stack traffic

We’ve talked a lot about Kubernetes Service inside a cluster, but using it in a real production scenario requires access to an external API or web site.

Generally, nodes have internal and external IP addresses. For Internet access, there is a 1:1 NAT mapping between the internal and external IP addresses of nodes, especially in the cloud environment.

The source IP address of a network packet sent from a node to the external network is changed from the internal IP address of the node to the external IP address of the node. However, when a request is sent from Pod to the outside, the CLOUD service provider does not know that the source IP address of the network packet is the Pod IP address, and it will discard all network packets whose source IP address is not the IP address of the node.

As you’ve probably guessed, you can use Iptables to solve this problem. Iptables rules can be added using kube-proxy and SNAT (Source IP address translation of network packets) to disguise the source IP of network packets. It tells the system kernel to change the source IP address for accessing network packets from the Pod IP address to the IP address of the system’s network device interface. The Contrack table also records these conversion information. When the system receives a reply from the Internet, the system uses the relevant information in the Contrack table to translate the source address in the reverse direction.

Stack traffic

We discussed Pod communication with each other, Pod communication with the extranet. Next we will discuss how the Kubernetes cluster handles requests from external users. So far, there are mainly the following schemes:

Nodeport and cloud load Balancer (L4– IP and Port)

Example Set the type of the Service to NodePort and assign a port number between 30000 and 33000 to the Service. Even if a special Pod is not running on any node, each node will start the process for the NodePort specified above. The traffic pushed through the NodePort Service will be sent to any Pod, again using the Iptables routing rules.

In the environment provided by the cloud Service provider, set the Service type to LoadBalancer and Kubernetes will generate the corresponding LoadBalancer (ELB) for the node.

Ingress (about HTTP/TCP) –

There are many open source tools that implement Ingress, such as Nginx, Traefik, HaProxy, and more. Their approach is to bind the url paths associated with HTTP hostNames/Paths to back-end services. It is the entry point to the traffic load balancer and the node NodePort service, and maps the pushed traffic to the corresponding back-end service through an ingress rule, rather than a load rule configured with a large number of NodePort servics.

Network Policy

Think of Network policies as security groups or access control lists for Pods. Network Policy rules allow or deny access to pods. Network policies are specifically implemented through layers or Network plug-ins, but iptables rules are typically used.

This is the end of the series. In the first two pieces we studied the fundamentals of Kubernetes network and the working principles of Overlays network. Now we’ve learned how to abstract services and do Service discovery within a dynamic cluster. We also provided a brief introduction to the workflows of pushing and removing requests and hardening the cluster using Network Policy.

Translation of the articleAn illustrated guide to Kubernetes Networking [Part 3]”, which was slightly abridged