Cilium is the hottest cloud native network solution in the past two years. As the first web plug-in to implement all the functions of Kube-Proxy through EBPF, what is its mystery? This paper mainly introduces the evolution of Cilium, its functions and specific usage examples.

background

With the increasing adoption of cloud native, all major vendors have more or less implemented the K8s containerization of their business, not to mention the head cloud vendors.

And with the popularity of K8s, the current cluster gradually presents the following two characteristics:

  1. The number of containers is increasing. For example, the K8s official single cluster already supports 150K pod
  2. The Pod life cycle is getting shorter and shorter, even to minutes or seconds in Serverless scenarios

As the density of containers increases and the lifecycle becomes shorter, the challenge to the native container network increases.

The current K 8s Service Load balancing Realization status quo of

Before Cilium, services were implemented by Kube-proxy in userspace, iptables, and IPVS modes.

Userspace

In the current mode, kube-proxy acts as a reverse proxy, listens on random ports, redirects traffic to proxy ports based on iptables rules, and forwards traffic to back-end PODS. Service requests go from user space to kernel iptables and then back to user space, which is costly and poor performance.

Iptables

Existing problems:

  1. Poor scalability. As theserviceThe performance of both the control plane and the data plane deteriorates dramatically as data reaches thousands. The reason is that the interface design of the Iptables control plane requires traversal and modification of all rules every time a rule is added. The performance of the control plane isO (n squared). On the data side, rules are organized in linked lists, and their performance isO(n).
  2. The LB scheduling algorithm supports only random forwarding.

Ipvs mode

IPVS was designed specifically for LBS. It uses hash tables to manage services, adding, deleting, and searching services in O(1) time complexity. However, the IPVS kernel module does not have SNAT functionality, so it borrows the SNAT functionality of Iptables.

After IPVS performs DNAT on packets, the connection information is stored in NF_ConnTrack, and iptables performs SNAT based on this relay. This mode is currently the best choice for Kubernetes network performance. However, due to the complexity of NF_ConnTrack, it brings great performance loss.

The development of Cilium

Cilium is an open source network implementation based on eBpf that provides network connectivity, service load balancing, security and observability solutions by dynamically inserting powerful security, visibility, and network control logic into the Linux kernel. In simple terms, it can be understood as kube-proxy + CNI network implementation.

Cilium is located between the container choreographer system and the Linux Kernel. Upwards, it can carry out network and corresponding security configuration for containers through the choreographer platform, and downward, it can control the forwarding behavior of container network and security policy execution by mounting eBPF program in the Linux Kernel.

A brief overview of Cilium’s development:

  1. 2016 Thomas Graf founded Cilium and is now the CTO of Isovalent, the commercial company behind Cilium
  2. Cilium was first released at DockerCon 2017
  3. Cilium 1.0 was released in 2018
  4. Cilium version 1.6 will be released in 2019, replacing Kube-Proxy 100%
  5. Google fully participated in Cilium in 2019
  6. In 2021, Microsoft, Google, FaceBook, Netflix, Isovalent, and others announced eBPF Foundation (under Linux Foundation)

Function is introduced

Check the official website, you can see that Cilium’s functions mainly include three aspects, as shown in the figure above:

  • network

    1. Highly extensible Kubernetes CNI plug-in to support large-scale, highly dynamic K8S cluster environment. Support a variety of rent modes:

      • OverlayMode, Vxlan and Geneve support
      • UnerlayMode: Forwards packets through the Routing table of the Linux host in Direct Routing mode
    2. Kube-proxy substitute, to achieve the four – layer load balancing function. LB is based on eBPF and uses highly efficient, infinitely scalable hash tables to store information. Cilium optimizes performance for north – south load balancing. Supports XDP and DSR (Direct Server Return, LB only changes the destination MAC address for forwarding packets)

    3. Multi-cluster connectivity. Cilium Cluster Mesh supports load, observability, and security management across multiple clusters

  • observability

  1. Provide production usable observability tool to identify connection information through POD and DNS identification
  2. Provides monitoring indicators at L3, L4, and L7 levels and behavior indicators of Networkpolicy
  3. Observability at the API level (HTTP, HTTPS)
  4. Besides monitoring tools, Hubble can also interconnect with mainstream cloud native monitoring systems such as Prometheus and Grafana to realize extensible monitoring strategies
  • security

    1. Support for not only K8S Network Policy, but also DNS level, API level, and cross-cluster level Network Policy
    2. IP port security audit logs are supported
    3. Transmission encryption

In conclusion, Cilium includes not only the Kube-Proxy + CNI network implementation, but also a number of observability and security features.

Install the deployment

The Linux kernel requires 4.19 or higher

You can use helm or cilium CLI, which I used here (version 1.10.3)

  • downloadcilium cli
wget https://github.com/cilium/cilium-cli/releases/latest/download/cilium-linux-amd64.tar.gz​
tar xzvfC cilium-linux-amd64.tar.gz /usr/local/bin
Copy the code
  • The installationcilium
wget https://github.com/cilium/cilium-cli/releases/latest/download/cilium-linux-amd64.tar.gz
tar xzvfC cilium-linux-amd64.tar.gz /usr/local/bin
Copy the code

Cilium install --kube-proxy-replacement=strict # install --kube-proxy-replacement=strict #

  • Hubble Visual Components
cilium hubble enable --ui
Copy the code
  • After the pod is ready, check the following status:
~ # cilium status / ¯ ¯ \ / ¯ ¯ __ / ¯ ¯ \ cilium: OK __ / ¯ ¯ __ / Operator: OK / ¯ ¯ __ / ¯ ¯ \ Hubble image: OK __ / ¯ ¯ __ / ClusterMesh: disabled __/ DaemonSet cilium Desired: 1, Ready: 1/1, Available: 1/1 Deployment cilium-operator Desired: 1, Ready: 1/1, Available: 1/1 Deployment hubble-relay Desired: 1, Ready: 1/1, Available: 1/1 Containers: hubble-relay Running: 1 cilium Running: 1 cilium - operator Running: 1 Image versions cilium quay. IO/cilium/cilium: v1.10.3: IO /cilium/operator-generic: v1.10.3:1 print-relay quay. IO /cilium/ print-relay: v1.10.3:1Copy the code
  • cilium cliCluster availability checking is also supported (optional)
~ # cilium status / ¯ ¯ \ / ¯ ¯ __ / ¯ ¯ \ cilium: OK __ / ¯ ¯ __ / Operator: OK / ¯ ¯ __ / ¯ ¯ \ Hubble image: OK __ / ¯ ¯ __ / ClusterMesh: disabled __/ DaemonSet cilium Desired: 1, Ready: 1/1, Available: 1/1 Deployment cilium-operator Desired: 1, Ready: 1/1, Available: 1/1 Deployment hubble-relay Desired: 1, Ready: 1/1, Available: 1/1 Containers: hubble-relay Running: 1 cilium Running: 1 cilium - operator Running: 1 Image versions cilium quay. IO/cilium/cilium: v1.10.3: IO /cilium/operator-generic: v1.10.3:1 print-relay quay. IO /cilium/ print-relay: v1.10.3:1Copy the code

After Hubble is installed, the type of Print-UI service is changed to NodePort, and you can log in to the Hubble interface to check relevant information through NodeIP+NodePort.

After the deployment of Cilium, there are the following components: Operator, Hubble (UI, relay),Cilium Agent (Daemonset form, one for each node), among which the key component is Cilium Agent.

As the most core component of the whole architecture, Cilium Agent runs on each host of the cluster in the mode of privileged container through DaemonSet. As a user-space daemon, Cilium Agent interacts with the container runtime and the container choreography system through plug-ins, and then carries out network and security configurations for the local container. It also provides an open API for other components to call.

Cilium Agent uses eBPF program to implement network and security configuration. Cilium Agent combines container identifiers and associated policies to generate eBPF programs, compile eBPF programs into bytecode, and pass them to the Linux kernel.

Related commands

Cilium Agent has some built-in debugging commands. As described below, Cilium in agent is different from the Cilium CLI described above (although it is the same as Cilium).

  • cilium status

It mainly shows some simple configuration information and status of Cilium, as follows:

[root@~] # kubectl exec -n kube-system cilium-s62h5 -- cilium status Defaulted container cilium-agent out of: cilium-agent, ebpf-mount (init), clean-cilium-state (init) KVStore: Ok Disabled Kubernetes: Ok 1.21 (v1.21.2) [Linux/AMD64] Kubernetes APIs: [ cilium/v2::CiliumClusterwideNetworkPolicy , cilium/v2::CiliumEndpoint , cilium/v2::CiliumNetworkPolicy , cilium/v2::CiliumNode , core/v1::Namespace , core/v1::Node , core/v1::Pods , core/v1::Service , discovery/v1::EndpointSlice , networking.k8s.io/v1::NetworkPolicy ] KubeProxyReplacement: Strict [eth0 10.251.247.131 (Direct Routing)] Cilium: Ok 1.10.3 (v1.10.3-4145278) NodeMonitor: Listening for events on 8 CPUs with 64x4096 of shared memory Cilium health daemon: Ok IPAM: IPv4: 68/254 Allocated from 10.0.0.0/24, BandwidthManager: Disabled Host Routing: Legacy Masquerading: BPF [eth0] 10.0.0.0/24 [IPv4: Enabled, IPv6: Disabled] Controller Status: 346/346 Healthy Proxy Status: Hubble: OK Current/Max Flows: 4095/4095 (100.00%), Flows/s: 257.25 Metrics: Disabled Encryption: Disabled Cluster health: 1/1 reachable (2021-08-11t09.33:3z)Copy the code
  • cilium service list

This section describes the implementation of Service. You can filter services by ClusterIP. FrontEnd is ClusterIP and Backend is PodIP.

[root@~]# kubectl exec -it -n kube-system cilium-vsk8j -- cilium service list Defaulted container "cilium-agent" out of:  cilium-agent, ebpf-mount (init), Clean-cilium-state (init) ID Frontend Service Type Backend 1 10.111.192.31:80 ClusterIP 1 => 10.0.0.22:8888 2 10.101.111.124:8080 ClusterIP 1 => 10.0.0.81:8080 3 10.101.229.121:443 ClusterIP 1 => 10.0.0.24:8443 4 10.111.165.162:8080 ClusterIP 1 => 10.0.0.213:8080 5 10.96.43.22:4222 ClusterIP 1 => 10.0.0.210:4222 6 10.100.45.25:9180 ClusterIP 1 => 10.0.0.48:9180 #Copy the code
  • cilium service get

Cilium service get < ID> -o json

[root@~]# kubectl exec -it -n kube-system cilium-vsk8j -- cilium service get 132 -o json Defaulted container "cilium-agent" out of: cilium-agent, ebpf-mount (init), clean-cilium-state (init) { "spec": { "backend-addresses": [{" IP ":" 10.0.0.213 ", "nodeName" : "n251-247-131", "port" : 8080}], "flags" : {" name ":" autoscaler ", "namespace" : "knative-serving", "trafficPolicy": "Cluster", "type": "ClusterIP" }, "frontend-address": { "ip": "10.98.24.168", "port" : 8080, the "scope" : "external"}, "id" : 132}, "status" : {" realized ": {" backend - addresses" : [{" IP ":" 10.0.0.213 ", "nodeName" : "n251-247-131", "port" : 8080}], "flags" : {" name ":" autoscaler ", "namespace" : "knative-serving", "trafficPolicy": "Cluster", "type": "ClusterIP" }, "frontend-address": { "ip": "10.98.24.168", "port" : 8080, the "scope" : "external"}, "id" : 132}}}Copy the code

There are many other useful commands that I don’t have space to show here, but those who are interested can try to explore (Cilium status –help).