Overview

The calico plugin repository is in projectcalico/cni-plugin and will compile two binaries: Calico and Calico-IPam, where Calico creates route, virtual interface, veth pair and other network resources for Sandbox Container. The data is written to the Calico Datastore database; Calico-ipam will allocate IP addresses for the current pod from the POD network segment of the current node. Of course, the current node does not have a POD network segment. It will first allocate the POD CIDr of the node from the cluster CIDR network segment and write related data into the Calico datastore database. The Cluster CIDR is user-defined, written to the Calico datastore in advance, and the block sizes from the Cluster CIDR can be customized (new versions of calico/ Node containers support customization, Older versions of Calico are not supported), please refer to change-block-size.

Let’s focus on how calico binary works and how calico-IPam binary assigns IP addresses.

Calico Plugin source code analysis

Calico plug-in is to follow the CNI standard interface, the implementation of ADD and DEL commands, here focuses on the ADD command how to achieve. Calico will first register the ADD and DEL commands in l614-L677:


func Main(version string) {
	// ...
	err := flagSet.Parse(os.Args[1:)// ...
	// Register the 'ADD' and 'DEL' commands
	skel.PluginMain(cmdAdd, nil, cmdDel,
		cniSpecVersion.PluginSupports("0.1.0 from"."0.2.0"."0.3.0"."0.3.1"),
		"Calico CNI plugin "+version)
}

Copy the code

The ADD command does three main logics:

  • Check whether the name of the WorkloadEndpoint object in the Calico datastore matches that of the current POD. If the name of the WorkloadEndpoint object matches that of the pod, a new WorkloadEndpoint object will be created. This object stores the nic name and IP address of the POD in the Host Network Namespace and the NIC name of the Container Network Namespace. The following is an example.
  • Create a Veth pair and place one network card in the host side network namespace and the other in the container side network namespace. Create a nic such as eth0 in the Namespace of Container Network and assign the IP address obtained by calling calico-ipam to the nic eth0. Create a nic in the host Network namespace. The nic name format is"cali" + sha1(namespace.pod)[:11]And set the MAC address “EE :ee:ee:ee:ee:ee”.
  • Create routes on the container side and the host side. On the container side, set the default gateway to169.254.1.1, the gateway address code is written dead; On the host, add routes such as10.217.120.85 dev calid0bda9976d5 scope link, including10.217.120.85Is the POD IP address,calid0bda9976d5Veth pair is the virtual Ethernet interface of the POD in the host computer.

An example of a WorkloadEndpoint object is as follows: a K8S POD object corresponds to a WorkloadEndpoint object in Calico, You can run calicoctl get wep -o wide to view all workloadendpoints. ZSHRC: kubernetes / ~/.zshrc

# calico
export CALICO_DATASTORE_TYPE=kubernetes
export  CALICO_KUBECONFIG=~/.kube/config
Copy the code

apiVersion: projectcalico.org/v3
kind: WorkloadEndpoint
metadata:
  creationTimestamp: "2021-01-09T08:38:56Z"
  generateName: nginx-demo-1-7f67f8bdd8-
  labels:
    app: nginx-demo-1
    pod-template-hash: 7f67f8bdd8
    projectcalico.org/namespace: default
    projectcalico.org/orchestrator: k8s
    projectcalico.org/serviceaccount: default
  name: minikube-k8s-nginx--demo--1--7f67f8bdd8--d5wsc-eth0
  namespace: default
  resourceVersion: "557760"
  uid: 85d1d33f-f55f-4f28-a89d-0a55394311db
spec:
  endpoint: eth0
  interfaceName: calife8e5922caa
  ipNetworks:
  - 10.217120.84./ 32
  node: minikube
  orchestrator: k8s
  pod: nginx-demo-1-7f67f8bdd8-d5wsc
  profiles:
  - kns.default
  - ksa.default.default

Copy the code

Take a look at the cmdAdd function code based on the above three main logic:


func cmdAdd(args *skel.CmdArgs) (err error) {
    // ...
	// Load configuration data from args.StdinData
	// '--cni-conf-dir' passes in the file contents, i.e. Cni configuration parameters, as described in the first article
	The data structure also corresponds to the data in the CNI configuration file
	conf := types.NetConf{}
	iferr := json.Unmarshal(args.StdinData, &conf); err ! =nil {
		return fmt.Errorf("failed to load netconf: %v", err)
	}
	// You can use cnI parameter Settings to drop calico plugin logs into the host file
	// "log_level": "debug", "log_file_path": "/var/log/calico/cni/cni.log",
	utils.ConfigureLogging(conf)
	
	// ...
	
	// You can set the MTU, which is the maximum transmission Unit of the Max Transmit Unit, in the CNI file
	if mtu, err := utils.MTUFromFile("/var/lib/calico/mtu"); err ! =nil {
		return fmt.Errorf("failed to read MTU file: %s", err)
	} else if conf.MTU == 0&& mtu ! =0 {
		conf.MTU = mtu
	}

	// Construct a WEPIdentifiers object and assign it to it
	nodename := utils.DetermineNodename(conf)
	wepIDs, err := utils.GetIdentifiers(args, nodename)
	calicoClient, err := utils.CreateClient(conf)
	
	// Check whether datastore is ready by 'calicoctl get ClusterInformation default -o yaml'
	ci, err := calicoClient.ClusterInformation().Get(ctx, "default", options.GetOptions{})
	if! *ci.Spec.DatastoreReady {return
	}

	// List displays the workloadEndpoint with the prefix wepPrefix. One POD corresponds to one workloadEndpoint. If the database can match the workloadEndpoint, use this workloadEndpoint
	// Otherwise, a workloadEndpoint will be written to the Calico database after the POD Network resource is created
	wepPrefix, err := wepIDs.CalculateWorkloadEndpointName(true)
	endpoints, err := calicoClient.WorkloadEndpoints().List(ctx, options.ListOptions{Name: wepPrefix, Namespace: wepIDs.Namespace, Prefix: true})
	iferr ! =nil {
		return
	}

	// For a newly created pod, a corresponding new Workloadendpoint object is finally written to the Calico datastore
    var endpoint *api.WorkloadEndpoint

	// Since we are a new POD, there will be no corresponding workloadEndpoint object in the database, so endpoints must be nil
	if len(endpoints.Items) > 0 {
		// ...
	}

	// Since the endpoint is nil, fill in the default WEPIdentifiers, where args.IfName is the container side network card name passed by Kubelet, usually eth0
	Replace (pod_name, "-", "--")}-{wepids.endpoint}, as described above
	// minikube-k8s-nginx--demo--1--7f67f8bdd8-- d5wsC-eth0 WorkloadEndpoint object
	if endpoint == nil {
		wepIDs.Endpoint = args.IfName
		wepIDs.WEPName, err = wepIDs.CalculateWorkloadEndpointName(false)}/ / is k8s Orchestrator
	if wepIDs.Orchestrator == api.OrchestratorKubernetes {
		// k8s.CmdAddK8s does the above three logical tasks
		ifresult, err = k8s.CmdAddK8s(ctx, args, conf, *wepIDs, calicoClient, endpoint); err ! =nil {
			return}}else {
		// ...
	}

	// Policy-type is k8s
	if conf.Policy.PolicyType == "" {
		// ...
	}

	// Print result to stdout, in the format defined by the requested cniVersion.
	err = cnitypes.PrintResult(result, conf.CNIVersion)
	return
}

Copy the code

The basic structure of cmdAdd() conforms to the function structure of CNI standard, and finally prints the result to STdout. K8s.cmdaddk8s ()

// Do three things:
// 1. Write a WorkloadEndpoint object to the Calico Store that corresponds to pod
// 2. Create a Veth pair with one end on the container and assign an IP/MAC address. On the host side, assign a MAC address
// 3. Create a route. Create a default gateway route on the container side. The host creates the POD IP/MAC route
func CmdAddK8s(ctx context.Context, args *skel.CmdArgs, conf types.NetConf, epIDs utils.WEPIdentifiers, calicoClient calicoclient.Interface, endpoint *api.WorkloadEndpoint) (*current.Result, error) {
	// ...
	// There are different data planes generated according to the operating system, here is linuxDataplane object
	d, err := dataplane.GetDataplane(conf, logger)
	// Create k8S client
	client, err := NewK8sClient(conf, logger)
	
	// Ipam. type=calico-ipam
	if conf.IPAM.Type == "host-local" {
		// ...
	}

	// ...
	
	/ / here will check the pod and the annotation of the namespace: cni.projectcalico.org/ipv4pools
	// We are not set, logic skips here
	if conf.Policy.PolicyType == "k8s" {
		annotNS, err := getK8sNSInfo(client, epIDs.Namespace)
        labels, annot, ports, profiles, generateName, err = getK8sPodInfo(client, epIDs.Pod, epIDs.Namespace)
		// ...
		if conf.IPAM.Type == "calico-ipam" {
			var v4pools, v6pools string
			// Sets the Namespace annotation for IP pools as default
			v4pools = annotNS["cni.projectcalico.org/ipv4pools"]
			v6pools = annotNS["cni.projectcalico.org/ipv6pools"]
			// Gets the POD annotation for IP Pools and overwrites Namespace annotation if it exists
			v4poolpod := annot["cni.projectcalico.org/ipv4pools"]
			if len(v4poolpod) ! =0 {
				v4pools = v4poolpod
			}
			// ...
		}
	}

	ipAddrsNoIpam := annot["cni.projectcalico.org/ipAddrsNoIpam"]
	ipAddrs := annot["cni.projectcalico.org/ipAddrs"]
	
	switch {
	// Call calico-ipam to assign an IP address
	case ipAddrs == "" && ipAddrsNoIpam == "":
		/ / our pod is not set the annotation "cni.projectcalico.org/ipAddrsNoIpam" and "cni.projectcalico.org/ipAddrs" values
		// Call calico-ipam to get pod IP
		// How to assign pod IP to calico-ipam
		result, err = utils.AddIPAM(conf, args, logger)
		// ...
	caseipAddrs ! =""&& ipAddrsNoIpam ! ="":
		// Can't have both ipAddrs and ipAddrsNoIpam annotations at the same time.
		e := fmt.Errorf("can't have both annotations: 'ipAddrs' and 'ipAddrsNoIpam' in use at the same time")
		logger.Error(e)
		return nil, e
	caseipAddrsNoIpam ! ="":
		// ...
	caseipAddrs ! ="":
		// ...
	}
	
	// Start creating the WorkloadEndpoint object and assign parameters
	endpoint.Name = epIDs.WEPName
	endpoint.Namespace = epIDs.Namespace
	endpoint.Labels = labels
	endpoint.GenerateName = generateName
	endpoint.Spec.Endpoint = epIDs.Endpoint
	endpoint.Spec.Node = epIDs.Node
	endpoint.Spec.Orchestrator = epIDs.Orchestrator
	endpoint.Spec.Pod = epIDs.Pod
	endpoint.Spec.Ports = ports
	endpoint.Spec.IPNetworks = []string{}
	if conf.Policy.PolicyType == "k8s" {
		endpoint.Spec.Profiles = profiles
	} else {
		endpoint.Spec.Profiles = []string{conf.Name}
	}

	// calico-ipam allocates the IP address value to endpoint.spec. IPNetworks
	iferr = utils.PopulateEndpointNets(endpoint, result); err ! =nil {
		// ...
	}

	// desiredVethName Indicates the name of the nic. The format is: '" Cali "+ SHA1 (namespace.pod)[:11]'
	desiredVethName := k8sconversion.NewConverter().VethNameForWorkload(epIDs.Namespace, epIDs.Pod)
	
	// The DoNetworking() function is important, which creates veth pairs and routes
	// Here is the DoNetworking() function calling the linuxDataplane object
	hostVethName, contVethMac, err := d.DoNetworking(
		ctx, calicoClient, args, result, desiredVethName, routes, endpoint, annot)
    
	// ...
	mac, err := net.ParseMAC(contVethMac)
	endpoint.Spec.MAC = mac.String()
	endpoint.Spec.InterfaceName = hostVethName
	endpoint.Spec.ContainerID = epIDs.ContainerID

	// ...

	// Create or update a WorkloadEndpoint object. At this point, a corresponding WorkloadEndpoint object will be written to the Calico datastore based on the newly created POD object
	if_, err := utils.CreateOrUpdate(ctx, calicoClient, endpoint); err ! =nil {
		// ...
	}

	// Add the interface created above to the CNI result.
	result.Interfaces = append(result.Interfaces, &current.Interface{
		Name: endpoint.Spec.InterfaceName},
	)

	return result, nil
}

Copy the code

This code ends up creating a Workloadendpoint object, and it’s important to use the DoNetworking() function, which creates routes and veth pairs. Then look at the DoNetworking() function of the linuxDataplane object to create veth pair and routes. Here mainly call github.com/vishvananda/netlink golang package to add and delete CARDS and routing operation, such as equal to perform IP link to the add/delete/set XXX orders, such as the golang package is also a very good package, Used by many major projects such as K8S project, when learning Linux network related knowledge can use this package to write a relevant demo, the efficiency is also much higher. Here’s how Calico created routes and veth pairs using netLink:


func (d *linuxDataplane) DoNetworking(
	ctx context.Context,
	calicoClient calicoclient.Interface,
	args *skel.CmdArgs,
	result *current.Result,
	desiredVethName string,
	routes []*net.IPNet,
	endpoint *api.WorkloadEndpoint,
	annotations map[string]string.) (hostVethName, contVethMAC string, err error) {
	// desiredVethName Indicates the name of the nic. The format is: '" Cali "+ SHA1 (namespace.pod)[:11]'
	hostVethName = desiredVethName
	// The name of the container nic is usually eth0
	contVethName := args.IfName

	err = ns.WithNetNSPath(args.Netns, func(hostNS ns.NetNS) error {
		veth := &netlink.Veth{
			LinkAttrs: netlink.LinkAttrs{
				Name: contVethName,
				MTU:  d.mtu,
			},
			PeerName: hostVethName,
		}
        // Create veth peer with eth0 on the container side and "Cali" + sha1(namespace.pod)[:11]
        // is equivalent to the IP link add XXX type veth peer name XXX command
        iferr := netlink.LinkAdd(veth); err ! =nil {
		}
		hostVeth, err := netlink.LinkByName(hostVethName)
		if mac, err := net.ParseMAC("EE:EE:EE:EE:EE:EE"); err ! =nil{}else {
			// Set the MAC address of the host nic to EE :ee:ee:ee:ee:ee
			iferr = netlink.LinkSetHardwareAddr(hostVeth, mac); err ! =nil {
				d.logger.Warnf("failed to Set MAC of %q: %v. Using kernel generated MAC.", hostVethName, err)
			}
		}

		// ...
		hasIPv4 = true

		// the IP link is set up on the host side
		iferr = netlink.LinkSetUp(hostVeth); err ! =nil{}// IP link set up on the container side
		contVeth, err := netlink.LinkByName(contVethName)
		iferr = netlink.LinkSetUp(contVeth); err ! =nil{}// Fetch the MAC from the container Veth. This is needed by Calico.
		contVethMAC = contVeth.Attrs().HardwareAddr.String()
		if hasIPv4 {
			// Add a default gateway route to the container side, such as:
			// default via 169.254.1.1 dev eth0
			// 169.254.1.1 dev eth0 scope link
			gw := net.IPv4(169.254.1.1)
			gwNet := &net.IPNet{IP: gw, Mask: net.CIDRMask(32.32)}
			err := netlink.RouteAdd(
				&netlink.Route{
					LinkIndex: contVeth.Attrs().Index,
					Scope:     netlink.SCOPE_LINK,
					Dst:       gwNet,
				},
			)
		}

		// Assign the POD IP address from the calico-ipam plugin to the nic on the container side
		for _, addr := range result.IPs {
			iferr = netlink.AddrAdd(contVeth, &netlink.Addr{IPNet: &addr.Address}); err ! =nil {
				return fmt.Errorf("failed to add IP addr to %q: %v", contVeth, err)
			}
		}
        // ...
		// Switch to the host network namespace
		if err = netlink.LinkSetNsFd(hostVeth, int(hostNS.Fd())); err ! =nil {
			return fmt.Errorf("failed to move veth to host netns: %v", err)
		}

		return nil
	})

    // Set the syscTLS configuration of the veth pair host network adapter. Set this network adapter to forward and arp_proxy
    err = d.configureSysctls(hostVethName, hasIPv4, hasIPv6)

	// IP link set up the host's veth pair
	hostVeth, err := netlink.LinkByName(hostVethName)
	iferr = netlink.LinkSetUp(hostVeth); err ! =nil {
		return ""."", fmt.Errorf("failed to set %q up: %v", hostVethName, err)
	}

	// Configure the route on the host side
	err = SetupRoutes(hostVeth, result)

	return hostVethName, contVethMAC, err
}

func SetupRoutes(hostVeth netlink.Link, result *current.Result) error {
	// Configure the host side of the route, usually the destination address is pod IP 10.217.120.85, the packet into the calid0bda9976d5 nic, route like:
	// 10.217.120.85 dev calid0bda9976d5 scope link
	for _, ipAddr := range result.IPs {
		route := netlink.Route{
			LinkIndex: hostVeth.Attrs().Index,
			Scope:     netlink.SCOPE_LINK,
			Dst:       &ipAddr.Address,
		}
		err := netlink.RouteAdd(&route)
        // ...
	}
	return nil
}

// Here English does not translate the explanation, English remarks said more detailed transparent.

// configureSysctls configures necessary sysctls required for the host side of the veth pair for IPv4 and/or IPv6.
func (d *linuxDataplane) configureSysctls(hostVethName string, hasIPv4, hasIPv6 bool) error {
  var err error
  if hasIPv4 {
    // Normally, the kernel has a delay before responding to proxy ARP but we know
    // that's not needed in a Calico network so we disable it.
    if err = writeProcSys(fmt.Sprintf("/proc/sys/net/ipv4/neigh/%s/proxy_delay", hostVethName), "0"); err ! =nil {
        return fmt.Errorf("failed to set net.ipv4.neigh.%s.proxy_delay=0: %s", hostVethName, err)
    }
    
    // Enable proxy ARP, this makes the host respond to all ARP requests with its own
    // MAC. We install explicit routes into the containers network
    // namespace and we use a link-local address for the gateway. Turing on proxy ARP
    // means that we don't need to assign the link local address explicitly to each
    // host side of the veth, which is one fewer thing to maintain and one fewer
    // thing we may clash over.
    if err = writeProcSys(fmt.Sprintf("/proc/sys/net/ipv4/conf/%s/proxy_arp", hostVethName), "1"); err ! =nil {
        return fmt.Errorf("failed to set net.ipv4.conf.%s.proxy_arp=1: %s", hostVethName, err)
    }
    
    // Enable IP forwarding of packets coming _from_ this interface. For packets to
    // be forwarded in both directions we need this flag to be set on the fabric-facing
    // interface too (or for the global default to be set).
    if err = writeProcSys(fmt.Sprintf("/proc/sys/net/ipv4/conf/%s/forwarding", hostVethName), "1"); err ! =nil {
        return fmt.Errorf("failed to set net.ipv4.conf.%s.forwarding=1: %s", hostVethName, err)
    }
  }

  if hasIPv6 {
     // ...	
  }
  
  return nil
}
Copy the code

conclusion

At this point, the Calico binary plug-in has created a network resource for a Sandbox container. That is, a Veth pair has been created and MAC addresses have been set for the host and container network cards, IP addresses have been set for the container segment, and the default gateway has been configured for the route on the container side. The host side is configured with routing, and the target address is sandbox Container IP to enter the host side’s Veth pair nic, and arp proxy and Packet forwarding functions are configured for the host side’s nic. Finally, A WorkloadEndpoint object is generated from this network data and stored in the Calico datastore.

However, there is still a key logic missing, that is, how calico-IPam assigns IP addresses. I have time to study the records later.