Even if your cluster is running smoothly, Kubernetes upgrades can be a difficult task. Since Kubernetes releases a new version every three months, upgrades are necessary. If you don’t upgrade your Kubernetes cluster within a year, you will fall far behind. In an effort to address the pain points of development operations, Rancher created a new open source project, System Upgrade Controller, to help developers smooth upgrades.
The System Upgrade Controller introduces a new Kubernetes custom resource definition (CRD) called the Plan. Plan is now the main component that handles the upgrade process. Here’s the architecture diagram from git repo:
Use the System Upgrade Controller to automatically Upgrade the K3s
There are two main requirements for upgrading the K3s Kubernetes cluster:
-
CRD installation
-
Create a Plan
First, let’s examine the version of the K3s cluster that is currently running.
Run the following command to quickly install:
# For master install: curl - sfL https://get.k3s.io | INSTALL_K3S_VERSION = v1.16.3 - k3s. 2 sh # For joining nodes: K3S_TOKEN is created at /var/lib/rancher/k3s/server/node-token on the server. For adding nodes, K3S_URL and K3S_TOKEN needs to be passed: curl -sfL https://get.k3s.io | K3S_URL=https://myserver:6443 K3S_TOKEN=XXX sh - KUBECONFIG file is create at /etc/rancher/k3s/k3s.yaml locationCopy the code
Kubectl get nodes NAME STATUS ROLES AGE VERSION kube-node-c155 Ready < None > 25h v1.16.3-k3s.2 kube-node-2404 Ready <none> 25h v1.16.3-k3s.2 kube-master-303D Ready master 25h v1.16.3-k3sCopy the code
Now we deploy the CRD:
kind: Namespace
metadata:
name: system-upgrade
---
apiVersion: v1
kind: ServiceAccount
metadata:
name: system-upgrade
namespace: system-upgrade
---
apiVersion: rbac.authorization.k8s.io/v1
kind: ClusterRoleBinding
metadata:
name: system-upgrade
roleRef:
apiGroup: rbac.authorization.k8s.io
kind: ClusterRole
name: cluster-admin
subjects:
- kind: ServiceAccount
name: system-upgrade
namespace: system-upgrade
---
apiVersion: v1
kind: ConfigMap
metadata:
name: default-controller-env
namespace: system-upgrade
data:
SYSTEM_UPGRADE_CONTROLLER_DEBUG: "false"
SYSTEM_UPGRADE_CONTROLLER_THREADS: "2"
SYSTEM_UPGRADE_JOB_ACTIVE_DEADLINE_SECONDS: "900"
SYSTEM_UPGRADE_JOB_BACKOFF_LIMIT: "99"
SYSTEM_UPGRADE_JOB_IMAGE_PULL_POLICY: "Always"
SYSTEM_UPGRADE_JOB_KUBECTL_IMAGE: "rancher/kubectl:v1.18.3"
SYSTEM_UPGRADE_JOB_PRIVILEGED: "true"
SYSTEM_UPGRADE_JOB_TTL_SECONDS_AFTER_FINISH: "900"
SYSTEM_UPGRADE_PLAN_POLLING_INTERVAL: "15m"
---
apiVersion: apps/v1
kind: Deployment
metadata:
name: system-upgrade-controller
namespace: system-upgrade
spec:
selector:
matchLabels:
upgrade.cattle.io/controller: system-upgrade-controller
template:
metadata:
labels:
upgrade.cattle.io/controller: system-upgrade-controller # necessary to avoid drain
spec:
affinity:
nodeAffinity:
requiredDuringSchedulingIgnoredDuringExecution:
nodeSelectorTerms:
- matchExpressions:
- {key: "node-role.kubernetes.io/master", operator: In, values: ["true"]}
serviceAccountName: system-upgrade
tolerations:
- key: "CriticalAddonsOnly"
operator: "Exists"
- key: "node-role.kubernetes.io/master"
operator: "Exists"
effect: "NoSchedule"
containers:
- name: system-upgrade-controller
image: rancher/system-upgrade-controller:v0.5.0
imagePullPolicy: IfNotPresent
envFrom:
- configMapRef:
name: default-controller-env
env:
- name: SYSTEM_UPGRADE_CONTROLLER_NAME
valueFrom:
fieldRef:
fieldPath: metadata.labels['upgrade.cattle.io/controller']
- name: SYSTEM_UPGRADE_CONTROLLER_NAMESPACE
valueFrom:
fieldRef:
fieldPath: metadata.namespace
volumeMounts:
- name: etc-ssl
mountPath: /etc/ssl
- name: tmp
mountPath: /tmp
volumes:
- name: etc-ssl
hostPath:
path: /etc/ssl
type: Directory
- name: tmp
emptyDir: {}
Copy the code
Breaking down yamL above will create the following components:
-
System-upgrade namespace
-
System-upgrade Service account
-
system-upgrade ClusterRoleBinding
-
Config map for setting environment variables in the container
-
The actual deployment
Now let’s deploy YAML:
#Get the Lateest release tag curl -s "https://api.github.com/repos/rancher/system-upgrade-controller/releases/latest" | Awk -f '"' /tag_name/{print $4}' v0.6.2 # Apply the controller manifest kubectl Apply -f https://raw.githubusercontent.com/rancher/system-upgrade-controller/v0.6.2/manifests/system-upgrade-controller.yaml namespace/system-upgrade created serviceaccount/system-upgrade created clusterrolebinding.rbac.authorization.k8s.io/system-upgrade created configmap/default-controller-env created deployment.apps/system-upgrade-controller created # Verify everything is running kubectl get all -n system-upgrade NAME READY STATUS RESTARTS AGE pod/system-upgrade-controller-7fff98589f-blcxs 1/1 Running 0 5m26s NAME READY UP-TO-DATE AVAILABLE AGE deployment.apps/system-upgrade-controller 1/1 1 1 5m28s NAME DESIRED CURRENT READY AGE replicaset.apps/system-upgrade-controller-7fff98589f 1 1 1 5m28sCopy the code
Create a K3s upgrade Plan
Now, it’s time to create an upgrade Plan. We’ll use the sample Plan mentioned in the Git repo sample folder.
--- apiVersion: upgrade.cattle.io/v1 kind: Plan metadata: name: k3s-server namespace: system-upgrade labels: Concurrency: K3S-upgrade: Server Spec: Concurrency: 1 Version: V1.17.4 + K3S1 nodeSelector: matchExpressions: k3s-upgrade, operator: Exists} - {key: k3s-upgrade, operator: NotIn, values: ["disabled", "false"]} - {key: k3s.io/hostname, operator: Exists} - {key: k3os.io/mode, operator: DoesNotExist} - {key: node-role.kubernetes.io/master, operator: In, values: ["true"]} serviceAccountName: system-upgrade cordon: true # drain: # force: true upgrade: image: rancher/k3s-upgrade --- apiVersion: upgrade.cattle.io/v1 kind: Plan metadata: name: k3s-agent namespace: system-upgrade labels: k3s-upgrade: agent spec: concurrency: 2 version: V1.17.4 + k3S1 nodeSelector: matchExpressions: - {key: k3s-upgrade, operator: Exists} - {key: k3s-upgrade, operator: Exists} - {key: k3s-upgrade, operator: NotIn, values: ["disabled", "false"]} - {key: k3s.io/hostname, operator: Exists} - {key: k3os.io/mode, operator: DoesNotExist} - {key: node-role.kubernetes.io/master, operator: NotIn, values: ["true"]} serviceAccountName: system-upgrade prepare: # Since v0.5.0-M1 SUC will use the resolved version of the plan for the tag on the prepare container. # image: Rancher/k3S-upgrade: v1.17.4-k3S1 image: Rancher/k3S-upgrade args: ["prepare", "k3S-server "] drain: force: true upgrade: image: rancher/k3s-upgradeCopy the code
Unpack yamL above and it will create:
A Plan that matches the expression to see what needs to be upgraded. So in the above example, we have two plans: K3S-server and K3S-Agent. Node – role. Kubernetes. IO/master to true and k3s – node will be taken up in server Plan upgrade. False will be used by the client Plan. So the label must be set correctly. Next, we come to apply Plan.
#Set the Node Labels kubectl label node kube-master-303d node-role.kubernetes.io/master=true # Apply the plan manifest kubectl apply -f https://raw.githubusercontent.com/rancher/system-upgrade-controller/master/examples/k3s-upgrade.yaml plan.upgrade.cattle.io/k3s-server created plan.upgrade.cattle.io/k3s-agent created # We see that the jobs have started kubectl get jobs -n system-upgrade NAME COMPLETIONS DURATION AGE apply-k3s-server-on-kube-master-303d-with-9efdeac5f6ede78-125aa 0/1 40s 40s apply-k3s-agent-on-kube-node-2404-with-9efdeac5f6ede78917-07df3 0/1 39s 39s apply-k3s-agent-on-kube-node-c155-with-9efdeac5f6ede78917-9a585 0/1 39s 39s # Upgrade in-progress, completed on the `node-role.kubernetes.io/master=true` node kubectl get nodes NAME STATUS ROLES AGE VERSION Kube-node-2404 Ready,SchedulingDisabled < None > 26h v1.16.3-k3s.2 kube-node-c155 Ready,SchedulingDisabled <none> 26h V1.16.3-k3s.2 Kube-master-303D Ready Master 26H v1.17.4+ K3S1 # In a few minutes all nodes get Upgrade to latest version As per the plan kubectl get nodes NAME STATUS ROLES AGE VERSION kube-Node-2404 Ready < None > 26H v1.17.4+ k3S1 Kube-node-c155 Ready <none> 26h v1.17.4+k3s1 kube-master-303D Ready master 26h v1.17.4+k3s1Copy the code
Our K3s Kubernetes upgrade is complete! Very easy and very smooth. Project can update the underlying operating system and restart the node. Welcome to try!
Making address:
Github.com/rancher/sys…