Background:

The online Kubernetes environment is built using Kubeadm. It was probably built by Kubeadm on 1.15. It has been running steadily for nearly two years. There was a major upgrade from 1.15 to 1.16. Several minor upgrades have been made. The current version is 1.16.15. I tried to upgrade to a higher version, but something happened when I upgraded the master cluster. Fortunately, it was a three-node master cluster, so I reverted to version 1.16. Upgrades to higher versions have not been made. Yesterday was finally the cluster began to upgrade……

The cluster configuration

The host name system ip
k8s-vip slb 10.0.0.37
k8s-master-01 centos7 10.0.0.41
k8s-master-02 centos7 10.0.0.34
k8s-master-03 centos7 10.0.0.26
k8s-node-01 centos7 10.0.0.36
k8s-node-02 centos7 10.0.0.83
k8s-node-03 centos7 10.0.0.40
k8s-node-04 centos7 10.0.0.49
k8s-node-05 centos7 10.0.0.45
k8s-node-06 centos7 10.0.0.18

The Kubernetes upgrade process

1. Refer to official documentation

With reference to:https://kubernetes.io/zh/docs/tasks/administer-cluster/kubeadm/kubeadm-upgrade/



https://v1-17.docs.kubernetes.io/zh/docs/tasks/administer-cluster/kubeadm/kubeadm-upgrade/

The Kubernetes cluster created by Kubeadm was upgraded from version 1.16.x to 1.17.x, and from version 1.17.x to 1.17.y, where y > x

2. Confirm the upgradable version and upgrade plan

yum list --showduplicates kubeadm --disableexcludes=kubernetes



Since my version of Kubeadm is 1.16.15, I have to upgrade to 1.17.15 and then from 1.17.15 to 1.17.17 (not to mention upgraders). The master node has three nodes: K8S-master-01, K8S-master-02, K8S-master-03. Personally, I don’t like to move the first node first, so I directly start from the third node (sh-master-03)……

3. Upgrade the K8S-master-03 node control plane

Yum install kubeadm 1.17.15-0 - disableexcludes = kubernetes

sudo kubeadm upgrade plan



Can you upgrade to 1.17.17? Have a try

Kubeadm upgrade the apply v1.17.17

Can’t upgrade to 1.17.17 but can upgrade to 1.17.16? But first upgrade Kubeadm. How could that be? 1. Y to 1. Y +1 versions

Kubeadm upgrade the apply v1.17.15

Yum install -y kubelet-1.17.15-0 kubectl-1.17.15-0 --disableexcludes=kubernetes

systemctl daemon-reload
sudo systemctl restart kubelet

Well, I still can’t figure out why we’re still at version 1.17.16, so it doesn’t matter. That’s it for now!



/etc/kubernetes: /etc/kubernetes: /etc/kubernetes: /etc/kubernetes: /etc/kubernetes: /etc/kubernetes

4. Upgrade other control plane (K8S-master-02 K8S-master-03)

Each of the other two master nodes executes the following command:

Yum install -y kubeadm-1.17.15-0 --disableexcludes=kubernetes kubeadm upgrade node yum install -y kubelet-1.17.15-0, lower lower class, lower lower class Kubectl-1.17.15-0 --disableexcludes=kubernetes systemctl daemon-reload sudo systemctl restart kubelet









Log in to any master node:

[root@k8s-master-03 ~]# kubectl get nodes NAME STATUS ROLES AGE VERSION k8s-master-01 Ready master 297d v1.17.15 K8S-Master-02 Ready Master 297D v1.17.15 K8S-Master-03 Ready Master 297D v1.17.16-rc.0 K8S-Node-01 Ready Node 549D V1.16.15k8s-node-02 Ready node 2d5h v1.16.15k8s-node-03 Ready node 549d v1.16.15k8s-node-04 Ready node 547d v1.16.15 K8s-node-05 Ready node 547d v1.16.15 k8s-node-06 Ready node 192D v1.16.15 test-ubuntu-01 Ready,SchedulingDisabled <none> 47H v1.16.15TM-node-002 Ready node 154D v1.16.15TM-node-003 Ready <none> 99D v1.16.15

5. Continue to upgrade the minor version to 1.17.17

Similarly, repeat the steps above to upgrade the minor version to 1.17.17

kubectl get nodes -o  wide



Note: The above mast-02 mast-03 control plane node upgrade ignores the vacated node step

6. Work node upgrade

Note: The demonstration is performed on the k8s-node-03 node

Yum install kubeadm-1.17.17 kubectl-1.17.17 kubelet-1.17.17 --disableexcludes=kubernetes



Make the node undispatchable and empty the node:

 kubectl drain k8s-node-03 --ignore-daemonsets

kubeadm upgrade node
sudo systemctl daemon-reload
sudo systemctl restart kubelet

kubectl uncordon k8s-node-03
kubectl get nodes -o wide

Note: the back cut of the figure has already been ignored, look at the results…… , just upgrade a few nodes first. Other nodes have time to upgrade, and it is estimated that they are similar. If there is any abnormality, they will be sorted out and analyzed again.

7. Some minor problems occurred during upgrade and use

1. clusterrole

There are still some minor exceptions, such as: the clusterrole system of my controller-manager: the permissions of kube-controller-manager are not valid.



I don’t know if it was because only two nodes were upgraded, K8S-master-01. When this problem occurred, I upgraded the k8s-master-01 node, deleted the clusterrole system:kube-controller-manager, and applied my clusterrole of kubernetes1.21

 kubectl get  clusterrole system:kube-controller-manager -o yaml > 1.yaml

 kubectl get  clusterrole system:kube-controller-manager -o yaml >clusterrole.yaml
 kubectl apply -f 1.yaml



Anyway seems to be solved…. Clusterrole takes a look at it and takes a look at it. Recently anyway is a little muddled……

2. Abnormal Flannel

Here’s another question from Flannel:

My cluster was updated from 1.15. There are always no problems, but with the new work node, the probe assigned to the new node will have all kinds of weird problems. Either the probe will not pass, or the reboot will have all kinds of problems….. What’s going on?



Doubts about Flannel. Go to the GitHub Flannel repository and take a look:



Look at the Flannel version of my cluster and it’s still v0.11. Please upgrade the Flannel……

Kubectl delete-f xxx. yaml(old Flannel plugin configuration file)

Official download kube-flannel. Yaml file

Change the network



Note: Of course, if it is still 1.16, it is necessary to modify the apiversion of RBAC.

kubectl apply -f kube-flannel.yaml

After the modification, there is basically no previous probe failure restart phenomenon.

3. Error from Prometheus

The Kubernetes version corresponds to the Prometheus version:



Well mine is the early 0.4 branch, and Kubernetes1.17 works. However, the alarm of the control-manager scheduler is abnormal…… With reference tohttps://duiniwukenaihe.github.io/2021/05/14/Kubernetes-1.20.5-upgrade1.21.0%E5%90%8E%E9%81%97%E7%97%87/Modify the kube-controller-manage kube-scheduler configuration file. Of course. If I upgrade to 1.18 or later, I will be irritated by….. Switching branches for reconfiguration or what?

Postscript:

  1. Try to refer to official documentation
  2. Remember to upgrade your network components
  3. If the API changes, remember to modify the associated component version or configuration file