The simplest kubernetes high availability installation

Sealos Project address: github.com/fanux/sealo…

This article teaches you how to build a K8S high availability cluster with one command without haProxy and Keepalived, and without Ansible. The kernel IPVS is used to load balance the APiserver and carry the Apiserver health check. The architecture is shown below:

This project, sealOS, aims to create a simple, clean, lightweight and stable kubernetes installation tool that supports highly available installations. In fact, it is not difficult to make something powerful, but it is more difficult to make it simple and flexible and extensible. These principles must be followed in implementation. Here are the design principles of SEALOS:

1. Design principles

Sealos features and Benefits:

Support offline installation, tools and resource packages (binary program configuration file image YAML file, etc.) are separated, so that different versions can replace different offline packages
Certificate of delay
Using a simple
Supports custom configuration
Kernel load, which is extremely stable, is extremely easy to troubleshoot because it is simple

Why not use Ansible?

Version 1.0 does use Ansible, but you still need to install Ansile, Python and some dependencies to install Ansible, so you don’t have to put Ansible into a container for users to use. If you don’t want to configure the key free, you need to use sSH-pass and so on when using the username and password. In short, it is not satisfactory to me. It is not as simple as I want.

So I thought I’d just have a binary tool with no dependencies. File distribution and remote commands are implemented by calling the SDK so I don’t rely on anything else.

Why not keepalived Haproxy?

Haproxy static pod run with no big problem, still easy to manage, keepalived now most open source ansible scripts are installed with yum or apt, this is very not controllable, has the following disadvantages:

Inconsistent source may cause version inconsistency. If the version is different, even the configuration file is different. I once detected that the script did not take effect and could not find the cause.
Some environment can not be installed directly because of system problems.
I have seen many installation scripts on the Internet, many detection scripts and weight adjustment methods are not correct, directly to check whether the HaProxy process is in, in fact, you should check whether apiserver is healthZ, if apiserver fails, even if the HaProxy process exists, The cluster will also be abnormal, is pseudo high availability.
For example, Prometheus is responsible for monitoring static pods, and Prometheus is responsible for monitoring static pods, and Prometheus is responsible for monitoring static pods, and Prometheus is responsible for monitoring static pods. Not as clean and concise as kubelet’s unified management.
We also had instances where Keepalived filled up the CPU.

So to solve this problem, I ran Keepalived inside the container (community provided images are basically not available). There were a lot of problems during the transformation, but it was finally solved.

All in all, tired of not loving, so I was wondering if I could get rid of haproxy and Keepalived to make a simpler and more reliable solution, and I found it…

Why don’t local loads use envoy or Nginx?

We solve the high availability problem with local load.

Local load: a load balancer is enabled on each node, with three masters upstream.

If I use a load balancer like envoy, I’ll have to run a process on each node, consuming more resources, which I don’t want. Ipvs actually runs one more process, lvSCARE, but lvSCARE is only responsible for managing ipvS rules. Like Kube-proxy, the real traffic is routed from the stable kernel without the need to throw packets into user mode.

One of the problems with architectural implementation that makes using envoys and so on awkward is that if the load balancing isn’t set up when you join it will get stuck and kubelet won’t come up, so to do that you need to launch the envoy first, which means you can’t manage it with static pods, Static Pods have the same problems as the keepalived host deployment above. Static pods are interdependent, logical deadlocks, chicken says egg, chicken says egg, and nobody ends up with anything.

With IPVs, however, I can establish the ipvS rules before joining, then join them, and then guard the rules. Once the apiserver is no longer accessible, the corresponding IPVS rules are automatically removed from all nodes and added back when the master is restored.

Why custom Kubeadm?

First of all, because kubeadm put the certificate expiration time to write dead, so need to customize it to 99 years, although most people can go to sign a new certificate, but we still do not want to rely on individual tools, directly change the source code.

The local load is the most convenient to modify the kubeadm code, because when joining we need to do two things, the first is to create ipvS rules before joining, the second is to create static pod. If this section does not customize kubeadm, it will report an existing error in the static POD directory, which is not elegant to ignore. And kubeadm already provides some good SDKS for us to implement this functionality.

With all the core functions integrated into Kubeadm, SEALOS becomes a lightweight tool for distributing and executing upper level commands, and we can use Kubeadm directly when adding nodes.

2. Use tutorials

Install dependencies

Install and start Docker
Download the Kubernetes offline installation package
Download the latest sealOS
Support kubernetes 1.14.0 +

The installation

You only need to run the following command for multi-master HA:

$SEALOS init --master 192.168.0.2 --master 192.168.0.3 --master 192.168.0.4 --node 192.168.0.5 --user root \ --passwd your-server-password \ --version v1.14.1 \ --pkg-url /root/kube1.14.1.tar.gzCopy the code

And then, there was no and… Yes, you have your high availability cluster installed. Do you feel confused? It’s that simple and quick!

Single master Multiple nodes:

$SEALos init --master 192.168.0.2 --node 192.168.0.5 --user root --passwd your-server-password --version V1.14.1 \ - PKG - url/root/kube1.14.1 tar. GzCopy the code

Use a free key or key pair:

$SEALos init --master 172.16.198.83 \ --node 172.16.198.84 \ -- PKG-url https://sealyun.oss-cn-beijing.aliyuncs.com/free/kube1.15.0.tar.gz \ - pk/root/kubernetes pem# this is your ssh private key file \
  --version v1.15.0
Copy the code

Parameter Description:

--master master server ADDRESS list --node node server address list --user SSH user name of the server --passwd SSH user password of the server -- PKg-url Location of the offline package. It can be stored in a local directory. Can also be placed on an HTTP server, sealOS will wget to install target machine --version kubernetes -- PK SSH private key address, configuration free key default is /root/.ssh/id_rsaCopy the code

Other parameters:

--kubeadm-config string kubeadm-config.yaml kubeadm configuration file, which can be customized -- VIP string Virtual IP (default"10.103.97.2"You are not advised to change the virtual IP address when the local load is usedCopy the code

Check whether the installation is normal:

$kubectl get node NAME STATUS ROLES AGE VERSION izj6cdQw4O4o9tc0q44rz Ready master 2m25s v1.14.1 Izj6cdqfqw4o4o9tc0q44sz Ready Master 119s v1.14.1 IZJ6CDQFQW4O4O9TC0q44TZ Ready Master 63s v1.14.1 Izj6cdqfqw4o4o9tc0q44uz Ready <none> 38s v1.14.1 $kubectl get pod --all-namespaces NAMESPACE NAME for RESTARTS  AGE kube-system calico-kube-controllers-5cbcccc885-9n2p8 1/1 Running 0 3m1s kube-system calico-node-656zn 1/1 Running 0  93s kube-system calico-node-bv5hn 1/1 Running 0 2m54s kube-system calico-node-f2vmd 1/1 Running 0 3m1s kube-system calico-node-tbd5l 1/1 Running 0 118s kube-system coredns-fb8b8dccf-8bnkv 1/1 Running  0 3m1s kube-system coredns-fb8b8dccf-spq7r 1/1 Running 0 3m1s kube-system etcd-izj6cdqfqw4o4o9tc0q44rz 1/1 Running 0 2m25s kube-system etcd-izj6cdqfqw4o4o9tc0q44sz 1/1 Running 0 2m53s kube-system etcd-izj6cdqfqw4o4o9tc0q44tz 1/1 Running 0 118s kube-system kube-apiserver-izj6cdqfqw4o4o9tc0q44rz 1/1 Running 0 2m15s kube-system kube-apiserver-izj6cdqfqw4o4o9tc0q44sz 1/1 Running 0 2m54s kube-system kube-apiserver-izj6cdqfqw4o4o9tc0q44tz 1/1 Running 1 47s kube-system kube-controller-manager-izj6cdqfqw4o4o9tc0q44rz 1/1 Running 1 2m43s kube-system kube-controller-manager-izj6cdqfqw4o4o9tc0q44sz 1/1 Running 0 2m54s kube-system kube-controller-manager-izj6cdqfqw4o4o9tc0q44tz 1/1 Running 0 63s kube-system kube-proxy-b9b9z 1/1 Running 0 2m54s kube-system kube-proxy-nf66n 1/1 Running 0 3m1s kube-system kube-proxy-q2bqp 1/1 Running 0 118s kube-system kube-proxy-s5g2k                                  1/1     Running   0          93s
kube-system   kube-scheduler-izj6cdqfqw4o4o9tc0q44rz            1/1     Running   1          2m43s
kube-system   kube-scheduler-izj6cdqfqw4o4o9tc0q44sz            1/1     Running   0          2m54s
kube-system   kube-scheduler-izj6cdqfqw4o4o9tc0q44tz            1/1     Running   0          61s
kube-system   kube-sealyun-lvscare-izj6cdqfqw4o4o9tc0q44uz      1/1     Running   0          86s
Copy the code

Add node

Select join command from master

$ kubeadm token create --print-join-command
Copy the code

We can use super kubeadm, but we need to add a –master parameter to join:

$ cd kube/shell && init.sh
$ echo "10.103.97.2 apiserver. Cluster. The local" >> /etc/hosts   # using vip$kubeadm join 10.103.97.7:6443 --token 9vr73a.a8uxyaju799qwdjv \ --master 10.103.97.100:6443 \ --master 10.103.97.101:6443 \ -- Master 10.103.97.102:6443 \ -- Discovery-token-ca-cert-hash sha256:7c2e69131a36ae2a042a339b33381c6d0d43887e2de83720eff5359e26aec866Copy the code

You can also use the sealos join command:

$sealOS join --master 192.168.0.2 --master 192.168.0.3 --master 192.168.0.4 -- VIP 10.103.97.2 --node 192.168.0.5 \ --user root \ --passwd your-server-password \ --pkg-url /root/kube1.15.0.tar.gzCopy the code

Use custom kubeadm configuration files

Sometimes you may need to customize the kubeadm configuration file, such as adding the domain name sealyun.com to the certificate.

First you need to obtain the configuration file template:

$ sealos config -t kubeadm >>  kubeadm-config.yaml.tmpl
Copy the code

Add sealyun.com to kubeadm-config.yaml.tmpl:

apiVersion: kubeadm.k8s.io/v1beta1
kind: ClusterConfiguration
kubernetesVersion: {{.Version}}
controlPlaneEndpoint: "apiserver.cluster.local:6443"
networking:
  podSubnet: 100.64. 0. 0/ 10
apiServer:
        certSANs:
        - sealyun.com This is the new domain name
        - 127.0. 01.
        - apiserver.cluster.local
        {{range .Masters -}}
        - {{...}}
        {{end -}}
        - {{.VIP}}
---
apiVersion: kubeproxy.config.k8s.io/v1alpha1
kind: KubeProxyConfiguration
mode: "ipvs"
ipvs:
        excludeCIDRs: 
        - "{{.VIP}}/ 32"
Copy the code

Note: other parts do not need to be modified, SEALOS will automatically fill in the contents of the template.

Finally, use –kubeadm-config to specify the configuration file template at deployment time:

$sealos init --kubeadm-config kubeadm-config.yaml.tmpl \ --master 192.168.0.2 \ --master 192.168.0.3 \ 192.168.0.4 \ --node 192.168.0.5 \ --user root \ --passwd your-server-password \ --version v1.14.1 \ -- pkg-URL / root/kube1.14.1. Tar. GzCopy the code

Version update

This tutorial uses version 1.14 to version 1.15 as an example, other versions of the principle is similar, understand this other reference to the official tutorial.

The upgrade process

Upgrade kubeadm to import images to all nodes
Upgrading a Controller Node
Upgrade Kubelet on master
Upgrade other master nodes
To upgrade the node
Verifying cluster Status

Upgrade kubeadm

Run CD kube/shell && sh init.sh to copy the offline package to all nodes. Kubeadm, kubectl, kubelet binaries will be updated, and will import the higher version of the image.

Upgrading a Controller Node

$kubeadm upgrade plan $kubeadm upgrade apply v1.15.0Copy the code

Restart kubelet:

$ systemctl restart kubelet
Copy the code

/usr/bin: kubelet service: /usr/bin: kubelet service: /usr/bin: kubelet service: /usr/bin: kubelet service: /usr/bin: kubelet service: /usr/bin The kubelet bin file is in the conf/bin directory.

Upgrade other controller nodes

$ kubeadm upgrade apply
Copy the code

To upgrade the node

Expel node (whether to expel depends on the situation, like a rude direct to also have no what) :

$ kubectl drain $NODE --ignore-daemonsets
Copy the code

Update kubelet configuration:

$kubeadm upgrade node config --kubelet-version v1.15.0Copy the code

Then upgrade Kubelet. Again, replace binary and restart Kubelet service.

$ systemctl restart kubelet
Copy the code

Call back lost love:

$ kubectl uncordon $NODE
Copy the code

validation

$ kubectl get nodes
Copy the code

If the version information is correct, the upgrade is basically successful.

Kubeadm upgrade apply

Check whether the cluster can be upgraded
Execute the version upgrade policy which versions can be upgraded
Check whether the mirror exists
Perform control component upgrades and roll them back if they fail, which is apiserver, Controller Manager, Scheduler, and so on
Update kube-DNS and kube-proxy
Create a new certificate file and back up the old one if it is more than 180 days old

The source code to compile

Since the NetLink library is used, it is recommended to compile in the container with a single command:

$ docker run --rm -v $GOPATH/ src/github.com/fanux/sealos:/go/src/github.com/fanux/sealos - w/go/src/github.com/fanux/sealos - it golang: 1.12.7 go buildCopy the code

If you are using the Go mod, you need to specify vendor compilation:

$ go build -mod vendor
Copy the code

uninstall

$ sealos clean \
  --master 192.168.0.2 \
  --master 192.168.0.3 \
  --master 192.168.0.4 \
  --node 192.168.0.5 \
  --user root \
  --passwd your-server-password
Copy the code

3. Implementation principle of SEALOS

Execute the process

throughsftporwgetCopy the offline installation package to the target machine (Masters and Nodes).
Execute on master0kubeadm init.
Execute kubeadm Join on the other master and set up the control surface. The process will start on the other masteretcdAnd with themaster0Etcd to form a cluster and start components of the control plane (apiserver, Controller, etc.).
The join node is configured with ipvS rules and /etc/hosts.

All requests to apiserver are accessed by domain name. Because nodes need to connect to multiple masters through virtual IP addresses, each node’s kubelet and Kube-proxy access to apiserver virtual addresses are different. Kubeadm can specify only one address in the configuration file, so it uses a domain name but resolves different IP addresses for each node. When the IP address changes, you only need to change the resolution address.

Local kernel load

Implement local kernel load balancing access to Masters on each node like this:

+----------+ +---------------+ virturl server: 127.0.0.1:6443 | mater0 | < -- -- -- -- -- -- -- -- -- -- -- -- -- -- -- -- -- -- -- -- -- - | ipvs nodes | real servers: + -- -- -- -- -- -- -- -- -- -- + | + -- -- -- -- -- -- -- -- -- -- -- -- -- -- -- + 10.103.97.200:6443 | 10.103.97.201:6443 + -- -- -- -- -- -- -- -- -- -- + | 10.103.97.202:6443 | mater1 |<---------------------+ +----------+ | | +----------+ | | mater2 |<---------------------+ +----------+Copy the code

A static pod of lvscare is installed on node to guard ipvS. If apiserver becomes unavailable, it will automatically remove the corresponding ipvS rules from all nodes and add them back when master is restored.

So you add three things to your Node, which you can see visually:

$ cat /etc/kubernetes/manifests   # Add static pod for LVSCARE
$ ipvsadm -Ln                     # You can see the ipvS rules created
$ cat /etc/hosts                  # Added virtual IP address resolution
Copy the code

Custom kubeadm

Sealos makes very few changes to Kubeadm, mainly extending the certificate expiration time and extending the join command. The following will focus on the transformation of the JOIN command.

First join adds the –master argument to specify the master address list:

lagSet.StringSliceVar(
	&locallb.LVScare.Masters, "master"And []string{},
	"A list of ha masters, -- Master 192.168.0.2:6443 -- Master 192.168.0.2:6443".)Copy the code

The master address list can then be retrieved for ipvS load balancing.

If it is not a controller and is not a single master, then only one ipvS rule is created for the controller.

if data.cfg.ControlPlane == nil {
			fmt.Println("This is not a control plan")
			if len(locallb.LVScare.Masters) ! =0 {
				locallb.CreateLocalLB(args[0])}}Copy the code

Then create lvscare Static Pod to guard ipvs:

if len(locallb.LVScare.Masters) ! =0 {
				locallb.LVScareStaticPodToDisk("/etc/kubernetes/manifests")}Copy the code

** So even if you don’t use SEALOS, you can just use custom Kubeadm to deploy the cluster, just a little bit more trouble. ** The installation steps are given below.

Kubeadm configuration file:

apiVersion: kubeadm.k8s.io/v1beta1
kind: ClusterConfiguration
kubernetesVersion: v1.14.0
controlPlaneEndpoint: "apiserver.cluster.local:6443" # apiserver DNS name
apiServer:
        certSANs:
        - 127.0. 01.
        - apiserver.cluster.local
        - 172.20241.205.
        - 172.20241.206.
        - 172.20241.207.
        - 172.20241.208.
        - 10.10397.1.          # virturl ip
---
apiVersion: kubeproxy.config.k8s.io/v1alpha1
kind: KubeProxyConfiguration
mode: "ipvs"
ipvs:
        excludeCIDRs: 
        - "10.103.97.1/32" Note: If you do not add this, kube-proxy will clean up your rules
Copy the code

Execute the following command on master0 (assuming VIP address 10.103.97.100) :

$ echo "10.103.97.100 apiserver. Cluster. The local" >> /etc/hosts Master0 address = master0 address = master0 address$ kubeadm init --config=kubeadm-config.yaml --experimental-upload-certs $ mkdir ~/.kube && cp /etc/kubernetes/admin.conf  ~/.kube/config $ kubectl apply-fhttps://docs.projectcalico.org/v3.6/getting-started/kubernetes/installation/hosted/kubernetes-datastore/calico-networkin G / 1.7 / calico. YamlCopy the code

Execute the following command on master1 (assuming VIP address 10.103.97.101) :

$ echo "10.103.97.100 apiserver. Cluster. The local" >> /etc/hosts # join master0 to master0$kubeadm join 10.103.97.100:6443 --token 9vr73a.a8uxyaju799qwdjv \ -- discovery-tok-ca-cert-hash sha256:7c2e69131a36ae2a042a339b33381c6d0d43887e2de83720eff5359e26aec866 \ --experimental-control-plane \ --certificate-key f8902e114ef118304e561c3ecd4d0b543adc226b7a07f675f56564185ffe0c07 $ sed"S / 10.103.97.100/10.103.97.101 / g" -i /etc/hosts  # change your address to your own, otherwise you will all rely on master0's pseudo high availability
Copy the code

Execute the following command on master2 (assuming the VIP address is 10.103.97.102) :

$ echo "10.103.97.100 apiserver. Cluster. The local">> /etc/hosts $kubeadm join 10.103.97.100/6443 --token 9vr73a.a8uxyaju799qwdjv \ -- discovery-tok-ca-cert-hash sha256:7c2e69131a36ae2a042a339b33381c6d0d43887e2de83720eff5359e26aec866 \ --experimental-control-plane \ --certificate-key f8902e114ef118304e561c3ecd4d0b543adc226b7a07f675f56564185ffe0c07 $ sed"S / 10.103.97.100/10.103.97.102 / g" -i /etc/hosts
Copy the code

Add –master to join node to specify the master address list:

$ echo "10.103.97.1 apiserver. Cluster. The local" >> /etc/hosts   Need to resolve to virtual IP$kubeadm join 10.103.97.1:6443 --token 9vr73a.a8uxyaju799qwdjv \ --master 10.103.97.100:6443 \ --master 10.103.97.101:6443 \ -- Master 10.103.97.102:6443 \ -- Discovery-token-ca-cert-hash sha256:7c2e69131a36ae2a042a339b33381c6d0d43887e2de83720eff5359e26aec866Copy the code

Offline package structure analysis

. ├ ─ ─ bin# specify the version of the binary file, only need these three, the other components run in the container│ ├ ─ ─ kubeadm │ ├ ─ ─ kubectl │ └ ─ ─ kubelet ├ ─ ─ the conf │ ├ ─ ─ 10 - kubeadm. Conf# This file is not used in the new version, I generated it directly in the shell, so that I can check the Cgroup driver│ ├ ─ ─ dashboard │ │ ├ ─ ─ dashboard - admin. Yaml │ │ └ ─ ─ kubernetes - dashboard. Yaml │ ├ ─ ─ heapster │ │ ├ ─ ─ grafana. Yaml │ │ ├ ─ ─ heapster. Yaml │ │ ├ ─ ─ influxdb. Yaml │ │ └ ─ ─ rbac │ │ └ ─ ─ heapster - rbac. Yaml │ ├ ─ ─ kubeadm. YamlKubeadm configuration file│ ├ ─ ─ kubelet. ServiceKubelet Systemd configuration file│ ├ ─ ─.net │ │ └ ─ ─ the calico, yaml │ └ ─ ─ promethus ├ ─ ─ imagesAll mirror packages│ ├─ ├─ ├─ ├.shInitialize the script└ ─ ─ master. ShRun the master script
Copy the code

init.shThe script copies the binaries in the bin directory to$PATHNext, configure systemd, disable swap, firewall, etc., and import the images required by the cluster.
master.shBasically kubeadm init is executed.
The conf directory contains kubeadm configuration files, calico YAMl files, and more.
Sealos calls the above two scripts, so most are compatible. Different versions can be kept compatible by fine-tuning scripts.

Wechat official account

Scan the following QR code to follow the wechat public account, and reply ◉ and group ◉ to join our cloud native communication group. There are masters like clouds in the group, such as Sun Hongliang, Zhang Curator, Yangming Big man and kubesphere core contributors are all at 😏