“This is the fourth day of my participation in the First Challenge 2022. For details: First Challenge 2022”

K8s series K8S learning seventeen, survival probe copy mechanism 2

This time we begin the study of the survival probe and replica controller in K8S

How to keep POD healthy

We already know how to create, delete, and manage pods, but how do we keep them healthy?

We can use survival probes and duplicates

Classification of probes

The probe currently has

  • Liveness probe

  • Readiness probe

We’re going to share the survival probe

Live probe

Using the survival probe to check whether the container is still running, we can specify the survival probe individually for each container in pod. If the probe fails, k8S will periodically execute the probe and restart the container

In K8S, there are three mechanisms for detecting containers:

  • HTTP get probe

HTTP GET requests can be made to the container’s IP address, specified port, and path. If the status code received by the probe is not error (2xx, 3XX status code), then it is considered as a successful probe, otherwise it is a failed probe, the container will be terminated, and then restart a POD

  • TCP socket probe

The probe attempts to establish a TCP connection with the specified port. If the connection is established, the probe succeeds; otherwise, it fails

  • The Exec probe

Execute the command inside the container and check the exit error code. If the error code is 0, the probe succeeds; otherwise, it fails

Survival probe case

Let’s create a POD and add our survival probe

kubia-liveness.yaml

apiVersion: v1
kind: Pod
metadata:
  name: kubia-liveness
spec:
  containers:
  - image: luksa/kubia-unhealthy
    name: kubia
    livenessProbe:
      httpGet:
        path: /
        port: 8080

Copy the code

Using the previous example of Kubia, the image we pull is Luksa/Kubia-unhealthy. This image is different from the previous one, that is, when receiving external access, the first five times will respond normally and the subsequent requests will report an error

We can test the survival probe in this way

Deploy a liveProbe case, unhealthy application kubia

The deployment of pod

kubectl create -f kubia-liveness.yaml

After about 1-2 minutes of deployment, we could see that the pod we started was rebooting

For example, within 11 minutes, kubia-LiVENESS restarted five times

View logs of crashed applications

We usually use kubectl logs -f XXX to view the logs, but now we need to view the logs of the crashed application. We can do this

kubectl logs -f kubia-liveness –previous

We can see that the crashed application looks like this

View pod details

kubectl describe po kubia-liveness

When viewing the pod details, we can see the following key information:

  • Exit Code

137 stands for 128 + x, so you can see that x is 9, which is the SIGKILL signal number, so it’s forced to terminate

Sometimes, it will be 143, so x will be 15, which is the SIGTERM signal

  • Liveness

    • Delay delay

    The amount of time the container waits to start probing. If this value is 0, the probe will start immediately after the container is started

    • Timeout Indicates the timeout period.

    As can be seen from the above figure, the timeout time is 1 second, so the container must respond within 1 s, otherwise the probe will fail

    • Period cycle

    The figure above shows a probe in 10 seconds

    • Number of failures

    The container will restart after three failed attempts. The container will restart after three failed attempts

As shown in the figure above, we can also see that the POD state is unhealthy. The survival probe failed because the container reported 500 and did not respond, so it immediately restarted the container

Configure the parameters of the survival probe

When configuring the parameters of the survivorship probe, which corresponds to the parameters of the LIVENESS Probe described above, we typically set a delay time because the specific application is sometimes not ready after the container is started

Therefore, we need to set a delay time, and this delay time can also be marked as the application startup time

We can add the configuration to set the first probe delay after container startup to 20 s:

apiVersion: v1
kind: Pod
metadata:
  name: kubia-liveness
spec:
  containers:
  - image: luksa/kubia-unhealthy
    name: kubia
    livenessProbe:
      httpGet:
        path: /
        port: 8080
      initialDelaySeconds: 20
Copy the code

Precautions for survival probes

When we create a survival probe, we need to make sure it’s a valid survival probe

  • The port and address requested by the probe are configured to ensure that the port and address are valid
  • When the probe accesses the address, ensure that no authentication is required. Otherwise, the probe fails all the time. Restart the container together
  • Be sure to look inside your program and not be affected by external factors
  • Note that the probe should not consume too many resources and must generally complete the response within 1 s

legacy

Using probes to keep the POD healthy looks good, and when there is an exception in the pod container, the survival probe can remove the abnormal POD and immediately recreate the POD

However, if the node where the POD is located hangs, then the survival probe has no way to handle it, because it is Kubelet above the node that handles the survival probe, and now the nodes are abnormal

We can solve this by using the duplicate mechanism

ReplicationController Duplicate controller

ReplicationController is also a K8S resource. It ensures that all managed pods are running at all times. If a POD disappears for any reason, ReplicationController will detect it. A POD is created for substitution

Take rc for example

Rc, which stands for ReplicationController, is designed to create and manage multiple copies of pods

For example, node1 has two pods, podAA and POD BB on it

  • PodAA is a pod created separately and not controlled by rc
  • Pod BB is controlled by RC

When node1 fails, podAA is dead, and no one cares about it

Pod BB is different. When node1 fails, RC creates a copy of pod BB on Node2

Rc small case

Rc is also a k8S resource, so when you create an RC, you can also use JSON or YAML to create an RC. There are three important components of creating an RC:

  • Label Selector Indicates a label selector

Which pods are used in the RC scope

  • Replica Count Number of replicas

Specifies the number of pods that should run

  • Pod Template Indicates the POD template

Used to create a new POD equity

We can write it like this, creating an RC named kubia

kubia-rc.yaml

apiVersion: v1
kind: ReplicationController
metadata:
  name: kubia
spec:
  replicas: 3
  selector:
    app: xmt-kubia
  template:
    metadata:
      labels:
        app: xmt-kubia
    spec:
      containers:
      - name: rc-kubia
        image: xiaomotong888/xmtkubia
        ports:
        - containerPort: 8080
Copy the code
  • Create an rc with the name kubia, number of copies, and selector app=xmt-kubia
  • Rc pod template, pull image address is xiaomotong888/xmtkubia, tag is app=xmt-kubia

When we’re creating an RC, we don’t have to write a selector, because the Kubernetes API is going to check that YAML is correct, that yamL has a template, and if there is a template, the tag that the selector corresponds to is the tag in the POD template

But you must write the pod template, otherwise Kubernetes API will report an error message

The deployment of the rc

kubectl create -f kubia-rc.yaml

View the RC and view the pod label

kubectl get rc

kubectl get pod –show-labels

We can see that the rc resource created, the number of pods needed to create is 3, the actual number is 3, and 3 are ready

kubia-749wj

kubia-mpvkd

kubia-tw848

Plus, they’re labeled APP = XMT-Kubia, so there’s nothing wrong with them

Rc capacity expansion and reduction

The RC controller controls the creation and removal of pods, roughly as follows

When rc starts, it searches the environment for matching tags,

  • If the number of tags found is less than the desired number configured in rc, a new POD is created
  • If the number of tags found is greater than the desired number configured in rc, the redundant PODS are deleted

We tried to remove kubiA-749wj to verify that RC would automatically create a new POD

kubectl delete po kubia-749wj

Sure enough, KubiA-749wj has been terminated and RC has created a new POD for us

View details of RC

kubectl describe rc kubia

We can see that since the creation of RC until now, there are four records of pods created by RC

Modify the pod label

Let’s try to modify the label of a pod to see if it affects RC’s

Add a ver=dev tag to kubia — MPVKD Pod to see if the pod is deleted

kubectl get pod --show-labels
kuebctl label pod kubia-mpvkd ver=dev
Copy the code

As far as RC is concerned, he only manages and controls his own configured tags. He does not care about other tags

Rewrite the APP tag

As mentioned earlier, if pod already has tags, it needs to be overwritten using –overwrite. This is also a k8S error that overwrites the configured tags

Let’s change the app tag to app=anonymous

kubectl label pod kubia-mpvkd app=anonymous –overwrite

The effect is that the original kubia-mpvkd pod has no effect, but RC detects that the number of app= xMT-kubia tags is less than RC’s expected, so it will actively create a new pod instead

The simple process and effect are shown below:

Talk about modifying the template

Modifying the template is as simple as editing an RC configuration

kubectl edit rc kubia

After executing the command, change the position of image in the POD template in the RC configuration to the image address you need to download. The simple flow of the POD template is as follows:

In the figure above, we can understand that modifying the POD template has no effect on the previous POD because the number of copies remains the same

When we delete a POD, rc detects that the actual number of pods is less than the expected number, so a new POD will be created, using the pod template we just modified

Modify the number of copies

We can try to change the number of copies to 6, the same operation, edit the RC configuration

kubectl edit rc kubia

As it looks, RC creates three more pods for us, making a total of six

Example Change the number of copies to 2

As it worked, RC did kill the remaining four pods, leaving only two remaining

Delete the rc

If rc is deleted, pod will be destroyed as well

However, we can also delete rc without affecting pod resources

kubectl delete rc kubia –cascade=false

–cascade=false is deprecated, so use –cascade=orphan

Delete rc works, let’s look at the simple delete process and the effect

Today is here, learning, if there is a deviation, please correct

Welcome to like, follow and favorites

Friends, your support and encouragement, I insist on sharing, improve the quality of the power

All right, that’s it for this time

Technology is open, our mentality, should be more open. Embrace change, live in the sun, and strive to move forward.

I am Nezha, welcome to like, see you next time ~