The original link

Taint and Toleration were achieved by placing Taint on Node and Toleration on Pod.

Node – Taint

Setting a stain on a node prevents some of the PODS that handle the stain from being deployed on that node. And deal with intolerant PODS accordingly.

Kubectl taint –help: kubectl taint –help: kubectl taint –help: kubectl taint –help: kubectl taint –help: kubectl taint –help: kubectl taint –help: kubectl taint –help

The node stain contains a key, value, and effect, as shown in

=

:

.


Different effects represent different effects

  1. NoSchedule says that if Pod does not tolerate these stains, Pod cannot be scheduled to the node that contains them
  2. PreferNoSchedule is a soft-limited version of NoSchedule, so try to block it, but still schedule to the tainted node if no other node is available
  3. NoExecute is different from NoSchedule, and this effect affects pods that are running on the node. If a stain with the effect NoExecute is added to a node, the Pod that is running on that node will also be affected.

The configuration of stains is as follows

type Taint struct {
    // Key smudge Key
    Key string `json:"key"`

    // Value indicates the Value of the stain
    Value string `json:"value,omitempty"`

    // Behavior when Pod does not tolerate this stain
    // The values are NoSchedule, PreferNoSchedule, and NoExecute
    Effect TaintEffect `json:"effect"`

    // Node add time, used only in NoExecute case
    TimeAdded *metav1.Time `json:"timeAdded,omitempty"`
}
Copy the code

Pod – Toleration

Pod can tolerate smudges on nodes by setting tolerance content.

type Toleration struct {
    // Key is the value of the Key to be tolerated
    // Null means that all keys are matched. If the Key is null, the following operator must exist
    // This combination matches all keys and all values, tolerating everything
    Key string `json:"key,omitempty"`

    // Opeartor indicates the relationship between Key and Value
    The value can be Exists or Equal. The default value is Equal
    // Exists is a wildcard that tolerates any Pod with a corresponding Key regardless of the value of the Key
    // Equal requires that both the Key and Value match
    Operator TolerationOperator `json:"operator,omitempty"`

    // Value is the Value of Key
    // If operator Exists, Value should be null; otherwise, it should be a matching string
    Value string `json:"value,omitempty"`

    // Effect indicates the Effect of the stain to match. A null value matches all effects
    // The value can be NoSchedule, PreferNoSchedule, NoSchedule, and NoExecute
    Effect TaintEffect `json:"effect,omitempty"`

    // TolerationSeconds takes effect when effect is set to NoExecute. He indicated Pod's tolerance for stains
    // By default, this value is not set, indicating that it will be tolerated forever
    // Once set, Pod will be expelled when the time is up, even if all conditions are met. 0 and negative values indicate immediate expulsion
    TolerationSeconds *int64 `json:"tolerationSeconds,omitempty"`
}

// TolerationOperator is defined as follows
type TolerationOperator string
const (
    TolerationOpExists TolerationOperator = "Exists"
    TolerationOpEqual  TolerationOperator = "Equal"
)

// TaintEffect is defined as follows
type TaintEffect string
const (
    // If Pod does not tolerate this stain, new pods are not allowed to be scheduled to this node.
    // Since this line is executed by the scheduler, it is allowed to be submitted directly to Kubelet instead of being executed by the scheduler's Pod.
    // Pods that are already running but do not meet the criteria are also allowed to continue running.
    TaintEffectNoSchedule TaintEffect = "NoSchedule"

    // Similar to TaintEffectNoSchedule, but try not to schedule new pods to this node, but do not forbid it.
    // This line is executed by the scheduler.
    TaintEffectPreferNoSchedule TaintEffect = "PreferNoSchedule"

    // Expel all pods that do not meet the stain (even if they are running)
    // Executed by the node controller
    TaintEffectNoExecute TaintEffect = "NoExecute"
)
Copy the code

The simple configuration is probably clear from the configuration structure

For a Pod to tolerate Node stains, the keys and effects in Pod’s Toleration need to be consistent with the Taint Settings and meet one of the following criteria

  1. The value of operator is Exists
  2. The value of operator is Equal and value is Equal

In addition, there are two special cases regarding keys and effetc

  1. An empty key is combined with the Exists operator, which matches all keys and values
  2. An empty effect matches all effects

Configuration of the sample

  1. For the nodetxAdding a staindisk=hdd:NoSchedule
  2. Try to deploy a Pod to the node that does not tolerate the stain
  3. Try to deploy a Pod to the node that tolerates the stain
  4. For the nodetxAdding a staintype=app:NoExecute
  5. Check the effect of NoExecute after adding a stain, Pod should be expelled immediately
  6. Try to deploy a Pod to this node that tolerates two stains and NoExecute for 60 seconds
  7. Check tolerance effect and be ejected after 60 seconds

Step-1 is the nodetxAdding a staindisk=hdd:NoSchedule

$ kubectl taint nodes tx disk=hdd:NoSchedule
node/tx tainted
Copy the code

Step-2 tries to deploy a Pod to the node that does not tolerate the stain

You just added a stain to the node, so now try to create an untolerated Pod on that node, asking t-1.yaml in the configuration file

apiVersion: v1
kind: Pod
metadata:
  name: t-1
spec:
  containers:
  - name: nginx
    image: nginx
  nodeSelector:
    kubernetes.io/hostname: tx
Copy the code

This Pod is always in the Pending state. As observed from Kubectl describe pod tolerations, there were no node deployments that met the criteria

$ kubectl get pods
NAME   READY   STATUS    RESTARTS   AGE
t-1    0/1     Pending   0          15s
Copy the code

Step-3 tries to deploy a Pod to the node that tolerates the stain

Delete the Pod you just created, and modify the configuration file to create a Pod that tolerates the stain

apiVersion: v1
kind: Pod
metadata:
  name: t-1
spec:
  containers:
  - name: nginx
    image: nginx
  nodeSelector:
    kubernetes.io/hostname: tx
  tolerations:
  - key: "disk"
    operator: "Equal"
    value: "hdd"
    effect: "NoSchedule"
Copy the code
$ kubectl create -f t-1.yaml
pod/t-1 created

$ kubectl get pods -o wide
NAME   READY   STATUS    RESTARTS   AGE
t-1    1/1     Running   0          11s
Copy the code

Step-4 is the nodetxAdding a staintype=app:NoExecute

Add the disk= HDD :NoExecute stain to the node

$ kubectl taint nodes tx disk=hdd:NoExecute
node/tx tainted
Copy the code

Step 5 check the effect of NoExecute after adding stains, Pod should be expelled immediately

$ kubectl get pods
NAME   READY   STATUS        RESTARTS   AGE
t-1    1/1     Terminating   0          1m32s
Copy the code

Step-6 attempts to deploy a Pod that tolerates two stains on this node and tolerates NoExecute for 60 seconds

Now let’s add tolerance to the Pod configuration file and try again (note that we now have two TAINTs on the TX node)

apiVersion: v1
kind: Pod
metadata:
  name: t-1
spec:
  containers:
  - name: nginx
    image: nginx
  nodeSelector:
    kubernetes.io/hostname: tx
  tolerations:
  - key: "disk"
    operator: "Equal"
    value: "hdd"
    effect: "NoExecute"
    tolerationSeconds: 60
  - key: "disk"
    operator: "Equal"
    value: "hdd"
    effect: "NoSchedule"
Copy the code

Step-7 checked the tolerance effect, and was expelled after 60s

The event list of the entire Pod is as follows. At 60 seconds, the Pod was expelled… Is that

Events:
  Type    Reason     Age        From               Message
  ----    ------     ----       ----               -------
  Normal  Scheduled  <unknown>  default-scheduler  Successfully assigned default/t-1 to tx
  Normal  Pulling    60s        kubelet, tx        Pulling image "nginx"
  Normal  Pulled     51s        kubelet, tx        Successfully pulled image "nginx"
  Normal  Created    50s        kubelet, tx        Created container nginx
  Normal  Started    50s        kubelet, tx        Started container nginx
  Normal  Killing    2s         kubelet, tx        Stopping container nginx
Copy the code

Some problems encountered in the process

The whole process goes like this

  1. The Node stain is set
  2. Try deploying a Pod to Node that tolerates disk= HDD :NoExecute

However, the deployment fails, and an error message is displayed

Events:
  Type     Reason                  Age               From               Message
  ----     ------                  ----              ----               -------
  Normal   Scheduled               <unknown>         default-scheduler  Successfully assigned default/t-1 to tx
  Warning  FailedCreatePodSandBox  16s               kubelet, tx        Failed to create pod sandbox: rpc error: code = Unknown desc = [failed to set up sandbox container "5a3a5babd3baeaa3c7fd7f55f807f2d26335d59c60bb0d6bd088d64ab091daa2" network for pod "tolerations-1": networkPlugin cni failed to set up pod "tolerations-1_default" network: unable to allocate IP address: Post http://127.0.0.1:6784/ip/5a3a5babd3baeaa3c7fd7f55f807f2d26335d59c60bb0d6bd088d64ab091daa2: dial tcp 127.0.0.1:6784: connect: connection refused, failed to clean up sandbox container "5a3a5babd3baeaa3c7fd7f55f807f2d26335d59c60bb0d6bd088d64ab091daa2" network for pod "tolerations-1": networkPlugin cni failed to teardown pod "tolerations-1_default"Network: Delete http://127.0.0.1:6784/ip/5a3a5babd3baeaa3c7fd7f55f807f2d26335d59c60bb0d6bd088d64ab091daa2: Dial the TCP 127.0.0.1:6784: connect: connection refused] Normal SandboxChanged 2s (x3 over 15s) kubelet, tx Pod sandbox changed, it will be killed and re-created.Copy the code

We initially thought that the Weave service would need to be manually deployed on other nodes even if the Master node had the Weave plug-in. That’s how I fixed the problem in the beginning.

The weave service is not deployed on each Node. However, after the NoExecute stain was set on the Node, the Pod was evicted.