• [] Event Specifies the background of an alarm
  • [] Ali Kube-event introduction, support notification procedures
    • [] PRACTICE of k8S1.16.3 cluster docking pinning robot

background

Monitoring is an important part of ensuring system stability. In Kubernetes open source ecosystem, resource monitoring tools and component monitoring bloom together.

  • CAdvisor: Kubelet built-in cAdvisor, monitoring container resources, such as container CPU, memory;
  • Kube-state-metrics: Kube-state-metrics listens to the API Server to generate state metrics for resource objects, focusing on metadata such as Deployment, Pod, replica status, etc.
  • Metrics – Server: Metrics – Server is also a cluster-wide resource data aggregation tool, a replacement for Heapster, from which the HPA component of K8S takes data;

And Node-Exporter, official and unofficial exporters, use Prometheus to mine this data and store, alert, and visualize it. But that’s not enough.

Monitor the real-time and accuracy of less than most of the resource monitoring is based on the push or pull mode for data offline, so the data is usually taken every once in a while, if appear some burr or abnormal in time intervals, and in the next patch arrived back, most of the collection system will swallow this exception. For the burr scenario, the stage acquisition will automatically cut the peak, resulting in the reduction of accuracy.

Insufficient coverage of monitoring scenarios Some monitoring scenarios cannot be expressed by resources. For example, the start and stop of Pod cannot be measured simply by the resource utilization rate, because when the resource is 0, we cannot distinguish the real cause of this state.

How does Kubernetes solve these two problems?

Event monitoring

In Kubernetes, there are two types of events. One is a Warning event, indicating that the state transition that generated the event was between unexpected states. The other is a Normal event, which indicates that the desired state is the same as the current state. We take the life cycle of a Pod as an example. When a Pod is created, the Pod will enter the Pending state first, waiting for the image to be pulled. When the image is accepted and passed the health check, the Pod will be in the Running state. A Normal event is generated. If the Pod crashes during the runtime due to OOM or other reasons, and enters the Failed state, which is not expected, then the Kubernetes will generate a Warning event. In view of this scenario, if we can generate monitoring events, we can timely check some problems that are easily ignored by resource monitoring.

A standard Kubernetes event has several important properties that can be used to better diagnose and alert problems.

Namespace: Namespace of the object that generated the event. Kind: Type of object to which events are bound, such as Node, Pod, Namespace, Componenet, etc. Timestamp: time at which the event occurred, etc. There is a Reason why this happened. Message: description of the event.

[root@master work]# kubectl get event --all-namespaces NAMESPACE LAST SEEN TYPE REASON OBJECT MESSAGE default 14m Normal  Created pod/busybox2 Created container busybox default 14m Normal Started pod/busybox2 Started container busybox default 19m Normal Pulling pod/litemall-all-584bfdcd99-q6wd2 Pulling image "litemall-all:2019-12-18-13-13-26" default 24m Warning Failed pod/litemall-all-584bfdcd99-q6wd2 Error: ErrImagePull default 14m Normal BackOff pod/litemall-all-584bfdcd99-q6wd2 Back-off pulling image "litemall-all:2019-12-18-13-13-26" default 4m47s Warning Failed pod/litemall-all-584bfdcd99-q6wd2 Error: ImagePullBackOffCopy the code

Kubectl version K8S1.9.2 found the event NAME column, now the higher version of the column disappeared, resulting in a very inconvenient view of the event more detailed information, I do not know whether it is a K8S bug.

Ali’s kube – eventer

For the Kubernetes event monitoring scenario, the Kuernetes community provided a simple event offline capability in Heapter, which was later archived with the deprecation of Heapster. In order to make up for the absence of event monitoring scenarios, Ali Cloud Container service released and opened source the Kubernetes event offline tool Kube-Eventer. Support offline Kubernetes events to staple bots, SLS logging services, Kafka open source message queues, InfluxDB timing databases, and more.

The project address

Aliyun container service Kube-Eventer

The following notification procedures are supported

The program name

describe

dingtalk

Nailing robot
sls

Ali Cloud SLS service
elasticsearch

Elasticsearch service
honeycomb

Honeycomb service
influxdb

Influxdb database
kafka

Kafka database
mysql The mysql database
wechat WeChat

The practice drills

Next, practice kube-Events docking pin alarms.

1, first have a group, you want to manage permissions, clickIntelligent swarm assistant2. Add robots:3. Add customizations (access custom services via Webhook)4. Fill in the robot name, safety Settings, etc.Security Settings are currently available in three ways:

(1) User-defined keywords. A maximum of 10 keywords can be set. The message can be sent successfully only when at least one keyword is included in the message. If a user-defined keyword is added: Cluster1, the message sent by the robot must contain the word monitoring alarm to be sent successfully. (2) sign, take timestamp+” N “+ key as signature string, use HmacSHA256 algorithm to calculate signature, then Base64 encode, and finally put signature parameters to urlEncode to get the final signature (utF-8 character set is required). (3) IP address (segment) : After setting, only requests from the IP address range will be processed normally. Two Settings are supported: IP address and IP address segment. The IPv6 address whitelist is not supported

I used the custom keyword method above. Set the keyword to Cluster1, which you need to change when creating the YAML file.

Refer to the nailing API docking documentation for more detailed examples

5. Copy the Webhook URL.6. Will followyamlSave the file tokube-event.yamlFile, modify the startup parameters--sinkFor what I just copiedwebhookThe address,labelWrite the keyword you just customizedcluster1.levelSpecify the alarm asWarningLevel event.

apiVersion: apps/v1 kind: Deployment metadata: labels: name: kube-eventer name: kube-eventer namespace: kube-system spec: replicas: 1 selector: matchLabels: app: kube-eventer template: metadata: labels: app: kube-eventer annotations: scheduler.alpha.kubernetes.io/critical-pod: '' spec: dnsPolicy: ClusterFirstWithHostNet serviceAccount: kube-eventer containers: - image: Registry.aliyuncs.com/acs/kube-eventer-amd64:v1.1.0-63e7f98-aliyun name: kube - eventer command: - "/kube-eventer" - "--source=kubernetes:https://kubernetes.default" ## .e.g,dingtalk sink demo - --sink=dingtalk:https://oapi.dingtalk.com/robot/send?access_token=81kanbcl18sjambp9cb31o0k1jalh189asnxmafbf70933cb42978a bd19d8fff7&label=cluster1&level=Warning env: # If TZ is assigned, set the TZ value as the time zone - name: TZ value: Asia/Shanghai volumeMounts: - name: localtime mountPath: /etc/localtime readOnly: true - name: zoneinfo mountPath: /usr/share/zoneinfo readOnly: true resources: requests: cpu: 100m memory: 100Mi limits: cpu: 500m memory: 250Mi volumes: - name: localtime hostPath: path: /etc/localtime - name: zoneinfo hostPath: path: /usr/share/zoneinfo --- apiVersion: rbac.authorization.k8s.io/v1 kind: ClusterRole metadata: name: kube-eventer rules: - apiGroups: - "" resources: - events verbs: - get - list - watch --- apiVersion: rbac.authorization.k8s.io/v1 kind: ClusterRoleBinding metadata: annotations: name: kube-eventer roleRef: apiGroup: rbac.authorization.k8s.io kind: ClusterRole name: kube-eventer subjects: - kind: ServiceAccount name: kube-eventer namespace: kube-system --- apiVersion: v1 kind: ServiceAccount metadata: name: kube-eventer namespace: kube-systemCopy the code

7、kubctl apply -f kube-event.yaml

When kubernetes cluster Pod due to OOM, pull image, health check did not pass error caused by restart, cluster administrator is actually not aware of, because Kubernetes has a self-repair mechanism, Pod crash, can restart a. With event alarms, the cluster administrator can discover and rectify service faults in a timely manner.

WeChat practice

Ali Cloud SLS practice

Elasticsearch practice

Honeycomb practice

Influxdb practice

Mysql practice

Kafka practice

reference

Kubernetes event offline tool Kube-Eventer is an open source tool

The author is simple

Author: Xiaowantang, a passionate and serious writer, currently maintains the original public account “My Xiaowantang”, focusing on writing the go language, Docker, Kubernetes, Java and other development, operation and maintenance knowledge to enhance the hard power of the article, looking forward to your attention. Note: Be sure to specify the source (note: from the official account: My Small bowl of soup, author: Small bowl of soup)