This article belongs to the K8S monitoring series, and the other articles are:
- K8s Monitoring (1) Install Prometheus
- K8s Monitoring (2) Monitors cluster components and PODS
- K8s monitoring (4) Monitor the host computer
The article was supposed to cover Grafana and AlertManager, but they were so easy to deploy that there wasn’t much incentive to write. I recently studied Prometheus Adapter and would like to write about my findings, perhaps grafana and AlertManager in my next article.
Ok, let’s jump right into the main body.
Kubernetes Apiserver provides two apis for monitoring metrics related operations:
- resource metrics API: is designed to provide monitoring metrics for k8S core components, for example
kubectl top
; - Custom Metrics API: Designed to provide metrics to the HPA controller.
Kubernetes apiserver is used to expose kubernetes functionality to other components through restAPI, but it provides core related functionality. What if Apiserver has to provide functionality that is useful but not core?
In this case, Kubernetes Apiserver provides the corresponding API, but it does not process requests that reach this API, but instead forwards them to an extension apiserver. Interestingly, this extension, Apiserver, can be developed by anyone as long as it follows the specification.
When users use the extension Apiserver, they only need to register with the Kube-Aggregator (the function of Kubernetes Apiserver), and the Aggregator will forward the requests for this API to the extension Apiserver. Of course, the interaction between Kubernetes Apiserver and extended Apiserver involves a lot of details that I won’t cover here.
Examples of this are the Resource Metrics API and the Custom Metrics API, which kubernetes Apiserver provides, but does not specify the implementation.
The purpose of this article is to implement them through Prometheus Adapter.
API Group and API Version
Before we implement these two apis, let’s talk about API groups and API versions.
The so-called API group is the value when you run Kubectl apI-versions. These values are composed of API group and API version.
# kubectl api-versions
admissionregistration.k8s.io/v1beta1
apiextensions.k8s.io/v1beta1
apiregistration.k8s.io/v1
apiregistration.k8s.io/v1beta1
apps/v1
...
Copy the code
There are so many that only five are listed here. The first admissionregistration. K8s. IO/v1beta1, admissionregistration. K8s. IO is API group, v1beta1 said its version.
If the API group is empty, the core API, kubernetes resources are provided by the API group. So how do you know which resources are provided by which API groups?
Execute kubectl api-resources to find out:
# kubectl api-resources
NAME SHORTNAMES APIGROUP NAMESPACED KIND
bindings true Binding
componentstatuses cs false ComponentStatus
configmaps cm true ConfigMap
endpoints ep true Endpoints
events ev true Event
limitranges limits true LimitRange
namespaces ns false Namespace
nodes no false Node
...
Copy the code
It’s a lot, but here’s just a part of it. The output is five columns. The NAME column is the resource NAME, and its functionality is provided by the API group of the APIGROUP column. The SHORTNAMES column is an abbreviation for these resources, which is useful when using Kubectl.
In all of the results listed above, the APIGROUP is empty, indicating that these resources are provided by the core API. When you see apiGroup with a value of “” in some roles or cluster Roles, you know that it is accessing resources provided by the core API.
For the two apis we will implement:
- The API group of the Resource Metrics API is
metrics.k8s.io
, for versionv1beta1
; - The Custom Metrics API group is
custom.metrics.k8s.io
, for versionv1beta1
.
Their API groups and API versions will be used later during registration.
custom metrics API
We’ll start with the Custom Metrics API, and then the Resource Metrics API. If all you need is the Resource Metrics API, you should look at this one, because both apis are implemented by Prometheus Adapter.
The Custom Metrics API is entirely intended for HPA V2, because V1 can only use the CPU as a metric for pod scale-out, which clearly does not meet the user’s needs.
There are several implementations of the Custom Metrics API, and we chose Prometheus Adapter because we already have Prometheus installed. With Prometheus Adapter, any indicator present in Prometheus can be used as a condition for HPA, thus satisfying all usage scenarios.
In Kubernetes 1.14, Apiserver already enabled Aggregation layer, so we only need to install Prometheus Adapter.
We will deploy the Prometheus Adapter using Deployment and, of course, bind various roles to the serviceAccount that starts it. Because it is itself an Apiserver, it listens on a port to provide HTTP services.
We also need to create a Service for it, and when kubernetes Apiserver forwards a request, it sends the request to the Service, through which it goes to the subsequent Prometheus Adapter.
All the YML files for this article are stored on GitHub, which is not listed here. The Prometheus Adapter files are in the Adapter directory, and the Custom Metrics API uses the following files:
prometheus-adapter-apiServiceCustomMetrics.yml
prometheus-adapter-apiServiceMetrics.yml
prometheus-adapter-clusterRoleAggregatedMetricsReader.yml
prometheus-adapter-clusterRoleBindingDelegator.yml
prometheus-adapter-clusterRoleBinding.yml
prometheus-adapter-clusterRole.yml
prometheus-adapter-configMap.yml
prometheus-adapter-deployment.yml
prometheus-adapter-roleBindingAuthReader.yml
prometheus-adapter-serviceAccount.yml
prometheus-adapter-service.yml
Copy the code
Among them:
- ApiServiceCustomMetrics: Used to provide the registration API, API group is
custom.metrics.k8s.io
, for versionv1beta1
; - ApiServiceMetrics: used to provide the registration API, API group is
metrics.k8s.io
, for versionv1beta1
. This is for Resource Metrics; - RoleBindingAuthReader: allows the extension Apiserver to read configMap to verify kubernetes Apiserver’s identity.
- ClusterRoleAggregatedMetricsReader:
metrics.k8s.io
和custom.metrics.k8s.io
Any of the following resources have any permissions; - ClusterRoleBindingDelegator: binding authority, let Prometheus adapter have permission to submit SubjectAccessReview request to kubernetes apiserver;
- ClusterRole: Prometheus Adapter accesses resources in the cluster.
- ConfigMap: the Prometheus Adapter configuration file, which is one of the highlights;
- Service: Kubernetes Apiserver sends requests to this service, which forwards them to the Prometheus Adapter pod.
Deployment and testing
Many people use the Kubernetes CA to sign a certificate and then use the certificate to provide HTTPS to Prometheus Adapter. This is not necessary; when Prometheus Adapter starts, if you do not provide it with a certificate, it generates a self-signed certificate to provide HTTPS.
As for whether the certificate is trusted is not important, because Prometheus – adapter – apiServiceCustomMetrics. Yml insecureSkipTLSVerify exist in the file: true this option.
Clone and deploy:
git clone https://github.com/maxadd/k8s-prometheus
kubectl apply -f k8s-prometheus/adapter
Copy the code
Test the deployment by accessing the API directly below (you may need to wait a little while) :
kubectl get --raw "/apis/custom.metrics.k8s.io/v1beta1/" | python -mjson.tool
kubectl get --raw "/apis/custom.metrics.k8s.io/v1beta1/namespaces/monitoring/pods/*/fs_usage_bytes" | python -mjson.tool
Copy the code
The first command outputs a very large number of values that can be used as HPA indicators. If you don’t, there is a deployment problem, for reasons explained later.
The second command prints the fs_usage_bytes indicator for all pods in the Monitoring namespace. If you don’t have one, deployment is also problematic.
I made some changes to the metrics and labels in Prometheus in my previous installation of Prometheus, so if you go directly to the official configuration, you will definitely have problems. Configuration is covered below.
In addition, the -v value of the Prometheus Adapter startup parameter corresponds to the debug level, and the higher the value is, the more detailed the log output is. But it doesn’t seem to be doing anything.
The configuration file
The format of the Prometheus Adapter configuration file is shown below (part of it is truncated because it is too long). It is divided into two parts, the first is rules, for custom metrics; The other is resourceRules, for metrics. If you use Only Prometheus Adapter for HPA, resourceRules can be omitted and vice versa.
Let’s start with the Rules rule, which has many queries under it. The purpose of these queries is to get as many metrics as possible so that they can be used in the HPA.
That is, with Prometheus Adapter, you can use any indicator from Prometheus for HPA, but only if you get it from a query (including the indicator name and its corresponding value). That is, if you only need to do HPA with one metric, you can write just one query, instead of using multiple queries as shown below.
rules:
- seriesQuery: '{__name__=~"^container_.*",container_name! ="POD",namespace! ="",pod! = ""} '
seriesFilters: []
resources:
overrides:
namespace:
resource: namespace
pod: The value in the official example is pod_name, which is also pod since I changed pod_name to POD earlier
resource: pods
name:
matches: ^container_(.*)_seconds_total$
as: ""
metricsQuery: sum(rate(<<.Series>>{<<.LabelMatchers>>,container_name! ="POD"}[1m])) by (<<.GroupBy>>)
---
resourceRules:
cpu:
containerQuery: sum(rate(container_cpu_usage_seconds_total{<<.LabelMatchers>>}[1m])) by (<<.GroupBy>>)
nodeQuery: sum(rate(container_cpu_usage_seconds_total{<<.LabelMatchers>>, id='/'}[1m])) by (<<.GroupBy>>)
resources:
overrides:
instance:
resource: nodes
namespace:
resource: namespace
pod:
resource: pods
containerLabel: container_name
Copy the code
Next we will explain its keywords:
- SeriesQuery: a statement from Prometheus in which all metrics queried are available for HPA;
- SeriesFilters: The queried metrics can be filtered out if they are not needed. Filters come in two ways:
is: <regex>
: Obtains only the indicator names matched by the regular expression.isNot: <regex>
:
- Resources: If I want to query the indicator of a POD, I must use its name and its namespace as the label of the indicator. Resources is to associate the label of the indicator with the resource type of K8S. The most common ones are POD and namespace. There are two ways to add labels, one is
overrides
The other istemplate
.- Overrides: Associates the label in the metric with the K8S resource. Kubectl get Pod/Pods = kubectl get Pod/Pods = Kubectl get Pod/Pods = Kubectl get Pod/Pods = Kubectl get Pod/Pods Because both POD and Namespace belong to the core API group, there is no need to specify an API group. When you query for a POD metric, it automatically adds the POD name and namespace as tags to the query criteria.
microservice: {group: "apps", resource: "deployment"}
Associate the microService tag in the metric with the Deployment resource in the APPS API group.
- Template: Through the go template.
template: "kube_<<.Group>>_<<.Resource>>"
If I write it this way, let’s say<<.Group>>
For apps,<<.Resource>>
For Deployment, then it will be in the indicatorkube_apps_deployment
Tags are associated with deployment resources;
- Overrides: Associates the label in the metric with the K8S resource. Kubectl get Pod/Pods = kubectl get Pod/Pods = Kubectl get Pod/Pods = Kubectl get Pod/Pods = Kubectl get Pod/Pods Because both POD and Namespace belong to the core API group, there is no need to specify an API group. When you query for a POD metric, it automatically adds the POD name and namespace as tags to the query criteria.
- Name: renames indicators. The reason for renaming indicators is that some indicators are incremented only, such as those ending in total. It is meaningless to use these indicators as HPA. Generally, we calculate its rate and take rate as the value, so the name cannot end in total at this time, so we have to rename it.
- Matches: matches indicator names using regular expressions.
- As: The default value is
The $1
That’s the first group. Empty as means use the default value.
- MetricsQuery: This is the query Prometheus; the previous seriesQuery query was to get HPA metrics. When we want to look up the value of an indicator, we need to use the query statement specified by it to do so. As you can see, the query uses rates and groupings to solve the problem of adding only metrics mentioned above, as well as templates.
- Series: indicator name.
- LabelMatchers: Additional tags, currently there are only pod and Namespace, so we need to use resources for association before;
- GroupBy: is the pod name, also associated with resources.
Front access/apis/custom. The metrics. K8s. IO/v1beta1 / of all the indicators are seriesQuery query to in these rules, may of course name and Prometheus is not exactly the same, because it USES the name to rename.
There are a lot of indicators that are not necessary for HPA, such as the indicators for Prometheus itself and the k8S component, but the Prometheus Adapter wants to expose all indicators so that you can use whatever you want. That’s why it has so many SeriesQueries.
Access/apis/custom metrics. K8s. IO/v1beta1 namespaces/monitoring/pods / * / fs_usage_bytes by metricsQuery query, To get the index value for each POD.
Most of the deployment problems are caused by poor correlation in the configuration file, and only by understanding the meaning of the configuration file can you ensure that the deployment does not have problems.
The rest of the resourceRules rules are for resource metrics, with only CPU and memory attributes, which are easily understood as node and POD. These two queries are executed when kubectl top Pods/Nodes is executed.
So much for Custom Metrics, the HPA is not covered here.
resource metrics API
The Resource Metrics API is officially designed to monitor the core components of k8S, but it only provides CPU and memory metrics for pods and Nodes.
Officials say it can do the following:
- HPA: CPU indicator can be used as HPA. The V1 version of HPA may depend on this, but it doesn’t matter now;
- Pod scheduling: Officially, this is an extension, because current POD scheduling doesn’t take node usage into account.
- Cluster federation: same resource usage, but not currently used;
- Dashboard: Drawing, I haven’t used dashboard, and I don’t know whether it works;
- Kubectl Top: This is the most useful feature.
So in summary, the biggest thing about the Resource Metrics API is that you can actually use the kubectl top command, right? Regardless of whether it’s useful or not, one of our goals is to deploy an extension apiserver to implement it. The next step is to choose an extension Apiserver.
Many people use metrics Server to provide the Resource Metrics API, and Prometheus Adapter to provide the Custom Metrics API. However, Prometheus Adapter supports both apis, so we don’t need metrics Server at all and just deploy a Prometheus Adapter.
We’ve already deployed it, we just need to verify it.
# kubectl -n monitoring top pods
NAME CPU(cores) MEMORY(bytes)
alertmanager-c8d754fbc-2slzr 1m 15Mi
grafana-74bf6c49f6-lf7vw 7m 50Mi
kube-state-metrics-856448748c-xtdxl 0m 24Mi
prometheus-adapter-548c9b9c4c-mr9zq 0m 39Mi
Copy the code
Kubectl top Node has a problem because I deleted all the indexes with id=’/’. If you don’t want to restore these indexes, see my next article on collecting host indexes.
That’s all for this article, thank you for reading, thank you!