Recently, I have been studying how to add appropriate metrics to applications to analyze application usage and debug. The overall idea is to use Promethues to collect data and grafana to display data. In the process, I found the Node-Exporter project, which I thought could be directly used to monitor my normal use of Linux machines, so I had this article.
The whole system uses three components:
- Node-exporter: an agent that runs on a host and collects operating system data
- Prometheus: Open source timing database as a center for data storage and analysis
- Grafana: Data presentation and analysis interface that provides various powerful dashboards to read data from multiple sources, including Promethues
NOTE: All services are started by Docker, docker and Docker-compose need to be installed and familiar with their use.
Install and configure Promethues
Promethues was originally developed by SoundCloud as a time series database for metrics collection and storage. It is now a project of the CNCF Foundation and an official metrics collection and monitoring tool recommended by Kubernets.
First, we create a docker-comemess. yml file that contains only one service, Promethues:
Version: '2' services: Prometheus: Image: PROM/Prometheus :v2.0.0 volumes: - ./prometheus.yml:/etc/prometheus/prometheus.yml command: - '--config.file=/etc/prometheus/prometheus.yml' ports: ? '9090 -' 9090.Copy the code
Docker and Docker-compose are easy to understand if you are familiar with them:
- The following image specifies the promethues container image to use. This file is v2.0.0
- Volumes combined the volumes in the current directory
prometheus.yml
The file is mounted to the container. This is because the file will be constantly modified later. Using the volume mount method is more flexible - Command specifies the command line parameters for promethues to run. In this case, only the location of the configuration file is specified. More parameters can be used
--help
To view the - Ports Maps the port monitored by the Promethues service to a host so that the host port can be accessed directly
Of particular importance is the promethe.yml file, which reads as follows:
global:
scrape_interval: 5s
external_labels:
monitor: 'my-monitor'
scrape_configs:
- job_name: 'prometheus'
static_configs:
- targets: ['localhost:9090']
Copy the code
The configuration file is divided into two parts: global and scrape_configs. The former is global and takes effect if a specific configuration item is not overwritten by subsequent tasks. Here are two configuration items. Scrape_interval indicates the scraping interval of the Promethues server. If it is too frequent, the pressure of promethues will be high; if it is too long, some key data may be missed. You are advised to configure tasks based on the importance of each task and cluster size.
Scrape_configs configs each fetching task and is therefore a list. Here we have only one task, which is fetching metrics of Promethues itself. Targets indicates the HTTP address of the task to be fetched. By default, fetching is performed at the /metrics URL, for example, http://localhost:9090/. This is monitoring data provided by Prometheus itself and can be viewed directly in a browser.
Each data has a name and a series of key-value pairs called label, and Prometheus automatically adds instance (host:port identifier of a node) and Job (task name) as labels to identify tasks when fetching data.
The data itself has no time information. When promethues captures the data, it automatically adds the timestamp of the time. In addition, this data is broken down into four different types on the client side: Counter, Gauge, histogram, and Summary. See the official documentation for more information about the Promethues Metrics data types.
Docker-compose up -d: docker-compose up -d: docker-compose up -d: docker-compose up -d: docker-compose up -d: docker-compose up -d: docker-compose
The working principle of the whole Promethues system is as follows:
- Promethues Server is located in the center and is responsible for the collection, storage and query of temporal data
- On the left is data sources. Promethues obtains data from compatible HTTP interfaces in pull mode. Data sources can be divided into three types
- A exporter that is exposed to Promethues and is written specifically for a component is called Jobs or Exporters, such as Node exporter, HAProxy, Nginx, MySQL and other services
- The Push Gateway converts data from push to pull
- Other promethues server
- Above is the automatic target discovery mechanism. In many production situations, manually configuring all metrics sources can be cumbersome, so Promethues supports DNS, Kubernetes, Consul and other service discovery mechanisms to dynamically obtain target sources for data capture
- On the lower right is the data output, which is generally used for UI presentation. You can use an open source solution such as Grafana, or you can read the Interface of Promethues for your own development
- You need to configure alarm rules. Once the alertManager detects that monitoring data matches the alarm rules, the alertManager sends alarm information through emails or social media accounts
Use Node Exporter to collect monitoring data
While monitoring the Promethues service is interesting and useful in itself, our goal is to monitor Linux hosts. Because Promethues can only pull monitoring data from an address on the HTTP interface, a tool is needed to expose the system data provided by Linux as an HTTP service. Fortunately, the Official Promethues community has many exporters who are responsible for exporting monitoring data of a component or system in a way that Promethues can understand. Node is a tool that exports monitoring data of Unix/Linux systems.
Grafana reads Linux data from /proc, modiates docker-comemage. yml, and adds node-exporter, which is listening on port 9100 by default:
Version: '2' services: Prometheus: Image: PROM/Prometheus :v2.0.0 volumes: - ./prometheus.yml:/etc/prometheus/prometheus.yml command: - '--config.file=/etc/prometheus/prometheus.yml' ports: - '9090:9090' node-exporter: image: PROM /node-exporter: V0.15.2 ports: - '9100:9100'Copy the code
In order for Promethues to collect Node-Exporter’s content, we need to add a separate task to the configuration file:
global:
scrape_interval: 5s
external_labels:
monitor: 'my-monitor'
scrape_configs:
- job_name: 'prometheus'
static_configs:
- targets: ['localhost:9090']
- job_name: 'node resources'
scrape_interval: 10s
static_configs:
- targets:
- 'node-exporter:9100'
Copy the code
Because Docker-swarm automatically resolves services to IP addresses, node-exporter:9100 can be directly used as the address.
Run the operation again and check that Node -exporter is in the targets list of Promethues.
Install and configure grafana
Prometheus provides a web interface for querying configuration information, monitoring whether a node is running properly, querying a metric, and providing simple graphical functions. However, its graphical capabilities are relatively simple, and when you have a lot of data to display at the same time, you need more powerful dashboard tools, such as grafana, introduced here.
Grafana is a powerful dashboard tool with a nice interface design, great functionality, and great flexibility in configuration.
Also, add grafana service to docker-comemage. yml:
Version: '2' services: Prometheus: Image: PROM/Prometheus :v2.0.0 volumes: - ./prometheus.yml:/etc/prometheus/prometheus.yml - prometheus_data:/prometheus command: - '--config.file=/etc/prometheus/prometheus.yml' ports: - '9090:9090' node-exporter: image: Run the following volumes: PROM/Node -exporter: V0.15.2 ports: - '9100:9100' Grafana: Image: grafana/ Grafana: - grafana_data:/var/lib/grafana environment: - GF_SECURITY_ADMIN_PASSWORD=pass depends_on: - prometheus ports: - '3000:3000' volumes: grafana_data: {} prometheus_data: {}Copy the code
Docker volumes are used to store data generated during grafana and Promethues running to ensure persistence, and the environment variable GF_SECURITY_ADMIN_PASSWORD=pass is used to set the password of admin.
When docker-compose is run, grafana will run at http://host-ip:3000 address. When you open docker-compose using a browser, the login page will appear. Enter the user name and the password you just configured to enter the service.
Grafana itself is just a dashboard that pulls data from multiple data sources (temporal databases) for presentation, such as the Promethues we use here. Therefore, Data source needs to be added before the official configuration interface. Click the button in the upper left corner of Grafana to find the Data Sources page or directly enter the http://host-ip:3000/datasources address to enter the corresponding page. Enter Prometheus for Type and Prometheus for URL. Add the promethues address that grafana service can access (because they are both running through docker-compose, the name can be directly used here). Fill in the Name field with a Name that identifies the source.
Then create a dashboard and add graph (I’m using the name Test Dashboard for simplicity) to the graph and add a panel that we use to display the load data of the system. Edit panel data, select promethues as data source, and fill in Query. The system node is simple, including node_load1, node_load5 and node_load15. Respectively are the load values of the system in the last one minute, five minutes, and fifteen minutes. When you’re done typing, click outside the input box and Grafana will automatically update the chart above:
Similarly, you can add other panels to display monitoring data from various aspects of the system, such as CPU, memory, IO, network, etc. For more grafana configurations, refer to the official documentation and choose the appropriate diagram to show the desired results.
There’s a lot of flexibility in manually creating powerful charts for Grafana through the interface, but it can be time-consuming (think of having to redo grafana every time you set up), and Since Node’s monitor data is common, having everyone manually create it would be duplication. To that end, Grafana supports importing and exporting configurations, and provides an official community for sharing Dashboard configurations.
Each dashboard has a number, and the number 22 dashboard is a display chart designed specifically for Node-Exporter. Click import Dashboard in Grafana, add the number and select the data source to get the fully configured chart:
If you have a problem with dashboard, you can add and edit it directly on the page, and then you can export the JSON file for reuse.
Configure the alarm
Through the Grafana chart, we can know the changes of various indicators in the system over time, which is convenient to judge whether a certain resource in the system is abnormal. However, we can’t keep staring at dashboard. We also need to notify us immediately by email or other means when the system is abnormal, which is the function of alarm.
Promethues provides user-defined alarm rules. If a rule is triggered during the processing of metrics data, users are notified of the alarm action by email or other means.
For our single-node host, we can define two simple alarm rules: an alarm is sent when the host is down or the CPU load is too high for a period of time. In this case, you need to create an alert.rules file to store alarm rules as follows:
➜ monitor Git :(Master) Qualify cat alert. Rules groups: - name: node-alert rules: -alert: service_down expr: up == 0 for: 2m - alert: high_load expr: node_load1 > 0.5 for: 5mCopy the code
When starting the Promethues service, mount the alarm rule file to the POD and add a volume as follows:
➜ monitor Git :(Master) Qualify cat docker-compose. Yml version: '2' Services: Prometheus: image: PROM/Prometheus: v2.0.0 volumes: - ./prometheus.yml:/etc/prometheus/prometheus.yml - ./alert.rules:/etc/prometheus/alert.rules ......Copy the code
Promethues is then told to load these rules:
➜ monitor git:(master) qualify cat prometheus.yml...... rule_files: - 'alert.rules'Copy the code
To check whether the alarm rules take effect, you can turn off the Node-Exporter service.
docker-compose stop node-exporter
Copy the code
View alarms on the Alert page of promethues.
Similarly, cpu-consuming services can be run to trigger the system load alarm, such as running the following Docker container (while loop hogging CPU can easily raise CPU load to high) :
docker run --rm -it busybox sh -c "while true; do :; done"
Copy the code
Promethues also provides an AlertManager to automatically trigger actions based on alarm rules and notify users and administrators in various ways.
conclusion
The full configuration file used in this article is available on Github as a reference.
It is important to note that this is only a local test environment and cannot be used directly in production. First, we did not configure secure access, all services are HTTP; Secondly, when Docker-compose runs, all the services are on the same machine, so distributed monitoring and high availability cannot be achieved.
If you want to use Promethues and Grafana in production, please refer to the official documentation.
The resources
- Installation overview of node_exporter, prometheus and grafana
- Monitoring with Prometheus, Grafana & Docker Part 1
- How To Add a Prometheus Dashboard to Grafana
- Monitoring linux stats with Prometheus.io
- a monitoring solution for docker and containerized services
- How To Query Prometheus on Ubuntu 14.04