The author’s gold

A technical expert at Walk-tech

Problems faced by traditional monitoring systems

What problems will traditional monitoring systems face?

Zabbix, for example

 

The initial use requires a lot of configuration, but as servers and services grow, traditional monitoring like Zabbix will face many problems:

  1. DB performance bottleneck. Zabbix will store the collected performance indicators in the database. When the number of servers and services grow rapidly, the database performance becomes the bottleneck first.

  2. Multiple sets of deployment, high management cost, when the database performance becomes a bottleneck, the first solution may be to deploy multiple sets of Zabbix, but it will bring the problem of high management and maintenance cost.

  3. The ease of use is poor, and the configuration and management of Zabbix is complex and difficult to master.

  4. Mail storms. Mail configuration rules are complex, and may cause mail storms if you are not careful.

With the development of container technology, traditional monitoring system faces more problems

  1. How is the container monitored?

  2. How are microservices monitored?

  3. How to analyze and calculate cluster performance?

  4. How do I manage a large number of configuration scripts on the Agent?

We can see that traditional monitoring systems cannot meet the monitoring needs of the current IT environment

Predecessor of Prometheus: Borgmon

In 2015, Google published a paper entitled “Google Uses Borg to Manage Large-scale Clusters”.

The paper also describes the size and challenges of Google clustering

  1. Tens of thousands of servers in a cluster

  2. Thousands of different applications

  3. Hundreds of thousands of jobs, dynamically increasing or decreasing

  4. Hundreds of clusters per data center

At this scale, Google’s surveillance system also faces significant challenges, and The Borgmon monitoring system in Borg is designed to address these challenges.

Borgmon introduction

So let’s take a look at how Google does the monitoring system for large clusters

The application buried point

First, all applications running in the Borg cluster need to be exposed to a specific URL, http://

:80/varz, through which we can obtain all the monitoring metrics exposed by the application.

Service discovery

However, there are tens of millions of such applications, and they can increase or decrease dynamically. How do you find them in Borgmon? Applications in the Borg are automatically registered with the Borg’s internal DOMAIN name server (BNS) when they start up. Borgmon reads the application list information in the BNS and collects the application list to discover which application services need to be monitored. When the list of applications is retrieved, all monitoring variable values for the application are pulled into the Borgmon system.

Collecting and stacking indicators

When monitoring indicators are collected in Borgmon, they can be displayed or used for alarms. In addition, because a cluster is too large, one Borgmon may not be able to meet the monitoring, collection and presentation requirements of the entire cluster. Therefore, a data center may deploy multiple Borgmon, which can be divided into data collection layer and summary layer. The data collection layer will have multiple Borgmon dedicated to the application to collect data, and the summary layer Borgmon will get data from the data collection layer Borgmon.

Indicator Data Store

After Borgmon collects performance data, it stores all data in the memory database, checks data to disks at regular intervals, and periodically packages the data to the external system TSDB. Typically, at least 12 hours of data is stored in the data center and global Borgmon for rendering diagrams. Each data point takes up about 24 bytes of memory, so it only needs 17GB memory to store 1 million time-series, one data point per minute for each time-series and 12 hours of data.

indicators

Index query

In Borgmon, indicators are queried using tags. Based on tag filtering, you can query specific indicators of an application or higher-dimensional information

For example, we found http_requests metrics for host0:80 app based on a set of filtering information

 

We can also query http_requests metrics for WebServers across the western United States

 

You get the HTTP_requests metric for all eligible instances

Rules to compute

On the basis of data collection and storage, we can obtain further data through rule calculation.

For example, we want to alert the Web Server when errors exceed a certain percentage, or when the percentage of total requests that are not 200 return codes exceeds a certain value.



Prometheus

introduce

Borgmon is an internal system at Google, so how do you use it outside of Google? Here is the monitoring system we described as Prometheus. Prometheus is the open source version of Borgmon, as mentioned in Google SRE, a book by an internal SRE engineer at Google. Prometheus was also very popular in the open source community, and the Native Cloud Foundation (CNCF), which was sponsored by Google under the Linux Foundation, included Prometheus in its second open source project (the first project was Kubernetes, an open source version of Borg).

architecture

The overall architecture of Prometheus is similar to that of Borgmon, with the following components, some of which are optional:

  • Prometheus master server that collects and stores time series data

  • Application Client code library

  • Short-term jobs’ Push gateway

  • Special purpose exporter (HAProxy, StatsD, Ganglia, etc.)

  • Alertmanager for alerting

  • Command line tool

  • Also, Grafana is an excellent tool to present as the Prometheus Dashboard

Database Monitoring

With Prometheus database indicator collection, we can run MySQL as an example. Because MySQL does not have an interface to collect performance indicators, we can run a single mysql_exporter to the MySQL database to extract performance indicators. And expose the performance collection interface to Prometheus, and we can start Node_EXPORTER to fetch host performance indicators.

Deploying the server

The server configuration is very simple, because Prometheus is entirely based on the Go language, and the Go language is very easy to install, install the server only need to download, decompress and run. It can be seen that there are few common programs on the server side, which only need to include Prometheus, the main service program, and AlertManager, the alarm system program.

 

The server configuration is also very simple, the common configuration includes the pull time and specific collection mode, as far as we monitor the mysql database, just need to fill in the mysql_EXPORTER address.

The deployment of the exporter end

For mysql collection, you only need to configure connection information and start mysQL_EXPORTER



After the configuration is complete, you can run mysql_export to collect mysql performance indicators

 

The collected mysql performance metrics can then be queried on the Prometheus server



Based on these collection metrics and the regular computation statements provided by Prometheus, we can implement some high-latitude query requirements, such as this statement,increase(mysql_global_status_bytes_received{instance="$host"}[1h]) 

We can query the number of bytes that MySQL receives per hour, and then we can put that query into Grafana, and we can show a really cool performance graph.

At present, the MySQL monitoring scheme combining Prometheus and Grafana has been open source implementation, and we can easily build a monitoring system based on Prometheus

 

For alarms, we can also implement complex alarm logic based on Prometheus rich query statements

For example, if the replication IO thread is not running or the replication SQL thread is not running and sends an alarm for 2 minutes, we can use the following alarm rule.

    
  1. # Alert: The replication IO or SQL threads are stopped.

  2. ALERT MySQLReplicationNotRunning

  3.  IF mysql_slave_status_slave_io_running == 0 OR mysql_slave_status_slave_sql_running == 0

  4.  FOR 2m

  5.  LABELS {

  6.    severity = "critical"

  7.  }

  8.  ANNOTATIONS {

  9.    summary = "Slave replication is not running",

  10.    description = "Slave replication (IO or SQL) has been down for more than 2 minutes.",

  11.  }

For example, if we want to monitor the MySQL standby database for a delay greater than 30 seconds and predict that the delay will be greater than 0 seconds in the next 2 minutes and last 1 minute, an alarm is generated

    
  1. # Alert: The replicaiton lag is non-zero and it predicted to not recover within

  2. #        2 minutes.  This allows for a small amount of replication lag.

  3. ALERT MySQLReplicationLag

  4.  IF

  5.      (mysql_slave_lag_seconds > 30)

  6.    AND on (instance)

  7.      (predict_linear(mysql_slave_lag_seconds[5m], 60*2) > 0)

  8.  FOR 1m

  9.  LABELS {

  10.    severity = "critical"

  11.  }

  12.  ANNOTATIONS {

  13.    summary = "MySQL slave replication is lagging",

  14.    description = "The mysql slave replication has fallen behind and is not recovering",

  15.  }

Of course, there are not only MySQL monitoring implementation in the database, but also many other open source implementations in the industry, so database monitoring can also be implemented out of the box

  • mysql_exporter 

https://github.com/prometheus/mysqld_exporter

  • redis_exporter 

https://github.com/oliver006/redis_exporter

  • postgres_exporter 

https://github.com/wrouesnel/postgres_exporter

  • mongodb_exporter 

https://github.com/percona/mongodb_exporter