Prometheus deployment

Prepare the environment

Deploy the software and environment

Redis, mysql, Grafana, Ansible, Exporter, Prometheus

Setting the host Name

hostnamectl set-hostname prome-master01
Copy the code

Set the time zone

Timedatectl [root@prometheus_master01 ~]# timedatectl Local time: 62021-03-27 22:39:41 CST Universal time: 6 21-03-27 14:39:41 UTC RTC Time: 6 21-03-27 14:39:41 Time zone: Asia/Shanghai (CST, +0800) NTP Enabled: yes NTP synchronized: yes RTC in local TZ: no DST active: n/a timedatectl set-timezone Asia/ShanghaiCopy the code

Disable the firewall Selinux

systemctl stop firewalld
systemctl disable firewalld
systemctl status firewalld

setenforce 0
sed -i '/^SELINUX/s/enforcing/disabled/' /etc/selinux/config
getenforce
Copy the code

Example Disable the SSHD DNS reverse solution

sed -i 's/^#UseDNS yes/UseDNS no/'  /etc/ssh/sshd_config
systemctl restart sshd 
Copy the code

Build phase

Set up the domestic YUM source

Mkdir/TMP /yum_repo_bk /bin/mv -f /etc/yum_repo_bk /* / TMP /yum_repo_bk # http://mirrors.aliyun.com/repo/Centos-7.repo # epel source wget - O/etc/yum repos. D/epel - 7. Repo https://mirrors.aliyun.com/repo/epel-7.repo yum makecacheCopy the code

Installing Required Tools

# rzsz 
yum -y install lrzsz  yum-utils
Copy the code

Prepare data directories

Mkdir -pv /opt/ TGZS # mkdir -pv /opt/appCopy the code

Set the ulimit for the history file

cat <<EOF >> /etc/profile
export HISTFILESIZE=
export HISTSIZE=
EOF
source /etc/profile
Copy the code

Download the latest version of Prometheus

# # # https://github.com/prometheus/prometheus/releases/tag/v2.25.2 address Prometheus wget - O / opt/TGZS Prometheus - 2.25.2. Linux - amd64. Tar. Gz https://github.com/prometheus/prometheus/releases/download/v2.25.2/prometheus-2.25.2.linux-amd64.tar.gz # node_exporter Wget - O/opt/TGZS/node_exporter - 1.1.2. Linux - amd64. Tar. Gz https://github.com/prometheus/node_exporter/releases/download/v1.1.2/node_exporter-1.1.2.linux-amd64.tar.gz # Alertmanager wget - O/opt/TGZS/alertmanager - 0.21.0. Linux - amd64. Tar. Gz https://github.com/prometheus/alertmanager/releases/download/v0.21.0/alertmanager-0.21.0.linux-amd64.tar.gz # Pushgateway wget - O/opt/TGZS/pushgateway - 1.4.0. Linux - amd64. Tar. Gz https://github.com/prometheus/pushgateway/releases/download/v1.4.0/pushgateway-1.4.0.linux-amd64.tar.gz # The process - exporter wget - O/opt/TGZS/process - exporter - 0.7.5. Linux - amd64. Tar. Gz https://github.com/ncabatoff/process-exporter/releases/download/v0.7.5/process-exporter-0.7.5.linux-amd64.tar.gz # Blackbox_exporter wget - O/opt/TGZS/blackbox_exporter - 0.18.0. Linux - amd64. Tar. Gz https://github.com/prometheus/blackbox_exporter/releases/download/v0.18.0/blackbox_exporter-0.18.0.linux-amd64.tar.gz # Redis_exporter wget - O/opt/TGZS/redis_exporter - v1.20.0. Linux - amd64. Tar. Gz https://github.com/oliver006/redis_exporter/releases/download/v1.20.0/redis_exporter-v1.20.0.linux-amd64.tar.gz # Mysql_exporter wget - O/opt/TGZS/mysqld_exporter - 0.12.1. Linux - amd64. Tar. Gz https://github.com/prometheus/mysqld_exporter/releases/download/v0.12.1/mysqld_exporter-0.12.1.linux-amd64.tar.gzCopy the code

Install mysql and configure it

# download installed mysql source package wget http://dev.mysql.com/get/mysql57-community-release-el7-8.noarch.rpm # mysql source yum localinstall Mysql57 - community - release - el7-8 noarch. RPM - y # check mysql source successful installation yum repolist enabled | grep "mysql. * - community. * #" MySQL > install mysql-community-server -y # yum install mysql-community-server -y # yum install mysql-community-server -y # yum install mysql-community-server -y # yum Mysqld: mysqld/mysqld/mysqld/mysqld/mysqld/mysqld/mysqld/mysqld/mysqld/mysqld/mysqld/mysqld A default password is generated for root in the /var/log/mysqld.log file. Find the default password of root and log in to mysql to change it: Mysql -uroot -p # mysql5.7 mysql -uroot -p # mysql5.7 The default password check policy requires that the password must contain uppercase and lowercase letters, digits, special characters, #, and a length of at least eight characters. Otherwise ERROR 1819 (HY000) is displayed: Your password does not satisfy the current policy requirements error # If you do not need password policy, add the following configuration to my.cnf file to disable it: Echo -e "validate_password = off\ncharacter_set_server= UTf8 \ninit_connect='SET NAMES ' Utf8 '\nskip-name-resolve\n" >> /etc/my.cnf systemctl restart mysqld mysql -uroot -p # identified by '123123'; grant all privileges on *.* to root@'%' identified by '123123' with grant option; flush privileges;Copy the code

Install Redis and configuration

yum -y install redis 
Copy the code

Compile and install Redis-6.2.1

Document redis. IO/download

Yum install -y GCC GCC -c++ TCL wget -o /opt/ TGZS /redis-6.2.1.tar.gz https://download.redis.io/releases/redis-6.2.1.tar.gz CD/opt/TGZS / # extract redis tar xf redis - 6.2.1. Tar. Gz # into the unzipped directory CD Redis-6.2.1 # allocator: if there is an environment variable MALLOC, it will be used to create redis. # And liBC is not the default allocator, the default is Jemalloc, because jemalloc has been proven to have fewer fragmentation problems than LIBC. # But if you have no jemalloc and only libc of course make error. So add this parameter and run the following command: Make MALLOC=libc -j 20 # make -j 20 # Mkdir -p /usr/local/redis # make PREFIX=/usr/local/redis # / etc/profile export PATH = $PATH: / usr/local/redis/bin source/etc/profile # copy the default configuration file to the/etc egrep -v "^ $| #" redis. Conf > Redis_sample. conf # change configuration file listening IP address to 0.0.0.0, Sed -i s/bind\ 127.0.0.1/bind\ 0.0.0.0/g redis_sample.conf sed -i s/daemonize\ no/daemonize\ yes/g  redis_sample.conf /bin/cp -f redis_sample.conf /etc/redis_6379.conf /bin/cp -f redis_sample.conf /etc/redis_6479.conf Sed -i s@logfile\ ""@logfile\ "/opt/logs/redis_6379.log"@g /etc/redis_6379.conf sed -i s@logfile\ ""@logfile\ Conf # sed -i s@dir\./@dir\ /var/lib/redis_6379@g /etc/redis_6379.conf sed -i s@dir\./@dir\ /var/lib/redis_6479@g /etc/redis_6379.conf 6379/g' /etc/redis_6379.conf sed -i 's/port 6379/port 6479/g' /etc/redis_6479.conf mkdir /var/lib/redis_6379 mkdir /var/lib/redis_6479 mkdir /opt/logs cat <<EOF > /etc/systemd/system/redis_6379.service [Unit] Description=The redis-server Process Manager After=syslog.target network.target [Service] Type=forking ExecStart=/usr/local/redis/bin/redis-server /etc/redis_6379.conf #ExecStop=/usr/local/redis/bin/redis-shutdown [Install]  WantedBy=multi-user.target EOF cat <<EOF > /etc/systemd/system/redis_6479.service [Unit] Description=The redis-server Process Manager After=syslog.target network.target [Service] Type=forking ExecStart=/usr/local/redis/bin/redis-server /etc/redis_6479.conf #ExecStop=/usr/local/redis/bin/redis-shutdown [Install] WantedBy=multi-user.target EOF systemctl Daemon -reload # redis systemctl enable redis_6379 systemctl enable redis_6479 Systemctl status redis_6379 systemctl status redis_6479Copy the code

Install grafana and configure it

RPM grafana 7 installation

# address https://grafana.com/grafana/download wget - O/opt/TGZS/grafana - 7.5.1-1. X86_64. RPM https://dl.grafana.com/oss/release/grafana-7.5.1-1.x86_64.rpm sudo yum install grafana 7.5.1-1. X86_64. RPMCopy the code

Create database in mysql

CREATE DATABASE IF NOT EXISTS grafana DEFAULT CHARSET utf8 COLLATE utf8_general_ci;
Copy the code

Modify the configuration file and fill in the mysql path

Sqllite3 name: grafana user: root password: 123123Copy the code

Start the service

systemctl start grafana-server
systemctl enable grafana-server
systemctl status grafana-server
Copy the code

Check whether logs report errors

tail -f /var/log/grafana/grafana.log
Copy the code

Laptop Settings hard solution

# windows 
C:\Windows\System32\drivers\etc\hosts
192.168.0.112 grafana.prome.me

Copy the code

Laptop Browser access

http://grafana.prome.me:3000/?orgId=1 default user password: admin/adminCopy the code

Google Chrome can’t edit the question

Issue github.com/grafana/gra…

Install Ansible and press Install Node-export in batches

Write the host name of the node to hosts

Echo "192.168.0.112 prome-master01" >> /etc/hosts echo "192.168.0.113 prome-node01" >> /etc/hostsCopy the code

The SSH key is generated on the master and copied to the node

Ssh-keygen ssh-copy-id prome-node01 ssh-copy-id prome-master01 #Copy the code

Install Ansible on the master

/etc/ansible/ansible. CFG ssh_args = -c -o ControlMaster=auto -o ControlPersist=60s -o StrictHostKeyChecking=noCopy the code

Playbook execution requires setting up machine files

Cat <<EOF > /opt/ TGZS /host_file prome-master01 prome-node01 EOF # ansible-i host_file all -m Ping tests whether the connection is normalCopy the code

Set the syslog and Logrotate services

ansible-playbook -i host_file init_syslog_logrotate.yaml
Copy the code

Write ansible publishing service scripts

Ansible-playbook -I host_file service_deploy.yaml -e "TGZ = node_exporters -1.1.2.linux-amd64.tar.gz" -e "app= node_exporters"Copy the code

Check the node_exporter service status

ansible -i host_file all -m shell -a " ps -ef |grep node_exporter|grep -v grep "

Copy the code

Browser 9100/metrics

node01.prome.me:9100/metrics
master01.prome.me:9100/metrics
Copy the code

Install Prometheus and configure it

Deploy Prometheus using Ansible

Yaml -e "TGZ = Prometheus -2.25.2.linux-amd64.tar.gz" -e "app= Prometheus" ansible-playbook -I host_file service_deploy.yaml -e "TGZ = Prometheus -2.25.2.linux-amd64.tar.gz" -e "app= Prometheus"Copy the code

To view the page

http://master01.prome.me:9090/

Copy the code

Prometheus configuration file parsing

Scrape_interval scrape_interval Evaluation_interval: 15s # Set the scrape interval to every 15 seconds. Default is every 1 minute. 15s # Evaluate rules every 15 seconds. The default is every 1 minute. 10s # query_log_file: /opt/logs/prometheus_query_log # global tag group # All data collected through this instance will be superimposed with external_labels: Account: 'huawei-main' region: 'beijng-01' # Alertmanager Information segment Alerting: AlertManagers: - Scheme: HTTP static_configs: -targets: - "localhost:9093" # Rule_files: - / etc/Prometheus/rules/record yml - / etc/Prometheus/rules/alert. Yml # collection configuration section scrape_configs: # The job name is added as a label `job=<job_name>` to any timeseries scraped from this config. - job_name: 'prometheus' # metrics_path defaults to '/metrics' # scheme defaults to 'http'. static_configs: - targets: [' localhost: 9090] # remote query section remote_read: # Prometheus - url: http://prometheus/v1/read read_recent: true # m3db - url: "Http://m3coordinator-read:7201/api/v1/prom/remote/read" read_recent: true # remote written paragraph remote_write: - url: "http://m3coordinator-write:7201/api/v1/prom/remote/write" queue_config: capacity: 10000 max_samples_per_send: 60000 write_relabel_configs: - source_labels: [__name__] separator: ; # key tag prefix matching to the drop of regex: '(kubelet_ | apiserver_ | container_fs_). *' replacement: $1 action: the dropCopy the code

So Prometheus instances can be used for the following purposes

Corresponding configuration segment	use
Collection configuration section	As a collector, data is saved locally
Collection configuration segment + remote write segment	As a collector + transmitter, data is stored locally + remotely
Remote query segment	Perform a query function to query remote storage data
Collection configuration segment + remote query segment	Perform collector + query to query local data + remote storage data
Collection configuration segment + Alertmanager information segment + Alarm configuration file segment	Perform the collector + alarm trigger to query local data and generate alarms and send them to the Alertmanager
Remote query segment + Alertmanager information segment + Alarm configuration file segment	Perform remote alarm triggers to query remote data and send alarms to the Alertmanager
Remote query segment + remote write segment + pre-aggregate profile segment	Do the pre-aggregation indicator, and write the result set indicator to the remote storage

Prepare the Prometheus configuration file and configure to export two node_exporters

global: scrape_interval: 15s scrape_timeout: 10s evaluation_interval: 15s alerting: alertmanagers: - scheme: http timeout: 10s api_version: v1 static_configs: - targets: [] scrape_configs: - job_name: prometheus honor_timestamps: true scrape_interval: 15s scrape_timeout: 10s metrics_path: /metrics scheme: HTTP Static_configs: -targets: -192.168.26.112:9100-192.168.26.113:9100Copy the code

Hot update configuration files

--web.enable-lifecycle curl -x POST http://localhost:9090/-/reloadCopy the code

Check the targets Up status on the page

To access the page master01. Prome. Me: 9090 / the targets

Explanation targets Page

Job Grouping
Endpoint instance address
State Indicates whether the collection is successful
The label tag set
Last Scrape Indicates the time between the Last Scrape
Scrape Duration Duration of the last collection
Error Collection Error

Obtain targets details from the API

run008_get_targets_from_prome.py

Status: normal num: 1/2 endpoint: http://172.20.70.205:9100/metrics state: the up labels: {' instance ':' 192.168.26.112:9100 ', 'job' : 'Prometheus'} lastScrape: the 2021-03-29 T18:20:04. 304025213 + 08:00 lastScrapeDuration: lastError 0.011969003: Status: normal num: 2/2 endpoint: http://172.20.70.215:9100/metrics state: the up labels: {' instance ':' 192.168.26.113:9100 ', 'job' : 'Prometheus'} lastScrape: the 2021-03-29 T18:20:06. 845862504 + 08:00 lastScrapeDuration: lastError 0.012705335:Copy the code

Fill in some random wrong target and test it out likeabc:9100

Status: abnormal num: 1/3 endpoint: http://abc:9100/metrics state: down labels: {' instance ':' ABC: 9100 ', 'job' : 'Prometheus'} lastScrape: the 2021-03-29 T18:24:08. 365229831 + 08:00 lastScrapeDuration: lastError 0.487732313: the Get "http://abc:9100/metrics": dial TCP: lookup ABC on 114.114.114.114:53: No to the host state: normal num: two-thirds of the endpoint: http://192.168.26.112:9100/metrics state: the up labels: {' instance: '192.168.26.112:9100', 'job' : 'Prometheus'} lastScrape: the 2021-03-29 T18: earth. 304044469 + 08:00 lastScrapeDuration: lastError 0.012483866: Status: normal num: 3/3 endpoint: http://192.168.26.113:9100/metrics state: the up labels: {' instance ':' 192.168.26.113:9100 ', 'job' : 'Prometheus'} lastScrape: the 2021-03-29 T18: num. + 845860017 08:00 lastScrapeDuration: lastError 0.010381262:Copy the code

Can be used to calculate the target collection success rate
up metrics

Collect indicators of Prometheus itself

- 192.168.26.112:9090-192.168.26.113:9090Copy the code

Prepare the environment

Install mysql and configure it

Install Redis and configuration

Install grafana and configure it

Install Ansible and press Install Node-export in batches

Install Prometheus and configure it

Collect indicators of Prometheus itself

Related Posts

Senior! Please tell me! Cookie use in Java!

Flask-avatars Sets the avatar in the Flask project

Make a maze game in Python, arrange it!