On the 7th day of my participation in the November Gwen Challenge, check out the details of the event: the last Gwen Challenge 2021
The Grafana nail alert card does not jump to the Grafana interface when clicked
In the Grafana config file.ini
root_url = 'xxxx'
Copy the code
Configure the address and restart
Grafana installation and enabling
I’m using the Docker method here
Official documentation: grafana.com/docs/grafan…
1. Data access, dashboard configuration display, the meaning of each indicator is not explained in detail, please refer to this article:
www.jianshu.com/p/7e7e0d067… By Jane Book – Kang teenager
2. Direct third gear start can fork this branch: github.com/monitoringa…
It has a very comprehensive and very orthodox grafana template for common data sources, which you can download and dump
3. Note that template-type dashboards can only be used for monitoring and presentation, and access alerts require custom Queries
Two, nail nail robot creation and configuration
Nail developer documentation: ding-doc.dingtalk.com/doc#/server…
1. Create nail group & Nail Robot
Create a custom robot
2. Get the URL of webhook in Robot Settings
Get the URL for webhook
3. Security Settings, this step is necessary, I select whitelist mode, and fill in grafana server address
Security Settings – Whitelist
Grafana set alarm
1. In the Grafana console, in the left column “Alerting” module, create an alert.
Disable Resolve Message If health monitoring is set to [OK], no Message is sent.
2. Create a dashboard and panel for testing, press “E” to enter editing mode, create a Query, and select data source, test item, instance ID, and data acquisition interval.
3. Create an alarm rule
- Name User-defined alarm Name
- Evaluate every health test frequency
- For Indicates the time required For changing from pending to Alerting. Send to Alarm trigger
- Message alert copy
4. Set a small alarm threshold for testing and go back to the pins to see the robot messages
// Remember to open the Disable Resolve Message tag so that the [OK] state does not alert
Other implementation details
1) Modify the EC2 monitoring of the AWS console to enable “Detailed Monitoring”, which actually means that the data capture frequency is from 5min to 1min
2) Basically follow the test routine, set alarm for the commonly used server, and multiple queries can be put into a panel
Monitoring item: CPU load
Health monitoring: Calculate the average CPU load of the previous 5 minutes every minute. If the CPU load exceeds 80, the alarm is generated
Alarm rule: When the mean value is greater than 80, the state becomes “pending”, and the pending state lasts for 3 minutes
Miscellaneous: nail group announcement, responder coordination, test robot conversion, modify robot profile picture
Fifth, perfect and expand
Grafana access stitching robot only supports link mode. In this article, link is only used for text preview. Here is a link sample
{"msgtype": "link", "link": {"text": "This new version to be released, founder XX called it mangrove. Until now, when faced with a major upgrade, product managers would pick a code name for the situation. This time, why mangrove? ", "title": "The train of The Times is moving forward ", "picUrl": "", "" messageUrl": "https://www.dingtalk.com/s?__biz=MzA4NjMwMTA2Ng==&mid=2650316842&idx=1&sn=60da3ea2b29f1dcc43a7c8e4a7c97a16&scene=2&srci d=09189AnRJEdIiWVaKltFzNTw&from=timeline&isappinstalled=0&key=&ascene=2&uin=&devicetype=android-23&version=26031933&nett ype=WIFI" } }Copy the code
You can modify the corresponding fields to enrich the functions of the nail robot, such as clicking the link to directly transfer to the service console and monitoring dashboard