1. Data level monitoring
Application scenario: New data is not applicable to scenarios without new data.
Specific operations: Monitor the database and query the number of newly added data. If the number is less than the threshold, an alarm is generated
2. Log monitoring
Application scenario: Basically suitable
Specific operations: Monitor errors and the number of errors. Crawler from unstructured data to structured data, abnormal errors must exist. Therefore, although we monitor error, the alarm threshold needs to be set to avoid alarm bombing.
3. Several specific methods
- Scrapy middleware + signal
Spider_closed {‘log_count/ERROR’: 2}
- SENTRY
SENTRY can find some specific causes of errors
- Statistics the number of log errors
When analyzing logs, suppress alarms and set a threshold for the number of errors
- Grafana visualization
The actual alarm conditions are not easy to write
- Data exception alarm buried point
For example, if you climb the news, but the page changes, the most important list page is not retrieved, the number is 0, you can send an alarm.