A, description,

Internet companies typically have dedicated data teams responsible for some of the company’s business metrics; In order to get these basic business indicators, usually the engineering team to do some data collection work, so the birth of the buried point.

There are many ways to bury the point, this paper mainly introduces the log burying point this way and the realization of ideas and cases.

Log burying point is to record service/behavior data in the form of program log printing

 

Ii. Overall structure

The following four steps are required to implement service monitoring and behavior analysis using log burying points

  1. Data generation (buried point)
  2. The data collection
  3. Data parsing (structuring)
  4. Trading data
  5. Data usage (presentation/analysis)

 

Iii. Program description

3.1. Data generation

Log data can be generated directly using logging frameworks such as Logback, which can encapsulate common methods, AOP, annotations, and other methods to generate specified buried logs

However, for later data parsing, the log data needs specification first

  1. All point log must agree on a unified format, for example: {time} | {source} | {object id} |} {type | {object properties (in & segmentation)}

Logs are generated in the following format: The 2019-11-07 10:32:01 | API – gateway | 1 | request – statistics | IP = 171.221.203.106 & browser = CHROME&operatingSystem = WINDOWS_10

  1. Avoid confusion between buried log files and logs generated by the system

The output directories and files of buried logs need to be separated from the application logs by configuring Logback

 

Buried point

Generate log

Request from the buried user of the gateway

 

3.2. Data collection

There are many middleware options for collecting log data. In addition to FileBeat, Flume, Fluentd, rsyslog, etc. A collection middleware needs to be deployed on each server.

Each server is deployed on the line, even if a server launched multiple micro services can also be collected together

PS: The message queue after log collection is not necessary and can be removed. However, the message queue has the following two advantages

  1. Peak clipping: Relieves the stress of later log parsing
  2. Data sharing: In addition to providing log data to the log system, log data can be provided to other places at the same time, such as stream computing

 

3.3. Data analysis

Use the Grok expression of Logstash to parse and structure the log data, as shown above

The 2019-11-07 10:32:01 | API – gateway | 1 | request – statistics | IP = 171.221.203.106 & browser = CHROME&operatingSystem = WINDOWS_10

Structured log data is as follows:

{ timestamp: '2019-11-07 10:32:01', appName: 'api-gateway', resouceid: '1', type: 'request-statistics', ip: '171.221.203.106', browser: 'CHROME', operatingSystem: 'WINDOWS_10'}Copy the code

 

3.4. Data falling disk

Elasticsearch index is automatically created with Logstash and shards by day

Index templates can be used to specify properties such as the type and participle of each field

 

3.5. Data usage

After log data is saved to Elasticsearch, you can display monitoring data or analyze log data in real time by aggregative query

Monitor the case

The aggregation query logic can be found at gitee.com/zlt2000/mic…

 

Four,

Log burying point is only one of the burying point means, the advantage of the system is non-invasion and flexible; Log collection, parsing, falling disk, etc. can be flexibly matched with different middleware, and do not need to modify the source system code; Besides, it can easily connect to other analysis platforms (such as big data platform).

PS: Can service monitoring directly query the service database without burying logs? (This is not recommended)

  1. Log burying points can be used to separate monitoring data from service data, and the monitoring platform does not affect or increase the pressure on the service database
  2. Using log burying point can realize real-time business data early warning conveniently

For example, after log collection, flow computing middleware is added to calculate the number or amount of preferential volume logs in a certain time window exceeding a certain threshold, and an alarm will be issued

 

Recommended reading

  • Log troubleshooting difficulty? Distributed log link tracing to help you
  • Zuul integrates Sentinel’s latest gateway flow control components
  • How can Spring Cloud developers resolve service conflicts and instance hopping?
  • How to do distributed transactions in Spring Cloud synchronization scenarios? Try Seata
  • How to do distributed transactions in Spring Cloud asynchronous scenario? Try RocketMQ
  • How does Spring Cloud Gateway do dynamic routing? Integrating the Nacos implementation is simple

Scan code attention has surprise!