Services have recently been monitored, and the most popular database currently monitored is Prometheus, which is also the default access database for Go-Zero. Read today about how Go-Zero plugged into Prometheus and how developers define their own monitoring metrics.
Monitor access
Service metrics monitoring based on Prometheus is integrated into the GO-Zero framework. However, it is not explicitly opened, which requires the developer to configure in config.yaml:
Prometheus:
Host: 127.0. 01.
Port: 9091
Path: /metrics
Copy the code
If the developer builds Prometheus locally, write the configuration for collecting service monitoring information in Prometheus configuration file promethe. yaml:
- job_name: 'file_ds'
static_configs:
- targets: ['your-local-ip:9091']
labels:
job: activeuser
app: activeuser-api
env: dev
instance: your-local-ip:service-port
Copy the code
Because local is running with Docker. Yaml in docker-Prometheus directory:
docker run \
-p 9090:9090 \
-v dockeryml/docker-prometheus:/etc/prometheus \
prom/prometheus
Copy the code
Open localhost:9090 to see:
To see monitoring information for the service, click http://service-ip:9091/metrics:
In the figure above, we can see that there are two types of buckets and count/sum.
So how does Go-Zero integrate monitoring metrics? And what are they monitoring? How do we define our own metrics? Here are some explanations
Zeromicro.github. IO/Go-Zero /ser…
How to integrate
In the example above, the request mode is HTTP, which means that the monitoring indicator data is constantly collected when the server is requested. It is easy to think of middleware functionality, specific code: github.com/tal-tech/go…
var (
metricServerReqDur = metric.NewHistogramVec(&metric.HistogramVecOpts{
...
// Monitor indicators
Labels: []string{"path"},
// Histogram distribution, statistics of buckets
Buckets: []float64{5.10.25.50.100.250.500.1000},
})
metricServerReqCodeTotal = metric.NewCounterVec(&metric.CounterVecOpts{
...
// Monitor indicator: record indicator incr()
Labels: []string{"path"."code"}}))func PromethousHandler(path string) func(http.Handler) http.Handler {
return func(next http.Handler) http.Handler {
return http.HandlerFunc(func(w http.ResponseWriter, r *http.Request) {
// Request entry time
startTime := timex.Now()
cw := &security.WithCodeResponseWriter{Writer: w}
defer func(a) {
// Request the return time
metricServerReqDur.Observe(int64(timex.Since(startTime)/time.Millisecond), path)
metricServerReqCodeTotal.Inc(path, strconv.Itoa(cw.Code))
}()
// Middleware release to execute subsequent middleware and business logic. Go back here and do a full request metric report
// [🧅 : Onion model]
next.ServeHTTP(cw, r)
})
}
}
Copy the code
It’s actually quite simple:
HistogramVec
Collect request time:bucket
That’s what they storeoption
Specified time indicator. The amount of time a request takes is aggregated by the corresponding bucket and counted.- The final result is a routing distribution at different times, which provides an intuitive area for developers to optimize.
CounterVec
Responsible for the specifiedlabels
Tag collection:Labels: []string{"path", "code"}
labels
Quite atuple
.go-zero
Based on(path, code)
As a whole, the return times of different status codes for different routes are recorded. if4xx,5xx
Too many times, should you look at the health of your service?
How to Customize
The basic Prometheus Metric package is also provided in Go-Zero for developers to develop their own Prometheus middleware.
Code: github.com/tal-tech/go…
The name of the | use | Collect function |
---|---|---|
CounterVec | Single count. Use: QPS statistics | CounterVec.Inc() Index + 1 |
GuageVec | Simple index record. Applicable to disk capacity, CPU/Mem usage (can be increased or decreased) | GuageVec.Inc()/GuageVec.Add() Index plus 1 over index plus N, or it could be negative |
HistogramVec | The distribution of reaction values. Applicable to: Request time and response size | HistogramVec.Observe(val, labels) Record the current value of the indicator and locate the bucket where the value resides |
HistogramVec.Observe()
We can actually see that each HistogramVec statistic above has three occurrences:
_count
: Number of data_sum
: Sum all data_bucket{le=a1}
: in a[-inf, a1]
Number of data ofTherefore, we also guess that in the statistical process, there are three kinds of data for statistics:
// Basically all Prometheus statistics use atomic CAS // Better performance than using Mutex func (h *histogram) observe(v float64, bucket int) { n := atomic.AddUint64(&h.countAndHotIdx, 1) hotCounts := h.counts[n>>63] if bucket < len(h.upperBounds) { // val corresponds to the data bucket +1 atomic.AddUint64(&hotCounts.buckets[bucket], 1)}for { oldBits := atomic.LoadUint64(&hotCounts.sumBits) newBits := math.Float64bits(math.Float64frombits(oldBits) + v) // sum +v if atomic.CompareAndSwapUint64(&hotCounts.sumBits, oldBits, newBits) { break}}// count count +1 atomic.AddUint64(&hotCounts.count, 1)}Copy the code
So developers want to define their own monitoring metrics:
- In the use of
goctl
The build API code specifies what to buildThe middleware:Zeromicro. Making. IO/go – zero/mid… - Write the indicator logic that needs to be counted in the middleware file
- Of course, developers can also write the metric logic of statistics in their business logic. Same as above.
The above is a parsing of the HTTP part of the logic, the RPC part of the logic is similar, you can see the design in the interceptor section.
conclusion
This paper analyzes the logic of the go-Zero service monitoring indicators. Of course, Prometheus can monitor some infrastructure by introducing a corresponding EXPORTER.
The project address
Github.com/tal-tech/go…
Welcome to Go-Zero and star support us!