1. Introduction to this paper
We usually use message middleware for message decoupling to achieve the purpose of peak cutting and valley filling. Then does the decoupling effect really meet the expectation? How to measure the result? For this purpose, this paper uses Prometheus + Grafana to record data indicators and view real-time results.
First look at the implementation effect:
What does this thing do?
1) Monitor the consumption of MQ and know the actual effect of peak clipping and valley filling after business decoupling;
2) Provide data support for reasonably configuring the number of consuming threads;
3) Consumption duration distribution provides data support for optimizing business code;
Note: reprint please indicate the source!!
2. Basic learning
Prometheus: next-generation cloud native monitoring system; Combined with Grafana data kanban, it can well realize the monitoring of database and server running state.
Metrics Types: Counter, Gauge, summary, histogram
counter
Introduction:
Incrementing counters, common usage scenarios: machine uptime, total service requests
Display format:
# HELP mq_message_total Current TPS. # TYPE mq_message_total counter mq_message_total{application="yiyi-example-prometheus",method="update",topic="EAT_FINISH",group="EAT_FINISH_GROUP",} 274.0 mq_message_total {application = "yiyi - example - Prometheus", method = "save", topic = "EAT_FINISH group =" EAT_FINISH_GROUP ",} 531.0Copy the code
gauge
Introduction:
Counter that can increase or decrease; Common usage scenarios: Number of threads currently running and memory size used by service running
Display format:
# HELP mq_message_working_threads Number of worker threads executing
# TYPE mq_message_working_threads gauge
mq_message_working_threads{application="yiyi-example-prometheus",method="update",topic="EAT_FINISH",group="EAT_FINISH_GROUP", 0.0 mq_message_working_threads} {application ="yiyi-example-prometheus",method="save",topic="EAT_FINISH",group="EAT_FINISH_GROUP", 1.0}Copy the code
summary
Introduction:
Total number of records (sum) and number of records (count). Common usage scenarios are as follows: Task processing duration distribution and average CPU usage.
Display format:
# HELP mq_message_deal_time Time spent to process each message
# TYPE mq_message_deal_time summary
mq_message_deal_time_count{application="yiyi-example-prometheus",method="update",topic="EAT_FINISH",group="EAT_FINISH_GROUP", 274.0 mq_message_deal_time_sum} {application ="yiyi-example-prometheus",method="update",topic="EAT_FINISH",group="EAT_FINISH_GROUP", 7093.0 mq_message_deal_time_count} {application ="yiyi-example-prometheus",method="save",topic="EAT_FINISH",group="EAT_FINISH_GROUP", 531.0 mq_message_deal_time_sum} {application ="yiyi-example-prometheus",method="save",topic="EAT_FINISH",group="EAT_FINISH_GROUP", 13572.0}Copy the code
histogram
Introduction:
It is generally used to analyze the distribution of data and record the total amount (sum), quantity (count) and distribution.
Display format:
# HELP MQ_message_deal_time_histogram Processing duration grouping
# TYPE mq_message_deal_time_histogram histogram
mq_message_deal_time_histogram_bucket{application="yiyi-example-prometheus",method="update",topic="EAT_FINISH",group="EAT_FINISH_GROUP",le="0.005", 0.0 mq_message_deal_time_histogram_bucket} {application ="yiyi-example-prometheus",method="update",topic="EAT_FINISH",group="EAT_FINISH_GROUP",le="0.01", 0.0 mq_message_deal_time_histogram_bucket} {application ="yiyi-example-prometheus",method="update",topic="EAT_FINISH",group="EAT_FINISH_GROUP",le="0.025", 0.0 mq_message_deal_time_histogram_bucket} {application ="yiyi-example-prometheus",method="update",topic="EAT_FINISH",group="EAT_FINISH_GROUP",le="0.05", 0.0 mq_message_deal_time_histogram_bucket} {application ="yiyi-example-prometheus",method="update",topic="EAT_FINISH",group="EAT_FINISH_GROUP",le="0.075", 0.0 mq_message_deal_time_histogram_bucket} {application ="yiyi-example-prometheus",method="update",topic="EAT_FINISH",group="EAT_FINISH_GROUP",le="0.1", 0.0 mq_message_deal_time_histogram_bucket} {application ="yiyi-example-prometheus",method="update",topic="EAT_FINISH",group="EAT_FINISH_GROUP",le="0.25", 0.0 mq_message_deal_time_histogram_bucket} {application ="yiyi-example-prometheus",method="update",topic="EAT_FINISH",group="EAT_FINISH_GROUP",le="0.5", 0.0 mq_message_deal_time_histogram_bucket} {application ="yiyi-example-prometheus",method="update",topic="EAT_FINISH",group="EAT_FINISH_GROUP",le="0.75", 0.0 mq_message_deal_time_histogram_bucket} {application ="yiyi-example-prometheus",method="update",topic="EAT_FINISH",group="EAT_FINISH_GROUP",le="1.0", 4.0 mq_message_deal_time_histogram_bucket} {application ="yiyi-example-prometheus",method="update",topic="EAT_FINISH",group="EAT_FINISH_GROUP",le="2.5", 11.0 mq_message_deal_time_histogram_bucket} {application ="yiyi-example-prometheus",method="update",topic="EAT_FINISH",group="EAT_FINISH_GROUP",le="5.0", 24.0 mq_message_deal_time_histogram_bucket} {application ="yiyi-example-prometheus",method="update",topic="EAT_FINISH",group="EAT_FINISH_GROUP",le="7.5", 37.0 mq_message_deal_time_histogram_bucket} {application ="yiyi-example-prometheus",method="update",topic="EAT_FINISH",group="EAT_FINISH_GROUP",le="10.0", 55.0 mq_message_deal_time_histogram_bucket} {application ="yiyi-example-prometheus",method="update",topic="EAT_FINISH",group="EAT_FINISH_GROUP",le="+Inf", 274.0 mq_message_deal_time_histogram_count} {application ="yiyi-example-prometheus",method="update",topic="EAT_FINISH",group="EAT_FINISH_GROUP", 274.0 mq_message_deal_time_histogram_sum} {application ="yiyi-example-prometheus",method="update",topic="EAT_FINISH",group="EAT_FINISH_GROUP", 7093.0}Copy the code
3. Prometheus + Grafana service establishment
Prometheus + grafana boot See also: Click to jump
4. Write MQ consumption monitoring plug-in
Here the MQ consumption performance metrics are recorded with a small requirement;
Requirements:
- MQ consumer – TPS count of consumption messages in seconds
- MQ consumer – number of worker threads statistics in seconds;
- MQ consumer – Average message processing time in seconds;
- MQ consumer – Distribution of the time it takes to process messages in seconds;
4.1 Spring-boot plug-in writing
pom.xml
<! -- =========================== prometheus =========================== -->
<dependency>
<groupId>org.springframework.boot</groupId>
<artifactId>spring-boot-starter-actuator</artifactId>
</dependency>
<dependency>
<groupId>io.micrometer</groupId>
<artifactId>micrometer-registry-prometheus</artifactId>
<version>1.1.4</version>
</dependency>
Copy the code
Core processing class
JobMetrics.java
@Component
public class JobMetrics {
@Value("${spring.application.name}")
private String application;
String [] labelNameArray = {"application"."method"."topic"."group"};
/** * count mp message TPS, * tag (app name, sample IP,topic,group) TPS count: increase(mq_message_total{topic="EAT_FINISH"}[1m])/60
*/
private final Counter tps = Counter.build()
.name("mq_message_total")
.labelNames(labelNameArray)
.help("The current TPS.")
.create();
/** * Average processing duration of a single message: sum(rate(mq_message_deal_time_sum{application="yiyi-example-prometheus", topic="EAT_FINISH",group="EAT_FINISH_GROUP"}[1m])) / sum(rate(mq_message_deal_time_count{application="yiyi-example-prometheus",topic="EAT_FINISH",group="EAT_FINISH_GROUP"}[1 M]) single message duration: same principle as above * */
private final Summary dealTimeSummary = Summary.build()
.name("mq_message_deal_time")
.labelNames(labelNameArray)
.help("Processing time per message")
.create();
private final Summary delayTimeSummary = Summary.build()
.name("mq_message_delay_time")
.labelNames(labelNameArray)
.help("Per-message processing delay")
.create();
/** * Number of working threads */
private final Gauge workingThreads = Gauge.build()
.name("mq_message_working_threads")
.labelNames(labelNameArray)
.help("Number of working threads executing")
.create();
/** * Processing duration grouping *
* To use this type, you need to know the label of le grouping in advance ** The following promQL statements are shown in percentage: * = = = = = = = = = = = = "left" = = = = = = = = = = = = = = = = * sum (rate (mq_message_deal_time_histogram_bucket * {* Application ="yiyi-example-prometheus",topic="EAT_FINISH",group="EAT_FINISH_GROUP",le="1.0" *}[1m])) * / * sum(rate(mq_message_deal_time_histogram_count * { * application="yiyi-example-prometheus",topic="EAT_FINISH",group="EAT_FINISH_GROUP" * }[1m])) * * * ============ [among] the = = = = = = = = = = = = = = = = (sum (rate (mq_message_deal_time_histogram_bucket * * * {* Application ="yiyi-example-prometheus",topic="EAT_FINISH",group="EAT_FINISH_GROUP",le="2.5" *}[1m]) * * -sum(rate(mq_message_deal_time_histogram_bucket * { * Application = "yiyi - example - Prometheus," topic = "EAT_FINISH group =" EAT_FINISH_GROUP ", le = "1.0" * *} [m] 1))) /sum(rate(mq_message_deal_time_histogram_count * { * application="yiyi-example-prometheus",topic="EAT_FINISH",group="EAT_FINISH_GROUP" * }[1m])) * * * ============ [back] = = = = = = = = = = = = = = = = (sum (rate (mq_message_deal_time_histogram_count * * * {* application="yiyi-example-prometheus",topic="EAT_FINISH",group="EAT_FINISH_GROUP" * }[1m])) * -sum(rate(mq_message_deal_time_histogram_bucket * { * Application = "yiyi - example - Prometheus," topic = "EAT_FINISH group =" EAT_FINISH_GROUP ", le = "10.0" * *} [m] 1))) /sum(rate(mq_message_deal_time_histogram_count * { * application="yiyi-example-prometheus",topic="EAT_FINISH",group="EAT_FINISH_GROUP" * }[1m])) * * *
* * */
private final Histogram dealTimeHistogram = Histogram.build()
.name("mq_message_deal_time_histogram")
.labelNames(labelNameArray)
.help("Processing Duration Grouping")
.create();
/** ** simulates request processing, actually using MQ Consumer, which can integrate itself with AOP and companies */
void handleRequest(String method){
String [] labelArray = {"yiyi-example-prometheus",method,"EAT_FINISH"."EAT_FINISH_GROUP"};
// Handle arrays
workingThreads.labels(labelArray).inc();
// Indicates the time of birth
long bornTime = System.currentTimeMillis() - new Random().nextInt(5) + 1;
int dealTime = new Random().nextInt(50) + 1;
try {
TimeUnit.MILLISECONDS.sleep(dealTime);
} catch (InterruptedException e) {
// TODO exception handling
e.printStackTrace();
}
long delayTime = System.currentTimeMillis() - bornTime;
tps.labels(labelArray).inc();
dealTimeSummary.labels(labelArray).observe(dealTime);
delayTimeSummary.labels(labelArray).observe(delayTime);
workingThreads.labels(labelArray).dec();
dealTimeHistogram.labels(labelArray).observe(dealTime);
}
@Autowired
public JobMetrics(PrometheusMeterRegistry meterRegistry) {
CollectorRegistry prometheusRegistry = meterRegistry.getPrometheusRegistry();
prometheusRegistry.register(tps); // TPS
prometheusRegistry.register(dealTimeSummary); // Task execution time
prometheusRegistry.register(delayTimeSummary); // Duration of the task lifecycle
prometheusRegistry.register(workingThreads); // Number of worker threads
prometheusRegistry.register(dealTimeHistogram); // Group the processing duration}}Copy the code
MyJob.java
Mock user request
/** * simulates 2 too machines processing messages */
@Component
@EnableScheduling
public class MyJob {
@Autowired
private JobMetrics jobMetrics;
@Async("main")
@Scheduled(fixedDelay = 500)
public void tpsRequestHandle1(a) {
jobMetrics.handleRequest("save");
}
@Async("main")
@Scheduled(fixedDelay = 1000)
public void tpsRequestHandle2(a) {
jobMetrics.handleRequest("update"); }}Copy the code
4.2 Grafana Displays the configuration
Number of working threads
promQL
mq_message_working_threads{application="yiyi-example-prometheus",topic="EAT_FINISH",group="EAT_FINISH_GROUP"}
Copy the code
Effect of the message
TPS quantity statistics for consuming messages
promQL
rate(mq_message_total{topic="EAT_FINISH"}[1m])
Copy the code
Effect of the message
Average message processing time
promQL
sum(rate(mq_message_deal_time_sum{application="yiyi-example-prometheus", topic="EAT_FINISH",group="EAT_FINISH_GROUP"}[1m]))
/
sum(rate(mq_message_deal_time_count{application="yiyi-example-prometheus",topic="EAT_FINISH",group="EAT_FINISH_GROUP"}[1m]))
Copy the code
Effect of the message
The time distribution for processing messages
promQL
= = = = = = = = = = = = "left" = = = = = = = = = = = = = = = = sum (rate (mq_message_deal_time_histogram_bucket {application ="yiyi-example-prometheus",topic="EAT_FINISH",group="EAT_FINISH_GROUP",le="1.0"
}[1m]))
/
sum(rate(mq_message_deal_time_histogram_count
{
application="yiyi-example-prometheus",topic="EAT_FINISH",group="EAT_FINISH_GROUP"} [m] 1)) = = = = = = = = = = = = "middle", in turn, configure multiple = = = = = = = = = = = = = = = = (sum (rate (mq_message_deal_time_histogram_bucket {application ="yiyi-example-prometheus",topic="EAT_FINISH",group="EAT_FINISH_GROUP",le="2.5"
}[1m]))
-sum(rate(mq_message_deal_time_histogram_bucket
{
application="yiyi-example-prometheus",topic="EAT_FINISH",group="EAT_FINISH_GROUP",le="1.0"
}[1m])))
/sum(rate(mq_message_deal_time_histogram_count
{
application="yiyi-example-prometheus",topic="EAT_FINISH",group="EAT_FINISH_GROUP"} [m] 1)) = = = = = = = = = = = = [back] = = = = = = = = = = = = = = = = (sum (rate (mq_message_deal_time_histogram_count {application ="yiyi-example-prometheus",topic="EAT_FINISH",group="EAT_FINISH_GROUP"
}[1m]))
-sum(rate(mq_message_deal_time_histogram_bucket
{
application="yiyi-example-prometheus",topic="EAT_FINISH",group="EAT_FINISH_GROUP",le="10.0"
}[1m])))
/sum(rate(mq_message_deal_time_histogram_count
{
application="yiyi-example-prometheus",topic="EAT_FINISH",group="EAT_FINISH_GROUP"
}[1m]))
Copy the code
According to the effect
see
- prometheus-book
- Source address – Github