1. Introduction to this paper

We usually use message middleware for message decoupling to achieve the purpose of peak cutting and valley filling. Then does the decoupling effect really meet the expectation? How to measure the result? For this purpose, this paper uses Prometheus + Grafana to record data indicators and view real-time results.

First look at the implementation effect:

What does this thing do?

1) Monitor the consumption of MQ and know the actual effect of peak clipping and valley filling after business decoupling;

2) Provide data support for reasonably configuring the number of consuming threads;

3) Consumption duration distribution provides data support for optimizing business code;

Note: reprint please indicate the source!!

2. Basic learning

Prometheus: next-generation cloud native monitoring system; Combined with Grafana data kanban, it can well realize the monitoring of database and server running state.

Metrics Types: Counter, Gauge, summary, histogram

counter

Introduction:

Incrementing counters, common usage scenarios: machine uptime, total service requests

Display format:

# HELP mq_message_total Current TPS. # TYPE mq_message_total counter mq_message_total{application="yiyi-example-prometheus",method="update",topic="EAT_FINISH",group="EAT_FINISH_GROUP",} 274.0 mq_message_total {application = "yiyi - example - Prometheus", method = "save", topic = "EAT_FINISH group =" EAT_FINISH_GROUP ",} 531.0Copy the code

gauge

Introduction:

Counter that can increase or decrease; Common usage scenarios: Number of threads currently running and memory size used by service running

Display format:

# HELP mq_message_working_threads Number of worker threads executing
# TYPE mq_message_working_threads gauge
mq_message_working_threads{application="yiyi-example-prometheus",method="update",topic="EAT_FINISH",group="EAT_FINISH_GROUP", 0.0 mq_message_working_threads} {application ="yiyi-example-prometheus",method="save",topic="EAT_FINISH",group="EAT_FINISH_GROUP", 1.0}Copy the code

summary

Introduction:

Total number of records (sum) and number of records (count). Common usage scenarios are as follows: Task processing duration distribution and average CPU usage.

Display format:

# HELP mq_message_deal_time Time spent to process each message
# TYPE mq_message_deal_time summary
mq_message_deal_time_count{application="yiyi-example-prometheus",method="update",topic="EAT_FINISH",group="EAT_FINISH_GROUP", 274.0 mq_message_deal_time_sum} {application ="yiyi-example-prometheus",method="update",topic="EAT_FINISH",group="EAT_FINISH_GROUP", 7093.0 mq_message_deal_time_count} {application ="yiyi-example-prometheus",method="save",topic="EAT_FINISH",group="EAT_FINISH_GROUP", 531.0 mq_message_deal_time_sum} {application ="yiyi-example-prometheus",method="save",topic="EAT_FINISH",group="EAT_FINISH_GROUP", 13572.0}Copy the code

histogram

Introduction:

It is generally used to analyze the distribution of data and record the total amount (sum), quantity (count) and distribution.

Display format:

# HELP MQ_message_deal_time_histogram Processing duration grouping
# TYPE mq_message_deal_time_histogram histogram
mq_message_deal_time_histogram_bucket{application="yiyi-example-prometheus",method="update",topic="EAT_FINISH",group="EAT_FINISH_GROUP",le="0.005", 0.0 mq_message_deal_time_histogram_bucket} {application ="yiyi-example-prometheus",method="update",topic="EAT_FINISH",group="EAT_FINISH_GROUP",le="0.01", 0.0 mq_message_deal_time_histogram_bucket} {application ="yiyi-example-prometheus",method="update",topic="EAT_FINISH",group="EAT_FINISH_GROUP",le="0.025", 0.0 mq_message_deal_time_histogram_bucket} {application ="yiyi-example-prometheus",method="update",topic="EAT_FINISH",group="EAT_FINISH_GROUP",le="0.05", 0.0 mq_message_deal_time_histogram_bucket} {application ="yiyi-example-prometheus",method="update",topic="EAT_FINISH",group="EAT_FINISH_GROUP",le="0.075", 0.0 mq_message_deal_time_histogram_bucket} {application ="yiyi-example-prometheus",method="update",topic="EAT_FINISH",group="EAT_FINISH_GROUP",le="0.1", 0.0 mq_message_deal_time_histogram_bucket} {application ="yiyi-example-prometheus",method="update",topic="EAT_FINISH",group="EAT_FINISH_GROUP",le="0.25", 0.0 mq_message_deal_time_histogram_bucket} {application ="yiyi-example-prometheus",method="update",topic="EAT_FINISH",group="EAT_FINISH_GROUP",le="0.5", 0.0 mq_message_deal_time_histogram_bucket} {application ="yiyi-example-prometheus",method="update",topic="EAT_FINISH",group="EAT_FINISH_GROUP",le="0.75", 0.0 mq_message_deal_time_histogram_bucket} {application ="yiyi-example-prometheus",method="update",topic="EAT_FINISH",group="EAT_FINISH_GROUP",le="1.0", 4.0 mq_message_deal_time_histogram_bucket} {application ="yiyi-example-prometheus",method="update",topic="EAT_FINISH",group="EAT_FINISH_GROUP",le="2.5", 11.0 mq_message_deal_time_histogram_bucket} {application ="yiyi-example-prometheus",method="update",topic="EAT_FINISH",group="EAT_FINISH_GROUP",le="5.0", 24.0 mq_message_deal_time_histogram_bucket} {application ="yiyi-example-prometheus",method="update",topic="EAT_FINISH",group="EAT_FINISH_GROUP",le="7.5", 37.0 mq_message_deal_time_histogram_bucket} {application ="yiyi-example-prometheus",method="update",topic="EAT_FINISH",group="EAT_FINISH_GROUP",le="10.0", 55.0 mq_message_deal_time_histogram_bucket} {application ="yiyi-example-prometheus",method="update",topic="EAT_FINISH",group="EAT_FINISH_GROUP",le="+Inf", 274.0 mq_message_deal_time_histogram_count} {application ="yiyi-example-prometheus",method="update",topic="EAT_FINISH",group="EAT_FINISH_GROUP", 274.0 mq_message_deal_time_histogram_sum} {application ="yiyi-example-prometheus",method="update",topic="EAT_FINISH",group="EAT_FINISH_GROUP", 7093.0}Copy the code

3. Prometheus + Grafana service establishment

Prometheus + grafana boot See also: Click to jump

4. Write MQ consumption monitoring plug-in

Here the MQ consumption performance metrics are recorded with a small requirement;

Requirements:

MQ consumer – TPS count of consumption messages in seconds
MQ consumer – number of worker threads statistics in seconds;
MQ consumer – Average message processing time in seconds;
MQ consumer – Distribution of the time it takes to process messages in seconds;

4.1 Spring-boot plug-in writing

pom.xml

<! -- =========================== prometheus =========================== -->
<dependency>
    <groupId>org.springframework.boot</groupId>
    <artifactId>spring-boot-starter-actuator</artifactId>
</dependency>
<dependency>
    <groupId>io.micrometer</groupId>
    <artifactId>micrometer-registry-prometheus</artifactId>
    <version>1.1.4</version>
</dependency>
Copy the code

Core processing class

JobMetrics.java

@Component
public class JobMetrics {

    @Value("${spring.application.name}")
    private String application;

    String [] labelNameArray = {"application"."method"."topic"."group"};

    /** * count mp message TPS, * tag (app name, sample IP,topic,group)  TPS count:  increase(mq_message_total{topic="EAT_FINISH"}[1m])/60 
 */

     private final Counter tps = Counter.build()

            .name("mq_message_total")

            .labelNames(labelNameArray)

            .help("The current TPS.")

            .create();
    /** * Average processing duration of a single message:  sum(rate(mq_message_deal_time_sum{application="yiyi-example-prometheus", topic="EAT_FINISH",group="EAT_FINISH_GROUP"}[1m])) / sum(rate(mq_message_deal_time_count{application="yiyi-example-prometheus",topic="EAT_FINISH",group="EAT_FINISH_GROUP"}[1 M]) single message duration: same principle as above  * */

     private final Summary dealTimeSummary = Summary.build()

             .name("mq_message_deal_time")

             .labelNames(labelNameArray)

             .help("Processing time per message")

             .create();
    private final Summary delayTimeSummary = Summary.build()

            .name("mq_message_delay_time")

            .labelNames(labelNameArray)

            .help("Per-message processing delay")

            .create();
    /** * Number of working threads */

     private final Gauge workingThreads = Gauge.build()

             .name("mq_message_working_threads")

             .labelNames(labelNameArray)

             .help("Number of working threads executing")

             .create();
    /** * Processing duration grouping *
 * To use this type, you need to know the label of le grouping in advance ** The following promQL statements are shown in percentage: * = = = = = = = = = = = = "left" = = = = = = = = = = = = = = = = * sum (rate (mq_message_deal_time_histogram_bucket * {* Application ="yiyi-example-prometheus",topic="EAT_FINISH",group="EAT_FINISH_GROUP",le="1.0" *}[1m])) * / * sum(rate(mq_message_deal_time_histogram_count * { * application="yiyi-example-prometheus",topic="EAT_FINISH",group="EAT_FINISH_GROUP" * }[1m])) * * * ============ [among] the = = = = = = = = = = = = = = = = (sum (rate (mq_message_deal_time_histogram_bucket * * * {* Application ="yiyi-example-prometheus",topic="EAT_FINISH",group="EAT_FINISH_GROUP",le="2.5" *}[1m]) * * -sum(rate(mq_message_deal_time_histogram_bucket * { * Application = "yiyi - example - Prometheus," topic = "EAT_FINISH group =" EAT_FINISH_GROUP ", le = "1.0" * *} [m] 1))) /sum(rate(mq_message_deal_time_histogram_count * { * application="yiyi-example-prometheus",topic="EAT_FINISH",group="EAT_FINISH_GROUP" * }[1m])) * * * ============ [back] = = = = = = = = = = = = = = = = (sum (rate (mq_message_deal_time_histogram_count * * * {* application="yiyi-example-prometheus",topic="EAT_FINISH",group="EAT_FINISH_GROUP" * }[1m])) * -sum(rate(mq_message_deal_time_histogram_bucket * { * Application = "yiyi - example - Prometheus," topic = "EAT_FINISH group =" EAT_FINISH_GROUP ", le = "10.0" * *} [m] 1))) /sum(rate(mq_message_deal_time_histogram_count * { * application="yiyi-example-prometheus",topic="EAT_FINISH",group="EAT_FINISH_GROUP" * }[1m])) * * *
 * * */

     private final Histogram dealTimeHistogram = Histogram.build()

             .name("mq_message_deal_time_histogram")

             .labelNames(labelNameArray)

             .help("Processing Duration Grouping")

             .create();
    /** ** simulates request processing, actually using MQ Consumer, which can integrate itself with AOP and companies */

    void handleRequest(String method){
        String [] labelArray = {"yiyi-example-prometheus",method,"EAT_FINISH"."EAT_FINISH_GROUP"};
        // Handle arrays

        workingThreads.labels(labelArray).inc();
        // Indicates the time of birth

        long bornTime = System.currentTimeMillis() - new Random().nextInt(5) + 1;
        int dealTime = new Random().nextInt(50) + 1;

        try {

            TimeUnit.MILLISECONDS.sleep(dealTime);

        } catch (InterruptedException e) {

            // TODO exception handling

            e.printStackTrace();

        }

        long delayTime = System.currentTimeMillis() - bornTime;
        tps.labels(labelArray).inc();

        dealTimeSummary.labels(labelArray).observe(dealTime);

        delayTimeSummary.labels(labelArray).observe(delayTime);

        workingThreads.labels(labelArray).dec();

        dealTimeHistogram.labels(labelArray).observe(dealTime);
    }
    @Autowired

    public JobMetrics(PrometheusMeterRegistry meterRegistry) {

        CollectorRegistry prometheusRegistry = meterRegistry.getPrometheusRegistry();

        prometheusRegistry.register(tps);  // TPS

        prometheusRegistry.register(dealTimeSummary); // Task execution time

        prometheusRegistry.register(delayTimeSummary);  // Duration of the task lifecycle

        prometheusRegistry.register(workingThreads);    // Number of worker threads

        prometheusRegistry.register(dealTimeHistogram); // Group the processing duration}}Copy the code 
MyJob.java
Mock user request
/** * simulates 2 too machines processing messages */
@Component
@EnableScheduling
public class MyJob {

    @Autowired
    private JobMetrics jobMetrics;


    @Async("main")
    @Scheduled(fixedDelay = 500)
    public void tpsRequestHandle1(a) {
        jobMetrics.handleRequest("save");
    }


    @Async("main")
    @Scheduled(fixedDelay = 1000)
    public void tpsRequestHandle2(a) {
        jobMetrics.handleRequest("update"); }}Copy the code
4.2 Grafana Displays the configuration
Number of working threads
promQL
mq_message_working_threads{application="yiyi-example-prometheus",topic="EAT_FINISH",group="EAT_FINISH_GROUP"}
Copy the code
Effect of the message

TPS quantity statistics for consuming messages
promQL
rate(mq_message_total{topic="EAT_FINISH"}[1m])
Copy the code
Effect of the message

Average message processing time
promQL
sum(rate(mq_message_deal_time_sum{application="yiyi-example-prometheus", topic="EAT_FINISH",group="EAT_FINISH_GROUP"}[1m]))
/
sum(rate(mq_message_deal_time_count{application="yiyi-example-prometheus",topic="EAT_FINISH",group="EAT_FINISH_GROUP"}[1m]))
Copy the code
Effect of the message

The time distribution for processing messages
promQL
= = = = = = = = = = = = "left" = = = = = = = = = = = = = = = = sum (rate (mq_message_deal_time_histogram_bucket {application ="yiyi-example-prometheus",topic="EAT_FINISH",group="EAT_FINISH_GROUP",le="1.0"
        }[1m]))
 /
 sum(rate(mq_message_deal_time_histogram_count
    {
 		application="yiyi-example-prometheus",topic="EAT_FINISH",group="EAT_FINISH_GROUP"} [m] 1)) = = = = = = = = = = = = "middle", in turn, configure multiple = = = = = = = = = = = = = = = = (sum (rate (mq_message_deal_time_histogram_bucket {application ="yiyi-example-prometheus",topic="EAT_FINISH",group="EAT_FINISH_GROUP",le="2.5"
    }[1m]))

 -sum(rate(mq_message_deal_time_histogram_bucket
    {
 		application="yiyi-example-prometheus",topic="EAT_FINISH",group="EAT_FINISH_GROUP",le="1.0"
    }[1m])))
 /sum(rate(mq_message_deal_time_histogram_count
    {
 		application="yiyi-example-prometheus",topic="EAT_FINISH",group="EAT_FINISH_GROUP"} [m] 1)) = = = = = = = = = = = = [back] = = = = = = = = = = = = = = = = (sum (rate (mq_message_deal_time_histogram_count {application ="yiyi-example-prometheus",topic="EAT_FINISH",group="EAT_FINISH_GROUP"
    }[1m]))
 -sum(rate(mq_message_deal_time_histogram_bucket
    {
 		application="yiyi-example-prometheus",topic="EAT_FINISH",group="EAT_FINISH_GROUP",le="10.0"
    }[1m])))
 /sum(rate(mq_message_deal_time_histogram_count
    {
 		application="yiyi-example-prometheus",topic="EAT_FINISH",group="EAT_FINISH_GROUP"
    }[1m]))
Copy the code
According to the effect


see

prometheus-book
Source address – Github

mo4tech.com (Moment For Technology) is a global community with thousands techies from across the global hang out!Passionate technologists, be it gadget freaks, tech enthusiasts, coders, technopreneurs, or CIOs, you would find them all here.

Rocket-mq consumer performance monitoring plug-in

1. Introduction to this paper

2. Basic learning

counter

Introduction:

Display format:

gauge

Introduction:

Display format:

summary

Introduction:

Display format:

histogram

Introduction:

Display format:

3. Prometheus + Grafana service establishment

4. Write MQ consumption monitoring plug-in

4.1 Spring-boot plug-in writing

pom.xml

Core processing class

JobMetrics.java

4.2 Grafana Displays the configuration

Number of working threads

promQL

Effect of the message

TPS quantity statistics for consuming messages

promQL

Effect of the message

Average message processing time

promQL

Effect of the message

The time distribution for processing messages

promQL

According to the effect

see

Rocket-mq consumer performance monitoring plug-in

1. Introduction to this paper

2. Basic learning

counter

Introduction:

Display format:

gauge

Introduction:

Display format:

summary

Introduction:

Display format:

histogram

Introduction:

Display format:

3. Prometheus + Grafana service establishment

4. Write MQ consumption monitoring plug-in

4.1 Spring-boot plug-in writing

pom.xml

Core processing class

JobMetrics.java

4.2 Grafana Displays the configuration

Number of working threads

promQL

Effect of the message

TPS quantity statistics for consuming messages

promQL

Effect of the message

Average message processing time

promQL

Effect of the message

The time distribution for processing messages

promQL

According to the effect

see

Related Posts

Comments | How expensive was it to develop the porting of the TensorFlow model to Pure Rust?

Common back-end development commands

HikariCP source code read (four) get and create a link