There are three things in this world that people can’t take away: the first is the food in the stomach, the second is the dream in the heart, and the third is the book in the brain
JVM application metrics framework Micrometer real-world
The premise
Spring-actuator collects measurement statistics, uses Prometheus to collect data, and Grafana to display data, which monitors the performance and business data of machines in the generation environment. In general, we call this operation “burying point”. The measurement API in SpringBoot relies on Spring – Actuator integration using Micrometer. The official website is Micrometer.io. In practice, business developers have been found to abuse Micrometer’s measurement type Counter, resulting in the use of only counting statistics in any case. This article analyzes the role and application scenarios of other metrics apis based on Micrometer.
A measurement library provided by Micrometer
Meter is a set of interfaces used to collect measurements from applications. The word Meter can be translated as “Meter” or “micrometer,” but neither of these sounds quite right, so we’ll call it Meter instead. Meter is created and stored by MeterRegistry, which is understandably the factory and cache center for Meter, and generally every JVM application must create a concrete implementation of MeterRegistry when using Micrometer. The specific types of Micrometer include Timer, Counter, Gauge, DistributionSummary, LongTaskTimer, FunctionCounter, FunctionTimer, and TimeGauge. The following sections describe the use methods and actual scenarios of these types in detail. A specific Meter type needs to be uniquely identified by its name and Tag(here referring to the Tag interface provided by Micrometer). The advantage of this method is that it can be marked by name and different tags can be used to distinguish various dimensions for data statistics.
MeterRegistry
MeterRegistry is an abstract class in Micrometer. Its main implementations include:
- SimpleMeterRegistry: The latest data for each Meter can be collected into SimpleMeterRegistry instances, but this data is not published to other systems, meaning that the data is stored in the application’s memory.
- 2, CompositeMeterRegistry: Multiple MeterRegistry aggregates, internally maintaining a list of MeterRegistry.
- Global MeterRegistry: factory class
io.micrometer.core.instrument.Metrics
Holds a static final inCompositeMeterRegistry
Instance globalRegistry.
Of course, consumers can also inherit MeterRegistry themselves to implement a custom MeterRegistry. SimpleMeterRegistry is suitable for debugging purposes and can be used simply as follows:
MeterRegistry registry = new SimpleMeterRegistry();
Counter counter = registry.counter("counter");
counter.increment();
Copy the code
When the CompositeMeterRegistry instance is initialized, the internally held MeterRegistry list is empty and if a new Meter instance is added with it, the operation of the Meter instance is invalid
CompositeMeterRegistry composite = new CompositeMeterRegistry();
Counter compositeCounter = composite.counter("counter");
compositeCounter.increment(); // <- Actually this step is invalid, but no error is reported
SimpleMeterRegistry simple = new SimpleMeterRegistry();
composite.add(simple); // <- Add the SimpleMeterRegistry instance to the CompositeMeterRegistry instance
compositeCounter.increment(); // <- count success
Copy the code
The global MeterRegistry is easier and more convenient to use because all you need to do is manipulate the static methods of the factory class Metrics:
Metrics.addRegistry(new SimpleMeterRegistry());
Counter counter = Metrics.counter("counter"."tag-1"."tag-2");
counter.increment();
Copy the code
Tag and Meter naming
In Micrometer, the Meter naming convention uses the English comma (dot). ) Separate words. However, different monitoring systems may have different naming conventions. If the naming conventions are inconsistent, the new monitoring system may be damaged during system migration or switchover. The naming rules of words separated by commas in Micrometer can be converted through NamingConvention, which is the underlying NamingConvention interface. The naming rules can be adapted to different monitoring systems, and the names and marks with special characters that are not allowed by the monitoring system can be eliminated. Developers can also override NamingConvention to implement custom naming conversions: registry.config().namingConvention(myCustomNamingConvention); . Micrometer provides a default conversion for some of the major monitoring or storage system naming conventions, such as the following:
MeterRegistry registry = ...
registry.timer("http.server.requests");
Copy the code
For different monitoring systems or storage systems, the name is automatically changed as follows:
- 1. Prometheus – http_server_requests_duration_seconds
- Atlas-httpserverrequests.
- 3, Graphe-HTTP.server.requests.
- Influxdb-http_server_requests (influxdb-http_server_requests)
The NamingConvention already provides five default conversion rules: dot, snakeCase, camelCase, upperCamelCase, and Slashes.
In addition, Tag is an important function of Micrometer. Strictly speaking, only when a measurement framework realizes the function of Tag can it truly collect measurement data in multiple dimensions. The name of the Tag generally needs to be meaningful, which means that the dimension or type of measurement can be inferred from the name of the Tag. Assuming we need to monitor database calls and Http request call statistics, the general recommendation is:
MeterRegistry registry = ...
registry.counter("database.calls"."db"."users")
registry.counter("http.requests"."uri"."/api/users")
Copy the code
Thus, when we select a counter named “database.calls”, we can further select groups “DB” or “users” to count the contribution or composition of different groups to the total number of calls. An inverse example is the following:
MeterRegistry registry = ...
registry.counter("calls"."class"."database"."db"."users");
registry.counter("calls"."class"."http"."uri"."/api/users");
Copy the code
The counter obtained by naming “calls” is basically unable to be grouped for statistical analysis due to the confusion of tags. At this time, it can be considered that the statistical data of the time series obtained is meaningless. A global Tag can be defined, that is, the global Tag definition will be attached to all meters used (as long as the same MeterRegistry is used).
MeterRegistry registry = ...
registry.counter("calls"."class"."database"."db"."users");
registry.counter("calls"."class"."http"."uri"."/api/users");
MeterRegistry registry = ...
registry.config().commonTags("stack"."prod"."region"."us-east-1");
// The same meaning as above
registry.config().commonTags(Arrays.asList(Tag.of("stack"."prod"), Tag.of("region"."us-east-1")));
Copy the code
Used as above, you can do multi-dimensional in-depth analysis through the host, instance, region, stack, and other operating environments.
Two more things to note:
- 1. The value of Tag must not be null.
- 2. In Micrometer, tags must appear in pairs, that is, they must be set to an even number. In fact, they exist in the form of Key=Value
io.micrometer.core.instrument.Tag
Interface:
public interface Tag extends Comparable<Tag> {
String getKey(a);
String getValue(a);
static Tag of(String key, String value) {
return new ImmutableTag(key, value);
}
default int compareTo(Tag o) {
return this.getKey().compareTo(o.getKey()); }}Copy the code
Of course, MeterFilter can be used when you need to filter necessary tags or names for statistics or whitelist Meter names. MeterFilter itself provides a series of static methods, multiple MeterFilters can be stacked or formed into a chain to achieve the user’s final filtering strategy. Such as:
MeterRegistry registry = ...
registry.config()
.meterFilter(MeterFilter.ignoreTags("http"))
.meterFilter(MeterFilter.denyNameStartsWith("jvm"));
Copy the code
Means to ignore the “HTTP” tag and reject Meter names that begin with a string of “JVM”. See the MeterFilter class for more usage.
The combination of Meter naming and Meter Tag, with the name as the axis and Tag as the multi-dimensional element, can enrich the dimension of measurement data and facilitate statistics and analysis.
Meters
The meters mentioned above include Timer, Counter, Gauge, DistributionSummary, LongTaskTimer, FunctionCounter, FunctionTimer, and TimeGauge. Here’s a look at each of them and how I understand them to be used in real life scenarios (production environments, I should say).
Counter
Counter is a simpler Meter that is a single-valued measure type, or a single-valued Counter. The Counter interface allows the user to count with a fixed value that must be positive. To be precise: a Counter is a single-valued Counter with positive increments. Here’s a very simple use example:
MeterRegistry meterRegistry = new SimpleMeterRegistry();
Counter counter = meterRegistry.counter("http.request"."createOrder"."/order/create");
counter.increment();
System.out.println(counter.measure()); / / [Measurement {statistic = 'COUNT', value = 1.0}]
Copy the code
Usage Scenarios:
The function of Counter is to record the total amount or value of XXX, which is suitable for some increasing types of statistics, such as placing orders, times of payment, total AMOUNT of Http requests, etc. Different scenarios can be distinguished by Tag. For placing orders, different tags can be used to mark different business sources or divide them by date. For total Http request records, you can use tags to distinguish between different urls. Take the ordering business as an example:
/ / entity
@Data
public class Order {
private String orderId;
private Integer amount;
private String channel;
private LocalDateTime createTime;
}
public class CounterMain {
private static final DateTimeFormatter FORMATTER = DateTimeFormatter.ofPattern("yyyy-MM-dd");
static {
Metrics.addRegistry(new SimpleMeterRegistry());
}
public static void main(String[] args) throws Exception {
Order order1 = new Order();
order1.setOrderId("ORDER_ID_1");
order1.setAmount(100);
order1.setChannel("CHANNEL_A");
order1.setCreateTime(LocalDateTime.now());
createOrder(order1);
Order order2 = new Order();
order2.setOrderId("ORDER_ID_2");
order2.setAmount(200);
order2.setChannel("CHANNEL_B");
order2.setCreateTime(LocalDateTime.now());
createOrder(order2);
Search.in(Metrics.globalRegistry).meters().forEach(each -> {
StringBuilder builder = new StringBuilder();
builder.append("name:")
.append(each.getId().getName())
.append(",tags:")
.append(each.getId().getTags())
.append(",type:").append(each.getId().getType())
.append(",value:").append(each.measure());
System.out.println(builder.toString());
});
}
private static void createOrder(Order order) {
// Ignore operations such as order entry
Metrics.counter("order.create"."channel", order.getChannel(),
"createTime", FORMATTER.format(order.getCreateTime())).increment(); }}Copy the code
Console output
name:order.create,tags:[tag(channel=CHANNEL_A), tag(createTime=2018-11-10)],type:COUNTER,value:[Measurement{statistic='COUNT', value=1.0}]
name:order.create,tags:[tag(channel=CHANNEL_B), tag(createTime=2018-11-10)],type:COUNTER,value:[Measurement{statistic='COUNT', value=1.0}]
Copy the code
The above example is to use global static method factory class Metrics to construct Counter examples, in fact, IO. Micrometer. Core. Instrument. The Counter interface provides an internal Builder class Counter. The Builder to instantiate the Counter, Counter.Builder can be used as follows:
public class CounterBuilderMain {
public static void main(String[] args) throws Exception{
Counter counter = Counter.builder("name") / / name
.baseUnit("unit") // Base unit
.description("desc") / / description
.tag("tagKey"."tagValue") / / label
.register(new SimpleMeterRegistry());// Bound MeterRegistrycounter.increment(); }}Copy the code
FunctionCounter
FunctionCounter is a specialized type of Counter that abstracts Counter increments to the interface type ToDoubleFunction, which is the JDK1.8 specialized type interface for Function. FunctionCounter is used in the same scenario as Counter, here’s how it can be used:
public class FunctionCounterMain {
public static void main(String[] args) throws Exception {
MeterRegistry registry = new SimpleMeterRegistry();
AtomicInteger n = new AtomicInteger(0);
// The ToDoubleFunction anonymous implementation can be simplified to AtomicInteger::get using Lambda expressions
FunctionCounter.builder("functionCounter", n, new ToDoubleFunction<AtomicInteger>() {
@Override
public double applyAsDouble(AtomicInteger value) {
return value.get();
}
}).baseUnit("function")
.description("functionCounter")
.tag("createOrder"."CHANNEL-A")
.register(registry);
// Let's simulate the third countn.incrementAndGet(); n.incrementAndGet(); n.incrementAndGet(); }}Copy the code
One of the obvious benefits of using FunctionCounter is that we don’t need to be aware of the existence of an instance of FunctionCounter, we actually just need to manipulate the AtomicInteger instance as one of the FunctionCounter instance building elements, This approach to interface design can be seen in many frameworks.
Timer
Timer is used to record the execution time of short events and display the sequence and occurrence frequency of events through time distribution. All Timer implementations record at least the number of events that occurred and the total time of those events to generate a time series. The base unit of the Timer depends on the server metrics, but in practice we don’t need to worry too much about the base unit of the Timer because Micrometer automatically selects the appropriate base unit when storing the generated time series. Common methods provided by the Timer interface are as follows:
public interface Timer extends Meter {...void record(long var1, TimeUnit var3);
default void record(Duration duration) {
this.record(duration.toNanos(), TimeUnit.NANOSECONDS);
}
<T> T record(Supplier<T> var1);
<T> T recordCallable(Callable<T> var1) throws Exception;
void record(Runnable var1);
default Runnable wrap(Runnable f) {
return() - > {this.record(f);
};
}
default <T> Callable<T> wrap(Callable<T> f) {
return() - > {return this.recordCallable(f);
};
}
long count(a);
double totalTime(TimeUnit var1);
default double mean(TimeUnit unit) {
return this.count() == 0L ? 0.0 D : this.totalTime(unit) / (double)this.count();
}
double max(TimeUnit var1); . }Copy the code
In fact, the more common and convenient method is several functional interface input methods:
Timer timer = ...
timer.record(() -> dontCareAboutReturnValue());
timer.recordCallable(() -> returnValue());
Runnable r = timer.wrap(() -> dontCareAboutReturnValue());
Callable c = timer.wrap(() -> returnValue());
Copy the code
Usage Scenarios:
Based on personal experience and practice, the conclusions are as follows:
- 1. Record the execution time of the specified method for display.
- 2. Record the execution time of some tasks to determine the rate of some data sources, such as the consumption rate of message queue messages.
Here is a practical example, to make a function of the system, record the execution time of the specified method, or use the order method to do the example:
public class TimerMain {
private static final Random R = new Random();
static {
Metrics.addRegistry(new SimpleMeterRegistry());
}
public static void main(String[] args) throws Exception {
Order order1 = new Order();
order1.setOrderId("ORDER_ID_1");
order1.setAmount(100);
order1.setChannel("CHANNEL_A");
order1.setCreateTime(LocalDateTime.now());
Timer timer = Metrics.timer("timer"."createOrder"."cost");
timer.record(() -> createOrder(order1));
}
private static void createOrder(Order order) {
try {
TimeUnit.SECONDS.sleep(R.nextInt(5)); // The simulation method takes time
} catch (InterruptedException e) {
//no-op}}}Copy the code
In a real production environment, spring-AOP can be used to abstract the logic of logging method time consumption into an aspect, thus reducing unnecessary redundant template code. The above example constructs a Timer instance with Mertics, but you can actually construct it with Builder as well:
MeterRegistry registry = ...
Timer timer = Timer
.builder("my.timer")
.description("a description of what this timer does") / / is optional
.tags("region"."test") / / is optional
.register(registry);
Copy the code
In addition, the use of Timer can also be based on its internal class timer.sample, through the start and stop methods to record the execution time between the two logic. Such as:
Timer.Sample sample = Timer.start(registry);
// Do business logic here
Response response = ...
sample.stop(registry.timer("my.timer"."response", response.status()));
Copy the code
FunctionTimer
FunctionTimer is a specialized type of Timer that provides two monotonically increasing functions (which are not monotonically increasing, but generally need to remain the same or not decrease over time) : a function for counting and a function for recording the total call time. Its constructor takes the following inputs:
public interface FunctionTimer extends Meter {
static <T> Builder<T> builder(String name, T obj, ToLongFunction
countFunction, ToDoubleFunction
totalTimeFunction, TimeUnit totalTimeFunctionUnit)
{
return newBuilder<>(name, obj, countFunction, totalTimeFunction, totalTimeFunctionUnit); }... }Copy the code
Examples from the official documentation are as follows:
IMap<? ,? > cache = ... ;// Assume the Hazelcast cache is used
registry.more().timer("cache.gets.latency", Tags.of("name", cache.getName()), cache,
c -> c.getLocalMapStats().getGetOperationCount(), // This is actually a cache method that records the increments initialized during the cache lifetime.
c -> c.getLocalMapStats().getTotalGetLatency(), // Total delay time of Get operation
TimeUnit.NANOSECONDS
);
Copy the code
ToDoubleFunction ToDoubleFunction ToDoubleFunction ToDoubleFunction ToDoubleFunction ToDoubleFunction ToDoubleFunction ToDoubleFunction ToDoubleFunction ToDoubleFunction The simple usage is as follows:
public class FunctionTimerMain {
public static void main(String[] args) throws Exception {
// This is to satisfy the parameter
Object holder = new Object();
AtomicLong totalTimeNanos = new AtomicLong(0);
AtomicLong totalCount = new AtomicLong(0);
FunctionTimer.builder("functionTimer", holder, p -> totalCount.get(),
p -> totalTimeNanos.get(), TimeUnit.NANOSECONDS)
.register(new SimpleMeterRegistry());
totalTimeNanos.addAndGet(10000000); totalCount.incrementAndGet(); }}Copy the code
LongTaskTimer
LongTaskTimer is also a specialized type of Timer, which is mainly used to record the duration of a task that has been executed for a long time. The monitored events or tasks are still running before the task is completed, and the total elapsed time of the task execution will be recorded when the task is completed. LongTaskTimer is suitable for recording the duration of events that run for a long time, such as relatively time-consuming scheduled tasks. In Spring applications, you can simply use the @scheduled and @timed annotations to record the total time of a Scheduled task based on Spring-AOP:
@Timed(value = "aws.scrape", longTask = true)
@Scheduled(fixedDelay = 360000)
void scrapeResources(a) {
// Do the relatively time-consuming business logic here
}
Copy the code
Of course, LongTaskTimer can be easily used in non-Spring architectures:
public class LongTaskTimerMain {
public static void main(String[] args) throws Exception{
MeterRegistry meterRegistry = new SimpleMeterRegistry();
LongTaskTimer longTaskTimer = meterRegistry.more().longTaskTimer("longTaskTimer");
longTaskTimer.record(() -> {
// Write the Task logic here
});
// Or so
Metrics.more().longTaskTimer("longTaskTimer").record(()-> {
// Write the Task logic here}); }}Copy the code
Gauge
A Gauge is a handle to the current metric record value, that is, it represents a single numerical Meter that can float up and down at will. Gauge is typically used for changing measurements, such as current memory usage, set with the return value of the ToDoubleFunction parameter, as well as for “counts” that move up and down, such as the number of messages in the queue. Typical use scenarios for Gauge mentioned in the official documentation are for measuring the size of collections or maps or the number of threads in a running state. Gauge is used to monitor events or tasks with natural upper bounds, while Counter is used to monitor events or tasks with no natural upper bounds, so you should use Counter instead of Gauge for things like total Http requests. MeterRegistry provides some easy ways to build Gauge related methods for observing values, functions, collections, and maps:
List<String> list = registry.gauge("listGauge", Collections.emptyList(), new ArrayList<>(), List::size);
List<String> list2 = registry.gaugeCollectionSize("listSize2", Tags.empty(), new ArrayList<>());
Map<String, Integer> map = registry.gaugeMapSize("mapGauge", Tags.empty(), new HashMap<>());
Copy the code
The above three methods build Gauge through MeterRegistry and return collections or mapping instances that can be used to record the change value as its size changes. The more important advantage is that we don’t need to be aware of the presence of the Gauge interface, just use collections or map instances as we would normally. In addition, the Gauge support Java. Lang. Number of subclasses, Java. Util. Concurrent. The atomic AtomicInteger in the package and AtomicLong, and Guava AtomicDouble offers:
AtomicInteger n = registry.gauge("numberGauge".new AtomicInteger(0));
n.set(1);
n.set(2);
Copy the code
In addition to using MeterRegistry to create gauges, you can also create gauges using the Builder stream:
// We generally do not need to manipulate the Gauge instance
Gauge gauge = Gauge
.builder("gauge", myObj, myObj::gaugeValue)
.description("a description of what this gauge does") / / is optional
.tags("region"."test") / / is optional
.register(registry);
Copy the code
Usage Scenarios:
Based on personal experience and practice, the conclusions are as follows:
- 1. Monitor floating values with natural (physical) upper bounds, such as physical memory, collections, maps, values, etc.
- 2. Monitoring of logically bounded floating values, such as backlogged messages, backlogged tasks (in thread pools), is essentially monitoring of collections or maps.
For a more practical example, suppose we need to send a short message or push to a logged-in user. The message is placed on a blocking queue, and a thread consumes the message for other operations:
public class GaugeMain {
private static final MeterRegistry MR = new SimpleMeterRegistry();
private static final BlockingQueue<Message> QUEUE = new ArrayBlockingQueue<>(500);
private static BlockingQueue<Message> REAL_QUEUE;
static {
REAL_QUEUE = MR.gauge("messageGauge", QUEUE, Collection::size);
}
public static void main(String[] args) throws Exception {
consume();
Message message = new Message();
message.setUserId(1L);
message.setContent("content");
REAL_QUEUE.put(message);
}
private static void consume(a) throws Exception {
new Thread(() -> {
while (true) {
try {
Message message = REAL_QUEUE.take();
//handle message
System.out.println(message);
} catch (InterruptedException e) {
//no-op} } }).start(); }}Copy the code
The above example is poorly written and is intended to demonstrate its use. Do not use it in a production environment.
TimeGauge
TimeGauge is a specialization of Gauge. Compared to Gauge, the constructor has a TimeUnit parameter that specifies the base TimeUnit for the ToDoubleFunction entry. Here is a simple example:
public class TimeGaugeMain {
private static final SimpleMeterRegistry R = new SimpleMeterRegistry();
public static void main(String[] args) throws Exception{
AtomicInteger count = new AtomicInteger();
TimeGauge.Builder<AtomicInteger> timeGauge = TimeGauge.builder("timeGauge", count,
TimeUnit.SECONDS, AtomicInteger::get);
timeGauge.register(R);
count.addAndGet(10086);
print();
count.set(1);
print();
}
private static void print(a)throws Exception{
Search.in(R).meters().forEach(each -> {
StringBuilder builder = new StringBuilder();
builder.append("name:")
.append(each.getId().getName())
.append(",tags:")
.append(each.getId().getTags())
.append(",type:").append(each.getId().getType())
.append(",value:").append(each.measure()); System.out.println(builder.toString()); }); }}/ / output
name:timeGauge,tags:[],type:GAUGE,value:[Measurement{statistic='VALUE', value=10086.0}]
name:timeGauge,tags:[],type:GAUGE,value:[Measurement{statistic='VALUE', value=1.0}]
Copy the code
DistributionSummary
Summary is primarily used to track the distribution of events, and in Micrometer, the corresponding class is DistributionSummary. It is used in much the same way as a Timer, but its recorded values do not depend on time units. A common usage scenario: Use DistributionSummary to measure the payload size of requests hitting the server. Create a DistributionSummary instance using MeterRegistry as follows:
DistributionSummary summary = registry.summary("response.size");
Copy the code
Create the following by builder stream:
DistributionSummary summary = DistributionSummary
.builder("response.size")
.description("a description of what this summary does") / / is optional
.baseUnit("bytes") / / is optional
.tags("region"."test") / / is optional
.scale(100) / / is optional
.register(registry);
Copy the code
Many of the build parameters in DistributionSummary are related to scaling and histogram representation, as shown in the next section.
Usage Scenarios:
Based on personal experience and practice, the conclusions are as follows:
- 1. Measurement of recorded values independent of time units, such as server payload, cache hit ratio, etc.
To take a relatively specific example:
public class DistributionSummaryMain {
private static final DistributionSummary DS = DistributionSummary.builder("cacheHitPercent")
.register(new SimpleMeterRegistry());
private static final LoadingCache<String, String> CACHE = CacheBuilder.newBuilder()
.maximumSize(1000)
.recordStats()
.expireAfterWrite(60, TimeUnit.SECONDS)
.build(new CacheLoader<String, String>() {
@Override
public String load(String s) throws Exception {
returnselectFromDatabase(); }});public static void main(String[] args) throws Exception{
String key = "doge";
String value = CACHE.get(key);
record();
}
private static void record(a)throws Exception{
CacheStats stats = CACHE.stats();
BigDecimal hitCount = new BigDecimal(stats.hitCount());
BigDecimal requestCount = new BigDecimal(stats.requestCount());
DS.record(hitCount.divide(requestCount,2,BigDecimal.ROUND_HALF_DOWN).doubleValue()); }}Copy the code
Histogram and percentage configuration
Histogram and percentage configuration are applicable to Summary and Timer, which are relatively complex and can be supplemented after thorough research.
Integration based on SpirngBoot, Prometheus and Grafana
JVM applications that integrate the Micrometer framework collect metrics using the Micrometer API in memory. Therefore, additional storage systems are required to store these metrics, monitoring systems are required to collect and process the data, and UI tools are required to display the data. The average big guy only likes to look at cool charts and animations. A common storage system is chronology database, mainstream include Influx, Datadog, etc. The dominant monitoring system, primarily for data collection and processing, is Prometheus(commonly known as Prometheus, but hereafter as such). The UI that I’m showing you is the one that’s been used a lot so far is Grafana. In addition, Prometheus already has a built-in implementation of a sequential database, so a relatively complete metric monitoring system relies only on the target JVM applications, the Prometheus component and the Grafana component. Let’s take a moment to build such a system from scratch. The previous article was based on A Windows operating system, which is probably not close enough to the production environment. This time, we will use CentOS7.
Micrometer is used in SpirngBoot
The Spring-boot-starter – Actuator dependencies in SpringBoot are integrated with Micrometer, and the metrics endpoints use Micrometer for many functions. Prometheus endpoints are also enabled by default. In fact, spring-boot-Bession-Autoconfigure is integrated with out-of-the-box APIS for many frameworks, and Prometheus package is integrated with support for Prometheus. The use of the actuator can easily expose projects to Prometheus endpoints as the client for Prometheus to collect data, and Prometheus(the server software) can use the endpoints to collect Micrometer measurements in the application.
We first introduce spring-boot-starter-actuator and Spring-boot-starter – Web to implement a Counter and Timer as an example. Rely on:
<dependencyManagement>
<dependencies>
<dependency>
<groupId>org.springframework.boot</groupId>
<artifactId>spring-boot-dependencies</artifactId>
<version>2.1.0. RELEASE</version>
<type>pom</type>
<scope>import</scope>
</dependency>
</dependencies>
</dependencyManagement>
<dependencies>
<dependency>
<groupId>org.springframework.boot</groupId>
<artifactId>spring-boot-starter-web</artifactId>
</dependency>
<dependency>
<groupId>org.springframework.boot</groupId>
<artifactId>spring-boot-starter-actuator</artifactId>
</dependency>
<dependency>
<groupId>org.springframework.boot</groupId>
<artifactId>spring-boot-starter-aop</artifactId>
</dependency>
<dependency>
<groupId>org.projectlombok</groupId>
<artifactId>lombok</artifactId>
<version>1.16.22</version>
</dependency>
<dependency>
<groupId>io.micrometer</groupId>
<artifactId>micrometer-registry-prometheus</artifactId>
<version>1.1.0</version>
</dependency>
</dependencies>
Copy the code
Then write an ordering interface and a message sending module to simulate sending messages to users after placing an order:
/ / entity
@Data
public class Message {
private String orderId;
private Long userId;
private String content;
}
@Data
public class Order {
private String orderId;
private Long userId;
private Integer amount;
private LocalDateTime createTime;
}
// Controller and service classes
@RestController
public class OrderController {
@Autowired
private OrderService orderService;
@PostMapping(value = "/order")
public ResponseEntity<Boolean> createOrder(@RequestBody Order order){
returnResponseEntity.ok(orderService.createOrder(order)); }}@Slf4j
@Service
public class OrderService {
private static final Random R = new Random();
@Autowired
private MessageService messageService;
public Boolean createOrder(Order order) {
// simulate an order
try {
int ms = R.nextInt(50) + 50;
TimeUnit.MILLISECONDS.sleep(ms);
log.info("Save order simulation time {} ms...", ms);
} catch (Exception e) {
//no-op
}
// Record the total number of orders
Metrics.counter("order.count"."order.channel", order.getChannel()).increment();
// Send a message
Message message = new Message();
message.setContent("Simulated SMS...");
message.setOrderId(order.getOrderId());
message.setUserId(order.getUserId());
messageService.sendMessage(message);
return true; }}@Slf4j
@Service
public class MessageService implements InitializingBean {
private static final BlockingQueue<Message> QUEUE = new ArrayBlockingQueue<>(500);
private static BlockingQueue<Message> REAL_QUEUE;
private static final Executor EXECUTOR = Executors.newSingleThreadExecutor();
private static final Random R = new Random();
static {
REAL_QUEUE = Metrics.gauge("message.gauge", Tags.of("message.gauge"."message.queue.size"), QUEUE, Collection::size);
}
public void sendMessage(Message message) {
try {
REAL_QUEUE.put(message);
} catch (InterruptedException e) {
//no-op}}@Override
public void afterPropertiesSet(a) throws Exception {
EXECUTOR.execute(() -> {
while (true) {
try {
Message message = REAL_QUEUE.take();
log.info("Simulate sending SMS,orderId:{},userId:{}, content :{}, Time :{} ms", message.getOrderId(), message.getUserId(),
message.getContent(), R.nextInt(50));
} catch (Exception e) {
throw newIllegalStateException(e); }}}); }}/ / cut class
@Component
@Aspect
public class TimerAspect {
@Around(value = "execution(* club.throwable.smp.service.*Service.*(..) )"
public Object around(ProceedingJoinPoint joinPoint) throws Throwable {
Signature signature = joinPoint.getSignature();
MethodSignature methodSignature = (MethodSignature) signature;
Method method = methodSignature.getMethod();
Timer timer = Metrics.timer("method.cost.time"."method.name", method.getName());
ThrowableHolder holder = new ThrowableHolder();
Object result = timer.recordCallable(() -> {
try {
return joinPoint.proceed();
} catch (Throwable e) {
holder.throwable = e;
}
return null;
});
if (null! = holder.throwable) {throw holder.throwable;
}
return result;
}
private class ThrowableHolder { Throwable throwable; }}Copy the code
The configuration of YAML is as follows:
server:
port: 9091
management:
server:
port: 10091
endpoints:
web:
exposure:
include: The '*'
base-path: /management
Copy the code
Note that following SpringBoot-2.x, configuring permissions for exposing Web endpoints is quite different from that in 1.x. Endpoints must be exposed as Web endpoints before they can be accessed. To disable or enable endpoint support, use the following methods:
${endpoint ID}. Enabled =true/falseCopy the code
You can check the kerbing-API documentation to see the features of all endpoints supported. This is the official documentation for version 2.1.0.release. I don’t know if the links will be broken in the future. Endpoint is open only to support, but not exposed as a Web endpoint, is through http:// {host} : {management port} / {management. Endpoints. Web. Base – path} / {endpointId} visit. The configuration to expose a monitoring endpoint as a Web endpoint is:
management.endpoints.web.exposure.include=info,health
management.endpoints.web.exposure.exclude=prometheus
Copy the code
Management. Endpoints. Web. Exposure. Exclude monitoring is used to specify not exposed as a web endpoint endpoint, Specify multiple commas in English when management endpoints. Web. Exposure. Include the default specified only info and health two endpoints, we all can be specified directly exposed endpoint: Management. Endpoints. Web. Exposure. Include = *, if use YAML configuration, remember to get quotes’ * ‘*. It is dangerous to expose all Web monitoring endpoints. If you want to do this in the production environment, ensure that http://{host}:{management.port} cannot be accessed through the public network (that is, the ports accessed by the monitoring endpoints can only be accessed through the Intranet. This makes it easier for the Prometheus server to collect data from this port.
Installation and configuration of Prometheus
The latest version of Prometheus is 2.5, but since I haven’t played With Docker in depth, download the Docker package and unpack it.
Wget https://github.com/prometheus/prometheus/releases/download/v2.5.0/prometheus-2.5.0.linux-amd64.tar.gz tar XVFZ prometheus-*.tar.gz cd prometheus-*Copy the code
Modify the properties of scrape_configs from Prometheus configuration file Prometheus. Yml.
scrape_configs:
# The job name is added as a label `job=<job_name>` to any timeseries scraped from this config.
- job_name: 'prometheus'
# metrics_path defaults to '/metrics'
# scheme defaults to 'http'.
Here configure the URL path to pull metrics, here select the application's Prometheus endpoint
metrics_path: /management/prometheus
static_configs:
Configure host and port
- targets: ['localhost:10091']
Copy the code
The path of the configuration pull metrics for localhost: 10091 / management/metrics, remember the previous section mentioned before application in a virtual machine started. Then start the Prometheus app:
#Parameter --storage.tsdb.path= The path where data is stored./data is the default path
./prometheus --config.file=prometheus.yml --log.level=debug
Copy the code
The default startup port referenced by Prometheus is 9090. After the startup is successful, the following logs are generated:
In this case, access TTP ://${vm host}:9090/targets to see the Job currently executed by Prometheus
Go to TTP ://${vm host}:9090/graph to find the metric Meter we defined and some metric meters that have been defined for JVM or Tomcat in spring-boot-starter-actuator. Let’s call the /order interface of the application and look at the rder_count_total ‘ ‘ethod_cost_time_seconds_sum defined earlier in the monitoring application
As you can see, the Meter information has been collected and displayed, but it is clearly not detailed and cool enough, so we need to use Grafana’S UI for embellishes.
Installation and use of Grafana
The Grafana installation process is as follows:
Wget https://s3-us-west-2.amazonaws.com/grafana-releases/release/grafana-5.3.4-1.x86_64.rpm sudo yum localinstall Grafana 5.3.4-1. X86_64. RPMCopy the code
After the installation is complete, run the service grafana-server start command to start the grafana-server. The default startup port is 3000. Run TTP ://${host}:3000 to start the grafana-server. The initial account password is admin, and the permission is administrator. You then need to add a data source in the Home panel to connect to the Prometheus server so that you can pull metrics from it. The data source add panel is as follows:
A port that points to the Prometheus server. You can then arbitrarily add any panels you want, adding a Graph panel for the order count metric
When configuring the panel, specify Title in the base (General) :
Next of importance is the configuration of Metrics, which specifies the data source and Prometheus query:
It is best to consult the official documentation for Prometheus to learn a little about the use of its query language PromQL, which can support multiple PromQL queries in one panel. The two items mentioned above are basic configurations. Other configuration items are generally auxiliary functions, such as early warning and other auxiliary functions shown in the chart. We will not expand them here, but can dig out the usage mode on Grafana’s official website. Then we call the order interface again, after a period of time, the chart data will be automatically updated and displayed:
Then add the Meter of the Timer used in the project to monitor the execution time of the method. After completion, it is roughly as follows:
Customize system output for example:
App_register app_login 0.0 0.0Copy the code
- Let’s do the statistics
- Metrics. Counter (“app_login”) can be used to augment data in Java
- Listen to the custom path /app/meter code in Prometheus as follows
@RequestMapping(value = "/meter", method = RequestMethod.GET)
@ResponseBody
public void getMeg(HttpServletResponse response) throws IOException {
Map map = Metrics.globalRegistry.getMeters().stream().collect(
Collectors.toMap(meter -> meter.getId().getName(),
meter -> Metrics.counter(meter.getId().getName()).count()));
StringBuffer html = new StringBuffer();
map.forEach((key,value) ->{
html.append(key).append("").append(value).append("\n");
});
response.setContentType("text/plain; Version = 0.0.4; charset=utf-8");
response.setContentLength(html.toString().length());
ServletOutputStream out = response.getOutputStream();
out.write(html.toString().getBytes());
out.flush();
}
Copy the code
- The reason for writing a custom is because Prometheus needs to return a text/plain response header; Version = 0.0.4; Charset = UTF-8 and content-length requires the Length of the text. Compare the pit. So keep a journal
Hard-won, give a concern github.com/yunlongn