1. The background

We know that Kafka provides a very comprehensive set of Metrics, covering brokers, consumers, producers, streams, and Connect. E-mapreduce collects Kafka Broker metrics through Ganglia to monitor Broker performance. However, the complete Kafka application includes two roles: Kafka Broker and Kafka client. When read and write performance problems occur, it is often difficult to find the problem from the perspective of the Broker and needs to be combined with the health of the client for joint analysis. Metrics, the Kafka client, is a very important type of data. How does E-MapReduce capture Kafka client metrics?

(Welcome to the Open source E-MapReduce project: github.com/aliyun/aliy…)

2. Implement

2.1 How to Collect metrics

Metrics Reporter extension Kafka provides the JmxReporter implementation by default, which means that Kafka Metrics can be viewed using JMX tools. So, we can achieve a set of Metrics Reporter (implement org.apache.kafka.com mon. Metrics. MetricsReporter), from the definition for these Metrics.

2.3 How to Store metrics

We can customize Kafka Metrics and select a storage system to store these Metrics for future use and analysis. Given that Kafka is itself a storage system, we can store Metrics in Kafka, which has several advantages:

  • No third-party storage system is required
  • Data is easily connected to other systems

Therefore, the complete client-side metrics collection scheme is shown below:

E-mapreduce provides an open source implementation emr-kafka-client-metrics, source address.

3. The test

We don’t need to compile it ourselves. E-mapreduce has already released jars to Maven. Download the latest version.

3.1 configuration

Configuration items instructions
metric.reporters Use the emr implementation: org. Apache. Kafka. Clients. Reporter. EMRClientMetricsReporter
emr.metrics.reporter.bootstrap.servers Metrics Stores bootstrap.servers for the Kafka cluster
emr.metrics.reporter.zookeeper.connect Metrics Store the Zookeeper address of the Kafka cluster

3.2 Loading Methods

There are two ways:

  • Put the EMr-kafka-client-metrics JAR on the machine, where the Classpath of the client application can be loaded
  • Add emr-kafka-client-metrics dependencies directly to the client jar package

3.3 Using Tutorial

This test will be demonstrated on an E-MapReduce Kafka cluster.

  • Download the latest version of the emr-Kafka-client-metrics package
Wget http://central.maven.org/maven2/com/aliyun/emr/ emr - kafka - the client - metrics / 1.4.3 / emr - kafka - the client - metrics - 1.4.3. JarCopy the code
  • Put the emr-kafka-client-metrics package in kafka lib
Cp emr - kafka - the client - metrics - 1.4.3. Jar/usr/lib/kafka - current/libs /Copy the code
  • Create a test Topic
Kafka-topics. Sh --zookeeper EMr-header-1:281 /kafka-1.0.1-- Replication-factor 2 --topic test-metrics --createCopy the code
  • Write data to the test topic, where we configure the producer’s configuration items into the local file client.conf.
## client.conf: metric.reporters=org.apache.kafka.clients.reporter.EMRClientMetricsReporter emr.metrics.reporter.bootstrap.servers=emr-worker-1:9092 Emr. Metrics. Reporter. Zookeeper. Connect = emr - the header - 81 / kafka - the bootstrap 1.0.1. 1:21 servers = emr - worker - 1:9 092 # # command:  kafka-producer-perf-test.sh --topic test-metrics --throughput 1000 --num-records 100000 --record-size 1024 --producer.config client.confCopy the code
  • Check the metrics of the client this time. Note that the default metrics topic is _emr-client-metrics
kafka-console-consumer.sh --topic _emr-client-metrics --bootstrap-server emr-worker-1:9092 
--from-beginningCopy the code

The message returned is as follows:

{prefix = kafka. Producer, client IP = 192.168. XXX, XXX, client. The process = 25536 @emr-header-1.cluster-xxxx, Attribute = request - rate, value = 894.4685104965012, timestamp = 1533805225045, group = producer - metrics, tag.client-id=producer-1}Copy the code
The field name instructions
client.ip IP address of the host where the client resides
client.process ID of the client program process
attribute Metric attribute name
value Metric value
timestamp The metric collection time is incorrect. Procedure
tag.xxx Metric Other tag information

4. Some limitations

  • Only Java-class client programs are supported
  • Only Kafka clients later than 0.10 are supported