1. The background
We know that Kafka provides a very comprehensive set of Metrics, covering brokers, consumers, producers, streams, and Connect. E-mapreduce collects Kafka Broker metrics through Ganglia to monitor Broker performance. However, the complete Kafka application includes two roles: Kafka Broker and Kafka client. When read and write performance problems occur, it is often difficult to find the problem from the perspective of the Broker and needs to be combined with the health of the client for joint analysis. Metrics, the Kafka client, is a very important type of data. How does E-MapReduce capture Kafka client metrics?
(Welcome to the Open source E-MapReduce project: github.com/aliyun/aliy…)
2. Implement
2.1 How to Collect metrics
Metrics Reporter extension Kafka provides the JmxReporter implementation by default, which means that Kafka Metrics can be viewed using JMX tools. So, we can achieve a set of Metrics Reporter (implement org.apache.kafka.com mon. Metrics. MetricsReporter), from the definition for these Metrics.
2.3 How to Store metrics
We can customize Kafka Metrics and select a storage system to store these Metrics for future use and analysis. Given that Kafka is itself a storage system, we can store Metrics in Kafka, which has several advantages:
- No third-party storage system is required
- Data is easily connected to other systems
Therefore, the complete client-side metrics collection scheme is shown below:
E-mapreduce provides an open source implementation emr-kafka-client-metrics, source address.
3. The test
We don’t need to compile it ourselves. E-mapreduce has already released jars to Maven. Download the latest version.
3.1 configuration
Configuration items | instructions |
---|---|
metric.reporters | Use the emr implementation: org. Apache. Kafka. Clients. Reporter. EMRClientMetricsReporter |
emr.metrics.reporter.bootstrap.servers | Metrics Stores bootstrap.servers for the Kafka cluster |
emr.metrics.reporter.zookeeper.connect | Metrics Store the Zookeeper address of the Kafka cluster |
3.2 Loading Methods
There are two ways:
- Put the EMr-kafka-client-metrics JAR on the machine, where the Classpath of the client application can be loaded
- Add emr-kafka-client-metrics dependencies directly to the client jar package
3.3 Using Tutorial
This test will be demonstrated on an E-MapReduce Kafka cluster.
- Download the latest version of the emr-Kafka-client-metrics package
Wget http://central.maven.org/maven2/com/aliyun/emr/ emr - kafka - the client - metrics / 1.4.3 / emr - kafka - the client - metrics - 1.4.3. JarCopy the code
- Put the emr-kafka-client-metrics package in kafka lib
Cp emr - kafka - the client - metrics - 1.4.3. Jar/usr/lib/kafka - current/libs /Copy the code
- Create a test Topic
Kafka-topics. Sh --zookeeper EMr-header-1:281 /kafka-1.0.1-- Replication-factor 2 --topic test-metrics --createCopy the code
- Write data to the test topic, where we configure the producer’s configuration items into the local file client.conf.
## client.conf: metric.reporters=org.apache.kafka.clients.reporter.EMRClientMetricsReporter emr.metrics.reporter.bootstrap.servers=emr-worker-1:9092 Emr. Metrics. Reporter. Zookeeper. Connect = emr - the header - 81 / kafka - the bootstrap 1.0.1. 1:21 servers = emr - worker - 1:9 092 # # command: kafka-producer-perf-test.sh --topic test-metrics --throughput 1000 --num-records 100000 --record-size 1024 --producer.config client.confCopy the code
- Check the metrics of the client this time. Note that the default metrics topic is _emr-client-metrics
kafka-console-consumer.sh --topic _emr-client-metrics --bootstrap-server emr-worker-1:9092
--from-beginningCopy the code
The message returned is as follows:
{prefix = kafka. Producer, client IP = 192.168. XXX, XXX, client. The process = 25536 @emr-header-1.cluster-xxxx, Attribute = request - rate, value = 894.4685104965012, timestamp = 1533805225045, group = producer - metrics, tag.client-id=producer-1}Copy the code
The field name | instructions |
---|---|
client.ip | IP address of the host where the client resides |
client.process | ID of the client program process |
attribute | Metric attribute name |
value | Metric value |
timestamp | The metric collection time is incorrect. Procedure |
tag.xxx | Metric Other tag information |
4. Some limitations
- Only Java-class client programs are supported
- Only Kafka clients later than 0.10 are supported