This series of blogs summarizes and shares examples drawn from real business environments, and provides practical guidance on Spark business applications. Stay tuned for this series of blogs. Copyright: This set of Spark business application belongs to the author (Qin Kaixin).

  • Kafka business environment combat – Kafka production environment planning
  • Kafka Business Environment in action – Kafka producer and consumer throughput test
  • Kafka business environment combat – Kafka Producer parameter setting and parameter tuning suggestions
  • Kafka business environment combat – Kafka cluster management important operation instructions operation and maintenance book

Kafka data distribution test instruction

2 Kafka startup command

3 Kafka version 0.10 instructions summary

3 Kafka Operation and maintenance Management collection

  • To view the list of Kafka topics, use the –list argument

    Bin /kafka-topics. Sh --zookeeper 127.0.0.1:2181 --list __consumer_offsets lX_test_topic testCopy the code
  • To view details of Kafka topic-specific, use the –topic and –describe parameters

    Bin /kafka-topics. Sh --zookeeper 127.0.1:2181 --topic lx_test_topic --describe topic :lx_test_topic PartitionCount:1 ReplicationFactor:1 Configs: Topic: lx_test_topic Partition: 0 Leader: 0 Replicas: 0 Isr: 0Copy the code
  • To view the list of consumer groups, use the –list argument

    Also according to the new/old consumer

    Bin /kafka-consumer-groups.sh --new-consumer --bootstrap-server 127.0.0.1:9292 --list lx_test Bin /kafka-consumer-groups.sh --zookeeper 127.0.0.1:2181 --list console-consumer-86845 console-consumer-11967Copy the code
  • To view the details of a particular consumer group, use the –group and –describe parameters

    Bootstrap-server and ZooKeeper parameters are also specified according to the new and old versions of consumer:

    Bin /kafka-consumer-groups.sh --new-consumer --bootstrap-server 127.0.0.1:9292 --group lx_test --describe group TOPIC PARTITION current-offset log-end-offset LAG OWNER Lx_test lX_test_topic 0 465 465 0 kafka-python-1.3.1_/127.0.0.1 Sh --zookeeper 127.0.0.1:2181 --group console-consumer-11967 --describe group TOPIC PARTITION CURRENT-OFFSET LOG-END-OFFSET LAG OWNER Could not fetch offset from zookeeper for group console-consumer-11967 partition  [lx_test_topic,0] due to missing offset data in zookeeper. console-consumer-11967 lx_test_topic 0 unknown 465 unknown console-consumer-11967_aws-lx-1513787888172-d3a91f05-0Copy the code
  • management

    Create a theme (4 partitions, Sh --create --zookeeper localhost:2181 --replication-factor 2 --partitions 4 --topic testCopy the code
  • The query

    Bin /kafka-topics. Sh --zookeeper 127.0.0.1:2181 --list # Bin /kafka-consumer-groups.sh --new-consumer --bootstrap-server localhost:9092 --list ## Displays a consumer groups consumption details (only support offset storage) on the zookeeper bin/kafka - run - class. Sh kafka. View ConsumerOffsetChecker - a zookeeper Bin /kafka-consumer-groups.sh --new-consumer --bootstrap-server localhost:2181 --group test localhost:9092 --describe --group test-consumer-groupCopy the code
  • Send and consume

    ## producer bin/kafka-console-consumer.sh --broker-list localhost:9092 --topic test ## consumer bin/kafka-console-consumer.sh --zookeeper localhost:2181 --topic test ## New producer bin/kafka-console-producer.sh --broker-list localhost:9092 - topic test -- producer. Config the config/producer. The properties of # # new consumers (support version 0.9 +) bin/kafka - the console - consumer. Sh --bootstrap-server localhost:9092 --topic test --new-consumer --from-beginning --consumer.config The config/consumer. The properties of # # advanced points of the usage of the bin/kafka - simple - consumer - shell. Sh - brist localhost: 9092 - topic test - partition 0 --offset 1234 --max-messages 10Copy the code
  • Balance the leader

      bin/kafka-preferred-replica-election.sh --zookeeper zk_host:port/chroot
    Copy the code
  • Add a copy of the

    cat > increase-replication-factor.json <<EOF {"version":1, "Partitions" : [{" topic ":" __consumer_offsets ", "partition" : 0, "replicas" : [0, 1]}. {" topic ":" __consumer_offsets ", "partition" : 1, "replicas" : [0, 1]}. {" topic ":" __consumer_offsets ", "partition" : 2, "replicas" : [0, 1]}. bin/kafka-reassign-partitions.sh --zookeeper localhost:2181 --reassignment-json-file increase-replication-factor.json --execute bin/kafka-reassign-partitions.sh --zookeeper localhost:2181 --reassignment-json-file increase-replication-factor.json --verifyCopy the code
  • Kafka-consumer-groups. sh new feature in 2.0 (viewing group information and status)

    Simple information bin/kafka - consumer - groups. Sh - the bootstrap - server localhost: 9092, localhost: localhost: 9093, 9094 - the describe - group Test-group --members consumer-id HOST client-id #PARTITIONS consumer-1-AA3f2E15-D577-4C51-ACD5-AA0d488CC131/127.0.0.1 Consumer-1 8 consumer-1-f3b11334-b9EB-4d4F-80d1-766446C77ee9/127.0.0.1 consumer-1 8 Consumer-1-658e4d7b-a561-4430-bbdf-c3ab59a18f3a /127.0.0.1 consumer-1 9 -- details bin/kafka-consumer-groups.sh --bootstrap-server localhost:9092,localhost:9093,localhost:9094 --describe --group test-group --members --verbose Consumer-id HOST client-ID #PARTITIONS ASSIGNMENT consumer-1-AA3F2E15-D577-4C51-ACD5-AA0d488CC131/127.0.0.1 consumer-1 8 test-topic(9,10,11,12,13,14,15,16) consumer-1-f3b11334-b9eb-4d4f-80d1-766446c77ee9/127.0.0.1 consumer-1 8 Test-topic (17,18,19,20,21,22,23,24) consumer-1-658e4d7b-a561-4430-bbdf-c3ab59a18f3a /127.0.0.1 consumer-1 9 Test-topic (0,1,2,3,4,5,6,7,8) -- group status bin/kafka-consumer-groups.sh --bootstrap-server localhost:9092,localhost:9093,localhost:9094 --describe --group test-group --state COORDINATOR (ID) ASSIGNMENT-STRATEGY STATE #MEMBERS localhost:9092 (0) range Stable 3Copy the code

4 Kafka test instructions summary

Key parameters in the preceding command include:

  • –num-records: Specifies the total number of messages to be sent for this test. I recommend that you run kafka-producer-perf-test for at least 5 minutes each time you run the script to get accurate results, so it’s best to set this parameter high
  • — Record-size: Size of the simulated message. This parameter should be consistent with the average message size in your production environment so that the test results more accurately reflect actual usage scenarios. If the average size of messages in your business is 1MB, set this parameter to 1024.
  • – Throughput: Throughput upper threshold. If there is no TPS limit, set it to -1. The same principle applies here: be consistent with your TPS goals in your real business scenario. If you don’t have a clear expectation, set it to -1 to see what is the maximum TPS you can currently achieve.
  • –compression. Type: Sets the message compression type. Is there anything in production that doesn’t start compression? In my experience, the combination of Kafka and LZ4 currently works best and can be used in production environments. In addition, the Kafka community is already considering adding support for ZStandard. ZStandard is Facebook’s open source compression algorithm and claims to beat all other compression algorithms so far, so let’s see.

4 summarizes

This section mainly discusses the kafka cluster management important operation instructions operation and maintenance, may be part of the screenshots from Github open source, part of my test case, if there is similar to some god private content, please directly message in me, I will revise the case.

Qin Kaixin in Shenzhen