The theory of
1. What is kafka?
Kafka is a distributed publish-subscribe messaging system. Originally developed by LinkedIn and later part of the Apache project, Kafka is a distributed, partiable, redundant and persistent logging service that processes streaming data.
2. Why kafka
A distributed message server can be used in one of three scenarios
- Buffering and peaking
- The upstream data has a burst of traffic, the downstream may not be able to carry, or there are not enough machines downstream to handle the data requests, leading to server collapse. At this point, the message service can buffer the message request
kafka
Mid-stream, downstream services are then consumed slowly under pressure.
- The upstream data has a burst of traffic, the downstream may not be able to carry, or there are not enough machines downstream to handle the data requests, leading to server collapse. At this point, the message service can buffer the message request
- The decoupling
- System A sends data to three systems BCD, and A sends data through interfaces. When system E is added, we need to add corresponding interfaces to process system E. If system B is removed in the process, the corresponding interfaces have to be manually removed. The introduction of A message service can send messages to
kafka
Then whoever needs it, you can subscribe to the messaging service yourself
- System A sends data to three systems BCD, and A sends data through interfaces. When system E is added, we need to add corresponding interfaces to process system E. If system B is removed in the process, the corresponding interfaces have to be manually removed. The introduction of A message service can send messages to
- asynchronous
- When A distribution system to write data to BCD three service database, it is A need to call BCD three services, and write data back to A system as A result, because it is synchronous, users need to wait here for A period of time will be returned as A result, at this point to the user is not very friendly, if will write operations on the message server data directly, no matter Follow-up operations. This is instantaneous for the user and obviously highly usable
3. Differences between Kafka and other MQ
Advantages:
- Extensible.
Kafka
The cluster can expand transparently, adding new servers to the cluster. - High performance.
Kafka
Performance far exceeds traditionalActiveMQ, RabbitMQ
, etc.Kafka
supportBatch
Operation. - Fault tolerance.
Kafka
eachPartition
Data is copied to several servers when aBroker
Fails,Zookeeper
Producers and consumers will be notified to use the othersBroker
.
The deployment of installation
Installation environment Linux Centos7.6
1. Download package
Enter the server specified folder directory, according to personal preferences. My side is /usr/local/,
cd /usr/local/
Go to the specified directory- Download package
Wget HTTP: / / https://repo.huaweicloud.com/apache/kafka/2.2.2/kafka_2.11-2.2.2.tgz
- Unpack the
The tar - XVF kafka_2. 11-2.2.2. TGZ
- rename
The mv kafka_2. 11-2.2.2 kafka
2. Prepare for deployment
-
Kafka is based on distributed message management, so you need a medium to manage kafka Broker. Zookeeper is an efficient distributed management tool, so you need to build zkZooKeeper before using Kafka
-
Modify the kafka configuration file
- Go to the specified directory
cd /usr/local/kafka
- Modify the configuration
vi config/server.properties
The following points need to be modified
listeners=PLAINTEXT://:9092
Change the server IP address
advertised.listeners=PLAINTEXT: / / 192.168.10.126:9092
The log folder can be modified or left unchanged by default
log.dirs=/tmp/kafka-logs
The default partition number is 1
num.partitions=4
# zk all nodes if cluster is multiple nodes single-node is one node
zookeeper.connect=192.168.10.122:2181
Copy the code
- Start the
kafka
Service, window startup and background suspension startup. It is recommended to start the window for the first time to check whether logs are successfully started
The. / bin/kafka – server – start. Sh. / config/server. The properties and specify the configuration file
The following shows the successful startup
Use CTRL + c to close the current service, use nohup background startup mode. / bin/kafka – server – start. Sh. / config/server properties &
Then using the ps command – ef | grep kafka see whether startup success, success is shown below
3. Related operation commands
- View all files in the current service
topic
Sh --zookeeper [service name or IP address]192.168.10.122:2181 --list./bin/kafka-topicsCopy the code
- create
topic
Specify the name astest
The number of partitions is 4 and the number of copies is 2
./bin/kafka-topics.sh --zookeeper 192.168.10.122:2181 --create --replication-factor 2 --partitions 3 --topic test
Copy the code
- delete
topic
Delete.topic. enable=true in server.properties; otherwise, it marks the deletion and does not actually delete the topic.
./bin/kafka-topics.sh --zookeeper 192.168.10.122:2181 \
--delete --topic test
Copy the code
- To view
topic
details
bin/kafka-topics.sh --zookeeper 192.168.10.122:2181 \
--describe --topic first
Copy the code
5. Send the MESSAGE
./bin/kafka-console-producer.sh \
--broker-list 192.168.10.122:9092- topic test # if create kafka is cluster startup - broler - the list at the back of the multiple addresses, splicing # after input the above command is successful there will be a ">" logo, # can input data input data, To delete it, press CTRL and click on Backspace. >hello worldCopy the code
- News consumption
/bin/kafka-console-consumer.sh --bootstrap-server192.168.10.122:9092 --from-beginning --topic test
Copy the code
- Modify the partition because
kafaka
Internal data processing is too duplicative. Currently, only the partition operation is added without reducing the partition function
./bin/kafka-topics.sh --zookeeper 192.168.10.122:2181 --alter --topic test --partitions 6
Copy the code
Kafka in Java (see next installment)
Blog reference CSDN bigwigsBlog.csdn.net/qq_26803795…