Kafka Cluster Deployment (Kafka + ZK mode)
“This is the 15th day of my participation in the Gwen Challenge in November. Check out the details: The Last Gwen Challenge in 2021.”
Kafka + zk mode
1. Plan the cluster
If you do not have a physical machine, you can use three VMS. Ubuntu system can not install their own baidu oh, I will not elaborate here.
The author uses a virtual machine, the following is the configuration of three computers
The host name | The IP address | broker.id | myid |
---|---|---|---|
kafka1 | 192.168.56.107 | 0 | 1 |
kafka2 | 192.168.56.108 | 1 | 2 |
kafka3 | 192.168.56.109 | 2 | 3 |
Note: Environment – Virtualbox+Vagrant installs Centos7
2. Download the installation package
Official website to download
Wget HTTP: / / https://dlcdn.apache.org/kafka/3.0.0/kafka_2.13-3.0.0.tgz
3. Set up the Kafka cluster
- Decompress the specified installation package directory
Tar -zxvf kafka_2.13-3.0.0. tgz-c /app
- Create a data folder
CD kafka_2.13-3.0.0/ & mkdir /app/kafka_2.13-3.0.0/logs
- Modify the configuration file server.properties
vi config/server.properties
Advertised_listeners are service ports that are exposed and are used to establish connectionsAdvertised. Listeners = PLAINTEXT: / / 192.168.56.107:9092# Broker a globally unique number in a cluster that cannot be repeated
broker.id=0
# Delete topic function
delete.topic.enable=true
False: The producer sends a message to an existing topic to avoid an error
auto.create.topics.enable = false
# Number of threads processing network requests
num.network.threads=3
The ready amount of disk IO to handle
num.io.threads=8
Buffer size for sending sockets
socket.send.buffer.bytes=102400
The buffer size for receiving sockets
socket.receive.buffer.bytes=102400
Request the buffer size of the socket
socket.request.max.bytes=104857600
# change kafka log directory and ZooKeeper data directory, because these two items are stored in TMP by default, and the contents of TMP directory will be lost with restartThe dirs = / app/kafka_2. 13-3.0.0 / logsThe number of partitions on the current broker for topic. The number of partitions is generally consistent with the broker
num.partitions=3
# Number of threads used to recover and clean data under data
num.recovery.threads.per.data.dir=1
The maximum length of time the segment file should be retained
log.retention.hours=168
Add Zookeeper to the Zookeeper clusterZookeeper. Connect = 192.168.56.107:2181192168 56.108:2181192168 56.109:2181Copy the code
Note:
- Delete. Topic. enable=true If server.properties is not configured with delete.topic.enable=true in the configuration file loaded when Kafaka starts, the deletion is not a true deletion, but marks the topic as marked for deletion
- How does Kafka completely remove topics and data
- 12. Advertised d.listeners
- Broker.id cluster is different
- Configuring environment Variables
Note: The other two servers operate in the same way. Broker.id must be unique in the cluster
4. Configure the built-in ZK
- Creating a folder
Directory for storing snapshot logs mkdir /app/zookeeper/data -p
Path for storing the event logs mkdir /app/zookeeper/datalog -p
- Modify the zookeeper.properties configuration file
Vi/app/kafka_2. 13-3.0.0 / config/zookeeper. Properties
# Licensed to the Apache Software Foundation (ASF) under one or more
# contributor license agreements. See the NOTICE file distributed with
# this work for additional information regarding copyright ownership.
# The ASF licenses this file to You under the Apache License, Version 2.0
# (the "License"); you may not use this file except in compliance with
# the License. You may obtain a copy of the License at
#
# http://www.apache.org/licenses/LICENSE-2.0
#
# Unless required by applicable law or agreed to in writing, software
# distributed under the License is distributed on an "AS IS" BASIS,
# WITHOUT WARRANTIES OR CONDITIONS OF ANY KIND, either express or implied.
# See the License for the specific language governing permissions and
# limitations under the License.
# The directory where the snapshot is stored.
# Snapshot log storage path
dataDir=/app/zookeeper/data
# the port at which the clients will connect
This port is the port through which the client connects to the Zookeeper server. Zookeeper listens on this port and accepts client access requests. Modify his port to be larger
clientPort=2181
# disable the per-ip limit on the number of connections since this is a non-production config
maxClientCnxns=0
# Disable the adminserver by default to avoid port conflicts.
# Set the port to something non-conflicting if choosing to enable this
admin.enableServer=false
# admin.serverPort=8080
#################### Custom ####################
This time is used as the interval between Zookeeper servers or between clients and servers to maintain the heartbeat, i.e. each tickTime will send a heartbeat.
tickTime=2000
This configuration item is used to configure the Zookeeper server to accept the client.
It is the maximum number of heartbeat intervals that the Cluster of Zookeeper servers connected to the Follower server of the Leader can tolerate when initializing the connection. When 5 heartbeats have passed (tickTime)
The Zookeeper server has not received a message from the Zookeeper client, indicating that the client failed to connect. The total length of time is 5 times 2000 is 10 seconds
initLimit=10
# This configuration item identifies the length of sending messages between the Leader and Follower. The maximum length of the request and response time cannot exceed the number of Ticktimes. The total length is 5*2000=10 seconds
syncLimit=5
If this is not specified, transaction logs will be stored in the directory specified by dataDir by default. This will seriously affect zK performance. When zK throughput is high, too many transaction logs and snapshot logs will be generatedDataLogDir = / app/zookeeper/datalog server. 1 = 192.168.56.107:2888:3888 server. 2 = 192.168.56.108:2888-3888 Server. 3 = 192.168.56.109:2888-3888#server.1 This 1 is the id of the server or any other number that identifies the server. This id is written to the myID file under the snapshot directory
#192.168.7.107 is the IP address in the cluster. The first port is the communication port between the master and slave. The default port is 2888
Copy the code
- Create the myID file in the dataDir configuration path
The server | Create a command |
---|---|
192.168.56.107 | echo “1” > /app/zookeeper/data/myid |
192.168.56.108 | echo “2” > /app/zookeeper/data/myid |
192.168.56.109 | echo “3” > /app/zookeeper/data/myid |
Note:
- Myid file and server.myid is a file in the snapshot directory that identifies the server. It is an important identifier used by the entire ZK cluster to discover each other.
- The other two servers perform the same operation, and the myID file content must be unique in the cluster
5. Install JDK and configure environment variables
Note:
- Installing the JDK for Linux
- Zookeeper is written in Java so it needs a Java environment
6. Configure the Kafka environment variable
Create a configuration sudo vi /etc/profile.d/kafka.sh
#KAFKA_HOME
exportKAFKA_HOME = / app/kafka_2. 13-3.0.0export PATH=$PATH:$KAFKA_HOME/bin
Copy the code
Updated the source /etc/profile configuration
7. Start the cluster
- Start the zookeeper
bin/zookeeper-server-start.sh -daemon config/zookeeper.properties
Copy the code
nohup bin/zookeeper-server-start.sh config/zookeeper.properties > log/zookeeper/zookeeper.log 2>1 &
Copy the code
- Restart kafka
bin/kafka-server-start.sh -daemon config/server.properties
Copy the code
nohup bin/kafka-server-start.sh config/server.properties > zklog/kafka.log 2>1 &
Copy the code
Start kafka1, kafka2, and kafka3 nodes in sequence
8. Shut down the cluster
Disable Kafka on kafka1, kafka2, and kafka3 in sequence
bin/kafka-server-stop.sh stop
Copy the code