Kafka Cluster Deployment (Kafka + ZK mode)

“This is the 15th day of my participation in the Gwen Challenge in November. Check out the details: The Last Gwen Challenge in 2021.”

Kafka + zk mode

1. Plan the cluster

If you do not have a physical machine, you can use three VMS. Ubuntu system can not install their own baidu oh, I will not elaborate here.

The author uses a virtual machine, the following is the configuration of three computers

The host name The IP address broker.id myid
kafka1 192.168.56.107 0 1
kafka2 192.168.56.108 1 2
kafka3 192.168.56.109 2 3

Note: Environment – Virtualbox+Vagrant installs Centos7

2. Download the installation package

Official website to download

Wget HTTP: / / https://dlcdn.apache.org/kafka/3.0.0/kafka_2.13-3.0.0.tgz

3. Set up the Kafka cluster

  1. Decompress the specified installation package directory

Tar -zxvf kafka_2.13-3.0.0. tgz-c /app

  1. Create a data folder

CD kafka_2.13-3.0.0/ & mkdir /app/kafka_2.13-3.0.0/logs

  1. Modify the configuration file server.properties

vi config/server.properties

Advertised_listeners are service ports that are exposed and are used to establish connectionsAdvertised. Listeners = PLAINTEXT: / / 192.168.56.107:9092# Broker a globally unique number in a cluster that cannot be repeated
broker.id=0
# Delete topic function
delete.topic.enable=true
False: The producer sends a message to an existing topic to avoid an error
auto.create.topics.enable = false
# Number of threads processing network requests
num.network.threads=3
The ready amount of disk IO to handle
num.io.threads=8
Buffer size for sending sockets
socket.send.buffer.bytes=102400
The buffer size for receiving sockets
socket.receive.buffer.bytes=102400
Request the buffer size of the socket
socket.request.max.bytes=104857600
# change kafka log directory and ZooKeeper data directory, because these two items are stored in TMP by default, and the contents of TMP directory will be lost with restartThe dirs = / app/kafka_2. 13-3.0.0 / logsThe number of partitions on the current broker for topic. The number of partitions is generally consistent with the broker
num.partitions=3
# Number of threads used to recover and clean data under data
num.recovery.threads.per.data.dir=1
The maximum length of time the segment file should be retained
log.retention.hours=168
Add Zookeeper to the Zookeeper clusterZookeeper. Connect = 192.168.56.107:2181192168 56.108:2181192168 56.109:2181Copy the code

Note:

  • Delete. Topic. enable=true If server.properties is not configured with delete.topic.enable=true in the configuration file loaded when Kafaka starts, the deletion is not a true deletion, but marks the topic as marked for deletion
  • How does Kafka completely remove topics and data
  • 12. Advertised d.listeners
  • Broker.id cluster is different
  1. Configuring environment Variables

Note: The other two servers operate in the same way. Broker.id must be unique in the cluster

4. Configure the built-in ZK

  1. Creating a folder

Directory for storing snapshot logs mkdir /app/zookeeper/data -p

Path for storing the event logs mkdir /app/zookeeper/datalog -p

  1. Modify the zookeeper.properties configuration file

Vi/app/kafka_2. 13-3.0.0 / config/zookeeper. Properties

# Licensed to the Apache Software Foundation (ASF) under one or more
# contributor license agreements. See the NOTICE file distributed with
# this work for additional information regarding copyright ownership.
# The ASF licenses this file to You under the Apache License, Version 2.0
# (the "License"); you may not use this file except in compliance with
# the License. You may obtain a copy of the License at
# 
# http://www.apache.org/licenses/LICENSE-2.0
# 
# Unless required by applicable law or agreed to in writing, software
# distributed under the License is distributed on an "AS IS" BASIS,
# WITHOUT WARRANTIES OR CONDITIONS OF ANY KIND, either express or implied.
# See the License for the specific language governing permissions and
# limitations under the License.
# The directory where the snapshot is stored.
# Snapshot log storage path
dataDir=/app/zookeeper/data
# the port at which the clients will connect
This port is the port through which the client connects to the Zookeeper server. Zookeeper listens on this port and accepts client access requests. Modify his port to be larger
clientPort=2181
# disable the per-ip limit on the number of connections since this is a non-production config
maxClientCnxns=0
# Disable the adminserver by default to avoid port conflicts.
# Set the port to something non-conflicting if choosing to enable this
admin.enableServer=false
# admin.serverPort=8080

#################### Custom ####################
This time is used as the interval between Zookeeper servers or between clients and servers to maintain the heartbeat, i.e. each tickTime will send a heartbeat.
tickTime=2000
This configuration item is used to configure the Zookeeper server to accept the client.
It is the maximum number of heartbeat intervals that the Cluster of Zookeeper servers connected to the Follower server of the Leader can tolerate when initializing the connection. When 5 heartbeats have passed (tickTime)
The Zookeeper server has not received a message from the Zookeeper client, indicating that the client failed to connect. The total length of time is 5 times 2000 is 10 seconds
initLimit=10
# This configuration item identifies the length of sending messages between the Leader and Follower. The maximum length of the request and response time cannot exceed the number of Ticktimes. The total length is 5*2000=10 seconds
syncLimit=5

If this is not specified, transaction logs will be stored in the directory specified by dataDir by default. This will seriously affect zK performance. When zK throughput is high, too many transaction logs and snapshot logs will be generatedDataLogDir = / app/zookeeper/datalog server. 1 = 192.168.56.107:2888:3888 server. 2 = 192.168.56.108:2888-3888 Server. 3 = 192.168.56.109:2888-3888#server.1 This 1 is the id of the server or any other number that identifies the server. This id is written to the myID file under the snapshot directory
#192.168.7.107 is the IP address in the cluster. The first port is the communication port between the master and slave. The default port is 2888
Copy the code
  1. Create the myID file in the dataDir configuration path
The server Create a command
192.168.56.107 echo “1” > /app/zookeeper/data/myid
192.168.56.108 echo “2” > /app/zookeeper/data/myid
192.168.56.109 echo “3” > /app/zookeeper/data/myid

Note:

  • Myid file and server.myid is a file in the snapshot directory that identifies the server. It is an important identifier used by the entire ZK cluster to discover each other.
  • The other two servers perform the same operation, and the myID file content must be unique in the cluster

5. Install JDK and configure environment variables

Note:

  • Installing the JDK for Linux
  • Zookeeper is written in Java so it needs a Java environment

6. Configure the Kafka environment variable

Create a configuration sudo vi /etc/profile.d/kafka.sh

#KAFKA_HOME
exportKAFKA_HOME = / app/kafka_2. 13-3.0.0export PATH=$PATH:$KAFKA_HOME/bin
Copy the code

Updated the source /etc/profile configuration

7. Start the cluster

  1. Start the zookeeper
bin/zookeeper-server-start.sh -daemon config/zookeeper.properties
Copy the code
nohup bin/zookeeper-server-start.sh config/zookeeper.properties > log/zookeeper/zookeeper.log 2>1 &
Copy the code
  1. Restart kafka
bin/kafka-server-start.sh -daemon config/server.properties
Copy the code
nohup bin/kafka-server-start.sh config/server.properties > zklog/kafka.log 2>1 &
Copy the code

Start kafka1, kafka2, and kafka3 nodes in sequence

8. Shut down the cluster

Disable Kafka on kafka1, kafka2, and kafka3 in sequence

bin/kafka-server-stop.sh stop
Copy the code