Recently, the ConFluent community published an article about Zookeeper being dropped in the upcoming 2.8 release of Kafka, which is an important improvement for Kafka users. Before Kafka was deployed, Zookeeper had to be deployed, and after that Kafka was simply deployed separately.

1. Introduction of Kafka

Apache Kafka was originally developed by Linkedin and later donated to the Apack Foundation.

Kafka is officially defined as a distributed streaming processing platform and is widely used because of its high throughput, persistence, and horizontal scalability features. Kafka has the following features:

Message queue Kafka has system decoupling, traffic peak clipping, buffering, asynchronous communication and other message queue functions.

Distributed storage systems, Kafka can persist messages, fail over multiple copies at the same time, and can be used as a data storage system.

Real-time data processing. Kafka provides some components related to data processing, such as Kafka Streams and Kafka Connect, with the real-time data processing function.

Here is Kafka’s message model:

Here are some of the main concepts in Kafka:

Producer and consumer: Producers and consumers in the message queue. Producers push messages to the queue, and consumers pull messages from the queue. Consumer Group: A collection of consumers that can concurrently consume messages from different partitions within the same topic. Broker: Server in a Kafka cluster. Topic: Classification of messages. Partition: A physical grouping of topics. A topic can have partitions. Messages in each partition are assigned an ordered ID as offset. Each consumer group can have only one consumer consume a partition.Copy the code

2. Relationship between Kafka and Zookeeper

The Kafka architecture is shown below:



As you can see, Kafka works with Zookeeper. So how do they work together?

Take a look at the chart below:

As you can see from the figure above, brokers are distributed and need a registry for unified management. Zookeeper holds the list of Broker services in a special node, /brokers/ IDS.

Broker sends a registration request to Zookeeper at startup. Zookeeper creates a broker node under /brokers/ IDS, such as /brokers/ IDS /[0…N], and stores the IP address and port of the broker.

❝ This node is a temporary node that is automatically deleted if the broker goes down. ❞

2.1.2 Topic Registration Zookeeper also assigns a separate node to a topic, and each topic is recorded in Zookeeper as /brokers/topics/[topic_name].

Messages for a topic are saved to multiple partitions, and the mapping between these partitions and the broker needs to be saved to Zookeeper.

A partition is saved in multiple copies. In the preceding figure, the red partition is the leader copy. When the broker on which the leader copy resides fails, the partition needs to re-elect the leader, which is performed by Zookeeper.

Once the broker is started, it registers its broker ID with the partition list of the corresponding topic node.

Select * from topic XXX where partition number 1 is 1;

[root@master] get /brokers/topics/xxx/partitions/1/state {" controller_epoch ": 15," leader ": 11," version ": 1," leader_epoch ": 2," isr ": [11]}Copy the code



When the broker exits, Zookeeper updates the partition list for its corresponding topic.



2.1.3 consumer registration

The consumer group will also register with Zookeeper, and Zookeeper will allocate nodes to it to save relevant data. The node path is /consumers/{group_id}, and there are three child nodes, as shown below:



This allows Zookeeper to record the relationship between partitions and consumers, as well as their offsets.

2.2 Load Balancing After a broker registers with Zookeeper, producers are aware of changes in the list of Broker services based on broker nodes to achieve dynamic load balancing.

Consumers in the Consumer Group can pull partition-specific messages based on topic node information to achieve load balancing.



In fact, Kafka stores a lot of metadata in Zookeeper, as shown below:



As brokers, topics, and partitions grow, the amount of data stored increases.



3. The Controller is introduced

As we saw in the last section, Kafka is so dependent on Zookeeper that Kafka cannot run without Zookeeper. So how does Kafka interact with Zookeeper?

The diagram below:



A single broker in a Kafka cluster is elected as a Controller to interact with Zookeeper. It is responsible for managing the state of all partitions and replicas in the Kafka cluster. Other brokers listen for changes in data on Controller nodes.

The Controller election depends on Zookeeper. After the election is successful, Zookeeper creates a temporary/Controller node.

The specific responsibilities of Controller are as follows:

Monitor partition changes ❝ For example, when the leader of a partition fails, the Controller elects a new leader for the partition. The Controller notifies all brokers to update the metadata when it detects a change in the partition’s ISR collection. When a topic adds partitions, the Controller is responsible for reassigning partitions. ❞

Listen for topic-related changes listen for broker-related changes to manage cluster metadataCopy the code

The following diagram shows the interaction details between Controller, Zookeeper, and broker:



After the Controller election is successful, the ControllerContext is initialized by pulling a complete metadata from the Zookeeper cluster. The metadata is cached on the Controller node. When a cluster changes, such as adding a topic partition, the Controller not only needs to change the cached data locally, but also needs to synchronize these changes to other brokers.

After listening to Zookeeper events, timed task events, and other events, the Controller temporarily stores these events in sequence to the LinkedBlockingQueue, which is processed in sequence by the event processing thread. Most of these processes need to interact with Zookeeper. The Controller needs to update its metadata.

4.Zookeeper problems

Kafka is itself a distributed system, but requires another distributed system to manage it, and complexity increases.

4.1 Operation and Maintenance Complexity

Zookeeper is used. When Deploying Kafka, two systems must be deployed. Kafka o&M personnel must have the O&M capability of Zookeeper.

4.2 Controller Troubleshooting

Kafaka relies on a single Controller node to interact with Zookeeper. If this Controller node fails, a new Controller needs to be selected from the broker. As shown below, the new Controller becomes Broker3.

After the new Controller is successfully elected, metadata is pulled from Zookeeper again for initialization, and all other brokers need to be notified to update ActiveControllerId. The old Controller needed to turn off listeners, event handling threads, and scheduled tasks. This process is time consuming with a large number of partitions, and the Kafka cluster does not work during this process.

4.3 Partition Bottleneck

When the number of partitions increases, the metadata saved by Zookeeper increases and the pressure on the Zookeeper cluster increases. When the number of partitions reaches a certain level, the listening delay increases, which affects Kafaka’s work.

So the number of partitions that Kafka hosts in a single cluster is a bottleneck. This is exactly what some business scenarios require.

5. Upgrade

The architecture diagram before and after the upgrade is compared as follows:

Kip-500 replaces the previous Controller with a Quorum Controller, in which each Controller node will store all metadata and ensure the consistency of copies through the KRaft protocol. So even if the Quorum Controller fails, the new Controller migrates very quickly.

Officially, Kafka can easily support millions of partitions after the upgrade.

❝ Kafak team Kafka Raft Metadata mode (KRaft ❞ Kafka) is a Raft Metadata mode that synchronizes data through Raft protocol.

The Kafka code kiP-500 that removes Zookeeper has been committed to the trunk branch and has been released in version 2.8.

Kafaka is planned to be compatible with the Zookeeper Controller and Quorum Controller in version 3.0, allowing users to perform grayscale testing. [5]

6. Summary

In the context of large clusters and cloud native, using Zookeeper puts a lot of pressure on Kafka’s operational and clustering performance. It is an inevitable trend to get rid of Zookeeper, which is also in line with the architecture idea of simplicity.