“This is the 15th day of my participation in the Gwen Challenge in November. Check out the details: The Last Gwen Challenge in 2021.”

Case of architecture

  • Kafka Cluster: Consists of multiple servers. Each server has a separate name for its broker.
  • Kafka broker: Servers contained in a Kafka cluster
  • Kafka Producer: Terminals or services that publish messages to Kafka clusters.
  • Kafka Consumer: Message consumers, responsible for consumption data.
  • Kafka Topic:Topic, the name of a class of messages. When data is stored, a type of data is stored under a TOPCI, and consumption data is also a type of data.
    • Order system: Create a topic called Order.
    • User system: Create a topic called user.
    • Commodity system: Create a topic called Product.

Note: Kafka metadata is stored in ZooKeeper.

The architecture analysis

The internal details of kafka architecture:

Kafka supports message persistence, the consumer for the pull model to pull data, consumption state and subscription relationships are maintained by the client, the message consumption is not immediately deleted, will retain historical messages. So when multiple subscriptions are supported, only one copy of the message is stored.

  • Broker: A Kafka cluster contains one or more service instances, which are called brokers
  • Topic: Every message published to a Kafka cluster has a category called Topic
  • Partition: A physical concept. Each topic contains one or more partitions. A Partition corresponds to a folder in which data and index files of the partitions are stored

Relationship to explain

  • Topic & Partition
    • A Topic is a data Topic, a place where data records are published and can be used to differentiate business systems.
    • Topics in Kafka are always multi-subscriber. A topic can have one or more consumers subscribing to its data.
    • A topic is a type of message, and each message must specify a topic.
    • The Kafka cluster maintains a partition log for each topic. The following figure
    • Each partition is an ordered, immutable set of records that are continuously appended to a structured Commit log file.
    • Each record in the partition is assigned an ID number to indicate the order, which is called offffSet. Offffset is used to uniquely identify each record in the partition.

In every consumer only preservation metadata is offffset (offset) the position of the consumption in the log, the offset control: by the consumer usually after read records, consumers will increase the offset in a linear fashion, but in fact, as a result of this position is controlled by the consumers, so consumers can use any order to consumption record. For example, a consumer can reset to an old offset to reprocess past data; You can skip the most recent record and start spending “now.”

These details show that Kafka consumers are very cheap — consumers are added and removed without much impact on the cluster or other consumers.