Introduce a kafka
1.1 Introduction to Kafka model
Kafka is an open source, lightweight, multi-partitioned, multi-replica, distributed message flow platform based on Zookeeper.
Compared with other message systems, it has the following advantages
- A message engine system that supports the publish and subscribe model.
- Provide fault tolerance when storing data streams.
- Timely response to data processing.
Kafka’s producer consumption model is as follows: Producers produce messages and send them to a message engine, from which consumers retrieve messages.
1.2 Kafka design concept
- High throughput and low latency: Kafka writes messages to the operating system page cache rather than directly manipulating disk IO. Kafka can process hundreds of thousands of messages per second with a minimum latency of a few milliseconds.
- High scalability: Each topic contains multiple partitions, and partitions within a topic can be distributed among different brokers.
- Load balancing: Kafka uses the leader election algorithm to achieve load balancing, which greatly improves operating efficiency.
- Failover: Kafka uses ZooKeeper as a session mechanism. After a cluster is built, Kafka’s fault tolerance is greatly improved, allowing one node to fail without affecting other nodes.
1.3 Usage Scenarios of Kafka
- Behavior tracking: Kafka can be used to track user behavior. For example, we often go to Jingdong for shopping. When you browse for shopping, your browsing information, your search index, and your shopping hobbies will be transmitted to Kafka as a message.
- Passing messages: One basic use of Kafka is to pass messages, either as a message bus or as a message broker.
- Log collection: Enterprise applications are very many, each application will generate logs, Kafka can be used as a log collection summary solution, convenient log management!
- Streaming processing: Kafka Stream, Kafka can receive the data Stream to other framework for computing processing!
- Load limiting: Kafka is mainly used in the Internet domain when there are too many requests at one time. Kafka can write requests to Kafka to avoid direct requests to the back-end application, which may cause service crashes.
1.4 Basic terms for Kafka
- Message: The data units in Kafka are called
The message
, also known as records; There are three elements in the message, key is the message key, used for partitioning; Value, the message body is used to hold the message. Timestamp, timestamp, indicating the time when the message was sent; - Topic: The type of message is called
The theme
(Topic), a Topic represents a class of messages. Topics are often used to distinguish different businesses, one for each business. - Partition: Topics can be divided into partitions. Kafka is a topic-partition-message hierarchy. The purpose of partitioning is to improve kafka throughput;
- Copy: The purpose of copy is to achieve message reliability. Copy the data in f zone to replica to prevent message loss. Kafka defines two types of replicas: the Leader Replica, which provides services, and the Follower Replica, which passively follows the Leader, acting as a backup.
- Broker: A single Kafka server is called a Kafka server
broker
; - ISR: Kafka maintains a collection of ISRS (in-sync Replicas) on Zookeeper for each Topic. Kafka updates this record as collections increase or decrease. If the Leader of a partition is not available, Kafka selects a replica from the ISR collection as the new Leader. The number of failures that can be tolerated is high; if a Topic has N+1 replicas, it can tolerate N server failures.
Kafka tutorial
- Kafka is introduced
- Install Kafka for Linux and install Kafka for Windows
- Kafka receives consumer messages from producers
- To continue to get has been updated
The Knowledge Seeker site asks for your support