kafka

From a website to crawl the data and summary of the interview questions, interested can answer

  • Kafka has a topic client that consumes offset when it is down. No, only the client can record offset locally.

  • Kafka high water level (not)

  • 17. Kafka choose what to do? Kafka is different from RabbitMQ. 19. How do kafka partitions synchronize? 20. How does Kafka guarantee not to lose messages? Kafka partition broker consumer consumer group topic Kafka why can carry such a high QPS? (shopee shrimp skin)

  • 2. How to ensure sequential consistency of Kafka messages? 3. How to ensure the reliability of Kafka messages? Kafka controller election and leader election

  • 4, Kafka architecture, how to use Kafka to ensure the order of messages? (jingdong)

  • How kafka ensures message ordering (partition)

  • Kafka consumption order reasons (bytes)

  • 7. Kafka features? Usage scenarios? Kafka partition? Kafka can a producer send messages to multiple partitions? 9. How Kafka messages are organized on disk (bytes)

  • Kafka Topic partition describes how many producers (unlimited) and how many consumers (?) the next partition corresponds to. Tencent (PCG)

  • What if producers produce too much data for consumers to consume? Kafka’s mechanism

  • How kafka guarantees consumption success (bytes)

  • How does Kafka achieve high throughput? (1) in order to read and write: kafka’s message is appended to the file, this feature makes the kafka can read and write (2) make full use of the order of the disk performance (3) zero copy: skip the copies of the user buffer, to establish a direct mapping of disk space and memory, data is copied to the no longer “buffer” to user mode (4) file segmentation: Kafka’s queue topic is divided into multiple segments, and each partition is divided into multiple segments. Therefore, messages in a queue are stored in N fragment files. (6) Data compression: Kafka also supports compressing message sets. Producer can compress message sets using GZIP or Snappy formats.

  • 6. Why is Kafka high performance? (Hungry?)

  • 4. How kafka ensures message loss (YY)

  • 4, Kafka copy synchronization mechanism understanding 5, Kafka how to write messages to ack(Tencent)

  • Kafka’s primary/secondary election mechanism (byte) kafka’s primary/secondary election mechanism

  • 19. Relationships between topic and partition and broker in Kafka (netease)

  • Kafka’s ISR queue leader is a queue leader. Kafka’s ISR queue leader is a queue leader. Kafka’s ISR queue leader is a queue leader.

  • Data backlogs and skew in Kafka

  • What is the reason why you choose kafka (kafka features), ask the deeper will ask kafka ISR mechanism, partition election

  • How kafka classifies data is that some data needs to be deleted and some data needs to be saved. What is the problem with Kafka and what is the problem with too much data? (dahua)

  • 3. Is there any research on kafka’s underlying structure? How do redis and Kafka talk to each other? (Meituan)

  • 5. How is Kafka guaranteed for high availability? 6. What are kafka’s performance bottlenecks? 7. Know the difference between Kafka and other MQ? Take activeMQ for example. 8. What kafka designs will improve its performance? (bytes)

  • How can Kafka be so efficient? How does Kafka ensure that messages are not repeated and not lost?

  • Kafka isR and OSR. How many values does ACK have? (quickly)

  • How do replicas synchronize messages in Kafka? Do you know the consumer group for Kafka? How they consume information. How was the message found and which copy was consumed? (jingdong)

  • 18. What do you know about Kafka itself? Kafka Topic? Principle, structure design of topic, what is a topic? Is the consumption of the leader partition or follwer readable as well? Why can’t I read Follwer? What situation will produce miss reading? Consumers may lose messages. 22. Know the ISR queue? 23. How does the leader determine the follwer failure? 24. What is the normal timeout period? 25. How does Kafka learn? (From whom)

  • 12. Does Kafka generate its own data? Have you tried setting kafka’s offset yourself? (58)

  • Why kafka puffs high (Ape Tutorial)

  • How does Kafka solve the problem of high throughput? How does Kafka handle large numbers of producers without allowing additional partitions, and how does it prevent data from piling up? 4, ZooKeeper for Kafka function? 4. How is kafka high throughput implemented? (great)

  • If the number of concurrent requests in kafka is too high for the broker to handle, how to solve the problem (station B)

  • Kafka solves the problem of two clients consuming data (a bit of information)

  • If Kafka is repartitioned when Flink consumes kafka data, will Fiink be affected?

  • 7. High Throughput kafka causes (Bytedance)

  • 9. Scenario: Kafka goes down. How do I pick up where I left off?

  • 1. Kafka architecture 2. How to synchronize master and slave kafka 3

  • When sparkstreaming directly into Kafka, the number of partitions in Kafka increases at some point. How does spark know about this? Why does Kafka have a consumer group? What is the role of producer,broker and Cousumer in Kafka

  • 8. What is kafka’s consumer group? Why is there a consumer group?

  • 12. Kafka hangs after consuming offset before committing offset. How do I know which kafka partitions are currently consumed by SparkStreaming and which offsets are consumed to

  • Kafka copy mechanism

  • Components of Kafka Talk about how consumer groups consume data. What are the features of Kafka? Kafka has high throughput and low latency. The producer commits asynchronously, writes sequentially, and writes with zero copy. Kafka has high throughput and low latency.

  • How does Kafka ensure that the order of the prices of an item is changed multiple times

  • Kafka: Features: high throughput low latency Why features such as zero copy, sequential write (Kafka useful to zero copy? Yeah!) What else can be done to improve throughput? Asynchronous production coding

  • Kafka basic understanding; Do you know the duplicate mechanism (Hang Seng)

  • Kafka: Comsumer group vs partition comsumer rebalance? ISR? How to handle message heap? Partitoin is either partitoin or batch poll. Transaction function is disabled to increase the number of partitions and consumers. Consumers are received by worker thread and blocking queue. Session.timeout.ms (bytes)

  • 12. Kafka is a push or a pull? How to save data on a partition to a disk?

  • 5. Why Kafka? What is a broker in Kafka? How does Kafka implement partitioning?

  • How does Kafka prevent message consumption and loss

  • 3. How to ensure high availability of message queues? How to ensure that messages are not re-consumed? 4. Kafka, activemq, rabbitmq rocketmq are what are the advantages and disadvantages? 5. If you were to write a message queue, how would you architecture it?

  • How does Kafka guarantee high throughput

  • Kafka’s message recovery mechanism

  • How does Kafka guarantee message reliability? What’s the difference between push and pull?

  • How Kafka works, whether messages are pulled or pushed

  • Usage scenarios of Kafka message queues How does Kafka ensure the reliability of message delivery

  • 9 Why kafka is used and how kafka implements high availability 10 How Kafka handles message loss 11 How does kafka guarantee idempotent? 12 How does Kafka ensure that only one consumer buys; 13 What are the application scenarios of message queues

  • 7. How do Kafka consumers ensure that they do not consume duplicate data? By submitting offsets, the idempotency of the data is guaranteed. What can you usually do? What’s the difference between redis and Mysql? If there is a data corresponding to offset, manual submission fails after consumption, how to deal with it? Rollback, using Transaction resolution in Kafka. Introduce Kafka transactions.

  • 5. Kafka message consumption fault tolerance mechanism; (everyone)

  • 2. Why does kafka need clustering and estimate kafka throughput?

  • Kafka replicas functions better than other message queues. (new)

  • You used message queues in your project. why would you use a message queue? ———– is the business scenario, by the way, the interviewer’s experience in development projects. What are the advantages and disadvantages of using message queues in your project? The cost of extending the coupling without message queues is too high, the synchronization time is too long, the request stress caused by concurrency is too high… What are the pitfalls encountered in a project and how can they be avoided? (MQ is a third party product, introduced into the system will of course increase the dependency of business logic on the product, and the MQ product will not hang, MQ will be sent repeatedly, messages will be lost, or messages will be out of order due to some internal thread delay.) What message queue products do you work with? What are the similarities and differences? ———RocketMQ, RabbitMQ, activeMQ, Kafka…… There should be different throughput, I honestly don’t know the exact similarities and differences, I’ve only used one… How do you make message queues highly available? —— to talk about cluster related application scenarios, originally message queue is a third-party product introduced for large-scale data requests, the stand-alone version is not gild the gild? Message queue repeat consumption —— For Kafka there is a message number, the technical term offset. Consumers will consume the queue in sequence according to the number, and submit the consumption record regularly. If there is a consumer outage, the restart will continue according to the offset number. If the offset in the message queue does not commit messages that the consumer has consumed, then repeated consumption occurs after the consumer restarts. How to ensure message queue idempotent ——— the pit dug in front, since there are repeated messages, it has to solve again. Check the received message for the written library, and consume it if it doesn’t exist. This is equivalent to doing a query validation before writing to the database. The possibility that the message queue may lose messages ———— The producer may lose messages (message MQ enables transaction monitoring whether MQ is receiving messages, but its performance is low due to synchronous blocking, and there is an asynchronous callback mechanism that is more efficient……) , MQ is down and lost (message queue enabled persistence… Create persistence and send Settings persistence is indispensable), and consumers lose it when consuming (ack mechanism of message queue…). . How can message queues be sequenced? Message queue delay and expiration? Message queues are full, and millions of messages continue to backlog problem —— Hardware expansion, more machines consume messages. How do you design a message queue architecture? What’s your opinion? ——— thinks of a message queue as a piece of middleware that supposedly sits between a database and a client request. The middleware accepts requests from thousands of clients and then drops them into the database. And the implementation of this message queue architecture requires scaling (plus physical machine distributed architecture), security (order and loss issues, so queue message numbering)

  • 4. How to solve the problem of message loss and repetition when consuming (ant)

  • 11. How does Kafka guarantee not to lose messages and not to consume again? (paypal)

  • How does RabbitMQ work? How do you ensure that messages are executed sequentially? Does Kafka understand? What’s the difference from RabbitMQ? Why didn’t you use Kafka? What were your thoughts at the time? (Pinduoduo)

  • What message-oriented middleware do you know? ——– Now that message queues have been introduced, some middleware on the market should definitely be investigated… Concurrency, maintenance period, customization function (Kafka log collection function)…… What middleware is used in your production environment? What does it mean to introduce middleware? —— Combined with the project… Decoupling, asynchrony, peak clipping… What are the disadvantages of using message queues? —– Availability of message queue downtime, as well as message loss, data consistency, data duplication, data order caused by messaging middleware…… A new set of problems. Since there are problems, how to avoid and solve the above problems? ——– High availability, Master and Slave **; Repeated data consumption, unique primary key of the database… ; Data loss, distinguishing between producer loss and message queue loss and consumer loss of data… (TCL)

  • How does Kafka guarantee not to lose messages and not to double consume? (Meituan)

  • How to ensure reliable, highly available, and idempotent Kafka

  • How does Kafka work? How do I keep messages from getting lost? (Pinduoduo)

  • How does Kafka keep messages sequential? (baidu)

  • What kafka does: How Kafka ensures reliability, how Kafka’s master-slave mechanism is described, how ISR explains consumer configuration. (bytes)

  • How does Kafka guarantee exactlyonce data? 3. How does your company maintain offset? Why not put it in mysql? 4, Kafka can ensure local order of data, how to ensure global order? 5. Introduce kafka transactions. (360).

  • 3. How is message ordering in Kafka? In Kafka, messages in each partition are written in order. When consumed, each partition can be consumed by only one consumer in each group, ensuring that consumption is also in order. The entire topic is not guaranteed to be ordered. In order to ensure the overall order of topic, the partition is adjusted to 1. (webank)

  • 7. How do Kafka consumers ensure that they do not consume duplicate data? By submitting offsets, the idempotency of the data is guaranteed. What can you usually do? What’s the difference between redis and Mysql? If there is a data corresponding to offset, manual submission fails after consumption, how to deal with it? Rollback, using Transaction resolution in Kafka. Introduce Kafka transactions.

  • 10 how kafka handles message loss; 11 How does kafka guarantee idempotent? 12 How does Kafka ensure that only one consumer buys; 13 Application scenarios of message queues.