When using MQ middleware, message loss and repeated consumption can occur if consumption data is mishandled. In this paper, kafka for the two problems of the scenario analysis and feasibility of the scheme.

Message loss

Message loss generally refers to the producers and consumers of messages.

producer

The message sender can configure the request. Required. Acks attribute to ensure secure message sending.

0: does not confirm whether the message is received successfully

1: confirms that the Leader receives the packet successfully

-1: indicates that the packet is received successfully by both the Leader and Follower

If the value is set to 0, producer does not acknowledge sending messages. Kafka Server may not receive messages for some reason, resulting in message loss.

If the value is set to 1, the producer finishes sending the message after confirming that the topic leader has received the message. In this case, followers may not receive the corresponding message. If the leader breaks down suddenly, the follower who does not receive any messages is promoted to the leader after the election, resulting in message loss.

Setting this to -1 is a good way to verify that Kafka Server has completed receiving and localizing messages and can retry if producer fails to send.

consumer

The consumer is lost because an exception occurred during the consumption process, but the corresponding message offset has been committed, so the message consuming the exception will be lost. The commit of offset includes manual commit and automatic commit, which can be configured using kafka.consumer.enable-auto-commit. Manual submission can flexibly confirm whether to submit the offset of the consumption data, which can well avoid message loss. Automatic submission is the main cause of data loss. Because message consumption does not affect the submission of offset. Trigger events for automatic submission:Copy the code

Subscribe to the partition via kafkaconsumer.assign ()

ConsumerCoordinator. The poll () method (maybeAutoCommitOffsetsAsync when processing method)

. Before consumers balance operations

ConsumerCoordinator closes the operation

If an exception occurs during message consumption, the consumerCoordinator.poll () method is executed before the next data pull to submit the offset of the current message, resulting in message loss.

To ensure data integrity as much as possible, manual submission is preferred.

Repeat purchases

For example, if we commit the offset manually, kafka is down or the network is down while we are consuming the message data processing to the library

Because the offset cannot be submitted, the consumer will consume the message data again when we restart the service or rebalance.

The processing of repeated consumption mainly focuses on the coding level of consumer, which requires us to develop and design the code from the perspective of idempotency to ensure the same result no matter how many times the same data is consumed. The processing can be done by adding a unique identifier to the message body, which the consumer confirms

This unique identifier indicates whether it has been consumed. If it has been consumed, no subsequent processing is carried out. So as far as possible to avoid repeated consumption.

Original link: blog.csdn.net/qq_38245668…

Reference: www.zhihu.com/question/35… zhuanlan.zhihu.com/p/54287819 zhuanlan.zhihu.com/p/136799968