Possible message loss?

  1. In the process of sending messages to the Broker, the Producer loses them due to network problems, or the Message reaches the Broker but has problems and is not saved

To solve this problem, Producer can start a transaction for MQ. If the process fails, it can be rolled back. However, there is a big problem

  1. The Broker receives a Message that is temporarily stored in memory and hangs before the Consumer can consume it

This can be resolved through persistent Settings:

(1) Set persistence when creating a Queue, so that the Broker persists the metadata of the Queue, but does not persist the messages in the Queue; (2) By setting the deliveryMode of Message to 2, the Message can be persisted to disk. In this way, only after Message is supported to disk can the Producer ACK be sent.

After these two steps, even if the Broker hangs and the Producer cannot receive an ACK, the Producer can resend the ack.

  1. The Message has not yet been processed. The Broker assumes that the Consumer has finished processing and will only send subsequent messages

At this point, autoACK is turned off, and after the message is processed, manual ack is performed

How to ensure reliable delivery at the production end?

  1. Ensure successful delivery of messages

  2. Ensure a successful reception to the MQ node

  3. The sender received an acknowledgement from the MQ node (Broker)

  4. Complete message compensation mechanism

The solution

The message fell library

  1. Dropping messages to the database, changing the state of the message, is very stressful for the database in a high concurrency environment, because the database needs to be written multiple times

Overall process:

  1. Both business data and messages are logged

  2. The production end sends a message to the Broker

  3. The Broker returns the Confirm response to the production end

  4. The message status is changed after confirmation is received

  5. Distributed scheduled tasks obtain message status

  6. If the message fails to be delivered, resend the message and record the resending times

  7. After 3 retransmissions, the state is modified and only manual intervention is possible

Delayed delivery of messages

Delayed delivery of messages, secondary validation, callback checks. Suitable for high concurrency environment, reduce the number of library write

Overall process:

  1. The upstream service first registers the business code and sends a message to the Broker

  2. Send the second delayed acknowledgement message

  3. Downstream services listen for messages for consumption

  4. Send a confirmation message. This is not a confirm mechanism, but a new message

  5. The confirm message is listened for by the callback service and then stored

  6. The callback service detects a delayed acknowledgement message and queries the database for its presence

  7. If the message is not found, the callback service sends a resend command through RPC to the upstream system

Compared with the first scheme, there is less message entry, and confirm mechanism is a core of reliable message delivery.

RabbitMQ can lead to non-idempotent situations

  1. Reliability message delivery mechanism: The consumer replies confirm. The network is interrupted and the producer does not receive an ACK. Scheduled task polling may resend the message, so that the consumer receives two messages

  2. Network jitter occurs during message transmission between the MQ Broker and the consumer. Procedure

  3. The consumer is faulty or abnormal

Kafka can be non-idempotent

When the offset is not committed on the Consumer side, the Consumer restarts, and the double consumption occurs

The solution

Unique ID+ fingerprint code

The overall implementation is relatively simple, need to write database, use the database primary key to duplicate, use ID for sub-database sub-table algorithm routing, from single database idempotent to multi-database idempotent

  1. In this case, the unique ID is usually the primary key of the business table, such as the item ID
  2. Fingerprint code: A fingerprint code is generated for each operation. You can use the timestamp + service id +… The purpose is to ensure that every operation is normal

Overall process:

  1. A unified ID generation service is required. To ensure reliability, the upstream service also needs to have a local ID to generate the service and send messages to the Broker
  2. The ID rule routing component is required to listen for the message. First, the message is stored in the database. If the message is successfully stored, it is proved that there is no duplicate and then sent to the downstream

Recommended reading

  1. MySQL > select * from pagesize; Tell me something about the way of thinking
  2. Redis, Kafka, and Pulsar message queues are compared
  3. After using IDEA for so long, you don’t even know that there is a function called auto complete!
  4. Why not use the BeanUtils property conversion tool