1. Why does message accumulation occur?

This is mostly because a Consumer problem is not detected in time, or failure recovery takes a long time, resulting in a large backlog of messages in MQ.

2. What are the consequences of information accumulation?

2.1 Message Is Discarded

For example RabbitMQ has an expiration time TTL, and expired messages are thrown away so that they are completely gone.

2.2 The disk is Full

If the pile is too large, you may run out of disk space and no new messages will come in.

2.3 Massive Messages to be processed

If messages are not out of date and disk space is sufficient, a Consumer’s nightmare can result in a flood of messages waiting to be consumed.

3. How to cope?

3.1 Discarded Messages

First, to implement message expiration prevention, no expiration time should be set.

If the expiration time is set, the message is lost, how to remedy?

Then, in the dead of night, when the traffic is lowest, write a temporary program to fill in the news.

For example, if an order message is missing, you need to find out which order messages are missing and resend to the queue.

3.2 Insufficient Disks

The system is usually monitored, and when a space threshold is reached, an alarm is raised, which requires immediate action.

For example, create a temporary message queue on another machine and write a temporary Consumer to act as a relay for the message. Remove the messages from the backlog and place them in the temporary queue.

Quickly clear the backlog of messages and restore disk space to normal levels.

3.3 Rapidly Process massive messages

After the Consumer returned to normal, how to deal with the mountain of messages?

How to handle in accordance with the normal situation before, monkey years horse months to consume, this process and new information in the continuous.

For example, with a backlog of 1 million items, there are 3 consumers, each of which can process 200 items per second, and 3 consumers can process 600 items per second.

It’ll take over an hour to process.

How many new messages will accumulate in an hour or so?

So normal processing is definitely not good, need to speed up.

For example, Kafka, the message backlog Topic has 3 partitions, so you can use up to 3 consumers, so adding consumers is useless.

You can still use temporary queues.

Create a new Topic and set it to 20 partitions

The Consumer is no longer handling the business logic, but merely handling, putting messages into temporary topics

These 20 partitions can now have 20 consumers handling the original business logic.

These 20 consumers can process a total of 4,000 items per second, making it possible to clear the backlog of 1 million items in just a few minutes.

There’s not a lot of new information coming in for a few minutes, so you can quickly get back to normal, and then you can put the whole structure back in its original form.

In summary, message backlog is more troublesome, it is best to prevent in advance, do a good job of hardware and message system health monitoring. If a message is lost, manually find the missing message and replace it. When the consumption is not over, you can consider using a temporary queue as a transfer, improve the processing capacity.

Recommended reading

OAuth2 diagram

Easy to understand the core concepts of Kubernetes

Architecture technology trends that developers must understand: Service Mesh