The concept that

In general, a message Queue needs to know topics, producers, consumers, Queue, Delivery Semantics.

What hurts is that there are different message queues and these terms are called differently and don’t have very precise meanings. So Ali started a project, OpenMessaging, to launch the first international standard for distributed messaging. Not many people seem to be buying it, but that doesn’t stop us from following the specification to learn about message queues.

Interested can be compared to see: github.com/openmessagi…

The RocketMQ official documentation is pretty clear, so I won’t repeat it

The core processes

Message write and store

Messages are stored in the broker, written to the commit log, written to memory first, and then flushed. Storage to disk is directly as a file system. To make disk writes more efficient, they are written sequentially, so that all topics are grouped together, unlike Kafka, which uses topic as its base unit.

A single Commitlog file is 1 GIGAByte and then scrolls to different files.

The message read

The consumer reads the offset, size, and HashCode values of the persistent message from the ConsumeQueue, and then reads the actual physical content of the message from the CommitLog.

In addition, to quickly locate messages, there is a file called index that can quickly locate messages given a Topic and Key

Pay attention to the point

Some things to be aware of when using message queues

Message retention time

Rocketmq is saved for 72 hours by default, after which it is discarded regardless of consumption, as specified by the fileReserverdTime parameter. Note that this configuration is global and cannot be set to different values for different topics, for the reason already mentioned, because RocketMQ stores messages for all topics together.

Message order

MessageListenerOrderly

Message loss

Theoretically guaranteed against loss (receiving duplicate messages, and some write performance degradation),

If the production end works in synchronous or asynchronous mode, the sending failure must be handled

Therefore, idempotent consumption is a must. To ensure that the broker does not lose, it needs to enable synchronous flush (to prevent memory loss) and synchronous replication (to prevent single points of failure). There is a performance penalty. The default parameter flushDiskType is ASYNC_FLUSH. The broker flushers the disk after a certain number of messages are sent.

CONSUME_SUCCESS After the consumption, the production end and the consumer end may have a network problem that causes the message to succeed, but the ACK does not succeed, so the delivery/consumption will be repeated. Delivery Semantics is generally At least once. Applications must maintain idempotent consumption

Write efficiency/consumption efficiency/consumption backlog

The sending end flushes disks asynchronously. In the case of asynchronous replication, the write speed can reach tens of thousands with two 4-core 8G and 100Byte sizes.

There are usually no bottlenecks on the broker side. However, because general services are shared in a cluster and all business lines are used, the traffic is still very high, so it is necessary to monitor the alarm and carry out horizontal expansion in time. If the delay is acceptable, the producer can commit in batches, making it more efficient to send.

ConsumeThreadMax (20 by default) If the consumer is independent, it can be adjusted to a larger size to improve the processing speed of the single machine.

When you cannot improve the processing speed of a single machine, you can scale the cluster horizontally. However, it is not infinitely horizontal and the number of subscription queues exceeding defaultTopicQueueNums is invalid. The default value is 4

Application scenarios

The question of why message queues are needed is well covered, but the three classic scenarios are peak clipping, asynchronous processing, and service decoupling. Personally, I think the summary here is more comprehensive. Github.com/openmessagi…

Focusing on the RPC scenario, note that this RPC is not an RPC call. Is a synchronous message, equivalent to two RPCS. The client sends the packet to the server. The server processes the packet before sending it to the client.

rpc

reference

Tech.meituan.com/2016/07/01/…

Tinylcy. Me / 2019 / the – DE…