1. Architectural pattern of RocketMQ
- NameServer: Manages information about all brokers in a cluster so that MQ systems can use NameServer to know which brokers are in the cluster
- Broker clustering: Deploy Broker clusters on multiple clusters, using a master/slave architecture for multiple copies of data storage and high availability. Store and forward messages
- Producer: The system that sends messages to MQ
- Consumer: A system that gets messages from MQ
2. Architectural principles of RocketMQ
— Pictures from the Internet
High availability
- RocketMQ naturally supports high availability and a multi-master, multi-slave deployment architecture. RocketMQ does not have a master election feature, so multiple master nodes are configured to ensure high availability of RocketMQ
- If the master hangs, if the current schema is one master with many slaves, it means that prooducer messages cannot be received, but consumers can still continue messages. If the current architecture is multi-master multi-slave, you can guarantee that when one of the nodes fails, the other master node can still provide message sending service
Read/write separation mechanism
- The master node receives transaction requests (the master can also provide read requests), while the slave node only receives read requests. The slave node receives the data synchronized from the MASTET and keeps the data consistent with the master
Message is sent
- When there are multiple master nodes and multiple slave nodes, a message is sent to only one of the master nodes. Rocketmq load balances messages sent from multiple master nodes so that messages can be evenly sent to multiple master nodes
News consumption
- In the figure above, two master nodes can be equally distributed to two consumers, one consumer consuming one master. If there is only one consumer at this point, that consumer will consume data from both master nodes
- Since each master can be configured with multiple slaves, if one master hangs, messages can still be consumed by consumers from the slave nodes
Read/write separation mechanism details
- When a message is written, it must be written by a Master Slave
- When a message is read, it may be fetched from the Master Broker or Slave Broker, and the amount of messages accumulated by the Master server determines whether the consumer pulls message consumption from the Slave server
How messages are read:
- The Broker receives a pull request from a message consumer. After retrieving the amount of locally accumulated messages, it calculates whether the amount of messages accumulated on the server is greater than a certain value of physical memory (40% by default). If so, it marks the next pull from the Slave server, calculates the Broker Id of the Slave server, and responds to the consumer
- When the consumer receives the data back from the pull response, the brokerID for the next recommended pull is cached. To determine which node to send the request to the next time the message is pulled
Summary: When to pull data from slave?
- Master hung up
- The message heap of this pull is greater than 40% of the physical memory
3. RocketMQ high availability guarantee
1. If the entire NameServer cluster fails, can it still produce and consume messages?
The registry nameserver is down and sends data through the producer’s local cache. If the Producer restarts, the registry cannot send data
2. How does NameServer sense if the Broker fails? (Heartbeat mechanism)
Borker sends heartbeat to NameServer periodically (30s)
NameServer runs a task every 10s to check the last heartbeat of each Broker. If a Broker has not sent a heartbeat for more than 120s, it is considered dead
Salve Broker also registers all Nameservers and sends heartbeat to all Nameservers every 30s
3. How does the system sense that the Broker is down?
This is done by pulling information from the Broker on NameServer
However, because Broker heartbeats, NameServer timed tasks, and producers and consumers pulling Broker information are all periodic and not aware in real time, there can be failures to send and consume messages
RocketMQ route discovery is non-real-time. When routing information for a topic changes, NameServer does not notify the client. Instead, the client periodically pulls the latest route corresponding to the Topic. The problem caused by non-real-time route discovery is solved by the client, which ensures the simplicity of NameServer logic
4. What if Broker and NameServer have network problems?
If a Broker is not down, but because of network problems between the Broker and Namesrv, Namesrv thinks that the Broker is down, and Producer gets new routing information, then the Producer can communicate with the Broker. The Producer does not send messages to the Broker
5. How does the consumer end consume after the Master Broker is suspended?
- If the Master broker hangs, messages cannot be written, meaning clients can no longer send messages
- The consumer then reads the message from the slave
Follow-up: What if the master restarts at this point?
- If the master recovers at this point, data is pulled from the master
Follow-up: Pull data from the master, how does the master know the latest offset?
- The latest offset is now stored on the slave. When the master wakes up, the slave will start a scheduled task and ask the master for the consumer Group’s offset. However, the master’s offset is expired. After receiving the master, the Slave will directly discard the offset after comparing it with its own
- How does the master update its offset? At this point, the client saves the latest offset in memory. When the master uses this offset to pull the message, the master will find that its offset has expired and will update its offset with this offset
- What if the client also fails? The offset is lost and must be consumed again from the expired offset
Summary: When consumers are notified of the master outage, they turn to the slave, but the slave cannot guarantee that 100% of the master messages are synchronized, so a small number of messages are lost. But messages are never lost, and unsynchronized messages will be consumed once the master recovers
Rebalance if the master is unable to recover for a long time.
When a primary broker fails, the queue-allocated clients (consumers) cannot pull messages properly. The Rebalance is triggered, and the load balancing triggers queue reallocation. Consumers are mapped to other queues
After the primary broker is restarted, the next load balancing is triggered to restore the load mapping
The problem of repeated consumption: Note that if a consumer restarts during a broker failure, LocalFileOffsetStore(memory object) will have no progress in the queue before the broker failure, so the consumer can only start consuming from the progress previously committed to the broker. If the consumer is not restarted during the broker outage, the progress saved in LocalFileOffsetStore or in the pullRequest can be used for pull task consumption, with little message re-consumption
6. What is the impact if Slave Broker fails?
Answer: A little, but not much
- Because message writes are all sent to the Master Broker, message retrieves can also go through the Master Broker.
- However, some messages may be retrieved from the Slave Broker. If the Slave Broker is suspended, messages can continue to be written or pulled through the Master Broker without affecting the overall operation
- But this results in all the read and write pressure being concentrated on the Master Broker
4. How does RocketMq guarantee high throughput?
- Distributed storage of massive messages
- Each Topic data is stored on multiple Broker machines (the mechanism for storing massive messages is distributed storage)
- Multiple Master Brokers in a cluster are sufficient to store a large number of messages
- High concurrency
- System traffic is spread across multiple machines deployed with RocketMQ
- High QPS in high concurrency scenarios can be spread across multiple brokers. This is because Topic data is spread across multiple brokers
- scalable
- If you want to resist higher concurrency and store more data, you can add more Broker machines to the cluster to linearly scale the cluster