Single machine up to 100,000 concurrent Ali development

Cluster deployment

Architecture diagram

The principle of

High availability

Master-slave architecture and multiple copy strategy

Borkers are primary and secondary, and master brokers are mainly responsible for writing

Master Borker receives messages and synchronizes them with slave Brokers to ensure that if one machine goes down, the other machine has data

The slave broker pulls data from the master broker at a fixed time. This is known as master-slave synchronization

Each broker startup is registered with all NameserVers.

If only individual Nameservers are registered, the broker information will be lost when a Nameserver goes down

There is a heartbeat mechanism between Nameserver and Borker that ensures that Nameserver will know when the broker is down

At regular intervals, the broker sends a heartbeat to the Nameserver. The Nameserver updates the latest heartbeat time

Nameserver scans the heartbeat of all brokers at regular intervals. If there is more than one heartbeat, the broker is considered down

The heartbeat drops as it travels

Source of information

Message retrieval can come from either the master or slave broker

When the broker returns a message to the requesting system, it suggests to the system that the next message request should be made to the Master or slave Broker.

Focus on

Rocketmq4.5 was not in full high availability mode. When the Master broker went down, it was not automatically switched to the Slave broker, requiring manual modification

After 4.5, a Dledger mechanism was used to support automatic switching to slaver Broker after a master broker failure

Dledger uses raft algorithm

Download and install

# Download RocketMQ HTTPS://github.com/apache/rocketmqDownload dledger HTTPS://github.com/openmessaging/openmessaging-storage-dledger # visual interface HTTPS://github.com/apache/rocketmq-externals 
Copy the code

Ensure that messages are not lost

Sending messages to MQ zero is lost

  1. Send message synchronously + repeat multiple retries

  2. Rocketmq transaction messaging mechanism, the overall effect is a bit better

Zero loss for MQ after receiving the message

Enable the synchronous disk flushing and master/slave architecture synchronization mechanism

After the data is written to disk and written to the disk of the slave Broker before it is returned to the producer, the message is written to MQ successfully

Zero loss after consumers receive the message

Rocketmq is naturally guaranteed because by default, RocketMQ returns a successful message to MQ after the message has been processed, not before the message processing logic has been executed

idempotence

Avoid the same

  1. The business method determines that when retrying, a message is sent ahead to MQ to see if the message has already been sent, if it has not been sent again, and if it has not been sent again

  2. Status query you write a message to MQ, write the message to Redis, write the ID and order status, when the interface is called repeatedly, go to Redis to query, according to the ID query status, success is not sent, failure is sent again

The Redis scheme is flawed and may lead to repeated consumption

If you send a message to MQ, you don’t have time to write redis, Redis is down, after the restart will send a message again, so there are two messages; So it is generally recommended to use a business approach for judgment

Duplicate messages

There is a special retry queue. After 16 retries, the queue enters the dead letter queue. The processing mode of the dead letter queue depends on service requirements

When the service fails to consume messages due to some reason, the reconsume_laster can be returned and the message is added to the delayed message consumerGroup for a maximum of 15 stepped-type retries. After 15 failures, the message is put into the dead letter queue and the consumer starts the thread for consumption

consumers

Consumers consume messages in two ways. One is push. The broker actively sends messages to consumers at random. One is pull, where the consumer pulls messages from the broker at random

push

In essence, consumers are constantly sending messages to the broker to pull data

After processing a batch of messages, the consumer sends a request to the broker for pull messages. It looks like the broker is pushing information to the consumer, but the consumer is constantly sending pull data to the broker

When a request is sent to the broker and no message is available to consume, the request thread is suspended, which is 15 seconds by default. Then a thread in the background checks for messages in the Broker and wakes up the request thread. The consumer picks up the message

pull

Ensure that messages are not lost

If the message is sent to the consumer, it may go down before the consumer actually consumes the message, the message is in the system cache, but the message returned to MQ is that the consumer successfully consumed the message

Synchronization mechanism

When the consumer has actually executed the processing logic of the message, the successful message is returned to MQ

An asynchronous mechanism cannot be used because it may result in the consumer giving MQ a success message before consuming the good news

If the consumer is down at this point, the message returned to MQ is false, with a store success message that did not actually succeed

producers

Synchronous message sending

Send a message to MQ and wait for MQ to return the result. If it does not return the result, it will get stuck

Sending messages asynchronously

Sends a message to MQ, does not wait for MQ to return the result, the CPU does something else, and when MQ returns the message, the code continues to execute

One-way message sending

Just send a message to MQ, whether OR not MQ returns a message

The message is sent successfully

  • The half message is not visible to consumers

    General after a message to the mq will be written to the corresponding topic/messageque consumerqueue, but after rocketmq recognition to the message for the half, the message will be written in the internal topic rocketmq, so consumers for half is not visible

For example, if you buy something, you have paid to the order system, send half once and find that MQ does not return the message, MQ hangs, so you carry out the fund rollback.

If the local transaction fails, the ordering system will send a ROLLBACK to MQ, indicating that I failed here and cannot accept your message

If the rollback and commit failed, since the messages in MQ are always in the half state and there is no response for a long time, you will know that there is a problem with MQ. At this time, you need to determine whether the state of the order is “completed”. If yes, commit the request again.

How to Rollback

Write the ROLLBACK record to op_+topic to mark a half message as rollback

Assuming that rollback or COMMIT has not been performed, MQ will call the interface up to 15 times to determine the status of the half message, and if the status of the half message is not known after 15 times, it will automatically mark the message as ROLLBACK

half

Before each message is sent, half is sent to MQ, and if MQ is working properly, an OK is returned to the producer and the producer can send the real message. If ok is not returned, there is a problem with MQ and the message is rolled back

There are actually three steps

  1. Producer sends half to MQ

  2. Mq returns information to the producer

  3. Producer proceeds to the next step

In fact, the above three steps may have problems, so how to ensure that there is no problem that, please continue to see

Do the following three responses to the above three steps

  1. If the producer fails to send the half message, a local thread is called to see if the half message returns within a specified time, or if it does not, it is rolled back

  2. If MQ fails to return a message to producer, MQ invokes a local thread to see if the half message has returned in a limited time, or if it has not, it rolls back

How do I ensure that the half message is sent successfully

Messages are written to RMQ_SYS_TRANS_HALF_TOPIC

skills

You can use Ali’s Cannal technology to synchronize mysql’s binlog

The data of a topic is placed on multiple Messagequeue to realize distributed storage

persistence

broker

After receiving a message, the broker writes all messages to disk in sequence. It is called a Commitlog. A parameter specifies the maximum capacity of a Commitlog

Sequential disk write + OS cache write + OS asynchronous flush

After receiving a message, the broker does not write the message directly to disk. Instead, it writes the message to the system cache, which then writes the message to disk periodically

Asynchronous flush may cause data loss. For example, after data is written to the system cache, the system suddenly breaks down. The producer thinks that the message has been written to the system, but it has not been written to the disk

Synchronous flush means that each time data must be written to disk, the message is sent

To optimize the

File preheating

The madvise system call loads data from disk space to memory as much as possible, reducing the number of times data is loaded from disk space to memory

mmap

The common procedure for storing data on disk is as follows

Two copies are required

Mmap only needs to be copied once

Once written to the virtual memory, it can be directly copied to the disk space. It is not necessary to copy the disk space to the thread private space.

messagequeue

A topic has multiple Messagequeeus, and a Messagequeeu has only one ConsumerQueue

Messagequeue does this on the store of the broker

Will be stored in the corresponding topic/messagequeeu0 / consumerqueue0

After each time the broker receives the message, messages will be written to disk in order, and at the same time the news will be the physical storage location records in the topic/messagequeeu0 / consumerqueue0, so convenient consumption news consumers over time, the message can know the location of the store

Dledger

what

This mechanism ensures that when the leader broker fails, it automatically switches to the slave broker.

why

Raft algorithm is used, basically all brokers vote for themselves, in the first round all brokers vote for themselves and then go to random sleep, broker1 goes to sleep for 2 seconds, Broker2 goes to sleep for 3 seconds, Broker3 goes to sleep for 4 seconds, broker1 definitely wakes up first from the data, He votes for himself and sends his vote to others. The remaining two find that others have already voted and follow him to vote, so the broker is elected and becomes the leader broker

Voting completed: the number of machines /2+1, it means the majority, that is, when there is a majority of people vote, do not need other people to express their opinion, directly the majority of people’s opinion as the final opinion

Broker voting mechanism

  1. If someone has already voted, they will respect their opinion and vote with them

  2. They vote for themselves

The order above is also the order of priority

Multi-copy synchronization

There are two stages

  1. Uncommitted phase

  2. Committed phase

Uncommitted phase

The leader broker receives the data, marks it in uncommitted state, and sends it to the Slave Broker’s Dledger Server through its own Dledger Server component

Committed phase

After receiving the message, the Slave Broker’s Dledger Server will reply an ACK to the Leader Broker’s Dledger Server. When the leader Broker receives more than half of the slave Brokers, The data is marked to the COMMITTED state

Then the slave Broker of the leader Broker sends the committed state to the Dledger Server of the Slave Broker

Network communication architecture model

Rocketmq network communication architecture model. First, the producer and the server establish a TCP long connection through the reactor main thread. The client and the server communicate through the SocketChannel, send messages through the Socketschannel, and monitor the message arrival of the SocketChannel through the REACTOR thread pool. The REACTOR thread pool is only responsible for pulling the message out, which requires encryption validation, encoding and decoding, and network connection management through the worker thread pool before the message can be processed. Send the message via the SendMessage thread pool. The REACTOR main thread is responsible for establishing long connections. The REACTOR main thread concurrently listens for message requests. The message is then processed by worker multithreading, and the read and write disks are processed by the business thread pool. The execution of each thread pool will not affect other thread pools to process requests in other links. The REACTOR thread pool uses AIO multiplexing

Message loss

Three of the following

  1. When a producer sends a message and the master broker does not receive the message due to network failure or failure, it can compensate for multiple failed messages by retry mechanism and memo mechanism

  2. When messages reach MQ, RocketMQ loses messages. When an asynchronous flush is used, the commit log of the message may not be flushed to disk in the Page cache. At this time, the physical broker is down and restarts, causing the Page Data in the cache is lost. If synchronous flush messages are selected, data may be lost even after being stored to disks. When a disk is faulty, you can back up disks for redundancy to ensure that as few messages are lost as possible

  3. The message is saved to MQ. When the consumer consumes the message without ack, MQ thinks that the message has been successfully consumed and jumps to the next offset. At this point, ack mechanism is used to ensure that the message is not lost