After the previous mysql took the life of the chain, I found that I this title was a lot of use, what took the life of zookeeper, took the life of multithreading a lot, this time, started the interview questions series MQ topic, message queue as a common use of middleware, the interview is also one of the points that must be asked, a look at the MQ interview questions.
Why do you use MQ? What are the specific usage scenarios?
The purpose of MQ is very simple, peak clipping and valley filling. In the case of an e-commerce transaction ordering scenario, the process of forward trading may involve creating an order, deducting inventory, deducting activity budgets, deducting points, and so on. If the time of each interface is 100ms, then theoretically the whole single link needs 400ms, which is obviously too long.
If all of these operations are handled synchronously, first of all, the call link is too long to affect the interface performance, and second, the distributed transaction problem is difficult to handle. Then, requests such as budget deductions and credits, which require less real-time consistency, can be handled by MQ asynchronously. At the same time, considering the inconsistency caused by asynchronism, we can retry through job to ensure the success of interface invocation. In general, companies have a platform for checking. For example, the problem of placing a successful order but not deducting points can be solved by checking.
With MQ, our links became simpler and messages were sent asynchronously, and our overall system became more resilient.
What mq do you use? What is the selection based on?
We mainly investigated several mainstream MQ, kafka, RabbitMQ, RocketMQ and ActivemQ. The selection was mainly considered based on the following points:
- Due to the high QPS pressure of our system, performance is the primary consideration.
- Development language, because our development language is Java, mainly for the convenience of secondary development.
- This is a must for highly concurrent business scenarios, so it is necessary to support distributed architecture design.
- The functionality is comprehensive, and sequential messages, transaction messages, and so on May be used due to different business scenarios.
Based on these considerations, we chose RocketMQ.
Kafka | RocketMQ | RabbitMQ | ActiveMQ | |
---|---|---|---|---|
Single machine throughput | The class of 100000 | The class of 100000 | All level | All level |
Development of language | Scala | Java | Erlang | Java |
High availability | Distributed architecture | Distributed architecture | A master-slave architecture | A master-slave architecture |
performance | Ms level | Ms level | Us level | Ms level |
function | Only major MQ functions are supported | Order message, transaction message and other functions perfect | Strong concurrency, good performance, low delay | Mature community products, abundant documentation |
You mentioned sending asynchronously, how to ensure the reliability of the message?
Message loss can occur when a producer sends a message, MQ itself loses a message, or a consumer loses a message.
Producer loss
The producer may lose the message if the program fails to send the message, an exception is thrown, and the message is not processed again. Or the sending process succeeds, but the network is intermittently disconnected during the sending process. MQ does not receive the message, and the message is lost.
Since synchronous sending generally does not occur in this way, we will not consider synchronous sending, and we will base our case on asynchronous sending.
Asynchronous sending can be divided into two ways: asynchronous callback and asynchronous no callback, no callback, after the producer sends no matter the result may cause message loss, and through the form of asynchronous sending + callback notification + local message table we can make a solution. The following is an example of a single scenario.
- The local data and MQ message table are saved after the order is placed. The status of the message is in send. If the local transaction fails, the order fails and the transaction is rolled back.
- If the order is successfully placed, the client returns a success message and asynchronously sends an MQ message
- MQ callback notification message sending result, corresponding to update database MQ sending status
- JOB polling fails to send a message after a certain period of time (based on the service configuration)
- The monitoring platform configuration or JOB program processes messages, alarms, and manual intervention that fail to be sent for a certain number of times.
In general, asynchronous callbacks are fine for most scenarios, but only for those scenarios where there is a complete guarantee that messages cannot be lost.
MQ lost
If a producer guarantees that a message will be sent to MQ, and MQ receives the message while it is still in memory, the message may be lost if it goes down without being synchronized to the slave node.
Such as RocketMQ:
RocketMQ can be flushed synchronously or asynchronously. The default is asynchronous flushing, which may cause messages to be lost before they reach the hard disk. You can set it to synchronous flushing to ensure message reliability, so that messages can be recovered from disk even if MQ fails.
Kafka, for example, can also be configured to:
Acks =all The producer is returned successfully only after all the nodes that participate in the replication receive the message. In this case, messages will not be lost until all nodes fail. Replication. factor=N, the value is greater than1This requires that each partion has at least one2Min.insync.replicas =N, set to greater than1, which requires the leader to perceive that at least one follower is still connected, setting retries=N to a value that is too large to allow the producer to retry until it failsCopy the code
Although configuration can be used to achieve high availability of MQ itself, there is a performance cost, and configuration is a tradeoff based on the business.
Consumer loss
The consumer loses the message: The server is down when the consumer receives the message. MQ assumes that the consumer has consumed the message and does not send the message again. The message is lost.
RocketMQ by default requires an ACK reply from the consumer, whereas Kafka requires manual configuration to turn off automatic offset.
The consumer does not return an ACK for confirmation. The retransmission mechanism varies according to different SENDING intervals and times of MQ. If the number of retries exceeds the number of retries, a dead letter queue will be entered and manual processing is required. (Kafka doesn’t have these.)
You talked about the problem of consumers failing to buy, but what about the backlog of messages caused by continued failure to buy?
Because when considering the problem of consumer consumption has been wrong, then we can consider from the following perspectives:
- If the consumer makes a mistake, it must be caused by the program or other problems. If it is easy to fix, the problem should be fixed first so that the consumer can resume normal consumption
- If not, write a temporary consumer consumption plan to consume the message first and then forward it to a new topic and MQ resource. The machine resources of this new topic should be applied separately to accommodate the current backlog of messages
- After processing the backlog, fix the consumer to consume new MQ and existing MQ data, and restore the new MQ consumption
What if the message backlog reaches the disk limit and the message is deleted?
This is… What can I do if I delete it? Calm down and think again. A.
At first, the message record we sent was saved in the database, and the data we forwarded was also saved, so we could find the missing part of the data through this part of the data, and then run a separate script to resend it. If the forwarded program does not fall out of the database, it will be compared with the consumer’s record, but the process will be a little more difficult.
With that said, why don’t you talk about how RocketMQ works?
RocketMQ consists of a NameServer registry cluster, Producer Producer cluster, Consumer Consumer cluster, and several brokers (RocketMQ processes). RocketMQ is structured as follows:
- The Broker registers all nameservers at startup, maintains long connections, and sends heartbeat every 30 seconds
- When sending a message, the Producer obtains the address of the Broker server from NameServer and selects a server to send the message based on the load balancing algorithm
- Conusmer also gets the Broker address from NameServer when consuming the message, and then actively pulls the message to consume
Why doesn’t RocketMQ use Zookeeper as its registry?
I think there are several reasons for not using ZooKeeper:
- According to CAP theory, only two points can be satisfied at most at the same time, while ZooKeeper satisfies CP, which means zooKeeper cannot guarantee the availability of service. When ZooKeeper elections, the whole election takes too long, during which the whole cluster is unusable. This is certainly unacceptable for a registry, which as a service discovery should be designed for usability.
- Based on performance considerations, The implementation of NameServer itself is very lightweight, and can be horizontally expanded by adding machines to increase the stress resistance of the cluster. However, the write of ZooKeeper is not extensible, and ZooKeeper can only solve this problem by dividing the domain into multiple ZooKeeper clusters. First of all, it is too complicated to operate, and second, it still violates the design of A in CAP, which leads to the disconnection between services.
- ZooKeeper ZAB maintains a transaction log on each ZooKeeper node for each write request. At the same time, it periodically mirrors the memory data to disks to ensure data consistency and persistence. For a simple service discovery scenario, this is not really necessary; the implementation is too heavy. And the data stored itself should be highly customized.
- Message delivery should be weakly dependent on the registry, and RocketMQ is designed based on this idea. Producers get Broker addresses from NameServer and cache them locally when they first send a message. If NameServer is not available in the entire cluster, It doesn’t have much effect on producers and consumers in the short term.
How does the Broker store data?
RocketMQ mainly stores commitlog files, ConsumeQueue files, and IndexFile files.
Broker stores messages to commitlog files after receiving them, and each Broker stores a portion of topic data in distributed storage. The File ConsumeQueue is generated under messagequeue corresponding to each topic to store the commitlog physical location offset, and the relationship between key and offset is stored in indexFile.
CommitLog files are stored in ${Rocket_Home}/store/ CommitLog. From the picture, we can clearly see the offset of file names. By default, each file is 1G.
Since messages for the same topic are not stored consecutively in the Commitlog, it is inefficient for consumers to retrieve messages directly from the Commitlog. Therefore, consumeQueue stores the physical address of the offset of messages in the Commitlog. In this way, the consumer locates the physical file in the Commitlog from the ConsumeQueue according to the offset, and then locates the file quickly in the Commitlog according to certain rules (offset and file size modulo).
How do you synchronize data between the Master and Slave?
The synchronization of messages between master and slave is based on raft protocol:
- After the broker receives a message, it is marked as uncommitted
- The message is then sent to all slaves
- The slave returns an ACK response to the master after receiving the message
- After receiving more than half of the ACKS, the master marks the message as committed
- The COMMITTED message is sent to all slaves, and the SLAVE state is changed to COMMITTED
Do you know why RocketMQ is fast?
This is due to sequential storage, Page Cache, and asynchronous flush.
- We write commitlogs sequentially, which provides much better performance than random writes
- Commitlog is not written directly to the disk, but to the PageCache of the operating system first
- Finally, the operating system asynchronously flushs the cached data to disk
What are transactional and semi-transactional messages? How do you do that?
Transaction messages are the XA-like distributed transaction capabilities that MQ provides to achieve the ultimate consistency of distributed transactions.
A semi-transactional message is one in which MQ receives a message from the producer, but does not receive a second acknowledgement that it cannot deliver.
The implementation principle is as follows:
- The producer first sends a semi-transactional message to MQ
- MQ returns an ACK acknowledgement after receiving the message
- The producer starts performing a local transaction
- If the transaction execution succeeds send commit to MQ, fail send ROLLBACK
- If MQ does not receive a commit or ROLLBACK from the producer for a long time, MQ checks the producer
- The producer queries the final state of transaction execution
- Submit a second acknowledgement based on the query transaction status
Finally, if MQ receives a second commit, the message can be delivered to the consumer, whereas if it is ROLLBACK, the message is saved and deleted after 3 days.