Abstract
As the most frequently mentioned component of the technical solution: message queues, it plays an important role in our program. Asynchronous, decoupled, peak-clipping (buffering) and other features are exactly why we chose it. This article will talk about the nature of message queues, usage scenarios, considerations, and introduce the mainstream message queues as you understand them.
What is a message queue?
Before using message queues, we first need to have a clear understanding of them. Usually in upstream and downstream communication, we need a carrier to transmit the information that one party is concerned about.
For example, after a user places an order in the mall, the product information will be packaged into an object and sent to the logistics module. In this case, the logistics module is upstream and the product information is the message. After the corresponding modules are abstracted out, the whole link can be connected by the glue of messages.
The upstream and downstream modules mentioned above are assumed to be in the same process, so the message exists in memory. What if the upstream and downstream are in two processes or deployed on different machines? How should messages be stored? And how should flow?
This is where “queues” come in, a traditional data structure that represents a first-in, first-out linked list. By “queue” I mean more of a message manager with both storage and routing features.
Messages can be stored in temporary memory or persisted to a file or database. However, messages are not stored forever. They have their own consumption rules and are then discarded after being distributed to modules according to certain rules.
The characteristics and use of message queues
In software design, we often demand that modular functions adhere to the principle of a single responsibility, as well as the principle of a dedicated person. However, in the process of service and service invocation, there are always some requirements that are not strongly related to the main process, such as state maintenance and failure retry. This is where message queues excel.
In general, we refer to functional modules that send data to message queues as “producers”. The module that ultimately uses the message is called the “consumer.”
After the producer successfully delivers the message, it no longer cares about the flow of the message. The subsequent process is guaranteed by the message queue, and the producer saves metadata at most. To prevent data loss, message queues typically provide persistence for restart and recovery.
In this way, even if the producer sends an exception, the consumer will not be idle as long as the message queue is functioning properly and can continue to distribute the remaining messages.
From this, we can also see the decoupling effect of message queues, making producers and consumers disconnected.
As well as decoupling, message queues provide a buffer zone. When faced with high concurrency and high traffic, the message will not directly hit the consumer, it will first walk in the message queue. As long as we have control over the message queue, it won’t be the last straw.
Commercial message queuing frameworks, such as RabbitMQ, which provides traffic control, generally take these exceptions into account.
With the message queue middleware in place, we should design the system for asynchronous processing as much as possible. For example, the aforementioned order completion notification and logistics shipping functions.
Note on message queue
Despite the existence of message queues, the linkage of system services becomes more flexible and extensible. However, the introduction of third-party components also means increased complexity.
Consistency, idempotency
Whether you like it or not, our system has been a distributed existence since we abstracted message queues. For distributed systems, the most difficult problem is consistency. There are CAP principles and BASE theory first, which will not be explained in detail in this paper. If you are interested, you can study it yourself. What I want to mention here is that the message queue itself cannot be used to achieve message reliability and consistency. Producers and consumers have to work together to perfect the whole transmission link.
The transaction nature of the local operation + send message action should be considered for the production side, and when the message fails, the local operation should be rolled back or timed to reach the final consistent state.
For the consumer side, there should be idempotent guarantee. Because of the complexity of network environment, we can not determine the receiving condition of messages in a communication process. Message queues often have retransmission mechanisms to prevent failures or timeouts.
Therefore, we cannot simply assume that messages occur only once, which is unrealistic.
Typically, we add a business unique identifier to the message, plus a status field for transactional judgment modification to ensure a single execution of the business.
sequential
It is not difficult to achieve sequentiality in the same process, as long as there is a global control. However, in order to avoid the bottleneck of a single machine, our message queue usually provides cluster deployment. This consideration of sequential nature in distributed, the difficulty is undoubtedly greatly increased.
Therefore, for sequential messaging requirements, we should re-plan as much as possible from a business perspective and consider whether this is really necessary. Of course, popular message queue frameworks also provide sequential functionality, like Kafka’s same Partition policy.
Common message queues
RabbitMQ and Kafka, mentioned earlier, are the main message queues. Also common are ActiveMQ and Alibaba’s RocketMQ.
RabbitMQ is developed in Erlang language, implements the advanced Message Queue Protocol (AMQP), provides a multi-language SDK interface, and has a complete message forwarding mechanism.
Kafka has high throughput and high performance data processing capability, and is often used for log collection. For example, ELK log analysis system uses Kafka as a message queue.
ActiveMQ is a message queue produced by Apache, which implements Java Message Service (JMS) protocol. Is a mature, fast open source messaging component
RocketMQ is as high performance as its name suggests, and of course, it has an advantage in documentation because it is open source from Alibaba. And also more close to our Chinese programmer’s development thinking.
Most of the above message queue frameworks have the characteristics of high availability, high scalability and high performance. One personal recommendation, of course, is A china-made RocketMQ (: ◡ lunch).
conclusion
Message queues are like intermediaries in our lives, and they come at a cost, but they do allow us to focus more on business development. And now there are many mature open source projects available on the market, with the power of open source, our programs will be more robust!
Interested friends can search the public account “Read new technology”, pay attention to more pushed articles.
Thank you for your support!
Read new technology, read more new knowledge.