Overall documentation: Article directory Github: github.com/black-ant
A new chapter has been opened, and there will be 4-5 chapters on MQ message queues.
There is not much technical depth in this article, but rather an overview of MQ, and if you can clarify MQ, the purpose of this article will be fulfilled
A. The preface
There are many MQ frameworks and it is difficult to describe them all, but here are just a few:
- RabbitMQ :
- Kafka
- RocketMQ
- TubeMQ
- ZeroMQ
- ActiveMQ
2. Structure review
2.1 Prior knowledge: JMS
JMS is the basis for many MQ frameworks. JMS defines a set of interfaces and semantics that allow Java applications to communicate with other messaging implementations, specifying both the specification and the implementation structure.
JMS basis:
JMS has several important roles and purposes:
- Solve the problem of tight coupling of RPC system
- Makes it easy to develop business applications that asynchronously send and receive business data and events
- Can be easily and effectively supported by a wide variety of enterprise messaging products
- A relatively low level of abstraction that runs below the complementary layer
- Define a set of interfaces and semantics that allow Java applications to communicate with other messaging implementations
JMS membership composition:
- JMS Provider: a messaging system that implements the JMS specification.
- JMS Clients: A Java application that sends and receives messages
- Messages: Objects used to pass Messages between JMS clients
- Administered Objects: The pre-configured JMS objects created by the administrator to use the JMS client
Messaging model:
Two ways of messaging are defined in JMS
- Point-to-point (Queue destination)
In this model, messages are passed from a producer to a consumer. The message is delivered to the destination (a queue) and then to one of the consumers registered for that queue. Although any number of producers can send messages to the queue, each message is guaranteed to be delivered and consumed by a consumer. If no consumer is registered to use the message, the queue holds the message until the consumer registers to use it
- Publish/subscribe (subject destinations) :
In this model, messages are passed from producers to any number of consumers. The message is delivered to the topic destination, and then to all topics. In addition, any number of producers can send messages to topic destinations, and each message can be delivered to any number of subscribers. If no consumer is registered, the topic destination will contain no messages unless it provides a persistent subscription to the inactive consumer. A persistent subscription represents a consumer registered at the topic destination that can be inactive when a message is sent to the topic
JMS message composition:
The message consists of three parts: Header, Properties, and body
Header: Each message requires a header that contains information for routing and identifying the message. Some of these fields are set automatically by the JMS provider during the generation and delivery of the message, while others are set by the client based on the message
Properties (optional) : Provides values that the client can use to filter messages. They provide additional information about the data, such as the process that created the data, and when it was created. Attributes can be treated as extensions of headers and consist of attribute name/value pairs. Using properties, clients can fine-tune their selection of messages by specifying certain values as selection criteria
The body (also optional) contains the actual data to be exchanged. The JMS specification defines six types of messages that a JMS provider must support:
- “Message” : indicates that there is no Message body
- StreamMessage: A message whose body contains a Java primitive type stream. It is written and read sequentially
- MapMessage: The body of a message contains a set of name/value pairs. The order of items is not defined
- TextMessage: The body of the message contains a Java string
- ObjectMessage: The body of the message contains the serialized Java object
- BytesMessage: A message whose body contains an uninterpreted byte stream
2.2 the RabbitMQ
RabbitMq is based on the Advanced Message Queuing Protocol (AMQP). RabbitMq has the following features:
- Support for many messaging protocols, such as AMQP, MQTT, STOMP, and hence also known as hybrid brokers.
- Support for several publish-subscribe, point-to-point, request-reply messaging technologies.
- The smart Broker/Dumb consumer model is used and is committed to delivering messages consistently to consumers.
- Support for Java, Ruby, NET, PHP, and many other languages, and provides plug-ins that you can add to extend use cases and integration scenarios.
- Synchronous and asynchronous communication modes are provided.
RabbitMQ main concepts:
- Virtual host: A virtual host holds a set of message switches, queues, and bindings
- Message: A message is not named and consists of a header and a body
- Bind: The switch needs to be bound to queues
- Channel: an independent two-way data channel in a multiplexing connection.
- Switch: Exchange forwards messages, but it does not store them
ExchangeType
- Fanout: Routes all messages sent to the Exchange to all queues bound to it
- Direct: The Exchange routing rule of the direct type is also very simple. It routes messages to queues whose binding keys exactly match routing keys
- Topic: Routes messages to queues where the binding key matches the routing key. Wildcards can also be used, such as * to match any text in a particular location, and “. The routing key is divided into several parts, “#” matches all rules, etc
- Headers: Based on the header attribute in the message content
The RabbitMQ features:
RabbitMQ uses mechanisms to ensure Reliability, such as persistence, transport confirmation, and release confirmation.
Flexible Routing Is used to route messages through Exchange before they are queued. For typical routing, RabbitMQ already provides some built-in Exchange. For more complex routing capabilities, you can bind multiple Exchanges together or implement your own Exchange through a plug-in mechanism.
More than one RabbitMQ server can form a cluster to form a logical Broker.
Highly Available Queues can be mirrored on machines in a cluster, making Queues usable even if some nodes fail.
RabbitMQ supports a variety of message queuing protocols, such as STOMP and MQTT.
RabbitMQ supports almost all common languages, such as Java,.NET, Ruby, and so on.
RabbitMQ provides an easy-to-use user interface that enables users to monitor and manage many aspects of a message Broker.
If a message fails, RabbitMQ provides a message Tracing mechanism so that the user can find out what happened.
RabbitMQ provides many plug-ins to extend RabbitMQ in many ways, or you can write your own.
The RabbitMQ process:
2.3 Kafka
Apache Kafka is a distributed data store optimized for real-time extraction and processing of streaming data. Streaming data is data continuously generated by thousands of data sources, usually sending data records simultaneously. The streaming platform needs to process this continuous flow of data, step by step, in order.
Kafka offers its users three main features
- Publish and subscribe record flows
- Efficiently store record streams in the order in which they were generated
- Process the record stream in real time
Members of Kafak:
- Producer: Publishes messages to the partitions of a specific Topic according to the partition method.
- ** The Kafka cluster ** receives messages from Producer and persists them to hard disks
- Consumer: Pulls messages from the Kafka cluster and controls the offset from which messages are retrieved
- Topic: Different classifications of message sources that Kafka processes
- Partition: physical grouping of a Topic. A Topic can be divided into multiple partitions. Each partition is an ordered queue, and each message is assigned an ordered ID (offset).
- Replicas: indicates the set of replicas of partitions
- Leader: a replicas role in which producer and consumer interact only with the leader
- Follower: a role in replicas that replicates data from the leader. Followers are the leader’s candidate
- Message: A basic unit of communication in which each Producer can publish messages to a Topic
- Consumer Group: A message can be consumed by one Consumer in multiple groups
Kafka process:
Pub Sub process:
- Producers periodically send messages to topics.
- The Kafka agent stores all messages in the partition configured for that particular topic. It ensures that messages are shared equally between partitions. If the producer sends two messages and has two partitions, Kafka stores one message in the first partition and the second message in the second partition.
- Consumers subscribe to specific topics.
- Once a consumer subscribes to a topic, Kafka provides the consumer with the current offset of the topic and also stores the offset in the Zookeeper ensemble.
- Consumers will periodically request new Kafka messages (such as 100 Ms).
- Once Kafka receives messages from producers, it forwards them to consumers.
- The consumer will receive the message and process it.
- Once the message is processed, the consumer sends an acknowledgement to the Kafka agent.
- Once Kafka receives confirmation, it changes the offset to the new value and updates it in Zookeeper. Because offsets are maintained in Zookeeper, consumers can correctly read the next message, even during server violence.
- The process repeats until the consumer stops requesting it.
- Consumers can fall back/jump to the desired topic offset at any time and read all subsequent messages.
Subscription process:
- Producers send messages to a topic at regular intervals.
- Kafka stores all messages in a partition configured for that particular topic, similar to the previous scheme.
- A single consumer subscribes to a particular Topic, assuming that topic-01 is Group and ID is group-1.
- Kafka interacts with consumers in the same way as publish-subscribe messages, until the new consumer subscribes to the same Topic topic-01 with the same group ID
- Once a new consumer arrives, Kafka switches its operations to shared mode and shares data between the two consumers. This sharing continues until the number of users reaches the number of partitions configured for that particular topic.
- Once the number of consumers exceeds the number of partitions, new consumers will not receive any further messages until existing consumers unsubscribe from any of them. This happens because each consumer in Kafka will be assigned at least one partition, and once all partitions have been assigned to existing consumers, new consumers will have to wait.
Kafka architecture:
Kafka contrast the RabbitMQ
2.4 RocketMQ
Apache RocketMQ is a unified messaging engine and lightweight data processing platform
RocketMQ members:
- Producer: Is responsible for producing messages. Producers send messages to the message server that are generated by the business application system.
- Consumer: Responsible for consuming messages, the Consumer pulls information from the message server and enters it into the user application.
- A message Broker is a message storage center that receives messages from producers and stores them. Consumers get messages from the Broker.
- NameServer: Used to hold meta information about Broker topics and provide producers and consumers with information about brokers.
RocketMQ process
1. Start Namesrv. After Namesrv starts, it listens to the port and waits for brokers, producers, and consumers to connect, acting as a routing control center.
The Broker keeps a long connection with all NamesRVs and sends heartbeat packets periodically. The heartbeat package contains the current Broker information (IP+ port, etc.) and stores all Topic information. After registration, the Namesrv cluster has a mapping between topics and brokers.
When creating a Topic, you need to specify which brokers the Topic will be stored on. A Topic can also be created automatically when a message is sent.
4. When the Producer sends messages, it establishes a long connection with one of the Namesrv clusters and obtains the brokers of the currently sent Topic from Namesrv. Then it establishes a long connection with the corresponding brokers and sends messages directly to the brokers.
Consumers are similar to producers. Establish a long connection to one of the NamesRVs, obtain which brokers the current subscribed Topic exists on, and then establish a connection channel directly with the brokers to start consuming messages.
This picture comes from my own understanding, I am not sure if there is a problem….
Compare other message queues (PS: RocketMQ official)
2.5 ActiveMQ
Apache ActiveMQ is the most popular open source, multi-protocol, Java-based messaging server. It supports industry standard protocols,
Apache ActiveMQ is fast, supports many cross-language clients and protocols, has an easy-to-use enterprise integration pattern and many advanced features, while fully supporting JMS 1.1 and J2EE 1.4. Apache ActiveMQ is released under the Apache 2.0 license
Apache ActiveMQ Proxy is an implementation of Java Messaging Service (JMS). JMS is a Java specification that allows applications to send data back and forth to each other in a simple and standard way.
ActiveMQ is a JMS provider. The JMS provider forms a software framework to facilitate the use of JMS concepts in applications. A node of ActiveMQ that allows clients to connect to it and use these messaging concepts is called “ActiveMQ Broker”
ActiveMQ features:
1Supports multiple cross-language protocols: Java, C, C++, C#, Ruby, Perl, Python, PHP2Supports enterprise integration solutions that can be integrated in JMS clients and Message Brokers3Supports many such as information groups, virtual destinations, and comprehensive destinations4Full JMS support1.1And J2EE1.4Supports transient, persistent, transactional, and XA messaging5Support for Spring6Support for common J2EE servers7Supports hot swap8High-performance logging9High-performance cluster support10Provide Web API11Allow web browsers to become part of the messaging structureCopy the code
ActiveMQ mode:
ActiveMQ is implemented based on JMS, so it supports JMS functionality:
P2P messaging domains use queues as destinations. Messages can be sent and received synchronously or asynchronously, and each message is sent to a Consumer only once.
The Pub/Sub (Publish/Subscribe, Publish/Subscribe) message domain uses topics as destinations, with publishers sending messages to topics and subscribers registering to receive messages from topics. Any message sent to topic is automatically delivered to all subscribers. The receiving mode (synchronous or asynchronous) is the same as that of the P2P domain.
Working mode of ActiveMQ
Once messages enter the system, they are arranged into two modes: queue and topic. Queues are FIFO (first-in, first-out) pipes for messages generated and consumed by agents and clients. Producers create messages and push them to these queues. The consumer application then polls and collects these messages, one at a time. Topics are subscription-based message broadcast channels. When a production application sends a message, multiple recipients that “subscribe” to the topic receive a broadcast of the message. This generation application is sometimes called a publisher in the context of topic messages,
ActiveMQ process:
Here is the logic involved in confirming consumption:
- (1) The customer receives the message;
- (2) Customer processing messages;
- (3) The message is confirmed;
There are four validation mechanisms:
Session. AUTO_ACKNOWLEDGE: customer (consumer) success from the receive () method returns, or from a MessageListener. The onMessage method returns successfully, the Session confirmation message automatically, and then automatically delete messages.
Session.CLIENT_ACKNOWLEDGE: The client acknowledges the message by explicitly calling the acknowledge method of the message. Call message.acknowledge() on the receiving end; Method, otherwise, the message will not be deleted.
Session. DUPS_OK_ACKNOWLEDGE : This is a “lazy” acknowledgement of the message, which may be repeated. On the second redelivery, JMSRedelivered in the header is set to true to indicate that the current message has been delivered once, and the client needs to control the reprocessing of the message.
Session.SESSION_TRANSACTED: The transaction was submitted and confirmed.
ActiveMQ versus Rabbit MQ
Based on version ———- > :
- ActiveMQ is an open source message broker based on the Java message service client,
- RabbitMQ is implemented on the Advanced Message Queueing protocol.
How it works ———- > : RabbitMQ works centrally, which makes it a unique approach. RabbitMQ is very portable and user friendly. Because large operations such as load balancing or persistent message queuing only run on a limited number of lines of code. But this approach is less scalable and slower because the delay is added from the size of the central node and message envelope. ActiveMQ is easier to implement and offers advanced features such as clustering, caching, logging, and message storage.
RabbitMQ is embedded in the application and acts as a halfway service. It distinguishes between supporting encryption, storing data on disk as pre-planned in the event of an outage, forming clusters, and repeating services for high reliability. It is deployed on the OTP platform to ensure maximum scalability and stability of queues as critical nodes of the entire system.
Implementation ———- > : ActiveMQ consists of a Java message service client, which can support multiple clients or servers. RabbitMQ is implemented to design the advanced message queuing protocol. It was extended to support different protocols, such as MQTT and STOMP.
Other advantages ———- >: RabbitQ supports multiple messaging protocols and provides acknowledgement and message queues. It can be enabled in various languages, such as Python, And Java. It also lets developers use apps like Chef, Docker, and Puppet. It provides high throughput and high availability by developing possible clusters. With support for pluggable authentication and authorization, it can easily handle public and private clouds. Http-api is a command-line tool with a user interface that helps manage and monitor RabbitMQ.
ActiveMQ has many advantages and can be applied according to needs with high efficiency. It supports c, C + +, and. NET and Python, which can be embedded in multi-platform applications through the advanced Message queue protocol. It enables the flexible exchange of messages between Web applications through the stream-oriented text messaging protocol STOMP. It also programs to manage iot devices.
2.6 ZeroMQ
This MQ is awesome, notice the big difference:
ZeroMQ basis
ZeroMQ (also known as ø MQ, 0MQ or ZMQ) is a high-performance asynchronous messaging library (not a framework) designed for distributed or concurrent applications. It provides a message queue, but unlike message-oriented middleware, ZeroMQ systems can run without a dedicated message broker.
It is important to note that Zero MQ is not a messaging middleware in the normal sense. It is a transport layer, a library for implementing a messaging and communication system between applications and processes – fast and asynchronous – that is intended to be part of the standard network protocol stack
ZeroMQ supports common messaging patterns (PUB/Sub, Request/Reply, Client/Server, etc.) through various transports (TCP, in-process, interprocess, multicast, WebSocket, etc.), making inter-process messaging as easy as inter-thread messaging. This makes your code clear, modular, and extremely easy to extend.
Interestingly, Zero MQ offers different implementations for different languages; Java, for example, can find a JZMQ on top of Git
ZeroMQ logical architecture
- Sockets: Using Sockets, users interact with these Sockets in a manner similar to TCP Sockets, the difference being that each socket can handle multiple peer communications
- Worker Threads: Various objects reside in a Worker Thread and each object is held by a parent object (ownership is represented in the diagram by a simple full line). Many objects are held directly by sockets; However, there are instances where entities are controlled by objects owned by sockets
- Listener: TCP Listener entity listens for incoming TCP connections and generates engine/session objects for each new connection
- Session: This is the Session object that communicates with the ZeroMQ socket
- Engine: The Engine object communicates with the network
- Pipe: When a session and a socket exchange messages. Pipe objects have two message-passing directions that handle each direction of the message to be delivered in them. In effect, each pipe is a lockless queue used to pass messages quickly between threads
ZeroMQ from blow
TCP: ZeroMQ is message-based, using message patterns instead of byte streams. XMPP: ZeroMQ is simpler, faster, and lower-level. Jabber can be built on top of ø MQ. AMQP: ZeroMQ does the same job faster100Times, and no proxy is required. (The specification is more concise -- less documentation than the AMQP specification278IPC: ZeroMQ can communicate across hosts CORBA: ZeroMQ won't impose a terrifically complex message format on you. RPC: ZeroMQ is completely asynchronous and you can add/remove participants at any time. RFC1149ZeroMQ is much faster! 29West LBM: ZeroMQ is free software! IBM Low-latency: ZeroMQ is free software! Tibco: ZeroMQ is still free software!Copy the code
ZeroMQ message mode
Release subscription
Pull or Push
Asynchronous request response
ZeroMQ is different from RabbitMQ
ZeroMQ comparing Kafka
2.7 TubeMQ
TubeMQ is produced by Tencent and has been donated to Apache. TubeMQ is a trillion-level distributed messaging middleware, which focuses on data transmission and storage under massive data. It has unique advantages in performance, reliability and cost
TubeMQ member architecture
Portal: the Portal part responsible for external interaction and o&M operations, including API and Web. API interconnects with management systems outside the cluster, and Web encapsulates daily O&M functions based on API.
Master: Responsible for the Control part of cluster Control, which consists of one or more Master nodes. Master HA is completed through heartbeat keepalive and real-time hot standby switchover between Master nodes (this is why you need to fill in the address of all Master nodes corresponding to the cluster when using the Lib of TubeMQ). The Master manages the cluster status, resource scheduling, permission check, and metadata query.
Broker: Responsible for the Store part of the actual data Store, which is composed of independent Broker nodes. Each Broker node manages the collection of topics within the node, including adding, deleting, modifying, and searching topics. Message storage, consumption, aging, partition expansion, offset records of data consumption within a Topic, cluster external capabilities, including the number of topics, throughput, capacity, etc., are completed by horizontally expanding Broker nodes.
Client: The Client part responsible for data production and consumption, which is provided in the form of Lib. The most commonly used part is the consumer side. Compared with the previous, the consumer side now supports Push and Pull data pulling modes, and the data consumption behavior supports sequential and filtered consumption. For the Pull consumption mode, support business through the client to reset the precise offset to support extractly-once consumption, at the same time, the consumer launched a new cross-cluster switch without restart BidConsumer client;
Zookeeper: responsible for the ZK part of offset storage. The function of this part has been weakened to the persistent storage of offset only. In consideration of the following multi-node copy function, this module is temporarily retained.
TubeMQ characteristics
- The pure Java implementation language
- Introduce the Master coordination node: Unlike Kafka, which relies on Zookeeper for metadata management and HA guarantee, TubeMQ system uses self-managed metadata arbitration mechanism. Master node uses embedded database BDB to store metadata in the cluster, update metadata, and implement HA eagerly. Responsible for TubeMQ cluster operation control and configuration management operation, external interface; Through the Master node, Broker configuration Settings, changes, and queries in the TubeMQ cluster are fully automated, closed-loop management, reducing the complexity of system maintenance
- Server-side consumption load balancing: TubeMQ uses service-side load balancing instead of client-side operation to improve system management and control capabilities and simplify client-side implementation, making it easier to upgrade the balancing algorithm
- System row-level locking: Row-level locking is used for concurrent operations that have intermediate states in the read and write of Broker messages to avoid duplication
- Offset management adjustment: Offset is managed independently by each Broker, and ZK is only used for data persistence. (Initially, it was considered to remove ZK dependency completely, but it was temporarily retained in consideration of subsequent function expansion.)
- Message read mechanism improvements: TubeMQ adopts the message random read mode, and in order to reduce the message delay, it also increases the memory cache read and write. For machines with SSD devices, it adds the processing of message lag to SSD consumption, to solve the problems of decreased throughput, small SSD disk capacity and limited flush times when consumption lags seriously. Make it meet the needs of business rapid production and consumption
- Consumer behavior control: Real-time and dynamic control of system access consumer behavior through policies, including traffic limiting and consumption suspension of specific services when the system is under high load, and dynamic adjustment of data pulling frequency, etc.
- Hierarchical service management and control: According to different requirements of system operation and maintenance, business characteristics and machine load state, the system supports operation and maintenance to dynamically control the consumption behaviors of different consumers through policies, such as access to consumption, hierarchical guarantee of consumption delay, consumption flow limiting control, and data pull frequency control
- System security management and control: According to the data service needs of different services and the consideration of system operation and maintenance security, TubeMQ system adds TLS transport layer encryption pipeline, authentication and authorization of production and consumption services, and access token management for distributed access control to meet the requirements of business and system operation and maintenance in system security
- Resource utilization improvement: Compared with Kafka, TubeMQ adopts connection reuse mode to reduce connection resource consumption; The logical partition is constructed to reduce the system’s occupation of file handles. The server filtering mode is used to reduce the network bandwidth resource usage. By stripping the use of Zookeeper, you can reduce the strong dependence and bottleneck of Zookeeper
- Client improvements: For ease of business use, we have simplified the client logic to a minimum set of features, adopted a reception quality statistics algorithm based on response messages to automatically filter out bad Broker nodes, and made connection attempts based on first use to avoid blocking the delivery of large volumes of data
TubeMQ contrast
TODO
2.8 DDMQ
DDMQ is a message queue product built by Didi Chuxing Architecture Department based on Apache RocketMQ
DDMQ structure
Because it is a domestic development, the documentation is quite similar to RocketMQ structure, you can directly see the documentation @ www.oschina.net/p/ddmq
2.9 IBM MQ
Function:
You can use IBM®MQ to enable applications to communicate at different times and in many different computing environments.
IBM MQ can transport any type of data as a message, enabling businesses to build flexible, reusable architectures, such as service-oriented Architecture (SOA) environments. It works with a wide range of computing platforms, applications, Web services, and communication protocols for rich and secure messaging. IBM MQ provides a communications layer for visibility and control of the flow of messages and data within and outside the organization.
The characteristics of
- Versatile messaging integration from mainframes to mobile devices, providing a single, robust messaging backbone for dynamic heterogeneous environments.
- Messaging with rich security features that produce auditable results.
- High performance messaging to improve speed and reliability of delivering data.
- Administration features that simplify message management and reduce the time spent using complex tools.
- Open standard development tools that support scalability and business growth.
PS: IBM MQ is only mentioned in the project at this stage, but in fact his understanding is not very deep, here first leave a pit, detailed can see the IBM official website, they are very perfect
3. Summary
Each framework has its own trade-offs. ZeroMQ, for example, is fast, but gives up some integrity, requiring the right choice.
Attached is a pulled table, the content of the table is not guaranteed for the time being, there are plans for several documents behind, which will be changed during the pressure test
Thank you
@ juejin. Cn/user / 322782…
@ www.ibm.com/
@ rocketmq.apache.org/docs/motiva…
@ www.oschina.net/p/ddmq