Message middleware

Message middleware concepts

Message (Mesage)

Refers to data transferred between applications. Messages can be very simple, such as containing only text strings, JSON, and so on, or complex, such as embedded objects.

Message Queue Middleware

Often referred to as MQ. Using efficient and reliable messaging mechanism for platform-independent data communication, and based on data communication for distributed system integration. It extends interprocess communication in a distributed environment by providing a messaging and message queuing model. Message queue middleware, also known as message queue or message middleware.

It generally has two delivery modes:

  • The point-to-point pattern is based on queues, where message producers send messages to queues and message consumers receive messages from queues. The existence of queues makes asynchronous transmission of messages possible.
  • The publish subscribe pattern defines how to publish and subscribe to messages to a content node, called a topic, which can be thought of as a mediation for message delivery, with a publisher publishing messages to a topic from which message subscribers subscribe. Topic enables subscribers and publishers of messages to remain independent of each other, and does not need to contact each other to ensure the transmission of messages. Publish/subscribe mode is adopted in one-to-many broadcast of messages.

The role of messaging middleware

  • Decoupling: Message-oriented middleware inserts an implicit, data-based interface layer in the middle of the process that is implemented by both processes, allowing you to extend or modify both processes independently, as long as they adhere to the same interface constraints.
  • Redundancy (storage) : In some cases, the process of processing data fails. Message-oriented middleware avoids the risk of data loss by persisting data until it has been fully processed.
  • Extensibility: Because message-oriented middleware decouples application processing, it is easy to increase the efficiency of message enqueueing and processing by adding additional processing without changing code or tuning parameters.
  • Peak peaking: When traffic surges, using message-oriented middleware enables key components to withstand sudden access pressures without completely collapsing under sudden requests overload.
  • Recoverability: The failure of a part of the system does not affect the entire system.
  • Buffering: Message-oriented middleware uses a buffering layer to help tasks execute most efficiently, and writing to message-oriented middleware is processed as quickly as possible. This buffer layer helps control and optimize the speed at which data flows through the system.
  • Asynchronous communication: Many times applications do not want or need to process messages immediately. Message-oriented middleware provides asynchronous processing mechanisms that allow applications to put some messages into the message-oriented middleware, but not process them immediately and process them later as needed.

RabbitMQ basic concepts

RabbitMQ is essentially a producer and consumer model, responsible for receiving, storing, and forwarding messages. Think of it as sending a package to the post office, which stores it and eventually sends it to the postman. RabbitMQ is like a system of post office, mailbox and postman. In computer terms, the RabbitMQ model is more like a switch model.

Producers and consumers

Producer: producers

The party that delivers the message. The producer creates the message and publishes it to RabbitMQ. A message can generally contain two parts: the message body and the Label. The message body can also be called payload. In practical applications, the message body is usually a data with a business logical structure. The label of the message is used to represent the message, such as the name of a switch and a routing key. The producer sends the message to RabbitMQ, which then sends the message to interested consumers based on the label.

Consumer: consumers

The party receiving the message. The consumer connects to the RabbitMQ server and subscribes to the queue. When a consumer consumes a message, it consumes only the payload of the message. In the process of message routing, the label of the message is discarded, and the message stored in the queue is only the message body, and the consumer only consumes the message body, so the producer of the message is not known, of course, the consumer does not need to know.

Broker: Service node of message-oriented middleware

For RabbitMQ, a RabbitMQ Broker can simply be regarded as a RabbitMQ service node, or RabbitMQ service instance. In most cases a RabbitMQ Broker can also be thought of as a RabbitMQ server.

The running process of a message queue

The producer first wraps the business-side data as possible, then encapsulates it as a message and sends it (the AMQP action corresponds to basic.publish) to the Broker. Consumers subscribe and receive messages (Basic.Consume or basic. Get correspond to this action in AMQP), Get the raw data through possible unpacking, and then proceed to the business processing logic. The business processing logic does not necessarily need to use the same thread as the logic that receives the message. A consumer process can use a thread to receive messages and store them in memory, for example using BlockingQueue in Java. The business processing logic uses another thread to read data from memory, which further decouples the application and improves the processing efficiency of the entire application.

Queue Queue

Queues, which are internal objects of RabbitMQ, are used to store messages. RabbitMQ messages can only be stored in queues, as opposed to messaging middleware such as Kafka. Kafka stores messages at the logical level of the topic (topic), and the corresponding queue logic is just a displacement identifier in the actual storage file of the topic. The RabbitMQ producer produces the message and ultimately delivers it to the queue, from which the consumer can retrieve and consume the message. Multiple consumers can subscribe to the same queue, in which case the messages in the queue are round-robin for processing among multiple consumers, rather than each consumer receiving all the messages and processing them.

Switches, routing keys, bindings

Exchange: a switch

In RabbitMQ, a producer sends a message to an Exchange (usually also represented by a capital “X”), which routes the message to one or more queues. If the route is not available, it may be returned to the producer or discarded. Here you can think of the switch in RabbitMQ as a simple entity.

The common switch types for RabbitMQ are FANout, Direct, Topic, and headers. Different types have different routing policies.

  • Fanout: It routes all messages sent to the switch to all queues bound to the switch.
  • Direct: Routes messages to queues where BindingKey and RoutingKey match exactly.
  • Topic: Messages are routed to queues that match BindingKey and RoutingKey, but there are a few different matching rules, such as fuzzy matching with “*” and “#”.
  • Headers: Does not rely on the matching rules of the routing key to route a message, but matches the HEADERS attribute in the content of the sent message. When a message is sent to the exchange, RabbitMQ will retrieve the headers(also a key/value pair) of the message and compare whether the key/value pair matches exactly the key/value pair specified by the queue/exchange binding. If so, the message will be routed to the queue. Otherwise, the queue will not be routed to. Headers switches perform poorly and are almost never seen.

RoutingKey: routing key

When a producer sends a message to an exchange, it typically specifies a RoutingKey that specifies the Routing rule for the message. This RoutingKey needs to be used in conjunction with the exchange type and the BindingKey to take effect. With a fixed exchange type and BindingKey, producers can specify a RoutingKey when sending a message to the exchange to determine where the message will go.

Binding: Binding

RabbitMQ binds the switch to the queue, usually specifying a BindingKey so RabbitMQ knows how to route messages to the queue correctly.

The Connection and the Channel

Both producers and consumers need to establish a Connection to the RabbitMQ Broker, which is a TCP Connection. Once the TCP connection is established, the client can then create an AMQP Channel, each of which is assigned a unique ID. A channel is a virtual Connection built on top of a Connection, and every AMQP instruction processed by RabbitMQ is done over the channel.

Why introduce channels when you can just use Connection? Imagine a scenario in which there are many threads consuming or producing messages from RabbitMQ in an application. There must be many connections, that is, many TCP connections. However, setting up and destroying TCP connections is very expensive for an operating system, and performance bottlenecks can occur if usage peaks. RabbitMQ uses a non-blocking I/O (NIO) approach to reuse TCP connections to reduce performance overhead and facilitate management.

Each thread holds a channel, so the channel multiplexes the Connection’s TCP Connection. RabbitMQ also ensures that each thread is private, as if it were a separate connection. When there is not much traffic per channel, multiplexing a single Connection can effectively save TCP Connection resources while creating performance bottlenecks. However, when the traffic of the channel itself is very large, the multiplexing of a Connection by multiple channels at this time will produce performance bottlenecks, thus limiting the overall traffic. At this point, it is necessary to open up multiple connections and evenly distribute these channels into these connections.

RabbitMQ operation process

The process by which a producer sends a message

  1. Producers connect to RabbitMQBroker, establish a Conmection, and open a Channel.
  2. The producer declares a switch and sets properties such as switch type, persistence, and so on.
  3. The producer declares a queue and sets related properties, such as whether it is exclusive, persistent, and automatically deleted.
  4. The producer binds the switch to the queue through the routing key.
  5. The producer sends a message to the RabbitMQ Broker containing information about routing keys, switches, etc.
  6. The corresponding switch looks for a matching queue based on the routing key it receives.
  7. If found, the message sent from the producer is placed in the appropriate queue.
  8. If not, discard or roll back to the producer based on the attributes of the producer configuration.
  9. Close the channel.
  10. Close the connection.

The process by which a consumer receives a message

  1. The consumer connects to the RabbitMQ Broker, establishes a Connection, and opens a Channel.
  2. The consumer requests the RabbitMQ Broker to consume messages in the corresponding queue, possibly setting up the corresponding callback function and doing some preparatory work.
  3. The consumer receives the message after the RabbitMQ Broker responds and delivers the message to the corresponding queue.
  4. Consumer acknowledgement (ACK) of the received message.
  5. RabbitMQ removes the corresponding acknowledged message from the queue.
  6. Close the channel.
  7. Close the connection.

RabbitMQ data reliability

Confirmation and rejection at the consumer end

To ensure that messages reliably reach consumers from queues, RabbitMQ provides message validation. Consumers can subscribe to the queue by specifying the autoAck parameter, and when autoAck equals false RabbitMQ will wait for an explicit acknowledgement from the consumer before removing the message from memory (or disk) (essentially marking it with a delete mark and then deleting it later). When autoAck equals true, RabbitMQ automatically sets the sent messages to acknowledgement and deletes them from memory (or disk), regardless of whether the consumer actually consumes them.

With message acknowledgement, as long as the autoAck parameter is set to false, the consumer has enough time to process the message without worrying about the message being lost if the consumer hangs up during processing, since RabbitMQ will wait to hold the message until the consumer explicitly calls basic. Ack.

When the autoAck parameter is set to false, for the RabbitMQ server the messages in the queue are split into two parts: those waiting to be delivered to consumers; Some have been delivered to consumers, but have not received the message of consumer confirmation signal. If RabbitMQ does not receive an acknowledgement from the consumer and the consumer is disconnected, RabbitMQ will re-queue the message for delivery to the next consumer, which may be the same consumer.

RabbitMQ does not set an expiration date for unacknowledged messages. The only way it can determine whether the message needs to be redelivered to the consumer is if the consumer has been disconnected. This is because RabbitMQ allows consumers to consume a message for a long time.

Sender confirmation mechanism

The producer sets the channel to Confirm mode. Once the channel is in Confirm mode, all messages posted on the channel are assigned a unique ID(starting at 1). Once the message has been posted to all matching queues, RabbitMQ sends an acknowledgement (basic.ack) to the producer (containing the unique ID of the message), which lets the producer know that the message has arrived at its destination correctly. If the message and queue are persistent, the acknowledgement message is sent after the message is written to disk. The deliveryTag in the confirmation message RabbitMQ sends back to the producer contains the serial number of the confirmation message, and RabbitMQ can set the multiple parameter in channel.basicAck to indicate that all messages up to this serial number have been processed.

Message transmission guarantee

Generally, reliable message transmission is the primary concern of business system when accessing message middleware. The message transmission guarantee of general message middleware is divided into three levels.

  • At most once. Messages may be lost, but they are never repeated.
  • -Leonard: At least once. Messages are never lost, but may be transmitted repeatedly.
  • -Dan: Exactly once. Each message must be transmitted once and only once.

RabbitMQ supports at most and at least one of these. The realization of “least once” delivery needs to consider the following aspects:

  1. Message producers need to enable transactions or Publisher confirm to ensure messages can be reliably delivered to RabbitMQ.
  2. Message producers need to use either the Mandatory parameter or a backup exchange to ensure that messages can be routed from the exchange to the queue so that they are saved and not discarded.
  3. Both messages and queues need to be persisted to ensure that the RabbitMQ server does not lose messages in the event of an exception.
  4. Consumers need to set autoAck to false when consuming messages, and then manually confirm that the messages have been correctly consumed to avoid unnecessary message loss on the consumer side.

RabbitMQ message persistence

Persistence improves RabbitMQ’s reliability in case of data loss in abnormal conditions (restart, shutdown, downtime, etc.). RabbitMQ persistence is divided into three parts: exchange persistence, queue persistence and message persistence.

Persistence of switches is implemented by declaring the durable parameter to true in the queue. If the exchange is not persistent, the exchange metadata will be lost after the RabbitMQ service is restarted, but messages will not be lost, but messages will not be sent to the exchange. For a long-used switch, it is recommended to make it persistent.

Persistence of queues is achieved by setting the durab1e parameter to true when the queue is declared. If the queue is not persistent, the metadata of the associated queue will be lost after the RabbitMQ service is restarted, and so will the data. Persistence of queues guarantees that their own metadata will not be lost due to exceptions, but it does not guarantee that messages stored internally will not be lost.

The message is persisted by setting the deliveryMode of the message (the deliveryMode attribute in BasicProperties) to 2.

Second, it will take a short but significant amount of time for persistent messages to be stored to disk after they have been correctly stored to RabbitMQ. RabbitMQ does not synchronously save every message and may only save it to the operating system cache rather than to the physical disk. If the RabbitMQ service node breaks down or restarts during this period, the RabbitMQ messages will be lost before they are saved.

If the master node fails at this time, it will automatically switch to the slave node. This ensures high availability unless the whole cluster fails. This does not guarantee RabbitMQ message loss, but mirroring queues are more reliable than non-mirroring queues. In actual production environments, key service queues are usually configured with mirroring queues. You can also introduce a transaction or sender confirmation mechanism at the sender to ensure that messages are correctly sent and stored to RabbitMQ, provided that the switch routes the messages to the appropriate queue when channel.basicPublish is called.

reference

RabbitMQ Field Guide