Original: Taste of Little Sister (wechat official ID: XjjDog), welcome to share, please reserve the source.

In distributed caching, Redis took the top spot. But for message queue MQ, it’s still the age of a thousand flowers.

The cache system, which basically solves an access problem, is fine, and the calls are synchronous. For message queues, the same is not true. It can be used in a variety of scenarios, with varying levels of reliability, and the process is asynchronous from production to consumption.

The main points of message system design, there are many. Nowadays, it is difficult to have a messaging system that can accommodate the design considerations mentioned below. If it says yes, the matrix is blowing.

So a lot of times, popular Kafka, RabbitMQ, RocketMQ, etc., are used together. If you are making a selection of related aspects, the following technical points are trade-offs. What’s it called: 牝鸡司晨, home on the ropes.

The main points of

This article abstracts some of the common features for these MQ as a whole. Protocol, type, consumption mode, accumulation capacity, high availability, high reliability, high performance, scalability, and ecology. If you want to dive into mq, there are several articles about Kafka.

360 degree test: Does Kafka lose data? Is its high availability sufficient? Further explore the use of Kafka business scenarios through testing

High availability

High availability (HA) provides failover and HA services for a single node in a cluster when exceptions occur. The general idea for solving high availability problems is the replica mechanism.

By adding copies, you can spread the risk of data across multiple machines. This requires that one of the replicas be identified as the new master shard in the event of a problem with the master shard. There are many such coordination tools, such as ZK. There are ALSO MQ, to implement the process yourself.

Some are more wasteful, such as RocketMQ, which uses standby to ensure high availability from a slave machine and then takes over if something goes wrong.

Highly reliable

The reliability and performance of a messaging system are at odds. In general MQ, reliability levels are adjustable, but performance is correlated in the opposite direction. In terms of message level, the general route is as follows:

-> Single node confirmation -> multi-node confirmation -> multi-node confirmation synchronous flush -> all nodes synchronous flush -> transaction message, etc.

The ack mechanism and multi-copy mechanism are used to ensure the high reliability of a single-node cluster. For a single node, a power failure or host exception may pose a major challenge. To handle this, you need a flush mechanism or some other persistence mechanism. At the same time, data integrity verification is also required, which is why messaging systems like Kafka take a long time to start up when there is a large amount of data.

Production Side In addition to buffer loss, the production side also has to take into account some sending errors, including timeout and retry processing for communication with the cluster.

Consuming end The consuming end uses the message confirmation mechanism to ensure that the message has been correctly consumed. Most messaging systems guarantee at least once semantics because many exceptions can occur in between. Ensure that the message is consumed at least once.

The implication is that messages are repetitive, and consumers need to be idempotent to ensure that repeated consumption does not cause business anomalies.

Errors can also occur on the consumer side. Some MQS can be automatically queued up after multiple consumption failures, and some MQS need to design their own topic for planning.

A high performance

As a data transmission channel, performance is a very weighty consideration. Two of the more important indicators are message latency and message throughput.

The process between sending a message from the production side and processing it by the consumer should not be too long, and for MQ consuming in pull mode, polling is accelerated and data transfer is accelerated using techniques such as zero-copy.

For message throughput, it is the result of a joint optimization of the production side, MQ node, and consumer side. At present, the main means are as follows:

Asynchronization Messages are sent asynchronously. The sender does not need to wait synchronously, which speeds up the processing speed.

Batch Sends data in batches, reducing the number of network transfers and facilitating data compression. Generally, a buffer is cached in memory, and if the buffer is full, or the time window is reached, a transfer is performed. This can significantly increase transmission speed, but data can be lost if mishandled.

Sequential IO XjjDog has mentioned in several articles that sequential disk manipulation is much faster than random memory manipulation. This is one of the reasons message queues like Kafka are fast, but be aware of the number of topics (think about why).

In addition, there are other tools. For example, optimize operating system parameters, use sharding to increase parallelism, etc.

Message type

Messages are somewhat point-to-point; a message is consumed only once. Pub/Sub Through the publish/subscribe model, a message can be consumed by multiple consumers. Another type of message is broadcast in broadcast mode, where the producer sends a message and all consumers receive it.

In addition to normal sent messages, there are some special-purpose messages. Sequential messages can be globally ordered or partitioned ordered, and are generally used in businesses with strict order requirements. Through the design of the service, global ordering can be avoided, which is a very costly operation.

Some MQ also support timed messaging (which is better for business systems). Transaction messages are more performance-intensive and should be used sparingly.

There are also MQ that provide tagging and message filtering capabilities. For example, the order information is sent to a topic, and the consumer only subscribes to the order of the related item. In some cases, the request isolation is very useful.

Consumption patterns

Consumption pattern, basically have push pattern and pull pattern. The pull mode is the most practical and popular because the consumption processing speed can be adjusted by the consumer side.

The real-time performance of push mode is better, but it is difficult to evaluate the ability of the consumer side, which is easy to overwhelm. Also, there are many challenges dealing with pub/sub, failure retries, etc.

agreement

Everyone knows that Java has a JMS specification, but something like Kafka doesn’t implement it. So some protocols, such as AMQP and OpenWire, have more obvious customizations.

This transport protocol, it’s not about function. For example, there are HTTP based protocols, or Redis, or even Stomp over WebSocket.

MQTT is the application protocol for IoT, and you’ll find a bunch of message queues based on it.

Accumulation ability

With so much data these days, mq’s ability to stack is very, very important. Take redis, an in-memory queue, for example, which bursts in minutes. In addition to serving as a channel for message processing, MQ can also be used as backup storage.

Stacking capability is embodied in mass storage, such as storing in databases (contradiction transfer), mounting very large disks, etc. But don’t count your chickens before they’re too young. Large clusters typically take a long time to start up and load, as well as fail rebalancing.

Another manifestation of accumulation ability is the cleaning up of historical information. Generally, there are two policies: disk online and disk expiration. You can flexibly set the policies based on requirements.

ecological

An open source software ecosystem is very important, and so is MQ. This is mainly reflected in two aspects: one is the diversified development languages supported (packages of producer and consumer are required), and the other is the support for peripheral software. Such as Spring, Spark, Hadoop, Flink, etc., to reduce integration costs.

Except for the relatively new MQ system, this is doing well.

The role of messaging systems

Message system plays an increasingly important role in distributed system design. Its usage scenarios include but are not limited to:

Peak cutting is used to accept requests that exceed the processing capacity of the service system to ensure smooth operation of services. This can result in significant cost savings, such as some split-kill activities that are not designed for peak capacity.

Buffering exists as a buffer layer in the service layer and in the slow falling layer, similar to peak clipping, but mainly used for intra-service data flow. Like sending text messages in bulk.

Decoupled from the beginning of the project does not determine specific requirements. Message queues can act as an interface layer to decouple important business processes. Just follow the conventions and program against the data to gain extended capabilities.

Redundant message data can be used in a one-to-many manner by multiple unrelated businesses.

Robust message queues can stack requests, so the consumer business can die for a short time without affecting the main business.

End

Distributed MQ can now be broadly divided into two categories, based on the volume and purpose of the message.

One class is used for business systems to ensure high reliability. It requires no loss of messages, such as orders, payments, etc., and a high SLA service level. In this case, there are also many functional requirements for MQ, including message traceability.

Another class of big data-related systems is typically characterized by very large throughput. In unusual cases, it doesn’t hurt to lose a few messages.

The messaging system, however, may focus only on MQ itself. Ensuring availability on the production side, the consumer side, and MQ itself is a business tradeoff.

Some time ago, for example, XJJDog opened source okMQ to solve the problem of high availability for a particular scenario. Open source a Kafka enhancement: okMQ-1.0.0

Xjjdog is a public account that doesn’t allow programmers to get sidetracked. Focus on infrastructure and Linux. Ten years architecture, ten billion daily flow, and you discuss the world of high concurrency, give you a different taste. My personal wechat xjjdog0, welcome to add friends, further communication.