17 aspects, comprehensive comparison of Kafka, RabbitMQ, RocketMQ, ActiveMQ

This paper comprehensively compares the differences of Kafka, RabbitMQ, ZeroMQ, RocketMQ and ActiveMQ when they are used as message queues. \

I. Documents

Kafka:. There are kafka’s own books and some online sources.

The rabbitmq:. There are some good books and lots of information online.

Zeromq: less. There are no books on ZeromQ, and the web is full of code implementations and introductions.

Rocketmq: less. There are no books devoted to RocketMQ, the online literature is spotty, and the official documentation is brief but not full of technical details.

Activemq:. There are no books specifically written about ActivemQ, and there are many online materials.

Second, development language

Kafka: Scala

The rabbitmq: Erlang

Zeromq: c

Rocketmq: Java

Activemq: Java

Iii. Supported Agreements

Kafka: A self-defined set… (Based on TCP)

The rabbitmq: it

Zeromq: TCP, UDP

Rocketmq: A self-defined set of…

Activemq: OpenWire, STOMP, REST, XMPP, AMQP

4. Message storage

Kafka: Memory, disk, database. Supports massive stacking.

The smallest unit of storage in Kafka is a partition. A topic contains multiple partitions. When Kafka creates a topic, these partitions are distributed across multiple servers, usually one server per broker.

Partition heads are evenly distributed across different servers, and partition copies are evenly distributed across different servers to ensure load balancing and high availability. As new brokers join the cluster, some copies are moved to the new brokers.

Kafka assigns new partitions to the directory with the least number of partitions in the directory list, based on the directory list in the configuration file. By default, the partitioner uses a polling algorithm to evenly distribute messages across different partitions on the same topic. If a key is specified when a message is sent, it is modelled to the corresponding partition based on the hashcode of the key.

Rabbitmq: Memory and disks. Supports small amounts of stacking.

Rabbitmq messages can be persistent and non-persistent, and both can be written to disk.

Persistent messages are written to disk as they arrive on the queue, and if possible, a backup of persistent messages is kept in memory, which improves performance and is flushed from memory when memory becomes tight. Non-persistent messages generally live only in memory and are swapped to disk when memory is tight to save memory.

A mirrored queue mechanism is introduced to “copy” important queues to other brokers in the cluster to ensure that messages from these queues are not lost. If the master fails, the slave that has been added for the longest time is promoted to the new master. All actions except messages are sent to the master, and the master broadcasts the command execution results to each slave. Rabbitmq will distribute the masters evenly across different servers and the slaves in the same queue evenly across different servers to ensure load balancing and high availability.

Zeromq: memory or disk of the sending end. Persistence is not supported.

Rocketmq: disk. Supports massive stacking.

A commitLog file stores actual message data. The upper limit of each commitLog file is 1 GB. After the file is full, a new commitLog file is automatically created to store data.

ConsumeQueue contains only offset, size, and Tagcode, which are very small and spread across multiple brokers. ConsumeQueue is a CommitLog index file. When consuming a ConsumeQueue, the consumer looks for the message offset in the CommitLog and then looks for metadata in the CommitLog.

The ConsumeQueue format guarantees the CommitLog file writing process. A large number of DATA IO are written to the same CommitLog file in sequence. New CommitLog files will be written after 1 GB of DATA is added. Plus RocketMQ is 4K to forcibly brush from PageCache to disk (cache), so high concurrent write performance is outstanding.

Activemq: Memory, disk, database. Supports small amounts of stacking.

5. Message transactions

Kafka: support

Rabbitmq: Supported. The client sets the channel to transaction mode, and the transaction will commit only if the message is received by rabbitMq, otherwise it will be rolled back if an exception is caught. Using transactions degrades performance

Zeromq: Not supported

Rocketmq: support

Activemq: support

Load balancing

Kafka: Supports load balancing.

1) A broker is usually a server node. Kafka tries to distribute partitions to different Broker servers for different partitions of the same Topic, and ZooKeeper keeps metadata information about brokers, topics, and partitions. The partition head will process production requests from clients, and the Kafka partition head will be assigned to different Broker servers to share the task.

Each broker caches metadata information, and clients can retrieve metadata information from any broker and cache it to know where to send requests.

2) Kafka’s consumer groups subscribe to the same topic, so that as much as possible each consumer is allocated to the same number of partitions, splitting the load.

3) When consumers join or leave the consumer group, rebalancing will be triggered to reallocate the partition and share the load for each consumer.

Most of Kafka’s load balancing is automated, and partition creation is done by Kafka, hiding many details and avoiding cumbersome configuration and human oversight of load issues.

4) The sending end determines which partition to send the message to by topic and key. If the key is null, the polling algorithm will be used to evenly send the message to different partitions of the same topic. If the key is not null, the partition to which it is sent is calculated based on the hashcode modulo of the key.

Rabbitmq: Poor load balancing support.

1) Which queue the message is sent to is determined by the switch and the key. The switch, routing key, and queue need to be created manually.

To establish a connection with the broker, the RabbitMQ client needs to know which switches and queues are on the broker.

Typically, you declare a target queue to send to. If there is no target queue, a queue is created on the broker. If there is, nothing is done and a message is sent to the queue. Assuming that most of the heavy work queues are created on the same broker, that broker is overburdened.

(Queues can be created in advance without having to declare a queue to be sent, but no attempt is made to create a queue to be sent, as it may not be found. Rabbitmq’s backup exchange stores messages that cannot be found in a special queue for later use.)

Rabbitmq clustering can solve this problem by using mirrored queues to create a master-slave architecture where the master nodes are distributed evenly among the different servers, with each server sharing the load. The slave node is only responsible for forwarding. If the master fails, the slave with the longest time is selected as the master.

When a new node is added to the mirror queue, messages in the queue will not be synchronized to the new slave unless the synchronization command is invoked. However, after the command is invoked, the queue will block and the synchronization command cannot be invoked in the production environment.

2) When a RabbitMQ queue has more than one consumer, the message received by the queue will be sent to the consumer in a polling distribution. Each message will be sent to only one consumer in the subscription list and will not be repeated.

This approach is ideal for scaling and is designed specifically for concurrent programs.

BasicQos limits the maximum number of unconfirmed messages that a consumer can hold on a channel if his or her task is heavy, at which point RabbitMQ will not send any messages to that consumer.

3) For RabbitMQ, the client does not establish a TCP connection to all the nodes in the cluster, but to one of them.

Rabbitmq clusters, however, can use HAProxy, LVS, or load balancing algorithms on the client side to spread connections between nodes in the cluster.

Client balancing algorithm: ****

1) Polling method. Returns the connection address of the next server in sequence.

2) Weighted polling method. Assign higher weights to high-configuration, low-load machines to handle more requests; The machine with low configuration and high load is assigned a lower weight to reduce its system load.

3) Random method. Randomly select a server connection address.

4) Weighted random method. Select the connection address randomly according to the probability.

5) Source address hashing. A value computed by a hash function that modulo the size of the server list.

6) Minimum connection number method. Dynamically select the connection address of the server with the least number of current connections.

Zeromq: Decentralized and does not support load balancing. Itself is just a multithreaded network library.

Rocketmq: Supports load balancing.

A broker is usually a server node. Brokers are divided into master and slave. The master stores the same data as the slave, and the slave synchronizes data from the master.

1) Nameserver maintains heartbeat with each cluster member and stores topicbroker routing information. Queues for the same Topic are distributed on different servers.

2) Messages are sent through polling queues, and each queue receives an average number of messages. Messages are sent to a topic, tags, or keys, but not to a particular queue (meaningless, cluster consumption and broadcast consumption have nothing to do with which queue the message is sent to).

Tags optional, similar to the tags Gmail uses for each email, for server filtering. Currently, only one tag per message is supported, so it can be analogous to Notify’s MessageType concept.

Keys indicates the service key of the message. The server will create a hash index based on keys. After setting the hash index, the Console system can query the message based on Topic and keys.

3) RocketMQ’s load balancing policy states that the number of consumers should be less than or equal to the number of queues. If more consumers exceed the number of queues, the excess consumers will not be able to consume messages. Consistent with Kafka, RocketMQ tries to allocate the same number of queues to each Consumer, splitting the load.

Activemq: Supports load balancing. Load balancing can be implemented based on ZooKeeper.

7. Cluster mode

Kafka: A natural ‘leader-slave’ stateless cluster where each server is both Master and Slave.

Partition chiefs are distributed evenly on different Kafka servers, and partition copies are distributed evenly on different Kafka servers. Therefore, each Kafka server contains both a partition leader and a partition copy. Each Kafka server is a Slave of a certain Kafka server. He is also the leader of a Kafka server.

Kafka’s cluster relies on ZooKeeper, which supports hot scaling and allows all brokers, consumers, and partitions to join and remove dynamically without shutting down the service, a major advantage over MQ that does not rely on a ZooKeeper cluster.

Rabbitmq: Simple clustering, ‘copy’ mode supported, not advanced clustering mode supported.

Every rabbitMQ node, whether a single node system or part of a cluster, is either a memory node or a disk node, and at least one of the clusters must be a disk node.

When you create queues in a RabbitMQ cluster, the cluster will only create queue processes and complete queue information (metadata, status, content) on a single node, not on all nodes.

Mirroring queues can avoid single points of failure and ensure service availability. However, you need to manually configure mirroring for some important queues.

Zeromq: Decentralized, no clustering.

Rocketmq: The ‘master-slave’ mode is commonly used. In the open source version, you need to manually switch the Slave to Master

The Name Server is a nearly stateless node that can be clustered without any synchronization of information between nodes.

Broker deployment is relatively complex, the Broker is divided into the Master and Slave, a Master can correspond to more than one Slave, but a Slave corresponds to only one Master, Master and Slave corresponding relation by specifying the same BrokerName, A different BrokerId is defined, with a BrokerId 0 for Master and a non-0 for Slave. Multiple masters can also be deployed. Each Broker establishes long connections to all nodes in the Name Server cluster and periodically registers Topic information to all Name Servers.

The Producer establishes a long connection with one node (randomly selected) in the Name Server cluster, obtains Topic routing information from the Name Server periodically, establishes a long connection with the Master providing Topic services, and sends heartbeat messages to the Master periodically. Producer is stateless and can be deployed in a cluster.

The Consumer establishes a long-term connection with one of the nodes in the Name Server cluster (randomly selected), obtains Topic routing information from the Name Server periodically, establishes a long-term connection with the Master and Slave that provide Topic services, and periodically sends heartbeat messages to the Master and Slave. Consumers can subscribe to messages from either Master or Slave, and the subscription rules are determined by the Broker configuration.

The client first finds the NameServer and then finds the Broker through the NameServer.

A topic has multiple queues that are evenly distributed across different Broker servers. The rocketMQ queue concept is similar to the partitioning concept in Kafka, where partitions in kafka’s same topic are distributed among different brokers as much as possible, and partition copies are distributed among different brokers.

The RocketMQ cluster’s slaves pull backup data from the master, which is distributed among different brokers.

Activemq: Supports simple cluster mode, such as’ active-standby ‘, but not advanced cluster mode.

8. Management interface

Kafka:

The rabbitmq:

Zeromq: no

Rocketmq: no

Activemq:

Ix. Usability

Kafka: Very high (distributed)

Rabbitmq: High (master/slave)

Zeromq: high.

Rocketmq: Very high (distributed)

Activemq: High (master slave)

10. Repeated messages

Kafka: Supports at least once, at most once, exactly-one

Rabbitmq: Supports at least once and at most once

Zeromq: Only retransmission mechanism, but no persistence, message lost retransmission is useless. At least once/at most once/exactly only once

Rocketmq: Supports at least once

Activemq: Supports at least once

Throughput TPS

Kafka: Maximum Kafka sends messages and consumes messages in batches. The sender merges small messages and sends them in batches to the Broker, and the consumer takes out batches of messages at a time.

Rabbitmq: Large

Zeromq: great

Rocketmq: The Big RocketMQ receiver can consume messages in batches. You can configure the number of messages consumed at a time, but the sender does not send messages in batches.

Activemq: Relatively large

Subscription form and message distribution

Kafka: Publish and subscribe pattern based on topic and matching by topic.

“Send”

The topic and key determine which partition the message is sent to. If the key is null, a polling algorithm is used to evenly send the message to different partitions of the same topic. If the key is not null, the partition to which it is sent is calculated based on the hashcode modulo of the key.

“Receive”

1) Consumers send heartbeats to the group broker to maintain their affiliation with the group and their ownership of the partition. Ownership once assigned will not change unless rebalancing occurs (for example, if a consumer joins or leaves the consumer group). The consumer will only read messages from the corresponding partition.

2) Kafka limits the number of consumers to less than the number of partitions. Each message is consumed by only one consumer in the same consumer Group (non-broadcast).

Kafka’s Consumer groups subscribe to the same topic, so that each Consumer is assigned the same number of partitions as possible. Different Consumer groups subscribe to the same topic independently of each other. The same message is processed by different Consumer groups.

Rabbitmq: Provides 4 types: Direct, topic, Headers and fanout.

“Send”

You declare a queue, and that queue will be created or has been created, and a queue is the basic storage unit.

It is up to the Exchange and key to decide which queue the message is stored in.

Direct > sends to the queue that exactly matches the bindingKey.

Topic > Routing keys are strings containing “.” and are sent to queues corresponding to fuzzy matching bingkeys containing “*” and “# “.

Fanout > is sent to all queues bound to Exchange regardless of key

Headers > has nothing to do with key. Messages are sent to this queue if the HEADERS attribute (a key-value pair) and the bound key-value pair match exactly. This method has low performance and is generally not used

“Receive”

Rabbitmq queues are basic storage units and are no longer partitioned or sharded. The consumer will specify which queue to receive messages from if we have created one.

When a RabbitMQ queue has more than one consumer, messages received by the queue are sent to the consumer in a polling distribution. Each message will be sent to only one consumer in the subscription list and will not be repeated.

This approach is ideal for scaling and is designed specifically for concurrent programs.

BasicQos limits the maximum number of unconfirmed messages that a consumer can hold on a channel if his or her task is heavy, at which point RabbitMQ will not send any messages to that consumer.

Zeromq: Peer-to-peer (P2P)

Rocketmq: Publish and subscribe pattern based on topic/messageTag and regular matching by message type and attribute

“Send”

Outgoing messages are sent through polling queues, with each queue receiving an average number of messages. Messages are sent to a topic, tags, or keys, but not to a particular queue (meaningless, cluster consumption and broadcast consumption have nothing to do with which queue the message is sent to).

Tags optional, similar to the tags Gmail uses for each email, for server filtering. Currently, only one tag per message is supported, so it can be analogous to Notify’s MessageType concept.

Keys indicates the service key of the message. The server will create a hash index based on keys. After setting the hash index, the Console system can query the message based on Topic and keys.

“Receive”

1) Broadcasting consumption. A message is consumed by multiple consumers, and even if consumers belong to the same ConsumerGroup, the message is consumed once by each Consumer in the ConsumerGroup.

2) Cluster consumption. Consumer instances in a Consumer Group share consumption messages equally. For example, if a Topic has nine messages and a Consumer Group has three instances, each instance consumes only three of the messages. That is, each queue distributes messages to each consumer in turn.

Activemq: Peer-to-peer (P2P), broadcast (publish-subscribe)

Point-to-point, with only one consumer per message;

Publish/subscribe, where there can be multiple consumers per message.

“Send”

Point-to-point mode: specify a queue that will be created or has already been created.

Publish/subscribe: specify a topic that will be created or has already been created.

“Receive”

Point-to-point mode: For queues that have been created, the consumer specifies which queue to receive messages from.

Publish/subscribe: For topics that have been created, the consumer specifies which topic to subscribe to.

Sequential messages

Kafka: Yes.

Set the producers of Max. In. Flight. Requests. Per. The connection is 1, can ensure the message is written to the server in the order it was received, even if the retry.

Kafka guarantees that messages within a partition are ordered, but in two cases

1) If the key is null, messages are written to partitions of different hosts one by one, but are still ordered for each partition

2) The key is not null, and the messages are written to the same partition. The messages in this partition are all in order.

Rabbitmq: Not supported

Zeromq: Not supported

Rocketmq: support

Activemq: Not supported

Xiv. Confirmation of information

Kafka: Yes.

1) Sender confirmation mechanism

Ack =0, regardless of whether the message was successfully written to the partition

Ack =1, the message is successfully written to the leader partition, return success

Ack =all, return success if the message is successfully written to all partitions.

2) Recipient confirmation mechanism

Offsets are automatically or manually submitted. Earlier versions of Kafka are submitted to Zookeeper, which puts a lot of pressure on Zookeeper. Newer versions of Kafka are submitted to The Kafka server, which is no longer dependent on the Zookeeper group and provides stable cluster performance.

Rabbitmq: Supported.

1) Sender confirmation mechanism, after the message is delivered to all matching queues, return success. If the message and queue are persistent, success is returned after writing to disk. Batch confirmation and asynchronous confirmation are supported.

2) Receiver confirmation mechanism: Set autoAck to false, which requires explicit confirmation; set autoAck to true, which automatically confirms.

When autoAck is false, the RabbitMQ queue is split between messages waiting to be delivered to consumers and messages delivered but not acknowledged.

If no confirmation is received and the consumer is disconnected, RabbitMQ will re-queue the message to either the original consumer or the next consumer.

Unacknowledged messages do not have an expiration date. If the message is not acknowledged and the connection is not disconnected, RabbitMQ will wait. Rabbitmq allows a message to be processed for a very long time.

Zeromq: Supported.

Rocketmq: Supported.

Activemq: Supported.

15. Message backtracking

Kafka: supports backtracking of the specified partition offset position.

Rabbitmq: Not supported

Zeromq: Not supported

Rocketmq: Supports point-in-time backtracking.

Activemq: Not supported

Message retry

Kafka: Not supported, but implemented.

Kafka supports backtracking at the specified partition offset position, enabling message retry.

Rabbitmq: Not supported, but can be implemented using message confirmation.

Rabbitmq receiver confirmation mechanism. Set autoAck to false.

When autoAck is false, the RabbitMQ queue is split between messages waiting to be delivered to consumers and messages delivered but not acknowledged. If no confirmation is received and the consumer is disconnected, RabbitMQ will re-queue the message to either the original consumer or the next consumer.

Zeromq: Not supported,

Rocketmq: Supported.

In most cases where message consumption fails, 99% of instant retries fail, so RocketMQ’s policy is to retry at the same interval when consumption fails.

1) The send method on the sending end supports internal retry. The retry logic is as follows:

A) Retry a maximum of 3 times;

B) If sending fails, it is forwarded to the next broker;

C) The total time of this method should not exceed the value set by sendMsgTimeout. The default value is 10 seconds.

2) Receiving end.

When a Consumer fails to consume a message, a retry mechanism is provided to make the message consume again. Consumer Consumer message failures can generally be classified into the following two conditions:

The message data itself cannot be processed due to reasons of the message itself, such as deserialization failure (e.g., phone charge, mobile phone number of the current message is changed

Cancel, unable to recharge), etc. A timed retry mechanism, such as retries after 10s.

The dependent downstream application service is unavailable, for example, the DB connection is unavailable, and the external system network is unreachable.

Even if the current failed message is skipped, consuming other messages will also result in an error. In this case, you can sleep for 30s and consume the next message, reducing the pressure on the Broker to retry the message.

Activemq: Not supported

17. Concurrency

Kafka: high

One consumer per thread kafka limits the number of consumers to less than or equal to the number of partitions. To increase parallelism, you can either enable multiple threads among consumers or increase the number of consumer instances.

The rabbitmq: high

Itself is written in Erlang language, high concurrency performance.

Multithreading can be enabled in consumers. The most commonly used approach is that one channel corresponds to one consumer, and each thread holds a channel. Multiple threads reuse the TCP connection of connection to reduce performance overhead.

This approach is ideal for scaling and is designed specifically for concurrent programs.

BasicQos limits the maximum number of unconfirmed messages that a consumer can hold on a channel if his or her task is heavy, at which point RabbitMQ will not send any messages to that consumer.

Zeromq: high

High rocketmq:

1) RocketMQ limits the number of consumers to less than or equal to the number of queues, but allows you to enable multi-threading among consumers, which is similar to Kafka, and improves parallelism in the same way.

Modify the consumption parallelism method

A) Under the same ConsumerGroup, the parallelism is improved by increasing the number of Consumer instances. Consumer instances that exceed the number of subscription queues are invalid.

B) Increase the consumption parallel thread of a single Consumer by modifying parameters consumeThreadMin and consumeThreadMax

2) For the same network connection, multiple threads on the client can send requests at the same time, and the connection will be reused to reduce the performance overhead.

Activemq: high

The speed of receiving and consuming messages of a single ActiveMQ is 10,000 pens/second (persistency is generally 10,000 pens/second and non-persistency is more than 20,000 pens/second). The performance of more than 100,000 pens/second can be achieved by deploying 10 ActiveMQ in the production environment. The more ActivemQ brokers deployed, the lower latency on MQ and the higher system throughput.

Anyone who can see here is a hero! \

From: urlify. Cn/vqqMRr

17 aspects, comprehensive comparison of Kafka, RabbitMQ, RocketMQ, ActiveMQ

I. Documents

Second, development language

Iii. Supported Agreements

4. Message storage

5. Message transactions

Load balancing

7. Cluster mode

8. Management interface

Ix. Usability

10. Repeated messages

Throughput TPS

Subscription form and message distribution

Sequential messages

Xiv. Confirmation of information

15. Message backtracking

Message retry

17. Concurrency

Related Posts

The fourth in the Spring Security series uses JWT for authentication

Function recursive

Flink ResourceManager Resource management