Why message queues

Before explaining what a message queue is, let’s take a look at why a message queue is needed, what problems it solves, and not just for the sake of use.

MQ can be used in a variety of scenarios, most of which are familiar: system decoupling, asynchronous processing, and traffic peaking. In addition, there are delayed notification, distributed transactions, sequential messaging, streaming processing, and so on.

The decoupling

Message queues can decouple system applications. What is decoupling? Let’s look at a scenario.

There is one system A that needs to send data to the three BCD systems through interface invocation, as shown in the figure below

In normal development, we will call the interface provided by the BCD system in the code of system A, when one day needs to add or delete an interface, then we need to add or delete the corresponding code, if the need for frequent modification, my mind would explode at the thought of this scenario. And services are coupled together, so it’s good that if one system goes wrong, it affects the whole business process, and if you try a catch in that place, it’s like planting a bomb, and at some point, bam, it goes off.

What happens when you use message queues

When the BCD system needs to use the data provided by system A, it can consume directly from MQ. If the system is added or deleted, it does not need to modify the code in system A, it can consume directly from MQ or cancel the consumption, so it is not fragrant.

Asynchronous processing

In order not to affect the user experience, the system needs to respond to the user as quickly as possible. Imagine if every time a user has to wait 1-2 seconds, you are challenging the user to think that they are Hello Kitty. The following figure

When not using the MQ, during operation, need to wait for all the system after handling the relevant business, to returns A response to the user, after introducing the MQ, if the system sends A message to the MQ queue takes 3 ms, then the total duration of 2 + 3 = 5 ms, this to the user, click the button after 5 ms is returned directly, This is not a general feeling, how a cool word!

Message queues are used to enable asynchronous processing of services. The benefits of this are:

  • Returns results faster;
  • The wait is reduced, and the concurrency between steps is naturally realized, improving the overall system performance.

Traffic peak clipping

In the system, the asynchronous processing of part of the work has been achieved using message queues, but there is still a problem: how to avoid overwhelming the system with too many requests? Generally, the number of concurrent requests per second is small and the system can provide services normally. If the number of concurrent requests per second suddenly increases in a certain period of time and the system cannot handle them, the system crashes and users cannot use the system.

How can this be avoided after the introduction of MQ

When MQ is introduced, user requests are written to MQ. Suppose A system processing at most 2 k requests per second, and A request of the time period is 6 k, then 6 k first request written to MQ, then A system from MQ slowly pull in request, not more than his second largest please request data, such that even in peak can also avoid the system crash, and produce loss.

What’s the problem?

There are two sides to the coin, and the introduction of MQ solves a lot of problems, so what problems does it create?

  • The system availability decreases

The more external dependencies a system introduces, the more likely it is to fail. Originally A system call BCD three system interface is good, no problem, but just add MQ in, in case MQ hang up how to do, MQ hang up, not only the whole system crash, you also crash? So make sure message queues are highly available.

  • System complexity enhancement

By adding MQ, how can you ensure that messages are not consumed twice? How to handle message loss? How to ensure sequential message delivery?

  • Consistency problem

A the system directly returns success after processing, all think this request is successful; But the problem is, if BCD three systems, BD two system write library success, the result of C system write library failure, what to do? That’s when the data is inconsistent.

To sum up:

  • How can messages be highly available
  • How can message consumption be idempotent
  • How do I deal with message loss
  • How do I ensure that messages are sequential
  • How do I resolve message backlog
  • How to maintain data consistency

How do I select a message queue

In software development, every software system is unique and it is impossible to use one method to solve all problems. Similarly, in the technical selection of message queue, there is no saying which message queue is the best, only the most suitable.

Different message queues have their own advantages and disadvantages in terms of functionality and features, but there should be a minimum standard when selecting a message queue to ensure its normal use in development.

First of all, the product is open source, otherwise when we use the process and encounter a Bug that affects the business, we cannot fix it by modifying the code.

Secondly, the product is popular this year and has a high community activity. The reason for this is that many bugs have been fixed by others, and it is easy to find solutions to some problems encountered in the process of using the product on the Internet.

In addition, popular products will have better integration and compatibility with the surrounding ecosystem.

Finally, some of the features required to be a passing message queue product include:

  • Reliable delivery of messages: ensure that no messages are lost;
  • Cluster: supports clusters, ensuring that services are not unavailable due to the failure of a node and messages cannot be lost.
  • Performance: Provides sufficient performance to meet the performance requirements of most scenarios.

At present, the mainstream message queue middleware in the market mainly includes Kafka, RocketMQ, RabbitMQ, ActiveMQ, etc.

RabbitMQ

RabbitMQ is written in a relatively niche programming language, Erlang, which was originally designed for reliable communication between systems in the telecommunications industry and is one of the few message queues to support the AMQP protocol.

RabbitMQ: lightweight, fast, “message queue out of the box”. That said, RabbitMQ is a fairly lightweight message queue that is very easy to deploy and use.

One feature of RabbitMQ is its flexible routing configuration. Unlike other message queues, it has an Exchange module between Producer and Queue, which you can think of as a switch.

The Exchange module also acts much like a switch, distributing messages from producers to different queues based on configured routing rules. Routing rules are flexible, and you can even implement them yourself.

The RabbitMQ client supports probably the most programming languages of all message queues, and if the system is developed in a less popular language you will probably find a RabbitMQ client.

There are several problems with RabbitMQ.

The first problem is that RabbitMQ does not support message stacking very well. Message queues are designed to be conduits and large backlogs are an abnormal condition to be avoided. When a large number of messages are backlogged, RabbitMQ performance degrades dramatically.

The second problem is RabbitMQ’s performance, which can handle tens of thousands to hundreds of thousands of messages per second, depending on the hardware configuration, based on official test data and daily experience. This is more than enough for most applications, however, if the performance requirements for message queuing are very high, do not choose RabbitMQ.

The final issue is the programming language used by RabbitMQ, Erlang. This language is not only a very niche language, but if you want to do extensions and secondary development on RabbitMQ, you are advised to think very carefully about sustainable maintenance.

RocketMQ

RocketMQ is a message queue product that Alibaba opened source in 2012. It was later donated to the Apache Software Foundation and graduated in 2017 as a top-level project of Apache. It has good performance, stability, and reliability, with almost all the features and features a modern message queue should have, and it continues to grow.

RocketMQ has been optimized for the response latency of online services, with millisecond responses in most cases. If the application scenario is concerned with response latency, RocketMQ should be used.

RocketMQ is an order of magnitude better than RabbitMQ and can process hundreds of thousands of messages per second.

One disadvantage of RocketMQ is that, as a home-grown message queue, it is not as popular internationally as its more popular foreign counterparts and is less integrated and compatible with the surrounding ecosystem.

Kafka

Kafka was originally developed by LinkedIn and is currently a top-level project at Apache. Kafka was originally designed to handle huge volumes of logs.

In earlier versions, there were many design sacrifices for maximum performance, such as poor message reliability, loss of messages, lack of clustering, and poor functionality, that were acceptable for the specific scenario of handling massive logs. Kafka at this time couldn’t even be called a proper message queue. Over the next few years, Kafka has developed into a very mature message queue product that can meet the needs of most scenarios in terms of data reliability, stability, and functionality.

Kafka is one of the most compatible ecosystems around, especially in big data and streaming computing, where Kafka is a priority for almost all open source software systems.

Kafka is developed in Scala and the Java language. It is designed with a lot of batch and asynchronous thinking, which allows Kafka to achieve extremely high performance. Kafka’s performance, especially for asynchronous sending and receiving, is the best of the three, but not of an order of magnitude different from RocketMQ, which can process hundreds of thousands of messages per second.

ActiveMQ

ActiveMQ is the oldest open source message queue. It was the only open source message queue available ten years ago. Now it has entered the old age and the community is not active. In terms of function and performance, ActiveMQ has a significant gap with modern message queues, and its existence is limited to compatibility with the older systems still in use.

conclusion