Why message queues?

What are the advantages and disadvantages of message queues?

What are the differences between Kafka, ActiveMQ, RabbitMQ, and RocketMQ?

Interviewer psychoanalysis

The interviewer wants to see:

First of all, do you know why message queues are used in your system? Many candidates say they use Redis and MQ in their projects, but they don’t really know why they use it. To put it bluntly, it was just for its own sake, or an architecture designed by someone else, and he never thought about it. People who don’t ask why they’re structured are usually people who aren’t thinking, and interviewers tend to think poorly of these candidates. Because the interviewer is worried that you will only be able to do dull work and not think for yourself when you join the team.

Second, since you use message queues, do you know what the advantages and disadvantages of using them are? If you don’t think about that, then you blindly put MQ into the system, then something goes wrong and you slip out and leave a hole for the company? If you haven’t considered the potential drawbacks and risks of introducing a technology, the interviewer who hires this candidate is likely to be a pot-digger. I’m afraid you work for a year to dig a pile of pits, their job hopping, leaving endless trouble for the company.

Third, since you used MQ, probably some kind of MQ, did you do any research at the time? Don’t fool yourself into using an MQ like Kafka for personal preference, without even researching the popular MQ types in the industry. What are the advantages and disadvantages of each MQ. Each MQ is not absolutely good or bad, but it depends on which scenario can be used to maximize its strengths and circumvent its weaknesses. If a candidate who does not consider the technology selection is recruited into the team, the leader assigns him a task to design a system. He may use some technologies in the team without considering the type selection, and the technology he finally chooses may not be appropriate, which is still a trap.

Analysis of interview questions

Why message queues

The idea is to ask you what are the usage scenarios of message queues, and then what are the specific scenarios in your project, what are you using message queues in this scenario?

When the interviewer asks you this question, one of the expected responses is to say, what is your company’s business scenario, and what is the technical challenge of that business scenario, which would have been a hassle if you didn’t use MQ, but now that you use MQ gives you a lot of benefits.

Let’s start with the common usage scenarios of message queues. There are many scenarios, but there are three core ones: decoupling, async, and peak clipping.

The decoupling

Look at this scene. A system sends data to the three BCD systems through interface calls. What if system E also wants this data? What if system C now doesn’t need it? A System manager almost collapsed.

In this scenario, system A is heavily coupled with various other chaotic systems. System A produces A critical piece of data, which many systems need to send. System A should always consider BCDE four systems if the failure of what to do? Do you want to repost? Do you want to save the message? The hair is all white!

If MQ is used, system A generates A piece of data and sends it to MQ, which system needs the data to be consumed in MQ itself. If the new system needs data, it can be consumed directly from MQ; If a system no longer needs this data, simply unconsume the MQ message. In this way, system A does not need to consider who to send data to, does not need to maintain the code, and does not need to consider whether others call success, failure timeout, etc.

Summary: With A model of MQ, Pub/Sub publishing subscription messages, system A is completely decoupled from other systems.

Interview tip: You need to consider whether there are similar scenarios in the system you are responsible for, that is, one system or one module, calling multiple systems or modules, calling each other is very complex and difficult to maintain. However, this call does not need to be directly synchronized to the interface. If it is decoupled asynchronously using MQ, it is also possible. You need to consider whether you can use MQ to decouple the system in your project. Show this in your resume, decoupled by MQ.

asynchronous

In another scenario, when system A receives A request, it needs to write to its own local library and three system write libraries in BCD. The local write library needs 3ms, and the three system write libraries in BCD need 300ms, 450ms, and 200ms respectively. The total delay of the final request is 3 + 300 + 450 + 200 = 953ms, close to 1s, the user feels that something is very slow. The user initiates a request through the browser and waits for 1s, which is almost unacceptable.

General Internet enterprises require that each request must be completed within 200 ms for direct user operations, which is almost insensitive to users.

If MQ is used, then system A sends 3 messages to the MQ queue continuously. If it takes 5ms, the total time for system A to receive A request and return A response to the user is 3 + 5 = 8ms. For the user, it actually feels like clicking A button, and it will directly return after 8ms, cool! Good job on the website, fast!

Peak clipping

Every day from 0:00 to 12:00, A system is calm, the number of concurrent requests per second is 50. As a result, the number of concurrent requests increases to 5K + requests per second from 12:00 to 13:00 every time. However, the system is directly based on MySQL, and a large number of requests flood into MySQL, executing about 5K SQL against MySQL every second.

MySQL can handle 2k requests per second, but 5K requests per second will kill MySQL and cause the system to crash and users will no longer be able to use the system.

But once the peak is over, in the afternoon, it’s a low peak, and maybe a million users are on the site at the same time, and the number of requests per second may be as low as 50, putting almost no pressure on the whole system.

If you use MQ and 5K requests are written to MQ per second, the A system can process at most 2k requests per second because MySQL can process at most 2k requests per second. System A slowly pulls requests from MQ at 2k requests per second, no more than the maximum number of requests it can handle per second, so that system A will never fail, even during peak times. At 5K requests coming in for MQ every second, 2k requests are going out, resulting in hundreds of thousands or even millions of requests being backlogged in MQ during the midday peak (1 hour).

This brief peak backlog is ok because after the peak, 50 requests are coming into MQ every second, but the A system is still processing at A rate of 2k requests per second. Therefore, as soon as the peak is over, system A will quickly clear the backlog of messages.

What are the advantages and disadvantages of message queues

The advantages mentioned above are the corresponding advantages in special scenarios, such as decoupling, asynchrony and peak clipping.

The disadvantages are as follows:

Reduced system availability The more external dependencies a system introduces, the more likely it is to fail. Originally you are A system call BCD three system interface good, person ABCD four systems good, no problem, you partial add MQ come in, in case MQ hang how to do, MQ hang, the whole system crash, you don’t over?

As the system complexity increases and MQ is added, how can you ensure that messages are not consumed twice? How to handle message loss? How to ensure sequential message delivery? Big head, lots of problems, lots of pain.

Consistency problem A system processing directly return success, people think you this request is successful; But the problem is, what if BCD three systems, BD two system write library success, the result of C system write library failure? Your numbers don’t match up.

So message queuing is actually a very complex architecture, and you introduce it with a lot of benefits, but you have to do all kinds of additional technical solutions and architectures to get around it, and when you do that, you see, gee, the system is an order of magnitude more complex, maybe 10 times more complex. But when the chips are down, it’s — it’s still there.

To sum up, after various comparisons, the following suggestions are made:

General business system to introduce MQ, everyone used ActiveMQ at the beginning, but now it is true that people do not use it much, has not been verified by large-scale throughput scenarios, the community is not very active, so you should forget it, I personally do not recommend using this;

Then RabbitMQ was used, but it was true that the Erlang language prevented a lot of Java engineers from getting too deep into it and controlling it. It was almost out of control for the company, but it was true that it was open source, fairly stable, and very active;

More and more companies are using RocketMQ, which is great because it’s alibaba, but there’s a risk that the community will suddenly disappear (RocketMQ has been donated to Apache, but it’s not very active on GitHub). RocketMQ is recommended, otherwise go back to RabbitMQ, they have an active open source community, it will not go wrong.

So for small and medium sized companies, with average technical strength and not particularly high technical challenges, RabbitMQ is a good choice. For large companies with strong infrastructure development capabilities, RocketMQ is a good choice.

Kafka is the industry standard for real-time computing, log collection and other scenarios in the field of big data. There is no problem with Kafka. The community is very active.