Distributed Systems Study Note 4: Message Middleware

Almost all of the content comes from the distributed system course taught by Professor Li Longhai of Xidian University.

Bloggers are just stenographers for teachers

Disadvantages of only supporting point-to-point communication

For a complex distributed system, each node communicates with each other in point-to-point mode, which will greatly improve the complexity and coupling degree of the system. Moreover, the communication between each node is directly carried out by Socket one to one (if the communication between nodes is directly written to death by code), which makes the scalability very poor — every time a new node is added, it needs to affect multiple nodes in the system. Similarly, tight coupling leads to poor fault tolerance.

This problem can be solved by introducing mediation nodes.

Introducing mediation nodes

A mediation node can be added to the system as a buffer-like role, separating producers and consumers of data. The features and advantages of the scheme are as follows:

Reduce coupling: Producers send data to the mediation node, and consumers get the data they need from the mediation node in some way (for example, “subscribe” to the content they are interested in through the topic subscription model);
Improved fault tolerance: Mediation nodes can cache data and ensure that data is not directly lost in the event of partial node failure/speed mismatch (short period of time);
Increased scalability: Production nodes don’t care about consumers, and consumers don’t care about producers. Therefore, the increase or decrease of production, consumers (type increase or decrease, number of nodes of the same type increase or decrease) does not affect the producer (if no data structure changes are required);

To avoid the mediation node being called a bottleneck to the system, it can actually be a cluster (which can remain logically a single point). This is called message queue middleware, as in Active MQ.)

Message Or iEnted Middleware (MOM)

The middleware provides a distributed message queue service that allows nodes to use messages flexibly through this mediationAsynchronous communication.Asynchronous communication: The sender puts the data on the message queue, and once it has finished sending the data, it can do something else — no blocking to wait for the receiver to receive the message. The receiver also receives data whenever it wants, rather than blocking for incoming data

Bus architecture for distributed systems

The architecture is shown on the left, with nodes connected via a (virtual) bus. Similar to the system bus inside the host (on the right), the node that sends a message simply “drops” data onto the bus (in this case, the bus is played by a message queue), and neither party is aware of the other’s existence (because subscriptions are used to determine who needs which messages).

In addition to its own processes, such a MOM should also provide an SDK for clients that use it. The client connects to the middleware through an interface provided by the SDK (usually a socket is built).

After the middleware connection is completed, the application sends and obtains messages through SDK. For example, the consumer just needs to configure the subscribed topic through the interface provided by the SDK, and the SDK does the rest of the message retrieval work and returns it to the application when needed.

Two communication modes for MOM

Message queue pattern

In this pattern, as shown in the diagram above, the middleware creates a fifO queue between producers and consumers (multiple of both), and each message that is fetched by one consumer is deleted from the queue (meaning that only one consumer can receive each message). This mode can achieve load balancing.

Because the same queue is faced with multiple consumers requesting data, middleware needs to decide who to send the message to according to some policy (usually random selection, or polling).

Load balancing

Distribute a large number of tasks evenly across multiple servers running the same business logic. When performance is low, you can improve overall performance by simply increasing the number of servers.

Advanced queue patterns: queues with priority (with higher priority, more messages are sent), queues that support persistence (middleware keeps a backup on disk and can be directly recovered after a power failure)

Topic/subscription communication mode

In this pattern, there can be different types of producers producing different types of data, and producers can choose to publish messages to specific message topics. Multiple consumers who subscribe to the same topic can receive the message publishing the message topic at the same time, so this mode can flexibly realize the broadcast and multicast many-to-many communication mode.This pattern applies when several data consumers are running their own business logic — that is, the same piece of data may be used by multiple consumers for different purposes. For example, in a shopping site, a user generates a new order data, which is put into MQ and (perhaps) retrieved by multiple servers for their own business logic — such as shipping orders to warehouses, billing consumption data into user accounts, adding user purchases to analytics, and so on.

MOM ：ActiveMQ

ActiveMQ is a Java Message Service (Java Message Service) compatible middleware produced by Apache, which provides client APIS for various programming languages.
The middleware and the client communicate with each other using the Advanced Message Queuing Protocol (AMOP) (therefore, the Middleware can also use ActiveMQ middleware to directly use this Protocol in the software layer).
ActiveMQ has three ways of receiving messages
- Blocking (synchronization) : The receiving function receives a message in the queue and returns it if there is a message in the queue (the return value is the fetched message). If the queue is empty, it blocks waiting for messages;
- Polling receive: in order to avoid blocking, it will check whether the queue is empty in advance, and block the queue when it is not empty.
- Callback reception: Register a callback function
  
  Callback function: this function is defined in one module and then registered with another module. When another module wants to use this function, the callback function is called from the registered address, and the module in which the callback function is called can get the value from the other module’s call argument
  
  Callback functions can be used to implement asynchronous communication
  - The module where the callback function resides defines the callback function and registers it with the middleware.
  - When there is a message in the queue of the middleware, the middleware actively calls the callback function and passes in the value through the parameters of the callback function to achieve asynchronous communication

Java Message Service, JMS

Java in addition to syntax, JDK, but also defined a lot of API standards (such as Java EE, mainly contains a lot of API standards), Java for a lot of other tools defined a unified API standards (such as JDBC, JMS)

JMS, despite its name, is actually a set of interface standards that provide a unified API interface standard for Java applications to access different MOM middleware. (This means that clients of MOM are used entirely through JMS apis, Can support multiple middleware without modifying the code, easy to switch between middleware)

Advantages of communication based on MOM

Realize asynchronous communication, reduce system response time and improve throughput
Realize the decoupling between distributed nodes
Ensure reliable delivery of messages to achieve final consistency
- Most moms have persistent caching, so that when the receiver fails and restarts, it can retrieve the message it needs
Realize broadcast, multicast and many-to-many communication
Realize flow peak cutting and flow control
Push and Pull models are supported

The idea of implementing extensible services

Several servers run the same logic and connect to the same message queue

The entry server receives external requests, and the exit server returns results for the request. When the portal receives the request, it puts the request into the message queue and sends the message to a specific business processing server according to the load balancing policy (or the business server can listen to the message queue and actively obtain new messages in the queue). The business server completes the calculation and puts the result into another queue that returns the result. The egress server listens for the new message and retrieves it and returns it to the external request sender.

This distributed system can improve the overall concurrent service processing capability by increasing the performance/number of service processing servers. The message queue, the entry and exit servers are relatively easy to bottleneck (the logic of the entry and exit servers is surprisingly simple, perhaps as simple as extracting a packet from the network card, doing a simple check and sending the message queue. If there is a bottleneck, some middleware also supports running all three parts in clusters to further improve performance).

Traffic peak clipping

When there is a spike in the number of requests, a large number of requests pile up on the MOM’s queue, and the subsequent business processing server does not fail by creating too many response threads. When the server running the queue runs out of memory, the data in the queue can be put into memory to continue queuing. As a result, the response time of the entire system slows down when the number of requests exceeds its processing capacity, but it does not get stuck.

Further optimization of the system

In the case of not improve system performance, can be in the middle for the requestor to provide some feedback to improve the user’s user experience – entry after the server receives the external client request, immediately return a said “handled already feedback to the client, and then began to run (long) may be the business logic. After the completion of the operation, it responds to the external client in an asynchronous way (such as sending a short message to the client, and then the client sends the query request to the export server).

Disadvantages of message queues – no feedback

It is not easy for the business process server to feedback on the process — for example, if the business process server fails, or if the server finds that the request message is illegal, it cannot simply report the exception back to the external requester. If you want feedback, you need to add a new queue for feedback exceptions, which increases the complexity of the system

This section provides an example for troubleshooting service processing server exceptions

Log method: The system will record the relationship between the request content and the processing node in the log, and then a special program will scan the log. If the program finds that a response to a request assigned to a node has timed out, it redirects the system to process the request.

Push model and Pull model

Push d. Pull: Pull.

It’s all about the Push model: clients (message sources) keep pushing requests into queues, regardless of whether consumers of the data can handle them

The Pull model

The delivery of data is passive — consumers actively Pull messages into queues. When all messages in the queue are empty, the queue will go to the data source and ask for new messages

The model requires a data caching pattern at source. The model is suitable for data analysis scenarios with large amounts of data to be processed