Instant chat (IM) system needs to solve the problem of message reliability and message consistency.

Message reliability simply means that no message is lost. One party sends a message, and the message is successfully delivered to the other party and displayed correctly. Message consistency refers to the message consistency between the sender and the session. Messages must not be repeated or in disorder.

Message sending implementation process

The general realization process of message sending can be divided into two stages: the sender sends the message, the server receives the message, and returns the ACK message to the sender; The server pushes the message to the receiver. Whether the message is successfully sent depends on the first phase, that is, whether the server receives the message. The message status can be divided into three types: sending, sending successfully, and sending failed. Its nodes are:

1. Sending: The sender triggers the sending event and receives an ACK message from the server.

2. The message is sent successfully: The sender replies with an ACK message.

3. Send failure: If the number of resends exceeds a certain number, no ACK reply is received.

Message sending flowchart:

Message reliability

Retransmission mechanism

The method to ensure the successful sending of messages in the first phase is to set up a resending mechanism. The resending mechanism determines whether the message needs to be resent according to whether the ACK corresponding to the message is received within a certain period of time. If the period exceeds the preset period, the message is resend. If the number of resends exceeds the preset number, the system does not resend the message, determines that the message fails to be sent, and changes the message sending status.

Session Record Check

Message sending In the second phase, the server pushes the message to the receiver. If the connection is disconnected, the message will be lost. Therefore, to ensure the integrity of the message, you need to obtain the session record according to the timestamp of the last message (ACK) after the connection is established, and return all messages in a period of time.

Another guarantee is to add timed polling to check message integrity.

Establish connection flow chart:

Two questions

Message resending and session record checking need to consider two issues: whether the message will be sent repeatedly, and whether the message order will be scrambled. Two examples:

1. Resend the message. If the server does not receive the lost message before the lost message reaches the server, the sender resends the lost message. If a server receives a message and returns an ACK that is lost, sending the same message again may cause the message to duplicate.

2. Message sequence: If the sender sends three consecutive messages and the first and third messages are successfully received by the server, will the third message be recorded when the second message is lost? If the second message reaches the server at this point, is it before or after the third time (the server usually stamps the record)?

Message consistency

Use the UUID message to de-duplicate

For message retransmission, you can add the attribute UUID to each message as the unique identifier of the message. The UUID of the retransmission message remains unchanged, and the front-end system deduplicates the message based on the UUID.

Use vector clocks for message ordering

As for message ordering, the order of messages has an important influence on the expression of the sender in chat. Incomplete messages or reversed messages may cause semantic incoherence or even distortion. Therefore, the order of sending messages from the sender must be guaranteed, and the order of sending messages from both sides of the session must be based on the actual situation.

In general cognition, the status is the message that is being sent and should not be seen by the other party. Only the message that is successfully sent can be seen by the other party. However, in the implementation, the success of sending a message is judged by the server receiving the message and returning an ACK, rather than being received by the other party.

The question then arises, if a message is in the sending state and a message is received, is the message received before or after the message being sent?

This is a context, and the key question is based on which message the sender sent the message.

This paper provides an idea for reference to vector clock algorithm in distributed system. Firstly, the vector clock algorithm is briefly described:

Vector clock algorithms are used to generate partial order relationships and correct causality in distributed systems. A system contains N nodes, and the message body generated by each node contains the logical clock of the node. The vector clock of the whole system is composed of n-dimension logical clock, and is transmitted in the message body generated by each node.

The specific implementation of vector clock algorithm:

1. In the initial state, vector value is 0;

2. Each time a node processes a node event, the clock of the node increases by one.

3. Each time a node sends a message, it sends the system vector clock containing its own clock.

4. Each time a node receives a message, the system vector clock is updated. The clock of the node is incremented by one.

5. When the node receives multiple messages at the same time, check whether there is a partial order relationship between the vector clocks that receive the messages

1. If partial order relationship exists, merge vector clocks and take the vector clock with larger partial order;

2. If the partial order relationship does not exist, the data cannot be merged.

  • Partial ordering: If each dimension of vector A is greater than or equal to vector B, there is A partial ordering relationship between A and B, otherwise there is no partial ordering relationship.

For message ordering, we actually deal with the context of the message and determine the causal relationship between messages. Reference vector clock algorithm, assuming that there are N message session parties, the vector clock of the system is composed of N-dimensional clocks, and the vector clock is transmitted in the message body sent by all parties, and sorted according to the vector clock. The specific implementation is as follows:

1. The system vector clock is set to (0, 0… , N);

2. The node sends a message to update the system vector clock. The clock of the node is increased by one, while other nodes remain unchanged.

3. The node receives the message and updates the system vector clock. The clock of the node increases by one. Other nodes compare the locally retained vector clock value of each node with the vector clock value in the message, and take the maximum value.

4. Determine the message order according to the partial order relation of the system vector clock in the message body:

1. If the partial order relationship can be determined, display the partial order relationship from small to large;

2. If the partial order of multiple messages cannot be determined, the messages are displayed in the natural order (received sequence).

Vector clock can solve the problem of consistency most news in theory, but also need to consider the actual use in the implementation of experience, which need to be one of the most concerned about the question is: whether to force the sorting, or, if the actual display order and the partial order relation between the vector clock not consistent, whether to order between the mobile message.

For example, in a multi-party conversation, if one of the parties is too slow to receive or send messages. After the last message he sees, the other person has started a new topic, and his message about the previous topic is finally sent and received by the other person, there is a problem:

Is the message about the previous topic displayed last or moved to an earlier time? If it appears at the end, but the content is not relevant to the current topic, others may be confused; If you move a message to an earlier time, it may not be seen by others, or it may feel jarring to see an extra message ahead.

IM scenarios are many and complex, and more often than not you need to think from a product perspective. To solve the problem of whether messages need to be sorted, only a relatively general solution is proposed: it is recommended that the sorting is not mandatory in the session, and the sorting is carried out in the session history according to the partial ordering relationship of the vector clock.

summary

For the reliability and consistency of IM system messages, the message retransmission mechanism is used to ensure that the message is successfully received by the server, and the session record check is used to ensure the integrity of the received message, thus ensuring the reliability of the entire message sending process. Uuid message deduplicating and reference vector clock algorithm are used to sort messages, which provides a solution to ensure message consistency.