This article will take you to build a lightweight IM server from scratch. The overall design ideas and architecture of IM have been described in my last blog. If you haven’t seen it, please click on the development of IM (instant messaging) server from scratch.
This article will give you more details about the implementation. I will explain how to build a complete and reliable IM system from three aspects.
- reliability
- security
- Store design
reliability
What is reliability? For an IM system, the definition of reliability is at least not to lose messages, messages do not repeat, not out of order, to meet the three points, said to have a good chat experience.
Don’t throw the message
Let’s start by never losing news.
First, review the design of the last articleServer architecture:
Let’s start with a simple example: When Alice sends a message to Bob, it might go through a link like this:
- client–>connecter
- connector–>transfer
- transfer–>connector
- connector–>client
In this whole link, every link may fail. Although TCP is reliable, it only ensures the reliability of the link layer, not the application layer.
For example, in the first step, the Connector receives a message from the client, but fails to forward it to the transfer. Bob will not receive the message, and Alice will not be aware of the message failure.
If Bob is offline, the message link is:
- client–>connector
- connector–>transfer
- transfer–>mq
If, in step 3, the Transfer receives a message from the Connector but offline message entry fails, the message has also failed to be delivered. To ensure the reliability of the application layer, we must have an ACK mechanism that allows the sender to confirm that the message has been received.
Specific implementation, we imitate TCP protocol to do an application layer ACK mechanism.
TCP packets are sent in the following formatByte
It’s in units, and we’re inmessage
The unit.Each time a sender sends a message, it must wait for an ACK response from the other party. The ACK acknowledgement message should have the ID received by the sender for identification.
Second, the sender needs to maintain a queue waiting for ack. Each time a message is sent, it is enqueued with a timer.
In addition, a thread has been polling the queue, if there is a timeout did not receive ack, it will take out the message and resend.
An ACK message that has not been received due to timeout can be handled in two ways:
- Like TCP, it keeps sending until it receives an ACK.
- Set a maximum number of retries. If no ACK is received after this number, useFailure mechanismProcessing, saving resources. For example, if yes
connector
Not received for a long timeclient
Then you can actively disconnect the connection with the client, and the remaining unsent messages are stored as offline messages. After the client is disconnected, you can try to reconnect the server.
No repetition, no disorder
Sometimes, the ACK may be received slowly due to network reasons, and the sender will send the ACK repeatedly. In this case, the receiver must have a deduplication mechanism. This is done by adding a unique ID to each message. This unique ID does not have to be global; it only needs to be unique within a session. For example, a conversation between two people, or a group. If the network is disconnected and reconnected, the session starts from 0 again.
The recipient needs to maintain the ID of the last message received in the current session, called lastId. Each time a new message is received, the ID is compared to lastId to see if it is consecutive, and if it is not, it is placed in a temporary queue for later processing.
Such as:
-
The lastId of the current session is 1, and then the server receives the message MSG (id=2). If the message is continuous, the server processes the message and changes the lastId to 2.
-
However, if the server receives the message MSG (ID =3), it indicates that the message has arrived out of order. Then, the message will be queued and processed after lastId turns to 2 (that is, the server receives the message MSG (ID =2) and finishes processing).
Therefore, to determine whether the message is repeated, msgId>lastId &&! The queue. The contains (msgId). If a duplicate message is received, you can determine that an ACK has not been delivered and send another ACK.
After receiving the message, the complete processing process is as follows:
The pseudocode is as follows:
Class ProcessMsgNode{/** * private Message Message; Private Consumer<Message> Consumer; } public CompletableFuture<Void> offer(Long ID,Message Message,Consumer<Message> Consumer) {if (isRepeat(id)) {// Message repeat sendAck(id); return null; } if (! IsConsist (id)) {// Message discontinuous notconsistmsmap. put(id, new ProcessMsgNode(message, consumer)); return null; } return process(id, message, consumer); } private CompletableFuture<Void> process(Long id, Message message, Consumer<Message> consumer) { return CompletableFuture .runAsync(() -> consumer.accept(message)) .thenAccept(v -> sendAck(id)) .thenAccept(v -> lastId.set(id)) .thenComposeAsync(v -> { Long nextId = nextId(id); If (notConsistMsgMap. Either containsKey (nextId)) {/ / are in the queue next message ProcessMsgNode node = notConsistMsgMap. Get (nextId); return process(nextId, node.getMessage(), consumer); } else {// There is no next message in the queue CompletableFuture<Void> Future = new CompletableFuture<>(); future.complete(null); return future; } }) .exceptionally(e -> { logger.error("[process received msg] has error", e); return null; }); }Copy the code
security
Both chat logs and offline messages will be backed up on the server, so the security of messages and the protection of customers’ privacy are also critical. Therefore, all messages must be encrypted. In the storage module, there are two basic tables to maintain user information and relation chain, namely im_user user table and IM_relation relation linked list.
im_user
A table is used to store common user information, such as user names and passwords.im_relation
The following table is used to record friends:
CREATE TABLE 'im_relation' (' id' bigint(20) COMMENT 'id',' user_id1 'varchar(100) COMMENT' id', 'user_id2' varchar(100) COMMENT 'encrypt_key ',' ENCRYPt_key 'char(33) COMMENT' AES key ', `gmt_create` timestamp DEFAULT CURRENT_TIMESTAMP, `gmt_update` timestamp DEFAULT CURRENT_TIMESTAMP ON UPDATE CURRENT_TIMESTAMP, PRIMARY KEY (`id`), UNIQUE KEY `USERID1_USERID2` (`user_id1`,`user_id2`) );Copy the code
user_id1
anduser_id2
Is the user ID of each other’s friends. In order to avoid duplication, the user id is stored according touser_id1
<user_id2
Is stored in the order of, and combined with the index.encrypt_key
It’s a randomly generated key. When the client logs in, it retrieves all of the user’s data from the databaserelation
, stored in memory for subsequent encryption and decryption.- When a client sends a message to a friend, it extracts the key of the relationship from the memory, encrypts the key, and sends the message. Similarly, when a message is received, the corresponding key is extracted and decrypted.
The complete client login process is as follows:
- The client invokes the REST interface for login.
- The client invokes the REST interface to obtain the ownership of the user
relation
. - The client sends a greet message to the Connector.
- Connector pulls offline messages and pushes them to clients.
- Connector Updates user sessions.
Why would connector push an offline message before updating the session? Let’s think about what would happen if the order were reversed:
- The user
Alice
Logging In to the Server connector
Update the session- Push offline Message
- At this point Bob sends a message to Alice
If the offline message is still being pushed and Bob sends a new message to Alice, the server will push the message immediately after obtaining Alice’s session. In this case, the new message might be pushed along among the offline messages, and Alice’s messages would be out of order.
We must ensure that offline messages precede new messages.
So if you push the offline message first, then update the session later. In the offline message push process, Alice’s status is “offline”. At this time, the new message sent by Bob will only be put into im_OFFLINE, and the new message will be “online” after the data in im_OFFLINE table is read. This also avoids the disorder.
Store design
Storing offline messages
When the user is offline, the offline message must be stored on the server and pushed after the user goes online. After understanding the previous section, storing offline messages is easy. Add an offline message table im_offline. The table structure is as follows:
CREATE TABLE 'im_offline' (' id' int(11) COMMENT 'id',' msg_id 'bigint(20) COMMENT' id', 'msg_type' int(2) COMMENT 'type ',' content 'varbinary(5000) COMMENT' type ', 'to_user_id' varchar(100) COMMENT 'recipient ID ', 'has_read' tinyint(1) COMMENT 'unread ',' gmt_create 'timestamp COMMENT' create time ', PRIMARY KEY (' id '));Copy the code
Msg_type is used to distinguish message types (chat, ACK). The encrypted message content is stored as a byte array. When a user goes online, pull records based on the condition to_user_id= user ID.
Prevent repeated push of offline messages
Let’s consider the case of multiple logins, where Alice has two devices logged in at the same time. In this case, we need some mechanism to ensure that offline messages are read only once.
CAS mechanism is used to achieve:
- First of all, take out all
has_read=false
In the field. - Check each message
has_read
Whether the value is false or, if so, true. This is an atomic operation.
update im_offline set has_read = true where id = ${msg_id} and has_read = false
Copy the code
- If the modification succeeds, it will be pushed. If the modification fails, it will not be pushed.
I believe that by now, students can build a complete and usable IM server by themselves. Please leave more questions in the comments section ~~
Github link: github.com/yuanrw/IM feel helpful to you please click a star bar ~!