At the beginning of September 2017, we initially implemented a set of minimalist group chat messaging system, its general structure is as follows:










System noun explanation:





  • 1) Client: message publisher [or server group chat message system caller], publisher;
  • 2) Proxy: system Proxy, a unified external interface, collects messages from clients and forwards them to brokers;
  • 3) Broker: System Message forwarding Server, the Broker will organize a RoomGatewayList according to the Gateway Message. Port address list], and then forward the message sent by the Proxy to all the gateways logged in by all members of the Room.
  • 4) Router: the user logs in to the message forwarding agent and forwards the user login and logout messages forwarded by the Gateway to all brokers;
  • 5) Gateway: the Gateway for all servers to receive connections from legitimate clients and forward client login and logout messages to all brokers through the Router;
  • 6) Room Message
  • 7) Gateway Message: a Room member logs in or logs out of a Gateway Message, including UIN/RoomID/Gateway address {IP: Port} messages.









This system has the following characteristics:





  • 1) The system only forwards the chat messages in the room and immediately forwards them to each node after receiving them. It does not store any chat messages in the room and does not consider the problems of message loss and message repetition;
  • 2) The system is fixed by a Proxy, three brokers and a Router;
  • 3) The Proxy receives the Room message sent from the back end, and then sends the message to a Broker according to a certain load balancing algorithm, which then sends the message to all interface machines related to the Room, Gateway;
  • 4) The Router receives a Gateway logout or login message from a Room member and sends the message to all brokers.
  • 5) After receiving the Gateway message forwarded by the Router, the Broker updates (adds or deletes) the Gateway collection records associated with a Room.
  • 6) The communication link of the whole system adopts UDP communication mode.

















Iv. Further focus on design: “scalability”


4.1. Basic ideas



A Replicaing capacity expansion scheme supports free planning without data migration and modification of routing code












First of all,






The second






The last


















After analysis, the corresponding architecture diagram is as follows:












4.2, the Client



The Client process is as follows:





  • 1) Load the Registry address from the configuration file;
  • 2) Obtain all proxies from Registy Proxy registration path /pubsub/ Proxy, and form a ProxyArray according to the increasing size of each Proxy ID;
  • 3) Start a thread to follow Registry path/Pubsub /proxy in real time to obtain dynamic changes of proxy and update ProxyArray in time;
  • 4) Start a thread polling regularly to obtain each proxy instance under Registry path/Pubsub /proxy, as a supplement to the concern policy, in order to keep each proxy member in the local ProxyArray consistent with each proxy on Registry; Periodically sends heartbeat to each Proxy and asynchronously obtains heartbeat packets. Delete Proxy members whose hops in the ProxyArray center timeout periodically.
  • 5) When sending messages, snowflake algorithm is used to assign a MessageID to each message, and then relevant load balancing algorithm is used to forward the message to a Proxy.


4.3, the Proxy



The detailed process of Proxy is as follows:





  • 1) Read the configuration file to obtain the Registry address;
  • 2) Register its own information in Registry path/Pubsub /proxy, and take the ReplicaID returned by Registry as its own ID;
  • 3) Obtain each replica of each broker partition from the Registry path/Pubsub /broker/ Partition (X);
  • 4) Obtain the current valid broker Partition Number from the Registry path /pubsub/broker/partition_num;


  • 5) Start a thread to follow the Broker path/Pubsub/Broker on Registry to get the following information in real time:
  •     {Broker Partition Number}
  • – New Broker Partition (now expanded)
  • – New Broker replica within the Partition (replica expansion occurred within the Partition);
  • – Information about a replica hung up in Broker Parition;


  • 6) Regularly send heartbeat to each Broker Partition replica, asynchronously wait for the heartbeat response packet returned by the Broker to detect its activity, so as to ensure that no Room Message is forwarded to the replica that has timed out;
  • 7) Start a thread to read the value of each sub-node under the Broker path/Pubsub/Broker on Registry regularly, observe the change of Broker Partition Number and the change of each Partition with the strategy of scheduled polling, as a supplement to the real-time strategy; Also periodically check brokers for heartbeat packet timeouts, removed from a valid BrokerList;
  • BrokerPartitionID = RoomID % BrokerPartitionNum, BrokerReplicaID % = RoomID BrokerPartitionReplicaNum 】 at a Partition forward up Room Message, receive Client Heatbeat package should give response in a timely manner.























4.4, Pipeline















The final implementation is according to the three steps of message processing by pipeline to do the following process processing:





  • 1) Start 1 message receiving thread and N lockless queues in the form of multi-write read. The message receiving thread starts an epoll cycle process to collect messages respectively. Then the message is written to the corresponding message protocol conversion queue with the corresponding hash algorithm [queue ID = UIN % N].
  • 2) Start N threads and N * 3 lockless queues with one write and one read [called message sending queue]. After receiving the message from the message protocol conversion queue and converting the protocol, each message protocol expert thread writes the message sending queue according to the corresponding hash algorithm [queue ID = UIN % 3N].
  • 3) Start 3N message sending threads to create a connection to each corresponding Broker. Each thread receives messages from one of the corresponding message sending queues and sends them out.









As for the interpretation of Pipeline itself, this paper does not elaborate, you can refer to the following figure:






4.5 Large room message processing























4.6, the Broker



The Broker process is as follows:





  • 1) The Broker loads the configuration to get the ID of its own Partition (suppose 3);
  • 2) to the Registry path/pubsub/broker/partition3 registration, sets the status to Init, Registry returned ID as its ID (replicaID);
  • 3) Receive the GatewayMessage forwarded by the Router and put it into the GatewayMessageQueue;
  • 4) Load data from the Database and load the RoomGatewayList data for which your Broker Partition is responsible;
  • 5) The GatewayMessage in GatewayMessageQueue is processed asynchronously, only the messages that meet the rule [PartitionID == RoomID % PartitionNum] are processed, and the data is stored in the local routing information cache;
  • 6) to modify the Registry path/pubsub/broker/partition3 under its own node status to Running;
  • 7) The startup thread pays attention to the value of Registry path /pubsub/broker/partition_num in real time;
  • 8) Start the thread to query the value of Registry path /pubsub/broker/partition_num periodically;
  • 9) When the value of Registry path/PubSub /broker/partition_num changes, clean each data in the local routing information cache according to the rule [PartitionID == RoomID % PartitionNum];
  • 10) Receiving the Room Message sent by Proxy, RoomID searches for all the gateways that Room members log in from the routing information cache and forwards the Message to these gateways.





























4.7, the Router



Router details are as follows:





  • 1) Router load configuration, Registry address;
  • 2) Register its own information in the Registry path/Pubsub /router, and take the ReplicaID returned by Registry as its own ID;
  • 3) Obtain each replica of each broker partition from the Registry path/Pubsub /broker/ Partition (X);
  • 4) Obtain the current valid broker Partition Number from the Registry path /pubsub/broker/partition_num;


  • 5) Start a thread to follow the Broker path/Pubsub/Broker on Registry to get the following information in real time:
  •     {Broker Partition Number}
  • – New Broker Partition (now expanded)
  • – New Broker replica within the Partition (replica expansion occurred within the Partition);
  • – Information about a replica hung up in Broker Parition;


  • 6) Send heartbeat to each Broker Partition replica regularly, and asynchronously wait for the heartbeat response packet returned by the Broker to detect its activity, so as to ensure that the Gateway Message is not forwarded to the replica that has timed out;
  • 7) Start a thread to read the value of each sub-node under the Broker path/Pubsub/Broker on Registry regularly, observe the change of Broker Partition Number and the change of each Partition with the strategy of scheduled polling, as a supplement to the real-time strategy; Also periodically check brokers for heartbeat packet timeouts, removed from a valid BrokerList;
  • 8) Load routing RoomGatewayList data from Database in full and put it into local cache;
  • 9) Receive the heartbeat message from the Gateway and return the ACK packet in time;
  • 10) Receive the Gateway Message forwarded by the Gateway, BrokerPartitionID % BrokerPartitionNum = RoomID % BrokerPartitionNum forwards to all Broker replicas under a BrokerPartition, Ensure that all replicas under the Partition have the same routing RoomGatewayList data, and then store the data in the Message in the local cache. When the data is not duplicated, the data is asynchronously written into the Database.


4.8, the Gateway



The detailed Gateway process is as follows:





  • 1) Read the configuration file and load the Registry address;
  • 2) Obtain all router replicas from the Registry path/Pubsub /router/ and form the replica array RouterArray according to the increasing order of each replica ID;
  • 3) Start a thread to monitor the Registry path/Pubsub/Router in real time to obtain dynamic changes of the Router and update the RouterArray in time;
  • 4) Start a thread polling regularly to obtain each router instance under the Registry path/Pubsub /router as a supplement to the concern policy, so that the local RouterArray can be updated in time; Periodically sends heartbeat packets to each Router and asynchronously obtains heartbeat packets. Periodically clear Router members whose center hops timeout on the RouterArray.
  • 5) When a member client in the Room is connected or all members in the Room are not connected to the current Gateway node, the Gateway Message is sent to a Router according to the rule [RouterArrayIndex = RoomID % RouterNum].
  • 6) When the Room Message forwarded by the Broker is received, the Message will be de-duplicated according to the MessageID. If there is no duplication, the Message will be sent to all clients in the Room connected to the current Gateway, and the MessageID will be cached for de-duplicated judgment.





Fifth, the next urgent solution: system stability





5.1. Message latency

















5.2. High availability























Further optimization: message reliability









Why does QQ use UDP instead of TCP?












The reliable message transmission process based on the current system is as follows:





  • 1) The Client configures an ID for each command message according to the Snowflake algorithm, makes three copies, and immediately sends them to different proxies.
  • 2) When the Proxy receives a command message, it randomly sends it to a Broker;
  • 3) The Broker receives a message and loses it to the Gateway.
  • 4) After receiving the command message, the Gateway makes repeated judgment according to the message ID. If the command message is repeated, it will be discarded; otherwise, it will be sent to APP and cached.





7. The Router needs further enhancement


7.1, the paper










































7.2, the Gateway



The detailed Gateway process is as follows:





  • 1) Obtain each replica of each partition from the Registry path/Pubsub/Router /partition(X);
  • 2) Obtain the current valid router Partition Number from Registry path /pubsub/router/partition_num;
  • 3) Start a thread to follow the Router path /pubsub/ Router on Registry to get the following information in real time: {Router Partition Number} -> New Router Partition (at this time the capacity expansion has occurred); New replica within the Partition (replica expansion occurred within the Partition); Information about a replica hung up in Parition;
  • 4) Send heartbeat packets to each Partition replica regularly, and asynchronously wait for the heartbeat response packets returned by the Router to detect its activity, so as to ensure that the Gateway Message is not forwarded to the replica that has timed out;
  • 5) Start a thread to read the values of each sub-node under the Router path/Pubsub/Router on Registry periodically, observe the changes of Router Partition Number and the changes of each Partition with the strategy of timed polling, as a supplement to the real-time strategy; Also check periodically for routers with heartbeat packet timeouts, removed from a valid BrokerList;
  • 6 Forwards the Gateway Message to the replica of a Partition based on the rule.



The rules in step 6 determine the destination Partition and replica of the Gateway Message. The rules are as follows:


If a RouterPartitionID satisfies condition(RoomID % RouterPartitionNumber == RouterPartitionID % RouterPartitionNumber), The message is forwarded to this Partition. The RouterPartition is not directly hash (RouterPartitionID = RoomID % RouterPartitionNumber). Is given when the Router for expansion and 2 times when all up all the new Partition start to complete and consistent data to modify the Registry path/pubsub/Router/partitionnum values, Only by following the calculation formula can each Replica of the new Partition get the Gateway Message during startup, that is, each Gateway Message is sent to two Router partitions. When the Router expansion, modify the Registry path/pubsub/Router/partitionnum value, the new clusters into the plateau, each Gateway Message will only be sent fixed a Partition, Condition (RoomID % RouterPartitionNumber == RouterPartitionID % RouterPartitionNumber) is equivalent to condition(RouterPartitionID = RoomID % RouterPartitionNumber). If the Router in the Partition a replia satisfy condition (replicaPartitionID = RouterPartitionReplicaNumber RoomID %), is put forward to this up. Up to the Registry registration ID called replicaID, the Router Parition all up up RouterPartitionReplicaArray arrays sorted by increasing replicaID composition, ReplicaPartitionID is the replica subscript in the array.






Gateway Message data consistency:


























// 63 -------------------------- 48 47 -------------- 38 37 ------------ 0 // | 16bit Gateway Instance ID | 10bit 38 bit Reserve | | on the yardCopy the code


7.3, the Router









Router details are as follows:





  • 1) The Router loads the configuration and obtains the ID of its Partition (suppose 3).
  • 2) to the Registry path/pubsub/router/partition3 registration, sets the status to Init, Registry returned ID as its ID (replicaID);
  • 3) After registration, Gateway messages and HeartBeat messages sent by the Gateway and Broker will be received. They will be cached in MessageQueue.
  • 4) from the Registry path/pubsub/router/partition3 get all up in its own Partition;
  • 5) Obtain the current valid router Partition Number from Registry path /pubsub/router/partition_num;
  • 6) Start a thread to follow the Registry path /pubsub/router to obtain the following information in real time: {router Partition Number} -> New replica in the Partition (replica expansion occurred in the Partition); Information about a replica hung up in Parition;
  • 7) Load data from Database;
  • 8) Start a thread to process the Gateway Message in MessageQueue asynchronously and forward the Gateway Message to other peer replicas in the same Partition. RoomID % BrokerReplicaPartitionID % BrokerPartitionNumber == BrokerReplicaPartitionID % BrokerPartitionNumber is then forwarded to each Broker in BrokerList according to the rule. Process heartbeat packets from the Broker, store the Broker’s information to a local BrokerList, and then send the packet back to the Broker.
  • 9) to modify the Registry path/pubsub/router/partition3 nodes under the condition for Running;
  • 10) Start a thread to read the values of each subpath under Registry path/Pubsub /router periodically, and observe the changes of each router Partition with the strategy of timed polling as a supplement to the real-time strategy; Check for time-out brokers and remove them from BrokerList.
  • 11) When the RouterPartitionNum is multiplied, The Router cleans its routing information cache according to the rule RoomID % BrokerReplicaPartitionID % BrokerPartitionNumber == BrokerReplicaPartitionID % BrokerPartitionNumber.
  • 12) Router stores the maximum GatewayMsgID of each Gateway locally. If a Gateway Message smaller than the GatewayMsgID is received, it can be discarded without processing. Otherwise, the GatewayMsgID is updated and processed according to the above logic.

















7.4, the Broker



The Broker process is as follows:





  • 1) The Broker loads the configuration to get the ID of its own Partition (suppose 3);
  • 2) to the Registry path/pubsub/broker/partition3 registration, sets the status to Init, Registry returned ID as its ID (replicaID);
  • 3) Obtain the current valid router Partition Number from Registry path /pubsub/router/partition_num;
  • 4) Obtain each replica of each router partition from the Registry path/Pubsub/Router/Partition (X);
  • 5) Start a thread to follow the Registry path /pubsub/router to get the following information in real time: {router Partition Number} -> New router Partition (at this time, the capacity has been expanded); New replica within the Partition (replica expansion occurred within the Partition); Information about a replica hung up in Parition;
  • BrokerPartitionID % BrokerPartitionNum == BrokerPartitionID % BrokerPartitionNum, RouterReplicaID = BrokerReplicaID % BrokerPartitionNum】 Selects a Router replica of the target Router Partition and sends a heartbeat message to it. BrokerPartitionNum, BrokerPartitionID, BrokerHostAddr, and Timestamp down to the second and wait asynchronously for all Router Replicas to reply, All Gateway messages forwarded by the Router are put into GatewayMessageQueue.
  • BrokerPartitionID == RoomID % BrokerParitionNum load data from Database according to BrokerPartitionID == RoomID % BrokerParitionNum
  • BrokerPartitionID % BrokerParitionNum == RoomID % BrokerParitionNum * BrokerParitionNum * BrokerParitionNum * BrokerParitionNum * BrokerParitionNum * BrokerParitionNum * BrokerParitionNum * BrokerParitionNum * BrokerParitionNum Leaving only the data for regular messages;
  • 9) to modify the Registry path/pubsub/broker/partition3 under its own node status to Running;
  • 10) Start a thread to read the values of each subpath under Registry path/Pubsub /router periodically, and observe the changes of each router Partition with the strategy of timed polling as a supplement to the real-time strategy; Periodically check the Router that times out. If a Router times out, replace it with another Router in the Partition to which it belongs, and periodically send heartbeat packets.
  • 11) When Registry path/Pubsub/Broker /partition_num value is changed to BrokerPartitionNum, Clean each piece of data in the local routing information cache according to rule [PartitionID == RoomID % PartitionNum].
  • 12) Receiving the Room Message sent by Proxy, RoomID searches all the gateways that Room members log in from the routing information cache and forwards the Message to these gateways;
  • 13) The Broker stores the maximum GatewayMsgID of each Gateway locally. If a Gateway Message is received that is smaller than the GatewayMsgID, it can be discarded. Otherwise, the GatewayMsgID is updated and processed according to the above logic.





























8. Offline message processing


8.1, the paper









If the system considers offline messaging, consider the following factors:





  • 1) Message solidification: ensure that users receive messages during offline period when they go online;
  • 2) Message ordering: both offline and online messages are delivered in the same message system. Each message is assigned an ID to distinguish the message order. The later the message order is, the larger the ID will be.









The new message architecture is shown below:










System noun explanation:





  • 1) Pi: message ID storage module, which stores the orderly increasing set of message IDS not sent by each person;
  • 2) Xiu: message storage KV module, which stores everyone’s messages and assigns IDS to each message, with ID as the key and value in the message;
  • 3) Gateway Message(HB) : user login and logout messages, including APP Hearbeat messages.

















8.2, Xiu








8.2.1 Storing messages



The parameter list for storing Message requests is {SnowflakeID, UIN, Message}, and the process is as follows:





  • 1) Receive the message sent by the client and obtain the message receiver ID (UIN) and SnowflakeID assigned by the client to the message;
  • 2) Check whether the addition of UIN % Xiu_Partition_Num == Xiu_Partition_ID % Xiu_Partition_Num is valid (that is, whether the message of the recipient should be in charge of the current Xiu), if not, return an error and exit;
  • 3) Check whether the message corresponding to SnowflakeID has been stored. If so, return the message ID and exit.
  • 4) Assign an MsgID to the message: each Xiu has its own unique Xiu_Partition_ID and a Partition_Msg_ID with an initial value of 0. MsgID = 1B[Xiu_Partition_ID] + 1B[Message Type] + 6B[++ Partition_Msg_ID] The Partition_Msg_ID is incremented by one each time it is allocated.
  • 5) Store the message to a Hashtable based on shared memory with the MsgID key, and store the CRC32 hash value and insert time of the message, and store the MsgID to an LRU list: The LRU List itself is not stored in shared memory; when the process restarts, the List can be reconstructed from the data in the Hashtable. When a message is stored in a Hashtable, if the Hashtable is full, the message in the Hashtable is filtered based on the LRU List.
  • 6) Return the MsgID to the client;
  • 7) The MsgID is asynchronously notified to the message solidified thread. The message solidified thread reads the message from Hashtable according to the MsgID and determines whether the message content is complete according to the CRC32 hash value. If the message is complete, it stores the message in the local RocksDB.


8.2.2 Reading Messages



The parameter list of the message request is {UIN, MsgIDList}, and its process is as follows:





  • 1) Obtain the requested MsgIDList and check whether each MsgID MsgID{Xiu_Partition_ID} == Xiu_Partition_ID condition is valid. If no, return an error and exit.
  • 2) Get the message corresponding to each MsgID from Hashtable;
  • 3) If the Hashtable does not exist, the MsgID message is read from RocksDB;
  • 4) After reading, all the obtained messages are returned to the client.


8.2.3 Synchronizing primary and secondary Data





















The data synchronization process is as follows:





  • 1) Followers periodically send heartbeat messages to the leader. The heartbeat messages contain the ID of the latest local message.
  • 2) The leader starts a data synchronization thread to process the heartbeat information of the followers. The leader’s data synchronization thread looks up the IDS of N messages following follower_latest_msg_id from the LRU list. If it gets them, it reads the messages and synchronizes them to the followers. If not, the message gap between the leader and the leader is too large.
  • 3) The follower receives the latest batch of messages from the leader and stores them.
  • 4) If the follower receives too large a response from the leader, the follower requests the leader agent to fully synchronize the solidified data of RocksDB and start the data synchronization process with the leader again after finishing the sorting.














8.2.4 Cluster Expansion



Doubling method was adopted in Xiu cluster expansion, and the working process of nodes of new partitions after expansion was as follows:





  • 1) The state of its node in the path/Pubsub /xiu/partition_id of Registry is running, and its external service address information is registered at the same time;
  • 2) Start another tool. After horizontal expansion, when all replicas in all newly started partitions are in Running state, Change the value of Registry path/Pubsub /xiu/partition_num to the number of partitions after expansion. Upgrade from 2 to 4 as shown in the previous example.





8.3, Pi,








8.3.1 Storing message ids



MsgID stores the request parameter list as {UIN, MsgID}, Pi workflow is as follows:





  • 1) Determine whether the condition UIN % Pi_Partition_Num == Pi_Partition_ID % Pi_Partition_Num is valid, if not, return error to exit;
  • 2) Insert the MsgID into the MsgIDList of UIN, keep all MsgID in the MsgIDList not to repeat the orderly increment, write the request content into the local log, and return a successful response to the requester.








8.3.2 Reading the Message ID List









The process is as follows:





  • 1) Determine whether the condition UIN % Pi_Partition_Num == Pi_Partition_ID % Pi_Partition_Num is valid, if not, return error to exit;
  • 2) Obtain all msGids in the StartID, StartMsgID + MsgIDNum range and return the results to the client.
  • 3) If ExpireFlag is valid, delete all msgids in the range of [0, StartMsgID] in the MsgIDList and write the requested content to the local log.


8.3.3 Synchronizing primary and secondary Data





















The data synchronization process is as follows:





  • 1) The follower periodically sends heartbeat messages to the leader, including the latest local LogID.
  • 2) The leader starts a data synchronization thread to process the heartbeat information of the followers and transfers the logID according to the logID reported by the followers.
  • 3) The follower obtains the latest batch of logs from the leader, stores them and then replays them.








8.3.4 Expanding a Cluster



The Pi cluster capacity expansion method is doubled. Then the node startup workflow is as follows:





  • 1) Register with Registry and obtain the PartitionNumber of Registry path/Pubsub /xiu/partition_num;
  • 2) If its PartitionID meets the condition PartitionID >= PartitionNumber, it means that the current Partition is a new cluster after expansion, and its status in Registry is updated to START;
  • 3) Read the leader of all Parition under Registry path/Pubsub /xiu, According to the condition’s own PartitionID % PartitionNumber == PartitionID % PartitionNumber, find the leader of the corresponding old Partition and call it parent_leader.
  • 4) Cache user requests forwarded by Proxy;
  • 5) Obtain the log from parent_leader;
  • 6) Synchronize memory data to parent_leader;
  • 7) Replay parent_leader log;
  • 8) Update its status in Registry to Running;
  • 9) Replay user requests;
  • 10) When PartitionNumber of the value of Registry path/Pubsub /xiu/partition_num meets the condition PartitionID >= PartitionNumber, it means that capacity expansion is completed. When processing a user request, return a response to the user.











8.4. Data sending process









The detailed process is as follows:





  • 1) The Client assigns SnowflakeID to the message based on the Snowflake algorithm, and sends the message to a Proxy based on the ProxyID = UIN % ProxyNum rule.
  • 2) Proxy receives the message and forwards it to Xiu;
  • 3) After receiving the response returned by Xiu, Proxy forwards the response to Client;
  • 4) If the Proxy receives the response returned by the Xiu with MsgID, it initiates the Pi write process and synchronizes the MsgID to Pi.
  • 5) If the Proxy receives the response returned by Xiu with an MsgID, it sends a Notify to the Broker to inform it of the latest MsgID of a UIN.


8.5 Data forwarding process









Under PiXiu architecture, brokers will receive the following types of messages:





  • 1) User login message;
  • 2) User heartbeat message;
  • 3) User logout message;
  • 4) Notify message;
  • 5) Ack message.









User login message flow is as follows:





  • 1) Check the current status of the user. If it is OffLine, set its status value to OnLine.
  • 2) Check whether the queue for sending messages is empty. If not, exit;
  • 3) Send a request {UIN: UIN, StartMsgID: 0, MsgIDNum: N, ExpireFlag: false} to the Pi module to obtain N message ids, set the user status to GettingMsgIDList, and wait for a response.
  • 4) According to the message ID queue returned by Pi, send a message request to Xiu {UIN: UIN, MsgIDList: MSG IDList}, set the user status to GettingMsgList and wait for the response;
  • 5) After the Xiu returned the message list, set the status to SendingMsg and forward the message to the Gateway.









There are three scenarios for the Gateway user logout message generation:





  • 1) Users voluntarily quit;
  • 2) User heartbeat times out.
  • 3) A network error occurs when forwarding messages to users.



The process of user logout messages is as follows:





  • 1) Check the user status. If the user status is OffLine, exit.
  • 2) If the user is OffLine, check the ID of the last message in the message list (LastMsgID) and send the MsgID request to Pi. {UIN: UIN, StartMsgID: LastMsgID, MsgIDNum: 0, ExpireFlag: True}, exit after Pi returns a response.



The process for processing Notify messages sent by the Proxy is as follows:





  • 1) If the user is in OffLine state, exit.
  • 2) Update the latest message ID (LatestMsgID) of the user. If the message queue is not empty, the user exits.
  • 3) Send a request {UIN: UIN, StartMsgID: 0, MsgIDNum: N, ExpireFlag: false} to the Pi module to obtain N message ids, set the user status to GettingMsgIDList, and wait for a response.
  • 4) According to the message ID queue returned by Pi, send a message request to Xiu {UIN: UIN, MsgIDList: MSG IDList}, set the user status to GettingMsgList and wait for the response;
  • 5) After the Xiu returned the message list, set the status to SendingMsg and forward the message to the Gateway.









The Ack message processing flow is as follows:





  • 1) If the user is in OffLine state, exit.
  • 2) Update LatestAckMsgID;
  • 3) If the message queue is not empty, the user will exit after sending the next message;
  • 4) If LatestAckMsgID >= LatestMsgID, exit
  • 5) Send the request {UIN: UIN, StartMsgID: 0, MsgIDNum: N, ExpireFlag: false} to the Pi module to obtain N message ids, set the user status to GettingMsgIDList, and wait for the response.
  • 6) According to the message ID queue returned by Pi, send a message request to Xiu {UIN: UIN, MsgIDList: MSG IDList}, set the user status to GettingMsgList and wait for the response;
  • 7) After the Xiu returned the message list, set the status to SendingMsg and forward the message to the Gateway.





Ix. Summary of this paper



This group chat messaging system still has the following task list to be improved:





  • 1) The message is transmitted through UDP link, which is not reliable [resolved by 2018/01/29];
  • 2) The current load balancing algorithm adopts the minimalist RoundRobin algorithm, which can be realized by adding the load-balancing algorithm based on weight according to the success rate and delay;
  • 3) This function can be implemented according to the message ID.
  • 4) There is no heartbeat scheme between modules, and the stability of the whole system depends on Registry [resolved by 2018/01/17];
  • 5) Offline message processing [solved by 2018/03/03];
  • 6) Prioritize messages.









Reference Documents:
A Replicaing capacity expansion scheme supports free planning without data migration and modification of routing code