preface
In the Internet age, messaging service has become a must-have product. Such as wechat, Dingding and QQ, which have IM as their core function. Another part is the user chat interaction in the broadcast room, in the form of common IM message flow; Common message flows in a live broadcast room include live barrage, interactive chat, signaling, graffiti, and message push. Messaging services are especially important in the online education industry, including interactive answers, doodles, performances, and likes during live teaching, but they have a high requirement on the reliability of messages and are sensitive to the immediacy of messages.
Message service
When it comes to messaging services, everyone will think of wechat and other common IM products, so live chat messages are often likened to group chat, while 1V1 messages are likened to single chat.
Most people, when they hear about messaging services, assume that messaging services are simple, that there are two types of messaging: single chat and group chat. Single chat is not to forward a message to a user; Group chat is to broadcast the message to all the users in the chat room.
What are some of the challenges that messaging services face?
Business scenario Analysis
In view of the above difficulties, let’s sort out how to build a live broadcast message system.
Problems faced
The main scene of the initial business is the group chat message and a small part of the single chat message. Since it is an educational scene, the business is divided into chatrooms by class, assuming that the number of people in each chatroom is 500.
Problem 1: User maintenance
There is a big difference in user maintenance between group chat in live scene and common group chat such as wechat. The group user relationship of wechat is relatively fixed. The operation of users entering and leaving a group is relatively low frequency, and the user set is relatively fixed. The users in and out of the broadcast room are very frequent, and the broadcast room is time-sensitive. The actual peak QPS in and out of the live broadcast room will not exceed 10,000. Using Redis can solve the problems of the chatroom user list storage and expired cleaning.
Problem 2: Message forwarding
When all users in a chat room with 500 people send messages simultaneously, the forwarding QPS of messages is 500*500= 2.5W. From the perspective of live broadcast client:
The real time
: If the message service does peak reduction, the accumulation of peak messages will lead to the increase of message delay, and some signaling messages have timeliness, so too large delay will affect the user experience and real-time interaction.
The user experience
: Generally, a screen displays no more than 10-20 chat and signaling messages of various users. If more than 20 messages are sent per second, the screen will continue to be flooded. The large number of messages also creates a continuous high load on the end.
So we define different priorities for messages. High-priority messages are forwarded and processed preferentially and are not discarded. Low-priority messages are forwarded after a certain policy is adopted.
Question 3: Historical messages
Services need to generate playback videos and obtain historical signaling and interactive chat messages. It is required to write historical messages quickly to ensure the timeliness of message forwarding.
Message preservation mainly includes write diffusion and read diffusion. We use the method of read diffusion, which can reduce the storage space and also reduce the retention time of messages. Given the low priority of playback, we chose Pika as the storage component. Pika is an interface similar to Redis, which can reduce learning and development costs. And because it is appended, its write performance is comparable to that of Redis.
Question 4: message order
The sequence of signaling messages must be the same as that in which the same person sends messages and users in the same chat room receive messages.
Resolution of message order can be ensured using queues such as Kafka, but with Kafka there is a certain latency. To reduce latency, we use a consistent hash strategy to process the forwarding of messages, which we will discuss in more detail later.
Design goals
Create a stable and efficient messaging server.
- Provide high reliability, high stability, high performance long connection services;
- Support millions of long connections online at the same time;
- Support multi-cluster rapid department heat, capacity expansion;
Service architecture
The architecture of the server went through several phases from the early rapid implementation of the business to the late growth in business volume.
The architecture of 1.0
The service is introduced
service
instructions
DispatchServer
Scheduling services. Provides HTTP/HTTPS interfaces. The client invokes a request to obtain the ACCESS address of a TCP persistent connection
AccessServer
Access services. Linux Epoll technology is used to realize asynchronous and non-blocking mode, high performance processing client connection and all kinds of message forwarding
AuthServer
Certification services. Verify the user name and password
MessageServer
Message service. Message forwarding, routing information maintenance, chat room information maintenance, message persistence
MsgFilterServer
Filtering services. Filter out sensitive words, low-priority messages, and blacklist users
HttpPushServer
Push service. An API interface is provided for the business side to use. The service interface can be invoked to push messages, including single chat, group chat, and other configurations
StorageServer
Storage service. Store user events and write them to Kafka
StatServer
Statistical services. Collect information about the load of each access service, including the number of users and machine load
Service design
Comprehensive consideration of services is divided into two parts, access services and business logic services. The core services are AccessServer and MessageServer. The interaction between the two services is roughly as follows:
From the figure above, we can see that not all the people in a chat room are on the same access service, so when a 500 people chat room message is forwarded, the QPS of the server is 500*500= 2.5W multiplied by the number of access services.
AccessServer
The access service maintenance establishes a persistent connection with the TCP client. The primary goal of AccessServer is to process network I/O packets and improve concurrency performance in an asynchronous, non-blocking manner. At the same time, the corresponding relationship between the chatroom and the user is maintained in the memory, and the chatroom related cache information is saved. Parse the protocol package with the client. There are two main categories of messages that AccessServer processes:
1. When sending upstream messages, the client parses related request parameters and delivers the messages to the MessageServer. The MessageServer obtains routing information and delivers the messages to the corresponding AccessServer for processing.
2. The MessageServer directly queries the TCP connection information of the corresponding user when broadcasting downlink messages in a single chat, and sends packets. If the message is a group chat message, the user list of the chat room is traversed to obtain the corresponding TCP connection information, and the packet is sent.
Maintaining the mapping between chat rooms and users on the AccessSever can reduce the interaction pressure between MessageServer and AccessServer.
MessageServer
The messaging service is responsible for interacting with Redis and Pika. Persist messages to Pika. Update the information such as the chatroom user list, chatroom routing (which AccessServer the chatroom people are distributed on), user routing (which AccessServer the user is on) to Redis, and query the relevant information from Redis when necessary. Handle the forwarding logic of user login, logout, entering and leaving the chat room, single chat message, group chat message and doodle message.
How does MessageServer ensure the order of messages?
First, AccessServer delivers messages from the same chat room to the same MessageServer according to the consistent Hash policy.
MessageServer then uses the Hash strategy to forward messages from the same chat room to the same thread for processing. Let’s look at the threading model of the service:
The network data processing and service logic processing threads are isolated to avoid TCP blocking caused by service logic processing blocking network threads. The network thread uses Epoll to send and receive data to improve the concurrency. Business threads focus on business logic processing. Different thread pools can be configured according to different services, such as online and offline, in and out of chat rooms, message sending, and different thread groups for messages with different priorities.
Cache optimization
We know that more threads is not necessarily better performance, because thread context switching has a significant performance overhead. And thread pools are used to ensure message sequentiality. If you rely entirely on Redis, you will face the following problems:
- New users need to enter the chat room from
Redis
Gets the list of users, and the user login and logout are updatedRedis
Chatroom user list, routing and other information, the more the chatroom users more pressure on the system; In the business scenario, the teacher will join hundreds of chat rooms, which will lead to the delay of the teacher’s class. - Every message sent needs to be sent from
Redis
Query the routing information of the chat room, assume the networkIO
+ The thread schedules a query request for 0.5ms, thenQPS
The number of messages such as doodles is 15 to 20 per second. When the load lasts for a period of time, it is easy to cause queue blocking, task timeout, and avalanche.
To solve the above problems, we have implemented the second-level cache strategy in Both AccessServer and MessageServer, and we have also implemented the cache elimination strategy to prevent memory overload:
MessageServer caches the chatroom user list and the chatroom route cache, and periodically synchronizes with Redis for cache consistency. When a user enters or leaves the chat room, the information is broadcast to the corresponding AccessServer. The AccessServer also caches the list of chat room users, which can reduce the RPC pressure between the AccessServer and MessageServer.
Cluster management
The services described in the previous section form a cluster, and the different clusters do not currently communicate with each other. Cluster management is for business isolation, because different business parties require different performance when using message services, to minimize the impact of one business overload on the normal use of other businesses. At the same time, you can build clusters with different carrying capacities based on different service loads to improve resource utilization.
Multiple cluster management requires DIspatchServer dispatching service to play a role. A client needs to know the IP address and port for establishing a TCP persistent connection with a server. Before establishing a connection, the client requests the scheduling service through HTTP. The scheduling service assigns the IP address and port of the access point to the client based on the configuration policy. The diagram below:
The architecture of 2.0
With the increase of business volume and diversification of business scenarios, the disadvantages of 1.0 architecture are gradually exposed:
- Messages with different frequencies preempt resources, interact with each other, and compare doodle messages and signaling messages.
- Lateral capacity
MessageServer
The interaction between different messages cannot be solved from the root, but the utilization of resources is reduced.
Service split
To solve the above problem, we split MessageServer into three services:
MessageServer: Handles chatroom logic. Including access to chat rooms, group chat messages, chat room cache management, cache synchronization broadcast.
BinMsgServer: Doodles message processing logic. Handle doodle message logic.
PeerMsgServer: Processes single chat logic. Including user online, offline, single chat message forwarding;
After service splitting, the state synchronization and invocation relationship between services also change:
After splitting, corresponding services can be expanded according to different concurrent messages of different services to improve the utilization of machine resources. AccessServer and HttpPushServer deliver different messages to different services for processing.
The cache update
After the architecture upgrade, the cache policy is also adjusted. MessageServer needs to synchronize routing information to BinMsgServer, and BinMsgServer needs to perform cache consistency processing. New cache policy:
In order to reduce unnecessary RPC calls, some circumventing strategies are adopted during cache synchronization:
MessageServer synchronizes the cache to BinMsgServer every time a user enters or leaves a chat room. MessageServer synchronizes the cache to BinMsgServer only when the chat room route status changes.
Secondly, MessageServer does not synchronize the routing information of the same chat room to all BinMsgServers. As described earlier, AccessServer uses the consistent Hash policy to deliver the same chat room message to the same server for processing. Therefore, MessageServer uses the same consistent Hash policy to synchronize chatroom routes to the corresponding BinMsgServer.
The above two steps can significantly reduce the RPC calls between MessageServer and BinMsgServer, and fully devote resources to message forwarding processing.
The architecture of 3.0
As a messaging service, it is not enough to be limited to the scene of live chat. It needs to support more business types, such as IM, push, and transparent transmission. Adding corresponding functions to the current service will not only have a great impact on the current service, but also increase the maintenance cost in the later period.
Different businesses have different requirements. Live chat and IM are similar, but have different requirements for messages. For example, IM messages have higher requirements for message persistence and consistency, but lower requirements for message latency. While push scene and live chat have different connection timing, live chat only needs to establish the connection when the user enters the live broadcast room, while push requires that the connection must be established when the APP starts.
From the point of view of the service side, if each business builds a set of access services, it will not only waste resources but also increase the maintenance cost.
From the client’s point of view, if each service establishes a TCP persistent connection, it will increase the cost of client performance, especially mobile devices will increase the power consumption.
In order to cope with the rapid change of business, the 3.0 architecture came into being:
The 3.0 architecture requires the SDK to cooperate with the upgrade.
TcpProxyServer
Added the TcpProxyServer service in 3.0. The service can be understood as a layer 7 proxy, but the protocol is not a protocol such as HTTP, but a custom protocol.
In order to carry multiple services on a TCP connection, we abstract the concept of Session, currently supported sessions include Chat(live Chat), IM, Push, Push and transparent transmission.
Considering the rapid support of new services, the TcpProxyServer is designed to dynamically configure the service forwarding route, and the service forwarding can be completed without development and only need to modify the configuration file.
The 2.0 client is directly established with the AccessServer TCP connection, and 3.0 is established with the TcpProxyServer connection, TcpProxyServer will request through RPC forward to the AccessServer.
TcpProxyServer can dynamically configure policies to control the delivery of requests to back-end services, including rotation, Hash, and consistency Hash policies.
With the iteration and upgrade of the client, the USERS of the SDK version V3 have accounted for about 70%.
The future planning
Connect the migration
If the AccessServer is overloaded or restarted abnormally, users re-log in and enter the chat room, and a large number of incoming and outgoing messages are broadcast, causing unnecessary performance consumption for both the server and client. We are currently working on the development of state migration, and AccessServer will save each user’s state and synchronize the user’s state in real time for recovery. When accessServer-1 is overloaded or restarted, the status of the accessServer-1 is restored to the status of accessServer-2 and the message forwarding process is normal. In this case, the client is not aware of the status and the user experience is improved.
QUIC
At present, the message system has achieved good performance in terms of stability and scalability, but our student users are all over the country, and the network conditions of users are also very different. Limited by TCP protocol stack and operating system, it is difficult for us to further improve the real-time performance of the message based on TCP protocol in the case of weak network.
Because TCP has the problem of queue header blocking, the problem of message delay appears in the weak network environment and the high packet loss rate. For QUIC, the above problems can be avoided by using UDP.