In the Era of Internet +, the magnitude of messages increases significantly and the forms of messages are diversified, which brings great challenges to the instant messaging cloud service platform. What is the architecture and features behind a highly concurrent IM system?
The above content is compiled from the materials shared by the chief architect of netease Yunxin.
Related Reading Recommendations
Detailed explanation of IM push guarantee and network optimization (1) : How to achieve the background protection without affecting the user experience
IM push guarantee and network optimization details (2) : How to do long link and push combination scheme
IM IM Push Guarantee and Network Optimization Detail (3) : How to optimize big data transmission in weak network environment
Key points of this article:
- Analysis of the overall structure of netease Yunxin
- Client connection and access point management in cloud messaging
- Servitization and high availability
neteaseIMCloud layered architecture diagram analysis
- Underlying clientSDK, covering Android, iOS, Windows PC desktop, Web and embedded devices and other platforms. The network protocols used in the SDK layer are TCP (layer 4) and Socket.IO (layer 7). The latter is used to provide persistent connection capability in the Web SDK. In addition to the SDK integrated into the App, it also provides an API interface for third-party servers to call, based on the Http protocol. The final A/V SDK is A real-time audio and video SDK based on UDP protocol, which is used to realize web-based voice and video calls.
- Gateway layer: Provides the client direct access and maintains the long connection with the server; WebSDK is directly connected to Weblink service, which is a long connection service based on socket. IO protocol, while AOS/IOS/PC and other client SDKS are directly connected to THE Link service based on TCP protocol. A very important function in Link and WebLink services is to manage the long connection of all clients. The gateway based on HTTP protocol has API service and LBS service, among which LBS service is used to help the client SDK select the most appropriate gateway access point and optimize the network efficiency. API services directly provide business requests from third-party servers;
- HA layer: on top of the gateway access layer is the HA layer. The gateway access layer provides direct connection to clients. There is an HA layer between the link layer and the Service layer to decouple and provide features such as high availability and easy extension. In HA on the concrete implementation way of the Link and WebLink both maintain client connection services, cloud letter provides protocol routing services, distribution to business requests, routing layer according to predefined rules will forward the request from the client to the corresponding business node, when the business expansion and cluster routing service can be found immediately after the new nodes are available, and When a service node is found to be abnormal, it will be marked by the routing layer and isolated offline for replacement.
- Service node cluster: In HA layer node is specific business cluster, we called the App services, the service of dealing with the specific client request, the back-end direct-connect DB, cache and other basic services, the characteristics of the nodes in the cluster is lightweight, and each node is stateless, cloud letter in the actual deployment of the cluster will be deployed across a network environment, For example, two sets of service nodes are deployed in the dual-node equipment room in the same city. The front-end service nodes distribute service requests through the routing layer. In normal times, services are hot standby for each other and online service traffic is evenly shared. When a single network environment or infrastructure fails, the routing service will immediately detect the fault, mark the compute nodes in the environment offline, and forward all traffic requests on the network to the normal cluster. Thus improving the overall availability of services; Together with the monitoring platform and other o&M tools, the real-time processing capacity and capacity usage of service nodes are dynamically monitored. When the processing capacity reaches the preset water level, an alarm is immediately set off. O&m personnel can quickly and conveniently expand the service node cluster through the automatic deployment platform.
- Business layer: it contains some key functions: core single chat message, group chat message and chat room, notification and so on; And user information hosting, special relationship management; There are API – oriented services such as SMS, call back and private line conference; There are also related capabilities such as real-time audio and video and live streaming.
The more important features listed on the far right are listed separately from the services layer, including third-party data synchronization with developer applications, personalized content audit support, super-group services, logging in and out of events, roaming messaging and cloud messaging history, push services, and more.
neteaseIMCloud Deployment Topology
Through the following simplified deployment topology diagram, you can have a preliminary understanding of yunxin’s overall technical system. On the right is the client, which obtains the gateway access point list through LBS service, establishes a long connection with such servers as Link and WebLink, and performs RPC operation. All requests from the client will be forwarded to the back-end APP layer through the routing layer, and the APP layer will process and deliver the processing results of synchronous requests in real time. Some asynchronous tasks are sent to asynchronous tasks through queue services, such as the sending of large groups of messages, push service, cloud historical message storage and third-party data cc synchronization service. The API interface at the bottom is similar. The API directly provides the invocation request to the third-party server. The API back-end is a variety of independent services, such as dialing back, SMS, etc. Similarly, all API backend business requests are logged; These logs are collected to the big data platform through the log collection platform. On the one hand, such data is stored in HDFS and used as the data source for data statistical analysis. On the other hand, the logs are imported to data warehouses such as Hbase for log retrieval and secondary analysis.
High concurrencyIMSystem connection layer optimization practices
What about the most important connection management service in instant messaging? The premise of fast message arrival is to maintain a stable connection between the client and server. It can be understood as the cornerstone of the stability of cloud messaging services. What are the most important issues to address at the gateway access layer? The core is still stability, security and speed.
How to ensure stability?
Netease Yunxin SDK adopts the long connection mechanism, and detects disconnection and automatically reconnection by the way of heartbeat. Meanwhile, Yunxin SDK does a lot of optimization work for weak network environment such as mobile network. The mobile terminal /PC terminal uses TCP to connect the client and server, and the Web terminal uses socketIO protocol. Achieve long connection while solving the browser compatibility problems;
How to achieve security?
Yunxin requires that all data transmitted over public networks be encrypted; In the process of building the SDK connection with server has a complex secret key negotiation process, first of all, the client need to generate a one-time use and secret key, and USES asymmetric encryption way after the secret key encryption to the server, the encrypted data will be server, then add the secret key is kept in the long connection session information, Data traffic is encrypted using the secret key, which is a streaming encryption, can effectively prevent man-in-the-middle attack and packet replay attacks.
How to ensure fast?
First is the choice of the gateway access points, with the aid of LBS service can help the client to find the most suitable for their own gateway access point, such as judging from the information such as IP to the physical distance of node recently, second after a connection is established, can greatly improve the mechanism of long connection message from top to bottom line speed, and in the process of data transmission, the cloud is transferred to the packet compression letter, Reduce network overhead to improve the speed of sending and receiving messages; For mobile client scenarios such as frequent foreground and background switching and re-login, THE SDK provides automatic login and re-connection mechanisms, that is, the MESSAGE channel has been established in advance while the UI is up. In the access gateway selection strategy, parallelism is used to speed up connection establishment.
This section describes the process of establishing a persistent connection between the client and the server
The first step for SDK access is to request the LBS service to obtain the address list of the access gateway that can be accessed. The LBS service will assign an address to the client according to various policy conditions. The common conditions are as follows:
- Appkey, through which a specific application request can be directed to a set of specific access points, can be used for dedicated server solution;
- The client IP address is used to assign the nearest access gateway to the client based on its geographical location. It is usually used on overseas nodes.
- SDK version number, which points clients in a specific version range to a specific gateway. It is often used for the compatible solution of upgrading old and new versions. Currently, there is no actual use case.
- A specific environment identifier, such as an intelligent customer service environment, is used to point a specific type of app to a specific gateway, and is used to isolate the needs of the larger environment.
After requesting the address from the LBS service to the access gateway, the client will attempt to establish a connection according to the address in the list. If in strict accordance with such order, the client will slow the process of a connection is established in order to accelerate the access process, actually in operation, the SDK will be the last time the use of local cache LBS request return address list to establish a connection, with a new address list from the LBS are cached in the local, arrange another time to use; If all the addresses in the list fail after repeated attempts, the default link address is used to establish a connection. If the default address fails, a 415 or 408 network error code will appear.
After the destination address is obtained, it attempts to establish a TCP persistent connection. After the connection is established, it negotiates with the server to add the secret key and sends the first authentication packet. After the authentication, the persistent connection is a secure and effective one, and the client can initiate subsequent RPC requests. The server can also send message notifications to this connection; If the key negotiation fails or authentication fails, the connection is regarded as an invalid connection request and the server forcibly disconnects.
Finally, let’s talk about the problem of accelerating nodes. In order to achieve fast connection, the node nearest to the gateway access point will be preferentially allocated. The acceleration node here is to get closer to a particular type of node provided by the user.
The background of the principle of acceleration node is that the line provided by the operator to individual users, whether mobile network or wired network, is always different in quality from the network between the IDC center; If you replace the critical path in the entire user link with the network line between IDC, it will help to improve the stability and speed of the connection.
Suppose a customer in the United States accesses a gateway access point in Hangzhou via a mobile network. Because the client’s network is a mobile network, a direct connection to the Hangzhou server requires very long links and unpredictable intermediate nodes, or in China’s case, a firewall. So most of the direct connection may be unable to connect, or after the connection is frequently disconnected.
We provide multiple layers of accelerators: with the addition of accelerators, the unexpected parts of the user’s overall link are replaced with better quality lines, and the user’s network directly connected to the local accelerators is often much better.
Here’s how different delivery modes affect message delivery efficiency:
Question 1: How to multiply the concurrency of message delivery?
In this figure, the upper part represents A point-to-point Link server. After sender A sends A message, the message is submitted to the APP for processing. The APP finds that the Link server where the message receiver B resides is Link Y, so it sends A downlink notification packet to the Link Y server. On Link y, find the long Link corresponding to user B and send the notification to the client. In this mode, all access points are equal for all users, and they can access any server. Any message must be sent to the Link server where the target receiver resides in the business layer, and the notification packet must be sent to the corresponding Link server. If the message is sent in a group, Then you need to query the Link list of all members in the group on the business APP. This is a time-consuming operation; And as the number of message receiving members continues to rise, the overhead continues to increase; So if you need to send a message to the chat room, because the number of members in the chat room is very large, this mode will soon hit a performance bottleneck, the message delivery delay will be very serious;
For broadcast Link servers, Yunxin first follows a principle when allocating access points, that is, members in the same chat room should try to allocate them to the same group of access points when allocating chat rooms. Maintains a long Link set for all members of each room on the Link; Instead of maintaining a mapping relationship between a specific user and a Link, the App maintains a collection of links assigned to a specific room. Therefore, after any member sends a chat room broadcast message, the message is uplink to the App through the link. The App only needs to find the link address list that has been assigned to the chat room and send a broadcast message to each link. After receiving the downlink broadcast message, the link will broadcast the message locally. This is more than an order of magnitude more efficient than the on-demand model;
Problem 2: How to solve the performance bottleneck of a single node?
After talking about the difference between peer-to-peer and broadcast links; Then let’s look at the evolution and optimization process of weblink proxy solution based on socket. IO in Yunxin.
Before this, we need to emphasize two key points in WebLink. First, WebLink is based on socket. IO protocol. To ensure the reliability of data channel, Yunshi needs to use Https to encrypt the channel.
Figure 1 shows the earliest solution. The back-end Weblink provides connection and SSL encryption. Multiple nodes are represented by LVS. This scheme exposes only one domain name to the outside, and there are actually a lot of internal nodes, expansion is also transparent to the outside; Web clients only need to directly connect to the unique domain name when connecting. For a single product, this way is the most convenient and fast, and the client can bypass the address assignment process. The disadvantages also focus on the single exit. If the single exit is attacked by DDOS, it can only be avoided through domain name binding, which takes a certain time to take effect and brings some costs in operation and maintenance. Secondly, for the service like Yunxun, the single exit loses flexibility. All customers are directly connected to the same entrance, and the isolation of exclusive service and business cannot be realized, and the accelerated node scheme cannot be realized.
Therefore, there is a second solution, which refers to the LBS allocation method in Link services. SSL encryption is implemented on Weblink nodes, and independent domain names are assigned to each Weblink node. Before the client access, the LBS service is used to allocate appropriate access points. The advantage of this solution is that it provides greater flexibility. The cluster can be expanded at any time, and the address of the access point can be dynamically adjusted for specific applications. It also provides the possibility of being an accelerator node. However, the problem of this scheme is that each node is a single point, and SSL encoding is needed in the node. As Java SSL has a large CPU resource overhead, the service capability of a single node will be affected in the case of sudden user traffic.
Therefore, there is a third solution. This solution uses Nginx as the seven-layer proxy on the front end, and configues SSL and domain name binding on Nginx. The back end can use a set of Weblink simultaneously. With the use of Nginx, the port allocation logic is more scientific, which improves the convenience of operation and maintenance. In the end, Cloud Messaging came up with a combination of solutions currently in use, with the front-end using LBS services to assign access points to the SDK to provide flexibility; The backend uses multiple Nginx clusters as proxy clusters, and the performance of each cluster group is improved.
Instant messaging platform serviceability and high availability practices
After focusing on some of the techniques used by CloudMessaging in implementing the client access layer and managing the access point to establish a stable and reliable messaging channel for IM services, let’s talk about the work done on the service and high availability in the business layer.
The gateway access layer is responsible for the maintenance and management of the long connection between the client. All A-Nodes, even stateless peers, are only responsible for the forwarding of requests between the client and the server, and the forwarding efficiency is optimized. The real business processing logic still needs a business layer to implement.
The business layer needs to handle a large number of requests and is responsible for the interaction with DB, cache, queue, third-party interface and other components. Its stability, availability and scalability directly affect the quality of the entire cloud service. In order to make the service layer more elastic, a routing layer is introduced between the gateway access layer and the service layer to decouple. After going online, the service node will register itself to the service center, and the routing node will transfer the request packet of the gateway layer and select the matching node from the service node to distribute the request. This three-tier architecture makes the overall system more resilient.
In order to improve service availability, Cloud Communication will distribute service nodes to environments belonging to different networks, which can provide services at the same time in normal cases. Once the network or infrastructure of one of the environments fails, the faulty cluster can be quickly taken offline through the routing layer.
The grayscale upgrade mode is flexibly supported. Yunxin can upgrade some service nodes and import the specified user traffic to the newly upgraded nodes through the configuration on the routing layer.
Flexible support for exclusive services. For customers who have a strong demand for exclusive resources, Yunxin can use the routing layer to import all the traffic of their applications into independent clusters.