This sharing will introduce the similarities and differences of conference data encryption based on WEBRTC, conference based on SFU mode and conference data encryption based on WEBRTC SFU mode in view of various current terminal encryption scenarios.
The content was shared by 263 audio and video architect He Xiaomin at the second half of the video conference table.
PPT data link: https://pan.baidu.com/s/19qf9…
Extract code: T02M
The widespread use of telecommuting tools in various industries and the fact that many companies are announcing telecommuting as a regular mode of collaboration means that the popularity of telecommuting products such as video conferencing has risen to the level of corporate strategy. It also makes companies pay more attention to the security of video conferencing software or platform.
The popularity of video conference necessarily mean the need to support higher concurrency, easier, and to support the use of mobile intelligent terminal access, enterprise self-built video conference platform have been unable to meet the demand of enterprise use, the acceptance of the enterprise for public cloud products gradually improve, enterprises of torture on the safety of video conferencing products followed.
Throughout the world, on the one hand, network attacks are more extensive, targeted and destructive than before; on the other hand, due to objective reasons of international political environment, the development of anti-globalization has brought a crisis of trust, and various fields such as government, enterprises, military and police have put forward higher information security requirements for video conference security.
It is a great honor to represent 263 in this event. Today I will share with you some of the encryption methods currently under development and in practice at 263 on end-to-end encryption strategies for video conferencing.
So let’s start with 263. 263 has been engaged in enterprise Internet communications for more than 20 years, providing enterprise customers with products and services such as video conferencing, enterprise live streaming, enterprise email, cloud storage, teleconferencing and so on. In 2015, 263 acquired Zhanshi Interactive, the largest interactive live broadcasting platform in China at that time, and comprehensively laid out its product line in the field of corporate Internet communication.
In recent years, gradually from the traditional tools of 263 vendors to one-stop digital marketing services, through the cloud video technology capability can assign industry customers, live video conference in the cloud, cloud applications greatly improved the ability of the overall video services, satisfy the business enterprise customers to carry out the remote video meeting, broadcast marketing, remote training and other video collaboration needs, Help enterprises create new digital marketing value.
Video conferencing security background.
When it comes to videoconferencing security, one of the factors that never gets around is last year’s epidemic. Since the epidemic in 2020, the market of telecommuting and telecommuting has been rapidly catalyzed, especially in video conferencing, online education, online training, online summit and other subdivided scenarios, and the terminal is not limited to computers and video conferencing hardware. For example, in the field of the Internet of Things, various sensors, unmanned aerial vehicles and other terminals also need video communication capabilities. In the medical field, the transmission of nuclear magnetic resonance images and infrared scanning, as well as the remote tracking and monitoring in the field of government emergency command, all of them need to realize video communication capability on various terminals. The various types of terminals have different requirements for data security, which is a great challenge to the security strategy of video communication platform.
In addition, the development trend of the video conference industry as a whole is more and more tending to the standardization of protocols, compatibility and integration of capabilities. At present, the application of video conference scene is more and more complex, using self-built plan of the enterprise, more and more high, the cost of using a private agreement, therefore, the scheme of enterprises more and more public cloud, the probability of network attack is relatively higher, and increase the strength of the network attack, become more targeted, more destructive power and influence is wider, More and more enterprises attach importance to the security and encryption strategy of video conferencing platform. Next, I will share some encryption methods for traditional video communication and some end-to-end encryption methods for video conferencing currently developed by 263.
End-to-end encryption.
There are three kinds of traditional encryption technology, symmetric encryption, asymmetric encryption and TLS encryption.
Symmetric encryption, in which the same encryption method is used and the other side is used to decrypt the same encryption method, requires the public passwords of both parties to be allocated in advance through a private channel, similar to the password book in a spy movie. Symmetric encryption has the advantage of being fast and efficient, but it also has the obvious disadvantage that once the key is exposed, the data can be eavesdropped.
With the development of encryption technology, asymmetric encryption technology is derived from symmetric encryption technology. The encryption and decryption keys of this technology are different, which are generally public key encryption and private key decryption. Even if the public key is compromised, the data will not be compromised. The communication parties only need to send their public keys to each other in advance to realize public key encryption. After sending out, the other side received and then decrypted, this technology than symmetric encryption security has a certain improvement.
The third type is TLS encryption. The advantage of TLS encryption is that on the basis of asymmetric encryption, it solves the problem of identity authentication. Encryption communication process is to first apply for a certificate to the certification body, the certificate is certified as a public key sent to the other party, the other party after receiving the certificate in the third party to verify whether the certificate is valid, after the verification of the two parties began the process of public key encryption, private key decryption communication.
Conference data encryption based on WEBRTC.
Next, I will focus on WebRTC based data encryption. In order to better protect the information security of customers, 263 cloud video service uses the self-developed WebRTC technology, and focuses on the security of video conference data.
Since the protocol setup of WebRTC itself is end-to-end communication, most WebRTC based meetings are data between the platform of the client and the WebRTC server. The process is standard SDP session negotiation — ICE holes to solve data and network problems — UDP-based DTLS realizes key exchange between the two sides — SRTP data transmission, the sender uses the public key to encrypt the data, uses SRTP channel to send, the other party receives the decryption.
SRTP is used for media data, and Datachannel is used for signalling transmission. It is rarely used in major platforms and SCTP encryption is used. WebRTC session establishment process, the session signaling exchange is not standardized definition, generally based on WebSocket implementation to be compatible with the browser, but can also be implemented on its own. As shown in the figure above, TLS encryption of TCP is used for signaling, and SCTP/SRTP encryption of UDP is used for data (the new version of WEBRTC has realized the realization based on UDP/HTTP3/ QUIC).
The process of key exchange is the key to WEBRTC, which is illustrated here. For example, there are two terminals under the conference platform based on WEBRTC to forward media data. The process is to first establish a session, and a pair or two pairs of keys will be generated during the establishment process. The uplink stream and the downlink stream are two independent peer streams, that is, sending and receiving are a pair of keys respectively. Assume that terminal A generates (A, A), the server generates (S1, S1), and terminal B does the same thing. After completion, they both create their own media streams. Next is the forwarding process of data, A terminal uses the public key to encrypt the server and send it to the server. The server uses the key decryption agreed between them, and then does processing (including the process of packet grouping, sorting, anti-packet loss and NACK). In the distribution process, it may also do the shunt of SFU, which is Simulcast shunt. Then send it to each terminal and encrypt it with the key they negotiated, and then decrypt it when the terminal receives it, and vice versa.
This enables the channel between each terminal and the server to be encrypted throughout the conference, and the server will decrypt, group, re-encrypt and send the received data. Its problem is that the server in a series of processing processes, memory CPU consumption is relatively high.
Currently, this encryption technology has been used on the 263 platform. 263 Based on the self-developed WebRTC technology, it not only realizes the stable, smooth and ultra-low latency video communication of cloud video conference and cloud live broadcast products, but also provides escort for the data security of government, finance, medicine and other industries as well as enterprise customers with high data security requirements with this encryption technology.
Private one-on-one conversations in videoconferencing.
For 1-to-1 video conferencing, 263 has designed an encryption method for private conversations to ensure the security of video conferencing. Suppose this is A standard conference server, terminal A and terminal B normally enter the meeting, and they want to do A private conversation between them. The conversation uses A third-party platform, and the SDK extension protocol based on the platform makes an encrypted signaling extension by itself, and the process is as follows: When terminal A wants to initiate A session, its master sends the public key to the other party as A key, while terminal B accepts the session and sends the public key back after generating the reply, so that one-to-one encrypted data transmission can be realized.
End-to-end encryption in SFU mode.
In SFU mode, namely distribution mode, the conference server needs to distribute the data of each terminal to multiple people through the server, so the whole process cannot be encrypted for every 1-to-1 data stream, but only the data transmission channel encryption of the online stream or the downlink stream can be realized. We now require that throughout the distribution process, the conference server cannot crack the signalling or data of each terminal to ensure that it is not monitored. To achieve trusted calls on an untrusted platform, the following process is required: first, each terminal operator forms a pair of “signaling keys” that broadcast their public key into the venue when they are joined. Subsequent signalling that requires encryption is only encrypted using the other party’s public key. In this way, all terminals can send signaling to the server, but the server cannot break the signaling.
Then comes the process of encrypting the data. Data encryption and signaling all the difference. If everyone’s data is required to be encrypted and sent, not only can the server not be decrypted, but also all terminals are required to receive, which requires a process of authorization.
For example: Client A uses the “A1” data key, and the data of Client A can be encrypted with its own A1 public key after login, and then sent to the server; The server does not know the private key of client A’s data, so it cannot decrypt. When other clients need to receive A’s data, they will first send A request signaling, which will be encrypted with the signaling public key “A” and forwarded to Terminal A. After receiving it, A will conduct authentication to ensure that the other party is A legitimate terminal (the application can establish its own authentication platform and include the certificate in the request). Once the authentication has been approved, the corresponding reply can be sent out, including the data private key “A1”.
Note that the public key is treated as a private key, and the private key is treated as a public key. This is the reverse process. In this way, terminal B can receive the data stream from the server. B has been authorized to use A’s private key to decrypt A’s encrypted data. This completes the process of forwarding the encrypted data through the server.
This SFU mode encrypts its certificate under the end-to-end process. First there are two client terminals A and B. The following is a third party application based on the conference platform SDK. Client A will first publish an encrypted video stream to the conference management server, and the server will broadcast it to the used terminal; Then, terminal B shall obtain the certificate from the certificate platform set up by the client before viewing, and then subscribe the video stream to the conference management server and provide the certificate after completion. After receiving the request, the server forwards the request containing the certificate B to terminal A, and terminal A conducts certificate authentication on the certificate server. After passing the certificate, terminal A forwards its public key to B through the conference management server, and then B receives it. Data flow communication can be conducted between A and B.
End-to-end encryption for WebRTC SFU mode.
In WebRTC SFU mode, if you want to use both the standard protocol of WebRTC and encrypt between terminals on top of it, the following process is required.
The previous procedure is the same as before, using an encrypted signaling channel. Once A standard WebRTC session is established, for example, A B requests A’s data, and B requests the private key of the data. After that, in the process of data stream distribution, before encrypting WEBRTC’s SRTP, an API layer callback is made to the application end, and then it uses SRTP encryption after doing another layer of encrypting.
In addition, only need to do channel decryption on the WEBRTC server, do not need the complete video frame data decryption, do not group frames and other processes, just need to do a sort, and then distribute to everyone or use the WEBRTC connection. In this way, after receiving the data, B not only needs to decrypt SRTP, but also needs to decrypt a layer of encryption processing of API layer made by the customer himself. This embeds end-to-end encryption into WebRTC.
WEBRTC SFU encrypted signaling channels are the same as standard conferences. Private key requests need to establish such a process independently of WebRTC. The distribution of the data stream makes the three changes just described: one more layer of encryption, one less layer of framing, and one more layer of decryption. Based on the above method, a communication process of trusted encrypted data can be realized through the design of the protocol.
The problem.
There are two problems here. One is authentication, and the other is protocol compatibility.
Authentication. We have realized that the server is secure and the transmission channel is secure, but whether the terminal is secure or not is not clearly defined. There is no standardization of the certificate process just mentioned. We only provide an open API or something like that, and users have the flexibility to implement it themselves. In general, one-to-many certification is difficult to implement a standardized process. In addition to using the certificate platform, it can also be implemented at the hardware level, such as in the form of dongles.
Protocol compatibility aspects. On an encrypted WebRTC system, other solutions are required if the standard WebRTC browser client is to be compatible. After the browser gets the data, it will not decrypt the API, and the data sent will not be encrypted with the public key “A1”. We have two solutions: First, the system supports both encrypted and unencrypted meetings, and the encrypted meeting does not support browser temporarily. Second, provide the server layer SDK, so that users can build a management platform based on SDK for authorization. This allows for full decryption and re-encryption on browser access before server distribution. Such distribution suggests that the third party in their own trusted room to build a server to achieve transfer.
That’s all for today. Thank you.
For more information, please scan the __ QR code in the picture or click __ to read the original __ to learn more about the conference.