Author: Liu Bo
With the development of live broadcasting, the real-time interactivity of live broadcasting becomes increasingly important. On the basis of WebRTC, we have developed UPRTC system based on years of development experience and current situation, which solves the problems of network delay, high concurrency and poor decoding ability of clients.
WebRTC’s past lives
What is the WebRTC
In May 2010, Google paid $68.2 million for GIPS, which owns codec, echo cancellation and other technologies. Since then, Google has open-source GIPS technology and developed industry standards with related bodies IETF and W3C to form the existing WebRTC project.
WebRTC stands for Web Real-Time Communication. It is not a single protocol, but includes multiple protocol standards including media, encryption, transport layer, and a set of javascript-based apis. The easy-to-use JavaScript API enables the browser to have P2P audio, video and data sharing capabilities without installing any plug-ins.
At the same time, WebRTC is not an isolated protocol, it has flexible signaling, can easily connect to the existing SIP and telephone network system.
Advantages of WebRTC
Prior to the establishment of the UPRTC project, WebRTC was chosen for three main reasons:
1. WebRTC is an open source and royalty free project, which greatly saves development time and cost;
2. WebRTC is led by Google and its technology is very advanced;
3. Browsers such as Safari and other terminals gradually strengthen support for WebRTC technology.
Core components of WebRTC
- Audio and video engines: OPUS, VP8 / VP9, H264
- Transport layer protocol: The underlying transport protocol is UDP
- Media protocol: SRTP/SRTCP
- Data protocol: DTLS/SCTP
- P2P Intranet penetration: STUN/TURN/ICE/Trickle ICE
- Signaling and SDP negotiation: HTTP/WebSocket/SIP, Offer Answer model
Figure 1 is a simplified diagram of the internal structure of WebRTC, with hardware at the bottom and audio capture module and video capture module above.
The middle part is the audio and video engine. The audio engine is responsible for audio acquisition and transmission, with noise reduction, echo cancellation and other functions. Video engine is responsible for network jitter optimization and Internet transmission codec optimization.
On top of the audio and video engine is a set of C++ apis, and on top of the C++ API is the Javascript API for the browser.
△ Figure 1: Internal structure of WebRTC
Figure 2 shows the protocol stack involved in WebRTC. The core protocols of WebRTC are built on the right side based on UDP.
ICE, STUN, and TURN are used for Intranet penetration to obtain and bind extranet mapped addresses, as well as the Keep Alive mechanism.
DTLS is used to encrypt the transmitted content, which can be regarded as the UDP version of TLS. This layer is required because of WebRTC’s focus on security.
SRTP and SRTCP are media data encapsulation and transmission control protocols.
SCTP is a flow control transport protocol that provides tcp-like features. SCTP can be built on top of UDP, and on top of DTLS in WebRTC.
RTCPeerConnection is used to establish and maintain end-to-end connections and provide efficient audio and video streaming.
RTCDataChannel is used to support end-to-end transmission of arbitrary binary data.
△ Figure 2: WebRTC stack
UPRTC based on WebRTC
In order to make the WebRTC protocol more suitable for real-time interactive live broadcasting, Youbaiyun has been modified and optimized on the basis of the WebRTC protocol, and built Youbaiyun UPRTC. Supports multiple application scenarios, including one-to-one, one-to-many, and many-to-many.
- The traditional WebRTC application mode is P2P, UPRTC is server transfer mode.
Due to its superior CDN resources, Wupaiyun deployed WebRTC to all edge nodes in the country using a fully distributed system. UTUN acceleration is used to transfer the forwarding and concurrent pressure to the server, ensuring that UPRTC terminals can withstand more concurrent routes.
- Add network congestion adaptive control, strong weak network adaptability.
For example, when mobile devices switch from WIFI network to 4G network, the Cloud server can detect bandwidth changes, count packet loss and delay, and adjust the dynamic bit rate to ensure that normal calls can be made even in weak network environment.
- Optimize and transform the underlying open source components to support high concurrency business scenarios.
A set of flexible and efficient service signaling is designed for sensitive signaling authentication.
Figure 3 shows the UPRTC technology composition:
- Media channel, data channel, signaling channel;
- Data encryption module;
- Bit rate adaptive control module;
- And shot cloud acceleration network;
- P2P drilling service;
- Room application business;
- Robots (automatic management functions and interactive functions).
△ Figure 3: UPRTC technology composition
conclusion
Although WebRTC source code is relatively mature, the following problems still need to be solved in practical application:
1. Excessive CPU consumption during audio processing;
2. Audio and video synchronization BUG;
3. Android WebRTC source code for H.264 support is not comprehensive, only default support qualcomm chip;
4. Bit rate adaptive algorithm should be added in the process of server architecture to dynamically control the total bit rate bandwidth within 2M.
Recommended reference documents:
W3C API documentation: https://github.com/w3c
IETF protocol related documents: https://datatracker.ietf.org
Related reading:
Real-time audio and video interactive series (part 2) : Practical analysis based on WebRTC technology
WebSocket+MSE — HTML5 live streaming technology analysis
From Html5 live to interactive live, see the selection of live protocols