#WebRTC series
- WebRTC source research (1) WebRTC architecture
- WebRTC source code research (2) WebRTC source directory structure
- WebRTC source research (3) WebRTC operation mechanism
- WebRTC source research (4) Web server working principle and common protocol basis
- WebRTC source code research (5) Nodejs build environment
WebRTC source research (1) WebRTC architecture
Recently I was mainly focused on the study in the field of audio and video, learn a lot of related books, and video, is still in the study, written by the many blog contents are stem from mu class teacher li chao video, want to learn about the audio and video is strongly recommended to buy li chao teacher’s video, your speaking was good, the price is not expensive, buy teacher li chao video click here.
WebRTC is an audio and video communication treasure box, providing mature solutions for audio and video processing and instant messaging. The key is that the source code for this solution is open source, so you can delve into the source code and learn the algorithms for solving difficult problems and apply them to your projects. WebRTC is an excellent multimedia framework with cross-platform features.
Please refer to:
- WebRTC standard official document
- WebRTC source download
- This section describes WebRTC protocol
1. Introduction of WebRTC
WebRTC, short for Web Real-time Communication, is an API that enables Web browsers to conduct real-time voice or video conversations. It became open source on June 1, 2011 and was incorporated into the World Wide Web Consortium’s W3C recommendation with support from Google, Mozilla, and Opera.
WebRTC addresses the inability to capture audio and video on the Web and provides peer-to-peer (that is, between browsers) video interaction.
WebRTC brings together advanced real-time communication technologies, including: advanced audio and video codecs (Opus and VP8/9), forced encryption protocols (SRTP and DTLS) and network address converters (ICE&STUN).
- WebRTC implements web-based video conferencing under the WHATWG protocol, which aims to provide real-time Communications (RTC) capabilities with simple javascript in a browser.
- The ultimate goal of WebRTC (Web Real-time Communication) project is to enable Web developers to use browsers (Chrome\FireFox…). Web developers also do not need to pay attention to the digital signal processing process of multimedia. They only need to write simple Javascript programs to achieve it. Organizations such as W3C are formulating Javascript standard APIS. It is WebRTC 1.0 and in Draft state. In addition, WebRTC also hopes to build a robust real-time communication platform between multiple Internet browsers, forming a good ecological environment for developers and browser manufacturers. At the same time, Google also hopes and is committed to make WebRTC technology as one of the HTML5 standards, which shows the far-reaching layout of Google.
- WebRTC provides the core technology of video conference, including audio and video collection, encoding and decoding, network transmission, display and other functions, and supports cross-platform: Windows, Linux, MAC and Android.
2. Capabilities of WebRTC
According to the original definition, WebRTC was designated as peer-to-peer (P2P) technology.
Since its inception, WebRTC has made it much easier for Web developers to build real-time communication applications using simple Java apis. But to be clear, WebRTC is a technology, not a complete application or service. We can think of WebRTC as a treasure chest in the field of audio and video. It is not an innovative technology, but the integration of the previous audio and video technology, and a certain amount of optimization. It has a lot of algorithms for audio and video processing that can be pulled out and used in different situations.
At present, WebRTC has great potential in videoconferencing and live broadcasting.
Although WebRTC was originally conceived as a pure P2P technology, many everyday business applications require centralized media capabilities to improve reliability, efficiency, or scalability through P2S (Peer-to-server) architecture. Issues between P2P and P2S architectures are important for building WebRTC applications.
In order not to restrict the development of server-related technologies, WebRTC does not set uniform standards for signaling and SDP server technologies. Each developer can choose signaling server according to his own situation.
WebRTC is an open source library for audio and video processing + instant messaging. It is an excellent multimedia framework, and it is cross-platform. In the field of audio and video, there are two beidou level open source libraries, one is FFmpeg, the other is WebRTC. They have different priorities, different strengths. FFmepg focuses on multimedia editing, audio and video codec and other video file processing. For WebRTC, its advantage is the realization of audio and video transmission in the entire network, network jitter, packet loss, network evaluation, optimization of the respective algorithms at the network level to ensure the stability of audio and video transmission, in addition, it can also optimize the network transmission of echo and other problems often occur, such as echo cancellation, noise reduction.
In general, WebRTC can do the following:
- Real-time audio and video interaction
- Games, instant messaging, file transfers, etc
- It is a treasure chest, transmission, audio and video processing (echo cancellation, noise reduction)
2.1 Seize the WebRTC’s 5G era
2.1.1 Browser Support Status
By now, almost all major browsers are WebRTC compatible, and WebRTC is now supported and adopted by all major browsers except Internet Explorer, These include Google Chrome, Apple Safari, Mozilla Firefox, QQ Browser, 360 browser and Microsoft Edge.
2.1.2 The addition of large factories
-
Threats to traditional audio and video providers sonnet (transnational, trans-India), namely Structure technology, Rongyun. A wave of new conference providers are using WebRTC technology to break into the Internet, dealing a fatal blow to traditional video and audio providers.
-
WebRTC reliability and ease of use (Sound Net calls a standard API (WebRTC API) W3C on the Web side)
-
WebRTC makes it possible by popularizing the meeting experience through a Web browser, supporting click-to-start, and eliminating the hassle of additional software.
2.1.3 WebRTC Application Cases
2.1.3.1 Education Industry solutions
2.1.3.2 Interactive e-commerce solution
2.1.3.3 Enterprise Video Collaboration /OA Office Solution
2.1.3.4
3. The webrtc framework
Let’s take a look at the weBRTC architecture diagram that is widely circulated on the Internet. This diagram is from the official website of WeBRTC. Because access to Google is restricted, the official website cannot be accessed directly, you can refer to the WeBRTC Chinese website:
As you can see from the above figure, the whole light green part belongs to the WebRTC core architecture layer, which encapsulates various Web API layer interfaces provided for the Web side. The purple part belongs to the application layer, which uses the API provided by the core layer. You can extend the API at the application layer, calling the WebRTC core layer interface.
The WebRTC core layer is divided into four layers:
- WebRTC C++ API (PeerConnection): This layer of API is relatively few, the most important is to achieve P2P connection. PeerConnection contains many interfaces, such as transmission quality, transmission quality report, statistics data, and all kinds of streams are encapsulated in the PeerConnection module. In addition to the main audio and video acquisition, audio and video transmission, non-audio and video data transmission.
- Description Session Management/ Abstract (Session): A Session layer used to manage audio and video data transmission and process related logic.
- The core of the third layer, including: audio engine, video engine, transmission, three core modules.
- The lowest layer is the hardware adaptation layer related to hardware: this layer includes: audio acquisition and rendering, video capture, network IO. Note that the bottom three modules are dotted lines, indicating that these modules can be implemented by themselves and can be overloaded, which greatly increases WebRTC’s flexibility and provides a foundation for cross-platform implementation.
The three core modules of WebRTC: Voice Engine, Video Engine and Transport. Among the three layers, Voice Engine only deals with audio-related technologies, while Video Engine deals with video-related technologies. Audio and Video are independent of each other, which is an important technology: Audio and video syncing is not included.
The Voice Engine consists of three modules:
- iSAC/iLBC Codec: This module is mainly codec,
iSAC/iLBC
It’s developed by Jips, and it has the familiar audioG711, G726, AAC, Opus
Currently Opus is the most used, AAC was also very popular a few years ago, you can add your own AAC module into WebRTC. Opus is used in WebRTC, which will be explained in detail later. - NetEQ for voice:
NetEQ
It’s actually a buffer of audio buffers for network adaptation. As we do to prevent audio jitter, which involves a lot of related algorithms. - Echo Canceler/ Noise Reduction: Echo Canceler/ Noise Reduction: Echo Canceler/ Noise Reduction: Echo Canceler/ Noise Reduction Echo cancellation is a headache for many companies, but WebRTC provides a very mature, stable echo cancellation algorithm, which is worth studying and applying to your own products. In fact, many algorithm engineers do use open source code like WebRTC, tune some parameters, to improve the audio quality.
Video Engine The Video Engine consists of three modules:
- VP8 Codec: This is the video codec,
VP8, VP9
Are developed by Google, WebRTC before the first support is their own home VP8, the back is also supportedThe H264 H265, Open the H264
, xH264 (in the official WebRTC is not supported, if you want to support your own Open H264 module to add) - The Video jitter buffer, like the audio buffer, is used to prevent Video jitter.
- Image enhancements: This is related to Image processing, for example Image enhancements, and Image enhancements are thinner for WebRTC. It also leaves out the corresponding interface, if you want to do some beauty, texture, filter processing, you can also do it in the reserved interface provided. There are interfaces like face video, you can join in.
Transport Transport module contains three modules: the UDP protocol is used at the bottom of the transmission, because audio and video transmission has higher requirements for timeliness, allowing some frames to be lost. WebRTC makes full use of UDP’s non-control ability, uses various mature algorithms to ensure high-quality audio and video transmission, and can automatically adapt to bit rate transformation. All audio and video data are sent and received through the transmission layer. It can be seen from the figure that the architecture level of WebRTC is very clear.
- SRTP: Generally, RTP is used for video transmission. However, because browsers have high security requirements, they use SRTP for encryption. There are control flow transmission protocol RTCP, the corresponding data to send, receive the report sent to the other side, so that the other side can do flow control processing.
- Multplexing: This is mainly multiple streams multiplexing the same channel.
- P2P (STUN + TURN + ICE) : Here is mainly P2P related technology, which is the most core technology of WebRTC, like the P2P technology used to do security is the most core technology of each company, now the source code of WebRTC this piece of technology is very mature, behind will also introduce this hole technology in detail.
The whole transmission layer includes the detection of the line, network packet loss, jitter, flow control, NAT piercing and other very review technologies have achieved a very mature scheme. It is very useful to learn this technique well.
WebRTC transport layer also realizes the calculation of your network bandwidth, not only can achieve stable audio and video transmission, but also can transmit other non-audio and video data, such as files, text and other binary data can be transmitted.
Note that there is no video rendering in the WebRTC core layer, video rendering needs to be done by the application layer.
Through the above explanation, we have WebRTC what ability, have a general understanding, next we will subdivide the functional items of each module, explain the related technology, knowledge, and finally through the analysis of the source code more in-depth understanding, so that we can write their own code, in the actual combat of the project.
3.1 Your Web App
Web developers can develop real-time communication applications based on video and audio based on Web API provided by integrated WebRTC browsers.
3.2 Web API
The WebRTC standard API (Javascript) for third-party developers makes it easy for developers to create web applications similar to web video chat, These apis can be divided into three types: Network Stream API, RTCPeerConnection API, and peer-to-peer Data API. WebRTC standard official document
- Network Stream API
MediaStream
: represents a media data stream.MediaStreamTrack
: Represents a media source in the browser.
- RTCPeerConnection
RTCPeerConnection
: an RTCPeerConnection object allows users to communicate directly between two browsers.RTCIceCandidate
: indicates a candidate for an ICE protocol.RTCIceServer
: indicates an ICE Server.
- Peer-to-peer Data API
DataChannel
: DataChannel an interface represents a bidirectional DataChannel between two nodes.
3.3 WebRTC Native C++ API
The native C++ API layer makes it easy for browser manufacturers to implement the WebRTC standard Web API and abstract the digital signal process. For browser developers to use to develop JavaScript apis.
3.4 Transport/Session
The Session component is based on libjingle (Session negotiation + NAT penetration component library) to develop the transport/Session layer: the Session layer component adopts part of libjingle library to implement, without using XMPP/Jingle protocol
- RTP Stack: Real Time Protocol
- P2P (ICE + STUN + TURN) : used to implement point-to-point transmission and establish call connections between different types of networks.
- Session Management: An abstract Session layer that provides Session establishment and Management functions. This layer protocol is left to the application developer to customize the implementation.
3.5 VoiceEngine
Audio engine is a framework that contains a series of audio multimedia processing, including the whole solution from video capture card to network transmission end. VoiceEngine is one of WebRTC’s most valuable technologies and is open source after Google acquired GIPS. In VoIP, the technology industry leads.
- ISAC: Internet Speech Audio Codec is a wideband and ultra-wideband Audio Codec for VoIP and Audio streams. It is the default Codec sampling frequency for the WebRTC Audio engine: 16khz, 24khz, 32khz; (default: 16khz) the adaptive rate ranges from 10kbit/s to 52kbit/s. Adaptive packet size: 30~60ms; Algorithm delay: Frame + 3ms
- ILBC: Internet Low Bitrate Codec A narrowband voice Codec for VoIP audio streams The bit rate of 20ms frames is 15.2 KBPS. The bit rate of 30ms frames is 13.33 KBPS. The standard is defined by the IETF RFC3951 and RFC3952
- NetEQ algorithm: adaptive jitter control algorithm and speech packet loss hiding algorithm. Enables it to adapt to the changing network environment quickly and with high resolution, ensuring beautiful sound quality and minimum buffering delay. GIPS is the world’s leading technology to effectively deal with the impact of network jitter and voice packet loss on voice quality. PS: NetEQ is also a very valuable technology in WebRTC. It has obvious effect on improving the quality of VoIP. It is better to integrate AEC, NR, AGC and other modules.
- Acoustic Echo Canceler (AEC): The Acoustic Echo Canceler is a software-based signal processing component that can remove echoes collected by the MIC in real time.
- Noise Reduction (NR) : Noise suppression is also a software-based signal processing element used to eliminate certain types of background Noise (hiss, fan Noise, etc.) associated with VoIP. …).
WebRTC audio parts, including equipment, decoding (iLIBC/iSAC/G722 / PCM16 / RED/AVT, NetEQ), encryption, sound files, voice processing, sound output, volume control, audio and video synchronization, network transmission and flow control (RTP/RTCP), and other functions.
-
Audio device — Audio_device: source code in the webrtc\modules\ Audio_device \main directory, contains the source code for the interface and each platform. On Windows, WebRTC uses Windows Core Audio and Windows Wave technologies to manage Audio devices, and also provides a mix manager. Using audio equipment, you can achieve sound output, volume control and other functions.
-
Audio codec — Audio_coding: source code in webrtc\modules\ Audio_coding directory. WebRTC using iLIBC/iSAC/G722 PCM16 / RED/AVT codec technology. WebRTC also offers NetEQ, a jitter buffer and packet loss compensation module that improves sound quality and minimizes latency. Another core feature is voice conference-based mixing.
-
Voice encryption –voice_engine_encryption: Like video, WebRTC provides voice encryption. Sound files This feature allows you to use local files as audio sources in Pcm and Wav formats. Also, WebRTC can record audio to local files.
-
Sound processing — Audio_processing: The source code is in the webrtc\modules\ Audio_processing directory. Sound processing Audio data is processed, including echo cancellation (AEC), AECM(AEC Mobile), automatic gain (AGC), Noise reduction (NS), and mute detection (VAD) to improve sound quality.
-
Network transmission and flow control: Like video, WebRTC uses mature RTP/RTCP technology.
3.5.1 Main data structures for audio and video image processing
Define the type of | The header file | Introduction to the | The path |
---|---|---|---|
Structures | common_types.h | List common constructs for VoiceEngine & VideoEngine | |
Enumerators | common_types.h | Lists enumerators common to VoiceEngine and VideoEngine | |
Classes | common_types.h | List common VoiceEngine and VideoEngine classes | |
class VoiceEngine | voe_base.h | How to use factory methods in the VoiceEngine class to allocate and release resources for VoiceEngine. It also lists the apis required to enable file tracing and/or tracing as callback messages | |
class VideoEngine | vie_base.h | How to allocate and release resources for VideoEngine using factory methods in the VideoEngine class. It also lists the apis required to enable file tracing and/or tracing as callback messages |
3.5.2 音频引擎(VoiceEngine)模块 APIs
The following table lists the sub APIs currently available in VoiceEngine
Define the type of | The header file | Introduction to the | The path |
---|---|---|---|
VoEAudioProcessing | voe_audio_processing.h | Added support for noise suppression (NS), automatic gain control (AGC) and echo control (EC). Also includes the receiver VAD. | |
VoEBase | voe_base.h | To enable full-duplex VoIP, use G.711. Note: This API must always be created | |
VoECallReport | voe_call_report.h | Added support for call reporting, which includes heartbeat detection counts, RTT metrics, and Echo metrics. | |
VoECodec | voe_codec.h | Added non-default codecs (e.g. ILBC, iSAC, G.722, etc.), voice activity detection (VAD) support. | |
VoEDTMF | voe_dtmf.h | Added phone event transmission, DTMF audio generation and phone event detection. (Phone events include DTMF.) | |
VoEEncryption | voe_encryption.h | Added external encryption/decryption support extension. | |
VoEErrors | voe_errors.h | Sound engine error code | |
VoEExternalMedia | voe_external_media.h | Add support for external media processing and allow the utilization of external audio resources. | |
VoEFile | voe_file.h | Add file playback, file recording, and file conversion functions. | |
VoEHardware | voe_hardware.h | Added audio device processing, CPU load monitoring and device information functions. | |
VoENetEqStats | voe_neteq_stats.h | Add buffer statistics. | |
VoENetwork | voe_network.h | Added external transport, port and address filtering, window QoS support and packet timeout notification. | |
VoERTP_RTCP | voe_rtp_rtcp.h | Added support for RTCP sender reporting, SSRC processing, RTP/RTCP statistics, Forward error correction (FEC), RTCP applications, RTP capture and RTP stay alive. | |
VoEVideoSync | voe_video_sync.h | Added RTP header modification support, playback delay tuning and monitoring. | |
VoEVolumeControl | voe_volume_control.h | Add speaker volume control, microphone volume control, mute support and other stereo scaling methods. |
3.6 VideoEngine
VideoEngine is WebRTC video processing engine VideoEngine is a series of video processing of the overall framework, from the camera capture video to video information network transmission to video display the whole process of the solution.
- VP8 video image codec is the default codec of WebRTC video engine. VP8 is suitable for real-time communication application scenarios, because it is mainly designed for low latency codec. PS: The VPx codec was open-source after Google acquired ON2, and VPx is now part of the WebM project, one of the HTML5 standards Google is working to promote
- Video Jitter Buffer A Video Jitter Buffer to reduce the adverse effects of Video Jitter and Video packet loss.
- Image Enhancements Module processes images captured by webcam, including brightness detection, color enhancement, and noise reduction, to improve video quality.
The video part of WebRTC includes collection, codec (I420/VP8), encryption, media files, image processing, display, network transmission and flow control (RTP/RTCP) and other functions.
-
Video capture –video_capture: source code in the webrtc\modules\video_capture\main directory, including interface and each platform source code. On Windows platform, WebRTC uses Dshow technology to enumerate video device information and video data collection, which means that most video collection devices can be supported; There’s nothing to do with video capture cards that require a separate driver, such as the Hikon HD card. Video capture supports multiple media types, such as I420, YUY2, RGB, UYUY, etc., and can control frame size and frame rate.
-
Video codec –video_coding: source code in webrtc\modules\video_coding directory. WebRTC uses I420/VP8 codec technology. VP8 is an open source implementation of Google’s acquisition of ON2 and is also used in WebM projects. VP8 provides higher quality video with less data, making it particularly suited to requirements such as video conferencing.
-
Video encryption – Video_Engine_encryption: Video encryption is a part of the Video_engine of WebRTC. It is equivalent to the function of the video application layer. It provides data security for both sides of the point-to-point video and prevents video data leakage on the Web. Video encryption encrypts and decrypts video data at both the sender and the receiver. The key is negotiated by both sides of the video, but the performance of video data processing is affected. You can also do without video encryption, which is better for performance. The data source of video encryption may be the original data stream or the encoded data stream. It is estimated to be the encoded data stream, so that the encryption cost will be smaller.
-
Video media file –media_file: The source code is in webrtc\modules\media_file. The function is to use local files as video sources, somewhat similar to the function of virtual cameras; The supported formats are Avi. In addition, WebRTC can also record audio and video to local files, more practical functions.
-
Video image processing –video_processing: The source code is in the webrtc\modules\video_processing directory. Video image processing for each frame of the image processing, including brightness detection, color enhancement, noise reduction and other functions, to improve the quality of video.
-
Video display –video_render: The source code is in the webrtc\modules\video_render directory. On Windows platform, WebRTC uses Direct3D9 and DirectDraw to display video. In Mac platform, metal is used to render the video. In IOS and Android, Opengl ES is used to render the video.
-
Network transmission and flow control: for network video, data transmission and control is the core value. WebRTC uses mature RTP/RTCP technology.
3.6.1 VideoEngine module APIs
Define the type of | The header file | Introduction to the | The path |
---|---|---|---|
ViEBase | vie_base.h | Basic functionality for creating VideoEngine instances, channels, and VoiceEngine interactions. | |
Note: This API must always be created. | |||
ViECapture | vie_capture.h | Added support for capture device allocation and capture device functionality. | |
ViECodec | vie_codec.h | Added non-default codecs, codec Settings and packet loss features. | |
ViEEncryption | vie_encryption.h | Added external encryption/decryption support. | |
ViEErrors | vie_errors.h | Error code for video engine | |
ViEExternalCodec | vie_external_codec.h | Added support for using external codecs. | |
ViEFile | vie_file.h | Added support for file recording, file playback, background images and snapshots | |
ViEImageProcess | vie_image_process.h | Add effect filters, zoom out, noise reduction and color enhancement. | |
ViENetwork | vie_network.h | Added send and receive functions, external transfers, port and address filtering, window QoS support, packet timeout notification and changing network Settings. | |
ViERender | vie_render.h | Added rendering function. | |
ViERTP_RTCP | vie_rtp_rtcp.h | Handles RTP/RTCP statistics, NACK/FEC, keepalive functionality and keyframe request methods. | |
4 weBRTC source code structure
WebRTC source code as part of Chromium, update speed is very fast, which benefits from Google to promote audio and video communication. WebRTC source code structure is also changing quickly, mainly reflected in the following aspects:
- Comparing the source versions M50 and M66, there are changes in the compilation tools. The old version used GYP to generate ninja files. The new version uses GN to generate Ninja files.
- Changes have been made to the directory structure, and some features supported in the older version have been removed in the new version.
- With the introduction of c++11, c++14 and c++17 new standards, WebRTC gradually supports the syntax function of the new standard in the new version, making the code more concise and easy to maintain.
In fact, weBRTC in each corresponding platform on the source are not the same, but the overall protocol framework is the same, the implementation on each platform is not the same, such as Android,ios, JS have achieved the same protocol function, but the coding is to use each platform’s own characteristics.
The following is a picture by Chen Zixing, the great god of Zhihu, to understand the overall structure as follows:
1 Directory Structure
If organized in accordance with the usual hierarchical thinking, from bottom to top, it is roughly divided into the following levels:
- OS cross-platform adaptation, hardware device access, third-party library Wrapper layer: including network layer, cross-platform encapsulation of OPERATING system API, audio device, video device encapsulation, audio, video codec, DTLS third-party implementation, etc.
- Network transport layer: including candidate collection, stuN/TURN protocol implementation, DTLS, RTP network connection establishment, SCTP connection establishment and so on.
- Channel layer: contains the transport channel (BaseChannel layer) and the MediaChannel (MediaChannel layer). BaseChannel interconnects with the PeerConnection and Transport layers. The MediaChannel implementation is actually inside the audio and video engine and is the bridge between the BaseChannel and the engine.
- RTP_RTCP: mainly flow control
- Audio Engine, Video Engine is the Audio and Video Engine layer, Audio and Video processing.
- Audio and video codecs, which are WebRTC’s own abstraction, rely on third-party libraries for the real codecs.
- PeerConnection and MediaStream are mainly implementations of JSEP protocol.
- PeerConnectionInterface is an abstract interface class.
For example, the code of m66 is in the following directory:
Directory name | The module content | Introduction to the | The path |
---|---|---|---|
api | Provides external interface, audio and video engine layer and Module direct interface. | — | — |
audio | Part of the audio stream abstraction, part of the engine logic. | — | — |
base | This part has not been learned yet. It is part of the Chromium project, and it seems WebRTC is not used much. | — | — |
build | Compile the script. It is important to note that different platforms get different toolsets when their code is downloaded. | — | — |
build_overrides | Compilation tools. | — | — |
buildtools | Build toolchain. | — | — |
call | It is mainly the interface abstraction of media stream. Provides a bridge between the media engine and the CODEC layer. The media streams in question are RTP streams. The PC layer also abstracts the media stream, either before encoding or after decoding. | — | — |
common_audio | Audio algorithm implementation, such as FFT. | — | — |
common_video | Video algorithm implementation, such as H264 protocol format. | — | — |
data | The test data | — | — |
examples | Examples of WebRTC use. Examples of peerConnection_client, PeerConnection_Server, STUN, and turn are provided. | — | — |
help | Help information. | — | — |
infra | Tool. | — | — |
logging | WebRTC log library. | — | — |
media | Media engine layer, including audio and video engine implementation. | — | — |
modules | WebRTC abstracts some logical independent modules to facilitate extended maintenance. | — | — |
ortc | Media description protocol, similar to SDP. | — | — |
out | Build output directory, which is the demo directory in webrTC’s official compilation guide. | — | — |
p2p | Candidate collection is mainly implemented, NAT traversal. | — | — |
pc | Implement the JSEP protocol. | — | — |
resources | The test data | — | — |
rtc_base | Including Socket, thread, lock and other OS basic functions. | — | — |
rtc_tools | Network monitoring tools, audio and video analysis tools. Many tools are scripted implementations. | — | — |
sdk | Mainly mobile side specific implementation. | — | — |
stats | WebRTC statistics module implementation. | — | — |
style-guide | Specification of coding | — | — |
system_wrappers | Encapsulation of OS-related functions, such as CPU and clock. | — | — |
test | Unit test the code implementation with gmock | — | — |
testing | Gmock, gtest and other source code, belonging to the whole Chromium project. | — | — |
third_party | Third-party library dependencies. For example, boringSSL, abseil-cpp, libvpx, etc | — | — |
tools | Common toolset that the entire Chromium project relies on. | — | — |
tools_webrtc | The toolset used by WebRTC. For example, code checks the use of ValGrind. | — | — |
video | An abstract interface to the video RTP stream that is part of the video engine. | — | — |
4.2 Core Modules
2 PeerConnection:
The main implementation logic of PeerConnection is in the PC directory of WebRTC source code.
Everything from PeerConnectionFactory and PeerConnection provides PeerConnectionFactoryInterface and PeerConnectionInterface two interface classes. The Factory class, as its name implies, creates peerConnections, and we’ll only discuss peerConnections.
You may already be familiar with WebRTC’s JavaScript interface. For example, RTCPeerConnection, setLocalDescription, setRemoteDescription, createOffer, createAnswer, and so on. The Native implementation of JavaScript interface is completed in PeerConnection, which also has a corresponding set of interfaces. The JavaScript interface implementation specification is JSEP. It can be said that the model of this set of specifications have been implemented.
The communication protocol between WebRTC terminals is ICE, and the schoolbag format is SDP. PeerConnection implements the logic of SessionDescription.
PeerConnection abstracted RtpTransceiver, RtpSender, and RtpReceiver models, corresponding to the implementation of media described in SDP.
4.2.2 the Module
WebRTC abstracts the logical function independent, cohesive and reusable parts into modules. Module in the WebRTC source modules directory, mainly audio and video equipment, CODEC, flow control, etc.
4.2.3 Network transmission module: Libjingle
WebRTC reuses some of the components of Libjingle, mainly the network and Transport components
5 WebRTC learning materials
5.1 books
- WebRTC Authoritative Guide: This book is suitable for beginners, you can quickly understand the theoretical knowledge related to WebRTC.
The book is written by Alan B.Johnston and Daniel C. Daniel C.Burnett. The third edition shows how to implement a data channel that sends real-time text directly between browsers. In addition, it covers a complete description of the browser media negotiation process (SDP session description for Firefox and Chrome), how to use Wireshark to monitor WebRTC, and example capture. In addition, the TURN server, which supports NAT and firewall penetration, is a new addition to version 3.
- Learning WebRTC (Chinese version) is a good primer for front-end developers. An introduction to the development of the transport layer requires further reference.
The author is Dan Ristic. It’s more like a simple tutorial that takes you step by step through a simple application. Also, examples of how to do file sharing are included in the book.
-
Getting Started with WebrTC
-
Development of Live Broadcasting System:
-
WebRTC Cookbook:
-
Real-time Communication with WebRTC: Peer-to-peer in the Browser
-
Handbook of SDP for Multimedia Session Negotiations: SIP and WebRTC IP Telephony:
-
WebRTC Blueprints (English Edition)
5.2 Great God Blog
- Cloud game open source project based on WebRTC
- WebRTC STUN/TURN server deployment
- Introduction to WebRTC
5.3 Demo Example code
Here are a few from the WebRTC developer community:
- STUN Server and Client: This project implements a simple STUN server and client on Windows, Linux, and Solaris. The STUN protocol (through NATs simple traversal UDP) described in IETF RFC 3489, is available at http://www.ietf.org/rfc/rfc3489.txt
- Mac side implementation of a Framework Demo:
- WebRtcRoom Server: using Node JS development, signaling Server using Socket.IO, Android, iOS, Html, Server have done the implementation
Several WebRTC demos available for cross-platform development:
-
Flutter + WebRTC Demo: Plugin for Flutter WebRTC, which supports iOS and Android. There are higher stars.
-
React Native + Webrtc Demo: The React Native Webrtc Module supports iOS and Android platforms. Video/Audio/Data Channels.. There is a good star count of 2.8K
- Xamarin + WebRTC: an iOS video calling App based on WebRTC, developed by Xamarin. The author himself is part of the Microsoft Xamarin team. There are not many stars at the moment. Use Xamarin’s native iOS video chat app based on WebRTC.
Webrtc-uwp peercc-sample: This example implements the webrTC Universal Windows Platform (UWP) App containing the PeerConnection example webrTC example. Used to establish audio/video calls between two peers. It includes a precompiled signal server and references the WebRTC UWP NuGet package, which makes it the fastest way to start and run WebRTC on UWP. The sample also includes Unity support and features developed for HoloLens and mixed reality. For reference to complete the WebRTC UWP source code rather than NuGet packages of the same sample, please see https://github.com/webrtc-uwp/PeerCC. It is based on the original PeerConnection example from https://webrtc.org.
-
RTCStartupDemo: a set of super simple signaling server, and the supporting client demo code completely based on WebRTC official API (including: Web/Android/iOS/Windows platform). This is a startup demo for webrTC beginners, including a simple signal server based on socket. IO sockets and some clients on Web/Android/iOS/Windows platforms.
-
Meething – ML-camera: Machine learning to generate avatars for Web-based video calls
- ShareDrop: An HTML5 version of Airdrop based on WebRTC that allows users to transfer files peer-to-peer. In this project, WebRTC peer-to-peer file transfer, WebRTC signaling, and Firebase were used.
- ShareDrop is a Web application inspired by Apple’s AirDrop service. It allows you to transfer files directly between devices without having to upload them to any server first. It uses WebRTC for secure point-to-point file transfers, Firebase for presence management and WebRTC signaling.
- ShareDrop allows you to send files to other devices on the same local network (i.e. devices with the same public IP address) without any configuration — just open https://www.sharedrop.io on all devices and they’ll see each other. It also allows you to send files between networks — just click the + button in the top right corner of the page to create a room with a unique URL and share that URL with others you want to send files to. Once they open the page in their device’s browser, you’ll see each other’s profile pictures.
- The main difference between ShareDrop and AirDrop is that ShareDrop requires an Internet connection to discover other devices, while AirDrop does not — it creates a special wireless network between these devices. ShareDrop, on the other hand, allows you to share files between mobile (Android) and desktop devices as well as across the web.
- Agora Web Demo:
- Agora Web TypeScript SDK
- Agora Online education scenario Demo on the Web
- 1-to-1 Video call Demo on the Web
- Web – based Multi-party Video call Demo
- Web side integrates Agora video SDK, video and audio self-collection
- Screen sharing Demo is implemented on the Web
- The Web client sends message Demo based on Agora REAL-TIME message SDK
- The Web side quickly integrates Agora video SDK to realize 17-person video live broadcast
- On the Web end, the Demo is used to remotely control the desktop by the browser
Reference: WeBRTC audio and video development summary – architecture analysis