With the rapid development of the Internet and the upcoming 5G era, WebRTC, as a sharp weapon of front-end interactive live broadcasting and real-time audio and video, is also a learning area that front-end developers can not miss. If that’s all you’ve heard, you might want to learn.

What is WebRTC?

WebRTC is built with Real-time Communications (RTC).

Around 2011, Google acquired GIPS, a company that developed many components for RTC, such as codec and echo cancellation technology. Google has open-source the technology developed by GIPS and hopes to make it an industry standard.

The acquisition cost a lot of money, and Google has to say open source is open source, but it’s clear that creating an open source ecosystem for audio and video is of greater value to Googl. “Browser + WebRTC” is one of Google’s answers. The vision is to enable audio and video communication between browsers quickly.

Today, WebRTC is a free, open project. Make web browser realize real-time communication function through simple JavaScript API interface.


WebRTC and architecture

Generally speaking WebRTC architecture will take out this chart, WebRTC architecture from top to bottom is:

Web API layer: Provides developers with standard apis (javascirpt) through which front-end applications use WebRTC capabilities.

C++ API layer: for browser developers, enabling browser makers to easily implement Web API solutions.

Audio Engine: Audio engine is a series of audio multimedia processing framework, including the entire solution from video capture card to network transmission end.

  1. ISAC /iLBC/Opus codec.
  2. NetEQ speech signal processing.
  3. Echo cancellation and noise reduction.

VideoEngine (VideoEngine) : is a series of video processing overall framework, from camera capture video, video information network transmission to video display the whole process of the solution.

  1. VP8 codec.
  2. Jitter buffer: indicates the dynamic jitter buffer.
  3. Image Enhancements: Image enhancements.

Transport: Transport/session layer, session negotiation + NAT penetration component.

  1. RTP real-time protocol.
  2. P2P transmission STUN+TRUN+ICE network traversal.

Hardware module: audio and video hardware capture and NetWork IO related.


WebRTC’s important classes and apis

Network Stream API

1.MediaStream and MediaStreamTrack

This class is not entirely within WebRTC’s scope, but local media streaming and remote streaming to the Vedio TAB playback are related to WebRTC. MS consists of two parts: MediaStreamTrack and MediaStream.

  • MediaStreamTrack represents a single type of data stream. It can be an audio or video track.

  • MediaStream is a complete audio and video stream. It can contain >=0 Mediastreamtracks. Its main function is to ensure that several media tracks are played simultaneously.

2.Constraints Media Constraints

There is another important concept about MediaStream called Constraints. It is used to regulate whether the data currently collected meets the needs, and can be set by parameters.

/ / basic
const constraint1 = {
    "audio": true.// Whether to capture audio
    "video": true   // Whether to capture the video
}

/ / detail
const constraint2 = {
    "audio": {
      "sampleSize": 8."echoCancellation": true // Echo cancellation
    },
    "video": {  // Video related Settings
        "width": {
            "min": "381".// The minimum width of the current video
            "max": "640" 
        },
        "height": {
            "min": "200".// Minimum height
            "max": "480"
        },
        "frameRate": {
            "min": "28".// Minimum frame rate
             "max": "10"}}}Copy the code

3. Obtain the local audio and video of the device

The local media stream fetch uses navigator.getUserMedia(), which provides access to the user’s local camera/microphone media stream.

var video = document.querySelector('video');
navigator.getUserMedia({
    audio : true.video : true
    }, function (stream) {
            // Get the local media stream
            video.src = window.URL.creatObjectURL(stream);
    }, function (error) {
            console.log(error);
});
Copy the code

The demo above is to get the stream through getUserMedia. The browser pops up and asks the user for permission. Only after the permission is granted can the stream be sent to the video label for playing.

The first argument to getUserMedia is Constraint, and the second argument is passed to the callback function to get the video stream. Of course, you can write promises like this:

navigator.mediaDevices.getUserMedia(constraints).
then(successCallback).catch(errorCallback);
Copy the code

RTCPeerConnection

RTCPeerConnection: A connection channel for RTC connections between peers and for audio and video data transmission without a server. (Servers are still needed in the actual production of live broadcast)

RTCPeerConnection is an advanced and powerful webSocket-like connection channel for transmitting audio and video data, but it can be used to build a browser

The reason why it is advanced and powerful, because it as WebRTC Web layer core API, let you do not need to pay attention to data transmission delay jitter, audio and video codec, audio and picture synchronization and other problems. Use PeerConnection directly to take advantage of the low-level encapsulation capabilities provided by these browsers.

var pc =  new RTCPeerConnection({
    "iceServers": [{"url": "stun:stun.l.google.com:19302" }, // Use Google public test server
        { "url": "turn:[email protected]"."credential": "pass" } // If there is a TURN server, you can configure it here
    ]
};);
pc.setRemoteDescription(remote.offer);
pc.addIceCandidate(remote.candidate);
pc.addstream(local.stream);
pc.createAnswer(function (answer) { 
    // Generate an SDP reply describing the end connection and send it to the peer end
    pc.setLocalDescription(answer);
    signalingChannel.send(answer.sdp);
});
pc.onicecandidate = function (evt) {
    // Generate an SDP reply describing the end connection and send it to the peer end
    if (evt.candidate) {
        signalingChannel.send(evt.candidate);
    }
}
pc.onaddstream = function (evt) {
    // The remote stream is received and played
    var remote_video = document.getElementById('remote_video');
    remote_video.src = window.URL.createObjectURL(evt.stream);
}
Copy the code

What is the ICE Server configuration? What is signalingChannel? What is answer and offer? What’s a candidate?

We can create an RTCPeerConnection with new RTCPeerConnection(). The above code only shows the RTCPeerConnection API and setup method, but does not run.

To complete an RTCPeerConnection, you need to set up ICE Server (STUN Server or TURN Server) and exchange information before connection. For this, you need to use a Signaling Server. The main exchange is the SDP session description protocol and the ICE candidate, which is described in the following paragraphs.

Peer-to-peer Data API

RTCDataChannel can establish point-to-point communication between browsers. Common communication methods include Websocket, Ajax and so on. Websocket is a two-way communication, but both Websocket and Ajax are the communication between the client and the server, you must configure the server to communicate.

Since RTCDATAChannel via RTCPeerConnection can provide point-to-point communication without going through the server, there is no need for /(avoid) the server middleware.

var pc = new RTCPeerConnection();
var dc = pc.createDataChannel("my channel");

dc.onmessage = function (event) {
  console.log("received: " + event.data);
};

dc.onopen = function () {
  console.log("datachannel open");
};

dc.onclose = function () {
  console.log("datachannel close");
};
Copy the code

Signaling Channel Indicates a Channel

We say WebRTC’s RTCPeerConnection is capable of communicating between browsers (no service).

But there is a problem, how do two browsers know about each other’s existence when they do not establish a PeerConnection through the server? Furthermore, how do they know each other’s network connection location (IP/ port, etc.)? What codecs are supported? Even when does a media stream start and when does it end?

Therefore, before establishing the RTCPeerConnection of WebRTC, another Channel must be established ️ to deliver these negotiation messages. These messages that need immediate negotiation are also called Signaling, and this Channel is called Signaling Channel.

Signaling exchanged between two client browsers has the following functions:

  • Negotiate media functions and Settings (exchange information in SDP objects: media type, codec, bandwidth, and other metadata)
  • Identify and verify the identity of the session participant
  • Control media sessions, indicate progress, change sessions, terminate sessions, etc

It mainly involves offer/ Answer SDP session description protocol and exchange of ICE candidates. This process is called WebRTC negotiation.

Note that the WebRTC standard itself does not specify the signaling exchange communication mode. The signaling service is implemented according to its own situation.

In Web browsers, websocket channels are generally used as signaling channels. For example, signaling services can be built based on socket. IO. Of course, there are many open source and stable mature signaling service solutions to choose from.

WebRTC establishes the connection key -ICE connection

After exchanging and setting up the SDP (Offer/ASNwer), weBRTC begins the actual connection to transmit audio and video data. The process of establishing a connection is complicated because weBRTC requires both efficient transport and stable connectivity.

The locations of browser clients are often complex. They may be in the same Intranet segment or two different locations, and the NAT gateway may be complex. So you need a mechanism to find a path of optimal transmission quality, and WebRTC has that capability.

Let’s start with three simple concepts.

  • ICE Canidate (ICE candidate) : contains the protocol, IP address and port, candidate type, and other information used in remote communication.
  • STUN/TURN: STUN implements P2P connection and TRUN implements trunk connection. Both implementations have standard protocols. (See figure below)
  • NAT traversal: NAT is network address translation. The client cannot be assigned a public IP address. Therefore, the client can communicate with the external network only by mapping the internal IP address to the public IP address port. NAT traversal is the process of discovering and establishing connections between clients behind multiple NAT gateways.

The general principle and steps of ICE connection are as follows:

  1. Launch the ICE Canidate task.
  2. This function collects candidates of the host type (Intranet IP port).
  3. The CANDIate of SRFLX type (IP port that NAT maps to the extranet) is collected through the STUN server.
  4. Candidates of relay type (IP and port of the relay server) are collected through the TUN server.
  5. NAT traversal is attempted, and connections are made according to the priorities of the host type, SRFLX type, and relay type.

Above, WebRTC can find a connection with the best transmission quality. Of course, it’s not that simple, and the whole process involves more complicated low-level details.

WebRTC uses step Demo code

Through the above understanding, combined with WebRTC API, signaling service, SDP negotiation, ICE connection and other content. We use a piece of code to illustrate the process steps of using WebRTC.

var signalingChannel = new SignalingChannel();
var pc = null;
var ice = {
    "iceServers": [{"url": "stun:stun.l.google.com:19302" }, // Use Google public test server
        { "url": "turn:[email protected]"."credential": "pass" } // If there is a TURN server, you can configure it here]}; signalingChannel.onmessage =function (msg) {
    if (msg.offer) { // Listen for and process remote proposals delivered through the sending channel
        pc = new RTCPeerConnection(ice);
        pc.setRemoteDescription(msg.offer);
        navigator.getUserMedia({ "audio": true."video": true }, gotStream, logError);
    } else if (msg.candidate) { // Register the remote ICE candidate to start connection checkingpc.addIceCandidate(msg.candidate); }}function gotStream(evt) {
    pc.addstream(evt.stream);
    var local_video = document.getElementById('local_video');
    local_video.src = window.URL.createObjectURL(evt.stream);
    pc.createAnswer(function (answer) { // Generate an SDP reply describing the end connection and send it to the peer end
        pc.setLocalDescription(answer);
        signalingChannel.send(answer.sdp);
    });
}
pc.onicecandidate = function (evt) {
    if (evt.candidate) {
        signalingChannel.send(evt.candidate);
    }
}
pc.onaddstream = function (evt) {
    var remote_video = document.getElementById('remote_video');
    remote_video.src = window.URL.createObjectURL(evt.stream);
}
function logError() {... }Copy the code

The present situation of the WebRTC

standard

At the beginning, each browser manufacturer will implement their own set of apis, such as webkitRTCPeerConnection and mozRTCPeerConnection, such a difference for front-end developers, of course, is unbearable.

Adapter.js is designed to bridge this gap and help us write our WebRTC code according to the specification. Check out github.com/webrtcHacks…

Another key point about standards is that the W3C’s WebRTC 1.0 standard (Candidate Recommendation) released in 2018 www.w3.org/TR/webrtc makes WebRTC a major technical driver of the video communication commercial application scene explosion as well. So as you can see, most of the current implementation of communication vendors, in the Web browser side of the solution is basically WebRTC.

compatibility

The development of standards will inevitably promote the promotion of compatibility and support. I did the pre-research of H5 online clipdoll in about 2017. At that time, I found that many browsers, especially mobile terminal and IOS, were completely unavailable, so I had to give up the WebRTC scheme.

So far it seems that browser support is very good, except IE is still not supported, PC browsers are basically supported. IOS is supported in IOS 11 or later on mobile devices.

Here’s the key: Don’t just look at Caniuse’s browser, look at whether it’s supported by each custom browser on mobile, I don’t have extensive compatibility testing data here.

But one conclusion is that WebRTC is available on the latest IOS and Android apps and wechat.

WebRTC learning guide

For a general guide to learning, you can start with the webRTC core API and follow the demo implementation such as local audio and video acquisition and presentation. Secondly, it is a good attempt to build simple signaling service and realize simple communication between browsers on the Intranet. When it is used, li Jue’s connection traversal, transmission principle and related protocols will be further explored, and finally weBRTC internal audio and video related knowledge will be further explored.

This is an easy to understand and comprehensive introduction to the basics of weBRTC for the Web front end.

Refer to the article

Webrtc.org/architectur… Developer.mozilla.org/zh-CN/docs/…