Introduction to WebRTC

On the front end, WebRTC is a relatively niche technology; But for online education, it is very central. There are many articles about WebRTC on the Internet. This article will try to introduce the working process of WebRTC, so that readers can have a complete concept of this technology.

WebRTC (Web Real-Time Communications) is an audio and video technology that is open source by Google and promoted to be included in W3C standards. It aims to achieve real-time audio and video communication between browsers in a point-to-point manner without resorting to intermediary media.

The biggest difference from the classic B/S architecture of the Web world is that WebRTC communicates directly with the client without passing through the server, which saves server resources and improves communication efficiency. To do this, a typical WebRTC communication process consists of four steps: find the other party, negotiate, establish a connection, and start the communication. Each of these steps is described below.

Step 1: Find the person

Although communication does not need to go through the server, the existence of the other party must be known before communication can begin. In this case, a signaling server is required.

Signaling server

There is no standard for a signaling server, a so-called “middle man” that helps two parties establish a connection, meaning developers can implement it using any technology, such as WebSocket or AJAX.

The two ends that initiate WebRTC communication are called peers, and the successful connection is called PeerConnection. A WebRTC communication can contain multiple peerConnections.

const pc2 = newRTCPeerConnection({... });Copy the code

In the peer search phase, the signaling server generally identifies and verifies the identity of participants. The browser connects to the signaling server and sends necessary information for the session, such as room number and account information. The signaling server finds the peer that can communicate and tries to communicate.

In fact, signaling server plays a very important role in the whole WebRTC communication process. In addition to the above roles, SDP exchange and ICE connection are inseparable from signaling, which will be mentioned later.

Step 2: Negotiate

The negotiation process mainly refers to SDP exchange.

SDP protocol

Session Description Protocol (SDP) is a common Protocol that is not limited to WebRTC. It describes multimedia sessions, including session declaration, session invitation, and session initialization.

In WebRTC, SDP is mainly used to describe:

Media capabilities supported by the device, including codecs
ICE Candidate address
Streaming media transfer protocol

The SDP protocol is text-based and has a very simple format. It consists of multiple lines. Each line has the following format:

<type>=<value>
Copy the code

Where, type indicates the name of the attribute and value indicates the value of the attribute. The specific format depends on type. The following is a typical SDP protocol example:

v=0
o=alice 2890844526 2890844526 IN IP4 host.anywhere.com
s=
c=IN IP4 host.anywhere.com
t=0 0
m=audio 49170 RTP/AVP 0
a=rtpmap:0 PCMU/8000
m=video 51372 RTP/AVP 31
a=rtpmap:31 H261/90000
m=video 53000 RTP/AVP 32
a=rtpmap:32 MPV/90000
Copy the code

Among them:

v=Indicates the protocol version number
o=Indicates the session initiator, includingusername,sessionIdEtc.
s=Represents the session name and is a unique field
c=Indicates connection information, including network type, address type, and address
t=Indicates the session time, including the start time and end time0Persistent session
m=Indicates the description of the media, including the media type, port, transmission protocol, and media format
a=Represents an additional property, used here to extend the media protocol

Plan B VS Unified Plan

During the development of WebRTC, the semantics of SDP also changed many times, and Plan B and Unified Plan are the two most commonly used at present. Both can represent multi-channel media streams in a PeerConnection. The differences are as follows:

Plan B: All video streams and all audio streams in onem=Value in usessrcDistinguish between
Unified Plan: One for each streamm=value

WebRTC 1.0 is Unified Plan, which is supported by mainstream browsers and enabled by default. The Chrome browser supports fetching the currently used Semantics via the following APIS:

// Chrome
RTCPeerconnection.getConfiguration().sdpSemantics; // 'unified-plan' or 'plan b'
Copy the code

The negotiation process

The negotiation process is not complicated, as shown below:

createOffer
offer
createAnswer
answer

/ / the sender, sendOffer/onReveiveAnswer for pseudo method
const pc1 = new RTCPeerConnection();
const offer = await pc1.createOffer();
pc1.setLocalDescription(offer);
sendOffer(offer);

onReveiveAnswer((answer) = > {
  pc1.setRemoteDescription(answer);
});

/ / recipient, sendAnswer/onReveiveOffer for pseudo method
const pc2 = new RTCPeerConnection();
onReveiveOffer((offer) = > {
  pc2.setRemoteDescription(answer);
  const answer = await pc2.createAnswer();
  pc2.setLocalDescription(answer);
  sendAnswer(answer);
});
Copy the code

It is worth noting that the SDP exchange may take place multiple times as the relevant information of both parties changes during the communication process.

Step 3: Establish a connection

The modern Internet environment is very complex, and our devices are often hidden behind layers of gateways. Therefore, in order to establish a direct connection, we also need to know the available connection addresses of both parties. This process is called NAT traversal, which is mainly done by the ICE server, so it is also called ICE tunneling.

ICE

An ICE (Interactive Connectivity Establishment) server is a third-party server that is independent of the communication parties. The server obtains the available IP address of the device for the peer to connect to. This is done by the STUN (Session Traversal Utilities for NAT) server. Each available address is called an ICE Candidate, from which the browser selects the most suitable one to use. The types and priorities of candidate items are as follows:

Host candidate: The host is an Intranet address obtained from the device NIC and has the highest priority
Reflection IP address candidate: the IP address obtained from the ICE server is an IP address on the Internet. The process of obtaining the reflected IP address is complicated. It can be simply understood as follows: The browser sends multiple detection requests to the server to comprehensively determine and obtain its own IP address on the public network based on the return of the server
Relay candidate: a last-ditch option provided by the ICE relay server, with the lowest priority, after none of the first two options have worked

When creating a PeerConnection, you can specify the ICE server address. Each time WebRTC finds an available candidate, an ICecandidate event is triggered, and the addIceCandidate method can be called to add the candidate to the communication:

const pc = new RTCPeerConnection({
  iceServers: [{"url": "stun:stun.l.google.com:19302" },
    { "url": "turn:[email protected]"."credential": "pass"}]// Configure the ICE server
}); 
pc.addEventListener('icecandidate', e => {
  pc.addIceCandidate(event.candidate);
});
Copy the code

The ICE connection established through the candidate can be roughly divided into the following two situations:

A direct P2P connection is the case for the above 1&2 candidates;
Traversal Using Relays around NAT through TURN (Traversal Using Relays around NAT) is the third case above.

Similarly, due to network changes and other reasons, ICE holes in the communication process may also occur several times.

Step 4: Communicate

WebRTC chose UDP as the underlying transport protocol. Why not choose TCP, which is more reliable? There are three main reasons:

UDPNo protocol connection, low resource consumption, high speed
A small amount of data lost in transit doesn’t matter much
TCPThe time-out reconnection mechanism of the protocol can cause significant delays

On UDP, WebRTC uses reencapsulated RTP and RTCP:

Realtime Transport Protocol (RTP) : used to transmit real-time data, such as audio and video data
RTP Trasport Control Protocol (RTP) : Used to monitor the quality of data transmission and provide feedback to data senders.

In actual communication, the two protocols send and receive data at the same time.

Key to the API

Here is a demo code to show what apis are used in the front-end WebRTC:

HTML


      
<html>
<head>

    <meta charset="utf-8">
    <meta name="viewport" content="width=device-width, user-scalable=yes, initial-scale=1, maximum-scale=1">
    <meta name="mobile-web-app-capable" content="yes">
    <meta id="theme-color" name="theme-color" content="#ffffff">
    <base target="_blank">
    <title>WebRTC</title>
    <link rel="stylesheet" href="main.css"/>
</head>

<body>
<div id="container">
    <video id="localVideo" playsinline autoplay muted></video>
    <video id="remoteVideo" playsinline autoplay></video>

    <div class="box">
        <button id="startButton">Start</button>
        <button id="callButton">Call</button>
    </div>
</div>

<script src="https://webrtc.github.io/adapter/adapter-latest.js"></script>
<script src="main.js" async></script>
</body>
</html>
Copy the code

JS

'use strict';

const startButton = document.getElementById('startButton');
const callButton = document.getElementById('callButton');
callButton.disabled = true;
startButton.addEventListener('click', start);
callButton.addEventListener('click', call);

const localVideo = document.getElementById('localVideo');
const remoteVideo = document.getElementById('remoteVideo');

let localStream;
let pc1;
let pc2;
const offerOptions = {
  offerToReceiveAudio: 1.offerToReceiveVideo: 1
};

async function start() {
  /** * get the local media stream */
  startButton.disabled = true;
  const stream = await navigator.mediaDevices.getUserMedia({audio: true.video: true});
  localVideo.srcObject = stream;
  localStream = stream;
  callButton.disabled = false;
}

function gotRemoteStream(e) {
  if(remoteVideo.srcObject ! == e.streams[0]) {
    remoteVideo.srcObject = e.streams[0];
    console.log('pc2 received remote stream');
    setTimeout((a)= > {
      pc1.getStats(null).then(stats= > console.log(stats));
    }, 2000)}}function getName(pc) {
  return (pc === pc1) ? 'pc1' : 'pc2';
}

function getOtherPc(pc) {
  return (pc === pc1) ? pc2 : pc1;
}

async function call() {
  callButton.disabled = true;
  /** * Create a call connection */
  pc1 = new RTCPeerConnection({
    sdpSemantics: 'unified-plan'.// Specify unified Plan
    iceServers: [
        { "url": "stun:stun.l.google.com:19302" },
        { "url": "turn:[email protected]"."credential": "pass"}]// Configure the ICE server
  }); 
  pc1.addEventListener('icecandidate', e => onIceCandidate(pc1, e)); // Listen for ICE candidate events

  /** * Create a reply connection */
  pc2 = new RTCPeerConnection();

  pc2.addEventListener('icecandidate', e => onIceCandidate(pc2, e));
  pc2.addEventListener('track', gotRemoteStream);

  /** * Add local media stream */
  localStream.getTracks().forEach(track= > pc1.addTrack(track, localStream));

  /** * pc1 createOffer */
  const offer = await pc1.createOffer(offerOptions); / / create the offer
  await onCreateOfferSuccess(offer);
}

async function onCreateOfferSuccess(desc) {
  /** * pc1 sets local SDP */
  await pc1.setLocalDescription(desc);

  
  /******* Pc2 is used as the other party to simulate the scenario of receiving an offer *******/

  /** * pc2 sets remote SDP */
  await pc2.setRemoteDescription(desc);
  
  /** * pc2 createAnswer */
  const answer = await pc2.createAnswer(); / / create the answer
  await onCreateAnswerSuccess(answer);
}

async function onCreateAnswerSuccess(desc) {
  /** * pc2 sets the local SDP */
  await pc2.setLocalDescription(desc);

  /** * pc1 sets remote SDP */
  await pc1.setRemoteDescription(desc);
}

async function onIceCandidate(pc, event) {
  try {
    await (getOtherPc(pc).addIceCandidate(event.candidate)); // Set the ICE candidate
    onAddIceCandidateSuccess(pc);
  } catch (e) {
    onAddIceCandidateError(pc, e);
  }
  console.log(`${getName(pc)} ICE candidate:\n${event.candidate ? event.candidate.candidate : '(null)'}`);
}

function onAddIceCandidateSuccess(pc) {
  console.log(`${getName(pc)} addIceCandidate success`);
}

function onAddIceCandidateError(pc, error) {
  console.log(`${getName(pc)} failed to add ICE Candidate: ${error.toString()}`);
}
Copy the code

Write in the last

As an “overview”, this paper introduces WebRTC technology from a relatively shallow level, many details and the original content, limited to space did not elaborate in depth. I also just contact a few months, if there is a fallacy, please inform.