This is the second day of my participation in the Novembermore Challenge.The final text challenge in 2021

An overview of the

WebRTC (Web Real-Time Communications) is a real-time communication technology that allows Web applications or sites to establish peer-to-peer connections between browsers without resorting to intermediaries. Implement the transmission of video streams and/or audio streams or any other data. WebRTC includes standards that make it possible to create peer-to-peer data sharing and teleconferencing without having to install any plug-ins or third-party software.

This section describes webRTC protocol

ice

Interactive Connection Mechanism (ICE) is a framework that allows your Web browser to connect with peers. There are many reasons why A helicopter connection from peer A to peer B will not work. It needs to bypass firewalls to prevent opening connections, give you a unique address if, as in most cases, your device does not have a public IP address, and relay data through servers if your router does not allow you to connect directly with peers. ICE uses STUN and/or TURN servers for this purpose, as described below.

STUN

NAT’s Session Horizontal Utility (STUN) is a protocol to discover your public address and identify any restrictions on your router that will prevent direct connections with peers.

The client will send a request to a STUN server on the Internet, which will reply with the client’s public address and whether the client is accessed behind the router NAT.

NAT

Network Address translation (NAT) is used to provide public IP addresses for your devices. The router will have a public IP address, and each device connected to the router will have a private IP address. The request translates from the device’s private IP to the router’s public IP and has a unique port. This way, you don’t need each device to have a unique public IP, but it can still be found on the Internet.

Some routers will restrict who can connect to devices on the network. This could mean that even if we had the public IP address found by the STUN server, no one would be able to create a connection. In this case, we need to turn around.

TURN

Some routers that use NAT employ a restriction called “symmetric NAT”. This means that the router will only accept connections to peers that you have previously connected to.

Transversal (TURN) using relays around the NAT is intended to circumvent the symmetric NAT restriction by opening the connection to the TURN server and relaying all information through that server. You will create a connection to the TURN server and tell all corresponding parties to send the packet to the server and then forward it to the server. This obviously comes with some overhead, so it is only used if there is no other option.

SDP

Session Description Protocol (SDP) is a standard for describing connected multimedia content, such as resolution, format, codecs, encryption, etc., so that two peers can understand each other when data is transferred. This is essentially metadata that describes the content rather than the media content itself.

Thus, technically speaking, SDP is not really a protocol, but rather a data format used to describe shared media connections between devices.

Documenting the SDP is well beyond the scope of this document; However, there are a few things worth noting here.

webRTC API

WebRTC allows browsers to do three main things.

  • Get audio and video
  • Audio and video communication
  • The communication of arbitrary data

WebRTC is divided into three apis, each corresponding to the above three functions.

  • MediaStream (also known as getUserMedia)
  • RTCPeerConnection
  • RTCDataChannel

getUserMedia

An overview of the

The navigator.getUserMedia method is currently primarily used to fetch audio (through a microphone) and video (through a camera) in a browser, but could be used in the future to fetch arbitrary data streams, such as discs and sensors.

The following code checks to see if the browser supports the getUserMedia method.

navigator.getUserMedia = navigator.getUserMedia || navigator.webkitGetUserMedia || navigator.mozGetUserMedia || navigator.msGetUserMedia; If (navigator.getUserMedia) {// support} else {// not support}Copy the code

Chrome 21, Opera 18, and Firefox 17 support this approach. Currently, IE does not support msGetUserMedia, just to ensure future compatibility.

The getUserMedia method takes three arguments.

navigator.getUserMedia({
    video: true, 
    audio: true
}, onSuccess, onError);
Copy the code

The first argument to getUserMedia is an object that indicates which multimedia devices to retrieve. The code above means to retrieve the camera and microphone. OnSuccess is a callback function that is called when the multimedia device was successfully retrieved; OnError is also a callback function that is called when fetching a multimedia device fails.

Here’s an example.


var constraints = {video: true};

function onSuccess(stream) {
  var video = document.querySelector("video");
  video.src = window.URL.createObjectURL(stream);
}

function onError(error) {
  console.log("navigator.getUserMedia error: ", error);
}

navigator.getUserMedia(constraints, onSuccess, onError);
Copy the code

If the web page uses the getUserMedia method, the browser asks the user if they approve of the browser calling the microphone or camera. If the user agrees, the callback function onSuccess is called; If the user refuses, the callback function onError is called.

The onSuccess callback takes a data stream object. The stream.getAudioTracks and stream.getVideoTracks methods return an array of the audio and visual tracks contained in the data stream, respectively. The number of sound sources and camera heads used determines the number of sound tracks and visual tracks. For example, if only one camera is used to capture video and no audio, the number of visual tracks is 1 and the number of audio tracks is 0. Each track has a kind attribute (video or audio) and a label attribute (such as FaceTime HD Camera (built-in)).

The onError callback takes an Error object as an argument. The code attribute of the Error object has the following values, indicating the type of the Error.

  • PERMISSION_DENIED: The user refused to provide information.
  • NOT_SUPPORTED_ERROR: The browser does not support hardware devices.
  • MANDATORY_UNSATISFIED_ERROR: Failed to discover the specified hardware device.

Example: Get the camera

The image captured by the camera is displayed on the web page using the getUserMedia method.

First, you need to place a video element on the web page. The image is displayed in this element.

<video id="webcam"></video>
Copy the code

Then, get the element in code.

function onSuccess(stream) {
    var video = document.getElementById('webcam');
}
Copy the code

Next, bind the SRC attribute of this element to the data stream, and the image captured by the camera header can be displayed.

function onSuccess(stream) { var video = document.getElementById('webcam'); if (window.URL) { video.src = window.URL.createObjectURL(stream); } else { video.src = stream; } video.autoplay = true; / / or video. The play (); } if (navigator.getUserMedia) { navigator.getUserMedia({video:true}, onSuccess); } else { document.getElementById('webcam').src = 'somevideo.mp4'; }Copy the code

In Chrome and Opera, the url.createObjecturl method converts the MediaStream to the URL (Blob URL) of a binary object that can be used as the value of the SRC attribute of the video element. In Firefox, the media data stream can be directly used as the value of the SRC property. Chrome and Opera also allow getUserMedia to retrieve audio data directly as the value of the audio or video element, meaning that if audio is also retrieved, the above code will play the video with sound.

One of the main uses for acquiring a camera is to allow users to take a picture of themselves with the camera head. The Canvas API has a ctx.drawImage(video, 0, 0) method that converts a frame of a video to a Canvas element. This makes it very easy to take screenshots.

<video autoplay></video> <img src=""> <canvas style="display:none;" ></canvas> <script> var video = document.querySelector('video'); var canvas = document.querySelector('canvas'); var ctx = canvas.getContext('2d'); var localMediaStream = null; function snapshot() { if (localMediaStream) { ctx.drawImage(video, 0, 0); // "image/webp" works for Chrome, // other browsers automatically reduce to image/ PNG document.querySelector('img').src = canvas.todataURL ('image/webp'); } } video.addEventListener('click', snapshot, false); navigator.getUserMedia({video: true}, function(stream) { video.src = window.URL.createObjectURL(stream); localMediaStream = stream; }, errorCallback); </script>Copy the code

Example: Capture microphone sound

To capture sound through a browser, you need the Web Audio API.


window.AudioContext = window.AudioContext ||
                      window.webkitAudioContext;

var context = new AudioContext();

function onSuccess(stream) {
	var audioInput = context.createMediaStreamSource(stream);
	audioInput.connect(context.destination);
}

navigator.getUserMedia({audio:true}, onSuccess);
Copy the code

The capture qualification

The first argument to the getUserMedia method, in addition to specifying the object to capture, can also specify some restrictions, such as limiting recording to hd (or VGA standard) video.


var hdConstraints = {
  video: {
    mandatory: {
      minWidth: 1280,
      minHeight: 720
    }
  }
};

navigator.getUserMedia(hdConstraints, onSuccess, onError);

var vgaConstraints = {
  video: {
    mandatory: {
      maxWidth: 640,
      maxHeight: 360
    }
  }
};

navigator.getUserMedia(vgaConstraints, onSuccess, onError);
Copy the code

MediaStreamTrack.getSources()

If the machine has multiple cameras/microphone, then you need to use MediaStreamTrack. GetSources method is specified, which used a webcam/microphone.

MediaStreamTrack.getSources(function(sourceInfos) { var audioSource = null; var videoSource = null; for (var i = 0; i ! = sourceInfos.length; ++i) { var sourceInfo = sourceInfos[i]; if (sourceInfo.kind === 'audio') { console.log(sourceInfo.id, sourceInfo.label || 'microphone'); audioSource = sourceInfo.id; } else if (sourceInfo.kind === 'video') { console.log(sourceInfo.id, sourceInfo.label || 'camera'); videoSource = sourceInfo.id; } else { console.log('Some other kind of source: ', sourceInfo); } } sourceSelected(audioSource, videoSource); }); function sourceSelected(audioSource, videoSource) { var constraints = { audio: { optional: [{sourceId: audioSource}] }, video: { optional: [{sourceId: videoSource}] } }; navigator.getUserMedia(constraints, onSuccess, onError); }Copy the code

Code above, said MediaStreamTrack. GetSources method callback function, you can get a list of the machine’s camera and microphone, and then specify to use the last camera and microphone.

RTCPeerConnectionl, RTCDataChannel

RTCPeerConnectionl

RTCPeerConnection is used to establish “peer to peer” communication between browsers, that is, the microphone or camera data obtained by the browser is transmitted to another browser. This involves a lot of complex work, such as signal processing, multimedia encoding/decoding, point-to-point communication, data security, bandwidth management, and so on.

Audio/video transfer between different clients does not need to go through the server. However, the connection between the two clients needs to be established through the server. The server transmits two main types of data.

  • Metadata of communication content: commands to open/close sessions, metadata of media files (encoding format, media type and bandwidth), etc.
  • Network communication metadata: IP addresses, NAT network address translation, firewalls, etc.

The WebRTC protocol does not specify how to communicate with the server, so various methods can be used, such as WebSocket. Through the server, the two clients exchange metadata based on the Session Description Protocol (SDP).

Here is an example.

var signalingChannel = createSignalingChannel(); var pc; var configuration = ... ; // run start(true) to initiate a call function start(isCaller) { pc = new RTCPeerConnection(configuration); // send any ice candidates to the other peer pc.onicecandidate = function (evt) { signalingChannel.send(JSON.stringify({  "candidate": evt.candidate })); }; // once remote stream arrives, show it in the remote video element pc.onaddstream = function (evt) { remoteView.src = URL.createObjectURL(evt.stream); }; // get the local stream, show it in the local video element and send it navigator.getUserMedia({ "audio": true, "video": true }, function (stream) { selfView.src = URL.createObjectURL(stream); pc.addStream(stream); if (isCaller) pc.createOffer(gotDescription); else pc.createAnswer(pc.remoteDescription, gotDescription); function gotDescription(desc) { pc.setLocalDescription(desc); signalingChannel.send(JSON.stringify({ "sdp": desc })); }}); } signalingChannel.onmessage = function (evt) { if (! pc) start(false); var signal = JSON.parse(evt.data); if (signal.sdp) pc.setRemoteDescription(new RTCSessionDescription(signal.sdp)); else pc.addIceCandidate(new RTCIceCandidate(signal.candidate)); };Copy the code

RTCPeerConnection has a browser prefix. In Chrome, it is webkitRTCPeerConnection, and in Firefox, it is mozRTCPeerConnection. Google maintains a library of functions called Adapter.js to abstract away differences between browsers.

RTCDataChannel

RTCDataChannel is used to transmit arbitrary data from point to point. It has the same API as WebSockets.

Here is an example.


var pc = new webkitRTCPeerConnection(servers,
  {optional: [{RtpDataChannels: true}]});

pc.ondatachannel = function(event) {
  receiveChannel = event.channel;
  receiveChannel.onmessage = function(event){
    document.querySelector("div#receive").innerHTML = event.data;
  };
};

sendChannel = pc.createDataChannel("sendDataChannel", {reliable: false});

document.querySelector("button#send").onclick = function (){
  var data = document.querySelector("textarea#send").value;
  sendChannel.send(data);
};
Copy the code

Chrome 25, Opera 18 and Firefox 22 support RTCDataChannel.

External function library

Due to the complexity of these two apis, external libraries are used to operate. At present, the video chat function libraries include SimpleWebRTC, easyRTC, Webrtc.io, and the peer-to-peer communication function libraries include PeerJS and Sharefest.

Here is an example of SimpleWebRTC.


var webrtc = new WebRTC({
  localVideoEl: 'localVideo',
  remoteVideosEl: 'remoteVideos',
  autoRequestMedia: true
});

webrtc.on('readyToCall', function () {
    webrtc.joinRoom('My room name');
});
Copy the code

Here is an example of PeerJS.

var peer = new Peer('someid', {key: 'apikey'}); peer.on('connection', function(conn) { conn.on('data', function(data){ // Will print 'hi! ' console.log(data); }); }); // Connecting peer var peer = new Peer('anotherid', {key: 'apikey'}); var conn = peer.connect('someid'); conn.on('open', function(){ conn.send('hi! '); });Copy the code