Build an in-depth understanding of WebRTC: Real-time Communication Between Browsers

Web Real-Time Communication (WebRTC) consists of a set of standards, protocols, and JavaScript apis for enabling (end-to-end) audio, video, and data sharing between browsers.

WebRTC makes real-time communication a standard feature for any Web application without third-party plug-ins and proprietary software, but through simple JavaScript apis.

In WebRTC, there are three main knowledge points, understand these three knowledge points, also understand the underlying implementation principle of WebRTC. The three points are:

MediaStream: Get audio and video streams
RTCPeerConnection: Audio and video data communication
RTCDataChannel: Data communication for any application

MediaStream As mentioned above, MediaStream is primarily used to fetch audio and video streams. Its JS implementation is also relatively simple, the code is as follows:

'use strict'; navigator.getUserMedia = navigator.getUserMedia || navigator.webkitGetUserMedia || navigator.mozGetUserMedia; Var constraints = {// Audio and video constraints audio: true, // Specify the request audio Track video: {// Specify the request video Track mandatory: {// mandatory constraint on video Track width: {min: 320}, height: {min: 180}}, optional: [// optional constraint {frameRate: 30}]}}; var video = document.querySelector('video'); function successCallback(stream) { if (window.URL) { video.src = window.URL.createObjectURL(stream); } else { video.src = stream; } } function errorCallback(error) { console.log('navigator.getUserMedia error: ', error); } navigator.getUserMedia(constraints, successCallback, errorCallback);Copy the code

In JS, we handle audio and video with the getUserMedia function, which takes three parameters: the audio and video constraint, the successful callback, and the failed callback.

At the bottom, the browser processes the captured raw audio and video streams through the audio and video engines, ensuring that the audio and video are synchronized, in addition to improving picture and sound quality.

Because audio and video are used for transmission, the sender also adjusts the output bit rate to accommodate changing bandwidth and network delays between clients.

For the receiver, it must decode audio and video streams in real time and accommodate network jitter and delay. Its working principle is shown in the figure below:

If you set both audio and video to true in the constraint, you will have audio and video tracks in the stream, and each Track will be synchronized in time.

The output of a stream can be sent to one or more destinations: a local audio or video element, a post-processing JavaScript agent, or a remote end. As shown below:

RTCPeerConnection

After capturing the audio and video streams, the next step is to send them out. However, unlike client-server mode, this is client-client transport, so NAT penetration must be resolved at the protocol level, otherwise transport will not be possible.

In addition, since WebRTC is primarily designed to solve the problem of real-time communication, reliability is not very important, so WebRTC uses UDP as the transport layer protocol: low latency and timeliness are key.

Before we go further, let’s think about whether we can just open audio and video and send UDP packets.

It’s not that simple, of course. In addition to addressing the NAT penetration issues we discussed above, you need to negotiate parameters for each flow, encrypt user data, and implement congestion and flow control.

Let’s look at a diagram of WebRTC’s layered protocol:

ICE, STUN, and TURN are required to establish and maintain end-to-end connections over UDP; SDP is a data format used to negotiate parameters for end-to-end connections; DTLS is used to ensure the security of transmitted data. SCTP and SRTP are application-layer protocols used to provide multiplexing, congestion, and flow control of different streams on top of UDP, as well as partial reliable delivery and other services.

ICE (Interactive Connectivity Establishment) : With multiple layers of firewalls and NAT devices separating the ends, we need a mechanism to collect IP from the public lines between the ends, and ICE is a great way to do this.

The ICE agent queries the operating system for the local IP address

If a STUN server is configured, the ICE agent queries the external STUN server for the public IP and port of the local end
If a TURN server is configured, ICE will consider the TURN server as a candidate, and when the end-to-end connection fails, the data will be forwarded through the specified intermediate device.

WebRTC uses Session Description Protocol (SDP) to describe the parameters of the end-to-end connection.

The SDP does not contain any information about the media itself, but only describes “session health,” as a set of connection properties: the type of media to be exchanged (audio, video, and application data), the network transport protocol, the codec used and its Settings, bandwidth, and other metadata.

DTLS extends the TLS protocol by explicitly adding offset fields and ordinals to each handshake record, which meets the requirements for orderly delivery and allows large records to be segmented into multiple groups and assembled at the other end. DTLS handshake records are transmitted in strict accordance with the TLS protocol sequence. If the sequence is incorrect, an error message is reported. Finally, DTLS also deals with packet loss: timers are used on both ends, and if no response is received by a predetermined time, the handshake is retransmitted. To ensure the integrity of the process, both ends need to generate their own signed certificates, and then follow the TLS handshake protocol. But such certificates cannot be used to authenticate identities because there is no chain of trust to authenticate. Therefore, if necessary, the application must participate in the authentication itself:

Applications can authenticate users by logging in
Each end can also specify its own “identity authority” when generating the SDP proposal/response. After receiving the SDP message, the peer end can contact the specified identity authority to verify the received certificate

SRTP defines a standard grouping format for delivering audio and video over IP networks. SRTP itself does not provide any guarantee mechanism for the timeliness, reliability or data recovery of the transmitted data. It is only responsible for wrapping digital audio samples and video frames with some metadata to assist the receiver in processing these streams.

SCTP is a transport layer protocol that runs directly over IP, similar to TCP and UDP. In WebRTC, however, SCTP runs on a secure DTLS channel, which in turn runs on top of UDP. Since WebRTC supports the transfer of arbitrary application data end-to-end through the DataChannel API, DataChannel relies on SCTP.

The RTCPeerConnection interface is responsible for maintaining the complete lifecycle of each end-to-end connection:

RTCPeerConnection manages the complete ICE workflow across NAT
RTCPeerConnection sends automatic (STUN) persistence signals
RTCPeerConnection tracks local streams
RTCPeerConnection tracks remote streams
RTCPeerConnection triggers automatic flow negotiation on demand
RTCPeerConnection provides the necessary API to generate connection proposals, receive replies, allow us to query the current state of the connection, and so on

Let’s take a look at the sample code:

var signalingChannel = new SignalingChannel(); var pc = null; Var ice = {" iceServers ": [{" url" : "stun:stun.l.google.com: 19302"}, / / by using Google public test server {" url ": "Turn :[email protected]", "credential": "pass"} // If there is a turnserver, configure it here]}; SignalingChannel. Onmessage = function (MSG) {if (MSG) offer) {/ / to monitor and handle delivery through transmitting channels remote PC = new proposed RTCPeerConnection (ice); pc.setRemoteDescription(msg.offer); navigator.getUserMedia({ "audio": true, "video": true }, gotStream, logError); } else if (msg.candidate) {// Register the remote ICE candidate to start the connection check pc.addicecandidate (msg.candidate); } } function gotStream(evt) { pc.addstream(evt.stream); var local_video = document.getElementById('local_video'); local_video.src = window.URL.createObjectURL(evt.stream); Pc.createanswer (function (answer) {// Generate the SDP response to describe the connection and send it to the peer pc.setlocalDescription (answer); signalingChannel.send(answer.sdp); }); } pc.onicecandidate = function (evt) { if (evt.candidate) { signalingChannel.send(evt.candidate); } } pc.onaddstream = function (evt) { var remote_video = document.getElementById('remote_video'); remote_video.src = window.URL.createObjectURL(evt.stream); } function logError() { ... }Copy the code

DataChannel DataChannel supports any application data exchange end-to-end, just like WebSocket, but end-to-end. After establishing the RTCPeerConnection connection, one or more channels can be opened at both ends to exchange text or binary data.

The following is a sample demo:

Var ice = {' iceServers: [{' url ':' stun:stun.l.google.com: 19302}, / / Google public test server / / {" url ": "turn:[email protected]", "credential": "pass"} ] }; // var signalingChannel = new SignalingChannel(); var pc = new RTCPeerConnection(ice); navigator.getUserMedia({'audio': true}, gotStream, logError); function gotStream(stram) { pc.addStream(stram); pc.createOffer().then(function(offer){ pc.setLocalDescription(offer); }); } pc.onicecandidate = function(evt) { // console.log(evt); if(evt.target.iceGatheringState == 'complete') { pc.createOffer().then(function(offer){ // console.log(offer.sdp); // signalingChannel.send(sdp); }) } } function handleChannel(chan) { console.log(chan); chan.onerror = function(err) {} chan.onclose = function() {} chan.onopen = function(evt) { console.log('established'); chan.send('DataChannel connection established.'); } chan.onmessage = function(MSG){// do something}} // Initialize the new DataChannel var dc = pc.createDataChannel('namedChannel', {reliable: false}); handleChannel(dc); pc.onDataChannel = handleChannel; function logError(){ console.log('error'); }Copy the code

Reproduced in: segmentfault.com/a/119000001…

For more information

Security streaming media Internet live -QQ exchange group: 615081503

Gb GB28181 Without liveGBS-QQ communication group: 947137753

WEB:www.liveqing.com: 8080

Build an in-depth understanding of WebRTC: Real-time Communication Between Browsers

RTCPeerConnection

For more information

Related Posts

Three aspects of netease: How does Stream improve the efficiency of traversal collection?

ConcurentHashMap

Nginx configures SSL certificates for HTTPS access