Understanding Web real-time communication from the front end perspective

What is WebRTC

WebRTC is short for Web real-time Communication. It is a very powerful tool for real-time audio and video chat, sending messages and files, and so on.

WebRTC name although Web accounts for half, but in fact WebRTC is an open source tool can run on the server, Android, IOS, and can do more powerful, because other end can change the WebRTC code, and the front-end can only use the standard RTC provided by the browser, But standard RTC also covers most scenarios.

Why from the front-end perspective, because the browser provides a simple WebRTC interface RTCPeerConnection, leaving out the underlying interface, is more friendly to the front-end.

Call up local cameras

The method exposed by the navigator.mediaDevices interface can be used in the browser to retrieve the local camera. It is also very simple to use

async function getLocalStream() {
    const stream = await navigator.mediaDevices.getUserMedia({ video: true.audio: true });
    // <video autoplay></video>
    video.srcObject = stream;
}
Copy the code

For security policy purposes, the camera can be used only in the localhost or HTTPS domain name
The first call will be queried, and if it is allowed, it will not be prompted next time. If it is disabled, it will fail next time

Here introduces the navigator. MediaDevices. GetUserMedia usage, more detailed usage and examples we can see mediaDevices opens the travel of your local video

How do I make audio and video calls

The principle of video calls is simply that both parties collect local video streams and send them to each other. This is where the RTCPeerConnection object is needed. If you haven’t heard of this API, it’s normal. When I turned up the example of MDN running, there were black question marks and it didn’t run.

How to make p2p connection

Then open the WebRTC MDN introduction, the first article is the architecture introduction, see a pile of English abbreviations, really persuade to leave:

ICE
STUN
NAT
TURN
SDP

With uneasy heart, I opened the second chapter, thinking that if I could understand it, the topic is signaling and video call, fortunately, the basic words are human, according to the knowledge taught by the second grade of primary school or can understand, but there are many new concepts, need to be patient to read several times.

Now to sum up what I’ve learned in my lifetime:

WebRTC delay can be so low, because of the use of p2p direct, media stream directly to hair, not through the server, but users are generally hidden behind the layers of the router, can’t be like even the server directly connected, so you need to know their own public address, then sent to each other, so that the other party to find you, This technology is completed by THE ICE module of WebRTC, and ICE uses STUN, NAT, TURN and other technologies. Therefore, as long as we know that ICE can collect the local public address and directly connect to the other party’s public address.
How to get the public address of the other party? In this case, the server is required to forward the message, which is generally achieved by using Websocket, called signaling server.

How do I send and receive media streams

In this way, the P2P connection has been established and local media streams can be happily sent, but one thing needs to be confirmed before they can be sent, namely local codec capabilities. This process is for both parties to send a text saying what they are capable of. This text is called Session Description Protocol (SDP). The exchange process also depends on the signaling server.

After negotiation, you can add the local stream to the peerConnection object

stream.getTracks().forEach(track= > peerConnection.add(track, stream));
Copy the code

Webrtc-p2p demo to sort out the process

Since the code on MDN didn’t work, I found this 50-line video call (WebRTC + WebSocket) project on Github as a demo to explain the process

Nodeindex. js is used to run the project, and both the originator and receiver of the call use a single page called p2p.html, with parameters in Query as the distinction.

Signaling connection

Signaling helps two parties exchange a message

const socket = new WebSocket('ws://localhost:8080');
socket.onopen = () = > {
        message.log('Signaling channel created successfully! ');
        target === 'offer' && (button.style.display = 'block');
}
socket.onerror = () = > message.error('Signaling channel creation failed! ');
socket.onmessage = e= > {}
Copy the code

Instantiation PeerConnection

As we all know, this API is compatible. After obtaining the correct object, instantiate it. In fact, there is an options parameter

const PeerConnection = window.RTCPeerConnection || window.mozRTCPeerConnection || window.webkitRTCPeerConnection; ! PeerConnection && message.error('Browser does not support WebRTC! ');
const peer = new PeerConnection();
Copy the code

P2p connection

The onicecandidate event is to collect the local available public IP address and send it to the peer party through the signaling server. The peer party performs the same operation

peer.onicecandidate = e= > {
    if (e.candidate) {
        message.log('Collect and send candidates');
        socket.send(JSON.stringify({
            type: `${target}_ice`.iceCandidate: e.candidate
        }));
    } else {
        message.log('Candidate collection complete! '); }};Copy the code

Use peer. AddIceCandidate (iceCandidate) after receiving the ice message from the other side. To complete the pairing

Capture video

Use the navigator. MediaDevices. GetUserMeidaAPI, do not have what good say of, access to the stream, add audio tracks to the peer instances. Because audio and video tracks are transmitted separately, it is necessary to add them to the peer in the dimension of track

stream.getTracks().forEach(track= > {
    peer.addTrack(track, stream);
});
Copy the code

Media capability Negotiation

Once you’ve added the audio and video, you can get the SDP, using the createOffer API

message.log('Create local SDP');
const offer = await peer.createOffer();
await peer.setLocalDescription(offer);

message.log('Transfer initiator local SDP');
socket.send(JSON.stringify(offer));
Copy the code

After receiving the SDP, the SDP is transmitted to the peer party. After receiving the SDP, the peer party uses createAnswer to generate a reply message. After receiving the SDP through signaling, the local party uses setRemoteDescription to complete the negotiation.

The other party wants to publish the same process, after completion can start to send audio and video stream.

Receive each other’s audio and video streams

Finally receive the stream object, the video tag to play

peer.ontrack = e= > {
    if (e && e.streams) {
        message.log('Receive audio/video stream data from the other party... ');
        remoteVideo.srcObject = e.streams[0]; }};Copy the code

conclusion

The core method of PeerConnection is basically the method used above, which is very simple and easy to use. But in fact, the bottom layer of WebRTC does a lot, and someone is carrying the load for you.
There are many concepts that need to be read more. Every concept is a big piece of knowledge. At the beginning, as long as you understand what each module does, it is mainly to understand the process.

annotation

Real-time: Generally when we watch live broadcast on the web page, there will be at least 2-3s delay, especially when watching the World Cup, the big man next door is cheering, you still don’t know what to do, because the link of live broadcast is very long, time is inevitable. However, chat scenarios require much lower latency, which can reach 200ms (or less) with WebRTC, so it is called real-time communication.
The reason why the code does not realize the call is that it needs the server to help exchange a kind of information, which is not available in the sample code. If you want to experience it, you can see the official example. This is implemented in one page, leaving out the server. If you want to use NodeJS to transfer, you can look at 50 lines of code to complete the video call (WebRTC + WebSocket) project, very basic;
Encoding and decoding: the original collection of video is very, very large, can not be transmitted between the network, the need to compress this process is called encoding, conversely, called decoding, here is some image knowledge of the introduction of digital_video_introduction;

END

Welcome to add me wX: 18618261774 to learn