After writing two articles about live video, some students asked me why I didn’t have any content about WebRTC. In fact, I said from the beginning that my requirement is live video on the mobile WEB, and mobile browsers are not very happy with “WebRTC support” at this point, so I’ll just ignore it. But FOR the sake of the title, I wrote “MOBILE WEB” as “HTML5”, so I’ll fill in this one for the sake of rigor.

Web Real-Time Communication (WebRTC) is commonly translated as Web real-time Communication in Chinese. It consists of a set of standards, protocols, and JavaScript apis for end-to-end audio, video, and data sharing. Unlike other browser communication mechanisms, WebRTC transmits data over UDP, while well-known XMLHttpRequest and WebSocket are all based on TCP.

As far as the browser market is concerned, only Google and Mozilla are serious about WebRTC, with Firefox 22 and Chrome 23 supporting it from the start. Microsoft has developed its own standard, which may be integrated with WebRTC later, but at least IE will not support the current WebRTC standard; Apple probably didn’t plan to add WebRTC support to Safari at all because FaceTime is a great way to communicate multimedia between Apple devices. As for Opera, the new kernel is basically equivalent to Chrome, which is even more ignored.

I met WebRTC

WebRTC involves a lot of complex technology, but the good news is that browsers have abstracted most of it into three apis:

  • MediaStream: Get audio and video streams;
  • RTCPeerConnection: Audio and video data communication;
  • RTCDataChannel: Any application data communication;

MediaStream corresponds to the navigator.getUserMedia() method in JS, which is responsible for retrieving audio and video streams from the underlying platform. After automatic optimization, encoding and decoding of WebRTC audio and video engine, audio and video streams can be directly used or transmitted to various destinations. Here is a Demo that uses getUserMedia to get the video stream and convert each frame to ASCII characters. In short, the MediaStream API is designed to be simple and easy to use.

RTCPeerConnection is used to establish and maintain end-to-end connections and provide efficient audio and video streaming. Of all the apis WebRTC provides, this is the most complex:

First of all, the end-to-end connection must be established, which inevitably solves the NAT penetration problem. RTCPeerConnection introduces the Interactive Connectivity Establishment (ICE) framework for this purpose. ICE is committed to establishing an effective channel between terminals, with direct connection first, STUN negotiation second, and TURN forwarding only.

STUN (Session Traversal Utilities for NAT) solves three problems: 1) Access to external IP addresses and ports; 2) Route entries are established in NAT to bind extranet ports so that inbound groups reaching extranet IP addresses and ports can find applications and not be discarded; 3) A simple keep-alive mechanism is defined to ensure that NAT route entries will not be deleted due to timeout. STUN server must be set up on the public network, can build ourselves, can also use a third party to provide public services, such as Google’s “stun:stun.l.google.com: 19302”.

TURN (Traversal Using Relays around NAT) protocol, which relies on an extranet relay device to pass data between the two ends. To put it simply, the two ends are indirectly connected by forwarding messages through the TURN service that can be accessed by both ends. TURN will also try to use TCP instead of UDP, which increases reliability and bandwidth costs significantly. According to The statistics of Google, TURN is required in about 8% of UDP services.

Secondly, to establish an end-to-end channel, it is still necessary to exchange and negotiate some information with the help of the server, a process called Signaling. WebRTC does not rule that Signaling must use a protocol, leaving the choice up to the application. We can choose different ways (XMLHttpRequest, WebSocket), using the existing SIP, Jingle, ISUP and other sending protocols, to establish channels.

Generally, in WebRTC applications, the channel establishment step is preferentially through WebSocket, and support to degrade to HTTP. For one thing, any browser that supports WebRTC must support WebSocket. Second, WebSocket is more real-time. It is important to note that WebSocket is only used to assist in establishing end-to-end connections. Once the connection is established, there is no need for the server to transfer the source from end to end (except in TURN relay mode).

RTCDataChannel is used to support end-to-end data exchange for any application. After establishing the RTCPeerConnection, in addition to transmitting audio and video streams, one or more channels can be opened for transmitting any text or binary content, which is called RTCDataChanel. The DataChannel API is very similar to WebSocket in that it can be used to transfer data from end to end, but in essence they are different:

First of all, WebRTC ends are peer to peer, and DataChannel can be initiated by either party. This is different from WebSocket connections that can only be initiated by the client; Second, TLS, the session layer protocol of WebSocket, is optional. The session layer protocol DTLS of WebRTC is required, which indicates that data transmitted through WebRTC must be encrypted. Moreover, WebSocket runs on TOP of TCP, so every message is naturally ordered and reliable. DataChannel can specify whether messages are ordered or out of order, reliable or partially reliable, and if partially reliable, whether a timeout or counting retransmission strategy is used through the delivery property options of SCTP.

Currently DataChannel runs on the following protocols:

  • Stream Control Transmission Protocol (SCTP), which provides some features similar to TCP.
  • Datagram Transport Layer Security (DTLS), the UDP version of TLS.
  • User Datagram Protocol (UDP) is the basis of WebRTC.

I’m not going to give you a complete overview of how to use WebRTC from scratch; there are plenty of similar articles online. Here are a few recommended articles, the last several Chinese from the same author, written in a relatively easy to understand:

Chapter 3 “UDP Composition” and Chapter 18 “WebRTC” describe UDP Intranet penetration and WebRTC in detail.

One to many live

As mentioned earlier, WebRTC is designed to solve the problem of end-to-end real-time communication, which means it is ideal for Internet telephony scenarios that require two-way video calls. Most WebRTC demos on the web also have two videos on the page, one for localStream and one for RemoteStream. So whether WebRTC can be used to achieve one-way one-to-many live broadcast? Of course you can, and it seems very simple:

  • First, there must be a page that calls getUserMedia to collect audio and video, which I call the source service.
  • When opening the live broadcast page, establish a PeerConnection to the source service and notify the source service through DataChannel.
  • After receiving the notification, the source service provides the live stream through the addStream method of the corresponding PeerConnection.
  • The live broadcast page listens to the OnAddStream event of PeerConnection and throws the obtained live stream to the Video for playing.

For convenience, I used the open source project PeerJS to verify the above process. PeerJS encapsulates the WebRTC Api, making it easier to use. It also provides Signaling services to assist in establishing connections, which can be used by registering an Api Key on the website. You can also set up your own service through PeerJS Server. You only need to install the peer through NPM install peer and then run the following command to start the service:

peerjs --port 9000 --key peerjs
Copy the code

Start the Peer Server, introduce peer-.js to the page, and start playing. First, realize the source service:


var peer = new Peer('Server', {
    host: 'qgy18.qgy18.com', 
    port: 9003, 
    path: '/',
    config: {
        'iceServers': [
              { url: 'stun:stun.l.google.com:19302' }
        ]
    }
});

navigator.getUserMedia({ audio: false, video: true }, function(stream) {
    window.stream = stream;
}, function() {  });

peer.on('connection', function(conn) {
    conn.on('data', function(clientId){
        var call = peer.call(clientId, window.stream);

        call.on('close', function() {  });
    });
});
Copy the code

Then there’s the live streaming service:


var clientId = (+new Date).toString(36) + '_' + (Math.random().toString()).split('.')[1];

var peer = new Peer(clientId, {
    host: 'qgy18.qgy18.com', 
    port: 9003, 
    path: '/',
    config: {
        'iceServers': [
              { url: 'stun:stun.l.google.com:19302' }
        ]
    }
});

var conn = peer.connect('Server');

conn.on('open', function() {
    conn.send(clientId);
});

peer.on('call', function(call) {
    call.answer();
    call.on('stream', function(remoteStream) {
        var video = document.getElementById('video');
        video.src = window.URL.createObjectURL(remoteStream);
    });

    call.on('close', function() {  });
});
Copy the code

The live broadcast page establishes an end-to-end connection with the source service by specifying an ID, and then tells the source service its OWN ID through DataChannel. After receiving the message, the source service actively sends the live stream, and the live broadcast page responds and plays it. The whole process is as simple as that, and here’s a “full Demo”.

After watching the Demo above, you may think that using WebRTC live broadcasting is so easy, just find a computer with a camera, open a browser to provide live broadcasting service, then why bother with HLS, RTMP and so on.

In fact, the reality is not so good, the Demo is just playing ok, the real use of a big problem!

First of all, although in the WebRTC live program, the server only acts as a bridge, and the actual data transmission occurs directly between the end and end, as mentioned above, there will still be 8% cases of direct connection. To ensure high availability, consider deploying a complex and expensive transfer service like TURN.

Second, Chrome limits the number of terminals each Tab can connect to, up to 256. In fact, at about 10 connections on my latest Retina Macbook Pro, Chrome started to get incredibly stuck, fans were whirling, 6GB of RAM was being consumed, the CPU was running full, network traffic was getting overwhelming, and the live service was getting extremely erratic.

Therefore, the actual use of the scheme, generally still need the support of Media Server, “end-to-end” into “end-to-end Media Server to multi-terminal” architecture. Media Server can have better performance and bandwidth, and can implement the WebRTC protocol by itself, so it is possible to support more users.

I found a WebRTC Gateway called Janus, an open source project that supports WebRTC in C. Janus itself is very simple to implement, provide plug-in mechanism to support different business logic, with official plug-ins can be used to achieve efficient Media Server services.

The official Demo provided by Janus is here, and I also tried to deploy one on my VPS. Janus has a Streaming plugin that accepts audio and video streams pushed by GStreamer and then pushes them to all users via PeerConnection. Since GStreamer can read the camera directly, there is no need to go through WebRTC’s MediaStream to get the video, so the architecture becomes traditional server-side. The whole process is complicated and tortuous, so I won’t write it here. If you are interested, you can discuss it with me alone.

Link to this article: imququ.com/post/html5-… , join the comments »

–EOF–

Published at 2015-05-01 00:22:07, with “HTML5, Video” tags added, last modified at 2015-05-02 17:28:00.

This website uses the “signature 4.0 International” creation and sharing agreement, and authorizes “Qiwu Weekly” wechat official account to mark “original” when using articles on this website. More info »

Warning: This article was last updated 422 days ago and the information described in this article may have changed. Please use it with caution.