WebRTC as a toolbox compared to traditional video conferencing on security, good compatibility, enhanced network advantages.

By Eric Rescorla

The original link/blog.mozilla.org/blog/2021/0…

The wide availability of high-quality video conferencing is one of the real successes of the Internet. Of course, the concept of videoconferencing has been around for a long time (witness Heywood Floyd’s Bell Videophone with his family in 2001), but until recently it required specialized equipment, or at least downloading specialized software. Simply put, WebRTC is video conferencing (VC) in a Web browser, no download required: you just visit a website and make a call. WebRTC versions are available for most major VC services: Google Meet, Cisco WebEx, Microsoft Teams, and a host of smaller companies.

It’s a toolbox, not a phone

WebRTC is not a complete video conferencing system; It’s a set of tools built into the browser that solves many of the difficulties of building a VC system so you don’t have to. These tools include:

  • Capture audio and video from the computer’s microphone and camera. This also includes what’s known as acoustic echo cancellation, which (hopefully) cancellations echoes even when people aren’t wearing headphones.
  • Allows two endpoints to negotiate their capabilities (for example, “I want to send and receive 1080p video with the AV1 codec”) and agree on a common set of parameters.
  • Establish a secure connection between you and the other person on the call. This includes getting data over any NAT or firewall on the network.
  • Audio and video are compressed and sent to the other side, and then reassembled after receiving. You also need to deal with the loss of some data, in which case you want to avoid a failure that affects the freeze frame or the audio.

This functionality is embedded in what’s called an application programming interface (API) : a programmer gives the browser a set of commands to set up a video call. As a result, you can write a very basic VC system in very few lines of code. Building a production system can be cumbersome, but with WebRTC, the browser does most of the work of building the client for you.

standardized

Importantly, these features are fully standardized: The API itself is published by the World Wide Web Consortium(W3C), and network protocols (encryption, compression, NAT traversal, etc.) are standardized by the Internet Engineering Task Force(IETF). The result is a host of specifications, including API specifications, protocols for negotiating what media to send or receive, and mechanisms for sending point-to-point data. All in all, this represents a lot of work done by many people over a decade, resulting in hundreds of pages of specifications.

As a result, you can create a VC system that works for everyone in your browser without any software installation.

Ironically, the actual release of the standard is a bit of a anticleech: Every major browser has released WebRTC for years, and as I mentioned above, there are plenty of WebRTC VC systems. This is a good thing: widespread deployment is the only way to gain confidence that the technology actually works as intended and the documentation is clear enough to implement from it. These standards reflect the collective judgment of the technology community that we have a system that works properly and that we are not going to change the basics. This also means that it is time for VC vendors implementing non-standard mechanisms to update to the requirements of the standard.

Why do you care?

At this point you may be thinking, “Ok, you both do a lot of work, but what does it matter? Can’t I just download Zoom? WebRTC is big for several important reasons.

security

Perhaps the most important reason is security. Because WebRTC runs entirely in a browser, that means you don’t have to worry about security in the software that VC providers want you to download. For example, Zoom has had a number of high-profile security holes in the last year, such as allowing websites to add you to calls without permission, or installing so-called remote code execution attacks that allow attackers to run their code on your computer. In contrast, because WebRTC does not require downloading, you are not exposed to any vulnerabilities that the vendor client may have. Of course, browsers don’t have a perfect security record, but every major browser has invested heavily in security technologies such as Sandboxing. Plus, you’re already running a browser, so every extra application you run increases your security risk. To that end, Kaspersky recommends running the Zoom Web client, even if the experience is much worse than the application.

The second security advantage of WebrTC-based conferences is that the browser controls access to the camera and microphone. This means you can easily block sites from using them, as well as determine when they are used. For example, Firefox prompts you before letting your site use cameras and microphones, and then displays their runtime content in the URL bar.

WebRTC is encrypted all the time during transmission, and doesn’t need the VC system to do anything else, so you mostly don’t have to ask the vendor if the encryption is good. This is one of Mozilla’s most involved parts of WebRTC and is in line with Principle 4 of the Mozilla Manifesto (the security and privacy of individuals on the Internet is fundamental and cannot be considered optional). Even more exciting, we are starting to see built-in end-to-end cryptographic conferencing built for WebRTC on MLS and SFrame. This will help address one major security feature that native clients don’t provide: preventing services from listening in on your calls. It’s good to see progress on that front.

Good compatibility

Because WebrTC-based video calling applications can work on standard Web browsers, they can significantly improve compatibility. For users, this means they can join a call without having to install anything, which makes life a lot easier. I’ve been on a lot of conference calls where people couldn’t join — usually because their company was using a different VC system — because they didn’t download the right software, which is much less the case now that it only works with a browser. This can be an even bigger problem in enterprises that have restrictions on software installations.

For those who want to support a new VC service, WebRTC means no need to write a new client software and let people download it. This makes it easier to enter the market without having to worry about users being locked into a VC system and not being able to use your system.

This doesn’t mean you can’t build your own client, many popular systems like WebEx and Meet have downloadable endpoints (or, in WebEx’s case, buy hardware). But it means you don’t have to, and if you do it right, browser users will be able to talk to your custom terminal, thereby providing an easy way for regular users to try out your service without too much commitment.

Strengthen the network

Because WebRTC is part of the Web, rather than a separate application, it means that it can be used not only for conferencing applications, but also to enhance the Web itself. Want to add audio streams to your game? Share your screen in a webinar? Uploading video from your camera? No problem, just use WebRTC.

One of the exciting things about WebRTC is that there are many Web applications that can use WebRTC in addition to video calling. Perhaps the most interesting is the use of WebRTC “Data Channels”, which allows a pair of clients to establish a connection between them that they can use to exchange Data directly. There are many interesting applications for this, including games, file transfers, and even BitTorrent in browsers. It’s still early days, but I think we’ll see a lot of Data Channels in the future.

Bigger picture

WebRTC itself is a big step forward for the Internet: if you had told people 20 years ago that they would be making video calls from their browser, they would have laughed at you, and I have to admit THAT I was sceptical at first, but I do it almost every day at work. But more importantly, it’s a great example of the power of the web to change people’s lives for the better, and of what we can do when we work together.

1. Technical notes: Perhaps the biggest issue for Firefox users is that people have implemented a Chrome specific mechanism to handle multimedia streams, called “Plan B.” The IETF eventually adopted something called a “Unified Plan,” and Chrome supports it (like Google Meet), but there are still some services, like Slack and Facebook Video Calling, that only use Plan B, This means that they don’t work with Firefox, which implements a unified plan.

2. The Zoom Web client is an interesting example because it only has part of WebRTC. Unlike (say) Google Meet, Zoom Web uses WebRTC to capture audio and video and transmit media over the network, but uses WebAssembly to do all the audio and video locally. This demonstrates the power of WebAssembly, but if you compare Zoom Web head-to-head with other clients such as Meet or Jitsi, you can see the advantages of using the WebRTC API built into the browser. ︎

3. Google has opened up their WebRTC stack, which makes it easier to write your own downloadable clients, including one that will interoperate with browsers. ︎