Abstract

In the Internet era of 2021, more and more live network programs have emerged one after another. Browser is one of the most accessible channels for users, gathering a large number of users to watch live broadcast. When users watch live content at the same time, the load borne by the server increases with the increase of the number of users, which will lead to slow playback, delay and other user experience decline. And the high cost of server bandwidth cannot be ignored.

Is there a solution that can effectively reduce server load and bandwidth while ensuring user experience and quality of service? That’s where Web P2P technology comes in.

First, the history of Web P2P

In 2010, Adobe introduced real-time streaming media RTMFP in Flash Player 10.0, making P2P between Flash Players possible using RTMFP. Over the next few years, the protocol mushroomed across the web, but by the end of 2020, no browser supported Flash at all. How do we use P2P on the Web?

WebRTC, which was born in 2011, has become the standard of H5 with the support of Google, Microsoft, Apple, Mozilla and other major browsers. It is also supported by Chrome, Edge, Safari and Firefox. So we can use WebRTC to achieve P2P.

HTML5 P2P based on WebRTC is referred to as H5 P2P for short below.

Ii. Introduction to WebRTC

The overall architecture of WebRTC is shown below. If you are a browser developer, you usually need to focus on implementing the WebRTC C++ API. Our goal is to implement P2P in the browser, so you only need to focus on the Web API.

The overall H5 P2P is realized based on RTCPeerConnection and RTCDataChannel, only realizing P2P at the data level, excluding audio and video communication content.

Iii. Overall structure

  1. H5 P2P client

It mainly includes communication module, transmission module, node management module, scheduling module, storage module and statistics module:

Communication module: mainly responsible for signaling communication with the server (Index, CPS, Tracker in the figure above)

Transport module: contains HTTPClient and RTCClient (upload & download) based on RTCDataChannel implementation

Node management: P2P active and passive connection, disconnection, elimination, and resource information synchronization between different users

Scheduling module: CDN/P2P scheduling is carried out according to buffer water level, fragment information, node quality and scale

Storage module: responsible for media data storage, sorting, and elimination

Statistics module: contains the implementation of the buried point of the system operation

  1. H5 P2P server

Contains Index, CPS, Tracker and Stun services:

Index: indicates the address of the Index service.

CPS: resource property service to convert the URL of live stream to RID.

Tracker: A resource tracking service that stores index relationships between users and resources.

MQTT: signaling service that forwards the Offer/Answer /candidate SDP protocol for establishing P2P connections between users.

Stun: in coturn deployment, UDP packets through which peer connections are established between major user nodes.

Iv. Stage process

The following figure briefly shows each stage and process of the overall plan:

Step 1: Open the web page, first request the Index service for the address of other services, and save it in memory

Step 2: After receiving the live streaming request, the client sends a request to CPS to convert the URL into a resource ID(RID). The resource ID(RID) of the live streaming streams with the same content and clarity is the same, indicating that data can be shared with each other

Step 3: The client will start HTTPClient to download data first, and ensure the second broadcast first. Do two things at once

A. The client sends the Address protocol to the Tracker service to obtain the addresses of other users who are watching the live broadcast. Connections will be established later to share data

B. The client sends a Publish agreement to the Tracker service to inform the Tracker of its own information, so that others can search for it and share data

Set up connections with other users you have obtained from the Tracker to exchange information and share data with each other

V. Connection establishment

Assuming that data is transferred between Peer1 and Peer2, a connection needs to be established before transmission. In WebRTC, ICE framework is used to establish peer connection between browsers, mainly including Stun and TURN. The following are briefly introduced:

  1. ICE

ICE (Interactive Connectivity Establishment) provides a P2P connection framework that integrates various NAT penetration technologies (STUN, TURN). Based on the offer/answer/candidate mode, through multiple address exchange, and then connectivity test, penetrate various firewalls between user networks. ICE mainly includes STUN and TURN protocols.

  1. STUN

Simple Traversal of UDP Through NAT (STUN) allows the client behind the NAT device to find its own public IP address /Port address and query its own NAT device type. Through this information, UDP communication can be established between two devices behind NAT. For details about STUN flow, refer to RFC5389.

  1. TURN

Generally speaking, TURN (Traversal Using Relays around NAT) is relaying/forwarding. Not all devices can rely on STUN to establish a connection, so when STUN cannot be used to establish a connection between two devices, TURN will be enabled for data transfer.

Because our goal is to save the bandwidth, if use TURN forward, does not reduce the bandwidth consumption of server, another reason is that in the process of the connection is established, and is different from the traditional video calls can only be 1 to 1, the transfer mode is more than 1 to transfer, so does not require every connection is reliable, Therefore, in H5 P2P project, we only use STUN to establish the connection.

Six, data download

  1. Resource coding

Web P2P network broadcast scheme can also support any multi-channel live broadcast at the same time, and the independence of each channel live broadcast stream must be guaranteed. Therefore, we provide a unified resource coding method to identify the unique channel live broadcast stream. Here, StreamId in the live stream is mainly used to generate resource ID (RID) of fixed length through calculation of MD5, which not only ensures the same content and different clarity, but also different RID. It also ensures that the resource ID(RID) is the same because the source address is switched due to interruption.

  1. Download the scheduling

During the entire live broadcast, since the playable cache data is very small, usually no more than 10 seconds? Therefore, how to make a decision to use CDN or P2P in real time according to the current network state and network environment? Here, our best practices are as follows:

A. In order to ensure the user’s viewing experience, when the playable data in the buffer is lower than a certain threshold (for example, 5 seconds), CDN download will be switched

B. if I found a piece of data in the future, the people around you are not, even if still have a lot of buffer to play, can not meet the needs of a strategy description, this time, in order to distribute data as soon as possible, should also be timely switch to CDN, when this happens, often said the nodes as well as his connection nodes are at the top of the live data distribution, In order to prevent the scene of “one person downloading and many people watching”, in line with the concept that faster downloading leads to earlier uploading, not only the node itself is required to download CDN as soon as possible, but also a certain proportion (such as 15%) of surrounding users are required to download CDN as soon as possible, thus improving the overall data distribution efficiency.

  1. P2P Task allocation

A. Task ID

For each chunk, there is a natural number (sn). For this chunk, we split it by a fixed length (default 16KB), called chunk_index. Then (Rid, sn, chunk_index) can uniquely identify a chunk.

B. Assign requests

The allocation process is similar to the card dealing mechanism in poker. The only requirement is that the other node owns the data block, and the better the quality of the nodes, the more important data will be allocated close to the player. The following diagram briefly shows the process of allocating 2 data blocks (9 data slices) to 3 nodes of the same quality.

  1. Memory control

For Web P2P, all media data is cached in memory, but not the entire data is cached. For live streaming, all users’ viewing points are very close together, so memory always caches data of a fixed size before and after the playpoint. Because only this part of data sharing utilization is the highest.

As shown in the above, there are color blocks of data are present in the data in the memory cache, which said the current play point, the red block yellow block says the content has been viewed in the past, green block says the future is not the content of the broadcast, you can see when play point moving from 100 to 101, 97 were eliminated data block, At the same time, block 103 is downloaded as the latest data and cached in memory.

7. Engineering practice

  1. Goggle FlatBuffers

FlatBuffers are used for both server communication and point-to-point P2P transport. FlatBuffers, with its advantages of cross-platform, high memory and efficiency, and flexible expansion, is adopted uniformly across our platform and greatly reduces communication costs.

  1. Multi-browser compatibility

Not only is there a huge variety of web browsers that support WebRTC online, but versions of the same product can vary greatly from one another. To address this pain point, WebRTC Adapter.js was adopted to solve this challenge.

  1. Memory pool

Although Node.js has its own garbage collection mechanism and memory management strategy, GC and memory allocation costs, and the Web side is a single-threaded environment, so any performance improvement will reduce the time loss and allow you to handle more high-throughput network layer traffic and business logic. Therefore, we adopted the memory pool strategy in memory allocation, and basically realized 100% memory reuse.

Viii. Technical Results

The figure below shows the bandwidth time-sharing statistics of a large live broadcast, showing the real data from 19:25 to 21:00. With the influx and departure of people, Web P2P brings considerable bandwidth benefits throughout the whole process. The final evaluation of Web P2P, under the premise of not affecting the user experience, not only can effectively reduce the load pressure of the server, but also can reduce a considerable proportion of the bandwidth cost.

At present, this technology has been applied in a large scale in various network broadcast, which has well solved the balance between service quality and bandwidth cost. However, we also found that there is some room for improvement under the super-large-scale live broadcast, mainly as follows:

(1) The service performance of Coturn and MQTT is not high, which will restrict the service quality of live broadcast.

(2) Access edge network, create diversified network, ease CDN load, and further reduce the bandwidth cost of CDN.