Introduction of WebRTC
WebRTC is a real-time communication solution launched by Google,
It includes audio and video collection, coding and decoding, data transmission, audio and video display and other functions.
Despite its name, WebRTC actually supports audio and video communication between the Web as well as Android and iOS.
The underlying technology
- VideoEngine
- VP8 codec
- Jitter buffer: indicates the dynamic jitter buffer
- Image Enhancements: Image enhancements
- VoiceEngine
- ISAC /iLBC/Opus codec
- NetEQ speech signal processing
- Echo cancellation and noise reduction
- Session Management
- ISAC sound compression
- VP8 Video codec for Google’s own WebM project
- APIs (Native C++ API, Web API)
Although the underlying implementation of WebRTC is extremely complex, the API for developers is very concise and can be divided into three main aspects:
- Network Stream API
- MediaStream Media data flow
- MediaStreamTrack media sources
- RTCPeerConnection
- RTCPeerConnection allows users to communicate directly between two browsers
- Candidate for the RTCIceCandidate ICE protocol
- RTCIceServe
- DataChannel
Network Stream API
There are two main apis: MediaStream and MediaStreamTrack.
MediaStreamTrack represents a single type of data stream (VideoTrack or AudioTrack),
A MediaStreamTrack represents a media track, which gives us the possibility to mix different tracks for multiple effects.
MediaStream is a complete audio and video stream that can contain multiple MediaStreamTrack objects,
Its main role is to cooperate with multiple media tracks at the same time to play, which is what we usually say audio and picture synchronization.
Eg:
LocalMediaStream represents media streams from local media capture devices such as webcams, microphones, and so on.
To create and use a local stream, a Web application must request user access through the getUserMedia() function.
Once the application is finished, it can revoke its access by calling the stop() function on LocalMediaStream.
RTCPeerConnection
Above we just managed to get the MediaStream streaming media object, but it was still limited to local viewing.
How to exchange streaming media with each other (audio and video calls)?
The answer is that we have to set up peer-to-peer connections, which is what RTCPeerConnection does.
Before we do that, we need to understand a concept: a signaling server.
For two devices on the public network to know who the other is, an intermediate party needs to negotiate and exchange their information.
That’s what a signaling server does — pull strings.
Once a peer connection is established, a MediaStream (a MediaStream object temporarily defined) can be sent directly to the remote browser.
DataChannel
Each stream actually represents a one-way logical channel, providing the concept of sequential delivery.
Message sequences can be sent in order or out of order. The message delivery order is reserved only for all ordered messages sent on the same class.
However, the DataChannel API has been designed to be bidirectional, meaning that each DataChannel consists of a bundle of incoming and outgoing SCTP streams.
The DataChannel setting (that is, creating an SCTP association) is performed when the CreateDataChannel() function is first called on the instantiated PeerConnection object.
Each subsequent call to CreateDataChannel() only creates a new DataChannel within the existing SCTP association.
Data processing and transmission process
WebRTC provides two threads externally: Signal and Worker. The former is responsible for signaling data processing and transmission, while the latter is responsible for media data processing and transmission.
WebRTC has a series of threads working together to complete the data flow pipeline.
Take a video data processing process as an example.
The Capture thread collects raw data from the camera and then reaches the Worker thread.
The Worker thread acts as a porter, forwarding the data to the Encoder thread without special processing.
Encoder thread calls specific Encoder (such as VP8 and H264) to encode the original data, and the encoded output is further RTP packet to form RTP packet.
Then the RTP packet is sent to the Pacer thread for smooth sending, and the Pacer thread pushes the RTP packet to the Network thread and finally sends it to the Network Internet.
Principle of audio and video recording
Audio and video playback principle
Get the video stream from the camera
The MediaStream interface is used to represent media data streams. (Streams can be input or output, local or remote)
A single MediaStream can contain zero or more tracks. (Each track has a corresponding MediaStreamTrack object)
MediaStreamTrack represents content that contains one or more channels with defined known relationships between channels.
All tracks in MediaStream are synchronized at render time.
The following figure shows a MediaStream consisting of a single video track and two different audio (left and right) tracks.
We are used to defining {video: true, audio: true} and writing CSS styles to control the display of the video window.
But the API comes with a constraint that initializes the aspect ratio of the video, camera-facing mode (front or back), audio and video frame rates, and so on.
navigator.mediaDevices
.getUserMedia({
audio: true,
video: {
width: 1280,
height: 720
}
})
.then(stream => {
console.log(stream);
});
Copy the code
If you want to achieve screen recording (screen sharing), is to get media parameters to change, such as camera to screen:
navigator.mediaDevices
.getUserMedia({
video: {
mediaSource: 'screen'
}
})
.then(stream => {
console.log(stream);
});
Copy the code
This is currently only supported by Firefox (Chrome and Edge use a different approach, see below).
A box will pop up asking for the application window to record, as shown below:
For a more detailed use of Constraints, see this blog post: getUserMedia() Video Constraints
Ok~ This part is very simple, here is a simple Demo:
<! DOCTYPE html> <html lang="en"> <head> <meta charset="UTF-8" /> <meta name="viewport" content="width=device-width, /> <title>Document</title> </head> <body> <h1><code>getUserMedia()</code> Very simple demo</h1> <video></video> <script> navigator.getUserMedia = navigator.getUserMedia || navigator.webkitGetUserMedia || navigator.mozGetUserMedia; const localVideo = document.querySelector('video'); // MediaStreamConstraints specifies which track (audio, video, or both) is requested const constraints = {audio: false, video: true}; function successCallback(stream) { localVideo.srcObject = stream; localVideo.play(); } function errorCallback(error) { console.error('navigator.getUserMedia error: ', error); } if (navigator.mediaDevices.getUserMedia) { navigator.mediaDevices .getUserMedia(constraints) .then(successCallback) .catch(errorCallback); } else { navigator.getUserMedia(constraints, successCallback, errorCallback); } </script> </body> </html>Copy the code
Above we learned about the screen sharing API, feel like we commonly used “screen sharing”.
So can you do a screen recording with this?
“Paper come zhongjue shallow, aware of this to practice.” Look at a very simple thing, did not implement are exaggerating.
First draw three buttons:
<button @click="start" :disabled="disabled. Start "> </button> <button @click="stop" </button> < button@click ="download" :disabled="disabled. Download "> Download file </button>Copy the code
Add a simple style:
button { margin: 0 1em 1em 0; Padding: 0.5em 1.2em 0.6em 1.2em; border: none; border-radius: 4px; background-color: #d84a38; font-family: 'Roboto', sans-serif; The font - size: 0.8 em. color: white; cursor: pointer; } button:hover { background-color: #c03434; } button[disabled] { background-color: #c03434; pointer-events: none; }Copy the code
Initialize data:
Data () {return {// stream: null, // chunks: [], // recording result: Null, // button disabled disabled: {start: false, stop: true, download: true}}},Copy the code
Required methods:
OpenScreenCapture () {... }, // Start screen sharing recording async start() {.... }, // Stop screen-sharing recording stop() {... }, // Download recorded video content () {... }}Copy the code
Ok ~ Let’s go inside each method to see what is required.
The first thing we need to do is get screen share permissions,
Because each browser implementation is different, a compatibility process is required here.
/ / get the screen share permissions openScreenCapture () {if (the navigator. GetDisplayMedia) {return the navigator. GetDisplayMedia ({video: true}); } else if (navigator.mediaDevices.getDisplayMedia) { return navigator.mediaDevices.getDisplayMedia({ video: true }); } else { return navigator.mediaDevices.getUserMedia({ video: { mediaSource: 'screen' }, }); }},Copy the code
After clicking “Start recording” button, set the disabled state of the three buttons in turn.
If the previously recorded content is not cleared, remove it using revokeObjectURL.
After obtaining screen share permission, instantiate a MediaRecorder object for recording storage.
Listen for dataavailable and push data into a block for storage when it is available.
// Start screen share recording async start() {this.disabled.start = true; this.disabled.stop = false; this.disabled.download = true; if (this.recording) { window.URL.revokeObjectURL(this.recording); } / / get the screen share permissions. This stream = await this. $options. The methods. The openScreenCapture (); This. MediaRecorder = new MediaRecorder(this.stream, {mimeType: 'video/webm'}); / / to monitor the available data. This mediaRecorder. AddEventListener (' dataavailable ', event => { if (event.data && event.data.size > 0) { this.chunks.push(event.data); }}); // Start recording this.mediaRecorder. Start (10); },Copy the code
After clicking the “Stop recording” button, you need to save the data block to a memory URL for subsequent download.
// Stop () {this.disabled.start = true; this.disabled.stop = true; this.disabled.download = false; // Stop recording this.mediarecorder. Stop (); // Release MediaRecorder this. MediaRecorder = null; // Stop all streaming tracks this.stream.gettracks ().foreach (track => track.stop()); // Release getDisplayMedia or getUserMedia this.stream = null; A memory / / get the current file this URL. The recording window. = URL. CreateObjectURL (new Blob (enclosing chunks, {type: 'video/webm})); },Copy the code
When the “Download file” button is clicked, the link href of the download element is updated and the click event is automatically triggered for a popup to prompt the download.
Download () {this.disabled.start = false; this.disabled.stop = true; this.disabled.download = true; const downloadLink = document.querySelector('a#download'); downloadLink.href = this.recording; // download specifies the text to be used as the file name downloadLink. Download = 'screen-records.webm '; downloadLink.click(); }Copy the code
When the above operation is done, it is found that the screen share did export a video.
But…
This video has no sound, the volume button has a “mute” symbol, and the volume cannot be adjusted.
GetDisplayMedia only supports video tracks by default, not audio tracks,
Anyway, I have tried several browsers that do not support audio tracks, and have not been able to verify that some blogs write the following to enable audio tracks:
navigator.mediaDevices.getDisplayMedia({
video: true,
audio: true
})
Copy the code
However, after studying in the previous chapter, we know that MediaStream is composed of multiple Mediastreamtracks.
It should be possible to add the video track currently retrieved by getDisplayMedia to an audio track to make a MediaStream.
Other parts do not need to change, just before the MediaRecorder record the MediaStream, it can be changed.
. / / get the microphone access const audioTrack = await the navigator. MediaDevices. GetUserMedia ({audio: true}); / / get the screen share permissions. This stream = await this. $options. The methods. The openScreenCapture (); / / add audio track to MediaStream enclosing stream. AddTrack (audioTrack. GetAudioTracks () [0]). This. MediaRecorder = new MediaRecorder(this.stream, {mimeType: 'video/webm'}); .Copy the code
Now, it becomes a real screen share with sound