The background of this article is that the company asked me to study WebRTC a long time ago to realize some functions based on WebRTC. After searching for information in Google and reading others’ examples, I came up with this article. Various things have been delayed until now (if there are any mistakes, please point out that I will update in time).

The contents of this article are as follows:

  • WebRTC basic knowledge
  • WebRTC noun

Contains examples:

  1. WebRTC local one-to-one video call example

  2. Example of WebRTC local online one-to-one video call

  3. A sample WebRTC based on PeerJs

  4. Based on WebRTC+MediaRecorder recording instance

  5. Based on WebRTC+WebAUdio recording example

  6. The example of recording compression based on WebRTC+WebAUdio

You can learn:

  1. End-to-end real-time chat based on WebRTC

  2. Implement end-to-end real-time video chat based on WebRTC

  3. Recording function based on WebRTC

introduce

What is WebRTC?

WebRTC (Web Real-Time Communications) is a real-time communication technology that allows Web applications or sites to establish peer-to-peer connections between browsers without resorting to intermediaries. Implement the transmission of video streams and/or audio streams or any other data. WebRTC includes standards that make it possible to create peer-to-peer data sharing and teleconferencing without having to install any plug-ins or third-party software

WebRTC is a technology set, just like WebComponent, with an API consisting of three parts:

  • GetUserMedia: Get local media resources (camera, microphone)
  • RTCPeerConnections: Establish a local proxy for peer-to-peer media streaming
  • RTCDataChannel: point-to-point data connection channel

The advantage of WebRTC

  • Platform and device independence. Developers can develop various webrTC-based applications through webrTC-enabled browsers without worrying about compatibility issues at the terminal and operating system levels. In addition, WebRTC also provides a standard API (W3C) and its standard protocol support (IETF) to avoid platform compatibility issues.
  • Voice and video security processing. WebRTC encrypts voice and video through SRTP. Secure Settings are required for users to log in to voice and video files using a browser. For example, voice and video files in an insecure wifi environment cannot be monitored by others.
  • Support advanced language and video processing, WebRTC support the latest encoding, voice support Opus, video support VP8. The built-in coding eliminates the security risks of third-party downloads, and supports the adjustment of the network environment to achieve better voice or video quality.
  • Supports reliable transmission creation. WebRTC provides a reliable transmission mode, including the stability of transmission in NAT environment.
  • Support multimedia stream processing, WebRTC provides the convergence of multimedia and multiple resources, provides the extension of RTP and SDP.
  • Support different network environment adjustment, because WebRTC is executed on the network platform, so it is very sensitive to the network environment and bandwidth. It detects and adjusts the network environment and bandwidth requirements to avoid network congestion. It guarantees this functionality through RTCP and SAVPF.
  • It has good compatibility with VoIP voice and video, WebRTC realizes compatibility operation with other media, including SIP, Jingle and XMPP docking. At the same time, if you need to interconnect with other traditional protocols, you can use WebRTC gateway to achieve smooth compatibility and ensure compatibility with traditional protocols.

What can WebRTC do

  • Online chat room
  • Online video chat
  • Remote real-time monitoring
  • Screen sharing
  • Point-to-point transfer of large files
  • Real-time games
  • live
  • .
  • And so on all need real-time application scenarios

The development history

The first voice communication protocol

WebRTC can be traced back to the earliest Voice communication Protocol (as well as the earliest Internet streaming Protocol), Network Voice Protocol (ABBREVIATED: NVP was first implemented in 1973 by Danny Cohen, a network researcher at the Institute of Information At the University of Southern California. The earliest research purpose of this technology is to demonstrate the ability to simultaneously digitize high quality, low bandwidth, and encrypted voice processing to meet the general military needs of global encrypted voice communications.

This protocol consists of two different parts, the control protocol and the data transmission protocol; The control protocol includes relatively basic telephone functions such as caller ID, ringtone, voice coded negotiation, and termination of a call. A data message contains the vocoder of the voice. For the vocoder format, a “frame” is defined as a packet containing some digital voice samples in the interval between transmissions.

Real time transport protocol

The Real-Time Transport Protocol (RTP) used in WebRTC was published by the Multimedia Transport Working Group of IETF in RFC 1889 in 1996.

The RTP protocol specifies the standard packet format for delivering audio and video over the Internet. It was originally designed as a multicast protocol, but has since been used in many unicast applications. RTP is commonly used in streaming media systems (in conjunction with RTSP), video conferencing, and Push to Talk systems (in conjunction with H.323 or SIP), making it the technical foundation of the IP telephony industry. The RTP protocol is used together with the RTP control protocol RTCP and is built on top of UDP.

The RTP standard defines two sub-protocols:

  • RTP (Data Transfer protocol) : For real-time data transmission, this protocol provides information such as timestamp (for synchronization), sequence number (for packet loss and reordering detection), payload format (for specifying the encoding format of the data)
  • RTCP (Control protocol) : used for QoS feedback and synchronous media streaming, accounts for only 5% of the loan RTP

WebRTC development history

  • Gobal IP Solutions was founded in 1990 in Stockholm, Sweden to develop VoIP and provide extremely high quality voice engines. Skype, QQ, WebEx, Vidyo and other well-known apps use his audio processing engine, which includes patented echo cancellation algorithms, low-latency algorithms for network jitter and packet loss, and advanced audio codecs. Google also uses GIPS permissions in Google Talk.
  • In May 2010, Google acquired the GIPS engine of VoIP software developer Global IP Solutions for $68.2 million and renamed it “WebRTC.” WebRTC uses THE GIPS engine to achieve web-based video conferencing, and supports 722, PCM, ILBC, ISAC and other encoding, while using Google’s own VP8 movie decoder; It also supports RTP/SRTP transmission.
  • Google integrated the software into its Chrome browser in January 2012. Meanwhile, the FreeSWITCH project claims support for iSAC Audio CoDEC.

Relative term

NAT

NAT is short for Network Address Translation (NAT), which maps a public IP Address to a device.

The typical network environment is under a NAT device (a device such as a router). The NAT device changes the source and destination IP addresses when IP packets pass through the device. For home routers, network Address port Translation (NAPT) is used, which changes not only IP addresses but also TCP and UDP ports so that devices on the Intranet share the same external IP address. Our equipment is often in the NAT device behind, like the campus network in the university, check the IP assigned to itself, in fact, is the Intranet IP, on behalf of our NAT device behind, if we connect a router in the dormitory, then we will send more packets through NAT.

One of the NAT mechanisms is that all requests sent from the Internet to the Intranet are masked by the NAT. If the device is under the NAT device, it cannot obtain data from the Internet. This also forms what is known as Intranet penetration

Network holing: USER A proactively sends A piece of data to USER B. Although the data of user A is not sent to user B, user B can send data to user A. This process is called holing.

To solve the Intranet penetration problem, user A and User B make A hole on each other’s NAT device so that they can communicate normally.

There are four types of NAT networks:

  • Perfect cone NAT
  • IP restricted cone NAT
  • Symmetrical NAT
  • Port restriction cone NAT

SDP

The SDP is the Session Description Protocol (SDP). The SDP describes the content of multimedia links, such as resolution, format, encoding, and encryption algorithm. Therefore, the content is not the media stream itself.

SDP is not a protocol, but a data format. The data structure consists of one or more lines of UTF-8 text, each starting with a character type, followed by an equal sign (” = “), followed by structured text containing a value or description, in a format that depends on the type. A line of text that begins with a given letter is often called a “letter line.” For example, the lines that provide the media description are of type “M”, so these lines are called “M lines”.

ICE

ICE, which stands for Interactive Connectivity Establishment, is a protocol framework that allows your browser to establish connections with peer browsers.

In practice, there are many reasons why your end-to-end connection may not work. In this case, you need to bypass the firewall and assign a unique visible address to the device.

ICE can integrate various NAT penetration technologies, such as STUN and TURN

STUN

STUN is short for Session Traversal Utilities for NAT. STUN is a network protocol that allows only NAT clients to find out their public address and prevent direct connection.

In the actual WebRTC application, the client cannot connect to the peer smoothly through the signaling server. We need to get the public address of the local and peer through the NAT session service, and the general process is as follows (it can be understood as the process of wechat exchange between the two parties in the matchmaking process) :

TURN

TURN, short for Traversal Using Relay NAT, is a data transfer protocol (an extension of STUN/RFC5389) that allows TCP or UDP to connect to cross-domain NAT or firewall.

TURN is a relay services to all of the data on both ends of the transit, as shown in figure, this kind of service obviously expensive (dating both sides no WeChat not only, and is very introverted, intermediary, can only rely on various message, matchmaker will be a little tired), so the WebRTC only connection cannot be established at both ends and STUN service cannot be used to apply:

signaling

Signaling is the process of sending control information between two devices to determine communication protocols, channels, media codecs and formats, as well as data transmission methods, and any desired routing information.

Signaling is not defined in the specification. ** So we need to choose our own message protocol (SIP/XMPP), two-way channel (WebSocket/XMLHttpRequest) and persistent server API (such as data channel). The reason for this design is that it is impossible for the specification to predict all possible WebRTC use cases, and it is better to let the developer choose the appropriate networking technology and messaging protocols.

There are three basic types of information exchanged during signaling:

  • Control messages: used to set up, open, close communication channels and handle errors.

  • Information needed to establish a connection: IP address and port information needed for devices to be able to talk to each other.

  • Media capability negotiation: Which codecs and media data formats can be understood by the interacting parties

The whole signaling process is roughly as follows:

  1. Each Peer creates an RTCPeerConnection object that represents its WebRTC session end.
  2. Each Peer establishes a response program for icecandidate event, which sends the candidates to the other Peer through the signaling channel when the event is detected.
  3. Each Peer establishes a responder for a track event that is triggered when the remote Peer adds a track to its stream. The responder should associate tracks with its receivers, for example<video>Elements.
  4. Callers create and share a unique identifier or some kind of token with receivers so that calls between them can be identified by code on the signaling server. The exact content and form of this identifier is up to you.
  5. Each Peer is connected to a convention signaling server, such as a WebSocket server, with which they all know how to exchange messages.
  6. Each Peer tells the signaling server that they want to join the same WebRTC session (identified by the token established in Step 4).
  7. Description, candidate address, etc

Signaling server

Signaling server is used to exchange data between two devices with as little exposure as possible. It does not need to process the data, but only needs to forward the data to the corresponding device. In essence, it is a relay service.

You can imagine the signalling service into a modern matchmaker: put forward for one of the men and women both parties to demand, coal woman give you looking for a qualified object, in each other’s consent, they exchange information, finally met through a matchmaker or, or online chat, no matter how process, the results do not need through the matchmaker, and matchmaker will ensure both the privacy of information.

Related technical points

Media streaming and media channels

MediaStreamTrack: A single audio or video channel, usually included in a media stream; Through multiple media channels, audio and picture synchronization can be achieved

MediaStream: A stream of media content containing multiple audio and video tracks

WebRTC through the navigator. MediaDevices. GetUserMedia to obtain media flow, usually with Video or Audio playback access to the media stream:

const { localVideo, peer } = this

this.mediaStream = await navigator.mediaDevices.getUserMedia({
  audio: true,
  video: true
})
localVideo.srcObject = this.mediaStream
localVideo.onloadedmetadata = async () => {
  await localVideo.play()
}
Copy the code

Through the navigator. MediaDevices. EnumerateDevices access media input and output list:

const devices:Array<MediaDeviceInfo> = await mediaDevices.enumerateDevices() if (! Devices.find ((item:MediaDeviceInfo) => item.kind === 'AudioInput ')) {alert(' Device does not support audio ')} if (! Devices.find ((item:MediaDeviceInfo) => item.kind === 'VideoInput ')) {alert(' Device does not support video ')}Copy the code

GetTract () allows you to close media streams by turning them off:

const tracks = this.mediaStream? .getTracks() ?? [] for (const track of tracks) { track.stop() }Copy the code

WebRTC local agent

RTCPeerConnection is used to create a connection proxy from a local device to a remote WebRTC. RTCPeerConnection provides methods to create, hold, monitor, and close connections. In addition, RTCPeerConnection is also the basis of the data channel, which can be created only when the local proxy is connected to the remote.

RTCPeerConnection point-to-point connection is the most difficult part of WebRTC. Although WebRTC has helped us simplify a lot of operations, it is still easy for beginners to step in many pits, such as Intranet penetration.

The general process is as follows (for ease of understanding, I stole a sequence diagram from elsewhere 😋) :

You can see an example here

Data channel

WebRTC uses the RTCDataChannel interface to create a two-way data channel connection between the two agents. With this connection we can transfer DOMString, String, ArrayBuffer, Blob, ArrayBuuferView, and so on. Commonly used with RTCPeerConnection (through the RTCPeerConnection createDataChannel create data channels).

The data channel only requires the caller to create the data channel, and the peer only needs to listen for the data channel connection.

Create a data channel:

const dc = this.peer.createDataChannel(this.$socket.id)
Copy the code

Sending data:

dc.send(message)
Copy the code

Listen for peer to send data:

Dc.onmessage = (e:MessageEvent) => {this.messages.push({sender: 'sender ', message: e.data})}Copy the code

Peer listens for data channel connections:

this.peer.ondatachannel = (e:RTCDataChannelEvent) => { const dc = this.dataChannel = e.channel dc.onmessage = (e:MessageEvent) => {this.messages.push({sender: 'on ', message: e.data})}}Copy the code

For specific examples, click here

Browser Support

According to the support shown in CanIse, modern browsers support it well, and as for IE, the consensus on all fronts (abandoned by its parent Microsoft in favor of a smaller Edge) is:

IE is dead, Edge is standing.

Native WebRTC small example

This is important

To obtain the media device, you must enable Https; otherwise, the device cannot be obtained

Local video call simulation

Here’s an online example to look at simulated local video calls (your device should support video calls, of course) :

The code can be viewed here

The general process is shown below (except that the signaling server is omitted as a local direct setting) :

Example of local analog end-to-end video call + data channel

Similar to local video, but with the addition of a data channel connection, see here for an example

Used in the project

If you have done the previous native examples, you will find that although native WebRTC has helped us to implement a lot of complex things, there are still a lot of steps to connect to WebRTC, and it is difficult to check if any of the steps fail. In short, WebRTC has a lot of holes

Therefore, in general, we will use the third-party encapsulation library in the project, and we choose PeerJS. Peerjs can help us with many problems, such as browser compatibility, connection process, error location, etc. Meanwhile, PeerJS also provides the default transfer service (generally, we will build it by ourselves, You can set up your own forwarding service through Peerjs-server.)

The preparatory work

Peer service

Although WebRTC is end-to-end communication, some signaling and candidate information needs to be exchanged through the server.

We have three options:

  • Use peerJS’s built-in transit service
  • Use peerjs-Server to set up the transfer service
  • The transit service is set up by the back end

I chose the second transit service as a demonstration example. In this example, the service is directly enabled:

Global Download package:

npm install peer -g
Copy the code

Then start the service directly (if HTTPS is not required) :

peerjs --port 9000 --key peerjs --path /appName
Copy the code

If HTTPS is required, the –sslkey and –sslcert parameters are required. In this case, the Nginx certificate is used

peerjs --port 9000 --key peerjs --path /appName --sslkey sslKeyPath --sslcert sslCertificate Path
Copy the code

Peer middleware can be used in applications with higher degrees of freedom or combined with existing NodeJS services, for example:

const express = require('express') const { ExpressPeerServer } = require('peer') const app = express() app.get('/', (req, res, next) => res.send('Hello world! ')) const server = app.listen(9000) const peerServer = ExpressPeerServer(server, { path: '/myapp' }); app.use('/peerjs', // Const HTTP = require(' HTTP ') const server = http.createserver (app) const peerServer = ExpressPeerServer(server, { debug: true, path: '/myapp' }); app.use('/peerjs', peerServer) server.listen(9000)Copy the code

Verify whether open signaling service: click http://127.0.0.1:9000/myapp, if return the JSON data format below, represents success:

{
    "name": "PeerJS Server",
    "description": "A server side element to broker connections between PeerJS clients.",
    "website": "http://peerjs.com/"
}
Copy the code

PeerServer’s event listener allows you to listen for device connections and disconnections:

peerServer.on('connection', (client) => { ... })
peerServer.on('disconnect', (client) => { ... })
Copy the code

Note that: If the domain name corresponding to the SSL certificate is not resolved to the public IP address of the server corresponding to the peer service, the test connection needs to be accessed for the first time and will expire.

Stun and TURN services

When the device is on different NAT networks, the STUN service is used for network penetration. When the STUN service is unavailable, the TURN service is used for data transfer.

Stun service construction

Stun services can be set up in many ways. Windows can be set up quickly using STUNTMAN, and STUNTMAN provides client-side testing procedures. For a tutorial, click portal. Another thing to note is that stuN needs to have dual public IP servers if you are testing in the cloud rather than locally.

Turn service setup

There are a lot of different ways to set up turn, and a lot of people use Coturn, but if you’re running a Windows server, it’s a lot of work, a lot of stuff to set up, and it’s not good for learning and delivering examples quickly; So we use Node-turn to quickly build an application using Node. For details, please refer to the official website, or check out the examples

example

An example of using peerJS video chat, and source code

Need to pay attention to the pit

WebRTC requires HTTPS to be enabled

HTTP is considered insecure, and the new version of WebRTC requires Https to be enabled. If Https is not enabled, the following error will be reported when obtaining the user’s camera or microphone permission:

Uncaught (in promise) TypeError: Cannot read property 'getUserMedia' of undefined
Copy the code

The STUN service can be used only on two public IP servers

During the test, it was found that the STUN service built locally could be used normally, but the service built in the cloud server was still not used normally. After a long time of checking the data, it was found that it needed two public IP addresses to be used normally. So if you really need to use WebRTC in your project, the turn service is recommended.

The signaling service resolves the domain name bound to the SSL certificate to the IP address of the server

After setting up the peer service, it usually tests whether the service is available and then uses it in the client.

If the signaling service is set up locally or the domain name corresponding to the SSL certificate is not resolved to the public IP address of the server, the peer service cannot be accessed and the following error is reported during the test on other devices:

ERROR PeerJS:  Error: Lost connection to server.
Copy the code

At this time, the test connection service can be accessed again. The reason for this problem is that the browser considers this is not a secure link and will block the link access (the new page access will block the link first, and ask whether to visit).

Solution: Resolve the domain name to the public IP address of the server.

WebRTC implements the recording function

There are two ways to achieve the recording function:

  • WebRTC+MediaRecorder+WebAudio
  • WebRTC+WebAudio

Related knowledge points

MediaRecorder+MediaStream records audio

This method is easier because it directly uses the interface provided by the browser to record audio. However, it is only compatible with some browsers:

MediaStream can through the navigator. MediaDevices. GetUserMedia () interface, specific examples we can see in front of the audio video call.

Events or functions to use:

  • The MediaRecorder() constructor passes in a MediaStream and records the incoming MediaStream.

  • MediaRecorder. Ondataavailable event handler to handle recorder trigger dataavailable events, time recording callback will provide a Blob object, dataavailable events in the following situations trigger and returns contain Blob data objects:

  • End of media stream

  • The recorder stops recording actively

  • Call the MediaRecorder. RequestData ()

  • If the recording split interval is set at the beginning of recording, the recording split interval is triggered every time

  • The MediaRecorder. Onend event handler fires when the recorder finishes recording

  • MediaRecorder. STRT () : Start recording and can pass in a split time interval in milliseconds to split the recording into separate blocks

  • MediaRecorder. Stop () : Stops recording audio

Recording process:

  1. Through the navigator. MediaDevices. GetUserMedia () to obtain the media stream

  2. Create an instance of MediaRecorder using a media stream, specifying the sampling bit rate and recording type

  3. Add corresponding events, such as dataavailable and End

  4. Call MediaRecorder.strt() at the appropriate time to start recording

  5. Call MediaRecorder.stop() when appropriate to end recording

Example:

/ / get the microphone media stream const stream = this. MediaStream = await the navigator. MediaDevices. GetUserMedia ({audio: {sampleRate: 48000, channelCount: 2 }, video: False}) // eslint-disable-next-line @typescript-eslint/ban-ts-comment // @ts-ignore // create recorder const recorder = this.recorder = new MediaRecorder(stream, { audioBitsPerSecond: 256000, mimeType: 'audio/webm'}) // Call recorder. Ondataavailable = async (e: {data:Blob}) => {console.log(e.ata) this.audiodata.push (e.ata)} // When recording stops recorder. Onstop = () => {const time = (new Date()).toisostring ().replace('T', ' ') this.currentFile = new File(this.audiodata, `${time}.mp3`, { type: 'audio/ mPEG '}) this.audioData = []} // Every 10 milliseconds to record data cut recorder.start(10) // when we need to stop recording this.recorder.stop()Copy the code

More documentation is available here

MediaStream+WebAudio records audio

AudioContext

An AudioContext is a class in WebAudio that is used to process audio. It represents an audio processing diagram constructed by an AudioNode linked together. Any audio operation must first create an instance of the AudioContext, since all audio operations are based on the AudioContext

const ac = new AudioContext()
Copy the code

AudioNode

AudioNode is a generic module that processes Audio (Audio, Video, MediaStream, etc.). It has both inputs and outputs. Here we read the MediaStream input source and connect it to the AudioContext:

const ac = new AudioContext()
const mediaNode = this.mediaNode = ac.createMediaStreamSource(stream)
mediaNode.connect(jsNode)
Copy the code

ScriptProcessorNode

ScriptProcessorNode is used to generate, process and analyze audio, inheriting from AudioNode; It is connected to two buffer audio processing modules, one buffer contains the input audio data, the other contains the processed output audio data; The onAudioProcess time that implements the AudioProcessingEvent interface can listen for audio to flow into the buffer.

We use this feature to listen for incoming audio and then save the data:

const creator = ac.createScriptProcessor.bind(ac) const jsNode = this.jsNode = creator(16384, 2, 2) jsNode.connect(ac.destination) jsNode.onaudioprocess = (e: AudioProcessingEvent) => { const audioBuffer = e.inputBuffer const leftData = audioBuffer.getChannelData(0) const RightData = audioBuffer. GetChannelData (1) this. Volume = rightData [rightData. Length - 1) + 1 / / there is a hole, if it is not for deep copy, This.leftaudiodata.push (leftData.slice(0)) this.rightaudiodata.push (rightData.slice(0))}Copy the code

The specific process

  1. throughnavigator.mediaDevices.getUserMediaGet a media stream
  2. Create AudioContext
  3. Create an AudioNode through a media stream and connect it to an AudioContext
  4. Create a ScriptProcessorNode and add a data inflow event to connect it to the AudioNode
  5. Add stop recording button and add event to process data

example

Recording:

/ / get the microphone media stream const stream = this. MediaStream = await the navigator. MediaDevices. GetUserMedia ({audio: {sampleRate: 48000, channelCount: 2 }, video: False}) // Save the recording via WebAudio const ac = new AudioContext() // Create an audioNode via media stream const mediaNode = this.medianode = Ac. CreateMediaStreamSource (stream) / / bind createScriptProcessor this for AudioContext, Dealt with audio and then create a node const creator. = ac createScriptProcessor. Bind (ac) const jsNode = this. JsNode = creator (16384, 2, Connect to AudioContext jsNode.connect(ac.destination) // audioNode connect to jsNode medianode.connect (jsNode) // Add audio inflow event jsNode. onAudioProcess = (e: AudioProcessingEvent) => { const audioBuffer = e.inputBuffer const leftData = audioBuffer.getChannelData(0) const RightData = audioBuffer. GetChannelData (1) this. Volume = rightData [rightData. Length - 1) + 1 / / there is a hole, if it is not for deep copy, This.leftaudiodata.push (leftData.slice(0)) this.rightaudiodata.push (rightData.slice(0))}Copy the code

Stop recording:

Const {mediaStream} = this // Get all media channels and stop them const tracks = mediastream.getTracks () clearInterval(timer) Tracks. The forEach (track = > {track. The stop ()}) / / stop the recording this. JsNode. Disconnect () enclosing mediaNode. Disconnect () / / merge data const left = mergeArray(this.leftAudioData) const right = mergeArray(this.rightAudioData) const audioData = mergedProvincialHighway(left, right) this.currentFile = createAudioFile(audioData) this.leftAudioData = [] this.rightAudioData = []Copy the code

Play the tape

The result is an ArrayBuffer, which is decoded using the AudioContext. AudioNode can play this audio:

function playAudio (file: File, errMsg: string): Promise<void> { if (! File) {return promise.reject (errMsg)} // Create a file reader const fileReader = new fileReader () // Play audio through AudioContext Onload = Async () => {// Create an AudioContext const AudioContext = new AudioContext() // Create a AudioNode through AudioContext const AudioNode = AudioContext. CreateBufferSource () / / AudioContext to decrypt binaries audioNode.buffer = await audioContext.decodeAudioData(fileReader.result as ArrayBuffer) AudioNode. Connect (audioContext. Destination) audioNode. Start (0)} / / read the selected file. FileReader readAsArrayBuffer return (file) Promise.resolve() }Copy the code

Download to local

Now that we have data of type File, we can use the url.createObjecturl () method to create a URL containing the data object, bind it to the dynamically created hyperlink element, and trigger its click event:

function downloadToLocal (file: File, errMsg: string): Promise<void> { if (! file) { return Promise.reject(Error(errMsg)) } const time = (new Date()).toISOString().replace('T', ' ') const a = document.createElement('a') a.href = URL.createObjectURL(file) a.setAttribute('download', `${time}.mp3`) a.click() return Promise.resolve() }Copy the code

Recording file compression

Using MediaRecorder recorded files, its data itself has been optimized, so do not need to compress, we only need to WebAudio recorded audio compression processing.

Check the WAV file header, the specific operations to do are as follows:

  1. Two-channel to mono channel
  2. Lower sampling number
  3. Reduced sampling rate

Two-channel to mono channel

The easiest way to change the number of channels is to do the following:

  1. Specify input and output as mono when creating ScriptProcessorNode (actual volume reduced by half)
  2. The audio data is stored in the first channel when it flows into the buffer
  3. When creating audio files, all tracks are set to 1

Change sample number

Changing the sampling bits only requires changing the sampling bits from 16 to 8 during the generation of the audio file

Reduced sampling rate

The sampling rate determines the size of the data per second. Lowering the sampling rate can effectively reduce the size of the file, but the disadvantage is that the audio is not as sharp. The sampling rate can be reduced by cutting and discarding the collected data and skipping some data proportionally (ratio = original sampling rate/reduced sampling rate. Since we directly discard the data in the ArrayBuffer proportionally by skipping, the two sampling rates must have a multiple relationship)

At the same time, the sample rate of the generated audio files should be consistent

example

WebRTC+WebAudio recording compression example, and code

The instance

WebRTC+MediaRecorder+WebAudio recording examples, and code

WebRTC+WebAudio recording examples, and code

WebRTC+WebAudio recording compression example, and code

The last

I will update it if I have the opportunity and time:

  • WebRTC enables multi-party video calls

  • WebRTC real-time chat for multiple people

  • WebRTC enables large file transfers

  • .

If this article happens to be helpful to you, please move the little hands of wealth to help point a praise 😊