Summary of audio and Video Development Pitfalls (II)

WebRTC coding prior to processing
The data collection
Audio Unit Processing Graph
AEC NS algorithm
AEC AGC implementation

In WebRTC, there are two major aspects that need special attention before Audio data is sent to the encoder: data collection and Audio Processing.

IOS system provides multiple Audio acquisition playback interfaces, including AVAudioRecorder AVAudioPlayer and AVPlayer based on files, and Audio Unit Processing Graph based on PCM format raw Audio data.

AVAudioRecorder can directly record sound as local recording files, support mp3 and other formats AVAudioPlayer can play local mp3 and other formats of sound recording files AVAudioPlayer is similar to AVAudioPlayer, but it can also support playing network files to provide URL Audio Unit Processing Graph provides real-time PCM raw Audio data Processing capabilities. If we want to develop functions similar to wechat voice messages, each voice message is a recording file, then use AVAudioRecorder AVAudioPlayer AVPlayer is great for real-time calling functions such as video conferencing and Audio Unit Processing Graph.

WebRTC uses the AudioUnit Processing Graph and WebRTC uses the same AudioUnit to capture and play Audio. Enable hardware Echo suppression according to the Apple documentation

In addition to Audio capture and playback the AudioUnit Processing Graph has many more functions such as reverb mixing format conversion and other Audio Processing functions that are required to combine one or more audiounits into an AUGraph.

The data collection

Data collection is primarily handled by the Audio Device module and is platform – and configuration-dependent

Mac computers, using the CoreAudio API, generally use the default built-in sound card parameter fs=48kHz, stero.
For Windows, WebRTC uses WASAPI. According to the different sound card parameters, the sampling rate and other parameters can be selected more. For example, after some computer builtInAEC is opened, FS =16kHz, Mono; if the Audio Enhancement of the sound card is turned off, fs=48kHz, STERo will be output.
Android generally uses the Java layer AudioRecord framework.
IOS generally uses the AudioUnit framework.

In addition, the data collection part also involves USB earphones, 3.5mm earphones, Bluetooth earphones and other peripherals. These devices also affect the subsequent Audio Processing on the Audio link, for example, the delay of Audio collection is added. Headphones with Speech Enhancement will modify the audio spectrum, and some headphone peripherals may be used improperly, resulting in no sound on the audio link.

Audio Processing

Audio Processing mainly includes AEC, AGC, NS and so on:

AEC—-Acoustic Echo Cancellation
AGC—-Automatic Gain Control is used to adjust the volume of the input signal.
NS—-Noise Suppression.

The output data from Audio Devices are successively processed by AEC, NS, AGC and other Audio processing modules

AEC AGC

AEC:

1. BuiltInAEC, generally Windows, Android system, builtInAEC will be enabled by default. 2.AECM, echo cancellation algorithm for mobile terminals, suitable for Android and iOS. 3.AEC algorithm, applicable to Echo cancellation algorithm of Windows/Mac Desktop. Of course, AEC can also be used in mobile terminals, and in some cases, echo leakage performance is better than AECM. However, the latest WebRTC has removed the old AEC code. 4.AEC3 algorithm, Google's revision of the old AEC algorithm, has completely replaced the old AEC algorithm at present.Copy the code

AGC:

```
 1.Legacy 
 
 2.AGCAGC2
Copy the code
```

Audio Unit Processing Graph

Create AudioUnit

ComponentType is set to kAudioUnitType_Output and this type can be used for collecting as well as playing

ComponentSubType is set to kAudioUnitSubType_VoiceProcessingIO for short (VPI/O) because this subclass has echo cancellation (AEC) automatic gain control (AGC).

Set callback to collect playback data:

Use the AudioUnitSetProperty function to manipulate AudioUnit to set the data format and set the data callback
An AudioUnit of the VPI/O type can collect and play audio simultaneously, but has two buses :bus1 for audio collection

(kInputBus) 1 Bus For audio playback (kOutputBus) 0

Initialize AudioUnit

Initialize the VPI/O:

Initialize the AGC

Sample_rate: This is the RTC’s own code to get the sampling rate set to 48000HZ if the device is multi-core, not 4S, otherwise set to 16000.

MFormatID: Format constant set to kAudioFormatLinearPCM surface is PCM format

Start the AudioUnit

Stop AudioUnit

Too much code to update…

Summary of audio and Video Development Pitfalls (II)

The data collection

Audio Processing

AEC AGC

Related Posts

Implementation idea of Chain Call (Swift)

Basic principles of IOS message forwarding

Full use of the Swift enumeration