An overview of the
In the above article, we reviewed the basic principles of Audio speed doubling on Android, and gradually extended the implementation of two algorithms commonly used on Android: Sonic and SoundTouch.
The preliminary conclusion is that when users enable audio multiplier, we need to switch between different implementations according to the specific scenario to ensure the best user experience.
For example, SoundTouch is preferred for regular music — especially background, percussive music — while Sonic is a better choice for purer audio (crosstalk storytelling, vocal cappella, etc.).
In this paper, Google open source ExoPlayer as an example, from the source analysis of the player’s own Sonic is specific how to achieve double speed; After that, we will try to integrate SoundTouch to provide different multispeed implementations for different application scenarios.
Sonic source code analysis
1. Introduction to the AudioProcessor
ExoPlayer has built-in integration with Sonic for audio speed and tone change by default, and is a Java version that is good to start learning.
All you need to do is implement the AudioProcessor interface provided by ExoPlayer to customize your own audio effects, such as gearshift, girl voice, background sound, etc.
Such designs are common, such as the OkHttp interceptor, the View event distribution and interception mechanism, and so on.
// An audio processor interface that takes audio data as input and converts it to modify its channel count, encoding or sampling rate.
public interface AudioProcessor {
// Configure the Processor to process input audio in the specified format
// 1. After calling this method, isActive() is called to determine if the audio processor isActive. If this instance is active, returns the configured output audio format;
After calling this method, flush() is necessary to apply the new configuration;
// 3. Before the new configuration is applied, data can still be securely entered and output in the old input/output format;
// 4. When the configuration changes, call queueEndOfStream().
AudioFormat configure(AudioFormat inputAudioFormat) throws UnhandledAudioFormatException;
// Returns whether the current Processor is active and processes the InputBuffer.
boolean isActive(a);
// InputBuffer columns audio data for processing.
void queueInput(ByteBuffer buffer);
// Mark the input stream as terminated
// Calling getOutput() returns all remaining output data, and it may take multiple calls to read all remaining output data. Once all remaining output data has been read, isEnded() returns true.
void queueEndOfStream(a);
// Returns a buffer containing processed output data between its position and limit.
ByteBuffer getOutput(a);
// Returns whether the current Processor has no more data output until it calls Flush () and new data is entered.
boolean isEnded(a);
// Clear all buffered data and pending output.
// If the Processor is still active after that, prepare a new input stream with the latest format configuration.
void flush(a);
// Reset and release all resources
void reset(a);
}
Copy the code
2. SonicAudioProcessor process analysis
In ExoPlayer, when the user plays audio at double speed, it is configured in DefaultAudioSink:
// DefaultAudioSink.java
public final class DefaultAudioSink implements AudioSink {
public static class DefaultAudioProcessorChain implements AudioProcessorChain {
// ...
private final SonicAudioProcessor sonicAudioProcessor;
/ / player. SetSpeed () in the method, through sonicAudioProcessor. SetSpeed () application times speed configuration
@Override
public PlaybackParameters applyPlaybackParameters(PlaybackParameters playbackParameters) {
silenceSkippingAudioProcessor.setEnabled(playbackParameters.skipSilence);
return newPlaybackParameters( sonicAudioProcessor.setSpeed(playbackParameters.speed), sonicAudioProcessor.setPitch(playbackParameters.pitch), sonicAudioProcessor.setVolume(playbackParameters.volume), playbackParameters.skipSilence); }}}Copy the code
Understanding the complete workflow of Sonic Audioprocessor will help you further understand Sonic, which contains logic specifically to handle speed and tone changes. Let’s focus on the core flow here:
// SonicAudioProcessor.java
public final class SonicAudioProcessor extends AudioProcessor {
private Sonic sonic;
private ByteBuffer buffer;
private ShortBuffer shortBuffer;
private ByteBuffer outputBuffer;
// 1. Set the double speed and mark it. Next time new audio data is entered, rebuild Sonic
public float setSpeed(float speed) {
if (this.speed ! = speed) {this.speed = speed;
pendingSonicRecreation = true;
}
return speed;
}
// 2. Buffer data is cleared, rebuild or release Sonic according to the new configuration
@Override
public void flush(a) {
if (isActive()) {
if (pendingSonicRecreation) {
// 2.1 Rebuild Sonic corresponding to tone and speed
sonic = newSonic(speed, ...) ; }else if(sonic ! =null) {
// 2.2 Reset sonicsonic.flush(); }}// ...
}
// 3. SonicProcessor starts processing data only when at least one of the current sonic, pitch, or sound changes (active)
public boolean isActive(a) {
returnpendingOutputAudioFormat.sampleRate ! = Format.NO_VALUE && (Math.abs(speed -1f) > =0.01 f
|| Math.abs(pitch - 1f) > =0.01 f
|| Math.abs(volume - 1f) > =0.01 f|| pendingOutputAudioFormat.sampleRate ! = pendingInputAudioFormat.sampleRate); }// 4. [important] InputBuffer for audio processing, such as double speed processing
@Override
public void queueInput(ByteBuffer inputBuffer) {
/ /... Detailed processing
}
// 5. Return the audio OutputBuffer, which contains the audio data at double speed
@Override
public ByteBuffer getOutput(a) {
ByteBuffer outputBuffer = this.outputBuffer;
this.outputBuffer = EMPTY_BUFFER;
return outputBuffer;
}
// 6. Mark the end of this input
@Override
public void queueEndOfStream(a) {
if(sonic ! =null) {
// The audio input ends, forcing the rest of the buffer to be processed at double speed and returned
// After this method is executed, the buffer in Sonic is clean
sonic.queueEndOfStream();
}
inputEnded = true;
}
// 7. This method returns true, indicating that the processing of the AudioProcessor is complete
@Override
public boolean isEnded(a) {
return inputEnded && (sonic == null || sonic.getOutputSize() == 0); }}Copy the code
Throughout the audio processing process, the reader can confirm that the most important core logic is in the queueInput() method, which contains the logic for changing the speed and tone of the audio input PCM data:
// SonicAudioProcessor.java
@Override
public void queueInput(ByteBuffer inputBuffer) {
if (inputBuffer.hasRemaining()) {
ShortBuffer shortBuffer = inputBuffer.asShortBuffer();
int inputSize = inputBuffer.remaining();
inputBytes += inputSize;
// 1. Input unprocessed PCM data into sonic
sonic.queueInput(shortBuffer);
inputBuffer.position(inputBuffer.position() + inputSize);
}
// 2. Get the outputSize from sonic's buffer.
// Create an empty shortBuffer to receive PCM output data after variable speed
int outputSize = sonic.getOutputSize();
if (outputSize > 0) {
if (buffer.capacity() < outputSize) {
buffer = ByteBuffer.allocateDirect(outputSize).order(ByteOrder.nativeOrder());
shortBuffer = buffer.asShortBuffer();
} else {
buffer.clear();
shortBuffer.clear();
}
// 3. Output the variable speed data to the shortBuffer. The outputBuffer is returned by getOutput()sonic.getOutput(shortBuffer); outputBytes += outputSize; buffer.limit(outputSize); outputBuffer = buffer; }}Copy the code
3. The Sonic analysis
After the above analysis, it can be concluded that Sonic is exposed in several important ways:
final class Sonic {
// 1. Add the remaining data in InputBuffer to its own queue
public void queueInput(ShortBuffer buffer);
// 2. Get the available output into the ShortBuffer, which will input data from position
public void getOutput(ShortBuffer buffer);
// 3. Get the size of the available output
public int getOutputSize(a);
// 4. Forcing output for all data already in the column will not cause additional delay in output, but may cause distortion
public void queueEndOfStream(a);
// 5. Clear the state in preparation for receiving the new InputBuffer
public void flush(a);
}
Copy the code
From the key API, Sonic internal processing also needs to introduce data buffers to ensure synchronization and avoid audio distortion, and provide queueEndOfStream and flush methods in response to calls to SonicAudioProcessor’s corresponding methods.
This article focuses only on the core multiplier flow, which is Sonic’s queueInput method:
// Sonic.java
public void queueInput(ShortBuffer buffer) {
// 1. Calculate the number of frames and bytes corresponding to the data in the buffer
int framesToWrite = buffer.remaining() / channelCount;
int bytesToWrite = framesToWrite * channelCount * 2; ,// 2. Ensure that your inputBuffer has enough space to input data
inputBuffer = ensureSpaceForAdditionalFrames(inputBuffer, inputFrameCount, framesToWrite);
buffer.get(inputBuffer, inputFrameCount * channelCount, bytesToWrite / 2);
inputFrameCount += framesToWrite;
// 3. Process data
processStreamInput();
}
private void processStreamInput(a) {
// ...
// 4.
if (s > 1.00001 || s < 0.99999) {
// 4.1 If the speed is changed, execute the double speed calculation method
changeSpeed(s);
} else {
// 4.2 Input data to outputBuffer intact without doubling speed
copyToOutput(inputBuffer, 0, inputFrameCount);
inputFrameCount = 0;
}
// ...
}
Copy the code
As you can see from the comments, the changeSpeed() method is the core variable speed algorithm:
private void changeSpeed(float speed) {
// ...
// 1. Execute the command repeatedly until the input data is processed
do {
if (remainingInputToCopyFrameCount > 0) {
positionFrames += copyInputToOutput(positionFrames);
} else {
// 2. Core method, calculate pitch period
int period = findPitchPeriod(inputBuffer, positionFrames);
// 3. Core method for speech signal synthesis to achieve double speed effect
if (speed > 1.0) {
positionFrames += period + skipPitchPeriod(inputBuffer, positionFrames, speed, period);
} else{ positionFrames += insertPitchPeriod(inputBuffer, positionFrames, speed, period); }}}while (positionFrames + maxRequiredFrameCount <= frameCount);
// ...
}
// 4. The internal calculation of skipPitchPeriod() is given to overlapAdd() without going into details
private int skipPitchPeriod(short[] samples, int position, float speed, int period) {
// ...overlapAdd(...) ;return newFrameCount;
}
Copy the code
Integrated SoundTouch
1. The compiler so
Unlike Sonic, which provides Java and C++ multi-platform implementations, SoundTouch only has C++ implementations, so requires CMake compilation to generate so files for Android.
For this I refer to the SoundTouchDemo version of the code, due to the older version of the warehouse, so slightly updated, interested readers can refer to the SoundTouch – Android library:
Github.com/qingmei2/so…
2. Define the SoundTouch class
In essence, Sonic and SoundTouch are structurally and ideologically the same regardless of the specific algorithmic implementation of voice signal processing, both maintain internal buffers and constantly receive input and return output:
public class SoundTouch {
static {
System.loadLibrary("soundtouch");
}
// 1. Initialize SoundTouch by passing in audio stream type, number of tracks, sample rate, sample size, audio speed, tone and other parameters
private static synchronized native final void setup(int track, int channels, int samplingRate, int bytesPerSample, float tempo, float pitchSemi);
// 2. Enter PCM data, see Sonic#queueInput
private static synchronized native final void putBytes(int track, byte[] input, int length);
// 3. GetOutput and output size, refer to Sonic#getOutput and Sonic#getOutputSize
private static synchronized native final int getBytes(int track, byte[] output, int toGet);
// 4. Set the double speed
private static synchronized native final void setTempoChange(int track, float tempoChange);
// 5. Refer to Sonic#flush
private static synchronized native final void finish(int track, int bufSize);
}
Copy the code
The reader only needs to focus on the core logic, omitting other implementations such as tone sandhi, and see the full code here.
3. Define the SoundTouchAudioProcessor class
QueueInput: queueInput: queueInput: queueInput: queueInput: queueInput: queueInput: queueInput: queueInput: queueInput: queueInput: queueInput
final class SoundTouchAudioProcessor {
public void queueInput(ByteBuffer inputBuffer) {
SoundTouch soundTouch = (SoundTouch)Assertions.checkNotNull(this.soundTouch);
new StringBuilder("");
byte[] input = new byte[0];
int outputSize;
if (inputBuffer.hasRemaining()) {
ShortBuffer shortBuffer = inputBuffer.asShortBuffer();
outputSize = inputBuffer.remaining();
this.inputBytes += (long)outputSize;
input = new byte[outputSize];
// 1. Unlike Sonic, we take the rest of the InputBuffer and store it in a new byte[]
for(int i = 0; i < outputSize; ++i) {
input[i] = inputBuffer.get(inputBuffer.position() + i);
}
// 2. Give input to soundTouch
soundTouch.putBytes(input);
inputBuffer.position(inputBuffer.position() + outputSize);
}
byte[] output = new byte[4096];
// 3. Get the output, SoundTouch#getBytes returns the data size as well as the convenience of Sonic
outputSize = soundTouch.getBytes(output);
if (outputSize > 0) {
if (this.buffer.capacity() < outputSize) {
this.buffer = ByteBuffer.allocateDirect(outputSize).order(ByteOrder.nativeOrder());
this.shortBuffer = this.buffer.asShortBuffer();
} else {
this.buffer.clear();
this.shortBuffer.clear();
}
// 4. Finally, just like Sonic, pass the output to outputBuffer
this.buffer.put(Arrays.copyOf(output, outputSize));
this.outputBytes += (long)outputSize;
this.buffer.limit(outputSize);
this.buffer.position(0);
this.outputBuffer = this.buffer; }}}Copy the code
Ultimately, we managed to implement the SoundTouchAudioProcessor and dynamically adjust the switching between SonicAudioProcessor and SoundTouchAudioProcessor for audio multiplier processing based on business needs.
reference
Part of the text of this article is excerpted from the following information, for readers who are interested in further understanding.
-
A Review of time-scale Modification of Music Signals @Jonathan Driedger @Meinard Muller
-
TSM Time domain pressure expansion (variable speed modulation) algorithm summary @DBinary
-
Exoplayer 07 AudioProcessor AudioProcessor @ underwater nightcrawler
-
Audio variable speed modulation principle and Soundtouch code analysis @floer Rivor
-
Audio variable speed modulation – Sonic source code analysis @floer Rivor
-
google-ExoPlayer @GitHub
-
bilibili-ijkplayer @GitHub
-
bilibili-soundtouch @GitHub
-
waywardgeek-sonic @GitHub
About me
If you think this article is valuable to you, welcome to ❤️, and also welcome to follow my blog or GitHub.
- My Android learning system
- About the article error correction
- About paying for knowledge
- About the Reflections series