Pcm format audio files, in the general player is not playable, it needs to be encoded, this chapter to learn how to code audio

A, MediaCodec

In Android, there is a component called MediaCodec that encodes raw data into a specified format and decodes the encoding to the original format

It can also encode and decode the Surface as input or output, but in audio, it mostly works directly with the data it encodes or decodes

1.1 Brief Introduction

Let’s start with an image from Google

In the picture we can see that:

  • It’s divided into two ends. One isThe clientSaid,The caller, one isThe service sideSaid,codecs
  • The callerFrom the firstcodecsRequest, or get oneAn empty bufferI get thisAn empty bufferAfter that, load the data into thisAn empty bufferintocodecs
  • codecsgetThe callerThe data will be processed accordingly
  • The callerAgain fromcodecsTo deriveThe output buffer, the buffer is the completed buffer,The callerAfter use,emptyAnd return the data tocodecs

After passing data to MediaCodec, the caller does not need to care how MediaCodec handles it, only whether the data is normal

Here’s how to use MediaCodec

1.2 Usage Mode

There are two ways to use MediaCodec:

  • synchronously
  • asynchronous

The specific process is as follows (from Google’s official website) :

synchronously

MediaCodec codec = MediaCodec.createByCodecName(name); The codec. The configure (format,...). ; MediaFormat outputFormat = codec.getOutputFormat();// option B
codec.start();
for (;;) {
 int inputBufferId = codec.dequeueInputBuffer(timeoutUs);
 if (inputBufferId >= 0) {ByteBuffer inputBuffer = codec.getinputBuffer (...) ;// fill inputBuffer with valid data... Codec. QueueInputBuffer (inputBufferId,...). ; }intOutputBufferId = codec. DequeueOutputBuffer (...). ;if (outputBufferId >= 0) {
   ByteBuffer outputBuffer = codec.getOutputBuffer(outputBufferId);
   MediaFormat bufferFormat = codec.getOutputFormat(outputBufferId); // option A
   // bufferFormat is identical to outputFormat
   // outputBuffer is ready to be processed or rendered.... Codec. ReleaseOutputBuffer (outputBufferId,...). ; }else if (outputBufferId == MediaCodec.INFO_OUTPUT_FORMAT_CHANGED) {
   // Subsequent data will conform to new format.
   // Can ignore if using getOutputFormat(outputBufferId)
   outputFormat = codec.getOutputFormat(); // option B
 }
}
codec.stop();
codec.release();
Copy the code

As you can see, synchronization is used in the following steps

  1. Create an instance

  2. Configuration MediaCodec

  3. Start the MediaCodec

  4. Open an infinite loop

    • Get inputByteBuffer, and then callqueueInputBufferMethod to insert a codec
    • Get outputByetBuffer, obtain the data after codec, and do the corresponding business processing, callreleaseOutputBufferMethod to release theByteBufferdata
  5. Stop codecs and free resources

In an infinite loop, we can use a flag bit to control whether to exit the loop, that is, to exit the codec

asynchronous

MediaCodec codec = MediaCodec.createByCodecName(name);
MediaFormat mOutputFormat; // member variable
codec.setCallback(new MediaCodec.Callback() {
 @Override
 void onInputBufferAvailable(MediaCodec mc, int inputBufferId) {
    ByteBuffer inputBuffer = codec.getInputBuffer(inputBufferId);
   // fill inputBuffer with valid data... Codec. QueueInputBuffer (inputBufferId,...). ; }@Override
 void onOutputBufferAvailable(MediaCodec mc, intOutputBufferId,...). {
   ByteBuffer outputBuffer = codec.getOutputBuffer(outputBufferId);
   MediaFormat bufferFormat = codec.getOutputFormat(outputBufferId); // option A
   // bufferFormat is equivalent to mOutputFormat
   // outputBuffer is ready to be processed or rendered.... Codec. ReleaseOutputBuffer (outputBufferId,...). ; }@Override
 void onOutputFormatChanged(MediaCodec mc, MediaFormat format) {
   // Subsequent data will conform to new format.
   // Can ignore if using getOutputFormat(outputBufferId)
   mOutputFormat = format; // option B
 }
​
 @Override
 void onError(...). {... }}); The codec. The configure (format,...). ; mOutputFormat = codec.getOutputFormat();// option B
codec.start();
// wait for processing to complete
codec.stop();
codec.release();
Copy the code

As you can see, there are differences between asynchronous and synchronous modes:

  • asynchronousIn the configurationMediaCodec, a callback is set
  • Callback, mainly inonInputBufferAvailableandonOutputBufferAvailable
  • inonInputBufferAvailableTo perform the logic of inserting data in
  • inonOutputBufferAvailableExecutes the logic that processes the codec data
  • Finally, again, release resources

With MediaCodec in mind, let’s move on to the coding of Pcm

Two, audio coding

In the previous chapter, how to use AudioRecord to collect Pcm data was introduced, and this chapter will also be used. However, the difference is that after the Pcm data is collected, it needs to be fed into the encoder for encoding operation, and finally obtain the AAC file that can be played on the general device

Review the previous Pcm steps for AudioRecord

  1. Starting a child thread
  1. Build instance
  1. Start recording
  1. Cycle fromAudioRecordRead data, write to file
  1. Stop recording and release resources

To encode, the data written to the file should be the encoded data, so the procedure is as follows:

  1. Starting a child thread
  2. Build instance
  3. Start recording and coding
  4. Cycle fromAudioRecordRead the data and feed it to the encoder
  5. Extract the encoded data from the encoder and write it to a file
  6. Stop recording and coding to release resources

Since AudioRecord used to read data in an infinite loop, this time the encoding was synchronized

Below, on the above steps, one by one

2.1 Enabling subthreads

Similarly, coding operations need to be performed in the child thread

private static class EncodeThread extends Thread {
    public EncodeThread(a){}}Copy the code

2.2 Set parameters

Similar to AudioRecord, some parameters are configured

private static class EncodeThread extends Thread {
    private static final long TIMEOUT_MS = 2000L;
    private AudioRecord audioRecord;
    private MediaCodec mediaCodec;
    /** * File output */
    private FileOutputStream fos;
    private final String path;
    /** * Sound source (usually from the microphone) */
    private final int audioSource;
    /**
     * 采样率
     */
    private final int sampleRateInHz;
    /** * Channel Settings */
    private final int channelConfig;
    /** * Encoding format */
    private final int audioFormat;
    /** * Bit rate */
    private final int bitRate;
    /**
     * 最大输入size
     */
    private final int maxInputSize;
    /** * Encoding type */
    private final String mine;
    /** * Number of channels */
    private int channelCount;
    /** * Audio buffer */
    private int bufferSizeInByte;
    /** * Whether to stop coding */
    private boolean isStopEncode = false;
    /** * constructor (passing in the necessary arguments) */
    public EncodeThread(String path,
                        int audioSource,
                        int sampleRateInHz,
                        int channelConfig,
                        int audioFormat,
                        int bitRate,
                        int maxInputSize,
                        String mime
    ) {
        this.path = path;
        this.audioSource = audioSource;
        this.sampleRateInHz = sampleRateInHz;
        this.channelConfig = channelConfig;
        this.audioFormat = audioFormat;
        this.bitRate = bitRate;
        this.maxInputSize = maxInputSize;
        this.mine = mime; }}Copy the code

Note the addition of bitRate, maxInputSize, and MIME parameters, where MIME represents the type of the encoding used to build the encoder

2.3 the initialization

Override Thread’s run method # run()

@Override
public void run(a) {
    super.run();
    initIo();
    initAudioRecord();
    initMediaCodec();
    encode();
}
Copy the code

It calls three methods, which are:

  • initIo()

    Initialize the file output stream

  • initAudioRecord()

    Initialize the recording component

  • initMediaCodec()

    Initializing the encoder

  • encode()

    And actually start coding

# initIo()

private void initIo(a) {
    if (TextUtils.isEmpty(path)) {
        return;
    }
    File file = new File(path);
    if (file.exists()) {
        file.delete();
    }
    try {
        fos = new FileOutputStream(path);
    } catch (FileNotFoundException e) {
        e.printStackTrace();
        fos = null; }}Copy the code

# initAudioRecord()

private void initAudioRecord(a) {
    bufferSizeInByte = AudioRecord.getMinBufferSize(sampleRateInHz, channelConfig, audioFormat);
    audioRecord = new AudioRecord(audioSource, sampleRateInHz, channelConfig, audioFormat, bufferSizeInByte);
}
Copy the code

# initMediaCodec()

private void initMediaCodec(a) {
    channelCount = 1;
    if (channelConfig == AudioFormat.CHANNEL_IN_MONO) {
        channelCount = 1;
    } else if (channelConfig == AudioFormat.CHANNEL_IN_STEREO) {
        channelCount = 2;
    }
    MediaFormat format = MediaFormat.createAudioFormat(
            mine, sampleRateInHz, channelCount);
    format.setInteger(MediaFormat.KEY_BIT_RATE, bitRate);
    format.setInteger(MediaFormat.KEY_AAC_PROFILE, MediaCodecInfo.CodecProfileLevel.AACObjectLC);
    format.setInteger(MediaFormat.KEY_MAX_INPUT_SIZE, maxInputSize);
    try {
        mediaCodec = MediaCodec.createEncoderByType(mine);
        mediaCodec.configure(format, null.null, MediaCodec.CONFIGURE_FLAG_ENCODE);
    } catch (IOException e) {
        e.printStackTrace();
        mediaCodec = null; }}Copy the code

When we build the MediaCodec instance, we have a parameter called MediaFormat that we need to create. We have to set the bitRate, otherwise the encoder will report an error. Another thing to note is that when AudioRecord is initialized, the channel setting is constant, but when we create MediaFormat, The number of channels to be converted to

2.4 coding

Now we get to the actual coding

# encode()

private void encode(a) {
    if (audioRecord == null || fos == null || mediaCodec == null) {
        return;
    }
    MediaCodec.BufferInfo info = new MediaCodec.BufferInfo();
    audioRecord.startRecording();
    mediaCodec.start();
    for(; ;) {if (isStopEncode) {
            release();
            break;
        }
        int inputBufferId = mediaCodec.dequeueInputBuffer(TIMEOUT_MS);
        if (inputBufferId >= 0) {
            ByteBuffer inputBuffer = mediaCodec.getInputBuffer(inputBufferId);
            int readSize = -1;
            if(inputBuffer ! =null) {
                readSize = audioRecord.read(inputBuffer, bufferSizeInByte);
            }
            if (readSize <= 0) {
                mediaCodec.queueInputBuffer(
                        inputBufferId,
                        0.0.0,
                        MediaCodec.BUFFER_FLAG_END_OF_STREAM);
                isStopEncode = true;
            } else {
                mediaCodec.queueInputBuffer(
                        inputBufferId,
                        0,
                        readSize,
                        System.nanoTime() / 1000.0); }}int outputBufferId = mediaCodec.dequeueOutputBuffer(info, TIMEOUT_MS);
        if (outputBufferId >= 0) {
            ByteBuffer outputBuffer = mediaCodec.getOutputBuffer(outputBufferId);
            int size = info.size;
            if(outputBuffer ! =null && size > 0) {
                byte[] data = new byte[size + 7];
                addADTSHeader(data, size + 7);
                outputBuffer.get(data, 7, size);
                outputBuffer.clear();
                try {
                    fos.write(data);
                } catch (IOException e) {
                    e.printStackTrace();
                }
                mediaCodec.releaseOutputBuffer(outputBufferId, false); }}}}Copy the code

Notice that the logic in the code is much more complex than when AudioRecord was collecting Pcm, but the structure of the code is clear

After the Pcm is collected by AudioRecord, before we directly write it to the file, but here we need to feed it into the encoder, and then write the data after the output buffer is obtained

Note that before writing to the file, the data needs to be processed by addADTSHeader, which means adding a 7-byte ADTS header to the encoded data

Why do we do that?

If you add this header information, it will not be played in the player, and the encoder encoded data does not carry this header information, so we need to add it ourselves

# addADTSHeader()

/** * Add AAC frame header **@param packet    packet
 * @param packetLen packetLen
 */
private void addADTSHeader(byte[] packet, int packetLen) {
    int profile = 2; // AAC
    int freqIdx = 4; / / 44.1 kHz
    packet[0] = (byte) 0xFF;
    packet[1] = (byte) 0xF9;
    packet[2] = (byte) (((profile - 1) < <6) + (freqIdx << 2) + (channelCount >> 2));
    packet[3] = (byte) (((channelCount & 3) < <6) + (packetLen >> 11));
    packet[4] = (byte) ((packetLen & 0x7FF) > >3);
    packet[5] = (byte) (((packetLen & 7) < <5) + 0x1F);
    packet[6] = (byte) 0xFC;
}
Copy the code

Note that when adding header information, some parameters are fixed and some parameters are variable. We focus on the variable information, such as encoding format, sampling rate, number of channels, and frame length

Third, making

PcmEncode.java

PcmActivity.java