PS: first in wechat public number: Gongxingzhi

To understand the relevant knowledge of audio and video, you can read the same series of articles:

  • Basic knowledge of audio and video development
  • Audio frame, video frame and its synchronization
  • Record mp4 from Camera2 and MediaCodec

MediaCodec is the codec component in Android that is used to access the underlying codecs, Commonly used with MediaExtractor, MediaSync, MediaMuxer, MediaCrypto, MediaDrm, Image, Surface and AudioTrack, MediaCodec is almost the standard hardware decoding for Android players, but whether it uses a soft codec or a hard codec, or whether it depends on the configuration of MediaCodec, MediaCodec will be introduced from the following aspects, the main content is as follows

  1. The type of MediaCodec processing
  2. MediaCodec codec process
  3. MediaCodec life cycle
  4. The creation of MediaCodec
  5. Initialization of MediaCodec
  6. MediaCodec data processing
  7. Adaptive playback support
  8. Exception handling for MediaCodec

The type of MediaCodec processing

MediaCodec supports processing of three data types, namely compressed data, raw audio data and raw video data. ByteBuffer can be used to process these three data. For the original video data, Surface can be used to improve the codec performance, but the original video data cannot be accessed, but the original video frame can be accessed through ImageReader, and the corresponding YUV data and other information can be obtained through Image.

Compressed buffer: The input buffer for the decoder and the output buffer for the encoder will contain compressed data corresponding to MediaFormat’s KEY_MIME type, usually a single compressed video frame for the video type, and for audio data, this is usually an encoded audio segment, usually containing a few milliseconds of audio, depending on the format type.

Original audio buffer: the original audio buffer containing PCM audio data of the whole frame, it is each of the channels in the channel order a sample, each PCM audio samples is a 16-bit signed integer or floating point number (in the native byte order), if you want to use floating point PCM encoding of the original audio buffer, need the following configuration:

mediaFormat.setInteger(MediaFormat.KEY_PCM_ENCODING, AudioFormat.ENCODING_PCM_FLOAT);
Copy the code

Check the floating-point PCM in MediaFormat as follows:

 static boolean isPcmFloat(MediaFormat format) {
  return format.getInteger(MediaFormat.KEY_PCM_ENCODING, AudioFormat.ENCODING_PCM_16BIT)
      == AudioFormat.ENCODING_PCM_FLOAT;
 }
Copy the code

A channel to extract a buffer containing 16-bit signed integer audio data, using the following code:

// Assumes the buffer PCM encoding is 16 bit.
short[] getSamplesForChannel(MediaCodec codec, int bufferId, int channelIx) {
  	ByteBuffer outputBuffer = codec.getOutputBuffer(bufferId);
  	MediaFormat format = codec.getOutputFormat(bufferId);
  	ShortBuffer samples = outputBuffer.order(ByteOrder.nativeOrder()).asShortBuffer();
  	int numChannels = format.getInteger(MediaFormat.KEY_CHANNEL_COUNT);
  	if (channelIx < 0 || channelIx >= numChannels) {
    	return null;
  	}
 	short[] res = new short[samples.remaining() / numChannels];
  	for (int i = 0; i < res.length; ++i) {
    	res[i] = samples.get(i * numChannels + channelIx);
  	}
  	return res;
}
Copy the code

Raw video buffer: In ByteBuffer mode, the video buffer is laid out according to the value of its MediaFormat KEY_COLOR_FORMAT. The supported color format of the device can be obtained from MediaCodecInfo. Video codecs may support three color formats:

  • Native raw Video Format: The original raw video format, marked by CodecCapabilities’ COLOR_FormatSurface constant, can be used with input or output Surface.

  • Flexible YUV buffers: Flexible YUV buffer, such as CodecCapabilities COLOR_FormatYUV420Flexible constant corresponding color format, Can be used with input and output Surface and ByteBuffer modes via getInput and OutputImage.

  • Other specific formats: These formats are usually only supported in ByteBuffer mode. Some color formats are vendor-specific, and others are defined in CodecCapabilities.

Since Android 5.1, all video codecs support flexible YUV 4:2:0 buffers. The MediaFormat# KEY_WIDTH and MediaFormat# KEY_HEIGHT keys specify the size of the video frame. In most cases, the video takes up only a portion of the video frame, as shown below:

Get the cropped rectangle of the original output image from the output format using the following keys. If these keys are not present in the output format, the video will occupy the entire video frame. Before using any MediaFormat# KEY_ROTATION, before setting the rotation, you can calculate the size of the video frame as follows:

MediaFormat format = decoder. GetOutputFormat (...). ;int width = format.getInteger(MediaFormat.KEY_WIDTH);
 if (format.containsKey("crop-left") && format.containsKey("crop-right")) {
    width = format.getInteger("crop-right") + 1 - format.getInteger("crop-left");
 }
 int height = format.getInteger(MediaFormat.KEY_HEIGHT);
 if (format.containsKey("crop-top") && format.containsKey("crop-bottom")) {
    height = format.getInteger("crop-bottom") + 1 - format.getInteger("crop-top");
 }
Copy the code

MediaCodec codec process

MediaCodec first takes an empty input buffer, fills the data to be encoded or decoded, sends the input buffer to MediaCodec for processing, releases the input buffer when the data is processed, and finally gets the output buffer that has been encoded or decoded. Release the output buffer after use, and its codec flow diagram is as follows:

The corresponding APIS for each phase are as follows:

// Get the index of the available input buffer
public int dequeueInputBuffer (long timeoutUs)
// Get the input buffer
public ByteBuffer getInputBuffer(int index)
// Submit the filled inputBuffer to the encoding queue
public final void queueInputBuffer(int index,int offset, int size, long presentationTimeUs, int flags)
// Gets the index of the output buffer that has been successfully codeced
public final int dequeueOutputBuffer(BufferInfo info, long timeoutUs)
// Get the output buffer
public ByteBuffer getOutputBuffer(int index)
// Release the output buffer
public final void releaseOutputBuffer(int index, boolean render) 
Copy the code

MediaCodec life cycle

MediaCodec has three states: Executing, stopping and releasing. A substate is a substate of Executing, a substate of releasing and a substate of Executing. Flushed, Running, and stream-of-stream are the uploading state that gets Flushed, and the three sub-states that stop are Uninitialized, Configured, and Error. The schematic diagram of the MediaCodec lifecycle is as follows:

Life cycle in synchronous mode Life cycle in asynchronous mode

As shown in the figure above, the three state switches are triggered by start, stop, reset, release, etc. The life cycle may be slightly different depending on how MediaCodec processes data. For example, in asynchronous mode, start immediately enters the Running sub-state. If the Flushed state is already Flushed, you need to call start again to enter the Running sub-state. The key apis corresponding to each sub-state switchover are as follows:

  • The state of being Stopped
// Create MediaCodec to enter the Uninitialized child state
public static MediaCodec createByCodecName (String name)
public static MediaCodec createEncoderByType (String type)
public static MediaCodec createDecoderByType (String type)
// Configure MediaCodec to enter the Configured sub-state, which crypto and descrambler explain later
public void configure(MediaFormat format, Surface surface, MediaCrypto crypto, int flags)
public void configure(MediaFormat format, @Nullable Surface surface,int flags, MediaDescrambler descrambler)
// Error
// An Error occurred during the codec process
Copy the code
  • Executing status (Executing status)
// Immediately after start, the uploading state becomes Flushed
public final void start(a)
// Enter the Running state when the first input buffer is queued
public int dequeueInputBuffer (long timeoutUs)
// When the input buffer is queued with the end-of-stream flag, the codec converts to end-of-stream substate
// MediaCodec will not accept other input buffers at this point, but will generate output buffers
public void queueInputBuffer (int index, int offset, int size, long presentationTimeUs, int flags)
Copy the code
  • Release status (Released)
// Release MediaCodec after codec is finished
public void release (a)
Copy the code

The creation of MediaCodec

As mentioned earlier, when MediaCodec is created, the Uninitialized child state is created as follows:

/ / create MediaCodec
public static MediaCodec createByCodecName (String name)
public static MediaCodec createEncoderByType (String type)
public static MediaCodec createDecoderByType (String type)
Copy the code

When createByCodecName is used, MediaCodecList can be used to obtain the supported codecs. Here is the encoder to obtain the specified MIME type:

/** * Query the encoder */ for the specified MIME type
fun selectCodec(mimeType: String): MediaCodecInfo? {
    val mediaCodecList = MediaCodecList(MediaCodecList.REGULAR_CODECS)
    val codeInfos = mediaCodecList.codecInfos
    for (codeInfo in codeInfos) {
        if(! codeInfo.isEncoder)continue
        val types = codeInfo.supportedTypes
        for (type in types) {
            if (type.equals(mimeType, true)) {
                return codeInfo
            }
        }
    }
    return null
}
Copy the code

Of course, MediaCodecList also provides the corresponding method to obtain the codec, as follows:

// Get the encoder of the specified format
public String findEncoderForFormat (MediaFormat format)
// Get the decoder for the specified format
public String findDecoderForFormat (MediaFormat format)
Copy the code

The MediaFormat parameter in the preceding method cannot contain any frame rate Settings. If the frame rate has been set, clear it and use it again.

Use MediaCodecList to list all codecs supported by the current device. When creating MediaCodec, select the codecs supported by the current format. Each codec is packaged into a MediaCodecInfo object. You can view the features of the encoder, such as hardware acceleration, soft or hard codec, as follows:

// Whether soft solution
public boolean isSoftwareOnly (a)
// Whether the Android platform provides (false) or the vendor provides (true) codec
public boolean isVendor (a)
// Whether hardware acceleration is supported
public boolean isHardwareAccelerated (a)
// Encoder or decoder
public boolean isEncoder (a)
// Get the current codec support appropriate
public String[] getSupportedTypes (a)
// ...
Copy the code

Soft solution and hard solution should be the audio and video development must master, when the use of MediaCodec can not be said to be all hard solution, whether to use hard solution or soft solution or depends on the encoder used, generally the codec provided by manufacturers are hard codec, such as Qualcomm (QCOM), generally, such as the system provided is soft codec. For example, the codec with the word Android, here is part of my (MI 10 Pro) own phone codec:

// Hard codec
OMX.qcom.video.encoder.heic
OMX.qcom.video.decoder.avc
OMX.qcom.video.decoder.avc.secure
OMX.qcom.video.decoder.mpeg2
OMX.google.gsm.decoder
OMX.qti.video.decoder.h263sw
c2.qti.avc.decoder
...
// Soft decoder
c2.android.aac.decoder
c2.android.aac.decoder
c2.android.aac.encoder
c2.android.aac.encoder
c2.android.amrnb.decoder
c2.android.amrnb.decoder
...
Copy the code

MediaCodec initialization

After MediaCodec is created, the Uninitialized sub-state will be set to MediaFormat. If asynchronous data processing is used, MediaCodec.Callback will be set before configure. Key apis are as follows:

// 1. MediaFormat
/ / create MediaFormat
public static final MediaFormat createVideoFormat(String mime,int width,int height)
/ / open or close function, specific see MediaCodeInfo. CodecCapabilities
public void setFeatureEnabled(@NonNull String feature, boolean enabled)
// Parameter Settings
public final void setInteger(String name, int value)

// 2. setCallback
// If you want to process data asynchronously, set mediacodec.callback before configure
public void setCallback (MediaCodec.Callback cb)
public void setCallback (MediaCodec.Callback cb, Handler handler)

/ / 3. Configuration
public void configure(MediaFormat format, Surface surface, MediaCrypto crypto, int flags)
public void configure(MediaFormat format, @Nullable Surface surface,int flags, MediaDescrambler descrambler)
Copy the code

Flags specifies whether the current codec is used as an encoder or decoder. Crypto and descrambler are both related to decryption. For example, some VIP videos need specific keys to cooperate with the decoding, and the video content will be decrypted only after the user logs in and verifies. Otherwise, some videos that need to pay to watch can be downloaded and spread freely. For more details, you can check the digital copyright technology in audio and video.

In addition, certain formats such as AAC audio and MPEG4, H.264, H.265 video formats contain some initialization specific data for MediaCodec. When decoding these compression formats, This particular data must be submitted to MediaCodec after start and before any frame data is processed, using the flag BUFFER_FLAG_CODEC_CONFIG in a call to queueInputBuffer, These specific data can also be configured using MediaFormat to set ByteBuffer as follows:

// Same for csD-0, CSD-1, csD-2
val bytes = byteArrayOf(0x00.toByte(), 0x01.toByte())
mediaFormat.setByteBuffer("csd-0", ByteBuffer.wrap(bytes))
Copy the code

Csd-0 and csd-1 can be obtained from MediaFormat, which is obtained from MediaExtractor# getTrackFormat. These specific data will be automatically submitted to MediaCodec at start. There is no need to directly submit this data. If you call Flush before the output buffer or format changes, you will lose specific data for the commit, and you will need to use something like the flag BUFFER_FLAG_CODEC_CONFIG in the call to queueInputBuffer.

Android uses the following codec-specific data buffers, which also need to be set to track format in order to properly configure MediaMuxer tracks. Each parameter set and codec-specific data portion marked with (*) must begin with a starting code of “\ x00 \ x00 \ x00 \ x01”, For reference:

The encoder will also output an outputbuffer with the BUFFER_FLAG_CODEC_CONFIG flag upon receiving this information, which is specific data, not media data.

MediaCodec data processing mode

Each created codec maintains a set of input buffers. There are two ways to process data, synchronous and asynchronous, depending on the API version. In API 21, starting from Android5.0, it is recommended to use ButeBuffer. Before this, only ButeBuffer arrays can be used for data processing, as follows:

MediaCodec, also known as the codec data processing, is the process of obtaining input and output buffers, submitting data to the codec, and releasing the output buffers. The difference between synchronous and asynchronous is that the key apis of the input and output buffers are as follows:

// Get input buffer (sync)
public int dequeueInputBuffer (long timeoutUs)
public ByteBuffer getInputBuffer (int index)
// Get the output buffer (sync)
public int dequeueOutputBuffer (MediaCodec.BufferInfo info, long timeoutUs)
public ByteBuffer getOutputBuffer (int index)
// The index of the input and output buffers is retrieved from the Callback of mediacodec.callback, and the corresponding input and output buffers are retrieved (asynchronously)
public void setCallback (MediaCodec.Callback cb)
public void setCallback (MediaCodec.Callback cb, Handler handler)
// Submit data
public void queueInputBuffer (int index, int offset, int size, long presentationTimeUs, int flags)
public void queueSecureInputBuffer (int index, int offset, MediaCodec.CryptoInfo info, long presentationTimeUs, int flags)
// Release the output buffer
public void releaseOutputBuffer (int index, boolean render)
public void releaseOutputBuffer (int index, long renderTimestampNs)
Copy the code

The following is a brief introduction to ButeBuffer for Android after 5.0.

ButeBuffer arrays have been Deprecated since Android 5.0. The official website mentions that ButeBuffer arrays have been improved to a certain extent. Therefore, when the device meets the conditions, apis corresponding to ButeBuffer should be used as far as possible, and asynchronous mode is recommended for data processing. The codes of synchronous and asynchronous processing modes are as follows:

  • Synchronous processing mode
MediaCodec codec = MediaCodec.createByCodecName(name); The codec. The configure (format,...). ; MediaFormat outputFormat = codec.getOutputFormat();// option B
 codec.start();
 for (;;) {
  int inputBufferId = codec.dequeueInputBuffer(timeoutUs);
  if (inputBufferId >= 0) {ByteBuffer inputBuffer = codec.getinputBuffer (...) ;// Populate the input buffer with valid data... Codec. QueueInputBuffer (inputBufferId,...). ; }intOutputBufferId = codec. DequeueOutputBuffer (...). ;if (outputBufferId >= 0) {
    ByteBuffer outputBuffer = codec.getOutputBuffer(outputBufferId);
    MediaFormat bufferFormat = codec.getOutputFormat(outputBufferId); // option A
    // bufferFormat is the same as outputFormat
    // The output buffer has been prepared and processed or rendered... Codec. ReleaseOutputBuffer (outputBufferId,...). ; }else if (outputBufferId == MediaCodec.INFO_OUTPUT_FORMAT_CHANGED) {
    // The output format changes and a new format is adopted later, in which case use getOutputFormat() to get the new format
    // If you use getOutputFormat(outputBufferId) to get the format of a particular buffer, there is no need to listen for format changes
    outputFormat = codec.getOutputFormat(); // option B
  }
 }
 codec.stop();
 codec.release();
Copy the code

For details, see the examples in the previous article: recording mp4 from Camera2 and MediaCodec.

  • Asynchronous processing mode
MediaCodec codec = MediaCodec.createByCodecName(name);
 MediaFormat mOutputFormat; // member variable
 codec.setCallback(new MediaCodec.Callback() {
  @Override
  void onInputBufferAvailable(MediaCodec mc, int inputBufferId) {
    ByteBuffer inputBuffer = codec.getInputBuffer(inputBufferId);
    // fill inputBuffer with valid data... Codec. QueueInputBuffer (inputBufferId,...). ; }@Override
  void onOutputBufferAvailable(MediaCodec mc, intOutputBufferId,...). {
    ByteBuffer outputBuffer = codec.getOutputBuffer(outputBufferId);
    MediaFormat bufferFormat = codec.getOutputFormat(outputBufferId); // option A
    // bufferFormat is equivalent to mOutputFormat
    // outputBuffer is ready to be processed or rendered.... Codec. ReleaseOutputBuffer (outputBufferId,...). ; }@Override
  void onOutputFormatChanged(MediaCodec mc, MediaFormat format) {
    // Subsequent data will conform to new format.
    // Can ignore if using getOutputFormat(outputBufferId)
    mOutputFormat = format; // option B
  }
 
  @Override
  void onError(...). {... }}); The codec. The configure (format,...). ; mOutputFormat = codec.getOutputFormat();// option B
 codec.start();
 // wait for processing to complete
 codec.stop();
 codec.release();
Copy the code

The End of the stream needs to be marked when the data to be processed ends. You can flag BUFFER_FLAG_END_OF_STREAM at the end of queueInputBuffer submissions on the last valid input buffer, It is also possible to submit an empty input buffer with end-of-stream flags after the last valid input buffer to mark its end. The input buffer cannot be submitted again unless the codec is flushed, stopped, or restarted. The output buffer continues to return until the output stream is finally notified of its termination by specifying the same end-of-stream flag in the dequeueOutputBuffer or BufferInfo returned via Callback# onOutputBufferAvailable.

If an input Surface is used as input to the codec, and there is no accessible input buffer, the input buffer is automatically submitted to the codec from the Surface, essentially omiting the input process. This input Surface can be created by the createInputSurface method. Calling signalEndOfInputStream will signal the end of the stream and immediately stop submitting data to the codec. The key API is as follows:

// Create input Surface after configure but before start
public Surface createInputSurface (a)
// Set input Surface
public void setInputSurface (Surface surface)
// Send the end of the stream signal
public void signalEndOfInputStream (a)
Copy the code

Similarly, if the output Surface is used, the related functions of the output buffer will be replaced. You can use setOutputSurface to set a Surface as the output of the codec. You can choose whether to render each output buffer on the output Surface. Key apis are as follows:

// Set the output Surface
public void setOutputSurface (Surface surface)
// False means not to render the buffer, true means to render the buffer using the default timestamp
public void releaseOutputBuffer (int index, boolean render)
// Render the buffer with the specified timestamp
public void releaseOutputBuffer (int index, long renderTimestampNs)
Copy the code

Adaptive playback support

When MediaCodec is used as a video decoder, you can check whether the decoder supports adaptive playback, i.e. whether the decoder supports seamless resolution changes:

// Whether to support a feature, CodecCapabilities# FEATURE_AdaptivePlayback corresponding to adaptive playback
public boolean isFeatureSupported (String name)
Copy the code

The adaptive playback function is activated only when the decoder is configured to be decoded on the Surface. When the video is decoded, only the key-frame (I frame) can be decoded independently after strat or Flush is called. All other frames are decoded accordingly. Key frames of different formats are as follows:

Different decoders have different support capabilities for adaptive playback, and the post-processing of SEEK operation is also different. This part of content will be left to the subsequent concrete practice and then sorted out.

Exception handling for MediaCodec

CodecException is usually caused by internal codec exceptions, such as media content corruption, hardware failure, and resource exhaustion. You can use the following methods to determine the exception handling for further processing:

// True: stop, configure, and start can be used for recovery
public boolean isRecoverable (a)
// true indicates a temporary problem. The encoding or decoding operation will be retried later
public boolean isTransient (a)
Copy the code

If isRecoverable and isTransient both return false, you need to reset or release the resource and restart work. Both cannot return true.

Can pay attention to personal wechat public number practice exchange learning.