MediaCodec hard solution

First consider the use of MediaCodec hard decoding, hard decoding code Google documentation is very detailed, mainly divided into asynchronous mode, synchronous mode. As for the decoded output, if it is decoded to a file, you can extract the outputBuffer and write it to the file. For display, it is recommended to pass the Surface when initializing MediaCodec:

decoder.configure(mediaFormat, surface, null.0);
mediaFormat.setInteger(MediaFormat.KEY_COLOR_FORMAT, MediaCodecInfo.CodecCapabilities.COLOR_FormatSurface);
decoder.releaseOutputBuffer(outputBufferIndex, true);
Copy the code

In this way, the Surface will be bound to the CODEC, and the decoded buffer will be directly displayed on the Surface at the bottom layer, without the need for the business layer array copy, which is the most efficient. In this case, the buffer obtained from the outputBuffer will also be NULL. Decoder. ReleaseOutputBuffer is decoders decoding real rendering.

But has encountered some h264 flow on some phones, hard to decode lost frames, debugging found a number of frames in the decoder. DequeueOutputBuffer (bufferInfo, 0); Returns -2, which is format changed, causing these frames to fail to be decoded. So think mediaFormat. SetInteger (mediaFormat. KEY_COLOR_FORMAT, MediaCodecInfo. CodecCapabilities. COLOR_FormatSurface); Here, change COLOR_FormatSurface into COLOR_QCOM_FormatYUV420SemiPlanar, and then display the decoding YUV on the Surface through OpenGLES2.0 or EGL. Can this problem be avoided? After some attempts, no matter which KEY_COLOR_FORMAT is changed for the camera and mobile phone, the format still changed to cause frame loss. For a consistent experience, it was decided to move to soft solutions.

Ffmpeg soft solution

Using FFMPEG soft solution does not require us to unframe in the business code, as long as the stream to FFMPEG all on the line, FFMPEG has its own AV_read_frame function can unframe, we mainly do queue work. Most ffMPEG software solutions on the web are read streams from files. It is easiest to pass the path of the file directly to FFMPEG. However, to use buffer inputs such as the camera, avio_ALloc_context and AV_PRObe_INput_buffer are used.

In the display part, the Surface can be transferred to FFMPEG to be converted into ANativeWindow. The decoded YUV can be converted into RGB through SWs_scale and then copied to ANativeWindow line by line for display.

However, a welcome soft solution performance problem is encountered. When the camera bit rate reaches 10M, decoding becomes a bottleneck. Each frame needs nearly 30ms to be decoded, plus the rendering time, resulting in a frame rate lower than the original frame rate of the camera of 30, resulting in picture delay and insufficient frame rate. Soon, the decoded buffer queue will be full, and old data will be discarded when the queue is full, and Mosaic will appear in the picture. Here, Samsung S8 with frame rate of 30,10 M bit rate camera is used for testing: after decoding, no rendering, frame rate 34-38; After decoding the same thread rendering, frame rate 22, CPU usage is always about 125%

With visible light decoding, frame rate is still guaranteed, but not with rendering.

Ffmpeg soft solution, decoding and rendering asynchronous

Try to separate the decoding thread from the rendering thread to drain the CPU as much as possible. Note that avCODEC_SEND_packet and AVCODEC_receive_frame must be invoked at the same time. That is, the send message must be received immediately after the send message is received. The send message can be continued only when the receive message is not 0.

Ffmpeg maintains a queue of frames and can send and receive frames in two threads. This is not true. Ffmpeg maintains an array of frames and can send only when the array is empty.

Therefore, send and receive should be executed synchronously on the same thread. After receiving, AVFrame should be queued. Another thread takes the frame from the queue, sws_scale it, and draws RGB to the NativeWindow. Decoding + rendering, CPU usage increased to 180%, performance improved to frame rate 29, but the CPU will heat down later, decoding time increased, frame rate dropped to 20, this performance is still not up to the requirements.

Ffmpeg hard solution

So I tried to use FFMPEG hard solution. In the spirit of “how do you know if you don’t try”, I decided to try ffMPEG hard solution, although I have tested that format changed on some cameras when calling MediaCodec hard solution on some phones, and ffMPEG is actually implemented with MediaCodec hard solution.

Configure requires the following configuration:

--enable-jni--enable-mediacodec --enable-decoder=h264_mediacodec --enable-hwaccel=h264_mediacodec --target-os=android (jni not found)Copy the code

Ijkplayer is not found. Configure is not found. Ijkplayer is not found. $FF_CFG_FLAGS –target-os= Linux –target-os=android; $FF_CFG_FLAGS –target-os=android AVCodec *pCodec = avcodec_find_decoder_by_name(“h264_mediacodec”); PCodec = pcodec_find_decoder (pCodecCtx->codec_id);

The hard solution does decode fast, and the decoding time of each frame is shortened to about 1ms, but the picture is stuck, and the CPU usage is still very high. When I look at the log, I find that the bottleneck has become SWs_scale, and the efficiency of swS_scale calculation for yuV after hard solution is very low. A scale frame takes 68ms and soon the render queue fills up. You can use libyuv to replace the SWs_scale function with Neon hardware acceleration, or use OpenGL hardware acceleration to render yuV to Surface. I don’t know why the soft solution of YUV, SWs_scale is very efficient, one frame only needs 12ms.

Ffmpeg multithreading soft solution

Try multi-threaded soft solution, continue to squeeze the CPU dry, solve the decoding performance bottleneck. PCodecCtx ->thread_count = 8; PCodecCtx ->thread_type = FF_THREAD_SLICE; After annotation, instead of multithreading invalid, multithreading decoding effect after annotation, performance soared, frame rate directly depends on the speed of feeding data

Feed data faster and eat up more CPU resources

Speed up the frame rate after feeding data directly up

conclusion

After choosing the consistency and performance of the model, the final solution: FFMPEG multi-threading soft solution, through SWs_scale conversion YUV to RGB display on the ANativeWindow.

Optimized space: sws_scale is replaced with libyuv to further reduce power consumption