An overview of the
In this paper, FFmpeg video decoding as the theme, mainly introduces the main process of FFmpeg video decoding, the basic principle; Secondly, the article also describes the simple application related to FFmpeg video decoding, including how to play video according to a certain time sequence on the basis of the original FFmpeg video decoding, how to join the logic of SEEK when playing video; In addition, the article focuses on the details that may be easily omitted when decoding video, and finally simply describes how to package a basic video decoding function VideoDecoder.
preface
FFmpeg
FFmpeg is a set of open source computer programs that can be used to record, convert digital audio and video, and convert them into streams. It can generate libraries for processing and manipulating multimedia data, including libavcodec and LibavFormat, which are advanced audio and video decoding libraries.
FFmpeg six common function modules
- Libavformat: a library that encapsulates and unencapsulates multimedia files and protocols, such as MP4 and FLV, and network protocols such as RTMP and RTSP.
- Libavcodec: audio and video decoding core library;
- Libavfilter: Library for audio, video, and subtitles;
- Libswscale: image format conversion library;
- Libswresample: Audio resampling library;
- Libavutil: tool library
Basic introduction to video decoding
- Demultiplexing (Demux) : Demultiplexing is also called decapsulation. There is a concept called package format, package format refers to the combination of audio and video formats, common mp4, FLV, MKV and so on. Generally speaking, encapsulation is the product of combining audio stream, video stream, subtitle stream and other accessories into a package according to certain rules. Unpacking plays the opposite role to packaging, disassembling a streaming media file into audio data and video data. In this case, the split data is compressed and encoded. The common video compression data format is H264.
- Decode: In simple terms, it is to decompress the compressed encoded data into the original video pixel data. The commonly used format of the original video pixel data is YUV.
- Color Space Convert: Usually for image display, it displays the image through RGB model, but using YUV model can save bandwidth when transmitting image data. Therefore, when displaying images, it is necessary to convert yuV pixel format data into RGB pixel format before rendering.
- Render: send the data of each video frame that has been decoded and converted into color space to the graphics card for rendering on the screen.
First, the preparation before the introduction of FFmpeg
1.1 FFmpeg SO library compilation
- Download and unpack the source library from FFmpeg’s official website.
- Download the NDK library and decompress it.
- Configure configure in the FFmpeg source library directory after decompression, modify the highlighted part of several parameters to the following content, the main purpose is to generate Android can use the name – version.
# # · · · · · · build Settings SHFLAGS = '- Shared - Wl - soname, $$(@ F)' LIBPREF = "lib" LIBSUF = "a" FULLNAME = '$(NAME) $(BUILDSUF)' LIBNAME='$(LIBPREF)$(FULLNAME)$(LIBSUF)' SLIBPREF="lib" SLIBSUF=".so" SLIBNAME='$(SLIBPREF)$(FULLNAME)$(SLIBSUF)' SLIBNAME_WITH_VERSION='$(SLIBNAME).$(LIBVERSION)' # Changed configuration SLIBNAME_WITH_MAJOR='$(SLIBNAME)$(FULLNAME)-$(LIBMAJOR)$(SLIBSUF)' LIB_INSTALL_EXTRA_CMD='$$(RANLIB)"$(LIBDIR)/$(LIBNAME)"' SLIB_INSTALL_NAME='$(SLIBNAME_WITH_MAJOR)' SLIB_INSTALL_LINKS = '$(SLIBNAME) # · · · · · ·Copy the code
- Create a new script file in the FFmpeg source library directory
build_android_arm_v8a.sh
, configure the path of the NDK in the file, and enter the following other contents;
# # to empty the last compilation of the make the clean here first configure your path of the NDK export the NDK = / Users/bytedance/Library/Android/SDK/the NDK / 21.4.7075529 TOOLCHAIN=$NDK/toolchains/llvm/prebuilt/darwin-x86_64 function build_android { ./configure \ --prefix=$PREFIX \ --disable-postproc \ --disable-debug \ --disable-doc \ --enable-FFmpeg \ --disable-doc \ --disable-symver \ --disable-static \ --enable-shared \ --cross-prefix=$CROSS_PREFIX \ --target-os=android \ --arch=$ARCH \ --cpu=$CPU \ --cc=$CC \ --cxx=$CXX \ --enable-cross-compile \ --sysroot=$SYSROOT \ --extra-cflags="-Os -fpic $OPTIMIZE_CFLAGS" \ --extra-ldflags="$ADDI_LDFLAGS" make clean make -j16 make install echo "============================ build android arm64-v8a success ==========================" } # arm64-v8a ARCH=arm64 CPU=armv8-a API=21 CC=$TOOLCHAIN/bin/aarch64-linux-android$API-clang CXX=$TOOLCHAIN/bin/aarch64-linux-android$API-clang++ SYSROOT=$NDK/toolchains/llvm/prebuilt/darwin-x86_64/sysroot CROSS_PREFIX=$TOOLCHAIN/bin/aarch64-linux-android- PREFIX=$(pwd)/android/$CPU OPTIMIZE_CFLAGS="-march=$CPU" echo $CC build_androidCopy the code
- Set permissions for all files in the NDK folder
chmod 777 -R NDK
; - Terminal execution script
./build_android_arm_v8a.sh
Start compiling FFmpeg. The compiled files will be in FFmpegandroid
In the directory, multiple. So files appear;
- To compile ARM-V7a, just copy and modify the above script as follows
build_android_arm_v7a.sh
The content of the.
#armv7-a
ARCH=arm
CPU=armv7-a
API=21
CC=$TOOLCHAIN/bin/armv7a-linux-androideabi$API-clang
CXX=$TOOLCHAIN/bin/armv7a-linux-androideabi$API-clang++
SYSROOT=$NDK/toolchains/llvm/prebuilt/darwin-x86_64/sysroot
CROSS_PREFIX=$TOOLCHAIN/bin/arm-linux-androideabi-
PREFIX=$(pwd)/android/$CPU
OPTIMIZE_CFLAGS="-mfloat-abi=softfp -mfpu=vfp -marm -march=$CPU "
Copy the code
1.2 Introduce FFmpeg’s so library in Android
- NDK environment, CMake build tool, LLDB (C/C++ code debugging tool);
- When creating a C++ module, the following important files are generated:
CMakeLists.txt
,native-lib.cpp
,MainActivity
; - in
app/src/main/
Under directory, create a directory and name itjniLibs
, this is the default directory where Android Studio places the so dynamic library; Then, injniLibs
Create a directoryarm64-v8a
Directory, and then paste the compiled.so file into this directory; Then paste the.h header generated at compile time (the interface exposed by FFmpeg) intocpp
In the directoryinclude
In the. The above.so dynamic library directory and.h header directory will be thereCMakeLists.txt
Explicitly declare and link in; - The top of the
MainActivity
In which to load C/C++ code compiled libraries:native-lib
.native-lib
在CMakeLists.txt
Is added to the library named “ffmpeg”, so inSystem.loadLibrary()
The input is “ffmpeg”;
class MainActivity : AppCompatActivity() { override fun onCreate(savedInstanceState: Bundle?) { super.onCreate(savedInstanceState) setContentView(R.layout.activity_main) // Example of a call to a native method Sample_text.text = stringFromJNI()} // Declare an externally referenced method corresponding to the C/C++ layer code. external fun stringFromJNI(): String Companion Object {// Load C/C++ compiled library in init{} : Init {system.loadLibrary ("ffmpeg")}}}Copy the code
native-lib.cpp
External is a C++ interface file where external methods declared in the Java layer are implemented;
#include <jni.h>
#include <string>
extern "C" JNIEXPORT jstring JNICALL
Java_com_bytedance_example_MainActivity_stringFromJNI(
JNIEnv *env,
jobject /* this */) {
std::string hello = "Hello from C++";
return env->NewStringUTF(hello.c_str());
}
Copy the code
CMakeLists.txt
Is a build script intended for the configuration to compilenative-lib
The construction information of this SO library;
# For more information about using CMake with Android Studio, read the # documentation: https://d.android.com/studio/projects/add-native-code.html # Sets the minimum version of CMake required to build the Native Library. Cmake_minimum_required (VERSION 3.10.2) # Declares the project. Project (" FFmPEG ") # Creates and names a library, sets it as either STATIC # or SHARED, and provides the relative paths to its source code. # You can define multiple libraries, And CMake builds them for you. # Gradle Automatically packages shared libraries with your APK. Set (FFmpeg_lib_dir ${CMAKE_SOURCE_DIR}/.. /${ANDROID_ABI}) set(FFmpeg_head_dir ${CMAKE_SOURCE_DIR}/FFmpeg) add_library( # Sets the name of the library. ffmmpeg # Sets the library as a shared library. SHARED # Provides a relative path to your source file(s). native-lib.cpp ) # Searches for a specified prebuilt library and stores the path as a # variable. Because CMake includes system libraries in the search path by # default, you only need to specify the name of the public NDK library # you want to add. CMake verifies that the library exists Add_library (Avutil SHARED IMPORTED) set_target_properties(avutil PROPERTIES IMPORTED_LOCATION ${FFmpeg_lib_dir}/libavutil.so ) add_library( swresample SHARED IMPORTED ) set_target_properties( swresample PROPERTIES IMPORTED_LOCATION ${FFmpeg_lib_dir}/libswresample.so ) add_library( avcodec SHARED IMPORTED ) set_target_properties( avcodec PROPERTIES IMPORTED_LOCATION ${FFmpeg_lib_dir}/libavcodec.so ) find_library( # Sets the name of the path variable. log-lib # Specifies the name of the NDK library that # you want CMake to locate. log) # Specifies libraries CMake should link to your target library. You # can link multiple libraries, such as libraries you define in this # build script, prebuilt third-party libraries, Or system libraries. target_link_libraries(# Specifies the target libraries. audioffmmpeg # link the ffmpeg. so libraries added earlier to the target library Native lib on Avutil swresample avcodec-landroid # Links the target library to the log library # included in the NDK. ${log-lib})Copy the code
- This brings FFmpeg into the Android project.
Second, THE principle and details of FFmpeg decoding video
2.1 Main Process
2.2 Basic Principles
2.2.1 Common FFMPEG interface
// 1 allocate AVFormatContext avformat_alloc_context(); // 2 Open avformat_open_input(AVFormatContext **ps, const char *url, const AVInputFormat * FMT, AVDictionary **options); Avformat_find_stream_info (AVFormatContext * IC, AVDictionary **options); // 4 Avcodec_alloc_context3 (const AVCodec *codec); // 5 Populates the codec context avCODEC_parameterS_to_context (AVCodecContext *codec, const AVCodecParameters * PAR) based on the codec parameters associated with the data stream; Avcodec_find_decoder (enum AVCodecID ID); // open the codec avcodec_open2(AVCodecContext *avctx, const AVCodec *codec, AVDictionary **options); Av_read_frame (AVFormatContext *s, AVPacket * PKT); Avcodec_send_packet (AVCodecContext *avctx, const AVPacket *avpkt); Avcodec_receive_frame (AVCodecContext * AVCTx, AVFrame *frame);Copy the code
2.2.2 Overall idea of video decoding
- You have to register first
libavformat
And register all codecs, reuse/dereuse groups, protocols, etc. It is the first function to be called in any FFMPEg-based application, and only by calling this function can the functions of FFmpeg be used properly. In addition, in the latest version of FFmpeg, this line of code can now be omitted;
av_register_all();
Copy the code
- Open the video file and extract the data flow information in the file;
auto av_format_context = avformat_alloc_context();
avformat_open_input(&av_format_context, path_.c_str(), nullptr, nullptr);
avformat_find_stream_info(av_format_context, nullptr);
Copy the code
- Then obtain the subscript of the video media stream to find the video media stream in the file;
int video_stream_index = -1; for (int i = 0; i < av_format_context->nb_streams; I++) {// match the subscript of the video media stream, if (av_format_context->streams[i]->codecpar->codec_type == AVMEDIA_TYPE_VIDEO) { video_stream_index = i; LOGD(TAG, "find video stream index = %d", video_stream_index); break; }}Copy the code
- Get the video media stream, get the decoder context, get the decoder context, configure the parameter value of the decoder context, open the decoder;
Auto stream = av_format_context->streams[video_stream_index]; Auto codec = avCodec_find_decoder (stream-> codecPAR ->codec_id); AVCodecContext* codec_ctx = AVCoDEC_alloc_context3 (codec); Auto RET = AVCODEC_parameterS_to_context (coDEC_ctx, stream-> codecPAR); If (ret >= 0) {// Open the decoder avcodec_open2(codec_ctx, COdec, nullptr); / /......}Copy the code
- By specifying pixel format, image width, image height to calculate the memory size required by the buffer, allocate the setting buffer; And since we’re drawing it on top, we need to use
ANativeWindow
, the use ofANativeWindow_setBuffersGeometry
Sets the properties of this draw window;
video_width_ = codec_ctx->width; video_height_ = codec_ctx->height; int buffer_size = av_image_get_buffer_size(AV_PIX_FMT_RGBA, video_width_, video_height_, 1); // Out_buffer_ = (uint8_t*) av_malloc(buffer_size * sizeof(uint8_t)); // Limit the number of pixels in the buffer by setting the width and height, not the size of the display screen. // If the buffer does not match the display screen size, the actual display may be stretched, Int result = ANativeWindow_setBuffersGeometry(Native_WINDOW_, video_width_, video_height_, WINDOW_FORMAT_RGBA_8888);Copy the code
- Allocate memory space to RGBA pixel format
AVFrame
, used to store the frame data converted to RGBA; Set up thergba_frame
Buffer to make it without_buffer_
Associated;
auto rgba_frame = av_frame_alloc();
av_image_fill_arrays(rgba_frame->data, rgba_frame->linesize,
out_buffer_,
AV_PIX_FMT_RGBA,
video_width_, video_height_, 1);
Copy the code
- To obtain
SwsContext
, it’s callingsws_scale()
It is used for image format conversion and image scaling. YUV420P may be called when converted to RGBAsws_scale
Format conversion failed and cannot return the correct height value due to the same reason as the callsws_getContext
时flags
About, need to beSWS_BICUBIC
Switch toSWS_FULL_CHR_H_INT | SWS_ACCURATE_RND
;
struct SwsContext* data_convert_context = sws_getContext(
video_width_, video_height_, codec_ctx->pix_fmt,
video_width_, video_height_, AV_PIX_FMT_RGBA,
SWS_BICUBIC, nullptr, nullptr, nullptr);
Copy the code
- Allocate memory space for storing raw data
AVFrame
, pointing to the original frame data; And allocate memory space to store the data before decoding the videoAVPacket
;
auto frame = av_frame_alloc();
auto packet = av_packet_alloc();
Copy the code
- Read the compressed frame data from the video code stream, and then start decoding;
ret = av_read_frame(av_format_context, packet);
if (packet->size) {
Decode(codec_ctx, packet, frame, stream, lock, data_convert_context, rgba_frame);
}
Copy the code
- in
Decode()
The function will contain native compressed datapacket
As input to the decoder;
/* send the packet with the compressed data to the decoder */
ret = avcodec_send_packet(codec_ctx, pkt);
Copy the code
- The decoder returns decoded frame data to the specified
frame
On, subsequent can have decodedframe
的pts
Converted to time stamp, according to the display sequence of the time axis frame by frame drawn to the playing picture;
while (ret >= 0 && ! Is_stop_) {// Return decoded data to frame RET = AVCODEC_receive_frame (codec_ctx, frame); if (ret == AVERROR(EAGAIN) || ret == AVERROR_EOF) { return; } else if (ret < 0) { return; Auto decode_time_ms = frame-> PTS * 1000 / stream->time_base.den; if (decode_time_ms >= time_ms_) { last_decode_time_ms_ = decode_time_ms; is_seeking_ = false; / /... / / picture data format conversion / /... / / draw the transformed data to the screen} av_packet_unref (PKT); }Copy the code
- Before drawing the picture, we need to convert the image data format, which is used here
SwsContext
;
Int result = sws_scale(sws_context, (const uint8_t* const*) frame->data, frame->linesize, 0, video_height_, rgba_frame->data, rgba_frame->linesize); if (result <= 0) { LOGE(TAG, "Player Error : data convert fail"); return; }Copy the code
- I’m using it because I’m drawing it on top
ANativeWindow
和ANativeWindow_Buffer
. Before drawing the screen, you need to use the next drawing of the lock windowsurface
To draw, then write the frame data to be displayed to a buffer, and finally unlock the window’s drawingsurface
To publish the buffer data to the screen display;
// Result = ANativeWindow_lock(native_window_, &window_buffer_, nullptr); if (result < 0) { LOGE(TAG, "Player Error : Can not lock native window"); } else {// Draw the image to the interface // Note: Auto bits = (Uint8_t *) Window_buffer_. Bits; for (int h = 0; h < video_height_; h++) { memcpy(bits + h * window_buffer_.stride * 4, out_buffer_ + h * rgba_frame->linesize[0], rgba_frame->linesize[0]); } ANativeWindow_unlockAndPost(native_window_); }Copy the code
- So that’s the main decoding process. In addition, since C++ needs to release resources and memory space by itself, it needs to call the freed interface to release resources after decoding to avoid memory leaks.
sws_freeContext(data_convert_context);
av_free(out_buffer_);
av_frame_free(&rgba_frame);
av_frame_free(&frame);
av_packet_free(&packet);
avcodec_close(codec_ctx);
avcodec_free_context(&codec_ctx);
avformat_close_input(&av_format_context);
avformat_free_context(av_format_context);
ANativeWindow_release(native_window_);
Copy the code
2.3 Simple Applications
In order to better understand the process of video decoding, here is encapsulated a VideoDecoder VideoDecoder, the decoder will initially have the following functions:
VideoDecoder(const char* path, std::function<void(long timestamp)> on_decode_frame);
void Prepare(ANativeWindow* window);
bool DecodeFrame(long time_ms);
void Release();
Copy the code
In the video decoder, entering the specified timestamp returns the decoded frame of data. One of the most important is the DecodeFrame(long time_MS) function, which can be called by the user, passing in the specified frame timestamp, and then decoding the corresponding frame data. In addition, synchronization locks can be added to enable decoding threads and use thread separation.
2.3.1 Adding a Synchronization lock for Video playback
If the video is decoded, there is no need to use synchronous wait.
However, if the video is to be played, the lock should be used for synchronous waiting after every frame is drawn by decoding. This is because the decoding and drawing need to be separated when the video is played, and decoding and drawing should be carried out in a certain time axis order and speed.
condition_.wait(lock);
Copy the code
The synchronization lock is awakened when the upper layer calls the DecodeFrame function passing in the decoded timestamp, allowing the decoded drawing loop to continue.
Bool VideoDecoder: : DecodeFrame (long time_ms) {/ / · · · · · · time_ms_ = time_ms; condition_.notify_all(); return true; }Copy the code
2.3.2 Adding seek_frame during Playback
In normal playback, the video is played frame by frame decoding; However, in the case of dragging the progress bar to the specified SEEK point, it may not be too efficient to decode the seek point frame by frame from beginning to end. At this time, you need to check the time stamp of seek points within certain rules, and meet the conditions of direct seek to the specified time stamp.
In the FFmpeg av_seek_frame
av_seek_frame
You can locate key frames and non-key frames, depending on the selectionflag
Value. Because the decoding of the video depends on the key frame, we generally need to locate the key frame;
int av_seek_frame(AVFormatContext *s, int stream_index, int64_t timestamp,
int flags);
Copy the code
av_seek_frame
In theflag
Is used to specify the position relationship between the sought I frame and the passed timestamp. When seeking the past timestamp, the timestamp may not be exactly at the position of I frame, but because decoding depends on I frame, it is necessary to find an I frame near this timestamp firstflag
It indicates whether to seek the previous I frame or the next I frame of the current timestamp;flag
There are four options:
Flag options | describe |
---|---|
AVSEEK_FLAG_BACKWARD | The first Flag is the most recent keyframe before seek reaches the requested timestamp. In general, SEEK is in ms, and if the specified MS timestamp is not a keyframe (high chance), seek will automatically backtrack to the nearest keyframe. Although this flag positioning is not very accurate, it can better deal with the problem of removing the Mosaic, because the way with BACKWARD will look back to the key frame and locate the key frame. |
AVSEEK_FLAG_BYTE | The second Flag is the position in the file to seek (in bytes), exactly the same as AVSEEK_FLAG_FRAME, but with a different search algorithm. |
AVSEEK_FLAG_ANY | The third Flag is that you can seek to any frame, not necessarily the key frame, so the use of screen may appear (Mosaic), but the progress and hand slide completely consistent. |
AVSEEK_FLAG_FRAME | The fourth Flag is the frame serial number corresponding to the time stamp of seek, which can be understood as finding the nearest key frame BACKWARD, and the direction is opposite to that of BACKWARD. |
flag
It may contain more than one of the above values simultaneously. Such asAVSEEK_FLAG_BACKWARD | AVSEEK_FLAG_BYTE
;FRAME
和BACKWARD
Is according to the interval between frames to calculate the target position of seek, suitable for fast forward and fast back;BYTE
Is suitable for large sliding.
The seek of the scene
- If the timestamp passed in when decoding is forward, and there is a certain distance beyond the previous frame timestamp, seek is needed. The “certain distance” here is estimated through many experiments, not all of them are 1000ms as used in the following code.
- If it is in the backward direction and less than the last decoding timestamp, but the distance from the last decoding timestamp is large (such as more than 50ms), seek to the last key frame;
- Using a bool variable
is_seeking_
To prevent other operations that interfere with SEEKING, the purpose is to control that only one SEEK operation is currently in progress.
if (! is_seeking_ && (time_ms_ > last_decode_time_ms_ + 1000 || time_ms_ < last_decode_time_ms_ - 50)) { is_seeking_ = true; Times_ms LOGD(TAG, "seek frame time_ms_ = %ld, last_decode_time_ms_ = %ld", time_ms_, last_decode_time_ms_); av_seek_frame(av_format_context, video_stream_index, time_ms_ * stream->time_base.den / 1000, AVSEEK_FLAG_BACKWARD); }Copy the code
Insert seek’s logic
Insert the logic of seek before the av_read_frame function (which returns a frame of video media). If the condition of SEEK is met, use av_seek_frame to reach the specified I frame. Av_read_frame is then decoded to the destination timestamp.
Int ret = av_read_frame(av_format_context, packet);Copy the code
2.4 Details in decoding process
2.4.1 Conditions of seek when DecodeFrame
When using the av_seek_frame function, you need to specify the correct flag and specify the conditions for the seek operation, otherwise the video may appear Mosaic.
if (! is_seeking_ && (time_ms_ > last_decode_time_ms_ + 1000 || time_ms_ < last_decode_time_ms_ - 50)) { is_seeking_ = true; Av_seek_frame (...,...,..., AVSEEK_FLAG_BACKWARD); }Copy the code
2.4.2 Reduce the number of decoding
In video decoding, it is possible to decode the frame data without the time stamp in some conditions. Such as:
- If the current decoding timestamp is in the same direction as the previous decoding timestamp or the current decoding timestamp, no decoding is required;
- If the current decoding timestamp is not larger than the last decoding timestamp and the distance between the current decoding timestamp and the last decoding timestamp is small (for example, the distance between the current decoding timestamp and the last decoding timestamp is not more than 50ms), no decoding is required.
bool VideoDecoder::DecodeFrame(long time_ms) {
LOGD(TAG, "DecodeFrame time_ms = %ld", time_ms);
if (last_decode_time_ms_ == time_ms || time_ms_ == time_ms) {
LOGD(TAG, "DecodeFrame last_decode_time_ms_ == time_ms");
return false;
}
if (time_ms <= last_decode_time_ms_ &&
time_ms + 50 >= last_decode_time_ms_) {
return false;
}
time_ms_ = time_ms;
condition_.notify_all();
return true;
}
Copy the code
With the constraints of these conditions, it will reduce some unnecessary decoding operations.
2.4.3 Using PTS of AVFrame
AVPacket
Store the data before decoding (encoded data :H264/AAC, etc.), save the data after decoding and before decoding, it is still compressed data;AVFrame
Store decoded data (pixel data :YUV/RGB/PCM, etc.);AVPacket
的pts
和AVFrame
的pts
Meaning varies. The former indicates when the decompression package is displayed, and the latter indicates when the frame data is displayed.
** * Presentation timestamp in AVStream->time_base units; the time at which * the decompressed packet will be presented to the user. * Can be AV_NOPTS_VALUE if it is not stored in the file. * pts MUST be larger or equal to dts as presentation cannot happen before * decompression, unless one wants to view hex dumps. Some formats misuse * the terms dts and pts/cts to mean something different. Such timestamps * must be converted to true pts/dts before they are stored in AVPacket. */ int64_t pts; // AVFrame PTS /** * Presentation timestamp in time_base units (time when frame should be shown to user). */ int64_t pts;Copy the code
- Whether or not the currently decoded frame data is drawn on the screen depends on the result of the comparison of the time stamp passed in to the decoded frame with the time stamp returned by the current decoder. Not available here
AVPacket
的pts
, it is most likely not an incremental timestamp; - The premise of rendering is when the specified decoding timestamp is not greater than the PTS converted timestamp of the currently decoded frame.
auto decode_time_ms = frame->pts * 1000 / stream->time_base.den; LOGD(TAG, "decode_time_ms = %ld", decode_time_ms); if (decode_time_ms >= time_ms_) { last_decode_time_ms_ = decode_time_ms; is_seeking = false; // Picture drawing // ···}Copy the code
2.4.4 There is no data in the video when decoding the last frame
Use av_read_frame(AV_format_context, packet) to return a frame of video media streaming to AVPacket. The function returns Success if the int value is 0, or Error or EOF if it is less than 0.
So if a value less than 0 is returned during video playback, the avcodec_flush_buffers function is called to reset the state of the decoder, flush the contents of the buffers, and then seek to the current passed timestamp, complete the decoded callback, and let the synchronization lock wait.
Ret = av_read_frame(AV_format_context, packet); ret = av_read_frame(av_format_context, packet); if (ret < 0) { avcodec_flush_buffers(codec_ctx); av_seek_frame(av_format_context, video_stream_index, time_ms_ * stream->time_base.den / 1000, AVSEEK_FLAG_BACKWARD); LOGD(TAG, "ret < 0, condition_.wait(lock)"); On_decode_frame_ (last_decode_time_MS_); condition_.wait(lock); }Copy the code
2.5 Upper encapsulation decoder VideoDecoder
If you want to package a VideoDecoder in the upper layer, you just need to expose the C++ layer VideoDecoder interface in native-lib. CPP, and then the upper layer through JNI call C++ interface.
For example, when the upper layer wants to pass in the specified decoding timestamp for decoding, write a deocodeFrame method, and then pass the timestamp to the C++ layer nativeDecodeFrame for decoding. The implementation of nativeDecodeFrame method is written in native-lib.cpp.
// FFmpegVideoDecoder.kt class FFmpegVideoDecoder( path: String, val onDecodeFrame: (timestamp: Long, texture: SurfaceTexture, needRender: Boolean) -> Unit){SurfaceTexture, needRender: Boolean) {timeMs: Long, sync: Boolean = false) {if (nativeDecodeFrame(decoderPtr, decoderPtr, TimeMS) &&sync) {// ······ ··} else {// ······ ···}} Private External Fun nativeDecodeFrame(decoder: Long, timeMS: Long): Boolean companion object { const val TAG = "FFmpegVideoDecoder" init { System.loadLibrary("ffmmpeg") } } }Copy the code
Then in native-lib. CPP call C++ layer VideoDecoder interface DecodeFrame, so through JNI to establish the connection between the upper layer and C++ bottom layer
// native-lib.cpp
extern "C"
JNIEXPORT jboolean JNICALL
Java_com_example_decoder_video_FFmpegVideoDecoder_nativeDecodeFrame(JNIEnv* env,
jobject thiz,
jlong decoder,
jlong time_ms) {
auto videoDecoder = (codec::VideoDecoder*)decoder;
return videoDecoder->DecodeFrame(time_ms);
}
Copy the code
Three, heart
Technical experience
- FFmpeg is compiled and combined with Android to realize video decoding playback, with high convenience.
- Because it is to use C++ layer to achieve specific decoding process, there will be learning difficulties, it is better to have a certain C++ foundation.
Fourth, the appendix
C++ encapsulated VideoDecoder
VideoDecoder.h
#include <jni.h> #include <mutex> #include <android/native_window.h> #include <android/native_window_jni.h> #include <time.h> extern "C" { #include <libavformat/avformat.h> #include <libavcodec/avcodec.h> #include <libswresample/swresample.h> #include <libswscale/swscale.h> } #include <string> /* * VideoDecoder Can be used to decode the data of the video media stream in an audio and video file (e.g..mp4). * Java layer after passing in the specified file path, can pass in the specified timestamp according to a certain FPS cycle for decoding (frame extraction), this implementation is provided by C++ DecodeFrame to complete. * At the end of each decoding, the timestamp of decoding a frame is called back to the upper decoder for use by other operations. */ namespace codec { class VideoDecoder { private: std::string path_; long time_ms_ = -1; long last_decode_time_ms_ = -1; bool is_seeking_ = false; ANativeWindow* native_window_ = nullptr; ANativeWindow_Buffer window_buffer_{}; Int video_width_ = 0; int video_height_ = 0; uint8_t* out_buffer_ = nullptr; // on_decode_frame is used to call back the timestamp of extracting the specified frame to the upper decoder for other operations. std::function<void(long timestamp)> on_decode_frame_ = nullptr; bool is_stop_ = false; // use STD ::mutex work_queue_mtx with the lock "STD ::unique_lock< STD ::mutex>" for loop synchronization; STD ::condition_variable condition_; STD ::condition_variable condition_; Void Decode(AVCodecContext* codec_ctx, AVPacket* PKT, AVFrame* frame, AVStream* stream, std::unique_lock<std::mutex>& lock, SwsContext* sws_context, AVFrame* pFrame); Public: // To create a new decoder, pass in the media file path and a decoded callback on_decode_frame. VideoDecoder(const char* path, std::function<void(long timestamp)> on_decode_frame); Void Prepare(ANativeWindow* window); void Prepare(ANativeWindow* window); Bool DecodeFrame(long time_ms); bool DecodeFrame(long time_ms); // Release the decoder resource void Release(); Static int64_t GetCurrentMilliTime(void); }; }Copy the code
VideoDecoder.cpp
#include "VideoDecoder.h"
#include "../log/Logger.h"
#include <thread>
#include <utility>
extern "C" {
#include <libavutil/imgutils.h>
}
#define TAG "VideoDecoder"
namespace codec {
VideoDecoder::VideoDecoder(const char* path, std::function<void(long timestamp)> on_decode_frame)
: on_decode_frame_(std::move(on_decode_frame)) {
path_ = std::string(path);
}
void VideoDecoder::Decode(AVCodecContext* codec_ctx, AVPacket* pkt, AVFrame* frame, AVStream* stream,
std::unique_lock<std::mutex>& lock, SwsContext* sws_context,
AVFrame* rgba_frame) {
int ret;
/* send the packet with the compressed data to the decoder */
ret = avcodec_send_packet(codec_ctx, pkt);
if (ret == AVERROR(EAGAIN)) {
LOGE(TAG,
"Decode: Receive_frame and send_packet both returned EAGAIN, which is an API violation.");
} else if (ret < 0) {
return;
}
// read all the output frames (infile general there may be any number of them
while (ret >= 0 && !is_stop_) {
// 对于frame, avcodec_receive_frame内部每次都先调用
ret = avcodec_receive_frame(codec_ctx, frame);
if (ret == AVERROR(EAGAIN) || ret == AVERROR_EOF) {
return;
} else if (ret < 0) {
return;
}
int64_t startTime = GetCurrentMilliTime();
LOGD(TAG, "decodeStartTime: %ld", startTime);
// 换算当前解码的frame时间戳
auto decode_time_ms = frame->pts * 1000 / stream->time_base.den;
LOGD(TAG, "decode_time_ms = %ld", decode_time_ms);
if (decode_time_ms >= time_ms_) {
LOGD(TAG, "decode decode_time_ms = %ld, time_ms_ = %ld", decode_time_ms, time_ms_);
last_decode_time_ms_ = decode_time_ms;
is_seeking_ = false;
// 数据格式转换
int result = sws_scale(
sws_context,
(const uint8_t* const*) frame->data, frame->linesize,
0, video_height_,
rgba_frame->data, rgba_frame->linesize);
if (result <= 0) {
LOGE(TAG, "Player Error : data convert fail");
return;
}
// 播放
result = ANativeWindow_lock(native_window_, &window_buffer_, nullptr);
if (result < 0) {
LOGE(TAG, "Player Error : Can not lock native window");
} else {
// 将图像绘制到界面上
auto bits = (uint8_t*) window_buffer_.bits;
for (int h = 0; h < video_height_; h++) {
memcpy(bits + h * window_buffer_.stride * 4,
out_buffer_ + h * rgba_frame->linesize[0],
rgba_frame->linesize[0]);
}
ANativeWindow_unlockAndPost(native_window_);
}
on_decode_frame_(decode_time_ms);
int64_t endTime = GetCurrentMilliTime();
LOGD(TAG, "decodeEndTime - decodeStartTime: %ld", endTime - startTime);
LOGD(TAG, "finish decode frame");
condition_.wait(lock);
}
// 主要作用是清理AVPacket中的所有空间数据,清理完毕后进行初始化操作,并且将 data 与 size 置为0,方便下次调用。
// 释放 packet 引用
av_packet_unref(pkt);
}
}
void VideoDecoder::Prepare(ANativeWindow* window) {
native_window_ = window;
av_register_all();
auto av_format_context = avformat_alloc_context();
avformat_open_input(&av_format_context, path_.c_str(), nullptr, nullptr);
avformat_find_stream_info(av_format_context, nullptr);
int video_stream_index = -1;
for (int i = 0; i < av_format_context->nb_streams; i++) {
// 找到视频媒体流的下标
if (av_format_context->streams[i]->codecpar->codec_type == AVMEDIA_TYPE_VIDEO) {
video_stream_index = i;
LOGD(TAG, "find video stream index = %d", video_stream_index);
break;
}
}
// run once
do {
if (video_stream_index == -1) {
codec::LOGE(TAG, "Player Error : Can not find video stream");
break;
}
std::unique_lock<std::mutex> lock(work_queue_mtx);
// 获取视频媒体流
auto stream = av_format_context->streams[video_stream_index];
// 找到已注册的解码器
auto codec = avcodec_find_decoder(stream->codecpar->codec_id);
// 获取解码器上下文
AVCodecContext* codec_ctx = avcodec_alloc_context3(codec);
auto ret = avcodec_parameters_to_context(codec_ctx, stream->codecpar);
if (ret >= 0) {
// 打开
avcodec_open2(codec_ctx, codec, nullptr);
// 解码器打开后才有宽高的值
video_width_ = codec_ctx->width;
video_height_ = codec_ctx->height;
AVFrame* rgba_frame = av_frame_alloc();
int buffer_size = av_image_get_buffer_size(AV_PIX_FMT_RGBA, video_width_, video_height_,
1);
// 分配内存空间给输出 buffer
out_buffer_ = (uint8_t*) av_malloc(buffer_size * sizeof(uint8_t));
av_image_fill_arrays(rgba_frame->data, rgba_frame->linesize, out_buffer_,
AV_PIX_FMT_RGBA,
video_width_, video_height_, 1);
// 通过设置宽高限制缓冲区中的像素数量,而非屏幕的物理显示尺寸。
// 如果缓冲区与物理屏幕的显示尺寸不相符,则实际显示可能会是拉伸,或者被压缩的图像
int result = ANativeWindow_setBuffersGeometry(native_window_, video_width_,
video_height_, WINDOW_FORMAT_RGBA_8888);
if (result < 0) {
LOGE(TAG, "Player Error : Can not set native window buffer");
avcodec_close(codec_ctx);
avcodec_free_context(&codec_ctx);
av_free(out_buffer_);
break;
}
auto frame = av_frame_alloc();
auto packet = av_packet_alloc();
struct SwsContext* data_convert_context = sws_getContext(
video_width_, video_height_, codec_ctx->pix_fmt,
video_width_, video_height_, AV_PIX_FMT_RGBA,
SWS_BICUBIC, nullptr, nullptr, nullptr);
while (!is_stop_) {
LOGD(TAG, "front seek time_ms_ = %ld, last_decode_time_ms_ = %ld", time_ms_,
last_decode_time_ms_);
if (!is_seeking_ && (time_ms_ > last_decode_time_ms_ + 1000 ||
time_ms_ < last_decode_time_ms_ - 50)) {
is_seeking_ = true;
LOGD(TAG, "seek frame time_ms_ = %ld, last_decode_time_ms_ = %ld", time_ms_,
last_decode_time_ms_);
// 传进去的是指定帧带有 time_base 的时间戳,所以是要将原来的 times_ms 按照上面获取时的计算方式反推算出时间戳
av_seek_frame(av_format_context, video_stream_index,
time_ms_ * stream->time_base.den / 1000, AVSEEK_FLAG_BACKWARD);
}
// 读取视频一帧(完整的一帧),获取的是一帧视频的压缩数据,接下来才能对其进行解码
ret = av_read_frame(av_format_context, packet);
if (ret < 0) {
avcodec_flush_buffers(codec_ctx);
av_seek_frame(av_format_context, video_stream_index,
time_ms_ * stream->time_base.den / 1000, AVSEEK_FLAG_BACKWARD);
LOGD(TAG, "ret < 0, condition_.wait(lock)");
// 防止解码最后一帧时视频已经没有数据
on_decode_frame_(last_decode_time_ms_);
condition_.wait(lock);
}
if (packet->size) {
Decode(codec_ctx, packet, frame, stream, lock, data_convert_context,
rgba_frame);
}
}
// 释放资源
sws_freeContext(data_convert_context);
av_free(out_buffer_);
av_frame_free(&rgba_frame);
av_frame_free(&frame);
av_packet_free(&packet);
}
avcodec_close(codec_ctx);
avcodec_free_context(&codec_ctx);
} while (false);
avformat_close_input(&av_format_context);
avformat_free_context(av_format_context);
ANativeWindow_release(native_window_);
delete this;
}
bool VideoDecoder::DecodeFrame(long time_ms) {
LOGD(TAG, "DecodeFrame time_ms = %ld", time_ms);
if (last_decode_time_ms_ == time_ms || time_ms_ == time_ms) {
LOGD(TAG, "DecodeFrame last_decode_time_ms_ == time_ms");
return false;
}
if (last_decode_time_ms_ >= time_ms && last_decode_time_ms_ <= time_ms + 50) {
return false;
}
time_ms_ = time_ms;
condition_.notify_all();
return true;
}
void VideoDecoder::Release() {
is_stop_ = true;
condition_.notify_all();
}
/**
* 获取当前的毫秒级时间
*/
int64_t VideoDecoder::GetCurrentMilliTime(void) {
struct timeval tv{};
gettimeofday(&tv, nullptr);
return tv.tv_sec * 1000.0 + tv.tv_usec / 1000.0;
}
}
Copy the code
Join us
As bytedance’s image team, we are currently developing a number of products, including Photoediting, CapCut, Light Color, Wake image and Faceu. Our business covers diversified image creation scenes. By June 2021, photoediting, Light Color Camera and CapCut have repeatedly topped the list of free apps in APP Store at home and abroad. And continue to maintain high growth. Join us to create the world’s most popular image creation products.
Club recruit delivery link: job.toutiao.com/s/NFYMcaq
Enrollment code: 5A38FTT
The school recruit delivery link: jobs.bytedance.com/campus/posi…
, recruiting – byte to beat each other entertainment research and development team: image bytedance. Feishu. Cn/docx/doxcnM…