[sound Ming]
First of all, this series of articles are based on their own understanding and practice, there may be wrong places, welcome to correct.
Secondly, this is an introductory series, covering only enough knowledge, and there are many blog posts on the Internet for in-depth knowledge. Finally, in the process of writing the article, I will refer to the articles shared by others and list them at the end of the article, thanking these authors for their sharing.
Code word is not easy, reproduced please indicate the source!
Tutorial code: [Making portal】 |
---|
directory
First, Android audio and video hard decoding:
- 1. Basic knowledge of audio and video
- 2. Audio and video hard decoding process: packaging basic decoding framework
- 3. Audio and video playback: audio and video synchronization
- 4, audio and video unencapsulation and packaging: generate an MP4
2. Use OpenGL to render video frames
- 1. Preliminary understanding of OpenGL ES
- 2. Use OpenGL to render video images
- 3, OpenGL rendering multi-video, picture-in-picture
- 4. Learn more about EGL of OpenGL
- 5, OpenGL FBO data buffer
- 6, Android audio and video hardcoding: generate an MP4
Android FFmpeg audio and video decoding
- 1, FFmpeg SO library compilation
- 2. Android introduces FFmpeg
- 3, Android FFmpeg video decoding playback
- 4, Android FFmpeg+OpenSL ES audio decoding playback
- 5, Android FFmpeg+OpenGL ES play video
- Android FFmpeg Simple Synthesis MP4: Video unencapsulation and Reencapsulation
- 7, Android FFmpeg video encoding
You can read about it in this article
Audio and video decoding process based on FFmpeg 4.x, focusing on how to achieve video playback.
preface
Hi ~ Long time waiting!
This article is very long, because there may be more small partners to JNI C/C++ is not very familiar with, so this article is more detailed to FFmpeg code to explain, a complete demonstration of the DECODING and rendering process of FFmpeg, and the decoding process was encapsulated.
To facilitate explanation and reading, the code is explained in chunks, meaning that the entire class is not posted directly.
But each section of code is marked at the beginning of which file and which class it belongs to. To see the full code, check out the Github repository.
This article requires basic knowledge of C/C++. If you are not familiar with C/C++, please check out my other article: “Getting started with Android NDK: BASIC knowledge of C++”.
Please read patiently, I believe that after reading can have a considerable understanding of FFmpeg decoding.
A brief introduction to FFmpeg libraries
In the last article, FFMPEg-related libraries were introduced into the Android project, including the following:
library | introduce |
---|---|
avcodec | Audio and video codec core library |
avformat | Packaging and parsing of audio and video container format |
avutil | Core tool library |
swscal | Image format conversion module |
swresampel | Audio resampling |
avfilter | Audio and video filter library such as video watermarking, audio sound change |
avdevice | I/O device library provides input and output of device data |
FFmpeg relies on the above libraries to achieve a powerful audio and video coding, decoding, editing, conversion, acquisition and other capabilities.
Two, FFMpeg decoding process introduction
In the previous series of articles, the use of Android to provide the raw decoding capabilities, the use of video decoding and playback.
To sum up, there are the following processes:
- Initialize the decoder
- read
Mp4
Encoded data in a file and fed into a decoder to decode - Get decoded frame data
- Render a frame onto the screen
FFmpeg decoding is nothing more than the above process, but FFmpeg is the use of CPU computing power to decode it.
1. FFmpeg initialization
The FFmpeg initialization process is relatively trivial compared to the Android crude decoding process, but the process is fixed, once packaged can be directly applied.
First take a look at the initialization flowchart
This is a set of parameters initialized according to the format of the file to be decoded.
AVFormatContext(format_CTx), AVCodecContext(coDEC_CTx), AVCodec(codec)
Structure: FFmpeg is based on C, which we know is a procedural language, meaning that unlike C++ there are no classes to encapsulate internal data. But C provides structures that can be used to encapsulate data and achieve class-like effects.
-
AVFormatContext: Belongs to the AvFormat library and stores the context of the stream data. It is mainly used for audio and video encapsulation and decapsulation.
-
AVCodecContext: belongs to the AVCodec library and stores the codec parameter context. It is mainly used to encode and decode audio and video data.
-
AVCodec: Belongs to AVCodec library, audio and video codec, real codec executor.
2. FFmpeg decoding cycle
Similarly, a flow chart is used to illustrate the specific decoding process:
After FFmpeg is initialized, it is time to decode the data frames.
As can be seen from the figure above, FFmpeg first extracts the data into an AVPacket (AVPacket), and then decodes the data into a frame of data that can be rendered, called an AVFrame (Frame).
Similarly, AVPacket and AVFrame are also two structures, which encapsulate specific data.
3. Encapsulation and decoding classes
With this understanding of the decoding process, you can write code according to the flow chart above.
Based on past experience, since FFmepg initialization and decoding processes are trivial and repetitive work, we must encapsulate them for better reuse and expansion.
Decode flow encapsulation
1. Define the decoding state:decode_state.h
In the SRC/main/CPP/media/decoder, right click New – > c + + Header File, input decode_state
//decode_state.h
#ifndef LEARNVIDEO_DECODESTATE_H
#define LEARNVIDEO_DECODESTATE_H
enum DecodeState {
STOP,
PREPARE,
START,
DECODING,
PAUSE,
FINISH
};
#endif //LEARNVIDEO_DECODESTATE_H
Copy the code
This is an enumeration that defines the states decoded by the decoder
2. Define the basic functions of decoder:i_decoder.h
:
In the SRC/main/CPP/media/decoder, right click New – > c + + Header File, input i_decoder.
// i_decoder.h
#ifndef LEARNVIDEO_I_DECODER_H
#define LEARNVIDEO_I_DECODER_H
class IDecoder {
public:
virtual void GoOn(a) = 0;
virtual void Pause(a) = 0;
virtual void Stop(a) = 0;
virtual bool IsRunning(a) = 0;
virtual long GetDuration(a) = 0;
virtual long GetCurPos(a) = 0;
};
Copy the code
This is a pure virtual class, a java-like interface (see getting started with Android NDK: C++ basics) that defines the basic methods a decoder should have.
3. Define a decoder base classbase_decoder
.
In SRC/main/CPP/media/decoder, right click New – > input base_decoder c + + Class, this Class is used to encapsulate the most fundamental in decoding process.
Two files are generated: base_decoder.h and base_decoder.cpp.
- Define header files:
base_decoder.h
//base_decoder.h
#ifndef LEARNVIDEO_BASEDECODER_H
#define LEARNVIDEO_BASEDECODER_H
#include <jni.h>
#include <string>
#include <thread>
#include ".. /.. /utils/logger.h"
#include "i_decoder.h"
#include "decode_state.h"
extern "C" {
#include <libavcodec/avcodec.h>
#include <libavformat/avformat.h>
#include <libavutil/frame.h>
#include <libavutil/time.h>
};
class BaseDecoder: public IDecoder {
private:
const char *TAG = "BaseDecoder";
/ / -- -- -- -- -- -- -- -- -- -- -- -- -- definition decoding related -- -- -- -- -- -- -- -- -- -- -- -- -- -- -- -- -- -- -- -- -- -- -- -- -- -- -- -- -- --
// Decode the information context
AVFormatContext *m_format_ctx = NULL;
/ / decoder
AVCodec *m_codec = NULL;
// Decoder context
AVCodecContext *m_codec_ctx = NULL;
// Packets to be decoded
AVPacket *m_packet = NULL;
// Finally decode the data
AVFrame *m_frame = NULL;
// Current playback time
int64_t m_cur_t_s = 0;
/ / total duration
long m_duration = 0;
// Start time
int64_t m_started_t = - 1;
// Decode state
DecodeState m_state = STOP;
// Data flow index
int m_stream_index = - 1;
// omit others
/ /...
}
Copy the code
Note: When introducing headers for FFmpeg libraries, be careful to include #include in extern “C” {}. Because FFmpeg is written in C, when it is introduced into C++ files, flags need to be compiled in C, otherwise compilation errors will occur.
In the header file, you first declare the relevant variables needed in the CPP, focusing on the decoding structures mentioned in the previous section.
- Define methods associated with initialization and decoding loops:
//base_decoder.h
class BaseDecoder: public IDecoder {
private:
const char *TAG = "BaseDecoder";
/ / -- -- -- -- -- -- -- -- -- -- -- -- -- definition decoding related -- -- -- -- -- -- -- -- -- -- -- -- -- -- -- -- -- -- -- -- -- -- -- -- -- -- -- -- -- --
/ / to omit...
/ / -- -- -- -- -- -- -- -- -- -- -- -- -- -- -- -- -- a private method -- -- -- -- -- -- -- -- -- -- -- -- -- -- -- -- -- -- -- -- -- -- -- -- -- -- -- -- -- --
/** * Initializes FFMpeg parameters * @param env JVM environment */
void InitFFMpegDecoder(JNIEnv * env);
/** * allocate the cache needed during decoding */
void AllocFrameBuffer(a);
/** * loop decode */
void LoopDecode(a);
/** * get the current frame timestamp */
void ObtainTimeStamp(a);
@param env JVM environment */
void DoneDecode(JNIEnv *env);
/** * Time synchronization */
void SyncRender(a);
// omit others
/ /...
}
Copy the code
- The decoding base class inherits from
i_decoder
, you also need to implement the generic methods specified therein.
//base_decoder.h
class BaseDecoder: public IDecoder {
// omit others
/ /...
public:
//-------- constructors and destructors -------------
BaseDecoder(JNIEnv *env, jstring path);
virtual ~BaseDecoder(a);//-------- implements the base class method -----------------
void GoOn(a) override;
void Pause(a) override;
void Stop(a) override;
bool IsRunning(a) override;
long GetDuration(a) override;
long GetCurPos(a) override;
}
Copy the code
- Defining decoding threads
As we know, decoding is a very time consuming operation, just like raw solutions, and we need to start a thread to carry the decoding task. So start by defining thread-specific variables and methods in the header file.
//base_decoder.h
class BaseDecoder: public IDecoder {
private:
// omit others
/ /...
/ / -- -- -- -- -- -- -- -- -- -- -- -- -- -- -- -- -- -- -- define the thread related -- -- -- -- -- -- -- -- -- -- -- -- -- -- -- -- -- -- -- -- -- -- -- -- -- -- -- -- --
// The JVM environment to which the thread is attached
JavaVM *m_jvm_for_thread = NULL;
// The original path is referenced by jstring, otherwise it cannot be operated on in the thread
jobject m_path_ref = NULL;
// The converted path
const char *m_path = NULL;
// The thread waits for the lock variable
pthread_mutex_t m_mutex = PTHREAD_MUTEX_INITIALIZER;
pthread_cond_t m_cond = PTHREAD_COND_INITIALIZER;
/** * Create a new decoding thread */
void CreateDecodeThread(a);
/** * static decoding method, used to decode thread callback * @param that current decoder */
static void Decode(std::shared_ptr<BaseDecoder> that);
protected:
/**
* 进入等待
*/
void Wait(long second = 0);
/** * resume decoding */
void SendSignal(a);
}
Copy the code
- Defines the virtual functions that subclasses need to implement
//base_decoder.h
class BaseDecoder: public IDecoder {
protected:
/** * subclasses prepare callback methods * @note: in decoding thread callback * @param env decodes thread bound JVM environment */
virtual void Prepare(JNIEnv *env) = 0;
/** * subclass render callback method * @note note: in decoding thread callback * @param frame video: a frame of YUV data; Audio: one frame of PCM data */
virtual void Render(AVFrame *frame) = 0;
/** * subclass release resource callback method */
virtual void Release(a) = 0;
}
Copy the code
Above, the decoding class infrastructure is defined:
FFmpeg
Decode the associated structure parameters- Basic method of decoder
- Decoding thread
- Specifies the methods that subclasses need to implement
4. Implement the basic decoder
In base_decoder.cpp, implement the methods declared in the header file
- Initial resolution code thread
// base_decoder.cpp
#include "base_decoder.h"
#include ".. /.. /utils/timer.c"
BaseDecoder::BaseDecoder(JNIEnv *env, jstring path) {
Init(env, path);
CreateDecodeThread(a); } BaseDecoder::~BaseDecoder() {
if(m_format_ctx ! =NULL) delete m_format_ctx;
if(m_codec_ctx ! =NULL) delete m_codec_ctx;
if(m_frame ! =NULL) delete m_frame;
if(m_packet ! =NULL) delete m_packet;
}
void BaseDecoder::Init(JNIEnv *env, jstring path) {
m_path_ref = env->NewGlobalRef(path);
m_path = env->GetStringUTFChars(path, NULL);
// Get the JVM in preparation for creating the thread
env->GetJavaVM(&m_jvm_for_thread);
}
void BaseDecoder::CreateDecodeThread(a) {
// When the thread terminates, the pointer of this class is automatically deleted
std::shared_ptr<BaseDecoder> that(this);
std::thread t(Decode, that);
t.detach(a); }Copy the code
The constructor is simple, passing in the JNI environment variable along with the path of the file to be decoded.
In the Init method, because jstring is not a C++ standard type, you need to convert jstring path to char to use it.
Note: Since JNIEnv and thread are one-to-one, that is, in Android, the JNI environment is tied to threads, each thread has a separate JNIEnv environment and is not accessible to each other. So if you want to access the JNIEnv in a new thread, you need to create a new JNIEnv for that thread.
At the end of Init, env->GetJavaVM(&m_jVM_for_thread) gets the JavaVM instance and saves it to m_jVM_for_thread. This lets you get a new JNIEnv environment for the decoder thread.
Creating a thread in C++ is very simple. You can start a thread with just two sentences:
STD :: Thread t(static method, static method parameter); t.detach();Copy the code
That is, the thread needs a static method as an argument, and when started, it calls back to the static method and can pass arguments to the static method.
In addition, the first code in the CreateDecodeThread method is used to create a smart pointer.
As we know, we need to manually delete pointer objects from C++ new, otherwise there will be a memory leak. Smart Pointers help us manage memory.
The pointer is automatically destroyed when its reference count reaches zero. In other words, there is no need to manually delete ourselves.
std::shared_ptr<BaseDecoder> that(this);
Copy the code
This encapsulates the smart pointer named that, so that when the decoder is used externally, there is no need to manually free the memory. When the decoder thread exits, it will be destroyed automatically and the destructor will be called.
- Encapsulation and decoding process
// base_decoder.cpp void BaseDecoder::Decode(std::shared_ptr<BaseDecoder> that) { JNIEnv * env; If (that-> m_jVM_for_thread ->AttachCurrentThread(&env, NULL)! = JNI_OK) { LOG_ERROR(that->TAG, that->LogSpec(), "Fail to Init decode thread"); return; } // initialize the decoder that->InitFFMpegDecoder(env); That ->AllocFrameBuffer(); // call the subclass decoder that->Prepare(env); // Enter the decode loop that->LoopDecode(); That ->DoneDecode(env); That -> jvm_for_thread->DetachCurrentThread(); }Copy the code
In the base_decoder.h header file declaration, Decode is a static member method.
The JNIEnv is first created for the decoding thread, and if it fails, it simply exits the decoding.
The Decode method is a step by step call to the corresponding method, very simple, read the annotation.
Next, look at the details of the step-by-step call.
- Initialize the decoder
void BaseDecoder::InitFFMpegDecoder(JNIEnv * env) {
//1, initialize the context
m_format_ctx = avformat_alloc_context(a);// open the file
if (avformat_open_input(&m_format_ctx, m_path, NULL.NULL) != 0) {
LOG_ERROR(TAG, LogSpec(), "Fail to open file [%s]", m_path);
DoneDecode(env);
return;
}
//3, get audio and video stream information
if (avformat_find_stream_info(m_format_ctx, NULL) < 0) {
LOG_ERROR(TAG, LogSpec(), "Fail to find stream info");
DoneDecode(env);
return;
}
//4, find codec
//4.1 Get the index of the video stream
int vIdx = - 1;// Store the index of the video stream
for (int i = 0; i < m_format_ctx->nb_streams; ++i) {
if (m_format_ctx->streams[i]->codecpar->codec_type == GetMediaType()) {
vIdx = i;
break; }}if (vIdx == - 1) {
LOG_ERROR(TAG, LogSpec(), "Fail to find stream index")
DoneDecode(env);
return;
}
m_stream_index = vIdx;
//4.2 Get decoder parameters
AVCodecParameters *codecPar = m_format_ctx->streams[vIdx]->codecpar;
//4.3 Obtaining the decoder
m_codec = avcodec_find_decoder(codecPar->codec_id);
//4.4 Obtaining the decoder context
m_codec_ctx = avcodec_alloc_context3(m_codec);
if (avcodec_parameters_to_context(m_codec_ctx, codecPar) ! =0) {
LOG_ERROR(TAG, LogSpec(), "Fail to obtain av codec context");
DoneDecode(env);
return;
}
//5, open the decoder
if (avcodec_open2(m_codec_ctx, m_codec, NULL) < 0) {
LOG_ERROR(TAG, LogSpec(), "Fail to open av codec");
DoneDecode(env);
return;
}
m_duration = (long) ((float)m_format_ctx->duration/AV_TIME_BASE * 1000);
LOG_INFO(TAG, LogSpec(), "Decoder init success")}Copy the code
It may seem complicated, but in fact the routines are the same, and at first it may feel uncomfortable, mainly because these methods are procedural-oriented invocation methods, which are not the same as the object-oriented languages they are used to.
Here’s an example:
In the above code, the file is opened like this:
avformat_open_input(&m_format_ctx, m_path, NULL.NULL);
Copy the code
For object oriented, the code would normally look like this:
// Note: The following is pseudocode for illustration only
m_format_ctx.avformat_open_input(m_path);
Copy the code
So what do you think about procedural calls in C?
We know that m_format_CTx is a structure that encapsulates specific data, so avformat_open_INPUT is a method that operates on that structure, and different method calls are operations on different data in that structure.
Please refer to the notes above for details about the specific process. It is actually the realization of the steps in [Initialization Flow Chart] in the first section.
There are two things to note:
FFmpeg
withalloc
The type method usually only initializes the corresponding structure, but the specific parameters and data cache can be used only after another method initialization.
For example m_format_CTx, m_COdec_CTx:
// create m_format_ctx = avformat_alloc_context(); Avformat_open_input (&m_format_CTx, m_PATH, NULL, NULL) -- -- -- -- -- -- -- -- -- -- -- -- -- -- -- -- -- -- -- -- -- -- -- -- -- -- -- -- -- -- -- -- -- -- -- -- -- -- -- -- -- -- -- -- -- -- -- -- -- -- -- -- -- -- -- / / create m_codec_ctx = avcodec_alloc_context3 (m_codec); Avcodec_parameters_to_context (m_CODEC_ctx, codecPar);Copy the code
- Point 4 on comments in the code
We know that audio and video data is usually encapsulated in different tracks, so in order to get the correct audio and video data, we need to get the corresponding index first.
The data type of audio and video is obtained by the virtual function GetMediaType(), which is implemented in subclasses, respectively:
Video: AVMediaType AVMEDIA_TYPE_VIDEO
Audio: AVMediaType AVMEDIA_TYPE_AUDIO
- Create data structures to be decoded and decoded
// base_decoder.cpp
void BaseDecoder::AllocFrameBuffer(a) {
// Initializes the data structure to be decoded and decoded
// 1) Initialize AVPacket to store data before decoding
m_packet = av_packet_alloc(a);// 2) Initialize AVFrame to store decoded data
m_frame = av_frame_alloc(a); }Copy the code
Very simply, two methods are used to allocate memory for later decoding.
- Decoding the cycle
// base_decoder.cpp
void BaseDecoder::LoopDecode(a) {
if (STOP == m_state) { // If the state has been changed externally, maintain the external configuration
m_state = START;
}
LOG_INFO(TAG, LogSpec(), "Start loop decode")
while(1) {
if(m_state ! = DECODING && m_state ! = START && m_state ! = STOP) {Wait(a);// Restore the synchronization start time to remove the lost waiting time
m_started_t = GetCurMsTime() - m_cur_t_s;
}
if (m_state == STOP) {
break;
}
if (- 1= =m_started_t) {
m_started_t = GetCurMsTime(a); }if (DecodeOneFrame() != NULL) {
SyncRender(a);Render(m_frame);
if(m_state == START) { m_state = PAUSE; }}else {
LOG_INFO(TAG, LogSpec(), "m_state = %d" ,m_state)
if (ForSynthesizer()) {
m_state = STOP;
} else{ m_state = FINISH; }}}}Copy the code
As you can see, we enter the while loop, which integrates the partial time synchronization code. The synchronization logic is explained in detail in the previous hard-solved article. For details, refer to audio and video synchronization.
Without going into detail, here’s just one of the most important: DecodeOneFrame().
- Decodes a frame of data
Before we look at the code, let’s look at how FFmpeg decodes in three ways:
+ + + + av_read_frame (m_format_ctx, m_packet) :
Read a frame of unsealed data from m_format_CTX and store it in m_packet;
+ + + + avcodec_send_packet (m_codec_ctx, m_packet) :
M_packet is sent to the decoder for decoding, and the decoded data is stored in m_CODEC_CTX;
+ + + + avcodec_receive_frame (m_codec_ctx, m_frame) :
Receive a frame of decoded data and store it in m_frame.
// base_decoder.cpp
AVFrame* BaseDecoder::DecodeOneFrame(a) {
int ret = av_read_frame(m_format_ctx, m_packet);
while (ret == 0) {
if (m_packet->stream_index == m_stream_index) {
switch (avcodec_send_packet(m_codec_ctx, m_packet)) {
case AVERROR_EOF: {
av_packet_unref(m_packet);
LOG_ERROR(TAG, LogSpec(), "Decode error: %s".av_err2str(AVERROR_EOF));
return NULL; // End of decoding
}
case AVERROR(EAGAIN):
LOG_ERROR(TAG, LogSpec(), "Decode error: %s", av_err2str(AVERROR(EAGAIN)));
break;
case AVERROR(EINVAL):
LOG_ERROR(TAG, LogSpec(), "Decode error: %s", av_err2str(AVERROR(EINVAL)));
break;
case AVERROR(ENOMEM):
LOG_ERROR(TAG, LogSpec(), "Decode error: %s", av_err2str(AVERROR(ENOMEM)));
break;
default:
break;
}
int result = avcodec_receive_frame(m_codec_ctx, m_frame);
if (result == 0) {
ObtainTimeStamp(a);av_packet_unref(m_packet);
return m_frame;
} else {
LOG_INFO(TAG, LogSpec(), "Receive frame error result: %d".av_err2str(AVERROR(result)))
}
}
/ / release the packet
av_packet_unref(m_packet);
ret = av_read_frame(m_format_ctx, m_packet);
}
av_packet_unref(m_packet);
LOGI(TAG, "ret = %d", ret)
return NULL;
}
Copy the code
Knowing the decoding process, the rest is actually handling exception cases, such as:
-
When decoding needs to wait, the data is re-sent to the decoder, and then the data is retrieved;
-
Decoding fails, the next frame is read, and decoding continues;
-
If decoding is complete, NULL data is returned;
Finally, it is very important to call AV_packet_unref (m_packet) when decoding a frame of data; Free memory, otherwise it will leak.
- Decoding complete, release resources
Once decoded, all FFmpeg related resources need to be released and the decoder closed.
It is also important to note that at initialization, the file path resulting from the jString conversion is also released, and the global reference is removed.
// base_deocder.cpp void BaseDecoder::DoneDecode(JNIEnv *env) { LOG_INFO(TAG, LogSpec(), "Decode done and decoder release") if (m_packet! = NULL) { av_packet_free(&m_packet); } if (m_frame ! = NULL) { av_frame_free(&m_frame); } // Close the decoder if (m_codec_ctx! = NULL) { avcodec_close(m_codec_ctx); avcodec_free_context(&m_codec_ctx); } // Close the input stream if (m_format_ctx! = NULL) { avformat_close_input(&m_format_ctx); avformat_free_context(m_format_ctx); } // release the transform parameter if (m_path_ref! = NULL && m_path ! = NULL) { env->ReleaseStringUTFChars((jstring) m_path_ref, m_path); env->DeleteGlobalRef(m_path_ref); } // Tell subclasses to Release the resource Release(); }Copy the code
Above, the decoder infrastructure package, as long as the inheritance and implementation of the specified virtual function, you can achieve the decoding of the video.
Iv. Video playback
Video decoder
There are two important points to note here:
1. Video data transcoding
As we know, when the video is decoded, the data format is YUV, while the screen display needs RGBA, so in the video decoder, there needs to be a layer of data conversion.
The SwsContext tool in FFmpeg was used, and the conversion method was SWs_scale, both part of the Swresampel toolkit.
Sws_scale can transform data format and scale the width and height of the picture.
Declare the renderer
After conversion, the video frame data becomes RGBA and can be rendered to the phone screen. There are two methods:
- First, rendering data directly through a local window is impossible to re-edit the image
- Second, pass
OpenGL ES
Rendering, can achieve the editing of the picture
This article uses the former, OpenGL ES rendering will be explained separately in a later article.
New directory SRC/main/CPP/decoder, video, and a new video decoder v_decoder.
Look at header file v_decoder.h
// base_decoder.cpp
#ifndef LEARNVIDEO_V_DECODER_H
#define LEARNVIDEO_V_DECODER_H
#include ".. /base_decoder.h"
#include ".. /.. /render/video/video_render.h"
#include <jni.h>
#include <android/native_window_jni.h>
#include <android/native_window.h>
extern "C" {
#include <libavutil/imgutils.h>
#include <libswscale/swscale.h>
};
class VideoDecoder : public BaseDecoder {
private:
const char *TAG = "VideoDecoder";
// Video data target format
const AVPixelFormat DST_FORMAT = AV_PIX_FMT_RGBA;
// Save YUV to RGB data
AVFrame *m_rgb_frame = NULL;
uint8_t *m_buf_for_rgb_frame = NULL;
// Video format converter
SwsContext *m_sws_ctx = NULL;
// Video renderer
VideoRender *m_video_render = NULL;
// Display the target width
int m_dst_w;
// Display a high target
int m_dst_h;
/** * Initializes the renderer */
void InitRender(JNIEnv *env);
/** * initialize display * @param env */
void InitBuffer(a);
/** * initializes the video data converter */
void InitSws(a);
public:
VideoDecoder(JNIEnv *env, jstring path, bool for_synthesizer = false);
~VideoDecoder(a);void SetRender(VideoRender *render);
protected:
AVMediaType GetMediaType(a) override {
return AVMEDIA_TYPE_VIDEO;
}
/** * whether loop decoding is required */
bool NeedLoopDecode(a) override;
/** * Prepare to decode environment * note: In decoding thread callback * @param env decodes thread-bound JNI environment */
void Prepare(JNIEnv *env) override;
/** * Render * Note: Call @param frame to decode RGBA data */ in the decoder thread
void Render(AVFrame *frame) override;
/** * Release callback */
void Release(a) override;
const char *const LogSpec(a) override {
return "VIDEO";
};
};
#endif //LEARNVIDEO_V_DECODER_H
Copy the code
Next, look at the v_deocder.cpp implementation, starting with the initialization code:
// v_deocder.cpp
VideoDecoder::VideoDecoder(JNIEnv *env, jstring path, bool for_synthesizer)
: BaseDecoder(env, path, for_synthesizer) {
}
void VideoDecoder::Prepare(JNIEnv *env) {
InitRender(env);
InitBuffer(a);InitSws(a); }Copy the code
The constructor is as simple as passing related arguments to the parent base_decoder.
Next comes the Prepare method, which subclasses of base_decoder must implement after initializing the decoder.
// base_decoder.cpp
void BaseDecoder::Decode(std::shared_ptr<BaseDecoder> that) {
// omit irrelevant code...
that->InitFFMpegDecoder(env);
that->AllocFrameBuffer(a);// The subclass initializes the method call
that->Prepare(env);
that->LoopDecode(a); that->DoneDecode(env);
// omit irrelevant code...
}
Copy the code
In Prepare, the initialization of InitRender is skipped and covered in more detail later.
Take a look at initialization related to data format transformation.
- Store data cache initialization:
// base_decoder.cpp
void VideoDecoder::InitBuffer(a) {
m_rgb_frame = av_frame_alloc(a);// Get the cache size
int numBytes = av_image_get_buffer_size(DST_FORMAT, m_dst_w, m_dst_h, 1);
// Allocate memory
m_buf_for_rgb_frame = (uint8_t *) av_malloc(numBytes * sizeof(uint8_t));
// Allocate memory to RgbFrame, format memory into three channels, and store their addresses separately
av_image_fill_arrays(m_rgb_frame->data, m_rgb_frame->linesize,
m_buf_for_rgb_frame, DST_FORMAT, m_dst_w, m_dst_h, 1);
}
Copy the code
Initialize an AVFrame with the av_frame_alloc method. Note that this method does not allocate cache memory.
The required memory block size is then calculated using the av_image_get_buffer_size method, where
AVPixelFormat DST_FORMAT = AV_PIX_FMT_RGBA m_dst_w: M_dst_h: is the height of the target screen (i.e. the actual height of the screen when it is displayed, which will be calculated by the specific window size in the subsequent renderer)Copy the code
Then it actually allocates a block of memory via av_malloc;
Finally, av_image_FILL_Arrays allocates this chunk of memory to AVFrame through av_image_fill_Arrays, so memory allocation is complete.
- Data conversion tool initialization
// base_decoder.cpp
void VideoDecoder::InitSws(a) {
// Initialize the format conversion tool
m_sws_ctx = sws_getContext(width(), height(), video_pixel_format(),
m_dst_w, m_dst_h, DST_FORMAT,
SWS_FAST_BILINEAR, NULL.NULL.NULL);
}
Copy the code
This is very simple, as long as the original screen data and target screen data length and width, format and so on can be passed in.
- Releasing related Resources
After decoding, the parent class calls the subclass Release method to Release the associated resources in the subclass.
// v_deocder.cpp
void VideoDecoder::Release(a) {
LOGE(TAG, "[VIDEO] release")
if(m_rgb_frame ! =NULL) {
av_frame_free(&m_rgb_frame);
m_rgb_frame = NULL;
}
if(m_buf_for_rgb_frame ! =NULL) {
free(m_buf_for_rgb_frame);
m_buf_for_rgb_frame = NULL;
}
if(m_sws_ctx ! =NULL) {
sws_freeContext(m_sws_ctx);
m_sws_ctx = NULL;
}
if(m_video_render ! =NULL) {
m_video_render->ReleaseRender(a); m_video_render =NULL; }}Copy the code
Initialization and resource release are complete, and the final renderer configuration is left.
The renderer
As mentioned above, there are generally two ways to render an image, so we will define the renderer first to facilitate later expansion.
Define the video renderer
New directory SRC/main/CPP/media/render/video, and create a header file video_render. H.
#ifndef LEARNVIDEO_VIDEORENDER_H
#define LEARNVIDEO_VIDEORENDER_H
#include <stdint.h>
#include <jni.h>
#include ".. /.. /one_frame.h"
class VideoRender {
public:
virtual void InitRender(JNIEnv *env, int video_width, int video_height, int *dst_size) = 0;
virtual void Render(OneFrame *one_frame) = 0;
virtual void ReleaseRender(a) = 0;
};
#endif //LEARNVIDEO_VIDEORENDER_H
Copy the code
This class is also pure virtual, similar to Java’s interface.
There are only a few interfaces specified here, which are initialization, rendering, and releasing resources.
Implement the local window renderer
New directory SRC/main/CPP/media/render/video/native_render, and create a header file native_render class.
Native_render header file:
// native_render.h
#ifndef LEARNVIDEO_NATIVE_RENDER_H
#define LEARNVIDEO_NATIVE_RENDER_H
#include <android/native_window.h>
#include <android/native_window_jni.h>
#include <jni.h>
#include ".. /video_render.h"
#include ".. /.. /.. /.. /utils/logger.h"
extern "C" {
#include <libavutil/mem.h>
};
class NativeRender: public VideoRender {
private:
const char *TAG = "NativeRender";
// Surface reference, which must be used otherwise it cannot operate in the thread
jobject m_surface_ref = NULL;
// Store cached data output to the screen
ANativeWindow_Buffer m_out_buffer;
// Local window
ANativeWindow *m_native_window = NULL;
// Display the target width
int m_dst_w;
// Display a high target
int m_dst_h;
public:
NativeRender(JNIEnv *env, jobject surface);
~NativeRender(a);void InitRender(JNIEnv *env, int video_width, int video_height, int *dst_size) override ;
void Render(OneFrame *one_frame) override ;
void ReleaseRender(a) override ;
};
Copy the code
As you can see, the renderer holds a Surface reference, which is something we are very familiar with, having used it for image rendering in the previous series of articles.
In addition, there is a local window ANativeWindow. As long as the Surface is bound to ANativeWindow, Surface rendering can be realized through the local window.
Take a look at the renderer implementation native_render.cpp.
- Initialize the
// native_render.cpp
ativeRender::NativeRender(JNIEnv *env, jobject surface) {
m_surface_ref = env->NewGlobalRef(surface);
}
NativeRender::~NativeRender() {}void NativeRender::InitRender(JNIEnv *env, int video_width, int video_height, int *dst_size) {
// Initialize the window
m_native_window = ANativeWindow_fromSurface(env, m_surface_ref);
// Draw the width and height of the region
int windowWidth = ANativeWindow_getWidth(m_native_window);
int windowHeight = ANativeWindow_getHeight(m_native_window);
// Calculate the width and height of the target video
m_dst_w = windowWidth;
m_dst_h = m_dst_w * video_height / video_width;
if (m_dst_h > windowHeight) {
m_dst_h = windowHeight;
m_dst_w = windowHeight * video_width / video_height;
}
LOGE(TAG, "windowW: %d, windowH: %d, dstVideoW: %d, dstVideoH: %d",
windowWidth, windowHeight, m_dst_w, m_dst_h)
// Set the width limit to the number of pixels in the buffer
ANativeWindow_setBuffersGeometry(m_native_window, windowWidth,
windowHeight, WINDOW_FORMAT_RGBA_8888);
dst_size[0] = m_dst_w;
dst_size[1] = m_dst_h;
}
Copy the code
Focus on the InitRender method:
Bind the Surface to the local window using ANativeWindow_fromSurface;
Through ANativeWindow_getWidth ANativeWindow_getHeight, the width and height of the Surface display area can be obtained.
Then, according to the width and height of the original video picture video_width video_height and the width and height of the realistic area, zoom the picture, you can calculate the width and height of the final display picture, and assign the value to the decoder.
After the video decoder v_decoder gets the target screen width and height, it can initialize the size of the data conversion cache.
Finally, the local window cache size is set by ANativeWindow_setBuffersGeometry to complete the initialization.
- Apply colours to a drawing
There are two important local methods:
ANativeWindow_lock locks the window and retrieves the output buffer m_out_buffer.
ANativeWindow_unlockAndPost frees the window and draws the buffered data to the screen.
// native_render.cpp
void NativeRender::Render(OneFrame *one_frame) {
// Lock the window
ANativeWindow_lock(m_native_window, &m_out_buffer, NULL);
uint8_t *dst = (uint8_t *) m_out_buffer.bits;
// Get the stride: the number of memory pixels that can be saved in a row *4 (i.e., rGBA bits)
int dstStride = m_out_buffer.stride * 4;
int srcStride = one_frame->line_size;
// Since the stride for window is different from the stride for frame, it needs to be copied line by line
for (int h = 0; h < m_dst_h; h++) {
memcpy(dst + h * dstStride, one_frame->data + h * srcStride, srcStride);
}
// Release window
ANativeWindow_unlockAndPost(m_native_window);
}
Copy the code
The rendering process looks complicated, mainly because there is a “stride” concept, which refers to the width of each row of data in a frame.
For example, the data format here is RGBA, and the pixels in a row are 8, so the total stride width is 8*4 = 32.
Why do we need to convert? The reason is that the stride size of the local window may be inconsistent with the stride size of the video screen data. When the video screen data is directly sent to the local window, inconsistent data may be read, which may eventually lead to screen loss.
Therefore, the data need to be copied line by line (memcpy) according to the dstStride of the local window and the srcStride of the video screen data.
Renderer call
Finally, call to renderer in video decoder V_decoder
// v_decoder.cpp
void VideoDecoder::SetRender(VideoRender *render) {
this->m_video_render = render;
}
void VideoDecoder::InitRender(JNIEnv *env) {
if(m_video_render ! =NULL) {
int dst_size[2] = {- 1.- 1};
m_video_render->InitRender(env, width(), height(), dst_size);
m_dst_w = dst_size[0];
m_dst_h = dst_size[1];
if (m_dst_w == - 1) {
m_dst_w = width(a); }if (m_dst_h == - 1) {
m_dst_w = height(a); }LOGI(TAG, "dst %d, %d", m_dst_w, m_dst_h)
} else {
LOGE(TAG, "Init render error, you should call SetRender first!")}}void VideoDecoder::Render(AVFrame *frame) {
sws_scale(m_sws_ctx, frame->data, frame->linesize, 0.height(), m_rgb_frame->data, m_rgb_frame->linesize);
OneFrame * one_frame = new OneFrame(m_rgb_frame->data[0], m_rgb_frame->linesize[0], frame->pts, time_base(), NULL.false);
m_video_render->Render(one_frame);
}
Copy the code
One is to set rendering to the video decoder;
Second, call the InitRender method of the renderer to initialize the renderer and get the target screen width and height
Finally, we call the Render method to Render.
OneFrame is a custom class used to encapsulate the content related to a frame of data. For details, see [project source].
Writing player
Above, completed:
Basic decoder
Encapsulation — — >Video decoder
The implementation of;The renderer
The definition of — — >Local Render window
The implementation of the.
Finally, we just need to put them together and play them.
Create a new player in the SRC /main/ CPP /media directory as follows:
// player.h
#ifndef LEARNINGVIDEO_PLAYER_H
#define LEARNINGVIDEO_PLAYER_H
#include "decoder/video/v_decoder.h"
class Player {
private:
VideoDecoder *m_v_decoder;
VideoRender *m_v_render;
public:
Player(JNIEnv *jniEnv, jstring path, jobject surface);
~Player(a);void play(a);
void pause(a);
};
#endif //LEARNINGVIDEO_PLAYER_H
Copy the code
The player holds a video decoder and a video renderer, as well as a play and pause method.
// player.cpp
#include "player.h"
#include "render/video/native_render/native_render.h"
Player::Player(JNIEnv *jniEnv, jstring path, jobject surface) {
m_v_decoder = new VideoDecoder(jniEnv, path);
m_v_render = new NativeRender(jniEnv, surface);
m_v_decoder->SetRender(m_v_render);
}
Player::~Player() {
// The delete member pointer is not required here
// Threads in BaseDecoder already use smart Pointers, which are automatically released
}
void Player::play(a) {
if(m_v_decoder ! =NULL) {
m_v_decoder->GoOn();
}
}
void Player::pause(a) {
if(m_v_decoder ! =NULL) {
m_v_decoder->Pause();
}
}
Copy the code
The code is simple: associate the decoder with the renderer.
Add the source code to the compilation
Although the various functional modules are written above, the compiler does not automatically add them to the compilation. To enable C++ code to be compiled, you need to manually configure it in cmakelists. TXT in the same location as the default native-lib.cpp, listed below.
# cmakelists. TXT // omit irrelevant configuration //...... Add_library (# Sets the name of the library. Native -lib # Sets the library as a shared library Provides a relative path to your source file(s). Native lib. CPP # tool ${CMAKE_SOURCE_DIR}/utils/logger.h ${CMAKE_SOURCE_DIR}/utils/timer.c # player ${CMAKE_SOURCE_DIR}/media//player.cpp # decoder ${CMAKE_SOURCE_DIR}/media//one_frame.h ${CMAKE_SOURCE_DIR}/media/decoder/i_decoder.h ${CMAKE_SOURCE_DIR}/media/decoder/decode_state.h ${CMAKE_SOURCE_DIR}/media/decoder/base_decoder.cpp ${CMAKE_SOURCE_DIR} / media/decoder/video/v_decoder CPP # ${CMAKE_SOURCE_DIR} renderer/media/render/video/video_render. H The ${CMAKE_SOURCE_DIR} / media/render/video/native_render/native_render CPP) / / omit irrelevant configuration / /...Copy the code
If the class has only.h headers, write only.h files. If the class has both headers and.cpp implementation files, configure only.cpp files
Note that once each class is created, it needs to be configured into cmakelists.txt, otherwise the code may not be able to import the associated library header file and will not compile.
Write the JNI interface
The next step is to expose the player to the Java layer, using JNI’s native-lib.cpp interface file.
Before you start writing the JNI interface, write the corresponding interface in FFmpegActivity:
// FFmpegActivity.kt
class FFmpegActivity: AppCompatActivity() {
override fun onCreate(savedInstanceState: Bundle?). {
super.onCreate(savedInstanceState)
setContentView(R.layout.activity_ffmpeg_info)
tv.text = ffmpegInfo()
initSfv()
}
private fun initSfv(a) {
sfv.holder.addCallback(object: SurfaceHolder.Callback {
override fun surfaceChanged(holder: SurfaceHolder, format: Int, width: Int, height: Int){}override fun surfaceDestroyed(holder: SurfaceHolder){}override fun surfaceCreated(holder: SurfaceHolder) {
if (player == null) { player = createPlayer(path, holder.surface) play(player!!) }}})}//------------ JNI related interface methods ----------------------
private external fun ffmpegInfo(a): String
private external fun createPlayer(path: String, surface: Surface): Int
private external fun play(player: Int)
private external fun pause(player: Int)
companion object {
init {
System.loadLibrary("native-lib")}}}Copy the code
The interface is simple:
CreatePlayer (path: String, Surface: surface): Int: creates a player and returns the player object address
Play (player: Int) : plays a player
Pause (player: Int) : pauses the player object
The player is created when SurfaceView initialization is complete: surfaceCreated.
The page layout XML looks like this:
<android.support.constraint.ConstraintLayout
xmlns:android="http://schemas.android.com/apk/res/android"
android:layout_width="match_parent" android:layout_height="match_parent">
<ScrollView
android:layout_width="match_parent"
android:layout_height="match_parent">
<LinearLayout
android:layout_width="match_parent"
android:layout_height="wrap_content"
android:orientation="vertical">
<SurfaceView android:id="@+id/sfv"
android:layout_width="match_parent"
android:layout_height="200dp" />
<TextView android:id="@+id/tv"
android:layout_width="match_parent"
android:layout_height="match_parent"/>
</LinearLayout>
</ScrollView>
</android.support.constraint.ConstraintLayout>
Copy the code
Next, according to the above three interfaces, write corresponding interfaces in JNI.
// native-lib.cpp
#include <jni.h>
#include <string>
#include <unistd.h>
#include "media/player.h"
extern "C" {
JNIEXPORT jint JNICALL
Java_com_cxp_learningvideo_FFmpegActivity_createPlayer(JNIEnv *env,
jobject /* this */,
jstring path,
jobject surface) {
Player *player = new Player(env, path, surface);
return (jint) player;
}
JNIEXPORT void JNICALL
Java_com_cxp_learningvideo_FFmpegActivity_play(JNIEnv *env,
jobject /* this */,
jint player) {
Player *p = (Player *) player;
p->play(a); }JNIEXPORT void JNICALL
Java_com_cxp_learningvideo_FFmpegActivity_pause(JNIEnv *env,
jobject /* this */,
jint player) {
Player *p = (Player *) player;
p->pause();
}
}
Copy the code
Very simple, I believe everyone can understand, in fact, is to initialize a player object pointer, and then return to the Java layer save, after the play and pause operations are Java layer this player pointer to the JNI layer to do specific operations.
Five, the summary
It’s a lot of code, but it’s actually a lot easier to understand if you’ve seen the previous series of raw solutions.
Finally, a quick summary:
-
Initialization: Initializes the decoder according to some functional interfaces provided by FFmpeg
- Enter the file stream context AVFormatContext
- The decoder context AVCodecContext
- Decoder AVCodec
- Allocate data cache space AVPacket (store data to be decoded) and AVFrame (store decoded data)
-
Decoding: decoding through the decoding interface provided by FFmpeg
- Av_read_frame Reads data to be decoded to AVPacket
- Avcodec_send_packet Sends AVPacket to decoder for decoding
- Avcodec_receive_frame reads decoded data to AVFrame
-
Transcoding and scaling: Convert YUV to RGBA through the transcoding interface provided by FFmpeg
- Sws_getContext Initializes the transformation tool SwsContext
- Sws_scale performs data transformation
-
Render: Renders video data to the screen via the interface provided by Android
- ANativeWindow_fromSurface binds the Surface to the local window
- ANativeWindow_getWidth/ANativeWindow_getWidth gets the Surface width and height
- ANativeWindow_setBuffersGeometry Sets the screen buffer size
- ANativeWindow_lock locks the window and obtains the display buffer
- According to the
Stride
Copy the data (memcpy) to the buffer - ANativeWindow_unlockAndPost unlocks the window and displays