Introduction to the
As of this post, cross-platform players will not be QT dominant. Why? Since QT is not our main area of study, we will learn how to build a true cross-platform player SDK based on basic libraries such as FFMPEG.
It is planned that the core of the player will be redeveloped based on FFPlay, why the redevelopment? Because ffPlay is very difficult to extend, basically all the logic is written in a C file. In order to solve the problem of not easy to expand, I conceived a set of ideas based on FFPlay secondary development (iJK will be referred to later on the mobile terminal). Since it is secondary development, the functions will be stronger than before, so I planned the following points:
Android
- cpu: ARMv7a, ARM64v8a
- API: Similar to MediaPlayer
- Video rendering: OpenGL ES
- Audio rendering: OpenSL ES
- NDK MediaCodec (Android 5.0 +)
- Times the speed of playback
- Save the GIF by intercepting the fragment
IOS
- cpu: armv7, arm64
- API: Similar to MediaPlayer.framework
- Video rendering: OpenGL ES
- Audio output: AudioQueue, AudioUnit
- Hardware Decoder: VideoToolbox (iOS 8+)
- Times the speed of playback
- Save the GIF by intercepting the fragment
PC
- SDL
At present, the PC side of the code is almost finished, you can see the engineering structure and effect:
Code to wait until the back of almost open source out ~
The following article is basically the first part of the analysis of the source code, and then the next part of the practice of a train of thought.
Below we mainly analyze a composition of ffPlay main frame
What is ffplay
Ffplay is ffMPEG’s own cross-platform player, written in C language. When you compile ffmpeg with the following parameter –enable-ffplay, it will generate an ffplay executable file in output/bin/, and use ffplay XXX.mp4 to play a media file. It is mainly a player implemented in FFMPEG + SDL. In fact, the famous iJkPlayer is based on ffPlay. c for secondary development, so it is very helpful for us to master the principle of FFPlay to develop player.
Ffplay framework analysis
At present, FFPlay is mainly constructed by five threads: main UI, read_thread, Audio_thread, video_thread and Audioqueue_thread (SDL audio rendering). The following figure shows the overall process:
Stream_open player initialization
Stream_open does four things internally
- Audio/video/subtitle decoding completed data is initialized according to the set fixed queue size ->
frame_queue_init
- Audio/video/subtitle To be decoded Packet queue initialization ->
packet_queue_init
- Initialize by audio/video/external clock
init_clock
- Use SDL package
SDL_CreateThread
Function to create a thread that reads unmarshalled media data
thread
The main the main thread
- Configure command parameters and invoke
stream_open
Initialize the parameters required by the player, and finally start the read_thread thread - Render the video in the main thread via SDL
Read_thread reads the unmarshaled data thread
avformat_open_input
Open media file- Find the corresponding audio/video/subtitle stream information
- Open the corresponding audio/video/subtitle decoder according to the corresponding codestream information, and open their respective threads for decoding
- start
av_read_frame
The rotation reads the data to be decodedpacket
And then callpacket_queue_put
将AVPacket
save
Audio_decoder /video_decoder/subtitle_decoder Decoder thread
- Read the corresponding AVPacket from packet queue, call the corresponding decoder, and store the decoded AVFrame in the frame Queue queue
Audioqueue_thread -> sdL_audio_callback Audio render thread
- Play from the decoded AVFrame PCM data copy -> SDL
Internal core functions
Audio and picture synchronization mechanism
- Audio based
- Video based
- Use the external clock as the main clock
Audio processing
-
The audio control
-
Mute control
-
PCM normalization
Video processing
- YUV -> RGB
- scale
control
- Play, pause, stop, fast forward, fast rewind, play frame by frame, mute
Data structure analysis
VideoState Player overall management structure
VideoState encapsulates the global properties of the player, with the following fields:
typedef struct VideoState {
SDL_Thread *read_tid; // Read the unmarshaled thread handle
AVInputFormat *iformat; // Pass the specified input format, pointing to demuxer
int abort_request; // If =1, request to quit playing
int force_refresh; // if =1, you need to refresh the screen immediately
int paused; // Pause when =1, play when =0
int last_paused; // Pause/Play status temporarily
int queue_attachments_req; // Request the cover of mp3, AAC and other audio files
int seek_req; // Identify a seek request
int seek_flags; // seek flags, such as AVSEEK_FLAG_BYTE
int64_t seek_pos; // Request the target position of seek (current position + increment)
int64_t seek_rel; // This time seek position increment
int read_pause_return;
AVFormatContext *ic; // Unpack to get format context
int realtime; // =1 indicates the real-time stream
Clock audclk; // Audio clock
Clock vidclk; // Video clock
Clock extclk; // External clock
FrameQueue pictq; // Decoded video Frame queue
FrameQueue subpq; // Decoded subtitle Frame queue
FrameQueue sampq; // Decoded audio Frame queue
Decoder auddec; // Audio decoder
Decoder viddec; // Video decoder
Decoder subdec; // Subtitle decoder
int audio_stream; // Audio stream index
int av_sync_type; // Audio and video synchronization type. By default, video is synchronized to audio
double audio_clock; // PTS of current audio frame + Duration of current frame
int audio_clock_serial; // Play sequence, seek can change this value
// The following four parameters are not used in audio master synchronization mode
double audio_diff_cum; // used for AV difference average computation
double audio_diff_avg_coef;
double audio_diff_threshold;
int audio_diff_avg_count;
// end
AVStream *audio_st; // Audio stream information
PacketQueue audioq; // Audio packet queue
int audio_hw_buf_size; // The size of the SDL audio buffer in bytes
// Point to a frame of audio data to be played. The data area pointed to will be copied into the SDL audio buffer. If it's resampled it points to audio_buf1,
// Otherwise point to the audio in frame
uint8_t *audio_buf; // points to data that needs to be resampled
uint8_t *audio_buf1; // Point to the resampled data
unsigned int audio_buf_size; // The size of a frame of audio data (pointed by audio_buf) to be played
unsigned int audio_buf1_size; // The actual size of the audio buffer audio_buf1
int audio_buf_index; The current audio frame has been copied to the SDL audio buffer
// Position index to the first byte to be copied
// The amount of data in the current audio frame that has not been copied into the SDL audio buffer:
// audio_buf_size = audio_buf_index + audio_write_buf_size
int audio_write_buf_size;
int audio_volume; / / volume
int muted; // mute =1. Mute =0
struct AudioParams audio_src; // Audio frame parameters
#if CONFIG_AVFILTER
struct AudioParams audio_filter_src;
#endif
struct AudioParams audio_tgt; // Audio parameters supported by SDL, resampling: audio_src->audio_tgt
struct SwrContext *swr_ctx; // Audio resampling context
int frame_drops_early; // Discard video packet count
int frame_drops_late; // Discard the video frame count
enum ShowMode {
SHOW_MODE_NONE = - 1./ / no show
SHOW_MODE_VIDEO = 0.// Display video
SHOW_MODE_WAVES, // Display wave, audio
SHOW_MODE_RDFT, // Adaptive filter
SHOW_MODE_NB
} show_mode;
// Audio waveform display is used
int16_t sample_array[SAMPLE_ARRAY_SIZE]; // Sample array
int sample_array_index; // Sampling index
int last_i_start; // From the beginning
RDFTContext *rdft; // Adaptive filter context
int rdft_bits; // Self-use bit rate
FFTSample *rdft_data; // Fast Fourier sampling
int xpos;
double last_vis_time;
SDL_Texture *vis_texture; / / audio Texture
SDL_Texture *sub_texture; // Subtitle display
SDL_Texture *vid_texture; // Video display
int subtitle_stream; // Subtitle stream index
AVStream *subtitle_st; / / subtitle stream
PacketQueue subtitleq; // Subtitle packet queue
double frame_timer; // Record the time when the last frame is played
double frame_last_returned_time; // Last return time
double frame_last_filter_delay; // Last filter delay
int video_stream; // Video stream index
AVStream *video_st; / / video
PacketQueue videoq; // Video queue
double max_frame_duration; Above this, we consider the jump a timestamp discontinuity
struct SwsContext *img_convert_ctx; // Change the video size format
struct SwsContext *sub_convert_ctx; // Subtitle size format change
int eof; // Whether reading is complete
char *filename; / / file name
int width, height, xleft, ytop; // Width, height, x start, y start
int step; // =1 step playback mode, =0 other modes
#if CONFIG_AVFILTER
int vfilter_idx;
AVFilterContext *in_video_filter; // the first filter in the video chain
AVFilterContext *out_video_filter; // the last filter in the video chain
AVFilterContext *in_audio_filter; // the first filter in the audio chain
AVFilterContext *out_audio_filter; // the last filter in the audio chain
AVFilterGraph *agraph; // audio filter graph
#endif
// Keep the steam index of the latest corresponding audio, video, and subtitle streams
int last_video_stream, last_audio_stream, last_subtitle_stream;
SDL_cond *continue_read_thread; The condition can be used to wake up the reader thread when it goes to sleep after the read queue is full
} VideoState;
Copy the code
Clock Encapsulates the Clock structure
Clock is mainly the time stamp encapsulation of audio/video/subtitles. Its structure is as follows:
// The system clock is defined by av_gettime_relative()
typedef struct Clock {
double pts; // Based on the clock, the current frame (to play) displays the timestamp, after playing, the current frame becomes the previous frame
// The difference between the current PTS and the current system clock. Audio and video are independent of this value
double pts_drift; // clock base minus time at which we updated the clock
// Time when the current clock (such as the video clock) was last updated
double last_updated; // Last update of the system clock
double speed; // Clock speed control, used to control playback speed
// Play sequence is a sequence of play action, a seek operation will start a new sequence of play
int serial; // clock is based on a packet with this serial
int paused; // = 1 indicates the pause state
/ / points to packet_serial
int *queue_serial; /* pointer to the current packet queue serial, used for obsolete clock detection */
} Clock;
Copy the code
The main API calls are as follows:
// Initializes the clock, queue_serial Specifies the sequence number to play in the current queue
void init_clock(Clock *c, int *queue_serial);
// Set the timestamp
void set_clock(struct Clock *c, double pts, int serial);
// Get the corresponding timestamp
double get_clock(struct Clock *c);
// Get the master timestamp based on the synchronization type
int get_master_sync_type(struct VideoState *is);
Copy the code
PacketQueue AVPacket queue
The queue is designed with the following ideas:
- Thread-safe, mutually exclusive, wait, wake up
- Cache data volume statistics
- Cache packet size statistics
- Cumulative package sustainable time
- Basic save/fetch/release functions
PacketQueue is mainly used to access and encapsulate audio/video/subtitle unsealed data. Its structure is as follows:
typedef struct MyAVPacketList {
AVPacket pkt; // Unpack the data
struct MyAVPacketList *next; // Next node
int serial; // Play the sequence
} MyAVPacketList;
typedef struct PacketQueue {
MyAVPacketList *first_pkt, *last_pkt; // Team head, team tail pointer
int nb_packets; // Number of packets, i.e. number of queue elements
int size; // Total data size of all elements in the queue
int64_t duration; // The duration of data playback for all elements of the queue
int abort_request; // The user exits the request flag
int serial; // Play serial number, same as serial of MyAVPacketList, but change sequence slightly different
SDL_mutex *mutex; Pthread_mutex_t (); // PacketQueue ();
SDL_cond *cond; Pthread_mutex_t (); // PacketQueue ();
AVPacket *flush_pkt;
} PacketQueue;
Copy the code
Main API calls:
/** * Initializes the values of each field * @param q * @return */
int packet_queue_init(struct PacketQueue *q);
/** * message queue, free memory * @param q */
void packet_queue_destroy(struct PacketQueue *q);
/** * start queue * @param q */
void packet_queue_start(struct PacketQueue *q);
/** ** abort queue * @param q */
void packet_queue_abort(struct PacketQueue *q);
/** * fetch a node from the queue * @param q * @param PKT * @param block * @param serial * @return */
int packet_queue_get(struct PacketQueue *q, AVPacket *pkt, int block, int *serial);
* @param PKT * @return */
int packet_queue_put(struct PacketQueue *q, AVPacket *pkt);
/** * put an empty package * @param q * @param starat_index * @return */
int packet_queue_put_nullpacket(struct PacketQueue *q, int starat_index);
/** * Clear existing nodes * @param q */
void packet_queue_flush(struct PacketQueue *q);
/** * save node * @param q * @param PKT * @return */
int packet_queue_put_private(struct PacketQueue *q, AVPacket *pkt);
Copy the code
FrameQueue AVFrame queue
Design idea:
- Thread-safe, supporting mutual exclusion, wait, wake up
- Controls the size of the cache queue
FrameQueue is mainly used to access the original data after decoding audio/video/subtitles
// Cache decoded audio, video, and subtitle data
typedef struct Frame {
AVFrame *frame; // Point to a data frame
AVSubtitle sub; // For subtitles
int serial; // Frame sequence, the serial will change when seeking
double pts; // Timestamp, in seconds
double duration; // The frame duration, in seconds
int64_t pos; // The byte position of the frame in the input file
int width; // Image width
int height; // Image high read
int format; // For image format (enum AVPixelFormat),
// For voice, (enum AVSampleFormat)
AVRational sar; // Image aspect ratio (16:9, 4:3...) , 0/1 if unknown or not specified
int uploaded; // This is used to record whether the frame has been displayed.
int flip_v; // if =1, flip vertically; if = 0, play normally
} Frame;
/* This is a circular queue where windex is the first element and rindex is the last element. */
typedef struct FrameQueue {
Frame queue[FRAME_QUEUE_SIZE]; // FRAME_QUEUE_SIZE Maximum size. If a number is too large, it consumes a lot of memory. Set this parameter carefully
int rindex; // Read the index. This frame is read and played when it is to be played. After playing, this frame becomes the last frame
int windex; / / write index
int size; // The current number of frames
int max_size; // The maximum number of frames can be stored
int keep_last; // = 1 to keep the last frame of data in the queue, and only release it when the queue is destroyed
int rindex_shown; // Initialize to 0 with keep_last=1
SDL_mutex *mutex; / / the mutex
SDL_cond *cond; // Condition variable
PacketQueue *pktq; // Packet buffer queue
} FrameQueue;
Copy the code
The main operation apis are as follows:
/** * Initialize FrameQueue * @param f raw data * @param PKTQ encoded data * @param max_size maximum cache * @param keep_last * @return */
int frame_queue_init(struct FrameQueue *f,struct PacketQueue *pktq, int max_size, int keep_last);
/** * destroy all frames in the queue * @param f * @return */
int frame_queue_destory(struct FrameQueue *f);
/** * release a reference to the cached frame * @param vp */
void frame_queue_unref_item(struct Frame *vp);
/** ** ** @param f */
void frame_queue_signal(struct FrameQueue *f);
Ff_frame_queue_nb_remaining Ensures that there is a Frame readable * @param f * @return */ before calling ff_frame_queue_nb_remaining
struct Frame *frame_queue_peek(struct FrameQueue *f);
@param f * @return No matter when it is called, it must not return NULL */
struct Frame *frame_queue_peek_next(struct FrameQueue *f);
If rindex_shown=0, the same effect as frame_queue_peek. If rindex_shown=1, the same effect will be read as Frame @param f * @return */
struct Frame *frame_queue_peek_last(struct FrameQueue *f);
/** * get a writable Frame, which can be done in blocking or non-blocking mode * @param f * @return */
struct Frame *frame_queue_peek_writable(struct FrameQueue *f);
/** * get a readable Frame, which can be done in blocking or non-blocking mode * @param f * @return */
struct Frame *frame_queue_peek_readable(struct FrameQueue *f);
/** * the number of frames in the queue is increased by 1 * @param f */
void frame_queue_push(struct FrameQueue *f);
* When keep_last is 1 and rindex_show is 0, the rindex is not updated and the current Frame * @param f */ is not released
void frame_queue_next(struct FrameQueue *f);
/** * Make sure there are at least 2 frames in the queue * @param f * @return */
int frame_queue_nb_remaining(struct FrameQueue *f);
* @param f * @return */
int64_t frame_queue_last_pos(struct FrameQueue *f);
Copy the code
Decoder Decoder structure
Decoder is mainly for audio/video/subtitles decoding package, its structure is as follows:
/** ** decoder package */
typedef struct Decoder {
AVPacket pkt; //
PacketQueue *queue; // Packet queue
AVCodecContext *avctx; // Decoder context
int pkt_serial; / / package sequence
int finished; // =0, the decoder is working; = non-0, the decoder is idle
int packet_pending; // =0, the decoder is in an abnormal state, need to consider resetting the decoder; =1, the decoder is in normal state
SDL_cond *empty_queue_cond; // If the packet queue is empty, send signal cache read_thread to read data
int64_t start_pts; // Start time of stream when initialized
AVRational start_pts_tb; // Time_base of stream when initialized
int64_t next_pts; // Record the PTS of the frame after the last decoding. If there is no valid PTS in the solved part of the frame, next_PTS is used for calculation
AVRational next_pts_tb; // Unit of next_pts
SDL_Thread *decoder_tid; // Thread handle
} Decoder;
Copy the code
The main API calls are as follows:
/** * decoder initialization * @param is * @return */
int ff_decoder_init(struct VideoState *is);
/** * decoded destroy * @param d */
void ff_decoder_destroy(struct Decoder *d);
/** * decoder parameter initialization * @param d * @param avctx * @param queue * @param empty_queue_cond */
void ff_decoder_parameters_init(struct Decoder *d, AVCodecContext *avctx, struct PacketQueue *queue,
struct SDL_cond *empty_queue_cond);
/** * open decoder component * @brief stream_component_open * @param is * @param stream_index Stream index * @return return 0 if OK */
int ff_stream_component_open(struct VideoState *is, int stream_index);
/** * create a decoding thread, audio/video has a separate thread */
int ff_decoder_start(struct Decoder *d, int (*fn)(void *), const char *thread_name, void *arg);
/** * stop decoding * @param d * @param fq */
void ff_decoder_abort(struct Decoder *d, struct FrameQueue *fq);
* @param d * @param frame * @param sub * @return */
int ff_decoder_decode_frame(struct Decoder *d, AVFrame *frame, AVSubtitle *sub, struct FFOptions *ops);
Copy the code
conclusion
This article introduces the main components of FFplay, the implementation principle will be reflected in the subsequent articles, please look forward to it!