Introduction to the

This article mainly introduces “FFplay” how to realize the ability of unsealing, decoding and audio and video synchronization, the next article will be based on “FFplay” has provided the ability, some basic ability to be independent, so as to facilitate the subsequent expansion.

decapsulation

Unpacking, as the name implies, is actually to unpack the packaged package and get the corresponding audio and video compressed data. Let’s take a look at the main architecture diagram for unpacking

Using ffplay architecture diagram drawn by ** “Thor” ** below, I extracted the corresponding unencapsulation:

“Av_register_all”

As shown in the following code, in higher versions, av_register_all is not required.

/* register all codecs, demux and protocols */
#if CONFIG_AVDEVICE
    avdevice_register_all();
#endif
Copy the code

“Avformat_open_input”

int avformat_open_input(AVFormatContext **ps, const char *url, ff_const59 AVInputFormat *fmt, AVDictionary **options);
Copy the code

The meaning of avformat_open_input is to open an input stream and read the wrapped header. Finally, the stream must be closed using avformat_close_input().

“Avformat_find_stream_info”

int avformat_find_stream_info(AVFormatContext *ic, AVDictionary **options);
Copy the code

The avformat_find_stream_info function is used to detect the basic information of audio and video streams.

“Av_read_frame”

int av_read_frame(AVFormatContext *s, AVPacket *pkt);
Copy the code

Av_read_frame The av_read_frame function reads unsealed audio, video, and subtitle data. There is a point to note in ffPlay source code, the code is as follows:

As prompted in the source, ffplay sets discard to disallow ** “all frames read” **. Because I did not pay attention to this function when DECOUPLING ffplay code, I failed to read data during the test unpacking.

The av_read_fream function in ffplay keeps reading in an endless loop in the read_thread layer.

decoding

Stream_component_open (); stream_component_open ();

static int stream_component_open(VideoState *is, int stream_index) { ... /* codec_id = avcodec_find_decoder(avctx->codec_id); . If ((ret = avCOdec_open2 (AVctx, codec, &opts)) < 0) {goto fail; IC ->streams[stream_index]->discard = AVDISCARD_DEFAULT; switch (avctx->codec_type) { case AVMEDIA_TYPE_AUDIO: ... // / the device parameters are saved in audio_tgt. If ((ret = Audio_open (is, channel_layout, NB_Channels, sample_rate, & IS -> Audio_TGt)) < 0) goto fail; . Decoder_init (&is->auddec, AVCTx, &is-> Audioq, is->continue_read_thread); . If (ret = decoder_start(&is->auddec, audio_thread, "audio_decoder", is)) < 0) goto out; SDL_PauseAudioDevice(audio_dev, 0); break; case AVMEDIA_TYPE_VIDEO: ... Continue_read_thread (&is-> Viddec, avCTx, &is-> VideoQ, is->continue_read_thread); If ((ret = decoder_start(&is->viddec, video_thread, "video_decoder", is)) < 0) goto out; is->queue_attachments_req = 1; // Enable the cover break of request mp3, AAC and other audio files; Case AVMEDIA_TYPE_SUBTITLE: // Video is similar to logic processing... decoder_init(&is->subdec, avctx, &is->subtitleq, is->continue_read_thread); if ((ret = decoder_start(&is->subdec, subtitle_thread, "subtitle_decoder", is)) < 0) goto out; break; default: break; }}Copy the code

According to the above brief code, it can be seen that audio, video and subtitle are decoded in audio_thread, video_thread and subtitLE_thread respectively.

The core API of decoding is as follows:

Read the original AVFrame data after decoding

avcodec_receive_frame(d->avctx, frame);
Copy the code

Sends compressed data to the decoder buffer queue

avcodec_send_packet(d->avctx, &pkt);
Copy the code

Since then, in coordination with packet Queue, as long as there is data in the queue, the decoding function will be called, and the decoding success will be stored in the corresponding frame queue.

Audio and video synchronization

What is audio and video synchronization, in fact, its function is to ensure that the audio playback time and video playback time should be in the same timeline. So there’s a question of how do you make sure that you’re on a timeline, that you need a reference, and in FFPlay, by default, you use video reference audio,

First, video is based on audio (recommended, most commonly used):

  • The video playback time is based on the audio playback time. If the video is faster than the audio, the wait is actually the fixed frame. If the video is slower than the audio, the frame loss process is needed.

Second, audio is benchmarked against video PTS

Third, take the external clock as the benchmark

If the video is synchronized to audio, the first step is to store the audio playback PTS when the audio needs to be played. This is done in the ffPlay source code, mainly in the sdl_audio_callback code.

“Audio Playback PTS Settings”

Long audio_callback_time = av_getTIME_relative (); long audio_callback_time = av_gettime_relative(); //PCM audio sampling 1024 * 2 channels * 16/2 bytes * Cache 2 frames = 8192; int sdl_audio_buf_size = 8192; int audio_pts = 0; int ret = sdl_audio_buf_size; Uint8_t *data_ = NULL; uint8_t *data_ = NULL; Int size = get_audio_frame(&data_); //3, copy PCM data to SDL buffer copy_to_sDL (data_); ret -= size; audio_pts+=size/audio_sample_rate; } // set the audio clock to more accurately calculate how much audio data is actually being played Set_audio_clock (audio_pts-(audio data actually written to the cache /(sample_rate*2*2)),audio_callback_time); }Copy the code

Ffplay basically sets up the clock code, which I’ve simplified to pseudocode, so it might look a little more understandable.

“Video Playback Synchronization Settings”

I’m going to do this in pseudocode again

Static void video_refresh() {// get the current time int time = av_gettime_relative() / 1000000.0; Int last_duration = current video frame PTS - previous frame PTS; double delay = last_duration; Double diff = current video frame PTS - get audio playback PTS; double diff = current video frame PTS - get audio playback PTS; Double sync_threshold = FFMAX(0.04, FFMIN(0.1, delay)); double sync_threshold = FFMAX(0.04, FFMIN(0.1, delay)) If (diff <= -sync_threshold) {delay = FFMAX(0, delay + diff); } else if (diff >= sync_threshold && delay > AV_SYNC_FRAMEDUP_THRESHOLD) {// Video delay = delay + diff; } else if (diff >= sync_threshold) {delay = 2 * delay; } frame_timer += delay; If (frame_timer <= 0)// If (frame_timer <= 0)// If (frame_timer <= 0) Double next_duration = get_next_frame().pts - get_last_frame().pts; if (time > frame_timer + next_duration){ drop_frame(); return; } // Display video frame display(); Sleep (delay); }Copy the code

The above pseudo-code annotations are more clear, I believe we can understand, or you can directly look at the ffplay source part

SDL preview

In the SDL preview section, I don’t have to analyze it here, because the code is the same.

conclusion

The brief analysis of ffplay implementation principle, personally think ffplay is the most difficult to understand inside the audio and video synchronization, the other are still relatively easy to understand. It is recommended to read ffplay source code, so as to understand FFplay more deeply.

reference

  • ffplay4.2.1