Yesterday Saturday, there were people in the group in technical communication!! .

To make a joke silently: these people are really crazy, they are still engaged in technology on the big weekend, is it not fun games or movies?

At the beginning, it was discussed how to realize the 100 speed change. As a relevant person, it was not convenient for the group leader to disclose these, but there were other leaders who gave ideas.

The discussion focused on how to drop frames.

For example, the next frame at normal speed will play the content of time 1s, while the second frame will play the content of time 100s.

The logic at this point has the following situations:

  1. If the next play time is larger than the current GOP size, then seek to the key frame nearest to the target PTS in time, such as 100s after 1s shift, seek to 98s.
  2. If the next playing time is in the same GOP, continue to decode, judge that the decoded frame PTS does not need to be displayed, then directly discard it and then solve it later until it is close to the target time point.
  3. If the time point after seek does not reach the target time point, you can repeat the second step if you need to continue decoding.

The above is the summary of the big guy in the group, take a small notebook quickly write down.

At this point, there are big guys on the decoding of lost frames given other opinions:

This paper focuses on frame loss processing of non-reference frames.

When we get an AVPacket through av_read_frame, we can judge its nal_ref_IDC value to decide whether to discard it or not.

If the value is 0, it indicates that other frames do not need to refer to this frame and can be discarded without sending it to the decoder, rather than losing the frame after decoding.

If you are not sure what nal_ref_idc means? Consider the concept of H.264 bit stream NALU.

The H.264 bit stream is transmitted in the form of NALU, which consists of HAL Header and RBSP of one byte.

The HAL Header is shown in the following figure:

HAL Header is calculated as follows:

forbidden_zero_bit(1bit) + nal_ref_idc(2bit) + nal_unit_type(5bit)
Copy the code

Different values of nal_unit_type represent different types of frames. The above information can be obtained by parsing AVPacket. Later, the article will continue to update how to calculate in detail in the public account audio and video development.

Therefore, it is possible to discard these non-reference frames completely when decoding, so it is safe to operate boldly.

And the operation of throwing non-reference frames is also through the product hundreds of millions of tests, this I can really testify!!

Lost frames in FFmpeg

The above frame loss logic is based on the H.264 specification, so in FFmpeg source code for this logic to do?

That must be some!!

If you look carefully at ffplay source code, there is the following call in the source code:

/* this thread gets the stream from the disk or the network */
static int read_thread(void *arg)
{
    // Omit some code
    for (i = 0; i < ic->nb_streams; i++) {
        AVStream *st = ic->streams[i];
        enum AVMediaType type = st->codecpar->codec_type;
        // AVDISCARD_ALL discards all frames
        st->discard = AVDISCARD_ALL;
        if (type >= 0 && wanted_stream_spec[type] && st_index[type] == - 1)
            if (avformat_match_stream_specifier(ic, st, wanted_stream_spec[type]) > 0)
                st_index[type] = i;
    }
    
    // Omit some code
    if(! video_disable) st_index[AVMEDIA_TYPE_VIDEO] =av_find_best_stream(ic, AVMEDIA_TYPE_VIDEO,
                                st_index[AVMEDIA_TYPE_VIDEO], - 1.NULL.0);
    if(! audio_disable) st_index[AVMEDIA_TYPE_AUDIO] =av_find_best_stream(ic, AVMEDIA_TYPE_AUDIO,
                                st_index[AVMEDIA_TYPE_AUDIO],
                                st_index[AVMEDIA_TYPE_VIDEO],
                                NULL.0);

    /* open the streams */
    if (st_index[AVMEDIA_TYPE_AUDIO] >= 0) {
        // Start the decoding thread
        stream_component_open(is, st_index[AVMEDIA_TYPE_AUDIO]);
    }
    ret = - 1;
    if (st_index[AVMEDIA_TYPE_VIDEO] >= 0) {
       // Start the decoding thread
        ret = stream_component_open(is, st_index[AVMEDIA_TYPE_VIDEO]);
    }
    // omit the code
}
Copy the code

The read_thread method runs on a separate thread. It first decapsulates, then starts a thread for decoding, and then calls the AV_read_frame method to read the AVPacket and put it in a queue for the decoding thread to use.

Discard to AVDISCARD_ALL before av_find_BEST_STREAM to filter out AVStream data, followed by stream_COMPONent_open.


/* open a given stream. Return 0 if OK */
static int stream_component_open(VideoState *is, int stream_index)
{
    // Omit some code
    // AVDISCARD_DEFAULT Default mode to filter invalid data
    ic->streams[stream_index]->discard = AVDISCARD_DEFAULT;
    switch (avctx->codec_type) {
    case AVMEDIA_TYPE_AUDIO:
     // Omit some code
        if ((ret = decoder_start(&is->auddec, audio_thread, "audio_decoder", is)) < 0)
            goto out;
    // Omit some code
    case AVMEDIA_TYPE_VIDEO:
     // Omit some code
        if ((ret = decoder_start(&is->viddec, video_thread, "video_decoder", is)) < 0)
            goto out;
    // Omit some code
}
Copy the code

The stream_COMPONent_open method also sets discard to AVDISCARD_DEFAULT, filtering out only invalid data.

Follow up on this discard field and you’ll find something new!

All types of discard values and how to use them are defined in FFmpeg as follows:

/** * @ingroup lavc_decoding */
enum AVDiscard{
    /* We leave some space between them for extensions (drop some * keyframes for intra-only or drop just some bidir frames). */
     // Do not discard any data
    AVDISCARD_NONE    =- 16.///< discard nothing
    // Discard useless data, such as size 0
    AVDISCARD_DEFAULT =  0.///< discard useless packets like 0 size packets in avi
    // Discard all non-reference frames
    AVDISCARD_NONREF  =  8.///< discard all non reference
    // Discard all two-way frames
    AVDISCARD_BIDIR   = 16.///< discard all bidirectional frames
    // Discard all non-internal frames
    AVDISCARD_NONINTRA= 24.///< discard all non intra frames
    // Discard all non-key frames
    AVDISCARD_NONKEY  = 32.///< discard all frames except keyframes
    // Drop all frames
    AVDISCARD_ALL     = 48.///< discard all
};
Copy the code

Therefore, it is perfectly possible to use the discard field to identify which frames are discarded when decoding.

In addition, the avcodec_send_packet method source notes indicate that you can use the avcodecContext.skip_frame field to determine which frames to discard.

/** * Internally, this call will copy relevant AVCodecContext fields, which can * influence decoding per-packet, and apply them when the packet is actually * decoded. (For example AVCodecContext.skip_frame, Which might direct the * decoder to drop the frame contained by the packet sent with this function
int avcodec_send_packet(AVCodecContext *avctx, const AVPacket *avpkt);
Copy the code

Excerpted part of the annotation content, write very clear.

Therefore, the same purpose can be achieved directly through avCodecContext.skip_frame without resolving the nal_ref_IDC field value of AVPacket.

Pro test effective, filter non-key frame, decoded out of all key frame.

Above, is about the loss of some frame sharing, technical exchange welcome to pay attention to the public number audio and video development.