Of FFmpeg ffprobe

Labels (separated by Spaces) : FFMPEG


Ffprobe is one of the three tools provided by FFMPEG. It is used to view various information of audio and video files, such as package format, audio/video stream information, and packet information.

The source code of FFProbe is FFProbe.c. If you want to obtain the information viewed by FFProbe during the development process, you can obtain the corresponding fields by analyzing the source code. This article mainly introduces the format, stream, Packet and Frame information, including the description of each field and the corresponding FFMPEG field.

View the encapsulation format of audio and video files

ffprobe -show_format inputFile
Copy the code

The following information is displayed:

[FORMAT]
/ / file name
filename=VID_20190811_113717.mp4
// The number of streams in the container, AVFormatContext->nb_streams
nb_streams=2
/ / the AVFormatContext - > nb_programs
nb_programs=0
// Encapsulate AVFormatContext-> iFormat ->name
format_name=mov,mp4,m4a,3gp,3g2,mj2
/ / the AVFormatContext - > iformat - > long_name
format_long_name=QuickTime / MOV
// AVFormatContext->start_time, based on AV_TIME_BASE_Q, converted to seconds
start_time=0.000000
AVFormatContext->duration, based on AV_TIME_BASE_Q, converted to seconds
duration=10.508000
// avio_size(AVFormatContext->pb)
size=27263322
// AVFormatContext->bit_rate
bit_rate=20756240
/ / the AVFormatContext - > probe_score
probe_score=100
[/FORMAT]
Copy the code

View the stream information of audio and video files

ffprobe -show_streams inputFile
Copy the code

The following information is displayed:

[STREAM]
// Index information for the current stream, corresponding to AVStream->index
index=0
// AVCodecDescriptor * cd = avcodec_descriptor_get(AVStream->codecpar->codec_id)
// Encode the name, namely CD ->name
codec_name=h264
CD ->long_name
codec_long_name=H264. / AVC / MPEG4 - AVC / MPEG4 - part 10
// An encoding parameter, including Baseline, Main, High, etc. Baseline has no B-frames, and Main and later can contain B-frames
AVStream-> codecPAR ->codec_id, AVStream-> codecPAR ->profile
profile=High
// Stream type av_get_mediA_type_string (AVStream-> codecPAR ->codec_type)
codec_type=video
/ / the AVStream - > codec - > time_base
codec_time_base=14777/877500
// get this via the macro av_fourcc2str(AVStream-> codecPAR ->codec_tag)
codec_tag_string=avc1
/ / corresponding AVStream - > codecpar - > codec_tag
codec_tag=0x31637661
AVStream-> codecPAR ->width
width=1920
// AVStream-> codecPAR ->height
height=1080
// The video frame width may be different from the above width, i.e. AVStream->codec->coded_width, for example, when the decoded frame is clipped before output or low resolution is enabled
coded_width=1920
// The video frame height may be different from the height above, i.e. AVStream->codec->coded_height, for example, when the decoded frame is clipped before output or low resolution is enabled
coded_height=1088
AVStream-> codecPAR ->video_delay
has_b_frames=0
// SAR, the aspect ratio of a single pixel
// FFmpeg provides multiple SARS: AVStream->sample_aspect_ratio, AVStream-> codecPAR ->sample_aspect_ratio, and AVFrame->sample_aspect_ratio
// Get the final SAR using av_guess_sample_aspect_ratio
sample_aspect_ratio=1:1
// DAR, the real aspect ratio of the image, which must be scaled when rendering the video
Par * SAR = DAR by av_reduce
display_aspect_ratio=16:9
// Av_get_pix_fmt_name (AVStream->codecpar->format)
pix_fmt=yuvj420p
AVStream-> codecPAR ->level
level=40
// Additional color space features, namely av_COLOR_range_name (AVStream-> codecPAR ->color_range), AVCOL_RANGE_MPEG for TV, AVCOL_RANGE_JPEG for PC
color_range=pc
// YUV color space type, i.e. Av_color_space_name (AVStream-> codecPAR ->color_space)
color_space=bt470bg
Av_color_transfer_name (AVStream-> codecPAR ->color_trc)
color_transfer=smpte170m
/ / the av_color_primaries_name (AVStream - > codecpar - > color_primaries)
color_primaries=bt470bg
// The location of the chroma sample is av_chroma_location_name(AVStream-> codecPAR ->chroma_location)
chroma_location=left
// Interleave the order of fields in the video, that is, AVStream-> codecPAR ->field_order
field_order=unknown
// av_timecode_make_mpeG_tc_string gets AVStream->codec->timecode_frame_start
timecode=N/A
AVStream->codec->refs
refs=1
is_avc=true
// Indicates the length of the NALU in bytes
nal_length_size=4
id=N/A
// The base frame rate of the current stream. This value is only a guess, corresponding to AVStream->r_frame_rate
r_frame_rate=30/1
// Average frame rate, corresponding to AVStream->avg_frame_rate
avg_frame_rate=438750/14777
// AVStream->time_base
time_base=1/90000
// Stream start time, based on time_base, i.e. AVStream->start_time
start_pts=0
// Start time after conversion (start_pts * time_base), in seconds
start_time=0.000000
// Stream duration, based on time_base, i.e. AVStream->duration
duration_ts=945728
// The duration after the conversion (duration_ts * time_base), in seconds
duration=10.508089
// AVStream-> codecPAR ->bit_rate
bit_rate=19983544
AVStream->codec->rc_max_rate
max_bit_rate=N/A
// Bits per sample/pixel AVStream->codec->bits_per_raw_sample
bits_per_raw_sample=8
// The number of frames in the video stream, AVStream->nb_frames
nb_frames=312
nb_read_frames=N/A
nb_read_packets=N/A
// The following TAG is AVStream->metadata information
// Counterclockwise rotation Angle (equivalent to normal video counterclockwise rotation Angle)
TAG:rotate=90
// Create time
TAG:creation_time=2019- 08- 11T03:37:28.000000Z
/ / language
TAG:language=eng
TAG:handler_name=VideoHandle
// SIDE_DATA is AVStream-> SIDE_DATA
[SIDE_DATA]
// side_data data type, Display Matrix represents a 3*3 Matrix, which needs to be applied to the decoded video frame to Display correctly
side_data_type=Display Matrix
displaymatrix=
00000000:            0       65536           0
00000001:       - 65536.           0           0
00000002:            0           0  1073741824
// Rotate 90 degrees clockwise to restore the video
rotation=- 90.
[/SIDE_DATA]
[/STREAM]
[STREAM]
// Index information for the current stream, corresponding to AVStream->index
index=1
// AVCodecDescriptor * cd = avcodec_descriptor_get(AVStream->codecpar->codec_id)
// Encode the name, namely CD ->name
codec_name=aac
CD ->long_name
codec_long_name=AAC (Advanced Audio Coding)
AVStream-> codecPAR ->codec_id, AVStream-> codecPAR ->profile
profile=LC
// Stream type av_get_mediA_type_string (AVStream-> codecPAR ->codec_type)
codec_type=audio
/ / the AVStream - > codec - > time_base
codec_time_base=1/48000
// get this via the macro av_fourcc2str(AVStream-> codecPAR ->codec_tag)
codec_tag_string=mp4a
/ / corresponding AVStream - > codecpar - > codec_tag
codec_tag=0x6134706d
// The format of the sample point is av_get_sample_fmt_name(AVStream-> codecPAR ->format)
sample_fmt=fltp
// The sample rate is AVStream-> codecPAR ->sample_rate
sample_rate=48000
AVStream->codecpar-> Channels
channels=2
// Stereo stands for stereo, which corresponds to channels. Av_bprint_channel_layout
channel_layout=stereo
Av_get_bits_per_sample (par->codec_id)
bits_per_sample=0
id=N/A
r_frame_rate=0/0
avg_frame_rate=0/0
// AVStream->time_base
time_base=1/48000
// Stream start time, based on time_base, i.e. AVStream->start_time
start_pts=0
// Start time after conversion (start_pts * time_base), in seconds
start_time=0.000000
// Stream duration, based on time_base, i.e. AVStream->duration
duration_ts=502776
// The duration after the conversion (duration_ts * time_base), in seconds
duration=10.474500
// AVStream-> codecPAR ->bit_rate
bit_rate=156002
AVStream->codec->rc_max_rate
max_bit_rate=156000
// Bits per sample/pixel AVStream->codec->bits_per_raw_sample
bits_per_raw_sample=N/A
// The number of frames in the audio stream, AVStream->nb_frames
nb_frames=491
nb_read_frames=N/A
nb_read_packets=N/A
TAG:creation_time=2019- 08- 11T03:37:28.000000Z
TAG:language=eng
TAG:handler_name=SoundHandle
[/STREAM]
Copy the code
  • SAR(Sample Aspect Ratio): The Aspect Ratio of a single pixel, that is, the Ratio of the width to the height of each pixel. Therefore, it can be considered that pixels are not square.
  • PAR(Pixel Aspect Ratio): Pixel Aspect Ratio, the Ratio of the horizontal and vertical collection points of an image, that is, the Ratio of the number of pixels.
  • Display Aspect Ratio (DAR): Displays the Aspect Ratio of the image to its final Display. The player needs to maintain the DAR Ratio when rendering video frames.

Relationship between them: PAR * SAR = DAR

Here’s an example, as shown below: Each square represents a pixel, 5 pixels wide and 4 pixels high, PAR= 5:4. Assume that the image has a display width of 160 and a height of 120, i.e., DAR= 4:3. SAR = DAR/PAR = 16:15, indicating that the pixel square is a rectangle.

FFmpeg provides multiple SARS:

  • AVStream->sample_aspect_ratio
  • AVStream->codecpar->sample_aspect_ratio
  • AVFrame->sample_aspect_ratio

The final SAR is obtained by av_guess_sample_aspect_ratio. For DAR, the value of AVStream->display_aspect_ratio is always 0:0. Referring to ffprobe code, DAR is calculated by AV_reduce, as shown below:

AVRational sar, dar;
// par
AVCodecParameters *par = AVStream->codecpar;
// Calculate the SAR
sar = av_guess_sample_aspect_ratio(AVFormatContext, AVStream, NULL);
// Calculate dar based on PAR and SAR
av_reduce(&dar.num, &dar.den, par->width * sar.num, par->height * sar.den,  1024*1024);
Copy the code

View data packets of audio and video files

/ / select audio or video - select_streams said ffprobe - show_format [- select_streams audio | video] inputFileCopy the code

Let’s first look at the first and second Packet of the video stream:

[PACKET]
//Packet type av_get_mediA_type_string (AVStream-> codecPAR ->codec_type)
codec_type=video
// The index of the stream to which the current frame belongs corresponds to AVStream->index
stream_index=0
// Frame display time, i.e. AVPacket-> PTS, based on AVStream->time_base time baseline
pts=0
// Convert to seconds
pts_time=0.000000
// Frame decoding time, i.e. AVPacket-> DTS, based on AVStream->time_base
dts=0
// Convert to seconds
dts_time=0.000000
// The duration of the current frame is equal to the PTS of the next frame - PTS of the current frame, i.e. AVPacket->duration, based on the AVStream->time_base time baseline
duration=12972
// Convert to seconds
duration_time=0.144133
// AVPacket->convergence_duration, also based on AVStream->time_base
convergence_duration=N/A
// Convert to seconds
convergence_duration_time=N/A
// The current frame Size, bytes, i.e. AVPacket-> Size
size=187872
// AVPacket->pos
pos=830842
flags=K_
[/PACKET]
[PACKET]
codec_type=video
stream_index=0
pts=12972
// 12972/90000
pts_time=0.144133
dts=12972
dts_time=0.144133
duration=2999
duration_time=0.033322
convergence_duration=N/A
convergence_duration_time=N/A
size=31200
// Last frame pos + size
pos=1018714
flags=__
[/PACKET]
Copy the code

Then look at the first and second Packet of the audio stream:

[PACKET]
/ / audio frame
codec_type=audio
// The index of the stream to which the current frame belongs corresponds to AVStream->index
stream_index=1
// Frame display time, i.e. AVPacket-> PTS, based on AVStream->time_base time baseline
pts=0
pts_time=0.000000
// Frame decoding time, i.e. AVPacket-> DTS, based on AVStream->time_base
dts=0
dts_time=0.000000
// The duration of the current frame is equal to the PTS of the next frame - PTS of the current frame, i.e. AVPacket->duration, based on the AVStream->time_base time baseline
duration=1024
/ / 1024/48000
duration_time=0.021333
convergence_duration=N/A
convergence_duration_time=N/A
size=416
pos=810458
flags=K_
[/PACKET]
[PACKET]
/ / audio frame
codec_type=audio
stream_index=1
pts=1024 
/ / 1024/48000
pts_time=0.021333
dts=1024
dts_time=0.021333
duration=1024
duration_time=0.021333
convergence_duration=N/A
convergence_duration_time=N/A
size=416
// Last frame pos + size
pos=810874
flags=K_
[/PACKET]
Copy the code

View the frame information after decoding audio and video files

/ / select audio or video - select_streams said ffprobe - show_frames [- select_streams audio | video] inputFileCopy the code

Take a look at frames 1 and 2 of the video stream first:

[FRAME]
// Frame type av_get_mediA_type_string (AVStream-> codecPAR ->codec_type)
media_type=video
// The index of the stream to which the current frame belongs corresponds to AVStream->index
stream_index=0
1: keyframe, 0: non-keyframe, that is, AVFrame->key_frame
key_frame=1
// Frame display time, i.e. AVFrame-> PTS, based on AVStream->time_base time baseline
pkt_pts=0
// Convert to seconds
pkt_pts_time=0.000000
// Frame decoding time is derived from the corresponding AVPacket copy, i.e. AVFrame->pkt_dts, based on AVStream->time_base
pkt_dts=0
// Convert to seconds
pkt_dts_time=0.000000
// Frame timestamp, basically the same as PTS, i.e. AVFrame->best_effort_timestamp, based on AVStream->time_base time base
best_effort_timestamp=0
// Convert to seconds
best_effort_timestamp_time=0.000000
// The frame duration of the corresponding AVPacket, i.e. AVFrame->pkt_duration, is based on AVStream->time_base
pkt_duration=12972
// Convert to seconds
pkt_duration_time=0.144133
// Reorder pos from the last AVPacket entered into the decoder, i.e. AVFrame->pkt_pos
pkt_pos=830842
// AVFrame->pkt_size
pkt_size=187872
// Rotate the previous frame width, i.e. AVFrame->width
width=1920
// The frame height before the rotation, i.e. AVFrame->height
height=1080
// Av_get_pix_fmT_name (AVFrame->format)
pix_fmt=yuvj420p
// SAR, the ratio of horizontal points to vertical points in image acquisition
// FFmpeg provides multiple SARS: AVStream->sample_aspect_ratio, AVStream-> codecPAR ->sample_aspect_ratio, and AVFrame->sample_aspect_ratio
// Get the final SAR using av_guess_sample_aspect_ratio
sample_aspect_ratio=1:1
// The image type of the video frame, in this case I frame, that is, av_get_picture_type_char(frame->pict_type)
pict_type=I
// Picture number in bitstream order, AVFrame->coded_picture_number
coded_picture_number=0
// picture number in display order, i.e. AVFrame->display_picture_number
display_picture_number=0
// Whether the video frame content is interlaced, i.e. AVFrame->interlaced_frame
interlaced_frame=0
// If the video frame content is interleaved, it represents the top field displayed first, i.e. AVFrame->top_field_first
top_field_first=0
When decoded, this signal indicates how much the video frame must be delayed. Extra_delay = REPEAT_pict/(2* FPS), AVFrame-> REPEAT_pict
repeat_pict=0
// Additional color space features, namely av_COLOR_range_name (AVFrame->color_range), AVCOL_RANGE_MPEG for TV, AVCOL_RANGE_JPEG for PC
color_range=pc
// YUV colorspace type, i.e. Av_color_space_name (AVFrame->colorspace)
color_space=bt470bg
/ / the av_color_primaries_name (AVFrame - > color_primaries)
color_primaries=bt470bg
// Av_color_transfer_name (AVFrame->color_trc)
color_transfer=smpte170m
// The location of the chroma sample is av_chroma_location_name(AVFrame->chroma_location)
chroma_location=left
[/FRAME]
[FRAME]
media_type=video
stream_index=0
// Non-keyframe
key_frame=0
pkt_pts=12972
/ / 12972/90000
pkt_pts_time=0.144133
pkt_dts=12972
pkt_dts_time=0.144133
best_effort_timestamp=12972
best_effort_timestamp_time=0.144133
pkt_duration=2999
pkt_duration_time=0.033322
pkt_pos=1018714
pkt_size=31200
width=1920
height=1080
pix_fmt=yuvj420p
sample_aspect_ratio=1:1
// The image type of the video frame, in this case P frame, i.e. Av_get_picture_type_char (frame->pict_type)
pict_type=P
coded_picture_number=1
display_picture_number=0
interlaced_frame=0
top_field_first=0
repeat_pict=0
color_range=pc
color_space=bt470bg
color_primaries=bt470bg
color_transfer=smpte170m
chroma_location=left
[/FRAME]
Copy the code

Then look at frames 1 and 2 of the audio stream:

[FRAME]
// Frame type av_get_mediA_type_string (AVStream-> codecPAR ->codec_type)
media_type=audio
// The index of the stream to which the current frame belongs corresponds to AVStream->index
stream_index=1
// Whether to keyframe
key_frame=1
// Frame display time, i.e. AVFrame-> PTS, based on AVStream->time_base time baseline
pkt_pts=0
// Convert to seconds
pkt_pts_time=0.000000
// Frame decoding time is derived from the corresponding AVPacket copy, i.e. AVFrame->pkt_dts, based on AVStream->time_base
pkt_dts=0
// Convert to seconds
pkt_dts_time=0.000000
// Frame timestamp, basically the same as PTS, i.e. AVFrame->best_effort_timestamp, based on AVStream->time_base time base
best_effort_timestamp=0
// Convert to seconds
best_effort_timestamp_time=0.000000
// The frame duration of the corresponding AVPacket, i.e. AVFrame->pkt_duration, is based on AVStream->time_base
pkt_duration=1024
// Convert to seconds
pkt_duration_time=0.021333
// Reorder pos from the last AVPacket entered into the decoder, i.e. AVFrame->pkt_pos
pkt_pos=810458
// AVFrame->pkt_size
pkt_size=416
// Av_get_sample_FMT_name (AVFrame->format)
sample_fmt=fltp
// The number of samples for the current audio frame, namely AVFrame->nb_samples
nb_samples=1024
// AVFrame-> Channels
channels=2
// Channel layout, obtained by av_bprint_channel_layout, corresponds to channels
channel_layout=stereo
[/FRAME]
[FRAME]
media_type=audio
stream_index=1
key_frame=1
pkt_pts=1024
pkt_pts_time=0.021333
pkt_dts=1024
pkt_dts_time=0.021333
best_effort_timestamp=1024
best_effort_timestamp_time=0.021333
pkt_duration=1024
pkt_duration_time=0.021333
pkt_pos=810874
pkt_size=416
sample_fmt=fltp
nb_samples=1024
channels=2
channel_layout=stereo
[/FRAME]
Copy the code

More audio and video information will be added later.

Refer to the article

  • FFmpeg gets the correct aspect ratio of the video