Summary of knowledge in this chapter:

  • Video Playback Principle
  • Video file encapsulation format
  • Introduction to audio and video coding

I. Principle of video player

The video files we play are usually wrapped in a wrapper format. What does the wrapper format do? Generally, there is not only video in a video file, but also audio, and the purpose of the package format is to package the video and audio together. So we need to decapsulate the format first to see which video streams and which audio streams, audio streams and video streams are still compressed data, can not be directly used for display, this needs to decode. Here is a flowchart for playing a video file.

According to the process in the above flowchart, we will understand the principle and implementation of each step step by step from top to bottom. Let’s take a look at the video encapsulation format.

Second, video file encapsulation format

Encapsulation format (also called container), is already encoded and compressed good video and audio tracks in accordance with a certain format into a file, that is, just a shell, or you can put it as a video and audio track folder can also be. To put it more generally, the video track is the rice, and the audio track is the dish, and the package format is a bowl, or a pot, the container used to hold the food. Here are some common video file suffix types and their corresponding encapsulation formats.

Video file Format Video encapsulation format
.avi AVI (Audio Video Interleaved)
, WMV, asf WMV (Windows Media Video)
.mpg、.mpeg、.vob、.dat、.3gp、.mp4 MPEG (Moving Picture Experts Group)
.mkv Matroska
Rm, RMVB Real Video
.mov QuickTime File Format
.flv Flash Video

Here are a few video encapsulation formats:

  • 1. AVI format, the corresponding file format is.avi, full name of Audio Video Interleaved, was launched by Microsoft in 1992. The advantages of this video format are good image quality and lossless AVI can save the alpha channel. The disadvantage is that the volume is too large, and the compression standard is not unified, there are many high and low version compatibility problems.

  • 2, DV-AVI Format, the corresponding file Format is.avi, the English full name of Digital Video Format, is a home Digital Video Format jointly proposed by SONY, Panasonic, JVC and many other manufacturers. Common digital cameras use this format to record video data. It can transmit video data to the computer through the IEEE 1394 port of the computer, and it can also record the edited video data back to the digital camera.

  • 3. WMV format, the corresponding file formats are. WMV and. With the same video quality, WMV files can be downloaded and played at the same time, so they are suitable for playing and transferring over the Internet.

  • 4, MPEG format, the corresponding file format is.mpg,.mPEG,.mPE,.dat,.vob,.asF,.3GP,.MP4 and so on, the English full name Moving Picture Experts Group, is the video format developed by the Moving Picture expert Group, The group was formed in 1988 to develop video and audio standards. Its members are technical experts in video, audio and systems. The MPEG format currently has three compression standards: **MPEG-1, MPEG-2, **, and MPEG-4. Mpeg-4 is one of the most popular video packaging formats in use today. It is specifically designed to play high-quality video for streaming media in order to get the best image quality with the least amount of data.

  • 5, Matroska format, corresponding file format is.mkv, Matroska is a new video packaging format, it can be a variety of different encoding video and more than 16 different formats of audio and subtitles in different languages into a Matroska Media file.

  • 6, Real Video format, the corresponding file format is. Rm,. RMVB, is Real Networks developed audio and Video compression specifications called Real Media. Users can use RealPlayer to set different compression rates according to different network transmission rates, so that video data can be transmitted and played in real time over low-rate networks.

  • 7, QuickTime File Format, the corresponding File Format is.mov, is a video Format developed by Apple, the default player is Apple QuickTime. This package format has the characteristics of high compression ratio and perfect video clarity, and can save the alpha channel.

  • 8. Flash Video format, the corresponding file format is.flv, which is a network Video encapsulation format extended by Adobe Flash. This format is adopted by many video sites.

Introduction to audio and video coding

1. Video coding mode

  • The role of video coding: the video pixel data (RGB, YUV, etc.) is compressed into video bit stream, so as to reduce the amount of video data.
The name of the Launch institutions Launch time Current Usage
HEVC (h. 265) MPEG/ITU-T 2013 In the research and development
H.264 MPEG/ITU-T 2003 All fields
MPEG4 MPEG 2001 The tepid
MPEG2 MPEG 1994 Digital TV
VP9 Google 2013 In the research and development
VP8 Google 2008 Don’t spread
VC-1 Microsoft Inc. 2006 Microsoft platform

(1) H.26X series

H.26x is led by the International Telecommunication Telecommunication Standardization Organization (ITU-T) and includes H.261, H.262, H.263, H.264 and H.265.

  • H.261, mainly for older video conferencing and video telephony systems. It was the first digital video compression standard to be used. In essence, all subsequent standard video codecs are based on it.

  • H.262, equivalent to MPEG-2 Part II, is used in DVD, SVCD, and most digital video broadcasting systems and wired distribution systems.

  • H.263, mainly used for video conferencing, video calling and network video related products. In terms of the compression of progressive scanning video sources, H.263 is a great improvement over its previous video coding standards. Especially at the low bit rate end, it can greatly save bit rate on the premise of ensuring certain quality.

  • H.264, equivalent to MPEG-4 Part X, also known as Advanced Video Coding (AVC), is a Video compression standard, a widely used format for recording, compression, and distribution of high-precision Video. The standard introduces a series of new technologies that can greatly improve the compression performance and can greatly exceed previous standards at both high and low rate ends.

  • H.265, known as High Efficiency Video Coding (HEVC), is a Video compression standard and is the successor to H.264. HEVC is believed not only to improve image quality, but also to achieve two times the compression rate of H.264 (equivalent to a 50% reduction in bit rate under the same picture quality), can support 4K resolution and even ultra high picture quality TV, the highest resolution can reach 8192×4320 (8K resolution), which is the current development trend.

(2) MPEG series

The MPEG series was developed by the Moving Image Expert Group (MPEG) under the International Standards Organization (ISO).

  • Mpeg-1 Part 2, mainly used on VCDS, some online videos also use this format. The quality of the codec is roughly comparable to that of the original VHS tape.
  • Mpeg-2 Part ii, equivalent to H.262, is used in DVD, SVCD, and most digital video broadcasting systems and wired distribution systems.
  • Mpeg-4 Part 2 can be used for network transmission, broadcast, and media storage. Compared with mPEG-2 part 2 and h.263 of the first edition, its compression performance is improved.
  • Mpeg-4 Part 10, equivalent to H.264, is the standard that the two coding organizations collaborated to produce.

2. Audio coding mode

The role of audio coding: audio sampling data (PCM, etc.) compressed into audio code stream, thus reducing the amount of audio data. Common audio coding methods are as follows:

The name of the Launch institutions Launch time Current Usage
AAC MPEG 1997 Various fields (new)
MP3 MPEG 1993 Various fields (old)
WMV Microsoft Inc. 1999 Microsoft platform
AC-3 Dolby Inc. 1992 The movie

(1) the MP3

MP3, mPEG-1 or MPEG-2 Audio Layer III, is a once-popular digital Audio encoding and lossy compression format designed to dramatically reduce the amount of Audio data. It was invented and standardized in 1991 by a group of engineers at the Fraunhofer-Gesellschaft, a research organization based in Erlangen, Germany. The popularity of MP3 has had a great impact on the music industry.

(2) the AAC

AAC, English full name Advanced Audio Coding, is developed by Fraunhofer IIS, Dolby Laboratories, AT&T, Sony and other companies jointly, launched in 1997 based on MPEG-2 Audio Coding technology. In 2000, after the advent of the MPEG-4 standard, AAC reintegrated its features, adding SBR technology and PS technology, in order to distinguish the traditional MPEG-2 AAC, also known as MPEG-4 AAC. AAC has a higher compression ratio than MP3, and the same size of the audio file, AAC sound quality is higher.

(3) the WMA

WMA, English full name Windows Media Audio, a digital Audio compression format developed by Microsoft, including lossless and lossless compression formats.

4. Video pixel data

1. Introduction to video pixel data

  • Video Pixel data: Saves the pixel value of each pixel on the screen.
  • Format: Common pixel formats are RGB24, RGB32, YUV420P, YUV422P, YUV444P and so on. YUV format pixel data is generally used in compression coding, and the most common format is YUV420P.
  • Features:Video pixel data is huge, one1Hour movieRGB24The data volume of the format is:3600 * 25 * 1920 * 1080 * 3 = 559.872GBytePS: Here the jiading frequency is25Hz, sampling accuracy8bit)

2. Color model

(1) RGB color coding

In our development scenario, the RGB model is used most. R, G and B respectively represent red, green and blue. These three colors are called primary colors, and any color can be produced by adding them in different proportions.

In AN RGB image, each pixel has three primary colors, red, green and blue. Each primary color occupies 8 bits, or one byte, so a pixel occupies 24 bits, or three bytes. So a 1280 * 720 image takes up 1280 * 720 * 3/1024/1024 = 2.63 MB of storage space. Is there a more efficient color model that can represent the color with fewer bits? That’s YUV color coding.

(2) YUV (YCbCr) color coding

Related experiments show that the human eye is sensitive to brightness but not to chroma. Therefore, the brightness information and chroma information can be separated, and the chroma information using a more “malicious” compression scheme, so as to improve the compression efficiency.

YUV color coding uses lightness Y and chromophores UV to specify the color of the pixel. The “Y” represents Luminance or Luma, which is the gray scale value. The “U” and “V” are Chrominance, or Chroma, which describes the tone and saturation of the image.

Similar to the RGB representation of an image, each pixel contains Y, U, and V components. But its Y and UV components are separable, and no UV component can show the full image, but in black and white.

  • YCbCrThe color space isYUVThe international standardized variant of digital TELEVISION and image compression, such as JPEG.YCbCrIs actuallyYUVA scaled and offset copy. Among themYYUVIn theYThe same meaning,Cb.CrThe same refers to color, but different in the way of expression. inYUVIn the family,YCbCrIs the most widely used member of a computer system in a wide range of applications,JPEG,MPEGUse this format. What people say in generalYUVMostly refers toYCbCr. **Cb: **RGBThe blue part of the input signal isRGBThe difference between signal brightness values. **Cr: ** reflectRGBThe input signal in red isRGBThe difference between signal brightness values.
  • RGB is converted to Ycbcr formula
Y = 0.257*R+0.564*G+0.098*B+16
Cb = 0.148*R0.291*G+0.439*B+128
Cr = 0.439*R0.368*G0.071*B+128
Copy the code
  • Ycbcr is converted to the RGB formula
R = 1.164*(Y- 16) +1.596*(Cr- 128.)
G = 1.164*(Y- 16)0.392*(Cb- 128.)0.813*(Cr- 128.)
B = 1.164*(Y- 16) +2.017*(Cb- 128.)
Copy the code

(3) YUV sampling format

To save bandwidth, most YUV formats use less than 24 bits per pixel on average. The main subsample formats are YCBCR4:2:0, YCBCR4:2:2, YCBCR4:1:1, and YCBCR4:4.

  • 4:2:0 4:2:0 is the most widely used sampling format. 4:2:0 means 2:1 horizontal sampling and 2:1 vertical sampling. It’s a half smaller than RGB. Let’s take a closer look at this sampling format using the 4:2:0 example.

    You can see it in the picture aboveYUV4:2:0I’m going to store the whole imageYInformation, then storedUInformation, finally storedVInformation. But the ratio is different, as you can see, two rows per rowY, half a line is storedUAnd half a lineV.

  • 4:4:4 4:4:4 means complete sampling. It’s the same size as RGB.

  • 4:2:2 means 2:1 horizontal sampling and vertical full sampling. It’s a third smaller than RGB.

The above summary refers to and extracts the following articles. Thank you very much for sharing with the following authors! :

1. Lei Xiaohua’s video class “Making a video Player based on FFmpeg+SDL – Section 1 – Outline and Basic Knowledge of Video and Audio” (PS: tribute to Mr. Lei Xiaohua, the god of audio and video, thank you for your selfless sharing achievements for us before your death)

2. Basic Principles of H264 by audio and video live technical experts

3. RGB and YCbCr by TIM Duncan

【H.264/AVC video codec Technology detailed explanation 】 23, interframe prediction coding (1) : The basic principle of interframe prediction coding

Reprint please note the original source, shall not be used for commercial dissemination – fan how much