Make writing a habit together! This is my first day to participate in the “Gold Digging Day New Plan · April More text challenge”, click to see the details of the activity.

For more blog posts, seeTotal catalog of romantic carriages for audio and video systems learning

YUV color coding analysis video coding principle — From The film of Sun Yizhen (1) Analysis video coding principle — from the film of Sun Yizhen (2)

The previous several blog posts have laid the foundation of C, C++, NDK almost, from today, will enter the theory part of audio and video, starting from the most basic knowledge of video. Only by having a clear understanding of the basic concepts can we learn the following content well.

We know that a video is made up of a frame of images, so let’s start with a frame of images.

pixel

Everyone may be very familiar with the pixel, usually mobile phone screen parameters are how many pixels. Pixel is the basic unit of the image, and each pixel constitutes the image. You can think of a pixel as a point in an image. Like Super Mario on screen:

When there are enough pixels, we can see an image with smooth enough lines.

On our typical screen, each pixel is made up of red, green and blue light:

Then the entire screen is made up of a dense mass of pixels:

The resolution of the

An image needed a parameter to indicate how many pixels it had, and the concept of resolution was born. For example, in an image with a resolution of 1920×1080, 1920 means 1920 pixels in width and 1080 pixels in height.

The intuition is that the higher the resolution the sharper the image, and it’s not a very precise way to say this, it’s true that for the original image, the higher the resolution the sharper the image. However, if the resolution is increased by enlarging the image, the image will be more blurred, because the enlarged image is filled by interpolation, so it will be blurred:

For example, here is the original image on the left, and the enlarged image on the right. The resolution of the right image is increased, but it is blurred instead:

A deep

As mentioned earlier, on our common screen, each pixel is made up of red, green and blue light:

Namely, the typical three color values are red, green and blue channels, which are R, G and B channels.

If we want a pixel on the screen to display the color we want, we must send the corresponding data to the screen, and the data can be regarded as a protocol between us and the screen. The mode of R, G and B can be regarded as a protocol. Usually, R, G and B each occupy 8 bits, that is, one byte. Eight bits can represent 256 color values, so three channels is 256 to the third color values, a total of 16.77 million colors. We call this image an 8bit image, and this 8bit is the bit depth.

The bigger the bit depth, the more color values we can represent and of course, the more storage we have, the more traffic it takes to move over the network.

Frame rate

As we know, video is composed of many seemingly continuous frames of images. By virtue of the visual transient characteristics of human eyes, when the number reaches a certain value, the sensitivity of human eyes will not be detected, and it looks like a continuous picture. The human visual system can process 10 to 12 images per second and perceive them individually, while higher rates are considered motion.

Frame rate represents the frequency (rate) at which bitmap images appear consecutively on the display in units called frames, i.e. the number of images in one second.

Higher frame rates result in smoother, more realistic animations. Generally 30fps is acceptable, but increasing the performance to 60fps can significantly improve the sense of interaction and verisimimency, but generally speaking, it is not easy to detect a significant improvement in fluency beyond 75fps.

In addition, a high frame rate means a high number of images are processed per second, which requires high equipment performance. The higher the frame rate, the more traffic, and the higher the bandwidth requirements. Therefore, different frame rates are generally selected according to different scenarios. For example, in real-time communication, the frame rate is lowered to save bandwidth, usually around 15 frames /s. Generally, a video with low flow requirements will be around 30 frames per second, while for a film with high flow requirements, it will be around 60 frames per second.

Bit rate

For the video file, the size can be directly described by the file size, but for the corresponding live video stream, it needs to be expressed by the bit rate.

Bit rate refers to the amount of video data in unit time, usually within 1 second, and the unit is Kb/s or Mb/s. If it is the original 8bit deep RGB video, the bit rate calculation is as follows:

Bit rate = Resolution (width * height) *3 (bytes) * frame rate

Generally speaking, for the same video with the same compression algorithm, the lower the bit rate, the more distortion the video will be, the higher the compression ratio. However, for different compression algorithms, it is not simple to say that the lower the bit rate, the more distorted the video.

conclusion

Today we mainly talk about the most basic concepts of video: pixel, resolution, bit depth, frame rate, bit rate. After knowing the most basic knowledge of video, we can start to learn the related content of color coding: YUV color coding of basic knowledge of audio and video development

Original is not easy, if you think this article is helpful to yourself, don’t forget to click on the likes and attention, but also to the author’s affirmation ~ *