Why video coding?

It is well known that video is a group of continuous image sequence, composed of continuous frames, one frame is an image. Due to the transient effect of the human eye, when a sequence of frames is played at a certain rate, we see a video of continuous action. During video collection, the frequency is generally 25 or 30 frames per second. After video signal digitalization, the amount of data will become very large, and the existing network and storage devices cannot directly store the original video image. Due to the high similarity between continuous frames, the video and image can be encoded and compressed for convenient storage and transmission, so as to remove the redundancy of space and time. Existing mainstream compressed video algorithms include H.264/AVC, H.265/HEVC, VP8, VP9, VVC and so on.

What is scalable video coding?

SVC (Scalable Video Coding) is an extension of traditional Video Coding. SVC is Scalable and layered, and can be divided in frame rate, resolution and quality. SVC can output multiple bit streams at one time, including basic and enhanced layers. Applicable to different terminal and network conditions. The basic layer consumes less bandwidth resources and ensures basic video quality. The basic layer plus the enhancement layer results in better frame rate, resolution, or quality. Let’s take a closer look at the SVC encoding features:

scalability

SVC coding scalability is divided into temporal scalability, spatial scalability and quality scalability. Temporal scalability refers to the decomposition of video streams into information representing different frame rates, spatial scalability refers to the decomposition of video streams into information representing different resolutions, and quality scalability refers to the decomposition of pixel values into different levels.

  • Temporal scalability

The video sequence is divided into multiple layers without overlapping, and the basic layer frame is encoded by ordinary video to provide basic layer code stream with basic resolution. For the enhancement layer, the basic layer data is used to encode the inter-frame prediction of the enhancement layer to generate the enhancement layer data. By adding an enhancement layer, you can get a higher frame rate and a smoother video.

  • Spatial scalability

Multiple images with different spatial resolutions are generated for each frame of the video. Basic layer code stream encodes low resolution images, and high resolution images can be obtained by adding enhanced layer code stream on this basis.

  • Quality scalability

Save the results of multiple qualities in compression coding. So that the decoding end can be decoded back to the required picture quality on demand. The basic layer code stream encodes the lower quality image in exchange for the efficiency of decoding, and the enhanced layer can get the higher quality image.

Comparison of SVC and H.264/AVC

Taking the most common H.264 encoding as an example, the fundamental difference between SVC and AVC is that SVC ADAPTS its bit rate in a stream, unlike AVC, which requires multiple streams of various resolutions. Specific performance differences refer to:

The advantage of SVC

  • Encode once, decode many times.

Without repeated encoding or transcoding, the decoder can choose to decode different levels of stream depending on network conditions and device capabilities. For example, if there are three people in A meeting and client A has better bandwidth, the server will send multiple layers of code stream, including basic layer and an enhanced layer. B has very low bandwidth, only requests the base layer, can see the smooth video. C bandwidth is very good, can request the base layer and two enhancement layer, can get very good video quality.

  • Anti-packet loss and high fault tolerance.

SVC is used to optimize network transmission and improve the tolerance of video to network packet loss. Only basic layer stream can be decoded, and the enhanced layer stream error/loss does not affect the video fluency. Anti-packet loss/anti-error means of different intensity can be used for basic and enhanced layer code streams. Counting forward error correction overhead, the SVC overall bit stream can be lower.

  • Good compatibility, basic layer code stream compatible with H.264 non-SVC decoders.
  • More economic

Without the special line requirements of traditional video conference, it can fully meet the requirements of commercial video applications under the Internet and mobile Internet environment, and the cost of network bandwidth and hardware equipment can be greatly reduced, enabling more people to enjoy the changes brought by professional video applications in their work and life.