Video quality assessment (VQA) has always been a very active research field, the first reason is the lack of a unified and accurate evaluation standard, and the second reason is that there are too many factors affecting video quality, including many subjective factors, which is difficult to objectively and quantitatively evaluate. After so many years of research, there have been a lot of video quality evaluation methods, this paper will simply classify and introduce them.
Related Reading recommendations
Live Video: Summary of Methods for Capturing and Synthesizing Various Image Sources in Windows
Key Technologies for Live Video: Fluency, Congestion, and Delayed Catch-up
Short Video Technology: Short Video Development Technology on Android
Classification of objective quality assessment methods
First of all, video quality assessment methods can be divided into subjective test and objective test. Subjective test is to score the video by means of human eye observation, which can be said to be the best way to reflect the audience’s perception of video quality, and is also the ultimate goal of other objective evaluation methods. However, subjective testing is extremely labor-intensive and time-consuming and cannot be applied directly in industry.
Objective evaluation methods, as recommended by the International Telecommunication Union (ITU), can be divided into five categories according to the type of input data: Parametric packet-layer model, Parametric planning model, Bitstream-layer model, And Hybrid model. Among them, media-layer model directly uses media information to perform operation analysis and give evaluation results, while other evaluation methods evaluate quality according to external variables such as coding parameters or network channel state.
The method of media layer model can be further divided into three types: Full Reference (FR), Reduced Reference (RR) and No Reference (NR) according to whether the original video data before coding needs to be input. The full reference uses complete original video signals as comparison data, while partial reference uses extracted partial video features as comparison data. Without reference, only actual data obtained by users are used to evaluate video quality. The accuracy and applicability of these three methods vary greatly.
Figure1 differences in video quality assessment of FR, RR and NR
Full reference video quality assessment
Obviously, among the three methods, the results of the full reference quality assessment method with complete original data as the comparison source will be more accurate. However, because it needs to use original data, it has great limitations in practical application, so it is generally only used in non-real-time evaluation systems. For example, when configuring coding parameters or comparing the performance of different encoders during the development process, this method is mostly used.
Early full reference evaluation methods generally directly use pixel difference as the measurement basis, such as mean square error (MSE), peak signal to noise ratio (PSNR), etc. These methods are simple to calculate and can reflect the degree of image distortion to a certain extent, so they are still used in many applications today.
But after all, humans don’t subjectively judge video quality just by individual pixel differences. Apart from the large amount of motion information contained in the video, even if static images are only considered, the same pixel difference distributed in different positions with different distribution rules will have different effects on the video quality. In order to evaluate video quality better, researchers put forward many new evaluation methods according to the characteristics of human natural vision. For example, VSSIM based on structural similarity and VQM based on comprehensive statistics of various impact factors. Their evaluation results are more close to the subjective feelings of human eyes. Here is A diagram borrowed from K. Shadrinathan, A. C. Bovik’s “Motion Tuned Spatio-Temporal Quality Assessmentof Natural Videos” to demonstrate PSNR. VSSIM, VQM difference. The three graphs below indicate objective test scores on the horizontal axis and subjective test scores on the vertical axis. It can be seen that the results of PSNR differ greatly from the subjective score, while VSSIM has the problem of different types of video evaluation accuracy, and VQM has the best results.
Figure2 comparison of objective and subjective evaluation scores of PSNR, VSSIM and VQM
Later, researchers introduced perceptual models based on the human visual system (HVS) to further improve the accuracy of video quality assessment. The most representative one is MOVIE (MOtion-based Video Integration). This method can calculate the motion vector of the object in the video, combine the distortion information of time domain and space domain, and finally get a subjective distortion evaluation score. MOVIE is one of the most excellent video quality evaluation methods. However, at the same time, the operation complexity of MOVIE is much higher than the algorithms mentioned above. In the chart below, the horizontal axis is the objective score of MOVIE application on the test sequence provided by the Video Quality Expert Group (VQEG) database, and the vertical axis is the subjective test score.
Figure3 Comparison between objective and subjective scores of MOIVE
Part reference video quality assessment
Full reference video quality assessment requires complete raw video signal, i.e., uncompressed pixel data. Data of this magnitude cannot be transmitted in real time, which makes it impossible to monitor video quality remotely in real time. In order to solve this problem, people put forward the evaluation method of partial reference. This method extracts some eigenvalues from the original video signal and uses them to evaluate the video quality. Common eigenvalues include DCT coefficient and motion vector. As a compromise between full reference and no reference, it can solve the problem of remote transmission at the cost of reduced accuracy. Most of the existing partial reference quality assessment methods can only reach the accuracy level of PSNR.
No reference video quality assessment
Unreferenced video quality evaluation no longer requires pre-distortion data, but only needs the same video information as the audience actually gets, and can get a general quality score. Although this kind of method is difficult to implement, once it is implemented, it can be applied flexibly in various fields related to video, and it is an ideal video quality evaluation method. But so far, there is still no mature scheme for unreferenced assessment. On the one hand, there is still a certain gap between the accuracy of its evaluation results and those of reference evaluation methods; on the other hand, it is highly dependent on video content, so its universality cannot be guaranteed.
However, unreferenced video quality evaluation has been the focus of video quality research. In addition, the progress and popularity of machine learning technology in recent years also provides a new direction for solving the problem of how to evaluate video quality without reference comparison. There have been some attempts to use machine learning to evaluate the quality of unreferenced video, but their effectiveness remains to be proven. We believe that with the continuous exploration and attempts of researchers, we can get a mature scheme in the future.
conclusion
The content of video quality evaluation is very much, this paper only roughly introduces the types of objective video quality evaluation and their applicable scenarios. In practical application, it is still necessary to choose the appropriate method according to the actual situation. For example, whether we need to compare video quality at different frame rates or resolutions, whether we need to consider the impact of network jitter, etc. Finally, use the following classification chart to make a conclusion:
Figure4 video quality assessment methods are roughly classified
In addition, for more articles on instant messaging and audio and video technology, you can visit the netease Yunxin blog.