Testers, how to test the live quality of live products? What is the performance that users are most concerned about? What are the standards for audio and video quality testing? The key to the future competition is to improve the quality indicators of the live streaming software and conduct special tests on fluency, clarity, sound quality, stability and traffic consumption on the premise that the functions can meet the needs of users, so as to achieve the quality of audio and video calls. Fundamentals of audio and video

1. The collection

Audio and video need to pass through the camera, microphone and other hardware devices through the sensor on the acquisition, transmission and transformation, and finally become a computer digital signal process. The two – person video and group video playback, collection and playback are completed by ffMPEG plug-in.

2. Before processing

Audio preprocessing includes gain control (AGC), noise suppression (ANS), echo cancellation (AEC), mute detection (VAD) and so on. Video preprocessing includes video noise reduction, scaling and so on.

3. The codec

A signal or data stream requires a codec operation, which involves either the encoding of a signal or data stream (usually for transmission, storage, or encryption) or the extraction of an encoded stream. There are many video codecs, such as VP8, VP9, MPEG, H264 and so on. Audio codecs can be divided into two categories, voice codecs (SILK, Speex, iSAC, etc.) and audio codecs (ECELT, AAC, etc.).

4. Network transmission

In network transmission, UDP or TCP transmission is selected according to different network environments. In real-time audio and video calls, UDP is preferred because of its freedom and delay. In addition, it also processes loss during transmission, including packet size control, FEC mechanism, packet loss retransmission, Jitter control, delay, out-of-order, and so on.

5. The post-processing

The data is transmitted over the network to the receiver and then decoded into post-processing, where audio data may need to be resamped or mixed, while video may need to eliminate blocking, time-domain downfrequency, and so on.

6. Play/render the display

After post-processing, the digital signal is converted into sound and picture is the process of playing/rendering. Windows commonly used audio playback apis are DirectSound, WaveOut, CoreAudio.

Video quality standards

The following describes video quality standards and test methods.

1. Speed of entry

Normal network requirement: it takes less than 1 second to enter a room (iOS and Android) weak network requirement: there is no standard for the speed of entering a room on a weak network. For Android, it is recommended to use a low-end model (such as mi note), and for iOS, it is recommended to use iphone6S

The test method

Cover the scene: the entrance should cover all, such as app internal, QQ, QQ space, wechat, Moments, Sina Weibo 1. One phone turns on the millisecond stopwatch, then another test phone turns on the product under test and enters the anchor room; 2. After the first frame appears in the anchor room, pause the stopwatch to record the data; 3. Repeat the above steps for 20 times and take the average of the final result.

Competing goods data

models application Time to enter the room (ms)
The android Competing goods A
Competing goods B
Competing goods C
Competing goods D
apple Competing goods A
Competing goods B
Competing goods C
Competing goods D

 

Definition 2.

Normal network requirement: The sharpness does not deteriorate compared with the previous version. Weak network requirement: When the network packet loss rate is 10%, the sharpness does not deteriorate compared with that of the normal network. Tools: Imatest Environment debugging: 1. The distance between the camera and the target card is 0.75m, and the Angle between the light source and the photographed card is kept at 45° to ensure that there is no shadow on the surface of the card. 2. Preheat the light source for at least 15 minutes before using a fluorescent lamp (D65/CWF/SP35); 3. Measure the illuminance and color temperature of 9 points on the surface of the reflective card to ensure the consistency of light, adjust the position of the measured phone to make its shooting position centered.

1. Use different competing products to take photos, import the photos to PC, use Imatest tool to calculate the sharpness, and click SFR: New File; 2. 2. Select the image to be processed and add it. Select the block diagram of 13 distribution points on the image (as shown below) for processing, and click “Yes, Continue” 3 when the block diagram is completed. Click [OK] and [Yes] to calculate the MTF50P generated as the sharpness of the image

Factors affecting the

The video resolution and bit rate greatly affect the video resolution. The higher the bit rate and the higher the resolution, the better the video resolution. Do not judge the video resolution based on the resolution alone.

Competing goods data

Anchor mobile platform Competing goods Sharpness value
IOS Competing goods A
Competing goods B
Competing goods C
Android Competing goods A
Competing goods B
Competing goods C

 

3. The frame rate

Normal network requirements: Due to the special physiological structure of human eyes, if the frame rate is higher than 16, it will be considered as coherent, so the frame rate is not lower than 16. Frame rate can be set according to the needs of comprehensive consideration, compared to competing products. And when the frame rate is lower than 5 frames, the human eye can obviously feel the picture is incoherent, producing the feeling of card. Weak network: When the network packet loss rate is 10%, the frame rate does not decrease significantly compared with the normal network

The test method

Equipment: 2 computers + 1 camera + 2 mobile phones. One computer plays the video, one computer records the video, one mobile phone acts as the anchor, one mobile phone acts as the audience, and the camera collects the audience side pictures. Video source: specific video Demo. avi Operation steps: 1. Computer 1 play the looping video demo.avi, computer 2 insert the camera, open the “audio and video” software; 2. 2. Mobile phone A initiates A live broadcast. When mobile phone B is the audience of the live broadcast, A points at the computer that plays the video, opens the “audio video” software, and points the camera of the video recording computer at B; 3. Click capture — “Capture Video –” to set the “Capture Folder”, click Capture video (about 10~20s recording), the video capture is complete. The captured video format is MPG format; 4. Convert the file in MPG format to YUV format: Edit the mepg2dec. CMD file, as shown in the following figure, change the file name to the file name of the captured video, save the file and run mpeg2dec.exe.

5. Open the YUVviewerPlus. Exe File, as shown in the following figure, and set the resolution of video recording (the default resolution of audio-video recording File is 720*480). 6. Click “Next” to start counting frames. Based on 30 frames in 1s, the number of scene image changes in 30 frames is the frame rate (preferably 3s). It is recommended to average the beginning/middle/end of the recorded video.

Factors affecting the

Under the condition of normal network without damage, the frame rate is mainly affected by the video. The higher the video bit rate, the higher the video bit rate will encode the video bit stream with high frame rate and high resolution.

Competing goods data

Competing goods Anchor mobile platform Frame rate
Competing goods A IOS
Android
Competing goods B IOS
Android
Competing goods C IOS
Android

 

4. Lag times

standard

Normal network: Weak network:

The test method

Globe (IOS) or Automated Test Tool (Android)

Factors affecting the

Under the condition of normal network without damage, the frame rate is mainly affected by the video. The larger the video bit rate is, the higher the video bit rate will encode the video bit stream with high frame rate and high resolution

5. Video quality stability

Under various damage change scenarios, no splashy screen, black screen, automatic interruption and other phenomena occurred within 3 hours of the live broadcast

The test method

1. Damage automation test, and use software audio-visual recording; 2. Check whether the recorded video has a distorted screen, black screen, or abnormal interruption. Audio quality standards The following describes audio quality standards and test methods.

1. The sampling rate

Normal network requirement: Audio sampling rate greater than 16K Weak network requirement: Audio sampling rate greater than 16K The test must cover the live and continuous mic scenarios.

The test method

Equipment: two mobile phones, playable sample equipment, recording pen 1. One mobile phone enters the anchor environment, and the other one serves as the audience terminal; 2. 2. Use a device that can play voice (music) samples to play in the anchor terminal; 4. Use Adobe Audition to check the spectrum: the highest spectrum is about 7K, so the sampling rate should be 16K;

2. Objective score of sound quality

Normal network requirement: In a normal network broadcast, the voice sound quality score is >=4.0. Weak network requirement: In a weak network broadcast, the voice sound quality score is >=3.5

The test method

Live broadcast mode: since the delay of live broadcast is more than 2 seconds, audio line is used to record and cut, and then SPIRENT device is used to measure scores. Devices: Two audio cables, one PC, two mobile phones

1. Connect the host microphone to the PC speaker, and the audience speaker to the PC microphone.

2. The PC plays a 48-K voice sample for 10 seconds.

3. Open Adobe Audition to record, the recording time is about 2mins;

4. Cut the recorded audio in segments (each voice is 10s, and the blank voice in front is reserved for about 3s).

5. Upload the cut audio file to the SPIRENT device and calculate the POLQA equalscore. Continuous mic mode: The delay is less than 1s. The SPIRENT device can directly measure the sound quality. 1. Connect the mic between the anchor terminal and the audience terminal; 2. Access the SPIRENT device to test the sound quality. The two-way test time is about 8mins; 3. Get the average score of sound quality \

3. Sound and picture synchronization

In normal network and weak network, the probability of out-of-sync is 0.

The test method

In the process of watching the live broadcast, the subjective judgment is whether the mouth shape of the anchor in the video is in line with the voice

Lianmai – Noise suppression

In the anchor and audience continuous mic mode, the noise cancellation effect of anchor → audience is not worse than that of the previous version.

The test method

Device: One audio cable, one device for playing voice samples, and one PC

1. Connect the mic between the anchor terminal and the audience terminal;

2. Put the mobile phone of the anchor in the muffler room and keep it in a fixed position, and then use the equipment that can play the voice sample to play the noise sample in the muffler room;

  1. The speaker of the audience is connected to the MICROPHONE of the PC;

4. Use Adobe Audition to record and save the file;

  1. Record the previous version in the same way (keep the same test environment);

6. Compare the old and new versions, select the same speech segment and noise segment, and calculate the SNR.

4. Limai – Echo cancellation

Standard: When the anchor and the audience are in continuous mic mode, the speaker will hear less echo and will not affect the communication.

The test method

Single talk: the audience end to open the speaker, anchor end to speak, subjective listen to whether there is their own echo; Turn to the audience and listen for echoes. Double speak: both sides open the speaker, and speak at the same time, subjective listen to whether there is echo, or sound intermittent cut.