**4. Statistical Analysis and Presentation of the Results**

After performing the subjective tests, we processed all collected results statistically; for each test sequence, codec, and resolution, the Mean Opinion Score (MOS) and 95 percent Confidence Interval (CI) in accordance with Reference [64] were calculated and plotted in

graphs, a shown below. The presentation of the results could be divided into five parts. In the first part, the cross-comparison of the results obtained from different laboratories, i.e., from UNIZA and VŠB, is performed using the Pearson correlation coefficient (PCC) and the Root Mean Square Error (RMSE). In the second part, the bitrate impact on the perceived video quality depending on the scene content is plotted. The third part deals with the Analysis of Variance (ANOVA) which was applied on the acquired data. In the fourth part, the impact of the bitrate on the perceived video quality in terms of the used codec and resolution is presented. Finally, in the fifth part, the minimum bitrate thresholds at which the video sequence should be encoded to reach certain quality are determined.

#### *4.1. Correlation between the Results from Individual Laboratories*

To compare the MOS values obtained from both laboratories, i.e., from UNIZA and VŠB, and, to find out the correlation, the Pearson correlation coefficient (PCC), as well the Root Mean Square Error (RMSE) were calculated. All computations were done for both codecs and resolutions, as well as for all test sequences. The results are plotted in Figures 5 and 6 and are shown in Table 5.

**Figure 5.** Comparison of Mean Opinion Score (MOS) values obtained from different laboratories. Each spot represents MOS values for corresponding codec, resolution, and test sequence.

**Figure 6.** Comparison of MOS values obtained from different laboratories. Each spot represents averaged MOS values from particular test sequences for corresponding codec and resolution.


**Table 5.** Correlation of MOS score between the laboratories.

As we can see from Figures 5 and 6, as well as from Table 5, there is a high correlation between the results from both laboratories. The lowest correlation was reached by the combination of Full HD resolution and H.264 codec. This is most likely due to the different displays used in the assessments; at the UNIZA laboratory, the Full HD display was used, while, at the VŠB laboratory, the Ultra HD display was used. Vice versa, the highest correlation rate was achieved by video sequences encoded to H.264 at UHD resolution.

#### *4.2. Impact of Bitrate on Video Quality Depending on Scene Content*

Figure 7 shows the impact of the bitrate on the perceived video quality (defined by the MOS with associated CI). In this figure, eight graphs are inserted considering used codec, resolution, and laboratory where the evaluation was conducted. Sequences with different scene contents are color-coded in the graphs; each curve represents MOS values for a given test sequence. Figure 8 shows the average MOS values obtained from UNIZA and VSB laboratories.

It is apparent from the graphs that the sequences with the lowest SI and TI values, such as the "Bund Nightscape" and the "Construction Field", reached the best MOS value. Vice versa, the observers rated the sequences situated in the middle of the SI-TI diagram, such as the "Marathon" or "Runners", as of worst quality. Interesting cases are the "Campfire Party" and the "Fountains" sequences. The "Campfire Party" contains a lot of movement (high TI values) but not many details (low SI values) and reached low MOS value, while the "Fountains" sequence lies near to the "Bund Nightscape" and the "Construction Field" sequences, meaning it has low both TI and SI values and also scored low on the MOS scale. A special case is the "Wood" sequence which is situated at the upper right corner of the SI-TI diagram. Nevertheless, its quality was perceived as similar to the sequences "Fountains" and "Runners". All these differences are more pronounced:


Based on these results, we can state that the compression efficiency and related video quality depends on the content of the sequences. However, the sequence representation and description only by the spatial and temporal information is not sufficient and should be the subject of further research. We suggest other parameters should be used to describe the scene, such as, for instance, the luminance and contrast or the colors occurring in the scene. In addition, the psychological factors should be considered. Based on the results, we can also state that the temporal information has greater impact on the perceived quality than the number of the objects defined by the spatial information.

**Figure 7.** Bitrate impact on the perceived video quality (defined by the MOS score with associated Confidence Interval (CI)) depending on codec and resolution for both laboratories independently. Each curve represents MOS values for each type of used test sequence.

**Figure 8.** Bitrate impact on the perceived video quality (defined by the MOS score with associated CI) depending on codec and resolution for both laboratories jointly. Each curve represents averaged MOS values from both laboratories for each type of used test sequence.

#### *4.3. Analysis of Variance*

To verify what stemmed from the graphical representation of the subjective evaluation results, the ANOVA was applied on the data [65]. The three-way ANOVA was used to compare the significance and influence of individual sequence parameters on the resulting perceived video quality. The interaction between three independent variables, bitrate (X1), content (scene type) (X2), and resolution (X3) in Table 6 or compression standard (X3) in Table 7 was examined, with video quality being considered a dependent variable. Tables 6 and 7 depict the three-way ANOVA matrices. The *F*-value, also called the F-ratio is calculated as the variance of the group means divided by the mean of the within group variances (Mean Squared Error). Greater *F*-value indicates more significant variation. In ANOVA, the *p*-value, i.e., the probability of getting the observed result at random, is also determined. For the source of variation to be regarded as insignificant, the *p*-value must be higher than a given alpha level, commonly set to 0.05. When performing ANOVA, the *p*-value is also determined to investigate the probability of rejecting the hypothesis.

Based on the analysis of the tables, the following conclusions can be drawn. Table 6 indicates that for H.265 encoded sequences, the effect of resolution can be ignored, since this variable was deemed statistically insignificant. In contrast, in the case of the H.264 codec, this negative phenomenon does not occur and resolution is the second most important parameter that determines the subjectively perceived quality. For both codecs, an alteration in bitrate results in a maximum change in the subjective MOS. According to Table 7, the impact of compression format on the perceived quality is considered statistically insignificant for Full HD video sequences. However, that is not the case for Ultra HD resolution, where deployed codec is the second most influential variable. Equivalently to Table 6, the bitrate has the greatest effect on the subjective video quality assessment results. All remaining ANOVA test results in both tables can be regarded statistically significant based on their *p*-values.


**Table 6.** Three-way Analysis of Variance (ANOVA) using video codec as a criterion.

**Table 7.** Three-way ANOVA using video resolution as a criterion.


*4.4. Impact of Bitrate on Video Quality Depending on Codec and Resolution*

Figure 9 shows the impact of the bitrate on the perceived video quality (defined by the MOS with associated CI) plotted separately for each type of video sequence. In this figure, eight graphs are inset, considering examined test sequence, which show the impact of used codec and resolution on the perceived quality of a given sequence; curve represents averaged MOS values from both laboratories for a given codec and resolution.

**Figure 9.** Bitrate impact on the perceived video quality (defined by the MOS score with associated CI) depending on used test sequence. Each curve represents averaged MOS values from both laboratories for corresponding codec and resolution.

In Figure 10, the averaged MOS value from both laboratories from all used test sequences for each codec and resolution is plotted.

We can draw several conclusions from Figures 9 and 10. Firstly, it is apparent that the H.265 compression standard yields better quality than the H.264 codec. This is a generally known fact and we expected it. But what is interesting and important is that the efficiency difference between these two codecs is negligible for the Full HD video sequences. Therefore, it is inessential to use H.265 compression standard at this resolution, as the observers will not see any notable differences. The use of H.265 codec is relevant only for the videos at the Ultra HD resolution, particularly at low bitrates. This is due to the fact that the quality of H.264 encoded video sequences increases with the rising bitrate up to the point where it reaches or even surpasses the perceived quality of H.265 sequences. Secondly, the compression efficiency of the H.265 compression standard at the Ultra HD resolution reaches the compression efficiency of both codecs at the Full HD resolution.

Indisputably, the conclusions drawn from the Analysis of Variance (ANOVA) and the graphical representation of the subjective quality evaluation results coincide. These findings could be beneficial for visual media content providers and broadcasting companies, as they indicate how to adjust video compression parameters to improve its quality. The fastest growth of perceived video quality is apparently due to an increase in bitrate. Specifically, the quality increases most rapidly until the bitrate reaches a value of approximately 5 Mbps. The analyses also revealed which combination of resolution and compression format is best used so that the resulting quality of visual content is perceived by viewers as good as possible.

**Figure 10.** Bitrate impact on the perceived video quality (defined by the MOS score with associated CI). Each curve represents averaged MOS values from both laboratories for corresponding codec and resolution—average MOS score.

#### *4.5. Minimum Bitrate Thresholds Suggestions*

Finally, Figure 10 shows the minimum bitrate thresholds at which the video sequences should be encoded to achieve good (4) or fair (3) quality. These quality thresholds are based on MOS values of used ACR method and are important for the bitrate setting of each codec to maintain a certain quality. Table 8 shows the mentioned minimum bitrates.


**Table 8.** Minimum bitrate thresholds to achieve good (4) and fair (3) video quality.

From Table 8, it follows that to achieve a good quality (value 4 on MOS scale), the video sequence must be coded to minimum 7.50 Mbps by both codecs for Full HD resolution and to 11.55 Mbps by H.264 codec and 9.00 Mbps by H.265 codec for Ultra HD resolution. To reach fair quality (value 3 on MOS scale), the minimum thresholds for the bitrates are 2.80 Mbps by H.264 codec and 2.60 Mbps by H.265 codec for Full HD resolution and 4.50 Mbps by H.264 codec and 2.80 by H.265 codec for Ultra HD resolution.

#### **5. Conclusions**

This paper dealt with the content impact on the perceived video quality evaluated using the subjective Absolute Category Rating (ACR) method. Eight types of video sequences with various scene content were evaluated. Two widely used video compression standards H.264/AVC and H.265/HEVC in combination with Full HD and Ultra HD resolutions, were tested. In the coding process, we selected 5 various bitrates based on our previous research, which showed that the efficiency of codecs grows nonlinearly with increasing bitrate. The number of bitrates was a compromise between the complexity and time requirements of subjective testing. In total, we created an annotated database which contains 160 different video sequences coded at constant bitrates with GOP set to half of the framerate value which is typical for video intended for transfer over a noisy communication channel. The perceived quality of the sequences was evaluated employing the subjective ACR method. The assessment was conducted in two laboratories: one situated at the University of Zilina, and the second at the VSB—Technical University in Ostrava. First, we calculated the correlation of the MOS values between both laboratories using the Pearson correlation coefficient (PCC) and the Root Mean Square Error (RMSE). The correlation proved to be considerably high. After that, we described the impact of the bitrate on video quality depending on scene content defined by Spatial (SI) and Temporal information (TI). The results showed that even if the sequences with low SI and TI values reach better MOS than the sequences with higher SI and TI values, these two parameters are not sufficient for scene description, and this domain should be the subject of further research. Subsequently, we described the impact of bitrate on video quality depending on codec and resolution. Based on the results, we concluded that the employment of the H.265 codec for compression of Full HD sequences is inessential, as the people did not observe any significant differences. Furthermore, we stated that the compression efficiency of the H.265 codec by the Ultra HD resolution reaches the compression efficiency of both codecs by the Full HD resolution. We also applied the ANOVA to verify what stemmed from the graphical representation of the subjective evaluation results. Finally, we determined the minimum bitrate thresholds at which the video sequences at both resolutions retain good and fair subjectively perceived quality.

**Author Contributions:** Conceptualization, A.H. and M.U.; methodology, J.B. and L.S.; validation, J.B., L.S. and M.U.; formal analysis, A.H. and J.B.; investigation, J.B. and L.S.; resources, M.U.; data curation, J.B. and L.S.; writing—original draft preparation, A.H. and M.U.; writing—review and editing, A.H., L.S. and M.U.; visualization, L.S. and M.U; project administration, M.U. All authors have read and agreed to the published version of the manuscript.

**Funding:** This work was supported by the Slovak Research and Development Agency under the project PP-COVID-20-0100: DOLORES.AI: The pandemic guard system.

**Institutional Review Board Statement:** Not applicable.

**Informed Consent Statement:** Not applicable.

**Data Availability Statement:** The data presented in this study are available on request from the corresponding author.

**Conflicts of Interest:** The authors declare no conflict of interest. The funders had no role in the design of the study; in the collection, analyses or interpretation of data; in the writing of the manuscript, or in the decision to publish the result.

### **References**


MDPI St. Alban-Anlage 66 4052 Basel Switzerland Tel. +41 61 683 77 34 Fax +41 61 302 89 18 www.mdpi.com

*Sensors* Editorial Office E-mail: sensors@mdpi.com www.mdpi.com/journal/sensors

MDPI St. Alban-Anlage 66 4052 Basel Switzerland

Tel: +41 61 683 77 34 Fax: +41 61 302 89 18

www.mdpi.com

ISBN 978-3-0365-2463-4