**(e) Marathon (f) Runners (g) Tall Buildings (h) Wood**

**Figure 1.** Printscreens of used test sequences. Reprinted with permission from [60], Copyright 2021, Uhrina.






**Table 2.** *Cont.*

#### *2.2. Dataset Preparation*

In our research, we decided to explore the quality of 8-bit video sequences at two commonly used resolutions, i.e., Full HD (FHD) and Ultra HD (UHD) with a typical chroma subsampled YUV 4:2:0 format. Because original sequences were uncompressed and YUV 4:4:4 color format at Ultra HD resolution with 10-bit depth was used, we had to convert them to the appropriate formats. Therefore, all test sequences were first chroma subsampled from YUV 4:4:4 to YUV 4:2:0 format and also the bit depth was changed from 10 to 8 bits per channel. Subsequently, all these conversion steps were repeated for Full HD resolution utilizing the FFmpeg tool [61]. As we wanted to assess also Full HD in addition to Ultra HD, the resolution also had to be altered. For all these conversion steps we used once again the FFmpeg tool [61]. Correspondingly, two uncompressed test sequences were generated (Figure 3) for each type of content, which adds up to 16 videos. We call them the source video sequences (SRCs) for the rest of this paper.

**Figure 3.** Process of preparing the test sequences: chroma subsampling, bit depth, and resolution changing.

#### *2.3. Coding Process*

All these test video sequences (SRCs) were afterwards encoded to both compression standards to be evaluated, i.e., H.264/AVC and H.265/HEVC. As a quality restriction parameter, we decided to use the constant bitrate. We selected 5 various target bitrates: 1, 3, 5, 10 and 15 Mbps based on our previous research [62] which have shown the efficiency of codecs growing nonlinearly with increasing bitrate. We have limited the number of bitrates to 5 as a compromise between the complexity and time requirements of subjective testing and precision of the measurements. For the purposes of our research, we decided to use the Group of Pictures (GOP) length typical for video intended for transfer over a noisy communication channel. The GOP length is based on the framerate of used video sequences and is commonly set to half of the framerate value. Accordingly, given that test video sequences had a framerate of 30 fps (frames per second), we chose the GOP length of 15 frames, i.e., M = 3, N = 15. The first number, labeled with M letter, expresses the distance between two anchor frames (I or P) and the second number, denoted with N letter, stands for the distance between two key frames (I). For this coding process, we used once again the FFmpeg tool, which contains libraries x264 and x265 for H.264/AVC and H.265/HEVC codec, respectively [61], creating the total of 160 video sequences for the subjective quality assessment. We refer to them as PVSs (Processed Video Sequences) for the rest of this paper. The FFmpeg command example for encoding the Wood test sequence to the H.264 format at 1 Mbps bitrate is:

*ffmpeg -i Wood\_1920x1080\_30fps\_420\_8bit\_YUV.yuv -vcodec libx264 -command example-params keyint=15:min-keyint=15:bframes=3:b-adapt=1:bitrate=1000:vbv-maxrate=1000:vbv-bufsize=1000 Wood\_1920x1080\_30fps\_420\_8bit\_H264\_01M.mp4.*

#### **3. Subjective Quality Assessment**

During the subjective testing, all created PVSs were shown to people of different ages and genders to evaluate their quality. We decided to use the Absolute Category Rating (ACR) method [58,63] which belongs to the category of Single Stimulus (SS) subjective video quality assessment techniques. The principle of this method is that the degraded sequences are presented to the observers one at a time, and they are asked to rate its quality on a five-level grading scale, where 1 indicates the bad quality, and 5 stands for the excellent quality. The measurement was conducted in two laboratories separately: one situated at the University of Zilina (UNIZA), and the second at the VŠB – Technical University in Ostrava. The video sequences were presented on three types of displays (Table 3) depending on the resolution of the test sequences in the laboratories under normal indoor illumination conditions.

**Table 3.** Types of used displays.


Thirty participants, mostly students, were involved in the testing in each laboratory. All of them were naive observers which means they had no expertise in the image artefacts that may be introduced by the system under test. Naturally, they were thoroughly acquainted with the method of assessment, types of impairment, grading scale, sequence, and timing as required by Reference [58]. The statistical distribution of the number of men and women who took part in the tests, as well as the average age of all observers, is shown in Table 4. The course of the entire subjective assessment process is represented by Figure 4.

**Table 4.** Statistical characteristic of the observers.


**Figure 4.** Complete process of coding and assessing the video quality.
