Impact of Video Motion Content on HEVC Coding Efficiency

Salih, Khalid A. M.; Ali, Ismail Amin; Mstafa, Ramadhan J.

doi:10.3390/computers13080204

Open AccessArticle

Impact of Video Motion Content on HEVC Coding Efficiency

by

Khalid A. M. Salih

^1,*,

Ismail Amin Ali

²

and

Ramadhan J. Mstafa

¹

Department of Computer Science, College of Science, University of Zakho, Zakho 42002, Kurdistan Region, Iraq

²

Department of Electrical and Computer Engineering, College of Engineering, University of Duhok, Duhok 42001, Kurdistan Region, Iraq

^*

Author to whom correspondence should be addressed.

Computers 2024, 13(8), 204; https://doi.org/10.3390/computers13080204

Submission received: 16 July 2024 / Revised: 13 August 2024 / Accepted: 16 August 2024 / Published: 18 August 2024

Download

Browse Figures

Versions Notes

Abstract

:

Digital video coding aims to reduce the bitrate and keep the integrity of visual presentation. High-Efficiency Video Coding (HEVC) can effectively compress video content to be suitable for delivery over various networks and platforms. Finding the optimal coding configuration is challenging as the compression performance highly depends on the complexity of the encoded video sequence. This paper evaluates the effects of motion content on coding performance and suggests an adaptive encoding scheme based on the motion content of encoded video. To evaluate the effects of motion content on the compression performance of HEVC, we tested three coding configurations with different Group of Pictures (GOP) structures and intra refresh mechanisms. Namely, open GOP IPPP, open GOP Periodic-I, and closed GOP periodic-IDR coding structures were tested using several test sequences with a range of resolutions and motion activity. All sequences were first tested to check their motion activity. The rate–distortion curves were produced for all the test sequences and coding configurations. Our results show that the performance of IPPP coding configuration is significantly better (up to 4 dB) than periodic-I and periodic-IDR configurations for sequences with low motion activity. For test sequences with intermediate motion activity, IPPP configuration can still achieve a reasonable quality improvement over periodic-I and periodic-IDR configurations. However, for test sequences with high motion activity, IPPP configuration has a very small performance advantage over periodic-I and periodic-IDR configurations. Our results indicate the importance of selecting the appropriate coding structure according to the motion activity of the video being encoded.

Keywords:

HEVC; video; motion vectors; periodic-I; periodic-IDR

1. Introduction

Over the past few years, numerous technological breakthroughs have led to an increase in the creation and consumption of audiovisual multimedia materials. Consumers are excessively exposed to video content through a multitude of social networking platforms, media-sharing Internet sites, and mobile phone applications. According to the most recent report by Cisco [1], there is a notable increase in the popularity and demand for video applications. The video streaming market is expected to reach around 1.6 billion users by 2027, showing significant growth and a rising global interest in video streaming services. The user penetration rate is anticipated to increase from 18.3% in 2024 to 20.7% by 2027 [2]. Specifically, it is anticipated that two-thirds (66%) of TV sets connected to the Internet will possess ultrahigh-definition (UltraHD) resolution, compared to a mere 33% in the year 2018. The term “UltraHD” is used to describe the resolution of 3840 × 2160 pixels, which is also commonly referred to as 4K. The usual bitrate for a 4K encoded video is commonly observed to range between 15 and 18 Mb/s, which exceeds the high-definition (HD) video bitrate by more than two-fold and surpasses the standard-definition (SD) video bitrate by a factor of nine [3]. According to Cisco Visual Networking Index forecasts, an expected 23% increase in Compound Annual Growth Rate (CAGR) in worldwide IP traffic between 2021 and 2026 will occur, reaching 2.3 zettabytes annually by 2026. It is expected that ‘video’ traffic will remain dominant, constituting 87% of global IP traffic by 2026 [4]. Nonetheless, the storing and delivery of this immense quantity of data pose significant challenges, necessitating the utilization of efficient compression methods [5]. As smartphones and social networks have become more popular over the past few years, many streaming services (Netflix, Disney Plus, YouTube TV, Hulu, Apple TV Plus) are available that can stream 4K videos online. This surge in video consumption necessitates efficient coding methods, especially for UltraHD resolutions like 4K (3840 × 2160 pixels) and 8K (7680 × 4320 pixels).

The fundamental aim of most digital video coding standards is to reduce the bitrate of video while maintaining the integrity of visual presentation. This means minimizing the bitrate necessary for the representation of video content to reach a given level of video quality or maximizing the video quality achievable within a given available bitrate. As a successor to H.264, High-efficiency video coding (HEVC) [H.265/MPEG-H] standard was released in 2013 [6]. HEVC was development prioritized two main concerns: higher video resolutions and the utilization of parallel processing architectures. However, the adoption of HEVC has been gradual, mainly due to higher processing power and other hardware requirements. The HEVC/H.265 video compression standard can effectively compress video content of various resolutions, including 8K.

HEVC standard was jointly created by the International Telecommunication Union—Telecommunication Standardization Sector (ITU-T)—Video Coding Experts Group (VCEG) and the International Organization for Standardization/International Electrotechnical Commission (ISO/IEC)—Moving Picture Experts Group (MPEG). The HEVC standardization’s main objective was to facilitate a substantial enhancement in compression performance compared to existing standards. HEVC can achieve a bitrate reduction of around 50% (as compared to H.264) while maintaining an equivalent level of perceptual video quality [4,7]. Additionally, there is evidence to support the superiority of HEVC over VP9 in several aspects [8]. HEVC can achieve this compression performance through a range of technical capabilities and qualities, such as supporting high-resolution video, improved color representation, and a more flexible block partitioning mechanism [9].

HEVC outperforms H.264 due to several key factors. HEVC achieves increased compression efficiency by employing advanced encoding techniques and introducing new coding tools such as larger block sizes [10], improved intra prediction [11], improved motion estimation [12], and compensation methods [13], leading to superior inter-frame prediction [14] and motion representation [15].

HEVC has better support for high-resolution video, including 4K and 8K resolutions. Additionally, HEVC supports a wider range of bit depths, enabling more accurate color representation and improved visual quality. Finally, HEVC is designed for a wide range of applications and use cases, making it suitable for delivering high-quality video over various networks and platforms [13,16,17,18,19,20]. However, it is important to note that HEVC’s performance gains come with increased computational complexity, which can impact hardware requirements [21].

The latest iteration of HEVC software is represented by the HEVC HM-18.0 reference software [22].

The main contributions of this paper are listed as follows:

A suitable encoding configuration for low-activity video sequences is selected to improve the coding performance. For such sequences, our results show that using the IPPP configuration can significantly improve coding performance by up to 4 dB.
Investigated the impact of motion content on the coding efficiency of HEVC video coding. Our results show that for highly active sequences, IPPP has a negligible performance advantage over periodic-I and periodic-IDR. Here, our results suggest using periodic-I and periodic-IDR rather IPPP to obtain the benefits of I-frames of limiting error propagation and offering random access while not losing a significant coding performance.
Investigated the impact of coding structure on decoding complexity. Our results show that IPPP has a slightly lower decoding complexity than periodic-I and periodic-IDR.
Proposed an adaptive scheme that adjusts the GOP structure and intra coding techniques used based on the motion content of the encoded video.

The rest of the paper is arranged as follows. In Section 2, the structure of HEVC codec is described. Section 3 reviews the related work. The evaluation methodology and configurations are discussed in Section 4, with an explanation of each phase. Section 5 presents the performance results of sequences and their evaluations in terms of bitrate efficiency and video quality. Section 6 discusses the results in the broadest context. A conclusion of this paper and suggestions for future work are provided in Section 7.

2. HEVC Codec

Overall, the HEVC structure (shown in Figure 1) provides a high degree of flexibility and adaptability, allowing it to optimize coding performance for a wide range of applications and content types.

HEVC divides video frames into a hierarchy of Coding Units (CU), as shown in Figure 2 [23]. The hierarchical structure organizes video frames into progressively smaller and more localized units for compression purposes. A Coding Tree Unit (CTU) is comprised of a rectangle picture area containing N × N samples of the luma component and its associated chroma components. The encoder has the ability to select the CTU sizes based on its specific architectural features and the requirements of the application environment. These limits may include memory requirements and constraints on latency. The bitstream contains a signal indicating the value of N, which can be either 64, 32, 16, or 8.

Furthermore, each CTU is partitioned into Coding Tree Blocks (CTBs), which can be further partitioned into multiple coding blocks (CBs), as shown in Figure 3. The chosen sizes of CBs might differ based on the intricacy of the information being encoded. The smallest CB is 4 × 4 samples, and the largest is 64 × 64 samples.

The HEVC standard also includes new coding tools that contribute to its improved coding efficiency. These include a more flexible prediction structure, a more efficient intra prediction scheme, a more powerful transform and quantization process, and a more sophisticated entropy coding scheme.

There are 35 intra prediction modes integrated within the codec. HEVC uses two types of transform coding: Discrete Cosine Transform (DCT) of type II (DCT-II) and Discrete Sine Transforms of type VII (DST-VII). The sizes of the transformation blocks range from 8 × 8 to 16 × 16 to 32 × 32 [24]. Encoder design features two loop filters, each optimized for a different aspect of video encoding. The first is the deblocking filter, which has the primary purpose of reducing compression-induced blocking artifacts. The second filter is called the Sample Adaptive Offset (SAO) filter, and it is used to eliminate artifacts caused by video coding’s transform and quantization processes [25]. HEVC standard uses Context Adaptive Binary Arithmetic Coder (CABAC) as the only entropy coding technique. CABAC can greatly improve compression efficiency through an arithmetic coding approach. Nonetheless, CABAC implementation is intricate and has its drawbacks, including a decrease in processing speed and increased hardware costs [26].

3. Related Work

Video compression and video quality assessment play pivotal roles in enabling efficient storage and transmission of multimedia content. As the need for multimedia services, especially video, has grown, these areas have become increasingly important. The HEVC standard has become a fundamental aspect among the several video compression standards, providing better compression efficiency in comparison to its predecessors.

Xu et al. [27] performed a thorough evaluation of the H.265/HEVC compression standard, examining the impact of bitrate and Group Of Pictures (GOP) pattern on video quality. The research aimed to provide guidance on video compression techniques, particularly in the areas of bitrate and GOP pattern selection. The study aimed to examine the relationship between video quality and bitrate across different GOP patterns. The evaluation of video quality was performed using objective metrics such as Peak Signal-to-Noise Ratio (PSNR), Structural Similarity Index (SSIM), and Video Quality Metric (VQM). The study’s results demonstrated that increasing the bitrate led to better video quality when using the same GOP pattern. Additionally, enlarging the GOP size while keeping the number of B-frames the same resulted in increased video quality.

Mackin et al. [28] examined how changes in frame rate affect the compression of HEVC video. The study demonstrated that greater frame rates, specifically those over 60 fps, can improve the quality of perception, particularly when using higher bitrates. The researchers introduced a new approach to measure the degree of content dependency by classifying video sequences into distinct groups according to their motion characteristics. Their research uncovered a nearly straight-line connection between the average bitrate and the ideal frame rate, suggesting that the ideal frame rate fluctuates with different bitrates. The study emphasizes the significance of taking video content into account when choosing the most suitable frame rates for various video sequences.

Valizadeh et al. [29] introduced a new approach to improve the effectiveness of video coding by using perceptual coding approaches that consider the characteristics of the Human Visual System (HVS). The proposed method included the video quality parameter PSNR-HVS in the rate–distortion optimization process in order to achieve greater compression efficiency compared to the HEVC standard system. Their proposed methodology demonstrated a reduction in bitrate of up to 4.56% for the evaluated video sequences.

The study conducted by Ruiz Atencia et al. [30] examined the influence of different HEVC coding configuration parameters on the perceptual quality of reconstructed videos. The study employed diverse metrics, including traditional image quality assessments and Netflix’s Video Multi-Method Assessment Fusion (VMAF), moving beyond conventional PSNR metrics. The methodology involved encoding video sequences under different configurations. The paper underscored the importance of considering perceptual quality metrics and provided valuable insights for configuring video encoders to optimize perceptual rate–distortion performance.

Kobayashi et al. [31] proposed a hybrid architectural system integrating hardware for efficient HEVC multiple channels encoding with software for packaging. This system allows for adaptive bitrate/multi-channel encoding, low-latency, and supports multiple HTTP streams protocol, as well as 4K video at 60 frames per second. In order to address the issue of content-aware bitrate control, the authors have proposed a technique that modifies the target bitrate according to the complexity of the video scene. This technique ensures that the bitrate ladder relationship is maintained and employs QP control to encode at a lower bitrate without causing noticeable degradation in image quality. Experimental results demonstrated a significant reduction in encoding bitrates.

Hamdoun et al. [32] examined the efficacy of integrating error-protection methods for HEVC video transmission via satellite channels. These methods included systematic network coding and physical-layer turbo coding, emphasizing the advantages of network coding in terms of error protection and resilience performance gains. The study specifically examined the network coding attributes in cases where no packets are lost in order to identify the precise qualities that are relevant to the GOP in streaming multimedia. Additionally, the paper utilized the IPPP encoding structure in the video encoding process, employing it to encode video sequences using the HEVC standard. Their results showed considerable video quality improvement compared to only UDP flow error-protection methods.

Joy and Kounte [33] proposed a novel approach to enhance HEVC compression efficiency using deep learning technology. The proposed deep-depth decision algorithm employed a content-based deep learning approach to training separate chroma and luma components. The algorithm predicted the depth of CTU and converted it into a simplified vector with 16 elements, leading to a reduction in encoding time and an improvement in encoding bitrate.

Z Pan et al. [34] suggested an algorithm that leverages the features of video content to enhance bit allocation in HEVC. Their algorithm established a correlation between motion activity, texture complexity, and bit allocation, resulting in enhanced rate–distortion performance and coding efficiency.

These research works used different techniques to improve compression efficiency and enhance the perceived video quality. However, none of them considered the effects of motion content and intra coding techniques on coding performance.

4. Evaluation Methodology and Configurations

4.1. Proposed Evaluation Framework

Figure 4 illustrates the proposed framework, which offers a systematic method for comprehending and evaluating the influence of H.265/HEVC encoded video motion content. This proposed approach not only improves theoretical comprehension but also offers practical instructions for assessing the HEVC codec by implementing the suggestions outlined in the study.

4.2. Quality Evaluation Metrics

Quality is an essential factor for assessing the performance and efficiency of all items and the way they function. To precisely evaluate the quality of an image, it is important to possess a reference point that represents the true or factual quality of the image [35]. There are various video quality assessment metrics, including Mean Squared Error (MSE), Universal Image Quality Index (UIQI), Peak Signal-to-Noise Ratio (PSNR), Structured Similarity Index Method (SSIM), and Feature Similarity Index Method (FSIM) that are commonly employed to evaluate and assess image and video quality. In this paper, we have used the widely adopted PSNR objective metric, which is related to the MSE. They are calculated by comparing the original video with the uncompressed received version.

The MSE is a comprehensive measure that quantifies the average value of the squared errors, with lower values indicating better performance. MSE allows us to estimate both the estimator’s bias and variance. If an estimator is unbiased, its MSE is equal to its variance [35].

The PSNR is a metric that measures the ratio between the highest possible signal strength and the power of the unwanted noise that impacts the quality of its depiction. PSNR is a commonly used metric for assessing the quality of reconstructed images in lossy image compression codecs. The signal is defined as the unmodified data, whereas the noise refers to the errors introduced during compression or distortion. The PSNR provides an estimated measure of how well a reconstruction compares to the original in terms of human perception, specifically in relation to compression codecs [36].

For a video sequence with N frames, each with pixels of dimensions D_x × D_y pixels, consider the pixel’s luminance value at coordinates (x, y) in frame n of the video and denote it I (n, x, y). MSE is the mean squared difference between luminescence values of video frames in the original video sequence I and processed sequence

\hat{I}

. MSE for a single video frame n is:

M S E_{n} = \frac{1}{D_{x} . D_{y}} \sum_{x = 1}^{D_{x}} \sum_{y = 1}^{D_{y}} {[I (n, x, y) - \hat{I} (n, x, y)]}^{2}

(1)

For an N-frame video, MSE is averaged over frames:

\bar{M S E} = \frac{1}{N} \sum_{n = 0}^{N - 1} {M S E}_{n}

(2)

The PSNR in decibels (dB) is generally defined as:

P S N R = 10 \log_{10} \frac{P^{2}}{M S E}

(3)

In which p is the peak luminance of a pixel (2^d − 1, where d is the depth of the pixel in bits). A sequence of videos with N frames is described by its average quality, measured in decibels, as:

\bar{P S N R} = \frac{1}{N} \sum_{n = 0}^{N - 1} {P S N R}_{n}

(4)

4.3. Video Datasets and Configurations

We selected a set of eight video sequences with a range of resolutions and motion activity commonly encountered in multimedia applications. Figure 5 shows a snapshot of these sequences. The dataset is composed of versatile Full HD (1920 × 1080) and High-Definition (1280 × 720) test video sequences. These natural sequences were captured either at 120, 60, 50, or 25 frames per second (fps) and stored online in raw 8-bit 4:2:0 YUV formats. The dataset is characterized by spatial and temporal perceptual information, coding complexity, and rate–distortion behavior [37]. One dataset is published online [38] under a non-commercial Creative Commons BY-NC license. The other dataset is also publicly accessible with appropriate copyright information included [39]. Snapshots of these sequences are shown in Figure 5.

Specifically, 200 frames from test sequences: YachtRide, HoneyBee, Crowd_run, Ducks_take_off, Sunflower, Fourpeople, Mobcal, and Shields were encoded. The test sequences utilized are shown in Table 1.

The system in use is a Lenovo IdeaPad Gaming 3-15IHU6 laptop, including an 11th Generation Intel^® Core™ i7-1137H processor, running Ubuntu 22.04.3 LTS operating system.

The sequences were encoded using HEVC HM-18.0 codec [22] with different GOP structures and different intra refresh mechanisms. The Quantization Parameter (QP) ranged from 22 to 42.

To study the performance of HEVC codec across different setups, we performed three distinct tests for each video sequence. For the first test (IPPP), an IPPP coding structure with an open GOP of size of 8 is used. The codec configuration parameters are shown in Table 2. The IntraPeriod is configured as −1, indicating that the GOP structure is repeated indefinitely during the whole video sequence. DecodingRefreshType is configured as 0, indicating that no frames are expressly designated as refresh points during the decoding procedure. The GOP Size parameter is configured to 8, indicating that each GOP structure consists of eight frames.

For the second test (Periodic-I), an open GOP structure of size of 8 is used with a periodic I frame at every 32 frames. The IntraPeriod is set to 32, indicating the insertion of an I frame every 32 frames. DecodingRefreshType is set to 0, and the GOP Size parameter is set to 8.

For the last test (Periodic-IDR), a closed GOP structure of size 8 was adopted. IntraPeriod parameter was set to 32, and DecodingRefreshType was set to 2, indicating the insertion of an IDR frame every 32 frames. These three experiences allow us to evaluate the codec’s performance under different GOP structures and decoding refresh mechanisms.

Figure 6 shows the coding structure for the three configurations used in this paper.

5. Results

Firstly, we analyzed the tested sequences to check their motion activity. We used average motion vectors per pixel (MVpp) as an indicator of motion activity.

We compressed the test sequences using the configurations in Table 1. We then analyzed the compression performance, taking into consideration the motion activity of the tested sequences.

5.1. Motion Activity

Figure 7 shows the average MVpp for the tested video sequences when using IPPP coding configuration. As can be seen, Crowd_run and Ducks_take_off are the most active sequences in terms of this indicator. On the other hand, Sunflower and HoneyBee show the least motion activity.

Figure 8 and Figure 9 show a predicted frame of Crowd_run and HoneyBee test sequences consecutively, with motion vectors shown as white and red lines. As it is clear from these figures, the predicated frame of Crowd_run includes more motion vectors (with many large ones) than the HoneyBee sequence.

5.2. Rate–Distortion Performance

Figure 10, Figure 11 and Figure 12 show the rate–distortion curves for less active video sequences HoneyBee, Sunflower, and FourPeople consecutively.

For HoneyBee test sequence at a bitrate of 1 Mbps, the IPPP configuration can achieve about 4 dB over both periodic-I and periodic-IDR configurations. Periodic-I has a slight coding advantage over periodic-IDR configuration, as shown in Figure 10.

For the other two sequences in this low activity group (Sunflower and FourPeople) at a bitrate of 1 Mbps, IPPP configuration can give about (1 dB and 1.5 dB) quality improvement when compared to periodic-I and periodic-IDR configurations.

Figure 13 and Figure 14 show the rate–distortion curves for the Mobcal and Shields test sequences that have intermediate motion activity.

The results show that the IPPP coding configuration can still achieve a reasonable quality improvement over periodic-I and periodic-IDR configurations. At 2 Mbps bitrate, the IPPP can achieve about 1.5 dB better than periodic-I and periodic-IDR for Mobcal. For Shields at 2 Mbps bitrate, the IPPP configuration can achieve about 1 dB better quality than the other two coding configurations.

For all sequences with low and intermediate motion activity, periodic-I shows a slight coding improvement over the periodic-IDR.

Figure 15, Figure 16 and Figure 17 show the rate–distortion curves for the more active sequences YachtRide, Ducks_take_off and crowd_run consecutively. The results show that the IPPP configuration has a very small performance advantage over periodic-I and periodic-IDR configurations. Additionally, periodic-I and periodic-IDR configurations show negligible coding differences for these sequences.

5.3. Encoding and Decoding Times

To check the complexity of encoding and decoding different video configurations, we encoded Sunflower, HoneyBee, and FourPeople test sequences using IPPP, Periodic-I, and periodic-IDR coding configurations. The sequences were encoded using QPs of 30, 32, and 27 consecutively to achieve an average PSNR of about 40 dB. The results of encoding times are shown in Figure 18, while Figure 19 shows the results of decoding times. Encoding times are almost not changed for different coding configurations except for the Sunflower sequence, which needs more time to encode the IPPP configurations than periodic-I and periodic-IDR configuration. Looking at Figure 19, it is clear that the decoding time for the IPPP coding configurations is less than that of the periodic-I and periodic-IDR configurations. The reason is that for the same video quality, periodic-I and periodic-IDR configurations use more bitrate than IPPP (Figure 10, Figure 11 and Figure 12). Increased bitrate needs more processing, which increases decoding times.

6. Discussion

Choosing the optimal coding configuration for encoding a video sequence is challenging as the motion activity of the coded video has a significant impact on the HEVC coding performance. We analyzed the coding performance of a range of video sequences with different levels of motion activity using different coding configurations.

6.1. Low Motion Activity

For low motion activity test sequences (HoneyBee, Sunflower, and FourPeople), the IPPP configuration can achieve considerably better coding than both periodic-I and periodic-IDR configurations. Periodic-I has a slight coding advantage over periodic-IDR configuration.

6.2. Intermediate Motion Activity

Rate–distortion curves for the test sequences with intermediate motion activity (Mobcal, Shields) show that the IPPP coding configuration can achieve a reasonable quality improvement over periodic-I and periodic-IDR configurations. Additionally, periodic-I shows a slight coding improvement over the periodic-IDR.

6.3. High Motion Activity

The rate–distortion curves for the more active sequences (YachtRide, Ducks_take_off and Crowd_run) show that the IPPP configuration has a very small performance advantage over periodic-I and periodic-IDR configuration. Additionally, periodic-I and periodic-IDR configurations show negligible coding differences for these sequences.

For the same quality video, IPPP uses fewer bits than periodic-I and periodic-IDR configurations. Therefore, the decoding time for the IPPP coding configurations is slightly less than that of the periodic-I and periodic-IDR configurations.

Generally, IPPP coding configuration can achieve a better coding performance by heavily relying on inter-frame prediction. Additionally, IPPP tends to have reduced decoding complexity compared to periodic-I and periodic-IDR structures. However, I-frames can minimize error propagation in error-prone environments and improve the random-access capability of encoded video.

However, for sequences with high motion activity, our results show that the coding advantage of the IPPP over periodic-I and periodic-IDR is very small. Therefore, for such sequences, we recommend including I-frames to obtain the advantages of these frames while not losing any significant coding performance.

Therefore, we propose an enhancement to the HEVC codec so that it can dynamically select the encoding configuration based on the motion content of the encoded video content. This adaptive scheme will offer better coding performance when the encoded video has low motion content and automatically add I-frames when motion activity increases.

7. Conclusions

IPPP typically achieves a lower bitrate by heavily relying on inter-frame prediction, leveraging previously encoded frames to predict the current frame. Additionally, IPPP tends to have reduced decoding complexity compared to periodic-I and periodic-IDR structures. On the other hand, intra coded frames minimize error propagation from inter-frame prediction and improve the random-access capability of encoded video. However, more frequent I-frames also elevate the bitrate, potentially reducing overall compression efficiency. Additionally, increased decoding complexity, particularly in real-time applications or resource-constrained devices, accompanies frequent I-frames. Hence, it is essential to carefully assess and evaluate the coding configuration in order to select the most appropriate configuration for a specific case and achieve the desired coding performance.

Our results for sequences with low motion content and intermediate motion content show that the IPPP configuration consistently has lower bitrates than the Periodic-I and Periodic-IDR configurations. This indicates that IPPP is able to perform efficient compression while efficiently maintaining visual quality. In contrast, Periodic-I and Periodic-IDR configurations incurred additional bits, leading to higher bitrates or lower quality at a specific bitrate. Additionally, periodic-I shows a slight coding improvement over the periodic-IDR for these sequences.

The results for tested sequences with high motion content indicate that the IPPP configuration achieved slightly lower bitrates than Periodic-I and Periodic-IDR. Taking into consideration the advantages of including I-frames in error-prone environments and the random access they offer, it may be preferable to use the Periodic-I and Periodic-IDR coding configurations in such scenarios.

Our results show how complicated the trade-offs are between bitrate, visual quality, and encoding methods. These findings emphasize the importance of choosing a suitable encoding configuration according to the motion activity of the encoded video sequence. If the priority is to achieve lower bitrates with acceptable PSNR, configuration with the IPPP coding structure is preferred. However, for videos with high-motion content, it may be preferable to use the Periodic-I and Periodic-IDR coding configurations because of the advantages these configurations can offer in error-prone environments.

Future work will investigate the effects of losses when these videos are sent over IP networks and compare with these results. Also, building the proposed codec that can dynamically select the encoding configuration based on the motion content of the encoded sequence is another important area for work in the future.

Author Contributions

Conceptualization, K.A.M.S. and I.A.A.; data curation, K.A.M.S. and R.J.M.; methodology, K.A.M.S. and I.A.A.; software, K.A.M.S. and I.A.A.; supervision, I.A.A. and R.J.M.; validation, K.A.M.S., I.A.A. and R.J.M.; visualization, K.A.M.S. and I.A.A.; writing—original draft preparation, K.A.M.S.; writing—review and editing, K.A.M.S., I.A.A. and R.J.M. All authors have read and agreed to the published version of the manuscript.

Funding

This research received no external funding.

Data Availability Statement

The original contributions presented in the study are included in the article, further inquiries can be directed to the corresponding author.

Conflicts of Interest

The authors declare no conflict of interest.

References

Cisco, U. Cisco Annual Internet Report (2018–2023) White Paper. 9 March 2020. Available online: https://www.cisco.com/c/en/us/solutions/collateral/executive-perspectives/annual-internet-report/white-paper-c11-741490.html (accessed on 30 July 2024).
Statista. Video Streaming (SVoD)—Worldwide. Available online: https://www.statista.com/outlook/dmo/digital-media/video-on-demand/worldwide (accessed on 29 January 2024).
Grois, D.; Giladi, A.; Choi, K.; Park, M.W.; Piao, Y.; Park, M.; Choi, K.P. Performance comparison of emerging EVC and VVC video coding standards with HEVC and AV1. SMPTE Motion Imaging J. 2021, 130, 1–12. [Google Scholar] [CrossRef]
Ramasamy, V.; Pop, M.-D. The Future Network 2030: A Simplified Intelligent Transportation System. In Intelligent Technologies for Sensors: Applications, Design, and Optimization for a Smart World; Apple Academic Press: Palm Bay, FL, USA, 2023; p. 315. [Google Scholar]
Habchi, Y.; Aimer, A.F.; Baili, J.; Inc, M.; Menni, Y.; Lorenzini, G. Improving medical video coding using multi scale quincunx lattice: From low bitrate to high quality. Trait. Du Signal 2022, 39, 1191–1202. [Google Scholar] [CrossRef]
Minopoulos, G.; Memos, V.A.; Psannis, K.E.; Ishibashi, Y. Comparison of video codecs performance for real-time transmission. In Proceedings of the 2020 2nd International Conference on Computer Communication and the Internet (ICCCI), Nagoya, Japan, 26–29 June 2020; IEEE: Piscataway, NJ, USA, 2020; pp. 110–114. [Google Scholar]
John, S.K.; Kunal, G.; Utkarsh, R. Video compression techniques: A review. Int. J. Pure Appl. Math. 2018, 119, 3709–3724. [Google Scholar]
Rahim, T.; Usman, M.A.; Shin, S.Y. Comparing H. 265/HEVC and VP9: Impact of high frame rates on the perceptual quality of compressed videos. arXiv 2020, arXiv:2006.02671. [Google Scholar]
Rao, K.R.; Hwang, J.J.; Kim, D. High Efficiency Video Coding and Other Emerging Standards; River Publishers: New York, NY, USA, 2022. [Google Scholar]
Pourazad, M.T.; Doutre, C.; Azimi, M.; Nasiopoulos, P. HEVC: The new gold standard for video compression: How does HEVC compare with H. 264/AVC? IEEE Consum. Electron. Mag. 2012, 1, 36–46. [Google Scholar] [CrossRef]
Patel, D.; Lad, T.; Shah, D. Review on intra-prediction in high efficiency video coding (HEVC) standard. Int. J. Comput. Appl. 2015, 975, 12. [Google Scholar] [CrossRef]
Pastuszak, G.; Trochimiuk, M. Algorithm and architecture design of the motion estimation for the H. 265/HEVC 4K-UHD encoder. J. Real-Time Image Process. 2016, 12, 517–529. [Google Scholar] [CrossRef]
Jayaratne, M.; Gunawardhana, L.; Samarathunga, U. Comparison of H. 264 and H. 265. Eng. Technol. Q. Rev. 2022, 5, 17–24. [Google Scholar]
Wang, T.; Wei, G.; Li, H.; Bui, T.; Zeng, Q.; Wang, R. A Method to Reduce the Intra-Frame Prediction Complexity of HEVC Based on D-CNN. Electronics 2023, 12, 2091. [Google Scholar] [CrossRef]
Liu, H.; Lu, M.; Ma, Z.; Wang, F.; Xie, Z.; Cao, X.; Wang, Y. Neural video coding using multiscale motion compensation and spatiotemporal context model. IEEE Trans. Circuits Syst. Video Technol. 2020, 31, 3182–3196. [Google Scholar] [CrossRef]
Mansri, I.; Doghmane, N.; Kouadria, N.; Harize, S.; Bekhouch, A. Comparative evaluation of VVC, HEVC, H. 264, AV1, and VP9 encoders for low-delay video applications. In Proceedings of the 2020 Fourth International Conference on Multimedia Computing, Networking and Applications (MCNA), Valencia, Spain, 19–22 October 2020; IEEE: Piscataway, NJ, USA, 2020; pp. 38–43. [Google Scholar]
Subbarayappa, S.; Rao, K. Video quality evaluation and testing verification of H. 264, HEVC, VVC and EVC video compression standards. IOP Conf. Ser. Mater. Sci. Eng. 2021, 1045, 012028. [Google Scholar] [CrossRef]
Benjak, J.; Hofman, D.; Knezović, J.; Žagar, M. Performance Comparison of H. 264 and H. 265 Encoders in a 4K FPV Drone Piloting System. Appl. Sci. 2022, 12, 6386. [Google Scholar] [CrossRef]
Monteiro, E.; Grellert, M.; Bampi, S.; Zatt, B. Rate-distortion and energy performance of HEVC and H. 264/AVC encoders: A comparative analysis. In Proceedings of the 2015 IEEE International Symposium on Circuits and Systems (ISCAS), Lisbon, Portugal, 24–27 May 2015; IEEE: Piscataway, NJ, USA, 2015; pp. 1278–1281. [Google Scholar]
Banitalebi-Dehkordi, A.; Azimi, M.; Pourazad, M.T.; Nasiopoulos, P. Compression of high dynamic range video using the HEVC and H. 264/AVC standards. In Proceedings of the 10th International Conference on Heterogeneous Networking for Quality, Reliability, Security and Robustness, Rhodes, Greece, 18–20 August 2014; IEEE: Piscataway, NJ, USA, 2014; pp. 8–12. [Google Scholar]
Wang, Z.; Li, F. Convolutional neural network based low complexity HEVC intra encoder. Multimed. Tools Appl. 2021, 80, 2441–2460. [Google Scholar] [CrossRef]
Fraunhofer Institute for Telecommunications, Heinrich-Hertz-Institut, HHI. High Efficiency Video Coding (HEVC). Available online: https://www.hhi.fraunhofer.de/en/departments/vca/technologies-and-solutions/h265-hevc/hevc-overview.html (accessed on 10 December 2023).
Sullivan, G.J.; Ohm, J.-R.; Han, W.-J.; Wiegand, T. Overview of the High Efficiency Video Coding (HEVC) Standard. IEEE Trans. Circuits Syst. Video Technol. 2012, 22, 1649–1668. [Google Scholar] [CrossRef]
Park, W.; Lee, B.; Kim, M. Fast computation of integer DCT-V, DCT-VIII, and DST-VII for video coding. IEEE Trans. Image Process. 2019, 28, 5839–5851. [Google Scholar] [CrossRef] [PubMed]
Singhadia, A.; Minhazuddin, M.; Mamillapalli, M.; Chakrabarti, I. A fast integrated deblocking filter and sample-adaptive-offset parameter estimation architecture for HEVC. Microprocess. Microsyst. 2021, 85, 104317. [Google Scholar] [CrossRef]
Mrudula, S.; Murthy, K.; Prasad, M. Optimized Context-Adaptive Binary Arithmetic Coder in Video Compression Standard Without Probability Estimation. Math. Model. Eng. Probl. 2022, 9, 458–462. [Google Scholar] [CrossRef]
Xu, J.; Zhou, B.; Zhang, C.; Ke, N.; Jin, W.; Hao, S. The impact of bitrate and GOP pattern on the video quality of H. 265/HEVC compression standard. In Proceedings of the 2018 IEEE International Conference on Signal Processing, Communications and Computing (ICSPCC), Qingdao, China, 14–16 September 2018; IEEE: Piscataway, NJ, USA, 2018; pp. 1–5. [Google Scholar]
Mackin, A.; Zhang, F.; Papadopoulos, M.A.; Bull, D. Investigating the impact of high frame rates on video compression. In Proceedings of the 2017 IEEE International Conference on Image Processing (ICIP), Beijing, China, 17–20 September 2017; IEEE: Piscataway, NJ, USA, 2017; pp. 295–299. [Google Scholar]
Valizadeh, S.; Nasiopoulos, P.; Ward, R. Improving compression efficiency of HEVC using perceptual coding. Multimed. Tools Appl. 2021, 80, 10235–10254. [Google Scholar] [CrossRef]
Atencia, J.R.; Granado, O.L.; Malumbres, M.P.; Martínez-Rach, M.O.; Van Wallendael, G. Analysis of the perceptual quality performance of different HEVC coding tools. IEEE Access 2021, 9, 37510–37522. [Google Scholar] [CrossRef]
Kobayashi, D.; Nakamura, K.; Kitahara, M.; Osawa, T.; Omori, Y.; Onishi, T.; Iwasaki, H. A Low-Latency 4K HEVC Multi-Channel Encoding System with Content-Aware Bitrate Control for Live Streaming. IEICE Trans. Inf. Syst. 2023, 106, 46–57. [Google Scholar] [CrossRef]
Hamdoun, H.; Nazir, S.; Alzubi, J.A.; Laskot, P.; Alzubi, O.A. Performance benefits of network coding for HEVC video communications in satellite networks. Iran. J. Electr. Electron. Eng. (IJEEE) 2021, 17, 1–10. [Google Scholar]
Joy, H.K.; Kounte, M.R. Decision Algorithm for Intra Prediction in High-Efficiency Video Coding (HEVC). J. Southwest Jiaotong Univ. 2022, 57, 180–193. [Google Scholar] [CrossRef]
Pan, Z.; Yi, X.; Zhang, Y.; Yuan, H.; Wang, F.L.; Kwong, S. Frame-level Bit Allocation Optimization Based on Video Content Characteristics for HEVC. ACM Trans. Multimed. Comput. Commun. Appl. (TOMM) 2020, 16, 1–20. [Google Scholar] [CrossRef]
Sara, U.; Akter, M.; Uddin, M.S. Image quality assessment through FSIM, SSIM, MSE and PSNR—A comparative study. J. Comput. Commun. 2019, 7, 8–18. [Google Scholar] [CrossRef]
Winkler, S. Digital Video Quality: Vision Models and Metrics; John Wiley & Sons: Hoboken, NJ, USA, 2005. [Google Scholar]
Mercat, A.; Viitanen, M.; Vanne, J. UVG Dataset: 50/120fps 4 K Sequences for Video Codec Analysis and Development. In Proceedings of the 11th ACM Multimedia Systems Conference, Istanbul, Turkey, 8–11 June 2020; pp. 297–302. [Google Scholar]
Group, U.V. Ultra Video Group—Home Page. Available online: https://ultravideo.fi (accessed on 30 July 2023).
Xiph.org Video Test Media [Derf’s Collection]. HD Content and Above. Available online: https://media.xiph.org/video/derf/ (accessed on 15 July 2023).

Figure 1. HEVC video coding encoder [23].

Figure 2. HEVC coding units.

Figure 3. HEVC coding blocks.

Figure 4. Proposed framework.

Figure 5. Snapshots of test sequences used [38,39].

Figure 6. Configurations tested.

Figure 7. Average MVpp of tested video sequences.

Figure 8. Frame 90 of Crowd_run, showing motion vectors as white lines.

Figure 9. Frame 90 of HoneyBee, showing motion vectors as short red lines.

Figure 10. Rate–distortion curve for HoneyBee test sequence.

Figure 11. Rate–distortion curve for Sunflower test sequence.

Figure 12. Rate–distortion curve for FourPeople test sequence.

Figure 13. Rate–distortion curve for Mobcal test sequence.

Figure 14. Rate–distortion curve for Shields test sequence.

Figure 15. Rate–distortion curve for YachtRide test sequence.

Figure 16. Rate–distortion curve for Ducks_take_off test sequence.

Figure 17. Rate–distortion curve for Crowd_run test sequence.

Figure 18. Encoding time.

Figure 19. Decoding time.

Table 1. Test sequences used.

Test Sequence	Resolution	No. of Frames Encoded	fps
HoneyBee	1920 × 1080	200	120
Sunflower	1920 × 1080	200	25
Fourpeople	1280 × 720	200	60
Mobcal	1280 × 720	200	50
Shields	1280 × 720	200	50
YachtRide	1920 × 1080	200	120
Ducks_take_off	1920 × 1080	200	50
Crowd_run	1920 × 1080	200	50

Table 2. Encoding parameters and configurations.

Configuration	IPPP	Periodic-I	Periodic-IDR
Encoder Parameter	IPPP	Periodic-I	Periodic-IDR
IntraPeriod	−1	32	32
DecodingRefreshType	0	0	2
GOP Size	8	8	8
QP	22–42	22–42	22–42

Disclaimer/Publisher’s Note: The statements, opinions and data contained in all publications are solely those of the individual author(s) and contributor(s) and not of MDPI and/or the editor(s). MDPI and/or the editor(s) disclaim responsibility for any injury to people or property resulting from any ideas, methods, instructions or products referred to in the content.

© 2024 by the authors. Licensee MDPI, Basel, Switzerland. This article is an open access article distributed under the terms and conditions of the Creative Commons Attribution (CC BY) license (https://creativecommons.org/licenses/by/4.0/).

Share and Cite

MDPI and ACS Style

Salih, K.A.M.; Ali, I.A.; Mstafa, R.J. Impact of Video Motion Content on HEVC Coding Efficiency. Computers 2024, 13, 204. https://doi.org/10.3390/computers13080204

AMA Style

Salih KAM, Ali IA, Mstafa RJ. Impact of Video Motion Content on HEVC Coding Efficiency. Computers. 2024; 13(8):204. https://doi.org/10.3390/computers13080204

Chicago/Turabian Style

Salih, Khalid A. M., Ismail Amin Ali, and Ramadhan J. Mstafa. 2024. "Impact of Video Motion Content on HEVC Coding Efficiency" Computers 13, no. 8: 204. https://doi.org/10.3390/computers13080204

Note that from the first issue of 2016, this journal uses article numbers instead of page numbers. See further details here.

Article Menu

Impact of Video Motion Content on HEVC Coding Efficiency

Abstract

1. Introduction

2. HEVC Codec

3. Related Work

4. Evaluation Methodology and Configurations

4.1. Proposed Evaluation Framework

4.2. Quality Evaluation Metrics

4.3. Video Datasets and Configurations

5. Results

5.1. Motion Activity

5.2. Rate–Distortion Performance

5.3. Encoding and Decoding Times

6. Discussion

6.1. Low Motion Activity

6.2. Intermediate Motion Activity

6.3. High Motion Activity

7. Conclusions

Author Contributions

Funding

Data Availability Statement

Conflicts of Interest

References

Share and Cite

Article Metrics

Article Access Statistics

Further Information

Guidelines

MDPI Initiatives

Follow MDPI