A Low-Complexity Fast CU Partitioning Decision Method Based on Texture Features and Decision Trees

Wang, Yanjun; Liu, Yong; Zhao, Jinchao; Zhang, Qiuwen

doi:10.3390/electronics12153314

Open AccessArticle

A Low-Complexity Fast CU Partitioning Decision Method Based on Texture Features and Decision Trees

by

Yanjun Wang

,

Yong Liu

,

Jinchao Zhao

and

Qiuwen Zhang

^*

College of Computer and Communication Engineering, Zhengzhou University of Light Industry, Zhengzhou 450002, China

^*

Author to whom correspondence should be addressed.

Electronics 2023, 12(15), 3314; https://doi.org/10.3390/electronics12153314

Submission received: 7 July 2023 / Revised: 31 July 2023 / Accepted: 1 August 2023 / Published: 2 August 2023

(This article belongs to the Special Issue Selected Papers from Young Researchers in Signal/Image/Video Coding and Processing)

Download

Browse Figures

Versions Notes

Abstract

:

The rapid advancement of information technology, particularly in artificial intelligence and communication, is driving significant transformations in video coding. There is a steadily increasing demand for high-definition video in society. The latest video coding standard, versatile video coding (VVC), offers significant improvements in coding efficiency compared with its predecessor, high-efficiency video coding (HEVC). The improvement in coding efficiency is achieved through the introduction of a quadtree with nested multi-type tree (QTMT). However, this increase in coding efficiency also leads to a rise in coding complexity. In an effort to decrease the computational complexity of VVC coding, our proposed algorithm utilizes a decision tree (DT)-based approach for coding unit (CU) partitioning. The algorithm uses texture features and decision trees to efficiently determine CU partitioning. The algorithm can be summarized as follows: firstly, a statistical analysis of the new features of the VVC is carried out. More representative features are considered to extract to train classifiers that match the framework. Secondly, we have developed a novel framework for rapid CU decision making that is specifically designed to accommodate the distinctive characteristics of QTMT partitioning. The framework predicts in advance whether the CU needs to be partitioned and whether QT partitioning is required. The framework improves the efficiency of the decision-making process by transforming the partition decision of QTMT into multiple binary classification problems. Based on the experimental results, it can be concluded that our method significantly reduces the coding time by 55.19%, whereas BDBR increases it by only 1.64%. These findings demonstrate that our method is able to maintain efficient coding performance while significantly saving coding time.

Keywords:

versatile video coding; intra coding; decision tree; machine learning

1. Introduction

With the spread of the Internet, video traffic has increased significantly. This is largely due to the widespread adoption of cutting-edge technologies such as high dynamic range (HDR) and ultra high definition (UHD) video content. The expansion of these technologies has led to an explosion of video traffic in our daily lives. According to the Cisco Global IP Video Traffic Statistics (2016–2021), global IP video traffic had reached 82% of total network traffic by 2021, up from just 73% in 2018. This means that global IP video traffic grew at a steady compound annual growth rate of over 26% between 2016 and 2021. In addition, Internet video traffic has quadruple in the same period, with a CAGR of 31%. This surge in video traffic demand poses new challenges for video compression, particularly in optimizing encoding efficiency while maintaining a superior video service experience [1].

The existing video coding standards, HEVC/H.265, fail to sufficiently address the future market demands due to their limited compression capacity [2]. Thus, there is an immediate requirement for a video coding standard that maintains exceptional coding efficiency while delivering an enhanced video service experience. To advance the development of the next-generation video coding standards [3], the Motion Picture Experts Group (MPEG) and the Video Coding Experts Group (VCEG) swiftly established the Joint Video Exploration Team (JVET) [4]. Compared with HEVC, VVC incorporates several advanced techniques. These enhancements include a QTMT division structure for CU division, larger block sizes, multiple transform selection (MTS), affine motion compensation (AMC), low-frequency non-separable transform (LFNST), and new intra-frame prediction tools [5,6,7,8].

The aforementioned coding techniques have led to a substantial improvement in coding efficiency; however, the increase in encoding complexity should not be overlooked either. From the literature [9], it can be seen that VVC achieves approximately 40% improvement in coding efficiency compared with the previous video coding standard HEVC while maintaining the same perceptual quality. However, the complexity of VVC cannot be overlooked, making the need to reduce the encoding complexity in VVC crucial.

To reduce the computational complexity of VVC encoding and enhance the integration of machine learning with VVC’s QTMT partition decision, we propose a fast CU partition decision method based on a decision tree classifier to predict CU partitions. The main contributions of this paper can be summarized as follows: First, the new features of VVC are statistically analyzed, considering extracting more representative features and using the new features to train a classifier that matches the framework. Second, we propose a novel and effective framework for reducing the complexity of VVC’s intra prediction for the characteristics of the QTMT partition decision. The framework can predict in advance whether a CU needs to be partitioned and whether it needs to be QT partitioned. This framework not only reduces the block division complexity of QTMT but also maintains high coding efficiency by transforming the multi-classification problem into multiple binary classification problems.

The rest of the paper is structured as follows: Section 2 briefly introduces the partition structure of VVC and effective methods to reduce the complexity of HEVC and VVC. Section 3 details the implementation of our proposed algorithm (including the solution and fast CU decision). Section 4 presents the evaluation of the performance of the algorithm, and Section 5 concludes the entire work as a concluding section.

2. Background and Related Works

2.1. QTMT Partition Background of VVC

VVC uses a new block partition structure to improve the coding efficiency of the QTMT. From Figure 1, we can see the division of the QTMT structure in VVC. To determine the optimal partition in VVC, the VTM methodically explores all potential partitions. The maximum CTU size set in VVC was 128 × 128 pixels, the minimum size allowed by QT child nodes was 16 × 16 pixels, the maximum size allowed by BT and TT was 64 × 64 pixels, and the minimum size was 4 × 4 pixels; the MTT (the maximum depth of division) is 4. However, due to the flexible partitioning structure of QTMT, this exhaustive rate-distortion optimization (RDO) process can significantly increase the complexity of coding. The QTMT partitioning method alone contributes to over 95% of the coding time in VVC. The exponential growth in computational complexity associated with VVC coding presents a major challenge in the development and real-time implementation of VVC for video applications. Therefore, reducing the computational complexity of VVC coding is currently the focus of research by many organizations, experts, scholars, and companies related to video applications. Based on the findings from the existing literature, it has been observed that by omitting the depth division of the QTMT structure in the VVC intra-frame coding module, the encoder can save more than 80% of the coding time. Therefore, we can improve the encoding speed and reduce the application cost of VVC through the study of the intra-frame encoding module and the development of efficient intra-frame encoding acceleration algorithms.

Although VVC is currently the newest video coding standard, there have been many researchers who have started to focus on VVC, working on improving and enhancing it for complexity reduction in coding. The studies conducted in this field can be categorized into two primary approaches: heuristics and data-driven methods. Heuristic approaches leverage intermediate encoding features (e.g., texture complexity and spatial correlation) to construct statistical models for coding unit (CU) partitioning. These models are able to improve coding efficiency by eliminating the redundancy in the rate distortion optimization (RDO) process in the early CU partitioning structure of VVC. However, the aforementioned methods require manual feature extraction, which may introduce limitations and potential biases in their application. On the other hand, a data-driven approach offers another solution [10,11,12,13,14,15]; CU delineation can be achieved by training learning models using large amounts of data, addressing the limitations of heuristic methods that rely on manual feature extraction. However, the complexity of neural networks is still an important problem before us.

2.2. Related Work

Over the years, researchers have introduced diverse techniques aimed at enhancing the efficiency of coding unit (CU) segmentation in video coding standards such as HEVC and VVC. Fast CU segmentation, optimization of intra-frame pattern decisions, and block-based video coding techniques have been extensively studied since the early stages. This section offers a concise overview of classical algorithms employed in previous video coding standards.

2.2.1. Fast HEVC-Based CU Segmentation Method

The HEVC intra prediction adaptive fast algorithm proposed in [16] introduces a CU size decision mechanism. First, specific image features are extracted to evaluate the complexity of a CU, which is closely related to its division. In addition, the complexity of the CU was analyzed using a support vector machine (SVM) to build a classification model. Finally, a CCML-based decision algorithm was proposed that is an adaptive and fast method for CU size determination combined with complexity classification. In [17], an early termination technique is introduced for the Bayesian decision-based CU classification algorithm, which combines online and offline learning. Initially, a Bayesian decision-making-based online learning method was proposed. Second, a novel method was proposed that was applicable to both online and offline scenarios. The method further refines the training of the decision loss using the minimum risk Bayesian decision rule. The study in [18] introduces an innovative gradient-based method for an intra-mode decision, aimed at reducing the number of candidate patterns used for both the rough pattern decision and RD optimization. To determine the CU division quickly, the early termination of similar CUs is prioritized. Additionally, the authors propose a rapid CU decision algorithm for SVM that utilizes depth differences as features. The study in [19] reports on how the introduction of spatiotemporal features effectively reduces the complexity of the encoder. Initially, the authors employ an adaptive depth-range prediction method that utilizes information from previous frames and neighboring coding tree units (CTUs) to estimate the potential depth range of the current coding unit (CU). In doing so, the range of CU sizes is reduced. Secondly, the authors present an early termination method that relies on boundary checking of the deblocking filter (DBF). This technique effectively reduces redundant computations for small CU sizes and optimizes the coding process. A new method for fast CU partitioning decisions based on the offline training of SVMs is presented in [20]. Here, the authors integrate the classifier into a reference encoder. Finally, a threshold decision adaptation technique was designed for this framework to achieve a better trade-off between time saving and encoding efficiency in the scheme. One study [21] transforms the CU partition into a multi-class classification task. To handle uncertain classification outputs, the risk regions are merged. In addition, the SVM uses different tuning parameters to effectively eliminate outliers from the training samples, thus improving the RDC performance. Furthermore, the proposed algorithm is seamlessly integrated into a comprehensive framework for joint RDC optimization. Study [22] presents an improved neural network clustering algorithm. It consists of three main steps. First, various features are extracted from the original encoder and a machine learning algorithm is used to select features with high relevance to CU partitioning. Secondly, the network model was used to determine the threshold of the algorithm. Finally, the thresholds obtained from the training model are used to achieve early termination of CU partitioning for textured video and depth maps.

2.2.2. Fast CU Classification Method Based on VVC

The method in [23] includes two steps: first, the requirement for partitioning is judged according to the texture complexity of the CU. Second, according to the texture direction, redundant candidate partitions are discarded. Study [24] proposes a partitioning algorithm based on texture analysis, including binomial tree (BT) and trinomial tree (TT) partitioning. A texture detection method is used to determine the texture complexity and the prediction direction of each CU. In order to analyze the CU texture complexity, the algorithm uses the regression method to establish the TT and BT partition strategies. This novel partitioning strategy effectively reduces the redundant partitioning of each CU. Moreover, the authors intelligently skip redundant modes by considering the prediction direction of the CU. A deep-learning-based CU partitioning prediction method is presented in [25]. An extensive database consisting of different CU partitioning patterns is first constructed using rich video content. Second, the MSE-CNN framework is proposed, which employs an early exit mechanism. To train the proposed MSE-CNN model, an adaptive loss function is introduced, which considers the amount of uncertainty in the segmentation pattern and aims to minimize the RD cost. Lastly, a multi-threshold decision-making scheme is devised to attain an optimal balance between encoding time reduction and encoding efficiency improvement. The HG-FCN framework is proposed in [26]. This framework can obtain the partition information of the current CU and sub-CUs. This approach can effectively eliminate redundancy. Furthermore, hierarchical partition structures are predicted using HGM. Finally, the use of a dual-threshold decision scheme effectively tunes the trade-off between coding performance and complexity. In [27], the LGBM classifier is used to make division decisions on CUs. A total of five classifiers are trained offline by an efficient procedure that selects features related to texture, encoding, and contextual information. The scheme is highly adaptable as it is able to provide multiple operating points, and the method allows for a trade-off between time saving and coding efficiency to meet different application requirements. Study [28] proposes a design algorithm for fast CU segmentation and intra-frame pattern decision making. The algorithm is composed of two components: a rapid intra prediction algorithm leveraging texture features and a swift CU partition algorithm utilizing an RFC classifier. A fast CU partitioning decision algorithm based on ResNet is presented in the literature [29]. First, a ResNet-based CNN classifier is designed to predict the division of CUs. Secondly, to enhance the prediction accuracy of the classifier, a dedicated loss function is formulated. Finally, the authors devise a dual-threshold scheme to achieve a harmonious balance between encoding time savings and performance.

The above discussion focuses on the latest developments in coding complexity reduction schemes based on VVC and HEVC. Although these schemes have been proven to have good coding efficiency, their ability to reduce complexity needs to be improved.

3. Proposed Methodology

In recent years, video coding technology has advanced rapidly, resulting in a significant increase in video resolution. This progress has also led to increasingly detailed image textures. Only the QT partition structure is included in HEVC, but the QTMT partition structure is included in VVC. At the same time, the diversity of QTMT partition structures has led to a significant increase in coding complexity. This significant increase in coding complexity poses a significant barrier to the generalization of practical applications. In order to effectively reduce the coding complexity, efforts are being made to optimize the coding complexity without compromising the coding efficiency. This section analyses the partition structure characteristics of the QTMT using relevant experimental data and proposes an efficient framework for fast CU partitioning within VVC frames. This section consists of four parts. Section 3.1 discusses decision analysis, Section 3.2 discusses texture feature analysis and selection, Section 3.3 discusses model construction and training, and finally Section 3.4 gives the overall algorithm framework.

3.1. Decision Analysis

Table 1 provides the distribution of CU sizes in VTM-10.0 [25]. Table 1 clearly indicates that no segmentation (NS) is the preferred option for the majority of CU sizes. TT_H and TT_V partitioning only account for a small proportion, whereas horizontal and vertical partitioning together account for a larger proportion.

It is well known that early termination of the partitioning of partial CUs can effectively reduce coding complexity. Therefore, this section describes the classifiers and features used in the proposed framework for intra-VVC fast CU partitioning. The algorithm uses DT as a classifier and uses the classifier to predict the CU partition. This algorithm significantly reduces the complexity of encoding. Next, we propose several frameworks for classic VVC fast CU partition decision algorithms. The advantages and disadvantages of each are analyzed. Finally, we propose a cascaded partition decision framework that can effectively balance predictive accuracy and practicality.

In study [30], a classical VVC fast CU decision early termination framework is proposed. The framework first performs an in-frame prediction of the CU at the current depth and subsequently uses coding information such as RD cost and coding mode to decide whether to terminate the division process. If the division is terminated, all division patterns are skipped; if the division continues, all candidate division methods are traversed in turn. The framework treats the division judgement as a binary classification problem. Sufficient classification information is obtained through intra-frame prediction to improve the results. Although the algorithm does not affect the encoding result, this approach has very limited ability to reduce complexity. Firstly, intra-frame prediction is required for each decision layer. However, for intermediate decision layers that require further segmentation, intra-frame prediction becomes redundant at this stage. Secondly, intermediate nodes that continue to divide must evaluate all remaining division patterns, even though the encoder will ultimately choose only one option. Thus, redundancy persists in the process.

The framework proposed in study [31] builds on the work of [30], introducing a partitioning pattern selection framework for multi-class classifiers. Before performing in-frame prediction, the framework determines whether the CU needs to be further subdivided into sub-CUs. If no partitioning is required, it pre-evaluates the current CU to the optimal size, terminates all partitioning, and continues with in-frame pattern prediction for the current CU. When partitioning is required, multiple classifiers are used to select one of the various partitioning candidate patterns, effectively bypassing the remaining partitioning methods and intra-pattern prediction for the current CU layer. The framework’s fast algorithm significantly reduces redundancy in the coding unit selection process at the cost of a large loss in coding performance. The main factor contributing to this is the expansion of the division patterns in QTMT, which greatly complicates the decision process. The selection of a single pattern often leads to relatively low prediction accuracy and ultimately to higher coding losses.

To strike a balance between decision accuracy and complexity reduction, a parallel decision framework was introduced in the literature [32]. This framework constructs parallel decision models for both QT division and MT division. The division termination judgement is first made for both QT and MT. If the judgment confirms termination, the intra-frame pattern prediction of the current CU is not skipped. When the MT segmentation process is not terminated, a multi-class classifier is used to select the type of MT segmentation. In this framework, QT is considered as a different division method from MT and is determined independently. Furthermore, the parallel structure implemented in this approach helps to reduce the risk of prediction errors. However, the parallel structure still falls short in terms of optimization complexity, leaving room for further improvements.

We next discuss our proposed fast CU partition decision framework. First, the framework judges whether the current CU needs to be divided after extracting features. If not, the CU immediately terminates the division and bypasses all partition types. If dividing, it continues to determine whether QT dividing is required. In the case of a QT division, all MT divisions are ignored. Otherwise, QT partition is skipped and MT partition is performed. In summary, this framework effectively solves the multi-classification problem in QTMT, allowing the model to focus on evaluating specific patterns and improving prediction accuracy. By selecting appropriate classifiers, the VVC fast CU early termination framework achieves greater complexity reduction while maintaining the small loss of coding performance.

3.2. Analysis and Selection of Texture Features

The algorithm in this paper uses a decision framework based on decision tree classifiers to create a partitioned decision model by decomposing the QTMT decision process into multiple binary classification problems. During the encoding process, the classifier can accurately determine the CU partition type and the complexity reduction of the classifiers is critical for partitioning decisions. Choosing an appropriate classifier is crucial. After careful analysis, a decision tree was finally chosen as the classifier for this algorithm. Decision trees as classifiers have the following advantages:

Can handle non-linear classification problems well;
Have a built-in mechanism for handling missing values and outliers;
Provide feature importance ranking;
Can handle large datasets efficiently.

Our data come from a wide range of video sequences; some of these features can be extracted directly during the encoding process, whereas other features are integrated into the VTM encoder. These features contain three main categories of information: local texture information, global texture information, and encoding information.

Global Texture Information

The global texture is characterized by the variance (var) of the current CU, as well as the horizontal gradient (Gx) and vertical gradient (Gy) calculated using the Sobel operator. In addition, the ratio of Gx to Gy (ratioGxGy) and the normalized gradients obtained by dividing Gx and Gy by the sum of the block areas (normGradient) are also considered.

Local texture information

The decision on MT partitioning is influenced by local texture features, which emphasizes the importance of overall and local texture descriptions within the CU. We introduced two local texture features: The absolute variance between the upper and lower regions of the CU is referred to as “diffVarHor”, whereas the absolute difference between the left and right regions of the CU is denoted as “diffVarVer”.

Coding information

As the evaluation of NS takes precedence over the evaluation of QT and MT segmentations, we prioritize multiple encoding attributes derived from the current CU to determine the appropriate segmentation type. As the segmentation types are performed sequentially, each subsequent segmentation can make use of the information computed from the previous segmentation. The coding information consists of RD cost (currCost) and distortion (currDistortion).

Figure 2 illustrates the importance of each classifier feature. Figure 2 demonstrates that the features related to RD cost (CurrCost and CurrDistortion) are of utmost importance, followed by the texture information.

3.3. Model Building and Training

In order to improve accuracy and to solve over- and under-fitting problems, several hyperparameters were introduced to be combined into the classifier to be optimized. Hyperparameter optimization can improve the performance of a classifier. When optimizing the hyperparameters, manual adjustment of the parameters is a tedious process. Therefore, automatic parameter tuning tools were utilized to simplify the process and provide approximate results at an early stage. These tools are designed to determine the appropriate parameter intervals for effective optimization. Optuna is a tool that automatically searches for the most balanced combination of parameters within a predefined parameter grid. By utilizing Optuna, the optimal parameter settings can be determined more easily.

The key parameters for classifier optimization are as follows:

max_depth: This hyperparameter determines the maximum depth of the decision tree. By setting a suitable max_depth value, the complexity of the tree can be controlled and over-fitting can be prevented;
min_samples_split: This hyperparameter sets the minimum number of samples required to further split the internal nodes. Adding min_samples_split regularizes the tree and prevents overfitting by ensuring that a minimum number of samples are present when the split occurs;
min_samples_leaf: The minimum number of samples required for internal node subdivision. Adjusting min_samples_leaf can affect the generalization ability of the tree;
The maximum number of features to consider for each split (max_features): The maximum number of features when looking for the best split in an internal node. By limiting the number of features, the risk of overfitting can be reduced;
Criterion: This hyperparameter determines the metric used to split the nodes in the decision tree.

In video coding, the partitioning of CUs is treated as several binary classification problems. In order to solve such problems, we adopt the CART algorithm as the basic decision tool of the decision tree classifier. This enables us to build a model that predicts the outcome of CU partitioning. CART constructs a binary tree by generating features and thresholds corresponding to the minimum Gini coefficient for each node. Assuming that there are k categories in total, the probability that the sample point belongs to the kth category is expressed as

P_{k}

. In this case, the Gini index of the probability distribution is defined by the following formula:

G i n i (D) = \sum_{k = 1}^{k} p_{k} (1 - p_{k}) = 1 - \sum_{k = 1}^{k} {p_{k}}^{2}

(1)

P_{k}

represents the probability that the selected sample belongs to category k, and the probability that the sample is misclassified is

{1 - P}_{k}

.

We will select the attribute with the smallest Gini coefficient from the candidate attribute set A as the optimal partition attribute. When we divide a dataset D using a specific possible value a of attribute A, we can calculate the Gini coefficients of the sets D₁ and D₂ divided under the conditions of feature A. Under the conditions of feature A, the definition formula of the Gini coefficient of set D is as follows:

{G i n i}_{A} (D) = \frac{D_{1}}{D} G i n i (D_{1}) + \frac{D_{2}}{D} G i n i (D_{2})

(2)

Next, we describe the process of classifier offline training, which involves the following steps:

Step 1: Select 40 frames from each video sequence and encode them with a full intra-frame profile. We split them into a sample set of the first 20 frames and a test set of the last 20 frames.

Step 2: For the current node, the training dataset is recorded as D and we continue to calculate the Gini index corresponding to the dataset. We consider each feature A, and for each possible value a, we divide D into two subsets, D₁ and D₂, according to whether the test at the sample point A = a yields a “yes” or “no” response. Then, we use formula 2 to calculate the Gini coefficient when A = a.

Step 3: From all possible features A and their potential segmentation points a, find the feature with the smallest Gini coefficient and its corresponding optimal segmentation point. With the optimal features and split points, we generate two child nodes from the current node and assign the training dataset to each child node according to their respective features.

Step 4: Call step 1 and step 2 recursively on the two child nodes and continue to repeat the above steps until the training of N decision trees is completed.

Step 5: Use the classifier to classify the current sample; each tree evaluates the classification results individually. The final classification result is determined by the prediction with the most number of identical judgments in the tree. This process yields an optimal partitioning pattern for the CU.

Table 2 illustrates the training parameters of the decision tree classifier. The training set comes from some video sequences for the JVET official standard test sequence set, including “ParkScene”, “RaceHorses”, “BQMall”, and “Johnny”.

After training the classifier, the next step involves evaluating the accuracy of the classifier. We adopt the AUC-ROC curve as the visual performance metric because it provides a comprehensive measure of classification performance at different threshold settings. It quantifies the effectiveness of the model in distinguishing between different classes.

Figure 3 demonstrates the superior performance of our NS and QT classifiers on the test set. The AUC-ROC curves further confirm the accuracy of our predictions, showing a strong agreement between the test and predicted values. These results demonstrate that our classifier excels in accurately predicting the CU partition types, providing high-performance results.

Figure 4 depicts the training and implementation process of the decision tree classifier within the VTM encoder, as illustrated in the framework. After feature extraction, it is used for classifier training. The VTM encoder was modified to collect multiple statistics containing information related to CU partitioning decisions. This process generates a dataset specific to each partition type. The dataset comprises encoder properties, video encoding sequences, and relevant features for partitioning decisions. In the preprocessing step, the dataset is first balanced and then key features are selected. These selected features serve as the input for training the classifiers. The training process of the model consists of the hyperparameter optimization and separate training of the classifier. Finally, the improved VTM encoder is evaluated for encoding time and efficiency savings. The encoder incorporates a decision tree classifier for QTMT partition determination but does not perform a full RDO process.

3.4. General Algorithmic Framework

Figure 5 illustrates the VVC fast CU segmentation decision framework proposed in this chapter. When the CU starts encoding, feature extraction is performed, including three types of information: global texture, local texture, and coding information. The extracted features are then inputted into a trained decision tree classifier to determine if the CU should undergo further division. This approach treats the QTMT division process as a multiple binary classification problem. For each division type (NS, QT, and MT), we utilize three dedicated classifiers to determine the appropriate division. In our VVC fast CU partitioning decision framework, it is first determined whether a division is required. If not, the CU immediately terminates the division and bypasses all the partition types. If a division is made, the determination of whether a QT division is made is continued. If the output of the QT classifier divides, all subsequent partitions are skipped. The current CU is divided directly into four sub-CUs using QT division. If the QT classifier indicates partition termination, we continue with the MT division decision to obtain the corresponding partition flags and select the division type with the highest probability.

4. Experimental Results and Analysis

In this section, we compare this algorithm with other algorithms. The experimental results can demonstrate the robustness of our algorithm. Specifically, Section 4.1 presents information about the specific configuration of the experiment. Section 4.2 introduces the comparative analysis with the VTM-10.0 standard algorithm. Finally, Section 4.3 compares the performance with state-of-the-art algorithms.

4.1. Configuration and Setup

Our experimental scenario was carried out using VVC’s VTM-10.0 reference software. The configuration file is named encoder _intra _main. cfg, which contains the complete internal main configuration settings. The QP settings are {22, 27, 32, 37}. To assess the performance of the fast CU division decision method, we employed the evaluation criteria outlined in three scenarios as discussed in [33,34,35]. The algorithm’s performance was evaluated based on two metrics: Bjøntegaard delta bit rate (BDBR) and time saved (∆T). The video sequences used in the evaluation include A1, A2, B, C, D, and E with video resolutions ranging from 416 × 240 pixels to 3840 × 2160 pixels. The experimental section was trained on a computer equipped with an Intel(R) Core (TM) i7-11800H CPU, 16 GB RAM, and an NVIDIA GeForce RTX 3060 GPU for acceleration. The coding saving time (ΔT) was calculated as follows:

∆ T = \frac{T_{V T M} - T_{p r o}}{T_{V T M}}

(3)

T_{V T M}

represents the encoding time of the VTM-10.0 encoder, whereas

T_{p r o}

is the encoding time of our method.

Table 3 provides a detailed description of the resolution, number of frames, frame rate, and bit depth of the video sequence.

4.2. Performance Comparison with the VTM-10.0 Standard Algorithm

Table 4 presents the experimental results comparing the coding performance of 22 video test sequences with VTM-10.0, measured using BDBR and ∆T as evaluation metrics. Table 4 demonstrates that our algorithm effectively reduces the complexity of CU partitioning while having a negligible impact on coding efficiency. The coding time is reduced by an average of 55.19%, but the BDBR increases by only 1.64%. Based on the results in Table 4, the following conclusions can be drawn: for high-resolution A1/A2 video sequences, the algorithm achieves a 62.34% saving in coding time and only a 1.71% improvement in BDBR. For the lower-resolution video sequences (B, C, D, and E), the algorithm saves an average of 52.51% in encoding time and only 1.61% in BDBR improvement. The experimental results show that the algorithm performs better in terms of coding performance for high-resolution video sequences compared with low-resolution video sequences. The algorithm achieves good coding performance while effectively reducing the coding complexity.

4.3. Framework Performance Evaluation

To further assess the algorithm’s performance, the algorithm was analyzed in comparison with other state-of-the-art algorithms (i.e., methods proposed in studies [33,34,35]). BDBR and ∆T were the evaluation metrics used to measure the performance of rate distortion (RD). The three schemes of [33,34,35] were chosen because they are closely related to our experimental setup; the experiments were all implemented on the VTM-10.0 platform and therefore these schemes are considered to be highly representative of the current research area. Table 5 presents a comprehensive comparison of BDBR and ∆T among the three schemes discussed in [33,34,35]. However, due to the lack of specific information on the Class B video sequences “MarketPlace” and “RitualDance” in some of the comparison scenarios, we have excluded these video sequences from the comparison to ensure fairness. When comparing our work with schemes [34] and [35], it is clear that our algorithm achieves a lower BDBR (1.64% < 1.75% < 2.52%). However, we achieve a higher ∆T (55.19% > 53.15% > 24.83%). When comparing our algorithm with scheme [33], we observe that although our algorithm has a higher BDBR (1.64% > 1.34%) than scheme [33], the ∆T is significantly higher (55.19% > 47.63%). These experimental data show that our scheme achieves a better ∆T and a lower BDBR. Our approach is excellent, especially in high-resolution video scenes, where it shows superior performance. The proposed method achieves a coding time reduction of 62.34%, whereas the BDBR increases only slightly (by 1.71%). Furthermore, in lower-resolution video sequences (B, C, D, E), the algorithm still achieves considerable time savings; the proposed method achieves a coding time reduction of 52.51%., whereas the BDBR increases by only 1.61%. Thus, our proposed fast CU division decision method effectively reduces the complexity and ensures improved coding efficiency at different resolutions.

In addition to the ΔT and BDBR comparisons discussed earlier, we also performed a comparative analysis of the number of CU partition blocks. Specifically, we calculated the average number of RDO blocks per frame for each of the four quantization parameters (QP) {22, 27, 32, 37}. By conducting experiments, the effectiveness of our method was successfully verified. Figure 6 displays a comparison of the number of blocks after processing. It can be seen that our method processes only 20.71% of the blocks with the original RDO settings, which proves the effectiveness of our algorithm. In particular, for high-resolution video sequences (A1 and A2), our method achieves a significant reduction in the number of encoded blocks by 90.14% and 88.72%, further illustrating the superior performance of our method for high-resolution video sequences. We can also verify the effectiveness of our method through many papers [36,37,38,39,40].

Figure 7 demonstrates that our solution outperforms the three algorithms in [33,34,35], achieving a favorable balance between BDBR and ΔT.

From Figure 8, we can see the details of the comparison of the video samples actually processed before and after the video sequence training.

5. Conclusions

In this paper, we propose a fast CU partitioning decision method based on a decision tree classifier to reduce the computational complexity of VVC encoding. The algorithm utilizes texture features and decision trees to efficiently determine CU partitions. The contribution of this method can be summarized as follows: First, a statistical analysis is performed on the new features of VVC, considering extracting more representative texture features to train a classifier matching the framework. Second, according to the unique characteristics of VVC’s QTMT partition decision, a novel fast CU decision framework is designed. This framework improves the efficiency of the decision-making process by transforming the QTMT partition problem into multiple binary classification problems. Although our algorithm effectively reduces the encoding time, there is still room for further improvement. We think it is necessary to further optimize the algorithm to improve the speed and efficiency of encoding and decoding. Furthermore, exploring accelerated methods for mode decision and block partitioning in more detail will help to significantly reduce coding complexity. The algorithm is based on a decision tree classifier and requires manual extraction of the texture features. However, this manual extraction process leads to the increased computational complexity of the algorithm. To further reduce the coding time and complexity, neural networks can be employed for automatic feature extraction. Compared with manual feature extraction, neural networks are more adaptable and expressive, enabling them to identify and extract features with higher accuracy. Therefore, one of the future research directions involves exploring the application of neural networks in video coding. Our goal is to predict and optimize video encoding by integrating deep learning algorithms to improve encoding quality and efficiency.

Author Contributions

The conceptualization of the research for this project was jointly performed by Y.W. and Y.L., whereas J.Z. made significant contributions to the methodology. Y.L. was responsible for software development and hardware adaptation and other related work. The process of verifying the experimental results involved Y.W., J.Z., Q.Z. and Y.L. Y.L. analyzed the extracted features, and J.Z. was in charge of investigating and researching classifiers, etc. Q.Z. provided the necessary code resources for the study; data management and maintenance were carried out by Q.Z., whereas Y.L. was responsible for developing the initial draft. Y.W. was responsible for writing, review, editing, and fine tuning. Q.Z. oversees project management and Y.W. is responsible for funding acquisition. All authors have read and agreed to the published version of the manuscript.

Funding

This work was supported in part by the National Natural Science Foundation of China (grant nos. 61771432, and 61302118), the Basic Research Projects of Education Department of Henan (grant nos. 21zx003, 23A520039 and 20A880004), the Key projects Natural Science Foundation of Henan (grant no. 232300421150), the Scientific and Technological Project of Henan Province (grant no. 232102211014), and the Postgraduate Education Reform and Quality Improvement Project of Henan Province (grant nos. YJS2021KC12, YJS2023JC08, and YJS2022AL034).

Data Availability Statement

Not applicable.

Conflicts of Interest

The authors declare no conflict of interest.

Abbreviations

VVC	Versatile Video Coding
HEVC	High-Efficiency Video Coding
CU	Coding Unit
CTU	Coding Tree Unit
MTT	Multi-Type Tree
QTMT	QuadTree with Nested Multi-type Tree
RDO	Rate Distortion Optimization
QT	QuadTree
BTH	Horizontal Binary Tree
BTV	Vertical Binary Tree
TTH	Horizontal Trinomial Tree
TTV	Vertical Trinomial Tree
QP	Quantization Parameter
TS	Time Saving
BDBR	Bjøntegaard Delta Bit-Rate
VTM	VVC Test Model
MPEG	Moving Picture Experts Group
VCEG	Video Coding Experts Group
BV	Vertical Binary Tree Partition
BH	Horizontal Binary Tree Splitting
TV	Vertical Ternary Tree Splitting
TH	Horizontal Ternary Tree Splitting
AMC	Affine Motion Compensation
MTS	Multiple Transform Selection
LFNST	Low-Frequency Non-Separable Transform
HM	HEVC Test Model
NS	No Splitting

References

Zhang, M.; Chu, R.; Dong, C.; Wei, J.; Lu, W.; Xiong, N. Residual Learning Diagnosis Detection: An advanced residual learning diagnosis detection system for COVID-19 in Industrial Internet of Things. IEEE Trans. Ind. Inform. 2021, 17, 6510–6518. [Google Scholar] [CrossRef]
He, P.; Li, H.; Wang, H.; Wang, S.; Jiang, X.; Zhang, R. Frame-wise detection of double HEVC compression by learning deep spatio-temporal representations in compression domain. IEEE Trans. Multimed. 2020, 23, 3179–3192. [Google Scholar] [CrossRef]
Bross, B.; Wang, Y.-K.; Ye, Y.; Liu, S.; Chen, J.; Sullivan, G.J.; Ohm, J.-R. Overview of the versatile video coding (VVC) standard and its applications. IEEE Trans. Circuits Syst. Video Technol. 2021, 31, 3736–3764. [Google Scholar] [CrossRef]
Li, Y.; Yang, G.; Song, Y.; Zhang, H.; Ding, X.; Zhang, D. Early intra CU size decision for versatile video coding based on a tunable decision model. IEEE Trans. Broadcast. 2021, 67, 710–720. [Google Scholar] [CrossRef]
Huang, Y.-W.; Hsu, C.-W.; Chen, C.-Y.; Chuang, T.-D.; Hsiang, S.-T.; Chen, C.-C.; Chiang, M.-S.; Lai, C.-Y.; Tsai, C.-M.; Su, Y.-C. A VVC proposal with quaternary tree plus binary-ternary tree coding block structure and advanced coding techniques. IEEE Trans. Circuits Syst. Video Technol. 2019, 30, 1311–1325. [Google Scholar] [CrossRef]
Zhao, X.; Kim, S.-H.; Zhao, Y.; Egilmez, H.E.; Koo, M.; Liu, S.; Lainema, J.; Karczewicz, M. Transform coding in the VVC standard. IEEE Trans. Circuits Syst. Video Technol. 2021, 31, 3878–3890. [Google Scholar] [CrossRef]
Huang, Y.-W.; An, J.; Huang, H.; Li, X.; Hsiang, S.-T.; Zhang, K.; Gao, H.; Ma, J.; Chubach, O. Block partitioning structure in the VVC standard. IEEE Trans. Circuits Syst. Video Technol. 2021, 31, 3818–3833. [Google Scholar] [CrossRef]
Zhou, M.; Wei, X.; Jia, W.; Kwong, S. Joint Decision Tree and Visual Feature Rate Control Optimization for VVC UHD Coding. IEEE Trans. Image Process. 2022, 32, 219–234. [Google Scholar] [CrossRef]
Bossen, F.; Sühring, K.; Wieckowski, A.; Liu, S. VVC complexity and software implementation analysis. IEEE Trans. Circuits Syst. Video Technol. 2021, 31, 3765–3778. [Google Scholar] [CrossRef]
Chen, F.; Ren, Y.; Peng, Z.; Jiang, G.; Cui, X. A fast CU size decision algorithm for VVC intra prediction based on support vector machine. Multimed. Tools Appl. 2020, 79, 27923–27939. [Google Scholar] [CrossRef]
Saldanha, M.; Sanchez, G.; Marcon, C.; Agostini, L. Fast transform decision scheme for VVC intra-frame prediction using decision trees. In Proceedings of the 2022 IEEE International Symposium on Circuits and Systems (ISCAS), Austin, TX, USA, 27 May–1 June 2022; pp. 1948–1952. [Google Scholar]
Wieckowski, A.; Brandenburg, J.; Bross, B.; Marpe, D. VVC search space analysis including an open, optimized implementation. IEEE Trans. Consum. Electron. 2022, 68, 127–138. [Google Scholar] [CrossRef]
da Silva, R.C.C.; Camargo, M.P.O.; Quessada, M.S.; Lopes, A.C.; Ernesto, J.D.M.; da Costa, K.A.P. An Intrusion Detection System for Web-Based Attacks Using IBM Watson. IEEE Lat. Am. Trans. 2021, 20, 191–197. [Google Scholar] [CrossRef]
Jiang, W.; Ma, H.; Chen, Y. Gradient based fast mode decision algorithm for intra prediction in HEVC. In Proceedings of the 2012 2nd International Conference on Consumer Electronics, Communications and Networks (CECNet), Yichang, China, 21–23 April 2012; pp. 1836–1840. [Google Scholar]
Wang, L.-L.; Siu, W.-C. Novel adaptive algorithm for intra prediction with compromised modes skipping and signaling processes in HEVC. IEEE Trans. Circuits Syst. Video Technol. 2013, 23, 1686–1694. [Google Scholar] [CrossRef]
Liu, X.; Li, Y.; Liu, D.; Wang, P.; Yang, L.T. An adaptive CU size decision algorithm for HEVC intra prediction based on complexity classification using machine learning. IEEE Trans. Circuits Syst. Video Technol. 2017, 29, 144–155. [Google Scholar] [CrossRef]
Kim, H.-S.; Park, R.-H. Fast CU partitioning algorithm for HEVC using an online-learning-based Bayesian decision rule. IEEE Trans. Circuits Syst. Video Technol. 2015, 26, 130–138. [Google Scholar] [CrossRef]
Zhang, T.; Sun, M.-T.; Zhao, D.; Gao, W. Fast intra-mode and CU size decision for HEVC. IEEE Trans. Circuits Syst. Video Technol. 2016, 27, 1714–1726. [Google Scholar] [CrossRef]
Kuo, Y.-T.; Chen, P.-Y.; Lin, H.-C. A spatiotemporal content-based CU size decision algorithm for HEVC. IEEE Trans. Broadcast. 2020, 66, 100–112. [Google Scholar] [CrossRef]
Grellert, M.; Zatt, B.; Bampi, S.; da Silva Cruz, L.A. Fast coding unit partition decision for HEVC using support vector machines. IEEE Trans. Circuits Syst. Video Technol. 2018, 29, 1741–1753. [Google Scholar] [CrossRef]
Zhu, L.; Zhang, Y.; Kwong, S.; Wang, X.; Zhao, T. Fuzzy SVM-based coding unit decision in HEVC. IEEE Trans. Broadcast. 2017, 64, 681–694. [Google Scholar] [CrossRef]
Bakkouri, S.; Elyousfi, A. Early termination of CU partition based on boosting neural network for 3D-HEVC inter-coding. IEEE Access 2022, 10, 13870–13883. [Google Scholar] [CrossRef]
Zhang, Q.; Zhao, Y.; Jiang, B.; Huang, L.; Wei, T. Fast CU partition decision method based on texture characteristics for H. 266/VVC. IEEE Access 2020, 8, 203516–203524. [Google Scholar] [CrossRef]
Ni, C.-T.; Lin, S.-H.; Chen, P.-Y.; Chu, Y.-T. High Efficiency Intra CU Partition and Mode Decision Method for VVC. IEEE Access 2022, 10, 77759–77771. [Google Scholar] [CrossRef]
Li, T.; Xu, M.; Tang, R.; Chen, Y.; Xing, Q. DeepQTMT: A deep learning approach for fast QTMT-based CU partition of intra-mode VVC. IEEE Trans. Image Process. 2021, 30, 5377–5390. [Google Scholar] [CrossRef] [PubMed]
Wu, S.; Shi, J.; Chen, Z. HG-FCN: Hierarchical grid fully convolutional network for fast VVC intra coding. IEEE Trans. Circuits Syst. Video Technol. 2022, 32, 5638–5649. [Google Scholar] [CrossRef]
Saldanha, M.; Sanchez, G.; Marcon, C.; Agostini, L. Configurable fast block partitioning for VVC intra coding using light gradient boosting machine. IEEE Trans. Circuits Syst. Video Technol. 2021, 32, 3947–3960. [Google Scholar] [CrossRef]
Zhang, Q.; Wang, Y.; Huang, L.; Jiang, B. Fast CU partition and intra mode decision method for H. 266/VVC. IEEE Access 2020, 8, 117539–117550. [Google Scholar] [CrossRef]
Zhao, J.; Wu, A.; Jiang, B.; Zhang, Q. ResNet-Based Fast CU Partition Decision Algorithm for VVC. IEEE Access 2022, 10, 100337–100347. [Google Scholar] [CrossRef]
Shen, X.; Yu, L. CU splitting early termination based on weighted SVM. EURASIP J. Image Video Process. 2013, 2013, 4. [Google Scholar] [CrossRef]
Zhang, Y.; Kwong, S.; Wang, X.; Yuan, H.; Pan, Z.; Xu, L. Machine learning-based coding unit depth decisions for flexible complexity allocation in high efficiency video coding. IEEE Trans. Image Process. 2015, 24, 2225–2238. [Google Scholar] [CrossRef]
Wang, Z.; Wang, S.; Zhang, J.; Wang, S.; Ma, S. Effective quadtree plus binary tree block partition decision for future video coding. In Proceedings of the 2017 Data Compression Conference (DCC), Snowbird, UT, USA, 4–7 April 2017; pp. 23–32. [Google Scholar]
Li, H.; Zhang, P.; Jin, B.; Zhang, Q. Fast CU Decision Algorithm Based on Texture Complexity and CNN for VVC. IEEE Access 2023, 11, 35808–35817. [Google Scholar] [CrossRef]
Pan, Z.; Zhang, P.; Peng, B.; Ling, N.; Lei, J. A CNN-based fast inter coding method for VVC. IEEE Signal Process. Lett. 2021, 28, 1260–1264. [Google Scholar] [CrossRef]
Zhang, C.; Yang, W.; Zhang, Q. Fast CU Division Pattern Decision Based on the Combination of Spatio-Temporal Information. Electronics 2023, 12, 1967. [Google Scholar] [CrossRef]
Zhao, S.; Shang, X.; Wang, G.; Zhao, H. A Fast Algorithm for Intra-Frame Versatile Video Coding Based on Edge Features. Sensors 2023, 23, 6244. [Google Scholar] [CrossRef] [PubMed]
Lee, T.; Jun, D. Fast Mode Decision Method of Multiple Weighted Bi-Predictions Using Lightweight Multilayer Perceptron in Versatile Video Coding. Electronics 2023, 12, 2685. [Google Scholar] [CrossRef]
Jing, Z.; Zhu, W.; Zhang, Q. A Fast VVC Intra Prediction Based on Gradient Analysis and Multi-Feature Fusion CNN. Electronics 2023, 12, 1963. [Google Scholar] [CrossRef]
Li, M.; Li, Z.; Zhang, Z. A VVC Video Steganography Based on Coding Units in Chroma Components with a Deep Learning Network. Symmetry 2022, 15, 116. [Google Scholar] [CrossRef]
Tsai, Y.-H.; Lu, C.-R.; Chen, M.-J.; Hsieh, M.-C.; Yang, C.-M.; Yeh, C.-H. Visual Perception Based Intra Coding Algorithm for H. 266/VVC. Electronics 2023, 12, 2079. [Google Scholar] [CrossRef]

Figure 1. Describes the partitioning of the QTMT structure in VVC.

Figure 2. NS, QT, and MT classifier feature importance ranking.

Figure 3. AUC_ROC curves for NS and QT classifiers.

Figure 4. Framework for training CU partitioning decisions with a decision tree model.

Figure 5. VVC fast CU division decision framework.

Figure 6. Comparison of the number of processing blocks.

Figure 7. ΔT and BDBR increases in the algorithm compared with related work.

Figure 8. Comparison details of actual processed video samples after video sequence training (a) before processing, (b) after processing.

Table 1. Percentage of CUs for different sizes in VTM-10.0.

CU Size	NS	QT	BT_H	BT_V	TT_H	TT_V
64 × 64	0.17	0.83	0	0	0	0
32 × 32	0.18	0.35	0.18	0.17	0.07	0.06
32 × 16	0.41	0	0.18	0.23	0.09	0.09
32 × 8	0.48	0	0.18	0.22	0	0.12
32 × 4	0.77	0	0	0.13	0	0.10
16 × 32	0.41	0	0.25	0.17	0.09	0.07
8 × 32	0.47	0	0.25	0.14	0.14	0
4 × 32	0.72	0	0.15	0	0.13	0
16 × 16	0.23	0.11	0.23	0.24	0.09	0.09
16 × 8	0.50	0	0.18	0.22	0	0.09
16 × 4	0.61	0	0	0.24	0	0.15
8 × 16	0.49	0	0.24	0.18	0.09	0
4 × 16	0.60	0	0.24	0	0.16	0
8 × 8	0.54	0	0.23	0.23	0	0
8 × 4	0.66	0	0	0.34	0	0
4 × 8	0.66	0	0.34	0	0	0

Table 2. Training parameters of the decision tree.

Parameter Name	Parameter Settings
Number of feature attributes	10
Number of decision trees	15
The maximum depth of the decision tree	20
Minimum number of samples to divide a node	25

Table 3. JVET Comment Test Conditions.

Class	Sequence Name	Resolution	Frame Count	Frame Rate	Bit Depth
A1	Tango2	3840 × 2160	294	60 fps	10
	FoodMarket4		300	60 fps	10
	Campfire		300	30 fps	10
A2	CatRobot	3840 × 2160	300	60 fps	10
	DaylightRoad2		300	60 fps	10
	ParkRunning3		300	50 fps	10
B	MarketPlace	1920 × 1080	240	24 fps	8
	RitualDance		240	24 fps	8
	Cactus		500	50 fps	8
	BasketballDrive		500	50 fps	8
	BQTerrace		600	60 fps	8
C	BasketballDrill	832 × 480	500	50 fps	8
	BQMall		600	60 fps	8
	PartyScene		500	50 fps	8
	RaceHorses		300	30 fps	8
D	BasketballPass	416 × 240	500	50 fps	8
	BQSquare		600	60 fps	8
	BlowingBubbles		500	50 fps	8
	RaceHorese		300	30 fps	8
E	FourPeople	1280 × 720	600	60 fps	8
	Johnny		600	60 fps	8
	KristenAndSara		600	60 fps	8

Table 4. Experimental results of the proposed algorithm compared with VTM-10.0.

Class	Sequence	OUR
Class	Sequence	BDBR	$∆ T$	$∆ T / B D B R$
A1	Tango2	1.70	67.72	39.84
	FoodMarket4	1.61	52.19	32.42
	Campfire	1.79	61.77	34.51
A2	CatRobot	2.13	64.04	30.07
	DaylightRoad2	1.69	69.88	41.35
	ParkRunning3	1.36	58.42	42.96
B	MarketPlace	1.14	70.08	61.47
	RitualDance	2.01	52.23	25.99
	Cactus	1.52	63.08	41.50
	BasketballDrive	1.59	65.34	41.09
	BQTerrace	1.79	53.45	29.86
C	BasketballDrill	1.65	44.61	27.04
	BQMall	1.62	54.03	33.35
	PartyScene	1.45	47.78	32.95
	RaceHorses	1.15	47.77	41.54
D	BasketballPass	1.53	46.92	30.67
	BQSquare	1.98	41.63	21.03
	BlowingBubbles	1.16	45.66	39.36
	RaceHorese	1.82	49.78	27.35
E	FourPeople	1.67	53.34	31.94
	Johnny	1.69	48.85	28.91
	KristenAndSara	1.96	55.61	28.37
Average		1.64	55.19	33.65

Table 5. The algorithm presented in this paper is compared with the algorithms proposed in [33,34,35].

Class	Sequence	LI [33]		PAN [34]		ZHANG [35]		OUR
Class	Sequence	BDBR	$∆ T$	BDBR	$∆ T$	BDBR	$∆ T$	BDBR	$∆ T$
A1	Tango2	1.55	50.59	3.68	34.05	2.07	65.06	1.70	67.72
	FoodMarket4	1.61	50.11	1.59	42.90	1.95	51.77	1.61	52.19
	Campfire	1.59	51.85	2.80	30.08	1.81	55.92	1.79	61.77
A2	CatRobot	1.77	47.92	5.59	30.62	2.33	61.22	2.13	64.04
	DaylightRoad2	1.39	54.33	4.43	29.20	1.77	61.48	1.69	69.88
	ParkRunning3	1.46	48.11	1.61	21.30	1.71	57.88	1.36	58.42
B	MarketPlace	/	/	3.22	36.47	/	/	1.14	70.08
	RitualDance	/	/	2.97	31.23	/	/	2.01	52.23
	Cactus	1.31	44.95	5.20	25.42	1.73	61.26	1.52	63.08
	BasketballDrive	1.49	46.16	2.96	32.39	1.92	58.63	1.59	65.34
	BQTerrace	0.94	48.33	0.98	13.80	1.50	54.55	1.79	53.45
C	BasketballDrill	1.49	46.16	1.59	24.38	1.79	47.27	1.65	44.61
	BQMall	1.24	48.33	2.35	22.41	1.52	49.5	1.62	54.03
	PartyScene	0.83	45.88	1.84	14.94	1.18	48.22	1.45	47.78
	RaceHorses	0.81	46.95	2.23	22.55	1.35	48.26	1.15	47.77
D	BasketballPass	1.41	40.04	1.56	21.18	1.76	41.98	1.53	46.92
	BQSquare	0.89	46.68	0.84	9.69	1.34	43.67	1.98	41.63
	BlowingBubbles	0.99	43.86	2.29	16.97	1.46	38.02	1.16	45.66
	RaceHorese	1.27	39.21	2.24	20.33	1.76	47.21	1.82	49.78
E	FourPeople	1.55	52.64	1.76	25.26	1.77	56.81	1.67	53.34
	Johnny	1.58	50.67	1.69	24.92	2.30	57.65	1.69	48.85
	KristenAndSara	1.63	49.82	2.11	26.21	1.98	56.57	1.96	55.61
Average		1.34	47.63	2.52	24.83	1.75	53.15	1.64	55.19

Disclaimer/Publisher’s Note: The statements, opinions and data contained in all publications are solely those of the individual author(s) and contributor(s) and not of MDPI and/or the editor(s). MDPI and/or the editor(s) disclaim responsibility for any injury to people or property resulting from any ideas, methods, instructions or products referred to in the content.

© 2023 by the authors. Licensee MDPI, Basel, Switzerland. This article is an open access article distributed under the terms and conditions of the Creative Commons Attribution (CC BY) license (https://creativecommons.org/licenses/by/4.0/).

Share and Cite

MDPI and ACS Style

Wang, Y.; Liu, Y.; Zhao, J.; Zhang, Q. A Low-Complexity Fast CU Partitioning Decision Method Based on Texture Features and Decision Trees. Electronics 2023, 12, 3314. https://doi.org/10.3390/electronics12153314

AMA Style

Wang Y, Liu Y, Zhao J, Zhang Q. A Low-Complexity Fast CU Partitioning Decision Method Based on Texture Features and Decision Trees. Electronics. 2023; 12(15):3314. https://doi.org/10.3390/electronics12153314

Chicago/Turabian Style

Wang, Yanjun, Yong Liu, Jinchao Zhao, and Qiuwen Zhang. 2023. "A Low-Complexity Fast CU Partitioning Decision Method Based on Texture Features and Decision Trees" Electronics 12, no. 15: 3314. https://doi.org/10.3390/electronics12153314

Note that from the first issue of 2016, this journal uses article numbers instead of page numbers. See further details here.

Article Menu

A Low-Complexity Fast CU Partitioning Decision Method Based on Texture Features and Decision Trees

Abstract

1. Introduction

2. Background and Related Works

2.1. QTMT Partition Background of VVC

2.2. Related Work

2.2.1. Fast HEVC-Based CU Segmentation Method

2.2.2. Fast CU Classification Method Based on VVC

3. Proposed Methodology

3.1. Decision Analysis

3.2. Analysis and Selection of Texture Features

3.3. Model Building and Training

3.4. General Algorithmic Framework

4. Experimental Results and Analysis

4.1. Configuration and Setup

4.2. Performance Comparison with the VTM-10.0 Standard Algorithm

4.3. Framework Performance Evaluation

5. Conclusions

Author Contributions

Funding

Data Availability Statement

Conflicts of Interest

Abbreviations

References

Share and Cite

Article Metrics

Article Access Statistics

Further Information

Guidelines

MDPI Initiatives

Follow MDPI