Intra-Mode Decision Based on Lagrange Optimization Regarding Chroma Coding

Li, Wei; Fan, Caixia

doi:10.3390/app14156480

Open AccessArticle

Intra-Mode Decision Based on Lagrange Optimization Regarding Chroma Coding

by

Wei Li

^* and

Caixia Fan

Department of Information Science, Xi’an University of Technology, Xi’an 710048, China

^*

Author to whom correspondence should be addressed.

Appl. Sci. 2024, 14(15), 6480; https://doi.org/10.3390/app14156480

Submission received: 18 June 2024 / Revised: 22 July 2024 / Accepted: 23 July 2024 / Published: 25 July 2024

Download

Browse Figures

Versions Notes

Abstract

:

The latest generation of standard versatile video coding (VVC) continues to utilize hybrid coding architecture to further promote compression performance, where the intra-mode decision module selects the optimal mode to balance bitrate and coding distortion. With regard to chroma intra modes, a scheme that uses a cross-component linear model (CCLM) is involved by utilizing the component correlation between luma and chroma, which could implicitly introduce distortion propagation from luma blocks to subsequent chroma prediction blocks during coding, impacting the result of a Lagrange optimization. This paper presents an improved intra-mode decision-based modified Lagrange multiplier for chroma components in VVC. The characteristics of chroma intra prediction are examined in depth, and the process of an intra-mode decision is analyzed in detail; then, the coding distortion dependency between the luma and chroma is described and incorporated into a Lagrange optimization framework to determine the optimal mode. The proposed method achieves an average bitrate-saving effect of 1.23% compared with the original scheme by using a dependent rate-distortion optimization in an All-Intra configuration.

Keywords:

Lagrange optimization; chroma coding; intra-mode decision; versatile video coding

1. Introduction

With the rapid development of multimedia technologies and communication networks, video services have undergone tremendous development in terms of range, from mobile HD video to ultra-HD video, experiencing explosive increases in applications such as those intended for entertainment, industry, medicine, sport, agriculture, and transportation [1]. While benefiting people’s lives, some outstanding problems are involved in the storage and transmission of these huge amounts of video data. Lossy video coding is thus put forward to lessen space redundancy and transmission expenses that affect the quality of reconstructed images [2]. Consequently, many researchers investigate video compression schemes, and a series of standards have been successively established, such as Advanced Video Coding (AVC) [3] and High-Efficiency Video Coding (HEVC) [4], where hybrid coding structures are employed, including the basic codec modules of intra/inter-prediction, transform, quantization, entropy coding, rate control, and rate distortion optimization (RDO).

Currently, the Joint Video Experts Term (JVET), founded by the ITU-T Video Coding Experts Group (VCEG) and the ISO/IEC Moving Picture Experts Group (MPEG), is devoted to formulating a new-generation video coding standard of versatile video coding (VVC) [5]. Novel compression technologies are being added to VVC with the aim of reducing the average bitrate to 50% with the same subjective video quality when compared to high-efficiency video coding (HEVC). In order to adapt to natural video characteristics, a quadtree with a nested multi-type tree (QT-MTT) coding structure is used to produce coding tree units (CTUs) for more flexible partitions, and extended multiple directional intra prediction modes are incorporated to accurately capture arbitrary edge directions [6]. In addition, multiple transform selection (MTS) is used for subsequent Context-based Adaptive Binary Arithmetic Coding (CABAC) to achieve an extraordinary performance improvement [7].

The hybrid encoders mentioned above employ prediction and residual coding to generate a video bitstream based on the redundant information of past-coded blocks [8]. Here, the prediction module refers to two aspects of intra and inter prediction, where intra prediction sufficiently explores spatial neighboring blocks to remove the potential correlation, and inter prediction corresponds to lessen inter-block redundancy as much as possible. Considering that a local scene would frequently change during a video sequence, intra prediction significantly contributes to coded video quality and efficiency through the use of a deep scheme to shrink the residual energy and subsequent bitstream size. Many studies focus on newly built or improved schemes for higher compression in prediction, and many directional prediction modes are continuously being investigated to obtain lower residual power, from the 9 modes previously available in AVC and the 35 modes in HEVC to the 67 modes currently available in VVC by fully utilizing contextual information [9].

The rich set of prediction patterns in intra coding can potentially achieve a relatively lower bitrate when representing blocks; however, it may simultaneously produce a burden on optimal selection. Usually, an encoder should compress a video by exhaustively trying all intra modes; however, the complexity and computation load increase drastically during operation. To address this issue, the RDO mode decision scheme is employed to determine the optimal mode to meet the best rate and distortion trade-off. Although research on intra-mode decisions using RDO has resulted in considerably positive benefits, most schemes are designed on the premise of ignoring the intra coding influence between the luma and chroma. Actually, prediction-based reconstructed luma samples could cause potential intra-distortion propagation to the chroma prediction block, which could impact the cost of rate and distortion (RD) in the final mode decision. This paper presents an improved intra-mode decision for chroma intra coding, where the distortion relevance between the luma and chroma is analyzed and incorporated into RDO. The experimental results show that the improved scheme can produce additional coding performance in VVC intra-mode decisions.

The main contributions of this work are as follows: this work provides an in-depth exploration and analysis of chroma intra prediction and the mode decision based on RDO; an intrinsic study and discussion concerning intra-coding distortion propagation from luma to chroma; a scheme to construct RD optimization based on a distortion model in mode decision; an improved Lagrange multiplier to obtain higher performance to balance the bitrate and coding loss; and a comparison and verification of the experimental results for different videos. The rest of this paper is organized as follows: The related works are described in Section 2, and the overview of the chroma prediction and RD mode decision are given in Section 3. Then, the coding distortion feature is analyzed, and the improved intra-mode decision for the chroma is provided in Section 4. Afterwards, the experimental results are reported in Section 5. Finally, the conclusions of this work are drawn in Section 6.

2. Related Works

Considering the non-uniformity between the luma and chroma, the coding structures and the intra mode prediction of the chroma component are autonomous from the luma component in VVC [10]. In terms of chroma intra mode prediction, Kim et al. [11] developed a cross-component linear prediction mode (CCLM) to remove visual redundancies, where the intra-chroma CU utilizes cross-component correlations with the luma-reconstructed CU to predict pixels on the basis of a linear model (LM). Zhang et al. [12] proposed that the LM parameters for the prediction of the original chroma are error-sensitive, and they proposed three novel LM-like modes to identify some potentially problematic conditions in parameter estimation. Li et al. [13] also explained that it is difficult to derive optimal LM parameters, especially when there are insufficient training samples for prediction blocks with small sizes, and proposed a multi-mode linear model based on up-sampling chroma samples. Choi et al. [14] proposed to utilize the intra prediction mode of the luma block covering the center position as the chroma candidate mode. The aforementioned research on intra mode prediction primarily focuses on the use of local reference blocks. Nevertheless, it is crucial to acknowledge that blocks with similar textures may reside in more distant regions and can serve as viable references in the prediction process.

The block matching approach employed in an intra block copy (IBC) method [15] was purposely designed for natural video data, where a coding block is predicted by a reconstructed block residing within the same frame. Here, the utilization of IBC achieves a good balance between high efficiency and low complexity. Furthermore, a combined intra-mode approach [16] was developed by utilizing both local and non-local correlations, which integrates block matching and angular prediction to effectively handle complex video content. The non-local blocks could be accurately predicted, thereby removing the spatial redundancy. Inspired by the decline in the accuracy of intra modes as the reference distance expanded, the position-dependent intra prediction combination (PDPC) [17] was introduced with more available reference samples, where the prediction pixels are determined based on the positions of samples and the intra mode. While some researchers held that the intra mode prediction block could be obtained using the neighboring samples, it was acknowledged that the reference line might exhibit significant discontinuity compared to the original samples. Therefore, multiple reference lines [18] were supported to improve the accuracy of prediction. All of the approaches above are beneficial to coding efficiency for intra coding, and some of them have been adopted into the VVC codec [19].

Regarding intra-mode decision, the technique of Lagrange optimization played a critical role in chroma coding, as it aimed to obtain the best prediction mode by calculating the minimum of the Lagrange cost, thereby achieving a balance between bitrate and distortion (RD). Several researchers devoted themselves to estimating the Lagrange multipliers, λ, in order to discover improvements in coding efficiency. Initially, some heuristic schemes were proposed to determine the λ using empirical expressions or iterative processes [20,21], but these methods were generally unsuitable for practical applications due to their high complexity and lack of theoretical foundation. Thus, the study of accurate RD models became appealing, aiming to solve the Lagrange multiplier based on the source probability distribution through analytical means. The popular logarithmic relationship between bitrate and distortion was presented based on the high-rate approximation curve for entropy-constrained scalar quantization. Consequently, the corresponding Lagrange multiplier could be accurately calculated in terms of the quantization parameter [22]. In [23], an adaptive Lagrange multiplier was estimated in the ρ-domain, which defines ρ as the percentage of zero coefficients among quantized transformed residuals. Furthermore, in [24], a Laplace distribution of transformed residuals was constructed to capture the RD statistical properties, enabling the derivation of the Lagrange multiplier based on RDO.

All of the above provide significant improvements in RDO performance, yet the Lagrange multiplier may not be sufficient to fully capture the coding properties. Liu et al. [25] built an adaptive parameter model for λ that considered the residual correlation across successive frames. Zhang et al. [26] conducted a comprehensive analysis and showed that the original Lagrange multiplier did not perform well for inter blocks, prompting them to propose and evaluate a novel approach for determining the λ. However, the methods mentioned above are designed under the premise of the identical characteristics of the luma and chroma, i.e., the source signal of the chroma is treated in the same manner as that of the luma. In fact, this is not suitable for accurately describing the RD characteristics of chroma intra mode.

3. Chroma Intra-Mode Decision in VVC

3.1. Chroma Intra Prediction

In the new generation of video coding standard VVC, the intra prediction technique is still retained to remove the redundancy of adjacent pixels in the spatial domain, where chroma signals exhibit the visual correlation among the neighboring chroma units. However, the inter-correlation between the luma and chroma is also considered and applied in chroma intra prediction. Figure 1 depicts the components of the luma (Y) and chroma (Cb and Cr) in one frame, which, to some extent, indicates the component relevancy in texture. Here, the test sequence of ParkScene, provided by JVET, has a resolution of 1280 × 720.

With the 4:2:0 YCbCr format, eight chroma modes are used for intra prediction, related to VER, HOR, PLANAR, DC, DM, LM, LM_T, and LM_L. Their serial numbers and corresponding operations are listed in Figure 2. Here, the luma and chroma blocks are shown in blue and yellow rectangles, respectively. The reconstructed blocks in gray act as references for intra prediction, and the pixels marked with circles are located in the coding blocks. The serial numbers are defined in [7] to distinguish the intra modes described below. The angular prediction of VER and HOR uses the nearest reference line to project each estimated pixel along the vertical and horizontal directions, respectively. For the DC mode, the predictor in the chroma block is the average of available neighboring reconstructed chroma pixels. Bi-linear interpolation is employed in PLANAR mode. In DM mode, the chroma block exploits the same intra prediction mode as the corresponding luma block. In VVC, the CCLM is related to the LM, LM_T, and LM_L, which operate based on different luma reference blocks.

The modes of LM, LM_T, and LM_L take advantage of the reconstructed luma samples to predict the chroma block using a linear mode. The reconstructed luma CUs of 2N × 2N are first down-sampled to the same resolution as the chroma blocks of N × N. Then, the intra chroma prediction is performed as follows:

y_{p} (i, j) = α x_{r} (i, j) + β,

(1)

where y_p is the predicted pixel of the chroma block, and x_r denotes the reconstructed value in the collocated down-sampled luma block. The parameters of α and β can be derived by the following:

α = \frac{C_{m a x} - C_{m i n}}{L_{m a x} - L_{m i n}},

(2)

β = C_{m a x} - {α C}_{m i n} .

(3)

Here, L_max and L_min represent the maximum and minimum down-sampled neighboring luma pixels, respectively. C_max and C_min are collocated chroma pixels. The process of LM is provided in Figure 3, and the parameters of α and β can be obtained by linear regression using the blue and yellow samples. Then, the luma reference blocks derived from down-sampling are integrated into (1) for the chroma prediction block. In the same way, the prediction of LM_T and LM_L is similar to that of LM, with potential differences in the selection of luma reference blocks.

3.2. Chroma Mode Decision

RDO is utilized to select the optimal mode for chroma intra prediction in order to achieve high coding efficiency, which can be represented by the following [27]:

m i n {D} s u b j e c t t o R \leq R c,

(4)

where the parameters of R and D represent the rate and distortion, respectively, for a given chroma block to be encoded, and the Rc is the available bitrate target. The solved RDO problem is converted into an unconstrained form using the Lagrange optimization scheme as follows:

m i n \{J\} J = D + λ \cdot R,

(5)

where the J indicates the Lagrange cost and the λ is the Lagrange multiplier. There are eight chroma prediction options in one block, where the i^th candidate mode corresponds to one operating point with distortion D_i and bitrate R_i. The intra-mode decision would search for the optimal point for the chroma block by minimizing the Lagrange cost J.

In order to decrease the chroma intra prediction complexity in VVC, the execution flow of RDO relates to two segments, rough decision and fine decision, as shown in Figure 4. The rough decision evaluates five modes with lower complexity using the Lagrange cost J_fir:

J_{f i r} = S A T D + λ_{f i r} \cdot R_{f i r},

(6)

where the SATD is the sum of absolute Hadamard transformed difference, and R_fir is the number of bits required to code the prediction mode. The three modes with the lowest J_fir are added to the candidate set for selecting the optimal prediction mode as follows:

J_{s e c} = S S E + λ_{s e c} \cdot R_{s e c} .

(7)

J_sec represents the Lagrange cost, and SSE stands for the sum of squared error between the original and reconstructed block. R_sec denotes the total bits spent to encode the predicted residual. Here, the relationship between λ_fir and λ_sec can be expressed as follows:

λ_{f i r} = \sqrt{λ_{s e c}} .

(8)

4. Proposed Lagrange Optimization Scheme

Generally, reference pixels are located on the left and top of the chroma block and are often smooth, making them easily acquired and directly applicable in intra modes such as VER, HOR, PLANAR, DC, and DM. However, for modes like LM, LM_T, and LM_L, the referenced reconstruction blocks located in the luma component should first be down-sampled for intra prediction. This process may potentially introduce pixel discontinuity. Subsequently, these discontinuous reference pixels are used for chroma prediction, and the inherent distortion from the luma component may propagate further into subsequent chroma blocks during intra-mode decision. Figure 5 illustrates the proportion of optimal modes under the All-Intra configuration, where the quantization parameters (QPs) are typically used to adjust the bitrate. Four common test QP values ∈{22, 27, 32, 37} are chosen for encoding, and the compression structure is defined as an All-Intra frame case for the conditions of the All-Intra configuration in VVC. It can be observed that the modes of LM, LM_T, and LM_L make up more than 40 percent of all intra prediction modes. Figure 6 shows the optimal modes in coding blocks for the sequence of BasketballDrive, where the mode numbers are represented by corresponding colors. It can also be noticed that the modes of LM, LM_T, and LM_L have a significant portion in the frames under different QPs, indicating that the role of the CCLM is vital in intra coding. Consequently, the corresponding distortion propagation characteristic from the luma to chroma is non-negligible in intra-mode decision, where the RD cost may exhibit a discrepancy. In an effort to improve compression efficiency, the discontinuous distortion propagation should be considered to lessen its negative impact on RDO.

4.1. Distortion Analysis

Owing to the assistance of luma coded blocks, chroma intra prediction utilizes the reference pixels for further compression in the modes of LM, LM_T, and LM_L. This can result in distortion drift from the luma to chroma, and the propagation distortion chain can grow as subsequent intra coding progresses. Here, the pixel distortion d located at (i,j) can be expressed as (9) based on the principle of the chroma intra prediction.

d (i, j) = b (i, j) - b^{'} (i, j) .

(9)

Here, the variable b is the original pixel value in current chroma block B_n, and b′ denotes the reconstructed pixel value in B′_n based on the reconstructed reference pixel. This can be rewritten as (10) by using the reconstructed pixel value b_o, which is based on the original reference pixel.

d (i, j) = [b (i, j) - b_{o} (i, j)] + [b_{o} (i, j) - b^{'} (i, j)] .

(10)

Here, the distortion d can be divided into two segments: the difference between the original pixel in the current block and the reconstructed one obtained by intra prediction using the original reference pixel, and the difference in the reconstructed pixel using the original reference pixel b_r with respect to the reconstructed reference pixel b_r′. The former represents the pixel distortion in the current chroma block, and the latter reflects the luma propagation distortion related to the reference chroma block. Generally, intra prediction coding can be described using the Gauss–Markov model for pixels, as shown in [28], and it holds for the following:

b_{o} (i, j) \approx θ b_{r} (i, j) + η,

(11)

b^{'} (i, j) \approx θ b_{r}^{'} (i, j) + η,

(12)

where θ and η are model parameters, respectively. Then, by substituting them into (10), it can be derived as follows:

d (i, j) = [b (i, j) - b_{o} (i, j)] + [θ b_{r} (i, j) + η - θ b_{r}^{'} (i, j) - η)] \approx d_{c} (i, j) + θ d_{r} (i, j),

(13)

where d_c indicates the distortion between the original chroma pixel and the reconstructed chroma pixel based on the original luma reference pixel, and d_r represents the distortion between the original luma reference pixel and the reconstructed luma reference pixel. Thus, for the k^th chroma block, the total distortion D_k can be obtained as follows:

D_{k} \approx D_{k c} + θ D_{k r} .

(14)

In conformity with the study above, it can be discovered that the total coding distortion for LM, LM_T, and LM_L is related to the distortions D_kc of the current chroma block and the distortion D_kr of the reference luma block. This indicates the characteristic of distortion propagation from the luma to chroma.

4.2. RDO-Based Distortion Feature

By combining the RDO theoretical approach, the Lagrange optimization within one chroma frame can be expressed as follows:

J = \sum_{k} D_{k} + λ_{f i r} \sum_{k} R_{k} .

(15)

Assuming that the k^th block is associated with luma pixels, the corresponding coding distortion could propagate to the following chroma blocks below and right through intra prediction, which establishes a distortion connection between D_k_,blow and D_k_,right with respect to R_k. Consequently, the RD cost can be calculated by derivation as follows:

\frac{\partial J}{\partial R_{k}} = \frac{\partial D_{k}}{\partial R_{k}} + \frac{\partial D_{k, b l o w}}{\partial R_{k}} + \frac{\partial D_{k, r i g h t}}{\partial R_{k}} + λ_{f i r} = \frac{\partial D_{k}}{\partial R_{k}} + \frac{\partial (D_{k c, b l o w} + {θ_{b l o w} D}_{k})}{\partial R_{k}} + \frac{\partial (D_{k c, r i g h t} + {θ_{r i g h t} D}_{k})}{\partial R_{k}} + λ_{f i r} = (1 + θ_{b l o w} + θ_{r i g h t}) \frac{\partial D_{k}}{\partial R_{k}} + λ_{f i r} = δ \frac{\partial D_{k}}{\partial R_{k}} + λ_{f i r} .

(16)

So, the Lagrange optimization can be further transformed as follows:

J = \sum_{k} D_{k} + \frac{λ_{f i r}}{δ} R_{k} = \sum_{k} D_{k} + λ_{f i r}^{'} R_{k},

(17)

where the parameter δ is estimated by linear regression using coded chroma blocks, and subsequently, the modified Lagrange multiplier λ′_ir for the modes of LM, LM_T, and LM_L can be derived in turn using (17). Following that, the Lagrange multiplier λ′_sec, which utilizes the distortion measurement of SSE, can be updated by the following:

λ_{s e c}^{'} = ({λ_{f i r}^{'})}^{2} = (\frac{λ_{f i r}}{δ})^{2} .

(18)

In intra-mode decision, the first rough decision is made using the Lagrange multiplier λ′_fir for the modes of LM_T and LM_L, contrary to other modes which are aided by λ_fir. Furthermore, λ′_sec and λ_sec are separately incorporated into the fine decision process to determine the optimal chroma mode, where λ′_sec is employed for the modes of LM, LM_T, and LM_L. Figure 7 illustrates the modified intra-mode decision using Lagrange optimization.

5. Experimental Results

To illustrate the coding efficiency of the proposed method, the VVC is modified by incorporating the improved Lagrange optimization scheme into the reference software VTM 22.1 [29] for comparison with the original version. Two aspects of algorithm verification are designed: an efficiency comparison between the proposed RDO method and the original RDO method and an analysis of the computational complexity of the proposed method relative to the original one. The experiments are conducted under the common test conditions specified in [30]. The video sequences, which belong to Class B/C/D/E, have diverse resolutions of 1920 × 1080, 832 × 480, 416 × 240, and 1280 × 720, as listed in Table 1. Each video sequence contains 300 coding frames. Here, the values of the quantization parameter are 22, 27, 32, and 37, respectively. The hardware platform used is Windows10 with an Intel Core i7-8700 CPU.

5.1. Coding Efficiency

The proposed chroma intra coding is compared with the original scheme using Bjontegaard delta bitrate (∆bitrate) with the aid of YCbCr PSNR based on the RD curves [31]. A negative ∆bitrate value indicates a reduction in the coding bitrate compared to the anchor scheme. The results of coding efficiency are summarized in Table 2, showing an approximate bitrate reduction of 1.23% in the intra-only configuration. Correspondingly, a bitrate saving of 1.03% is observed in the Random-Access configuration, as shown in Table 3, where the period of intra frame is 32. Based on the data, it can be seen that coding efficiency can be promoted with better results by using the modified intra-mode decision-based chroma RDO method. Furthermore, the RD curves in Figure 8 demonstrate that the proposed scheme achieves better RD performance compared to conventional VVC. In conclusion, the proposed algorithm contributes to enhancing chroma coding efficiency.

5.2. Coding Complexity

In order to verify the effect on coding complexity, the proportion of time increment is calculated as follows:

∆ T = \frac{T_{p} - T_{o}}{T_{o}}

(19)

where T_p and T_o represent the total coding time of the proposed algorithm and the original algorithm, respectively. The results of the complexity comparison are given in Table 4, where the average values of ∆T are approximately 0.030 for the All-Intra configuration and 0.014 for the Random-Access configuration. It can be observed that the computational complexity has a slight increase compared to the original algorithm. Despite the fact that the chroma RDO process in the LM, LM_T, and LM_L modes consumes additional coding time, the increment in complexity is not excessive. Based on the above evaluation, it can be verified that the proposed method provides improved coding performance with an acceptable computation complexity.

6. Conclusions

The chroma intra-mode decision is interrelated with the luma component, significantly impacting both objective and subjective coding efficiency. Since chroma prediction may utilize reconstructed luma reference blocks, there exists a dependency between components, and intra prediction coding may potentially propagate distortion from luma to subsequent chroma blocks. Therefore, an improved mode decision based on Lagrange optimization for chroma is proposed to address the RD characteristic, where the new Lagrange multiplier can be employed in the LM, LM_T, and LM_L modes. The experiments demonstrate that the proposed method enhances the bitrate performance of chroma intra coding by more than 1.23% under the All-Intra condition compared to the original VVC scheme.

Author Contributions

Conceptualization, W.L. and C.F.; methodology, W.L.; investigation, W.L.; writing—original draft preparation, W.L.; writing—review and editing, C.F.; funding acquisition, W.L. All authors have read and agreed to the published version of the manuscript.

Funding

This research was funded by the Natural Science Basic Research Program of Shaanxi (Program No. 2024JC-YBQN-0727 and Program No. 2023-ZDLGY-50) and the Research Program of School-enterprise cooperation (No. 441223064).

Institutional Review Board Statement

Not applicable.

Informed Consent Statement

Not applicable.

Data Availability Statement

The original contributions presented in the study are included in the article, further inquiries can be directed to the corresponding author.

Conflicts of Interest

The authors declare no conflicts of interest.

References

Sugawara, M.; Masaoka, K. UHDTV image format for better visual experience. Proc. IEEE 2013, 101, 8–17. [Google Scholar] [CrossRef]
Marzuki, I.; Sim, D. Overview of potential technologies for future video coding standard (FVC) in JEM software: Status and review. IEIE Trans. Smart Process. Comput. 2018, 7, 22–35. [Google Scholar] [CrossRef]
ITU-T Recommendation H.264 and ISO/IEC 14496–10; Advanced Video Coding (AVC). ITU-T: Paris, France, 3–12 May 2003.
ITU-T Recommendation H.265 and ISO/IEC 23008–2; High Efficiency Video Coding (HEVC). ITU-T: Paris, France, 21–28 April 2013.
ITU-T Recommendation H.266 and ISO/IEC 23090–3; Versatile Video Coding (VVC). ITU-T: Paris, France, 22–30 June 2020.
Zhao, H.; Zhao, S.; Shang, X.; Wang, G. A Fast Algorithm for VVC Intra Coding Based on the Most Probable Partition Pattern List. Appl. Sci. 2023, 13, 10381. [Google Scholar] [CrossRef]
Chen, J.; Ye, Y.; Kim, S. Algorithm description for versatile video coding and test model 12 (VTM 12). In Proceedings of the 21st Meeting of ITU-T/ISO/IEC Joint Video Experts Team (JVET), JVET-U2002, Geneva, Switzerland, 6–15 January 2021. [Google Scholar]
Chen, G.; Lin, M. Sample-Based Gradient Edge and Angular Prediction for VVC Lossless Intra-Coding. Appl. Sci. 2024, 14, 1653. [Google Scholar] [CrossRef]
Viitanen, M.; Sainio, J. From HEVC to VVC: The first development steps of practical intra video encoder. IEEE Trans. Consum. Electron. 2022, 68, 139–148. [Google Scholar] [CrossRef]
Kim, J.K.; Oh, K.J.; Kim, J.W.; Kim, D.W.; Seo, Y.H. Intra Prediction-Based Hologram Phase Component Coding Using Modified Phase Unwrapping. Appl. Sci. 2021, 11, 2194. [Google Scholar] [CrossRef]
Kim, J.; Park, S.W.; Park, J.Y.; Jeon, B.M. Intra chroma prediction using inter channel correlation. In Proceedings of the Meeting report of the second meeting of the Joint Collaborative Team on Video Coding (JCT-VC) JCTVC-B021, Geneva, Switzerland, 21–28 July 2010. [Google Scholar]
Zhang, X.; Gisquet, C.; Francois, E.; Zou, F.; Au, O.C. Chroma Intra Prediction Based on Inter-Channel Correlation for HEVC. IEEE Trans. Image Process. 2014, 23, 274–286. [Google Scholar] [CrossRef]
Li, S.; Zhou, Q.; Chen, Z.; Liu, Y.; Ling, N. A Linear Model for YUV 4:2:0 Chroma Intra Prediction. In Proceedings of the 2019 IEEE International Symposium on Circuits and Systems (ISCAS), Hokkaido, Japan, 26 May 2019. [Google Scholar]
Choi, N.; Park, M.; Choi, K. CE3-related: Chroma DM modification. In Proceedings of the Meeting Report of the 12th Meeting of the Joint Video Experts Team (JVET) JVET-L0053, Macao, China, 3–12 October2018. [Google Scholar]
Zuo, X.; Wang, L.; Chen, F.; Song, X. Intra block copy for intra-frame coding. In Proceedings of the Meeting Report of the 10th Meeting of the Joint Video Experts Team (JVET) JVET-J0042, San Diego, CA, USA, 10–20 April 2018. [Google Scholar]
Zhang, T.; Fan, X.; Zhao, D.; Xiong, R.; Gao, W. Hybrid intraprediction based on local and nonlocal correlations. IEEE Trans. Multimed. 2018, 20, 1622–1635. [Google Scholar] [CrossRef]
Said, A.; Zhao, X.; Karczewicz, M.; Chen, J.; Zou, F. Position dependent prediction combination for intra-frame video coding. In Proceedings of the 2016 IEEE International Conference on Image Processing (ICIP), Phoenix, AZ, USA, 25–28 September 2016; pp. 534–538. [Google Scholar]
Chang, Y.J.; Jhu, H.J.; Jiang, H.Y.; Zhao, L. Multiple Reference Line Coding for Most Probable Modes in Intra Prediction. In Proceedings of the Data Compression Conference (DCC), Snowbird, UT, USA, 26–29 March 2016; 2019; p. 559. [Google Scholar]
Zhao, L.; Zhao, X.; Liu, S.; Li, X.; Lainema, J.; Rath, G.; Urban, F.; Racape, F. Wide Angular Intra Prediction for Versatile Video Coding. In Proceedings of the 2019 Data Compression Conference (DCC), Snowbird, UT, USA, 26–29 March 2019. [Google Scholar]
Yang, Y.; Hemami, S. Generalized rate-distortion optimization for motion-compensated video coders. IEEE Trans. Circuits Syst. Video Technol. 2000, 10, 942–955. [Google Scholar] [CrossRef]
Yang, E.; Yu, X. Rate distortion optimization for H.264 interframe coding: A general framework and algorithms. IEEE Trans. Image Process. 2007, 16, 1774–1784. [Google Scholar] [CrossRef] [PubMed]
Wiegand, T.; Girod, B. Lagrange multiplier selection in hybrid video coder control. In Proceedings of the Proceedings 2001 International Conference on Image Processing (Cat. No.01CH37205), Thessaloniki, Greece, 7–10 October 2001; pp. 542–545. [Google Scholar]
Chen, L.; Garbacea, I. Adaptive lambda estimation in Lagrangian rate-distortion optimization for video coding, presented at the Visual Commun. In Proceedings of the SPIE, San Jose, CA, USA, 15 January 2006. [Google Scholar]
Li, X.; Oertel, N.; Hutter, A.; Kaup, A. Laplace Distribution Based Lagrangian Rate Distortion Optimization for Hybrid Video Coding. IEEE Trans. Circuits Syst. Video Technol. 2009, 19, 193–205. [Google Scholar]
Liu, Z.; Wang, D.; Zhou, J.; Ikenaga, T. Lagrange multiplier optimization using correlations in residues. In Proceedings of the IEEE International conference on Acoustics, Speech and Signal Processing, Kyoto, Japan, 25–30 March 2012; pp. 1185–1188. [Google Scholar]
Zhang, F.; Bull, D. Rate-distortion optimization using adaptive Lagrange multipliers. IEEE Trans. Circuits Syst. Video Technol. 2019, 29, 3121–3131. [Google Scholar] [CrossRef]
Sullivan, G.J.; Wiegand, T. Rate-distortion optimization for video compression. IEEE Signal Process. Mag. 1998, 15, 74–90. [Google Scholar] [CrossRef]
Tu, Y.K.; Yang, J.F.; Sun, M.T. Efficient rate-distortion modeling for efficient H.264/AVC encoding. IEEE Trans. Circuits Syst. Video Technol. 2007, 17, 530–543. [Google Scholar] [CrossRef]
VTM Reference Software. Available online: https://vcgit.hhi.fraunhofer.de/jvet/VVCSoftware_VTM (accessed on 26 October 2023).
Bossen, F. JVET Common Test Conditions and Software Reference Configurations for SDR Video. In Proceedings of the Meeting Report of the 15th Meeting of the Joint Video Experts Team (JVET) JVET-N1010, Gothenburg, Sweden, 3–12 July 2019. [Google Scholar]
Bjontegaard, G. Calculation of Average PSNR Difference Between RD-curves. In Proceedings of the ITU-T Q.6/SG16 VCEG 13th Meeting, Doc. VCEG-M33, Austin, TX, USA, 2–4 April 2001. [Google Scholar]

Figure 1. Luma of Y and chroma of Cb and Cr in ParkScene sequence.

Figure 2. Chroma prediction modes for intra coding.

Figure 3. The framework of LM in intra prediction.

Figure 4. The framework of RDO in chroma prediction.

Figure 5. Ratio of each optimal mode in BasketballDrive.

Figure 6. Distribution of each optimal mode in BasketballDrive.

Figure 7. The framework related to the modified scheme.

Figure 8. The RD curve of the performance in comparison with the sequence of RaceHorses_832 × 480.

Table 1. The test sequences used for verifying the proposed method.

Class/Resolution/Bit-Depth	Frame Rate	Sequence
Class B/1920 × 1080/8	50	Cactus
	50	BasketballDrive
	60	BQTerrace
Class C/832 × 480/8	30	RaceHorses
	60	BQMall
	50	PartyScene
	50	BasketballDrill
Class D/416 × 240/8	30	RaceHorses
	60	BQSquare
	50	BlowingBubbles
	50	BasketballPass
Class E/1280 × 720/8	60	FourPeople
	60	Johnny
	60	KristenAndSara

Table 2. The RD performance of the proposed algorithm compared to the original algorithm in the All-Intra configuration.

Sequences	∆Bitrate (%)
Class B	−1.23
Class C	−1.24
Class D	−1.22
Class E	−1.26
Average	−1.23

Table 3. The RD performance of the proposed algorithm compared to the original algorithm in the Random-Access configuration.

Sequences	∆Bitrate (%)
Class B	−1.03
Class C	−1.04
Class D	−1.03
Class E	−1.05
Average	−1.03

Table 4. The encoding complexity of the proposed algorithm compared to the original algorithm.

Sequences	ΔT
Sequences	All-Intra	Random-Access
Class B	0.033	0.017
Class C	0.028	0.014
Class D	0.031	0.016
Class E	0.029	0.012
Average	0.030	0.014

Disclaimer/Publisher’s Note: The statements, opinions and data contained in all publications are solely those of the individual author(s) and contributor(s) and not of MDPI and/or the editor(s). MDPI and/or the editor(s) disclaim responsibility for any injury to people or property resulting from any ideas, methods, instructions or products referred to in the content.

© 2024 by the authors. Licensee MDPI, Basel, Switzerland. This article is an open access article distributed under the terms and conditions of the Creative Commons Attribution (CC BY) license (https://creativecommons.org/licenses/by/4.0/).

Share and Cite

MDPI and ACS Style

Li, W.; Fan, C. Intra-Mode Decision Based on Lagrange Optimization Regarding Chroma Coding. Appl. Sci. 2024, 14, 6480. https://doi.org/10.3390/app14156480

AMA Style

Li W, Fan C. Intra-Mode Decision Based on Lagrange Optimization Regarding Chroma Coding. Applied Sciences. 2024; 14(15):6480. https://doi.org/10.3390/app14156480

Chicago/Turabian Style

Li, Wei, and Caixia Fan. 2024. "Intra-Mode Decision Based on Lagrange Optimization Regarding Chroma Coding" Applied Sciences 14, no. 15: 6480. https://doi.org/10.3390/app14156480

Note that from the first issue of 2016, this journal uses article numbers instead of page numbers. See further details here.

Article Menu

Intra-Mode Decision Based on Lagrange Optimization Regarding Chroma Coding

Abstract

1. Introduction

2. Related Works

3. Chroma Intra-Mode Decision in VVC

3.1. Chroma Intra Prediction

3.2. Chroma Mode Decision

4. Proposed Lagrange Optimization Scheme

4.1. Distortion Analysis

4.2. RDO-Based Distortion Feature

5. Experimental Results

5.1. Coding Efficiency

5.2. Coding Complexity

6. Conclusions

Author Contributions

Funding

Institutional Review Board Statement

Informed Consent Statement

Data Availability Statement

Conflicts of Interest

References

Share and Cite

Article Metrics

Article Access Statistics

Further Information

Guidelines

MDPI Initiatives

Follow MDPI