1. Introduction
In recent years, three-dimensional (3D) video has triggered an extremely attention from research institutes and industries due to the development of multimedia and stereoscopic displaying technologies [
1]. A new 3D representation, multiview video plus depth (MVD), is developed by MPEG as the most popular format for 3D application [
2,
3,
4]. In MVD format, only few captured texture videos and the associated depth images are utilized for providing information of the 3D scene. After receiving texture and depth data, the arbitrary virtual images can be synthesized based on depth image-based rendering (DIBR) [
5]. To efficiently save the transmission cost, the texture video and depth map of MVD data are coded and transmitted. Since depth image is not displayed in receiver side and only represents the information of a 3D scene, the quality of depth map coding needs to be measured by the rendering image quality. Therefore, depth coding method designed by the new compression standard has become an active research topic.
Based on this consideration, a joint collaborative team on 3D video (JCT-3V) has been formed; 3D-high efficiency video coding (3D-HEVC) is an extension of the new compression standard HEVC [
6,
7,
8]. In 3D-HEVC, several tools, such as depth modeling modes (DMMs) and segment-wise DC coding (SDC), are designed to achieve better coding of the edges in a depth map [
9]. To reach the highest compression efficiency, depth coding examines all possible modes to find the one with best rate-distortion (RD) cost, which involves extremely high complexity. Thus, it is very desirable to develop a low complexity depth coding scheme which can reduce the encoding time of 3D-HEVC encoder with no loss of compression performance.
A number of methods [
10,
11,
12,
13,
14,
15,
16,
17] have been reported to speed up the depth coding. A fast decision effort is presented in Yoon and Ho [
10] to reduce the complexity of H.264/AVC. A fast depth method is proposed in Lee et al. [
11] to reduce the encoding time of multiview video coding (MVC). A fast approach depending on the correlation between depth and texture is utilized in Shen et al. [
12] to speed up coding computation. A fast mode decision is employed in Yoon and Ho [
13] by using an adaptive edge classification that determines depth discontinuity regions. A fast motion estimation strategy is employed in Cernigliaro et al. [
14] by jointly using the relation of the texture image information. A fast method is designed in Pan et al. [
15] to reduce the complexity of motion estimation (ME) by utilizing the coding correlations between texture video and depth image. A fast mode decision method is introduced in Shen and Zhang [
16] to minimize depth coding complexity. A joint MVD compression method has been investigated in our previous work [
17] to reduce the encoding time of depth compression. All of these methods are efficient in reducing depth map complexity with acceptable video quality for the previous coding standard. However, those fast methods are not well suited to the HEVC-based 3D video encoder, which adopts a new quadtree-structured coding unit with complexity promotion in 3D-HEVC encoders.
Currently, studies on complexity reduction of depth compression have also been proposed for 3D-HEVC, which is divided to two categories: fast inter prediction and intramode decision. Category 1: fast inter prediction algorithms proposed in literatures [
18,
19,
20,
21,
22,
23,
24]. An early termination method is presented in Tohidypour et al. [
18] to reduce complexity of the encoders based on correlations with dependent and base view. A fast decision scheme is proposed in Shen et al. [
19] to save the depth compression time by the intercomponent, interlevel, and interview correlation. However, the property of intramode prediction in depth map coding is not exploited in this method, and the interview coding mode correlation is not suitable for depth map coding; the for this reason is that most depth maps are obtained by depth estimation algorithms, and there is very little connection between depth sequence viewpoints. A low complexity method based on the motion information is developed in Zhang et al. [
20] to accelerate mode decision in depth coding. An early SKIP is investigated in Conceição et al. [
21] based on an adaptive threshold mode to reduce complexity of the inter frames. An early determination scheme is designed in Chen et al. [
22], which can deduct an averaged 22% complexity reduction with no loss for coding. A fast decision is developed in Lei et al. [
23] to reduce candidate modes of depth coding, where interview and grayscale similarity correlation are jointly used. A fast determination algorithm [
24] is utilized to reduce depth encoding time. A fast mode selection is introduced in our previous work [
25] to speed up the depth compression by utilizing the edge information.
Category 2: fast intra prediction algorithms [
26,
27,
28,
29,
30,
31,
32,
33,
34]. A fast partitioning method is employed in Zhang et al. [
26] to accelerate the encoding process. A fast framework is developed in Sanchez et al. [
27], which is based on edge information to accelerate the depth mode prediction. Fast schemes are incorporated in Park [
28,
29] for balance between time reduction and RD performance of 3D-HEVC. A fast depth work [
30] is proposed to reduce complexity of the RD-cost calculation. A fast depth selection is exhibited in He et al. [
31] to reduce the intra candidates of the current coding unit. An intra decision scheme is proposed in Zhang et al. [
32], using the squared Euclidean distance of variances to speed up the depth compression. Early termination and fast intramode selection methods are exploited in our previous works [
33,
34] to reduce the runtime of depth intramode process. Average complexity reductions of 40% and 30% are achieved for intra-only mode coding, respectively. These aforementioned algorithms can effectively accelerate intra and intermode decision with acceptable video quality degradation. However, useful depth map compression information including the intermode, intramode, and RD cost correlation among 3D-HEVC encoders are not fully exploited in the above algorithms. There is still some room for further reducing computational complexity of depth coding.
In this paper, we present a novel 3D-HEVC scheme to reduce mode prediction complexity of depth coding. The basic idea of this scheme is to use the spatial–temporal and texture video correlations to analyze current treeblock characteristics and to skip some unnecessary modes. It consists of two methods: early SKIP and adaptive intra prediction selection.
The rest of the paper is organized as follows.
Section 2 presents the mode statistical analysis of depth map coding.
Section 3 explains the proposed fast scheme in detail. The experimental results and conclusion are provided in
Section 4 and
Section 5, respectively.
4. Experimental Results
In order to evaluate the proposed complexity reduction scheme, the experiment is implemented on the 3D-HEVC (HTM 16.1 [
35]). Eight MVD sequences, recommended by JCT-3V CTC [
36], are used for the simulation. Among those test sequences, the “Shark”, “Undo_Dancer”, and “GT_Fly” are synthetic videos with high precision depth map, and the “Kendo”, “Balloons”, “Newspaper”, “Poznan_Hall2”, and “Poznan_Street” are natural videos with estimated depth map. The experiment conditions: three-view: center view-left view-right view (in coding order); QPs are set as (25, 34), (30, 39), (35, 42), and (40, 45); the number of test coding frames is 100; DMMs Mode is on; and VSO is enabled. The “VSRS-1D-Fast” software is used for experiments [
38]. We use the Bjontegaard Delta Bitrate (BDBR) [
39] to evaluate the compression performance of our scheme. The complexity is measured with reduction of total time “
”, which is defined as follows
where
and
represent the run time of proposed encoder scheme and anchor method for same test sequence, respectively. The coding performances were implemented on a Microsoft Windows 7 SP1 (64-bit) workstation with two CPU of
[email protected] GHz.
Table 8 gives the experiment results of the proposed scheme compared to the HTM16.1 encoder. BDBR
S and BDBR
T represent the BD-rates calculated by the synthesized view and texture video PSNR, respectively. It can be observed from
Table 8 that the proposed scheme can significantly save the run time for both all-intra and random access configurations. The proposed scheme saves 49.1% and 63.0% encoding time for “Random Access” and “All-Intra” configurations, respectively, with a minimum of 45.6% for “Shark” (“Random Access” case) and a maximum of 68.7% for “Poznan_Hall2” (“All-Intra” case). The encoding time saving of “Poznan_Hall2” sequence is more than that of other test videos; this is because that the “Poznan_Hall2” sequence contains more stationary areas. The computation reduction of the proposed scheme is particularly high because a lot of unnecessary modes are considerably reduced. Meanwhile, the proposed scheme slightly reduces the quality of rendered video because the depth image has small loss. The average increase of bitrate on BDBR
S is less than 0.29% and 0.24% for “Random Access” and “All-Intra” configurations, respectively. Additionally, the proposed scheme can improve the bitrate performance on BDBR
T because the bitrate of depth map is reduced. It leads to 0.12% (“Random Access” case) and 0.07% (“All-Intra” case) BDBR decrease compared with original HTM16.1 method. Thus, the proposed scheme can effectively save the run time of depth coding with a little loss of coding efficiency.
Figure 4 and
Figure 5 show the RD performance and depth map runtime saving curves of the proposed scheme compared to original encoder on two test sequences “Kendo” (1024 × 768 resolution with natural sequence) and “Undo_Dancer” (1920 × 1088 resolution with synthetic sequence) under “Random Access” and “All-Intra” configurations. It can be observed from
Figure 4 and
Figure 5 that the proposed scheme can achieve a consistent depth map encoding time saving over a large QP range with nearly the same value in the rendered view PSNR. Furthermore, when encoding bit rate decreases, the proposed scheme time saving increases, as seen in
Figure 4 and
Figure 5. This is because as QPs increase, the probability of checking SKIP due to early SKIP mode decision algorithm and the probability of testing Planar/DC mode due to adaptive intra prediction selection algorithm are all increased in the proposed scheme.
In addition to the original HTM, the result of proposed scheme is also compared with five state-of-the-art fast methods under the “Random Access” and “All-Intra” conditions. These methods include FMDRA [
19] (optimized only for depth map inter coding), CRSED [
27], EBIMS [
29], LBPMS [
30], PDIMS [
32], and EETDM [
33], which are recently developed fast and efficient depth map algorithms for 3D-HEVC. All algorithms were performed on HTM16.1 and tested using the same computer for objective comparison. It can be seen from
Figure 6 and
Figure 7 that all five previous works have good encoding time saving. Compared with the five previous algorithms, the proposed scheme can perform better performance on computation reduction. Approximately 8.5%–41.0% depth runtime can be further saved in the “Random Access” and “All-Intra” conditions. On the other hand, compared with CRSED and EBIMS methods, the coding efficiency loss of the proposed scheme is negligible, with less than 0.2% BDBR
s increase. Furthermore, the proposed scheme performs a better RD performance than FMDRA, LBPMS, PDIMS, and EETDM methods. The above results show that the proposed method outperforms the recent fast algorithms for 3D-HEVC, in terms of depth coding runtime saving with better or similar RD performance.