1. Introduction
Digital television systems have gained widespread popularity due to their superior image and sound quality compared to analog television systems [
1,
2,
3]. When developing receivers for digital television (DTV) systems like Integrated Services Digital Broadcasting (ISDB), Digital Video Broadcasting (DVB), or Advanced Television Systems Committee (ATSC) [
4,
5,
6], it is crucial to have a controlled environment for studying and testing the receivers. The testing process should be straightforward and cost-effective, considering that it may need to be performed multiple times by numerous engineers [
7,
8]. The paper [
9] suggests the utilization of Digital Video Broadcast-Terrestrial (DVB-T) systems as position sensors in regions where Global Navigation Satellite Systems (GNSS) experience reduced performance.
In high-definition television (HDTV) broadcasting systems, the vertical size of the picture in the video stream specifies the height of each encoded picture. It represents the displayable part of the luminance component of the frame lines. To avoid start code emulation, the vertical size value must not be zero and should be a multiple of 16, 32, or 64 [
10]. The standardization of HDTV has taken several decades, resulting in various resolutions and standards beyond the common 1920 × 1080. Interestingly, even for 1920 × 1080, Moving Picture Expert Group 2 (MPEG2) encoders are used. In practice, the full HDTV video source pads an additional eight lines at the bottom of the display, which are either filled with a black or gray bar in real-time hardware encoders or by repeating the last line of content in software encoders like MPEG Software Simulation Group (MSSG). These extra eight lines constitute the DTV Essential Hidden Area (DEHA), which plays an essential role in DTV encoding, streaming, and decoding but has remained unutilized until now.
The authors in [
11] introduced a real-time HDTV video decoding scheme for DVB or ATSC set-top boxes, emphasizing an efficient decoder design with reduced memory access contention. The simulation results demonstrate the decoder’s ability to handle MPEG-2 MP@HL HDTV videos in real time at a clock rate of 81 MHz and a bit rate of 18–22 Mbps. The authors in the paper [
12] introduced a low-cost H.264/AVC video decoder for HDTV, achieving real-time HD1080 video decoding at 120 MHz with low power consumption. A real-time digital HDTV video decoding architecture with a dual decoding data path and efficient anchor picture storage is presented in [
13]. It achieves the real-time decoding of MPEG-2 MP@HL HDTV at 1920 × 1080 pixels/frame and 30 frames/s, with a low clock rate of 81 MHz and a bit rate of 18 to 20 Mbps.
This paper introduces a modified general HDTV decoder designed specifically for utilizing the DEHA. Firstly, the DEHA is thoroughly analyzed in conventional terrestrial and internet HDTV broadcasting systems using the implemented decoder. Secondly, a gray block image is used for transferring program-related metadata to evaluate the effectiveness of DEHA utilization. The experimental verification of program-related metadata, including technical information, specific camera details, and post-production technologies, is achieved using an overlaid Quick Response (QR) code. Furthermore, the impact of DEHA utilization on video compression processes is examined to assess video quality.
The utilization of DEHA offers seamless transfer capabilities and proves advantageous for various purposes related to DTV services, benefiting from its inherent synchronization advantages within video content. Moreover, the proposed system adheres to well-established standards such as ATSC, ISDB, and DVB, ensuring compatibility and interoperability.
In conclusion, this research aims to explore the untapped potential of DEHA utilization in DTV systems. By introducing a modified HDTV decoder and conducting comprehensive experiments, we aim to demonstrate the significance of DEHA in terrestrial and internet HDTV broadcasting. The findings presented in this paper provide valuable insights into the effective utilization of the DEHA, paving the way for advancements and improvements in the field of DTV.
2. The Conventional DEHA Analysis
2.1. DTV Essential Hidden Area (DEHA)
Digital Television (DTV) broadcasting systems, such as ISDB, ATSC, and DVB, utilize the H.262 and H.264 video codecs for delivering High Definition (HD) content [
14,
15] investigates the advancement of high-capacity transmission technologies for the future digital terrestrial television broadcasting system, where the authors achieved successful experiments involving 8K transmission and mobile reception via advanced modulation and Multiple-Input Multiple-Output (MIMO) technologies. This paper presents an overview of the advanced Integrated Services Digital Broadcasting-Terrestrial (ISDB-T) transmission system and its performance. The research is part of a program supported by the Ministry of Internal Affairs and Communications. The Full HDTV videos have 1080 scan lines and are 1920 pixels wide. The authors in [
16] focused on versatile video encoding standard, which offers significantly improved performance compared to High-efficiency coding (HEVC). However, due to the MPEG full HD video format’s requirement that the number of coded pixels be divisible by 16, 1080-line videos are encoded with 1920 × 1088 pixel frames, with the last eight lines discarded before display. These hidden lines form the DTV Essential Hidden Area (DEHA) and are crucial for accurate decoding, the synchronization of related services, and maintaining video quality. By intelligently utilizing the DEHA, broadcasters can optimize video compression, enhance system performance, and enable adaptive streaming capabilities.
Table 1 showcases the dimensions of the DEHA for different video resolutions, highlighting its expansion as video quality increases.
The DTV Essential Hidden Area (DEHA) plays a critical role in the encoding and decoding process of video content in 4K and 8K UHD standards, such as H.265. As demonstrated in
Table 1 and
Figure 1, the DEHA consists of an additional 8, 16, or 32 lines, which are created by padding the bottom of the active video source. This padding is necessary for the optimal functioning of the encoder, stream, and decoder. In modern MPEG-2/4/H reference encoders, the DEHA is padded with the last line signal of the input video, improving video quality compared to earlier hardware encoders that used white, gray, or black bars for padding.
Figure 1a illustrates the current practice of padding the DEHA with the last line signal, while
Figure 1b,c depict the previous approach of using bars for padding. Though there may be subtle differences in video quality between DEHA modes, these differences are considered negligible. The MPEG-2/4/H reference software recommends using the last line signal of the input video for the conventional reference mode DEHA. Despite its significance in the encoding/decoding process, the DEHA is presently viewed as redundant in terms of channel utilization and storage efficiency. Hence, the focus of this research is to explore methods of effectively utilizing the DEHA, an area that has not been extensively studied thus far.
In real-world scenarios, the full HDTV video source employs black or white bars or repetition of the last line of the content to pad the extra eight lines at the bottom of the display in real-time hardware encoders.
Figure 2 presents an analysis of the conventional DEHA using a proposed decoder for terrestrial and internet HDTV broadcasting systems. In both cases, the DEHA is formed by duplicating the last line of the content. The authors in [
17] proposed the flexible fast-adaptive Successive Cancellation List (SCL) decoder with polar nodes for digital video broadcasting to improve the latency and throughput.
The investigation and development of methodologies to enhance the utilization of the DEHA are central objectives of this research, as its potential remains untapped in current practices. By optimizing the DEHA’s usage, significant advancements can be made in channel efficiency and storage utilization, ultimately improving the overall performance and quality of DTV systems.
Due to the interlaced field of frames, terrestrial HD content exhibits horizontal stripes at the bottom in
Figure 2a when encoded with MPEG-2. However, there is an error in the bottom field. In the case of progressive internet HD content from YouTube, an H.264 encoder distorts the image by vertically bending a diagonal floral leaf due to the last line copy to DEHA in
Figure 2b. It is crucial to handle the pixel data within the DEHA with care during the transformation from image space to the frequency domain to avoid potential degradation of the original video quality. Although the compressed data of the DEHA in
Figure 2 forms only a small portion of the conventional video stream, altering the pixel data within the DEHA without careful consideration can have adverse effects on overall video quality. Further research is needed to gain a deeper understanding of the relationship between the DEHA and encoding algorithms, enabling the development of optimization techniques that enhance video quality while effectively utilizing the DEHA. Exploring the trade-offs between video quality, channel utilization, and storage efficiency will contribute to advancements in digital television systems.
2.2. DEHA on MPEG Constant Bitrate (CBR) Encoding
The MPEG encoding technique is a form of lossy compression that operates on macroblocks, with the last line of macroblocks (LML) consisting of active video and DEHA [
18]. In
Figure 3, we illustrate the encoding process of the second macroblock within the LML utilizing DEHA. Within this process, the Y21 and Y22 blocks represent the active video region, while the Y23 and Y24 blocks represent the DEHA. The U2 and V2 blocks are divided into the active video and DEHA portions, vertically separated into top and bottom sections.
Following the transformation of pixel data into frequency components, all coefficients undergo quantization using the macroblock quantization factor (mQuant) and are subsequently encoded using a variable length code. To ensure an even distribution of generated bytes within the video stream, the average number of generated bytes per video frame (B
frame) can be calculated using the below equation.
In a 17.5 Mbps 25 frames/sec video stream, the average generated bytes of the video frame are 87,500 (=17.5 106/25/8) bytes/frame. A video frame is composed of a 68-macroblock line. Thus, the generated bytes of the B
LML are as follows:
DEHA is half of the LML. Therefore, the generated bytes of the B
DEHA are as follows:
There are about 643 bytes per frame for DEHA in a 17.5 Mbps 25 frames/sec MPEG CBR video stream. Consequently, although this is only 0.735%, there are 16.084 Kbytes/sec (128.672 Kbps = BDEHA frame rate) as redundancy in storage or in DTV channel in 17.5 Mbps 25 frames/sec MPEG CBR encoding.
In order to analyze the average generated bytes in the last line of macroblocks (LML) with DEHA, we present an example using the CrowdRun sequence from the International Telecommunication Union (ITU)’s video quality test sequences [
2].
Figure 4 illustrates the comparison of average generated bytes for different picture encoding types. It is observed that while the generation bytes of Yn3 and Yn4 blocks in the bar mode are lower than those in the reference mode, the generation bytes of Yn1 and Yn2 blocks in the bar mode DEHA are higher [
1]. Similarly, the generation bytes of Un and Vn blocks in the bar mode DEHA are also higher compared to the reference mode. Despite the difference in mQuant values, the generation bytes of the LML exhibit minimal variation between DEHA modes, as shown in
Figure 4a,b. In 17.5 Mbps, 25 frames per second MPEG CBR encoding, all DEHA modes generate approximately 816 bytes per frame. This corresponds to half the generated bytes in the LML, as depicted in
Figure 4a. However, the current utilization of the DEHA is considered redundant in terms of channel utilization and storage efficiency, amounting to 20,400 Kbytes/sec. This discrepancy is attributed to the pixel complexities in the LML, exceeding the theoretical values predicted via Equation (1).
In
Figure 4b, the maximum difference in bytes per frame between the reference mode and the black bar mode in I-picture is 294.88 bytes, representing a ratio of only 0.146% (=294.88/201,600.2). Despite the variations in generated bytes between DEHA modes, the average generated bytes per frame remain consistently close to 87,500 bytes, as shown in
Figure 4b.
Finally, the correlation between the generated bits in bar mode DEHA and reference mode DEHA is depicted in
Figure 4c. The correlation coefficient for generated bytes per frame is nearly 1, exceeding 0.99997. A correlation of 1 implies that the encoded bytes of DEHA in the bar mode are identical to those in the reference mode. Thus, the generation of bytes is not significantly affected by the DEHA modes in MPEG CBR encoding.
2.3. Motion Effect and Video Quality with DEHA
Motion estimation (ME) and motion compensation (MC) play a crucial role in MPEG video compression algorithms. These techniques are essential for achieving efficient video coding. In the context of the last line of macroblocks (LML), the horizontal motion vectors in the bar mode exhibit similarities to those in the reference mode. However, a notable distinction arises in the vertical motion vectors of the bar mode, which are predominantly restricted to 0. This restriction is due to the vertical boundary of the bar mode within the LML, as depicted in
Figure 5a. Understanding the characteristics and limitations of motion vectors in different modes is vital for accurately representing and compensating motion in MPEG video compression.
In accurate motion estimation, the average sum of square error (MSE) after motion estimation (ME) is either 0 or a very small value. However, due to the restriction of vertical motion vectors in the bar mode DEHA, the MSE values after ME in the last line of macroblocks (LML) are depicted in
Figure 5b. It can be observed that the histograms of black bars mode and white bars mode are highly similar. There exist minor differences between DEHA modes in terms of lower MSE values. Notably, the bar mode exhibits more perfect matching blocks (i.e., MSE is 0 and marked as ‘O’ on the
y-axis) compared to the reference mode. Despite the slight variations in MSE after ME between DEHA modes, the video quality of the active region in the LML is higher for the bar mode due to the utilization of smaller mQuant values, as illustrated in
Figure 6a. These findings highlight the impact of DEHA modes on video quality in the LML.
The video quality assessment of the CrowdRun sequence was conducted to enable a precise comparison of quality within the 1920 × 1080 active regions, measured using the Peak Signal-to-Noise Ratio (PSNR) of the Y component.
Figure 6b illustrates the results, with frames marked along the dotted diagonal line representing the corresponding quality levels. In all conditions tested, the average quality achieved using the proposed method was either equal to or slightly better than that of the original MPEG-2 and MPEG-4 AVC video.
As a result, it can be concluded that the utilization of bar modes in DEHA does not significantly impact image quality or bit generation in MPEG CBR video compression. Furthermore, since the LML is located far from the center and falls within the transient effect area [
1], the video quality of the active region in the LML is not prominently visible to the viewer. This realization forms the foundation of the main idea behind the utilization of DEHA, specifically via the combination of black-and-white bar DEHA modes.
3. The Proposed DEHA Decoder for DEHA Utilization
Traditional real-time hardware encoders have typically applied padding to the DEHA using the last line signal from the input data or inserting black bars. The DEHA, in this context, is composed of a combination of representative block images. In the case of DTV broadcasting, for instance, the DEHA consists of 240 block images. Although the data within the DEHA does not serve any essential purpose, the compression process can potentially compromise the quality of the active video as the quantity of DEHA data increases. Therefore, it is crucial to minimize the amount of compressed data within the DEHA to preserve the quality of the active video. Ensuring that the DEHA data are optimized can help mitigate any negative impact on the overall video quality during the compression process.
3.1. Video Encoder and Decoder for DEHA Utilization
To validate the proposed DEHA decoder, the insertion and restoration of DEHA were implemented, as depicted in
Figure 7.
Existing video encoders typically include a DEHA insertion process. The proposed method involves simply replacing the conventional DEHA insertion process with a DEHA insertion process using the proposed method. During the insertion of the proposed DEHA, various components such as Service Data, Metadata, Technical Metadata, etc., can be utilized.
3.1.1. Proposed Decoder
Similar to a conventional HDTV receiver, the proposed HDTV receiver comprises several essential components, including a tuner, a transport stream de-multiplexer, a video decoder, and a panel driver. These components work together to facilitate the utilization of DEHA. To incorporate DEHA functionality into the existing HDTV broadcasting systems, a DEHA overlay block is inserted, providing a visual representation of how DEHA is utilized. DEHA images or block data are overlaid onto the decoding image using the Picture-in-Picture (PIP) mode.
This overlay is achieved by employing 240 DEHA blocks positioned at the bottom of the encoded stream.
Figure 8 illustrates the composition of the proposed HDTV receiver and the integration of DEHA for enhanced broadcasting capabilities.
3.1.2. Proposed Block Decoder
MPEG video compression is based on lossy compression. A block image is degraded via ME and mQuant in MPEG CBR encoding, as shown in
Figure 9.
In MPEG CBR video compression, block average
Davg(n) is used to reconstruct the metadata from the degraded block image embedded in DEHA.
Here, n denotes an n-th block in DEHA, and
Bn(
i,
j) is restored a pixel value by the receiver. From the reconstructed
Davg(n), a bit value of metadata
Dth(n) restores with the threshold 128 as a boundary.
This Dth(n) is the (n mod 8)-th bit value of the n/8-th metadata. The metadata reconstructed using the block mean and the threshold value can restore the original data values despite the degradation of the block image due to MPEG CBR lossy compression.
3.2. Metadata Transfer for DEHA Utilization
Program-related technical metadata plays a crucial role in ensuring the compatibility and quality of television programs. These metadata are specifically tailored to suit the requirements of different cameras, post-production technologies, and editorial processes. By including both editorial and technical metadata, a consistent set of information is created to facilitate the processing, review, scheduling, and distribution of programs within the television industry. However, despite the significance of technical metadata for broadcast producers, a significant portion of these metadata is often missing in the DTV stream used for broadcasting. To address this limitation, a careful selection of relevant metadata has been made for utilization within the DEHA framework. The specific technical metadata chosen for DEHA utilization can be found in
Table 2, ensuring their inclusion and utilization for enhanced broadcasting capabilities.
3.3. QR Code for DEHA Utilization
In order to mitigate transmission errors in metadata, QR codes are employed, leveraging their inherent error correction capabilities. With the error correction function, QR codes enable the restoration of data even if certain parts of the code are damaged during transmission. When utilizing level L (4), data recovery can be achieved up to approximately 7% within a code word unit. A 21 × 21 QR code is employed for DEHA utilization, excluding the positioning, spacing, and timing marks regions. This QR code configuration provides a total of 240 blocks available for encoding DEHA data.
Figure 11 showcases an example of data designed for DEHA utilization, demonstrating the integration of QR codes to enhance the reliability and accuracy of metadata transmission.
Finally, the proposed technical metadata will be reconstructed without error in the decoder using block image, MPEG CBR video encoding, block average, threshold value, and QR code.
4. Experimental Results
To analyze DEHA for conventional HDTV broadcasting, the proposed DEHA decoder is implemented on a general DTV decoder by adding a DEHA decoder use of block average and modifying the overlay block to data and the check pattern image of DEHA, as shown in
Figure 12.
The proposed stream was created by modifying the DEHA generation part in MSSG Software. The proposed stream was configured similarly to the actual broadcasting system using a well-known universal DTV stream generator, channel coder, and Vestigial Sideband (VSB) modulator. The proposed decoder applies the register value corresponding to the display size as large as the DEHA size in the existing DTV decoder so that pixels can be output, and the DEHA is overlaid on the active region using a Field-programmable gate array (FPGA). DEHA Overlay Mode can be set to DEHA Text, DEHA Image, QR code, and Magnified DEHA, as shown in
Figure 10.
There are 240 blocks in the DEHA for an HDTV broadcasting system. DEHA values of only the first 20 block value are overlaid in a three-digit decimal on the bottom of a decoding image. In DEHA analysis using the proposed decoder, a general video encoder copied the last line as the DEHA, as shown in
Figure 1, and almost all real-time video encoders use a black bar as the DEHA in
Figure 12.
Experiments for DEHA utilization (
Figure 13) were conducted under a variety of sequences, as shown in
Table 3.
All sequences via SVT were selected from VQEG’s original footage and standard HDTV test materials recommended in ITU-R BT.1210 [
19,
20].
Table 3.
Test sequences for DEHA decoder [
21].
Table 3.
Test sequences for DEHA decoder [
21].
Sequence Name | Duration [Frames] | Coding Difficulty |
---|
Crowd Run | 250 | Difficult |
Park Joy | 250 | Difficult |
Ducks Takeoff | 250 | Difficult |
Into Tree | 250 | Easy |
Old Town Cross | 250 | Easy |
Program-related technical metadata are prescribed for DEHA utilization in the previous section (
Table 2).
The sequences have been designed to represent a demanding, but not unduly so, multi-genre TV program, hence it can be suggested that this set of sequences can act as a reference for the peak bit rate (video elementary stream) that is needed for secondary distribution (also known as transmission) of most TV programs to reach ‘good reception condition’ (also known as broadcast quality).
The above list is ordered according to the coding difficulty, with the most difficult sequence at the top of the list. If the whole program is encoded at a bit rate where a sequence classified as ‘difficult’ does not have a bigger quality drop than 30% in a DSCQS test, it can be assumed that ‘good reception condition’ (also known as ‘Broadcast Quality’) according to ITU-R BT.1122-1 can be obtained for the whole program. The same appears for sequences classified as ‘easy’ (12% quality drop allowed).
4.1. Video Quality in MPEG CBR
The whole program is encoded at a bit rate of 13.0 Mbps and 17.5 Mbps using an MPEG-2 video encoder. Actually, 13.0 Mbps is used for multi-mode service, and 17.5 Mbps is adopted for single-program service via terrestrial HDTV broadcasters [
22].
The proposed DEHA decoder decodes and overlays the DEHA image on the active video to a checked board pattern or QR code, as shown in
Figure 14.
The 16 × 15 checked board pattern is useful to analyze the DEHA image in conventional HDTV broadcasting, as shown in
Figure 13 and
Figure 14a. Three-digit values assist us in decoding the DEHA for confirmation of the proposed decoder, and the DEHA text and QR code help us to read the technical metadata.
The video quality score of those sequences was collected for PSNR quality comparison within active 1920 × 1080 regions except the DEHA. The Peak Signal to Noise Ratio (PSNR), which gives the amount of difference between the original and the processed pictures, is a commonly used objective measure of the quality of a video segment before and after compression. The result of the video quality evaluation is shown in
Table 4.
Under all conditions, the average quality of the proposed method was slightly neglectable lower or better than that of the original MPEG-2 video compression. This was due to the difference in the creation of the DEHA between the last line copy in the original MPEG video compression and the block image in the proposed method.
The average quality of technical metadata using the check pattern block is good enough for no recovery error over 50 dB. Based on the results of video quality evaluation, the proposed metadata transfer to block image method through the DEHA does not have an effect on quality during video compression.
To further analyze the results presented in
Table 4 and examine maximum and minimum values, trends, etc., we have included the active region PSNR of the Old Town Cross video sequence
Figure 15.
The difference in image quality per frame is less than ±0.0011 dB, and the proposed method shows no change in image quality compared to the existing method.
4.2. Motion Effect in MPEG CBR
The whole program is encoded at a bit rate of 13.0 Mbps and 17.5 Mbps using an MPEG-2 video encoder. Actually, 13.0 Mbps is used for multi-mode service, and 17.5 Mbps is adopted for single-program service via terrestrial HDTV broadcasters [
22].
4.3. Advantages and Limitations of Proposed Method
Summarized the advantages and limitations of DEHA as shown in the
Table 5.
5. Conclusions
The embedded DTV Essential Hidden Area (DEHA) holds significant potential for terrestrial, satellite, and internet HDTV and UHD broadcasting. But, DEHA is a small part of the video Bandwidth in the decoder. In conclusion, DEHA is an essential and redundant element in MPEG2 that must be present in the system, but it may go unused and be discarded.
Our proposed DEHA decoder has been successfully implemented to analyze and utilize this previously overlooked feature. By displaying the DEHA as a black bar or the last line copy in conventional broadcasting, we have provided a means to investigate and understand its role in HDTV processes. Furthermore, our experiments have demonstrated the practicality of embedded DEHA for program-related metadata services, enabling the transfer of technical information and details about cameras and post-production technologies. The integration of checked pattern block images has proven effective in facilitating this transfer.
This technique can be utilized for copyright and security purposes to differentiate between the original video and copies, and it can also enhance video quality by incorporating features like subtitles. With its versatility and compatibility with established standards like ATSC, ISDB, and DVB, the embedded DEHA utilization offers a valuable tool for enhancing DTV services and ensuring synchronized video content.