Next Article in Journal
Adaptive Space-Aware Infotaxis II as a Strategy for Odor Source Localization
Next Article in Special Issue
Unveiling the Future of Human and Machine Coding: A Survey of End-to-End Learned Image Compression
Previous Article in Journal
Mechanism for High-Precision Control of Movement at Maximum Output in the Vertical Jump Task
Previous Article in Special Issue
A New Transformation Technique for Reducing Information Entropy: A Case Study on Greyscale Raster Images
 
 
Font Type:
Arial Georgia Verdana
Font Size:
Aa Aa Aa
Line Spacing:
Column Width:
Background:
Article

Influential Metrics Estimation and Dynamic Frequency Selection Based on Two-Dimensional Mapping for JPEG-Reversible Data Hiding

1
School of Computer Science, Nanjing University of Posts and Telecommunications, Nanjing 210023, China
2
National Key Laboratory of Autonomous Marine Vehicle Technology, Harbin Engineering University, Harbin 150001, China
3
Smart Campus Research Centre, Information Construction and Management Office, Nanjing University of Posts and Telecommunications, Nanjing 210023, China
*
Author to whom correspondence should be addressed.
Entropy 2024, 26(4), 301; https://doi.org/10.3390/e26040301
Submission received: 25 January 2024 / Revised: 15 March 2024 / Accepted: 26 March 2024 / Published: 29 March 2024
(This article belongs to the Special Issue Information Theory and Coding for Image/Video Processing)

Abstract

:
JPEG Reversible Data Hiding (RDH) is a method designed to extract hidden data from a marked image and perfectly restore the image to its original JPEG form. However, while existing RDH methods adaptively manage the visual distortion caused by embedded data, they often neglect the concurrent increase in file size. In rectifying this oversight, we have designed a new JPEG RDH scheme that addresses all influential metrics during the embedding phase and a dynamic frequency selection strategy with recoverable frequency order after data embedding. The process initiates with a pre-processing phase of blocks and the subsequent selection of frequencies. Utilizing a two-dimensional (2D) mapping strategy, we then compute the visual distortion and file size increment (FSI) for each image block by examining non-zero alternating current (AC) coefficient pairs (NZACPs) and their corresponding run lengths. Finally, we select appropriate block groups based on the influential metrics of each block group and proceed with data embedding by 2D histogram shifting (HS). Extensive experimentation demonstrates how our method’s efficiently and consistently outperformed existing techniques with a superior peak signal-to-noise Ratio (PSNR) and optimized FSI.
Keywords:
RDH; JPEG; 2D mapping; HS

1. Introduction

Reversible Data Hiding (RDH) involves embedding a message into a cover image with the objective of minimizing visual distortion and enabling the lossless recovery of the original image from the marked image. This technique is extensively utilized in domains such as medical imaging, military imaging, and legal forensics, where the integrity of the original digital material is paramount.
Nowadays, three major approaches concerning RDH in JPEG images have been receiving increasing attention: (1) quantization-table-based RDH [1,2], (2) Huffman-codes-based RDH [3,4,5,6], and (3) Discrete Cosine Transform (DCT)-coefficient-based RDH [7,8,9,10,11,12,13,14,15,16].
The first approach, predicated on the alteration of quantization tables for the embedding of secret data, was introduced by [1] and subsequently refined by [2]. Although this strategy creates space for embedding and ensures a specified level of visual quality, it has resulted in a considerably substantial FSI.
The second approach embeds secret data by modifying or mapping the variable-length codes (VLCs) in the JPEG bitstream, which preserves the visual quality of the image during data embedding, albeit with a relatively limited embedding capacity (EC). In [3], a direct mapping approach was proposed, targeting the transposition of a used VLC to an unused VLC based on differing embedding bits. Utilizing the one-to-one relationship between the run/size value (RSV) and VLC, Du et al. [4] proposed an HS-based strategy that sorts RSVs according to their frequency of occurrence. While this approach can enhance the EC, it may also induce a notable increase in file size. In [5], a simulated embedding model was developed by ordering RSV occurrences to find a more effective mapping relationship for data embedding. In [6], a new VLC encoding mapping method was introduced, which utilizes a genetic algorithm to solve the mapping method and customize the VLC encoding accordingly. This method performs well in terms of file size increment compared to previous methods.
The third approach accomplishes the embedding of secret data by modifying the quantized DCT coefficients, which include alternating current (AC) and direct current (DC). Huang et al. [7] initially introduced an HS-based RDH scheme, wherein the AC coefficients possessing values of ± 1 were employed to embed secret data. To maintain the reversibility of the process and make room for the data, the other AC coefficients, excluding zero-valued ones, are shifted. Hou et al. [8] formulated a new analog embedding-centric distortion function and a block selection mechanism aimed at minimizing visual distortion. The above two methods were optimized in [9], wherein a multi-faceted optimization approach was recommended, striving to pinpoint optimal decision variables among payload, FSI, and visual distortion. Xiao et al. [10] optimized the embedding distortion (ED) by choosing different embedding positions at different frequencies. Li et al. [11] proposed a novel 2D mapping strategy that categorizes the non-zero AC coefficient pairs into four distinct groups, enhancing the visual quality of the image. In [12,13,14,15,16], methods based on 2D histogram mapping were progressively developed and augmented through the optimization of distortion functions, the systematization of mapping methods, and the tailoring of adaptive mapping techniques.
In the preceding RDH methodologies employing a 2D mapping strategy for the DCT coefficients, two principal issues were discerned. The initial concern pertains to the lack of subsequent block selection from the FSI standpoint during the embedding phase; the second issue is the unavailability of the frequency ordering post data embedding, necessitating the use of an additional 63 bits to document the selected frequencies. Confronted with the challenges previously described, we were inspired to develop a new JPEG-based RDH scheme aimed at elevating the overall performance of the system. The contributions of this paper are delineated below.
  • We implement a dynamic frequency selection method based on a recoverable frequency order to ascertain the most suitable frequencies to embed data;
  • We document the run length of NZACPs during their construction, facilitating the prospective estimation of file size increment and visual distortion;
  • We group image blocks based on their metrics and adaptively prioritize the appropriate block groups for data embedding.
The remainder of this paper is structured as follows: Section 2 discusses the necessary preliminary knowledge. Section 3 details our proposed method. Experimental results are presented in Section 4, and Section 5 provides the concluding remarks.

2. Preliminary Knowledge

2.1. Overview of JPEG Compression

JPEG, widely utilized for the lossy compression of digital images, especially those from digital photography, follows a sequential process for grayscale images. This process involves three consecutive steps: Discrete Cosine Transform (DCT), quantization, and VLC assignment.
We can use the formulas below to realize the DCT transformation and its inverse procedure.
F ( u , v ) = 1 4 C ( u ) C ( v ) x = 0 7 y = 0 7 f ( x , y ) cos ( 2 x + 1 ) u π 16 cos ( 2 y + 1 ) v π 16
f ( x , y ) = 1 4 u = 0 7 v = 0 7 C ( u ) C ( v ) F ( u , v ) cos ( 2 x + 1 ) u π 16 cos ( 2 y + 1 ) v π 16
C ( u ) = 1 2 , if u = 0 1 , otherwise
where f ( x , y ) represents the original image, which we will also denote as I ( x , y ) . And F ( u , v ) stands for the unquantified DCT coefficients.
To diminish the quantity of information, each DCT coefficient value is divided by a quantization step and the quotient is rounded to the nearest integer. The magnitude of the quantization step dictates the compression ratio, with larger steps yielding more substantial compression.
C ( u , v ) = F ( u , v ) Q ( u , v )
where C ( u , v ) , Q ( u , v ) represent the quantized DCT coefficient and quantization value at the position of ( u , v ) . Note that the value of Q ( u , v ) is related to the quality factor (QF); the larger the QF, the smaller the Q ( u , v ) value and the higher the visual quality of the JPEG image.
Finally, entropy coding, which is applied to generate the JPEG bitstream, has several stages. It is implemented in 8 × 8 blocks, and here, we provide an encoding example for a single block:
  • Step 1: Scan the block in a zigzag pattern to obtain a coefficient sequence.
  • Step 2: Convert the coefficient sequence to an RSV sequence. RSV is constructed for non-zero coefficients in sequences. For each non-zero coefficient, it is converted to the number of zero coefficients between the previous non-zero coefficient and current coefficient, the length of the binary representation of the coefficient, and the binary representation of the coefficient; note that the first element should not exceed 15. If there are 16 consecutive zeros, construct an RSV with the values ( 15 , 0 , 0 ) . In addition, when encountering negative numbers, the representation method for the third element is to invert the highest bit of its binary representation of its absolute value.
  • Step 3: Merge the first two elements in RSV into a single byte, with the top 4 bits and the bottom 4 bits representing the two elements, respectively. Due to the limitations on element values during the RSV construction process, there will be no overflow or other issues during merging.
  • Step 4: The Huffman table in the JPEG Header contains a value table and a bit table. The value table corresponds one-to-one with the merged byte, and the bit table determines the length of the encoding corresponding to that byte. Based on these two tables, the byte can be converted into a Huffman code (also known as VLC in JPEG encoding).
  • Step 5: Merge the obtained Huffman code with the third element of the RSV; at this point, the RSV is converted into a binary sequence; After converting all RSVs within the block, convert the entire binary sequence to hexadecimal.
The above is the entropy encoding process for a block. The complete process involves traversing and encoding all 8 × 8 blocks.

2.2. Overview of HS-Based RDH

The 1D HS-Based RDH for JPEG images is initially proposed by Huang [7]. In this scheme, the quantized DCT coefficients are divided into 8 × 8 blocks initially. Each block has 1 DC coefficient and 63 AC coefficients. Then, we scan the whole image in row order to obtain the sequence of blocks { B 1 , B 2 , , B K } . Each block is then converted into a coefficient sequence in a zigzag pattern, with the 63 AC coefficients designated as frequency bands F i { F 1 , , F 63 } .
Note that, in this method, we have a classification of the DCT coefficients for zero-valued coefficients, which remain constant during both embedding and extraction; for coefficients with an absolute value of 1, we define them as insertable coefficients, which are used for embedding of the secret data; and for coefficients with an absolute value greater than 1, we define them as shiftable coefficients, which are used to make space for the embedding of the data and to ensure the reversibility of the method.
Based upon the conditions above, the data embedding process is mathematically articulated as follows:
C k ( i ) = C k ( i ) + s i g n ( C k ( i ) ) b , if | C k ( i ) | = 1 C k ( i ) + s i g n ( C k ( i ) ) , if | C k ( i ) | > 1
where C k ( i ) stands for the DCT coefficient value of the ith frequency band in the kth block. C k ( i ) represents the C k ( i ) with secret bit embedded and b { 0 , 1 } represents the secret bit. s i g n ( · ) is sign function.
s i g n ( x ) = 1 , if x < 0 0 , if x = 0 1 , if x > 0
And the extraction process is articulated as follows:
b = 0 , if | C k ( i ) | = 1 1 , if | C k ( i ) | = 2
C k ( i ) = sign C k ( i ) , if 1 C k ( i ) 2 C k ( i ) sign C k ( i ) , if C k ( i ) 3
Through Equations (7) and (8), the DCT coefficients with secret data can be recovered to their original values and the secret data can be extracted without error.

2.3. File Size Increment Table

The table, designated as h c i t in [17], encapsulates the incremental changes in the entropy code’s length resulting from the data embedding process. This table is formulated using the B I T S and H U F F V A L lists from the AC Huffman Table, readily extractable from the DHT segment within the JPEG Header. Consequently, the augmentation in file size for a marked image, caused by the increase of a non-zero AC coefficient in the ith frequency of the kth block can be represented as S k ( i ) and calculated by
S k ( i ) = h c i t [ r k ( i ) , c k ( i ) ] + 1 , if | C k ( i ) | = 2 z 1 0 , others
where r k ( i ) and c k ( i ) represent the run length of the coefficient (the number of zero coefficients in the interval between two non-zero coefficients) and the binary length of the coefficient, respectively.

2.4. The Laplacian Cumulative Distribution Function

He [18] proposed a scheme to approximate the distribution of all AC coefficients across frequencies using the Laplacian cumulative distribution function (CDF). This function, denoted by F i ( x ) , derives from the foundational work of [19,20], and it facilitates the estimation of the AC coefficient distribution within the ith frequency. The CDF is delineated as follows:
F i ( x ) = 1 2 + 1 2 · s i g n ( x ) · 1 e λ i | x |
where λ i ( > 0 ) is the scale parameter.
Considering that AC coefficients with zero value are unaffected post-embedding, their ratio among K blocks can be represented as follows:
P Z ( i ) = k = 1 K [ C k ( i ) = = 0 ] K
And the ratio can also be represented by CDF
P Z ( i ) = F i ( 0.5 ) F i ( 0.5 )
Therefore, the scale parameter can be solved by combining Equations (11) and (12)
λ i = 2 · ln 1 1 K k = 1 K [ C k ( i ) = = 0 ]
Once we obtain the scale parameter, the ratio of the insertable coefficient P E ( i ) and the shiftable coefficients P S ( i ) can be solved by
P E ( i ) = F i ( 0.5 ) F i ( 1.5 ) + F i ( 1.5 ) F i ( 0.5 ) = e 1 2 λ i e 3 2 λ i P S ( i ) = 1 P E ( i ) P Z ( i )

2.5. Distortion Calculation

The visual distortion caused by secret data embedding we denote by ED, whose value can be represented by
ED i = 1 M j = 1 N I ( i , j ) I ( i , j ) 2
Based on Parseval’s Theorem, it can be concluded that
i = 1 M j = 1 N I ( i , j ) I ( i , j ) 2 = i = 1 M j = 1 N F ( i , j ) F ( i , j ) 2
And F ( i , j ) = C ( i , j ) · Q ( i , j ) , so we can deduce that
ED k = 1 K i = 1 63 Q 2 ( i ) · C k ( i ) C k ( i ) 2
where Q ( i ) represents the quantization value of the ith frequency band in the quantization table. Due to the quantization table used being 8 × 8 , the summation method of Equation (17) has changed from the perspective of the entire image to 63 AC coefficients per block (DC coefficients do not carry secret data).
Since, during the data embedding process, for any non-zero AC coefficient that would be used for embedding or shifting, it changes by at most 1, we can simplify the above equation to be
ED k = 1 K i = 1 63 Q 2 ( i )
In conclusion, as delineated in Equation (18), we have derived a method to compute the simulated visual distortion, which is expressed as the square of the quantized value, corresponding to the modified coefficient’s position.

3. Proposed Method

The framework of the proposed method and its constituent processes are delineated in this section. The methodology primarily comprises two elements: dynamic frequency selection and block-based influential metrics analysis followed by grouping. Initially, we iteratively select frequencies with the help of payload and offset. NZACPs are then constructed within each block based on the frequencies identified in the initial phase. Subsequently, we estimate the ED and FSI resulting from the alteration of these coefficients. Blocks are grouped and sorted based on these two metrics, with suitable clusters of blocks being earmarked for data embedding. The comprehensive method is depicted in Figure 1.

3.1. Theoretical Foundation

The goal of the method is to achieve higher PSNR and lower FSI. With these objectives in mind, firstly, in block processing, we hope that when the coefficients within the block are modified, the distortion caused can be as small as possible, leading to Section 3.2; After obtaining the block order, we found that the distortion caused by modifying coefficients at different frequencies within each block is also different. Based on this, we introduce Section 3.3; after confirming the suitable frequencies, we follow the principles in Section 3.4 to construct a 2D mapping on the blocks based on the frequencies. At the same time, we noticed that, when processing according to block order, some blocks, although they cause relatively less distortion when modified, will result in a greater increase in FSI. In response, we introduce an influential model in Section 3.5 and Section 3.6, which facilitates the identification of block groups that permit data embedding with minimal distortion and FSI.

3.2. Block Pre-Processing

Taking a JPEG image with the shape of M × N as example, we assume that we divide this image into K blocks, where K = M × N 8 × 8 . The prevalence of zero coefficients in a block is indicative of its smoothness, and for a smooth block, modifying the coefficients within the block will cause less distortion. Therefore, we will first record the smoothness of the K blocks and use it as one of the metrics.
M T 1 k = i = 1 63 C k ( i ) = = 0
where [ · ] stands for Iverson bracket, which assigns a value based on the truth of the proposition H it encloses. If the proposition H is true, then [H] = 1; conversely, if H is false, then [H] = 0. This bracket notation allows us to tally the non-zero AC coefficients within a block.
Subsequently, with inspiration received from [16], we will count the non-zero AC coefficients within each block and add or subtract one to each of their values. In this way, for a block, we can calculate the numerical distortion caused by all the non-zero coefficients within the block being modified by one with the Equation (18). Since we want to give higher embedding priority to blocks with less distortion, we will take the inverse of this numerical distortion as another metric.
M T 2 k = i = 1 63 C k ( i ) 0 · Q 2 ( i )
Based on the above, we introduce the metric M T , defined as the sum of M T 1 and 1 / M T 2 from Equations (19) and (20). It indicates that blocks with higher M T values are smoother and result in less distortion during the modification of coefficients. Therefore, the set of blocks { B 1 , , B k , , B K } can be arranged in descending order based on M T to yield { B 1 ˜ , , B k ˜ , , B K ˜ } .
M T k = M T 1 k + 1 M T 2 k

3.3. Dynamic Frequency Selection

After the completion of block pre-processing, a selection of frequencies becomes imperative. There are two main reasons for doing so: the first is that DCT coefficients located in the high-frequency domain correspond to larger quantization coefficients, which can lead to a higher ED; the second is that it is possible that most of the coefficients contained at that frequency cannot be used for data embedding, like coefficients with an absolute value greater than 1 in the HS-based method, which can only be used to make room for data embedding rather than embedding data.
Inspired by [8,18], we sort frequencies based on the unit distortion of different frequencies, and the metric U D ( i ) is calculated as follows, in conjunction with Equation (18):
U D ( i ) = 1 2 + P S ( i ) P E ( i ) · Q 2 ( i )
where P E ( i ) and P S ( i ) denote the ratio of insertable and shiftable coefficients in the ith band of K blocks, respectively. The detailed calculation process concerning these two metrics is shown in Section 2.4. Note that the metrics utilized to establish the frequency order remain invariant prior to and subsequent to the data embedding process, that is, the frequency order can be reconstructed post-data embedding without necessitating additional data.
Upon sorting the frequencies, a dynamic selection is made using an offset, which operates under the principle that selecting a greater number of frequencies correlates with the use of fewer blocks, and conversely. It is observed that, for certain images, a selection of less distorted frequencies and a higher number of blocks yields improved performance. Conversely, other images benefit from selecting more frequencies and fewer blocks. Consequently, our selection strategy is guided by the order of frequencies, with the requirement that the total EC must surpass the combined value of the payload P and the offset O.
i = 1 r EC ( i ) P + O
EC ( i ) = k = 1 K | C k ( i ) | = = 1
from the Equations (23) and (24), we can obtain the optimal frequency set F * = { F 1 * , , F r * }
This part represents one cycle of selecting the frequencies, after simulating data embedding, we can compute the total distortion T. And T is numerically equal to ED from Equation (15). In the actual embedding process, we loop the offset set to obtain the minimum total distortion T * . The detailed process is shown in Section 3.7.

3.4. Two-Dimensional Mapping Generation

The 2D mapping uses the same strategy as Li [11], defining the NZACP within each block into four categories, namely A, B, C, and D, respectively. Under these, the process of modifying the coefficients in NZACPs for data embedding is 2D HS. The mapping strategy is shown in Figure 2a, with the first quadrant serving as an example.
The acquired coefficient pairs are those that ultimately complete the embedding of the secret data, each according to its specific type, whilst ensuring the image’s reversibility. The comprehensive data embedding procedure is delineated as follows:
  • Type A: They are defined by the set { ( x = ± 1 , y = ± 1 ) } . When embedding data, they stay the same when they encounter a 0. If they encounter a 1, they look back one place, and the coefficient pairs are shifted 1 place on the x-axis if the next place is a 0, and 1 place on the y-axis if the next place is a 1.
  • Type B: They fall within { ( x = ± 1 , y ± 1 ) } or { ( x ± 1 , y = ± 1 ) } . When embedding data, a 0 is shifted 1 bit on the x-axis or y-axis when encountered; if a 1 is encountered it is shifted 1 bit along the diagonal direction.
  • Type C: They are categorized as { ( x = ± 2 , y = ± 2 ) } . When embedding data, they stay the same when they encounter a 0 as Type A; if a 1 is encountered it is shifted 1 bit along the diagonal direction as Type B.
  • Type D: All remaining coefficient pairs are classified as Type D and are solely shifted diagonally. This shifting is primarily utilized to maintain reversibility and does not embed secret data.
In the process of constructing NZACPs, we document not only the type of each NZACP, but also the run length and the corresponding frequency associated with both coefficients within each pair. This step helps us later to compute the ED and FSI for each block. The restoration process can be easily accomplished through a few steps. From Figure 2b, we can easily identify their types, extract corresponding data, and follow the principles to shift them back to their original positions.

3.5. Influential Model Construction

Utilizing { B 1 ˜ , , B k ˜ , , B K ˜ } , F * and the mapping strategy described in Section 3.4, we can construct an influential model for every block. This model facilitates the computation of ED and FSI caused by available coefficient pairs for each block, after embedding or shifting. Moreover, it records the EC of each block. An example of the construction and computational procedures is provided, with details presented in Figure 3.
Figure 3 illustrates a random 8 × 8 block for computation. Initial processing involves a zigzag traversal of this block to distill the non-zero coefficients, concurrently documenting their run lengths. Subsequently, we pair two adjacent non-zero coefficients. Take the third coefficient pair as an example. This coefficient pair corresponds to Type B, i.e., it pans one unit horizontally or vertically or one unit diagonally depending on the secret data to be embedded. Predicated on the trajectory of the movement, we can deduce its EC, and when amalgamated with the principles from Equations (9) and (18), we are equipped to compute the ED and FSI. Thus, for each block B k ˜ , we are capable of calculating its respective EC k , ED k , FSI k .

3.6. Adaptive Block Grouping

Guided by the model mentioned in Section 3.5, each block B k ˜ is characterized by the metrics ED k , FSI k , EC k . For illustrative purposes, consider a 512 × 512 image as an example, which yields 4096 blocks; these are then segregated into 64 groups, denoted as g r o u p m , m = 1 , , 64 , based on the aforementioned metrics. This categorization results in the parameters g r o u p _ ED m , g r o u p _ FSI m , and g r o u p _ EC m . Subsequently, these 64 groups are ordered in an ascending sequence according to the influence factor i n f l , computed as
i n f l m = normalize g r o u p _ ED m + normalize g r o u p _ FSI m
The selection process involves iterating over all groups from the beginning, continuing until the cumulative embedding capacity m = 1 M g r o u p _ EC m P . Ultimately, this yields M block groups ready for data embedding.

3.7. Data Embedding and Extracting

The data embedding procedure is delineated with precision as follows:
  • Step 1: Decode the original JPEG bitstream to obtain the quantized DCT coefficient matrix and the quantization table. Initialize the total distortion T * to positive infinity. Arrange all DCT blocks by their own M T in descending order.
  • Step 2: Compute the unit distortion U D for all frequency bands, considering Equation (22) to set their initial priorities.
  • Step 3: Select F * = { F 1 * , , F r * } frequencies based on the payload P and offset O. Apply the 2D mapping strategy to construct NZACPs on the sorted K blocks and F * , and compute the ED k , FSI k , EC k of each block.
  • Step 4: Group the blocks by ED k , FSI k , EC k . Sort and select M block groups for data embedding.
  • Step 5: Simulate the embedding of secret data and record the total distortion T. If T < T * , then T * = T , and keep record of the auxiliary data for this case. If all the O have been traversed, then go to Step 6, otherwise go back to Step 3.
  • Step 6: Sequentially embed secret data in the optimal frequency band set F * and M selected block groups. Then, encode the DCT coefficients with secret data as the marked image.
Notably, it is imperative to document the length of the optimal frequency band set L * , the chosen M block groups, and the payload P, which occupy 6 bits, 64 bits, and log 2 P bits, respectively. These elements are sequentially embedded into the reserved space within the JPEG Header as auxiliary data.
Data extraction and image restoration processes are executed with ease.
  • Step 1: Extract the auxiliary data L * , M, and P from the reserved space within the JPEG Header.
  • Step 2: Recover the optimal frequency band set F * with Equation (22) and L * . The equation can restore the order of frequencies and F * is the first L * frequencies in the order.
  • Step 3: Rearrange the block order with M T . As M T remains unchanged after data embedding, the block order can be directly restored.
  • Step 4: Reconstruct the NZACPs by utilizing F * in conjunction with the M block groups. Then, from Figure 2b, we can easily identify the type of NZACP with secret data, as well as ascertain the shifting direction for image recovery and the extraction of the corresponding data.
  • Step 5: Sequentially extract the secret data and recover DCT coefficients via an inverse 2D mapping shift. Then, encode the restored DCT coefficients to obtain the original image.

4. Experimental Results

This section begins by detailing the experimental settings and the selection of image datasets. Subsequent comparisons are drawn between our methodology and both one-dimensional as well as two-dimensional schemes. The concluding portion assesses our method’s comprehensive performance relative to several contemporary techniques, with the findings systematically presented.

4.1. Experimental Setup

We conducted a series of experiments to showcase the advantages of our scheme over USC-SIPI imageset, CVG-UGR image database, and Bossbase v1.01 image set. The first two sets include multiple commonly used testing images such as Lena, Baboon, Goldhill, etc. The last set consists of numerous natural images. All experimental images were resized to 512 × 512 and re-compressed into JPEG format with different quality factors.
The overall performance is evaluated using two primary metrics: visual quality, assessed by PSNR, and FSI. These metrics provide a comprehensive evaluation of the experimental results. The FSI is determined by comparing the storage space occupied by a JPEG image before and after the data embedding process. Concurrently, the PSNR metric is utilized to appraise the visual quality of the image.
PSNR = 10 · log 10 ( MAX I 2 MSE ) d B
where MAX I represents the maximum achievable pixel value in the image, while the mean square error (MSE) of the spatial image is computed using the equation below:
MSE = 1 M × N i = 1 M j = 1 N | | I ( i , j ) I ( i , j ) | | 2
where I and I are the marked image and original image, respectively, and M , N denote the shape of the JPEG image.

4.2. Assessment of Visual Quality and File Size Increment

A series of experiments were performed to assess the visual quality and file size increment in this section. The proposed scheme demonstrates the capability to preserve high visual imperceptibility post data embedding, applicable to images regardless of their textured or smooth characteristics. Furthermore, the scheme was bench-marked against a variety of JPEG RDH schemes, which are categorizable into two types: schemes based on 1D HS and those founded on 2D HS.

4.2.1. Evaluating against One-Dimensional Methodologies

We initially compared our scheme with four JPEG RDH schemes based on 1D HS, including Huang [7], Hou [8], Yin [9], He [17]. Five quintessential test images from the USC-SIPI image set were employed to appraise the PSNR and FSI efficacy. The resultant experimental data are shown in Table 1. Note that the bold data in Table 1 represent the best performance in their respective experimental settings. And this rule also applies to Table 2 and Table 3.
From Table 1, the superiority of our method over the 1D approaches is evident, achieving a higher PSNR and a smaller FSI in most scenarios. For instance, our approach exhibits a PSNR for the Lena image at a capacity of 5000 bits capacity that is 0.11 % higher than the nearest competitor, while simultaneously reducing the FSI by 7.19 % compared to the lowest FSI reported by other methods. This marked improvement stems from our strategic block selection, guided by EC k , ED k , and FSI k , which leads to block groups that achieve a lower FSI and a higher PSNR. Additionally, our dynamic frequency selection method has pinpointed a frequency set with minimized unit embedding distortion.

4.2.2. Evaluating against Two-Dimensional Methodologies

Furthermore, to demonstrate the superiority of our scheme, comparisons were drawn with counterparts based on 2D HS, including Li-N [11] and Li-F [12]. These comparisons were conducted by adjusting the quality factors to 70, 80, and 90, respectively, across five classical images sourced from the USC-SIPI image set and the CVG-UGR image database. The outcomes are exhibited in Table 2 and Table 3.
From Table 2 and Table 3, it can be observed that our method is also superior to several existing methods of the same kind in various aspects. For FSI, our method consistently has better performances. At a QF of 80 for 6000 bits, our FSI values are 6376 bits for Lena, 5912 bits for Peppers, 5528 bits for Tiffany, 7368 bits for Goldhill, and 5640 bits for Splash. These figures are lower when compared to the other two methods. In the context of PSNR, our method also shows a clear advantage. At a QF of 70 for 6000 bits, our PSNR values are 47.281 dB for Lena, 48.030 dB for Peppers, 47.897 dB for Tiffany, 46.082 dB for Goldhill, and 48.298 dB for Splash, surpassing the corresponding figures from the other two methods.

4.3. Evaluating against the State of the Arts

To furnish additional insights, we expanded our experimental regimen to encompass the BOSSbase v1.01 database, thus affirming the wide applicability and robustness of our proposed scheme. A set of 50 images was arbitrarily chosen from the database for testing. All experimental images underwent conversion to JPEG format using three QFs (70, 80, 90), and eight different payloads were used to evaluate the corresponding performances. Comparative experiments were conducted against three established schemes: Huang [7], He [17], Li-N [11], Li-F [12], focusing on PSNR and FSI. The findings from these experiments are exhibited in Figure 4 and Figure 5.
From Figure 4 and Figure 5, it can be seen that, across all the experimental environments we established, our method consistently outperforms others in terms of PSNR and FSI.
In addition, we will further compare the running time of different methods. We record the experimental times in Figure 4 and Figure 5 and calculate the average running time of each method uniformly. The results are shown in Table 4.
Combined with the discussion above, Table 4 reveals that the Huang [7] method, the earliest developed, is the fastest. It directly selects the embedding location for secret data based on preliminary extensive findings; however, it lags in terms of PSNR and FSI performance. The He [17] method demonstrates good performance in terms of PSNR; however, when considering FSI, it falls short in comparison to methods utilizing 2D mapping. Additionally, its average running time is higher than that of other methods. The Li-N [11] method is marginally faster than ours, yet it under-performs in terms of PSNR and FSI. On the other hand, the Li-F [12] method shows some improvement over the Li-N [11] method in terms of performance but it is almost four times slower than ours, and its performance also lags behind our method.

5. Conclusions

Confronting the issue of balancing visual quality with file size increment in JPEG RDH, this paper unveils a new scheme that utilizes 2D mapping. Our scheme treats blocks as discrete units, and block groups are chosen based on the assessment of their influential metrics. It also incorporates the Laplacian cumulative distribution function into the unit distortion computation for frequency selection. The experimental results clearly indicate that our proposed scheme surpasses multiple contemporary JPEG RDH methods in visual quality and file size increment.

Author Contributions

Conceptualization, H.W. and C.L.; Data curation, C.L.; Methodology, H.W. and C.L.; Software, C.L.; Supervision, H.W.; Visualization, H.W.; Writing—original draft, C.L.; Writing—review and editing, H.W. All authors have read and agreed to the published version of the manuscript.

Funding

This research was funded by the National Natural Science Foundation of China (61872190), the Jiangsu Planned Projects for Postdoctoral Research Funds (2020Z058), and the Stable Supporting Fund of National Key Laboratory of Autonomous Marine Vehicle Technology (2024-HYHXQ-WDZC06).

Data Availability Statement

The data presented in this study are available on request from the corresponding author.

Conflicts of Interest

The authors declare no conflict of interest.

References

  1. Fridrich, J.; Goljan, M.; Du, R. Lossless data embedding for all image formats. In Proceedings of the Security and Watermarking of Multimedia Contents IV, San Jose, CA, USA, 19–25 January 2002; Volume 4675, pp. 572–583. [Google Scholar] [CrossRef]
  2. Wang, K.; Lu, Z.M.; Hu, Y.J. A high capacity lossless data hiding scheme for JPEG images. J. Syst. Softw. 2013, 86, 1965–1975. [Google Scholar] [CrossRef]
  3. Qian, Z.; Zhang, X. Lossless data hiding in JPEG bitstream. J. Syst. Softw. 2012, 85, 309–313. [Google Scholar] [CrossRef]
  4. Du, Y.; Yin, Z.; Zhang, X. Improved lossless data hiding for JPEG images based on histogram modification. Comput. Mater. Contin. 2018, 55, 495–507. [Google Scholar] [CrossRef]
  5. Du, Y.; Yin, Z.; Zhang, X. High Capacity Lossless Data Hiding in JPEG Bitstream Based on General VLC Mapping. IEEE Trans. Dependable Secur. Comput. 2020, 19, 1420–1433. [Google Scholar] [CrossRef]
  6. Du, Y.; Yin, Z. New framework for code-mapping-based reversible data hiding in JPEG images. Inf. Sci. 2022, 609, 319–338. [Google Scholar] [CrossRef]
  7. Huang, F.; Qu, X.; Kim, H.J.; Huang, J. Reversible Data Hiding in JPEG Images. IEEE Trans. Circuits Syst. Video Technol. 2016, 26, 1610–1621. [Google Scholar] [CrossRef]
  8. Hou, D.; Wang, H.; Zhang, W.; Yu, N. Reversible data hiding in JPEG image based on DCT frequency and block selection. Signal Process. 2018, 148, 41–47. [Google Scholar] [CrossRef]
  9. Yin, Z.; Ji, Y.; Luo, B. Reversible Data Hiding in JPEG Images with Multi-Objective Optimization. IEEE Trans. Circuits Syst. Video Technol. 2020, 30, 2343–2352. [Google Scholar] [CrossRef]
  10. Xiao, M.; Li, X.; Ma, B.; Zhang, X.; Zhao, Y. Efficient Reversible Data Hiding for JPEG Images with Multiple Histograms Modification. IEEE Trans. Circuits Syst. Video Technol. 2021, 31, 2535–2546. [Google Scholar] [CrossRef]
  11. Li, N.; Huang, F. Reversible data hiding for JPEG images based on pairwise nonzero AC coefficient expansion. Signal Process. 2020, 171, 107476. [Google Scholar] [CrossRef]
  12. Li, F.; Zhang, L.; Qin, C.; Wu, K. Reversible data hiding for JPEG images with minimum additive distortion. Inf. Sci. 2022, 595, 142–158. [Google Scholar] [CrossRef]
  13. Weng, S.; Zhou, Y.; Zhang, T. Adaptive reversible data hiding for JPEG images with multiple two-dimensional histograms. J. Vis. Commun. Image Represent. 2022, 85, 103487. [Google Scholar] [CrossRef]
  14. Weng, S.; Zhou, Y.; Zhang, T.; Xiao, M.; Zhao, Y. General Framework to Reversible Data Hiding for JPEG Images with Multiple Two-Dimensional Histograms. IEEE Trans. Multimed. 2022, 25, 5747–5762. [Google Scholar] [CrossRef]
  15. Li, F.; Qi, Z.; Zhang, X.; Qin, C. Progressive Histogram Modification for JPEG Reversible Data Hiding. IEEE Trans. Circuits Syst. Video Technol. 2023, 34, 1241–1254. [Google Scholar] [CrossRef]
  16. Weng, S.; Zhou, Y.; Zhang, T.; Xiao, M.; Zhao, Y. Reversible Data Hiding for JPEG Images with Adaptive Multiple Two-Dimensional Histogram and Mapping Generation. IEEE Trans. Multimed. 2023, 25, 8738–8752. [Google Scholar] [CrossRef]
  17. He, J.; Chen, J.; Tang, S. Reversible Data Hiding in JPEG Images Based on Negative Influence Models. IEEE Trans. Inf. Forensics Secur. 2020, 15, 2121–2133. [Google Scholar] [CrossRef]
  18. He, J.; Pan, X.; Wu, H.t.; Tang, S. Improved block ordering and frequency selection for reversible data hiding in JPEG images. Signal Process. 2020, 175, 107647. [Google Scholar] [CrossRef]
  19. Lam, E.; Goodman, J. A mathematical analysis of the DCT coefficient distributions for images. IEEE Trans. Image Process. 2000, 9, 1661–1666. [Google Scholar] [CrossRef] [PubMed]
  20. Smoot, S.R.; Rowe, L.A. DCT coefficient distributions. In Proceedings of the Human Vision and Electronic Imaging, San Jose, CA, USA, 28 January 1996; Volume 2657, pp. 403–411. [Google Scholar]
Figure 1. Illustration of the proposed scheme, using the gray Lena JPEG image as an example.
Figure 1. Illustration of the proposed scheme, using the gray Lena JPEG image as an example.
Entropy 26 00301 g001
Figure 2. 2D mapping strategy and type identification after data embedding.
Figure 2. 2D mapping strategy and type identification after data embedding.
Entropy 26 00301 g002
Figure 3. Example of influential model with computational procedures.
Figure 3. Example of influential model with computational procedures.
Entropy 26 00301 g003
Figure 4. FSI comparison for three quality factors (QF = 70, 80, 90) [7,11,12,17].
Figure 4. FSI comparison for three quality factors (QF = 70, 80, 90) [7,11,12,17].
Entropy 26 00301 g004
Figure 5. PSNR comparison for three quality factors (QF = 70, 80, 90) [7,11,12,17].
Figure 5. PSNR comparison for three quality factors (QF = 70, 80, 90) [7,11,12,17].
Entropy 26 00301 g005
Table 1. One-Dimensional JPEG RDH Schemes Comparison under QF = 90.
Table 1. One-Dimensional JPEG RDH Schemes Comparison under QF = 90.
ImagesMetricHuang [7]Hou [8]Yin [9]He [17]Our
500010,000500010,000500010,000500010,000500010,000
LenaPSNR54.39350.98954.94551.36555.56151.69355.41851.91355.62352.053
FSI762414,328656013,464567211,624639212,08052649992
BaboonPSNR49.23645.32349.63645.33049.96345.66450.49346.22550.83646.550
FSI836017,008753617,000771216,008754416,736732814,416
TiffanyPSNR52.56049.03453.33549.57853.88350.00254.17050.81054.34851.005
FSI769614,968632814,032616012,856656011,688560810,632
PeppersPSNR52.97549.16954.04750.11754.60850.47254.85951.39355.09251.502
FSI804814,848687214,376638412,824601612,072540810,688
CouplePSNR51.62347.77252.70248.27453.24048.81853.80649.83553.87149.700
FSI732815,208664014,728640813,112640813,520592812,736
Table 2. Two-Dimensional JPEG RDH Schemes PSNR Comparison under Different QFs.
Table 2. Two-Dimensional JPEG RDH Schemes PSNR Comparison under Different QFs.
ImageSchemeQF = 70QF = 80QF = 90
6000900012,0006000900012,0006000900012,000
LenaLi-N [11]46.86444.17841.68450.20547.71745.59454.50352.38150.806
Li-F [12]46.89444.21341.63550.45747.74545.63754.57152.47950.797
Our47.28144.31141.74350.63347.97745.75854.80652.64650.933
PeppersLi-N [11]47.28244.77342.62650.09447.78846.03153.97851.80150.151
Li-F [12]47.56344.94342.65050.64248.20846.03753.89351.66750.097
Our48.03045.26242.83250.68348.35446.42854.21952.10050.446
TiffanyLi-N [11]47.34744.70042.58249.84247.60545.79553.16351.20049.703
Li-F [12]47.56944.88842.59250.18047.69345.74353.11151.16549.582
Our47.89745.00542.77050.39348.02746.04653.45251.52549.985
GoldhillLi-N [11]45.46143.31841.72247.69745.54943.98551.74649.40047.622
Li-F [12]45.73743.57941.79047.91645.72544.03251.73349.38247.571
Our46.08243.84242.06548.27046.03444.42652.14349.62347.852
SplashLi-N [11]47.80345.59143.41850.19848.26246.60753.51151.67050.272
Li-F [12]47.81945.69143.57150.42848.36546.83553.64751.66950.231
Our48.29845.69243.67350.69048.61146.84254.06152.04350.622
Table 3. Two-Dimensional JPEG RDH Schemes FSI Comparison under Different QFs.
Table 3. Two-Dimensional JPEG RDH Schemes FSI Comparison under Different QFs.
ImageSchemeQF = 70QF = 80QF = 90
6000900012,0006000900012,0006000900012,000
LenaLi-N [11]653610,30414,552679210,01613,6006688951213,064
Li-F [12]705610,51214,824638410,25613,8006536964012,984
Our648810,09614,5286376956013,3606232893612,824
PeppersLi-N [11]6632986413,3126712988012,8806816994413,152
Li-F [12]680810,36813,5766144886413,4006784956012,792
Our6088916013,2405912881612,2246440932812,224
TiffanyLi-N [11]6208962413,4646184920012,848724010,11213,752
Li-F [12]6696994413,8965712945613,1926952998413,200
Our5976955213,2885528876011,8886552977613,176
GoldhillLi-N [11]751210,66414,520780811,54415,192774412,16816,472
Li-F [12]731210,72814,312805611,49615,256797612,35216,424
Our6664995213,752736810,85614,480772812,02416,384
SplashLi-N [11]5520848012,1206232877611,648726410,60013,832
Li-F [12]6056890412,4885984944012,264672810,52814,280
Our5520855212,0725640864811,4566688991212,968
Table 4. Average running time comparison in BOSSbase v1.01 database under different QFs.
Table 4. Average running time comparison in BOSSbase v1.01 database under different QFs.
QFAverage Running Time/s
Huang [7]He [17]Li-N [11]Li-F [12]Our
700.0428.423.0212.613.19
800.0527.203.0614.213.42
900.0523.463.0416.293.93
Disclaimer/Publisher’s Note: The statements, opinions and data contained in all publications are solely those of the individual author(s) and contributor(s) and not of MDPI and/or the editor(s). MDPI and/or the editor(s) disclaim responsibility for any injury to people or property resulting from any ideas, methods, instructions or products referred to in the content.

Share and Cite

MDPI and ACS Style

Wang, H.; Lu, C. Influential Metrics Estimation and Dynamic Frequency Selection Based on Two-Dimensional Mapping for JPEG-Reversible Data Hiding. Entropy 2024, 26, 301. https://doi.org/10.3390/e26040301

AMA Style

Wang H, Lu C. Influential Metrics Estimation and Dynamic Frequency Selection Based on Two-Dimensional Mapping for JPEG-Reversible Data Hiding. Entropy. 2024; 26(4):301. https://doi.org/10.3390/e26040301

Chicago/Turabian Style

Wang, Haiyong, and Chentao Lu. 2024. "Influential Metrics Estimation and Dynamic Frequency Selection Based on Two-Dimensional Mapping for JPEG-Reversible Data Hiding" Entropy 26, no. 4: 301. https://doi.org/10.3390/e26040301

Note that from the first issue of 2016, this journal uses article numbers instead of page numbers. See further details here.

Article Metrics

Back to TopTop