1. Introduction
Cryptography uses mathematical theory to prevent people from gaining access to secret data illegally. However, the secret data can still be threatened by malicious attackers since the meaningless and unintelligible form generated from encryption may attract their attention. To conceal the existence of secret data from the public, data hiding provides a satisfactory solution by making the very existence of the original messages imperceptible. Data hiding [
1] is the science of concealed communication, which involves hiding secret data in meaningful cover objects with a slight and imperceptible distortion. It can be used in various applications, such as copyright protection (robust watermarking), secret communication (steganography), and image authentication (fragile watermarking). Various applications have different requirements. For the safety of secret communications, it is important to conceal the existence of the secret data in order to avoid attracting an attacker’s attention. People cannot easily distinguish the difference between the meaningful cover object and the corresponding hiding result, i.e., the stego object. To meet the safety requirements, the hiding distortion should be minimized, and the hiding capacity should suffice for embedding the secret data. Intuitively, the goal of data hiding is to design schemes that have high hiding capacity but low distortion introduced by the hiding. Hence, a trade-off must be made between hiding capacity and hiding distortion.
In data hiding, we will assume that the cover image is an 8-bit grayscale digital image, which is the set of all possible pixel values in the range [0, 255]. The most common way used in data hiding is the least significant bit (LSB) of pixel values. The LSB of pixel value
i can be computed as
i mod 2. The LSB embedding operation is flipping the LSB of the pixel value. Low hiding distortion means that there is a low probability of detecting the existence of a secret message. Mielikainen [
2] proposed a modification to the LSB matching that involves designing a binary function of two cover pixels to the desired value, which allows hiding the same payload as LSB matching but with fewer changes to the cover image. Also, van Dijk et al. [
3] developed another type of ±1 steganography for improving embedding efficiency in which two-dimensional codes were proposed to embed a 5-ary message symbol in a group of two pixels by modifying, at most, one pixel in the group by one. In addition, this can be generalized to n-dimensional codes [
4,
5] that allow log
2(2n + 1) secret bits to be embedded in n cover pixels by modifying, at most, one pixel in the group by one. Fridrich et al. [
6] introduced a wet paper coding mechanism in which the sender embeds messages into content without sharing selection rules with the recipient. In addition, wet paper codes have been combined with most steganographic schemes to improve their embedding efficiencies [
7].
However, the LSB-based data-hiding schemes have a low hiding capacity. To improve the hiding capacity, Lin et al. [
8] proposed a high-payload, reversible data hiding scheme that is based on the absolute moment block truncation coding (AMBTC) compression domain. They presented four disjointed sets for embeddable blocks to embed data using different combinations of the mean value and the standard deviation. Malik et al. [
9] modified the AMBTC compression technique for hiding secret data by first applying the original AMBTC technique, and they identified the smooth and complex blocks using a threshold value. They converted the one-bit plane into a two-bit plane and replaced all bits of the bit plane with the secret bits to obtain better image quality and high capacity. However, this approach permanently destroys the original AMBTC code and requires overhead information. Chen et al. [
10] proposed an image authentication scheme for AMBTC of a compressed image using turtle shell-based data hiding.
In 2019, Yu et al. [
11] proposed a hybrid data-hiding method for AMBTC compressed images, which combines a turtle-shell reference matrix and (7, 4) Hamming code to enhance the hiding capacity for compressed codes. Lin et al. [
12] provided a reversible data-hiding method that uses adaptive block truncation coding based on an edge-based quantization approach. They utilized a Canny edge detector to obtain edge-blocks and non-edge-blocks, and they applied zero-point fixed histogram shifting to embed the secret information into the compressed code. In 2020, Yu et al. [
13] proposed an adaptive image steganography method combining matrix coding. They constructed a reference data set by classifying all possible 7-bit binary number combinations and adaptively embedded 3-bit data by choosing a suitable alternative from the reference data. Their approach provided better results than the other existing matrix coding-based data-hiding schemes.
To minimize the risk of hiding data, we aim to decrease the hiding distortion and increase the hiding capacity. In this paper, we propose a high-payload data-hiding procedure based on AMBTC to improve the embedding efficiency further. Our proposed scheme embeds secret data by modifying the quantization level according to the pre-defined lookup table. Moreover, previous AMBTC-based schemes may have suffered from the problem of having a high quantization level lower or equal to a low quantization level caused by hiding the data. We propose an adaptive embedding strategy to solve the above issues and achieve high hiding capacity. We demonstrated that the proposed scheme has better hiding performance in image quality and hiding capacity.
To make this paper self-contained, in
Section 2, we review the absolute moment block truncation coding (AMBTC) algorithm for data hiding. High-payload data hiding based on AMBTC is described in
Section 3. The experimental results and their analyses are presented in
Section 4. Also, in
Section 4, the performance is compared to the performances of existing data-hiding schemes. The brief contribution is concluded in
Section 5, where we also outline the future research directions.
2. AMBTC
The absolute moment block truncation coding (AMBTC) proposed by Lema and Mitchell [
14] is a type of lossy image compression technique. It is a variation of block truncation coding (BTC), but it is simpler in practical implementation. AMBTC preserved the first absolute moment and proposed a two-level, non-parametric minimum mean square error quantizer where the threshold is fixed to the sample mean. A lower mean square error yield by AMBTC is used in our proposed data hiding to minimize the impact of secret data hiding.
In AMBTC, an image is divided into non-overlapping blocks of
n ×
n pixels, and
xi is the gray level of a pixel in the block where 1 ≤
i ≤
n2. Each block is quantized, and the corresponding resulting block has the same sample mean and the same sample first absolute central moment as each original block. For each block, the sample mean,
, is the decision threshold of the quantizer, which is calculated as
In AMBTC encoding, each block is encoded as (
L,
H,
BM), where (
L,
H) is a two-level MMSE (minimum mean square error) quantizer, and
BM is a bitmap to denote the thresholding result. The bitmap
BM is presented as:
where
xi is encoded as 1 when
xi is greater than or equal to the threshold and 0 otherwise. The two-level MMSE quantizers are computed as shown below:
where
q is the number of pixels above the threshold. Note that
L and
H are estimated conditional means given that
xi is less than or greater than
, respectively.
In AMBTC decoding, each pixel
of each block is decoded according to the two-level MMSE quantizer and
BM:
3. Our Proposed DH Method Based on HP-AMBTC
To improve the hiding capacity and decrease the hiding distortion, the two-level MMSE quantizer used by AMBTC is used in our proposed data hiding to minimize the hiding impact. In addition, we designed a lookup table for adaptively hiding secret information in images. The details of the proposed scheme are described as follows.
3.1. Data Embedding
Assume that the cover image is an 8-bit grayscale digital image, which is the set of all possible pixel values in the range [0, 255]. The cover image is divided into non-overlapping blocks of 4 × 4 pixels, and let {x1, x2, …, x16} be the pixels in a block read in a raster scan where xi ∈ [0, 255].
- Step 1.
For each block, we use AMBTC to encode it to generate the AMBTC compression code (L, H, BM).
- Step 2.
Reconstruct the block using Equation (4) to obtain the reconstructed pixels {, , …, } read in a raster scan.
- Step 3.
For each reconstructed block, we mark
and
as the non-embeddable pixels and modify
as
- Step 4.
Embed secret data into embeddable pixels {, , …, } as
where
yi is the stego pixel, and
mv is defined as follows:
Secret Data | H/L mv |
00 | −1 |
01 | 0 |
10 | +1 |
11 | +2 |
Secret Data | L mv | H mv |
0 | 0 | 0 |
1 | −1 | +1 |
Secret Data | L mv | H mv |
00 | −1 | −1 |
01 | 0 | 0 |
10 | −2 | +1 |
11 | −3 | +2 |
Secret Data | L mv | H mv |
1111 | −1 | +3 |
0000 | −2 | −2 |
00 | −1 | −1 |
01 | 0 | 0 |
10 | +1 | +1 |
11 | +2 | +2 |
Secret Data | H/L mv |
1111 | +3 |
0000 | −2 |
00 | −1 |
01 | 0 |
10 | +1 |
11 | +2 |
Note that the lookup table, which is our embedding and extraction rule, should be previously shared between the two parties using a secure channel. Taking
Figure 1 as the data hiding to conceal 34 secret bits, for example, after decoding the AMBTC encoded code (90, 150,
BM) to obtain a reconstructed image block, we define
and
as the non-embeddable pixels. According to |
H−
L| = 60 > 5 and secret data, we choose the Case 5 lookup table to adaptively increase the embeddable pixels by the corresponding modified value,
mv. Note that the stego image is also an 8-bit grayscale digital image rather than a compressed image or a decompressed image. From
Figure 1, we can see that the stego image block is very similar to the cover image block. After hiding 34 bits of secret data, the hiding distortion is very low since we used a two-level MMSE quantizer to minimize the impact of hiding.
3.2. Data Extraction
This process extracts the secret data from the 8-bit grayscale stego image. Assume that the grayscale stego image is divided into non-overlapping blocks of 4 × 4 pixels, and let {y1, y2, …, y16} be the pixels in a stego image block and read in raster scan where yi ∈ [0, 255].
- Step 1.
For each block, we mark y1 and y16 as the non-embeddable pixels. If y1 ≥ y16, it indicates y1 as H and y16 as L, respectively. Otherwise, y1 is L and y16 is H, respectively.
- Step 2.
For each pixel of embeddable pixels {y2, y3, …, y15}, we extract the secret data according to the absolute difference of H and L and the lookup table defined below:
Stego Value | Secret Data |
(H − 1)/(L − 1) | 00 |
H/L | 01 |
(H + 1)/(L + 1) | 10 |
(H + 2)/(L + 2) | 11 |
Stego Value | Secret Data |
H/L | 0 |
(H + 1)/(L − 1) | 1 |
Stego Value | Secret Data |
(H − 1)/(L − 1) | 00 |
H/L | 01 |
(H + 1)/(L − 2) | 10 |
(H + 2)/(L − 3) | 11 |
Stego Value | Secret Data |
(H + 3)/(L − 3) | 1111 |
(H − 2)/(L − 2) | 0000 |
(H − 1)/(L − 1) | 00 |
H/L | 01 |
(H + 1)/(L + 1) | 10 |
(H + 2)/(L + 2) | 11 |
Stego Value | Secret Data |
(H + 3)/(L + 3) | 1111 |
(H − 2)/(L − 2) | 0000 |
(H − 1)/(L − 1) | 00 |
H/L | 01 |
(H + 1)/(L + 1) | 10 |
(H + 2)/(L + 2) | 11 |
Figure 2 illustrates the data extraction example. We define
y1 and
y16 as the non-embeddable pixels, and we regard
y1 as
L and
y16 as
H due to
y1 <
y16. According to |
H−
L| = 60 > 5, we choose the Case 5 lookup table to extract the secret data. Finally, 34-bits of secret data can be extracted after scanning 14 pixels. Moreover, the AMBTC decoded image block can be almost restored according to the corresponding
Ls and
Hs except for the first and the last pixels.
We said the AMBTC encoded image block could be almost restored because the first and the sixteenth pixels needed to be modified and then served as the indicators for data extraction later. If these two pixels in a block are the same as the original ones after the data hiding, the original AMBTC-encoded image blocks can be completely restored. Otherwise, only fourteen pixels in a block can be restored to the original AMBTC compression codes after data extraction.