1. Introduction
Reversible data hiding (RDH) is a technology which embeds secret data into cover media (such as text, image, and video) imperceptibly and reversibly. The embedded data can be extracted and the cover media can be completely recovered. Because the digital images are used widely in various fields, a series of RDH methods based on images have been proposed [
1]. These methods can be classified into two categories: RDH in plaintext images [
2,
3,
4,
5,
6,
7,
8,
9] and RDH in encrypted images (RDHEI) [
10,
11,
12,
13,
14,
15,
16,
17,
18,
19,
20,
21,
22,
23,
24,
25,
26,
27,
28,
29,
30,
31,
32,
33].
RDHEI methods embed secret data into an encrypted image in a reversible way without decryption or being aware of the image content. RDHEI technology is useful for applications in which the data managers have no access to the contents of the images for privacy or other reasons. For example, in the application of cloud storage, to protect privacy, the original images are encrypted before storing them in the cloud servers. In RDHEI, there are three parties: the content owner, the data hider, and the receiver. The content owner encrypts the original image and transfers it to the data hider. The data hider embeds secret data into the encrypted image. After data embedding, the receiver can extract the secret data or retrieve the original image from the encrypted image which contains secret data. According to how the data hiding room is vacated, the existing RDHEI methods can be classified into the vacating room after encryption (VRAE) [
10,
11,
12,
13,
14,
15,
16,
17,
18,
19,
20,
21,
22,
23,
24,
25,
26,
27] and the reserving room before encryption (RRBE) [
28,
29,
30,
31,
32,
33].
In the VRAE methods, the original image is encrypted with no preprocessing by the content owner. The data hider should vacate room in the encrypted image for data hiding. In Zhang’s method [
10], the encrypted image is divided into non-overlapping blocks, and one secret bit is embedded into one block by flipping the three least-significant bits (LSBs) of half the pixels in the block. To extract the secret data and recover the flipped block, a smoothness estimator is used to judge the flipped pixels in each block after image decryption. The methods in [
11,
12,
13] improve Zhang’s method in terms of the reversibility and visual quality. In Hong et al.’s method [
11] and Liao et al.’s method [
12], to reduce errors in data extraction and image recovery, different improved smoothness estimators and side matching strategies are used. In Qin et al.’s method [
13], an elaborate pixel selection scheme is used to reduce the distortion of the directly decrypted image, and a more precise smoothness estimator is used to reduce errors in data extraction and image recovery. In Wu et al.’s method [
14], pixels are pseudo-randomly selected from the encrypted image and divided into same-size groups, and one secret bit is embedded into one group by flipping the
i-th (
) LSBs of all pixels in the group. After image encryption, multiple designed content-aware pixel value predictors are used for detecting the bit flipping, so that the embedded bits can be extracted and the image can be recovered. Dragoi et al.’s method [
15] extends Wu et al.’s method. In Dragoi et al.’s method, more pixels in the encrypted image can be selected for data hiding so that the embedding rate is improved, and Bose–Chaudhuri–Hocquenghem (BCH) coding is used to reduce errors in data extraction and image recovery. In Zhou et al.’s method [
16], the encrypted image is divided into non-overlapping blocks, and a public key modulation scheme is used to embed a certain number of bits into each block. To recover the image and extract the embedded bits, a classifier based on the support vector machine (SVM) is used to judge the embedded bits. In [
17], a designed sparse matrix is used to compress the LSB planes of the encrypted image so that the spared bits in the LSB planes can be used to accommodate the secret data. The receiver extracts the embedded data directly from the LSB planes, and uses a specific pixel predictor to recover the compressed LSBs. The methods in [
18,
19] improve the method in [
17]. In Qian et al.’s method [
18], the pixels of the encrypted image are divided into three subsets. In the different subsets, the LSBs of the pixels are compressed by different sparse matrices, and are recovered by different pixel predictors. In Qin et al.’s method [
19], a specific image encryption scheme is used, so that the encrypted image can be decomposed into smooth regions and complex regions. Only the LSBs of the smooth regions are compressed for data hiding. In [
20,
21], Low-Density Parity Check (LDPC) code is used to compress the specific bit planes for data hiding. In Zhang et al.’s method [
20], the bits of the fourth LSB plane of the encrypted image are compressed. In Qian et al.’s method [
21], three quarters of the most significant bits (MSBs) are compressed. To recover the image, the log-likelihood ratio (LLR) and iterative belief propagation algorithm (BPA) decoding algorithm are used to recover the compressed bits.
Some VRAE methods use special encryption schemes to partially retain the spatial correlation in encrypted images, and the data hider uses variants of the different traditional RDH methods (such as histogram shifting, prediction error expansion, and pixel value ordering) to embed secret data. In Huang et al.’s method [
22], various histogram shifting-based RDH methods are accomplished in the encrypted domain. In Xiao et al.’s method [
23], a variant pixel value ordering scheme is used to embed secret bits into non-overlapping blocks of the encrypted image. In Yi et al.’s method [
24], a designed pixel value predictor is used to generate prediction errors, and a variant prediction error expansion scheme is used for data embedding. In Qin et al.’s method [
25], the MSBs of each block of the encrypted image are compressed by a sparse matrix compression scheme. In Li et al.’s method [
26], based on a different histogram which is generated from the encrypted image, a variant histogram shifting scheme is used to embed secret data. In Ge et al.’s method [
27], a multi-level histogram shifting scheme is used to embed secret data into each block of the encrypted image.
In the RRBE methods, before image encryption, the content owner preprocesses the original image to vacate room for accommodating secret data. The vacated room is retained after image encryption, and the data hider can embed secret data into the room directly. In Ma et al.’s method [
28], the original image is divided into the smooth regions and the complex regions. To reserve the LSBs of the encrypted image for data hiding, the LSBs of the complex regions are embedded into the smooth regions using a traditional RDH method, such as difference expansion. Image encryption is performed after embedding the LSBs. At the data hider’s side, the secret data can be placed into the original positions of the embedded LSBs directly. In Zhang et al.’s method [
29], before image encryption a pixel estimator is used to predict the original values of a portion of pixels, then these pixels are substituted with their prediction errors. After image encryption, based on the histogram of the prediction errors, the data hider uses a variant histogram shifting scheme to embed secret data. In Cao et al.’s method [
30], the original image is divided into patches, and each patch is encoded using less bits by a sparse coding technology with an over-complete dictionary. The spared bits in each patch can be used for data embedding after image encryption. In Yi et al.’s method [
31], each MSB plane of the original image is divided into non-overlapping blocks and compressed by a designed sparse matrix coding scheme. Then, the LSBs of the image are embedded into the compressed MSB planes to vacate room for data hiding. In Chen et al.’s method [
32], a bit plane rearrangement strategy is used to rearrange the MSBs of the original images, then the rearranged MSBs are compressed by an extended run-length coding scheme to vacate room for data hiding. In Qiu et al.’s method [
33], the LSBs of the original image are removed for data hiding by using a reversible integer transformation scheme.
Benefitting from the use of the original image’s spatial correlation, the RRBE methods can achieve a much higher capacity than the VRAE methods. However, the RRBE methods require the content owner to handle extra image processing work. In some cases, the content owner may not be able to process images. Therefore, the VRAE methods are more feasible for different applications.
In this paper, we propose a novel VRAE RDHEI method which is based on the compression of pixel differences. In the proposed method, at the content owner’s side a specific image encryption scheme is used to partially retain the spatial correlation in non-overlapping blocks of the encrypted image. At the data hider’s side, for each block the four pixels are divided into two parts—one mark pixel and three replaceable pixels—and three pixel differences between the three replaceable pixels and the mark pixel are collected. Due to the spatial correlation, the pixel differences are highly likely to be concentrated and can be compressed efficiently. By replacing the information of the replaceable pixels with their compressed pixel differences, the data hiding room is vacated in the encrypted image without losing any information. At the receiver’s side, the receiver can extract the secret data with no error and recover the image to its original version.
The rest of the paper is organized as follows. In
Section 2, we present the detailed introduction to the proposed method. In
Section 3, the experimental results and comparisons are provided. The conclusions are given in
Section 4.
2. The Proposed Method
In this section, we introduce the details of the proposed method.
Figure 1 illustrates the framework of the proposed method. At the content owner’s side, the content owner encrypts the original image with no preprocessing using two encryption keys, and sends the encrypted image to the data hider. The data hider vacates room in the encrypted image, then uses a data hiding key to embed secret data into the room. The receiver extracts the embedded data from the encrypted image by the data hiding key
, and recovers the original image or generates the marked decrypted image containing the secret data by the encryption keys.
2.1. Image Encryption
For an 8-bit standard gray image sized
, we denote the
k-th bit of the pixel
as
(
,
). The
is defined as follows:
To partially retain the spatial correlation in the encrypted image for data hiding, a specific image encryption scheme is used to encrypt the original image at the level of the block. The detailed steps of the image encryption scheme are as follows:
Step 1: Divide the original image into non-overlapping blocks ().
: For each block
, an 8-bit pseudo-random bit sequence
is generated using the block encryption key
. Denoting the four pixels in
as
,
,
, and
, each pixel
is encrypted into
by
as follows:
where
is the
k-th bit of
, and
is the
k-th bit of
.
If or is odd, after all the blocks are encrypted, the last row or column should be encrypted at the level of the pixel. For each pixel which does not belong to any block, an 8-bit pseudo-random bit sequence is generated by to encrypt the pixel. The encryption process of each pixel is the same as that shown in Equations (2) and (3).
Step 3: After all the blocks have been encrypted, to enhance the encryption strength the content owner uses the block permutation key to pseudo-randomly permute all blocks into inside the encrypted image.
After image encryption, the encrypted image is sent to the data hider for embedding secret data. Because the four pixels in the block are encrypted by the same bit sequence, the encrypted pixels are still highly likely to be similar. Therefore, the spatial correlation of the original image can be partially retained in each block. That allows the data hider to vacate room in each block for accommodating secret data.
2.2. Data Embedding
When the data hider receives the encrypted image, first the data hider divides the encrypted image to retrieve the
blocks
. For each block
, the four pixels
,
,
and
are divided into one mark pixel and three replaceable pixels. As shown in
Figure 2, to simplify the statement
is assigned to the mark pixel, and the other three pixels are assigned to the replaceable pixels. Then, three pixel differences
,
, and
of block
are calculated as:
The range of is . However, since the spatial correlation is retained in each block, there is a high probability that the four pixels inside the block are close to each other. Thus, is highly likely to be close to 0.
After all the pixel differences
are collected from all the blocks, these pixel differences are highly concentrated.
Figure 3 shows the histogram of the pixel differences of Lena (in Figure 5). As shown in the figure, most of the pixel differences are concentrated in a small range around 0. Therefore, these pixel differences can be compressed efficiently by Huffman coding. According to the mark pixel and the pixel differences, the three replaceable pixels in each block can be recovered completely. Therefore, the data hider can replace the replaceable pixels with the compressed pixel differences to vacate room in the encrypted image for accommodating secret data.
The procedure of room vacating and data embedding is as follows:
Step 1: For each block, calculate the three pixel differences according to Equation (4).
Step 2: For all the pixel differences , calculate the distribution of the difference values and use Huffman coding to encode all into .
Step 3: Compose all the encoded pixel differences , , ,…,, , into one bit sequence , which is the bitstream of the compressed pixel differences.
Step 4: Beginning from the highest MSB plane, embed the Huffman codebook, and their length information into the MSBs of all replaceable pixels by bit substitution. After this auxiliary information has been embedded, the rest bits of all the replaceable pixels can be used as the data hiding room.
: To embed secret data, beginning from the LSB plane the data hider uses the data hiding key to pseudo-randomly select bits from the data hiding room and substitute them with the secret data.
Figure 4 shows an example of room vacating and data embedding. First, the encrypted image is divided into four
blocks:
. Then, three pixel differences in each block are calculated, and all the pixel differences are encoded by Huffman coding. All the encoded pixel differences compose the bitstream of the compressed pixel differences
. Finally, the length information, the Huffman codebook, and the
are embedded into the MSBs of the replaceable pixels, then the secret data are embedded into the rest bits of the replaceable pixels by the data hiding key.
2.3. Data Extraction and Image Recovery
When the receiver gets the marked encrypted image containing secret data from the data hider, the receiver can obtain different data from the image using different keys.
Data extraction: When the receiver has the data hiding key , the receiver can extract the secret data directly from the data hiding room. Beginning from the LSB planes of the replaceable pixels, the receiver uses the data hiding key to retrieve the embedded secret bits from the data hiding room, so that the secret data are extracted.
Image Recovery: When the receiver has the block encryption key and the block permutation key , the receiver can retrieve the original image from the marked encrypted image or generate the marked decrypted image, which still contains secret data and is highly similar to the original image. The procedure of image recovery is as follows:
Step 1: Extract the length information from the MSBs of the replaceable pixels. Then, extract the Huffman codebook and the bitstream of the compressed pixel differences according to the length information.
Step 2: According to the Huffman codebook, decompose into the encoded pixel differences , and decode them into the original differences .
For each block
containing the mark pixel
, retrieve the original replaceable pixels
,
and
as follows:
: After retrieving the original values of all the replaceable pixels, to recover the original image all the replaceable pixels in the marked encrypted image are substituted with their original values.
To generate the marked decrypted image, for each replaceable pixel the MSBs which are substituted with the auxiliary information are recovered according to the original value, and the LSBs which belong to the data hiding room stay the same.
Step 5: Use the block permutation key and the block encryption key to decrypt the processed encrypted image into the original image or the marked decrypted image, which is highly similar to the original image.