1. Introduction
Data hiding (DH) and digital watermarking (DW) are two of the important contents of cyberspace security research, which have been greatly developed in recent years. Various new technologies and new methods for them emerge one after another. For example, Abdallah et al. [
1,
2] proposed some new algorithms for video watermarking. Yu et al. [
3] summarized the current robust video watermarking algorithms for copyright protection. However, most steganography algorithms cause permanent distortion of the original carrier after embedding the data, which is not allowed in some areas where data authentication is required, such as privacy protection in the cloud environment, military combat map transmission, etc. [
4]. Reversible data hiding (RDH) [
5] takes data hiding and distortion-free recovery of the original carrier into account. Currently, RDH algorithms mainly include: lossless compression (LC) [
6,
7], difference expansion (DE) [
8,
9] and histogram shifting (HS) [
10,
11]. With the promotion of cloud services, for the purpose of privacy protection, users usually encrypt the uploaded data, so that the original data become incomprehensible ciphertext data. Reversible data hiding in encrypted image (RDH-EI) comes into being and has become the latest research hotspot. The RDH-EI requires that the carrier used for embedding be encrypted, and the carrier still be decrypted without error after extracting embedded data. RDH-EI is an important combination of signal processing technology and data hiding technology in encrypted domain. It has dual functions of privacy protection and secret information transmission for information security in data processing. There are two main applications in medical health. The first is the integrity protection of medical images. Medical images are mostly digital images that are easily copied, edited, or maliciously altered while being easy to store. A more mature technique for protecting image integrity information is digital signatures. The main principle is to generate non-repudiation hash values that are appended to original image and transmitted to the receiver. However, in reality, confidential communications often run the risk of being attacked by malicious attacks and analysis. The way to directly attach the hash value to the image is easily vandalized and replaced, thus increasing the difficulty of image authentication. On some special occasions, image reversible information hiding can be used as an important complementary technology to digital signatures, embedding image integrity information such as hash values into medical images. The receiver can not only recover the medical image reversibly, but also correctly extract the legal authentication information to protect the integrity of the medical image. The second is encrypted image management in telemedicine. Image-oriented telemedicine is widely used in the future information society. For the purposes of confidentiality and privacy protection, medical images sometimes need to be encrypted beforehand and sent to third parties such as the cloud for management. The sensitive information such as the medical record is embedded in encrypted image by means of RDH-EI, so that encrypted image management in telemedicine can be realized.
Zhang [
12] firstly proposed RDH-EI, which encrypts images with stream ciphers, and then divides encrypted images into blocks that do not overlap each other. Each block has two groups. By flipping each pixel’s three least significant bits (LSBs) in the corresponding group, 1-bit information can be embedded. The receiver extracts extra data through the wave function. However, when the block is small, the error rate becomes high. Hong et al. improved Zhang’s method by using a wave function capable of edge matching in [
13] and using unbalanced bit flipping in [
14]. In [
15], Zhang proposed a separable RDH-EI algorithm, which can extract additional data in both the encrypted domain and the plaintext domain. The above RDH-EI algorithms need to vacate room after encryption (VRAE) for data hiding, resulting in lower embedding capacity of the algorithm and higher error rate in the data extraction process. Ma et al. [
16] proposed a RDH-EI algorithm by vacating room before encryption (VRBE). Zhang et al. [
17] improved reversibility, embedding capacity, and image quality by exploiting prediction errors in data embedding.
The RDH-EI algorithms mentioned above are to encrypt images with symmetric cryptography because symmetric cryptography has lower computational complexity and faster encryption and decryption speed. However, because of using symmetric cryptography, each pair of senders and receivers must have different keys, which are required in the context of current multi-party cloud computing services. The amount of keys is huge, which makes it difficult to manage keys. For example, in a communication network with n users, each user must use keys to communicate with the remaining users, and the total number of keys in the system will be up to . Such a large amount of keys will have insecurities in the various aspects of saving, transferring, using and destroying. In addition, both parties using symmetric cryptography must use the same key for encryption and decryption, thus the key must be transmitted over the secure channel before communication. If the key is stolen during the delivery process, or either party leaks the key, the encrypted data will be insecure. Because symmetric cryptography has the above shortcomings of designing the RDH-EI algorithms, we naturally use public key cryptography into the RDH-EI algorithms. The key amount of the public key cryptography is greatly reduced, and the key does not need to be transferred in advance. More importantly, with public key cryptography, we can perform homomorphic addition and homomorphic multiplication in the encrypted domain, which makes the embedding process of additional data more secure.
Differing from RDH-EI, Chen et al. [
18] firstly proposed encrypted image-based reversible data hiding with public key cryptography (EIRDH-P). In EIRDH-P, the characteristics of public key cryptography overcome the shortcoming that symmetric encryption requires the safe channel to pass the key in advance. The algorithm embeds the 1-bit information into a pair of adjacent encrypted pixels. According to the homomorphic characteristics of the Paillier cryptosystem, the receiver obtains the secret information by comparing all the decrypted pixel pairs. The disadvantage of the Chen’s method is that there is an inherent overflow problem. Subsequently, Shiu et al. [
19] and Wu et al. [
20] improved Chen’s method by solving the overflow problem. The above inseparable algorithms limit the application scenario and scope of the algorithms, and the separable algorithms have more application scenarios. The receiver’s privilege determines different operations it can perform, for example the receiver can only extract additional data, the receiver can only decrypt the image, and the receiver can both decrypt the image and extract additional data.
In 2016, Zhang et al. [
21] first proposed a separable EIRDH-P algorithm, which combines wet paper code (WPC) [
22] with Paillier homomorphic encryption. Zhang’s method vacates room by histogram shrinking and uses WPC to embed additional data. Xiang et al. [
23] proposed a RDH-EI algorithm based on mirroring ciphertext groups (MCGs) by using the homomorphic and probabilistic characteristics of the Paillier cryptosystem. In summary, using the homomorphic characteristics of public key cryptography for RDH-EI is the latest research hotspot, which we call reversible data hiding in homomorphic encrypted image (RDH-HEI). This paper proposes a RDH-HEI scheme based on EC-EG. Firstly, the cover image is segmented. The square grid pixel group randomly selected by the image owner has one reference pixel and eight target pixels. The
n LSBs of the reference pixel and all bits of the target pixels are self-embedded into other parts of the image by a method of predictive error expansion (PEE). To avoid overflowing when embedding data, the
n LSBs of the reference pixel are reset to zero before encryption. Then, the pixel values of the image are encrypted after encoded onto the points of the elliptic curve. The encrypted reference pixel replaces the encrypted target pixels surrounding it, thereby constructing the mirror central ciphertext (MCC). In a set of MCC, the data hider embeds the encrypted additional data into the
n LSBs of the target pixels by homomorphic addition in ciphertexts, while the reference pixel remains unchanged. The receiver can achieve separation of additional data extraction from cover-image recovery, additional data extraction without any errors, and reversibility completely.
The rest of this paper is organized as follows.
Section 2 briefly describes homomorphic EC-EG cryptosystem, its properties and reviews predictive error expansion for reversible data hiding in the plain domain. Implementation of the proposed RDH-HEI scheme is elaborated in
Section 3. Simulation experiment and theoretical analysis are presented in
Section 4. Finally, the conclusion is drawn in
Section 5. It is important to note that a preliminary version [
24] of this paper is in proceedings of the “11th Intelligent Networking and Collaborative Systems (INCoS-2019)”.
2. Related Knowledge
2.1. Elliptic Curve
The image trajectory of elliptic curve is not an ellipse, but a cubic curve on a plane. However, it is a problem that people find when studying the arc length of an ellipse.
Definition 1. Elliptic curve is a plane curve determined by the cubic Weierstrass equation: The points on the curve satisfy the equation. If the coefficient is , the number of points on the curve is finite, and all the points on the curve plus an artificial infinity point constitute the set and the number of points in is recorded as .
When constructing a cryptosystem, the following elliptic curve is mainly used:
Figure 1 shows the distribution of points on the elliptic curve when
.
Theorem 1. The set of points on the elliptic curve forms an Abel group for the addition defined below:
- 1.
- 2.
- 3.
- 4.
(addition rule)If is satisfied when , then - 5.
(multiple point rule) , If is satisfied, then
2.2. Elliptic Curve Public Key Cryptography
Elliptic curve discrete logarithm problem (ECDLP) means that x is determined by a given P and Q, that is to say, if is the set, then makes true. The elliptic curve discrete logarithm problem is more difficult to deal with than the discrete logarithm problem on the finite field, and the security is also higher.
Definition 2. Suppose that E is an elliptic curve and G is a point on E. If exists, is established, then n is said to be the order of point G, where O is the infinity point.
Implementation of elliptic curve ElGamal (EC- EG):
Key generation: Select the base field , the elliptic curve , and encode the plaintext information m to the point on the curve. Select the generator G (base point) of as the public parameter. The user selects the private key and the public key is .
In cryptography, describe an elliptic curve on an , commonly used to six parameters: (p, a and b are used to determine an elliptic curve, h is the integer part of the number of all points on the elliptic curve divided by n). The choice of these parameters directly affects the security of encryption. The parameter values generally require the following conditions to be met:
p is of course larger and safer, but the larger is p, the slower is the calculation speed. p is about 200 bits to meet general safety requirements:
;
;
;
n is a prime number; and
.
Encryption: Bob selects a random integer and calculates the ciphertext as
Decryption: Alice calculates to complete the decryption.
Homomorphic addition: After encoding the plaintext
to the two points
on
, the ciphertext is calculated according to Equation (
6) using random integer
.
According to
in Key generation, the homomorphic addition can be expressed as
That is, the sum of the two ciphertexts is equal to the sum of the plaintexts after decryption.
2.3. Rhombus Pattern Based PEE
The rhombus prediction error extension [
25] is to divide the image pixels into overlapping pixel groups. Pixels of a group are further divided into “×” and “•” sets. Pixels of “×” set are used for data embedding and pixels of “•” set are used for prediction.
Figure 2 illustrates that a pixel
in the “×” set is predicted by its four neighboring pixels
in the “•” set. The embedding, extraction and recovery procedures are given as follows.
2.3.1. The Embedding Procedure
The predicted value
is the average of four adjacent pixel values of
and is calculated by Equation (
8).
The prediction error
is the difference between the original pixel
and the predicted value
and is calculated by Equation (
9).
The difference expansion method [
26] is employed to expand the prediction error
for data embedding. Equation (
10) is used for PEE.
where
is the modified prediction error after data embedding and
b is one bit of the message to be embedded. Thus, each group can embed one bit of information. The modified pixel value
after data embedding is calculated by Equation (
11).
2.3.2. Overflow or Underflow Processing
The PEE data embedding may cause overflow or underflow.
can be used to avoid embedding into pixels that cause overflow or underflow. Substituting
from Equation (
9) and
from Equation (
10) into Equation (
11), we get
According to Equation (
12), it is clear that pixels that do not cause overflow or underflow satisfy the condition in Equation (
13).
The
to record the positions of invalid groups can be created by pre-processing the original image. Each group of the original image is checked for Equation (
13) and records the row and column binary data of invalid groups. The
is compressed and then self-embedded into to the image.
2.3.3. Extraction and Recovery Procedures
The extraction procedure is the inverse of the embedding procedure. Since the “•” pixels do not change, the receiver has the same predicted pixel value
of “×” pixels as the image owner. Given the modified pixel value
and the modified prediction error
, Equation (
14) is used to calculate which additional data have been embedded.
Then, the embedded information is calculated by Equation (
15).
Equation (
16) is used for calculating the original prediction error
.
Finally, the original pixel value
is recovered as