1. Introduction
The modification of digital multimedia content has become easier, especially in terms of images, and thus the issue of image copyright protection has attracted more attention. Accordingly, image watermarking technology aims at providing a reliable way to alleviate this problem related to the intellectual management. The robust watermarking method can protect copyright of the image, and have two basic characteristics, namely, robustness and fidelity. Since these two characteristics are contradictory, a good robust watermarking method can balance the relationship between robustness and fidelity.
The robust watermarking technology is divided into the spatial and the frequency domain. Compared with the spatial domain, the frequency domain watermarking-based watermarking method can obtain much more watermarking robustness without a great amount of image distortion. Therefore, the present study focuses on the image watermarking schemes in the frequency domain.
Many frequency techniques have been presented for the robust watermarking, such as discrete wavelet transform [
1,
2], discrete Fourier transform [
3], discrete cosine transform [
4,
5], quaternion discrete Fourier transform [
6,
7,
8], and quaternion Hadamard transform [
9].
Guan et al. [
10] proposed a watermarking method that embedded a watermark into the two-level DCT coefficients using a specified technology. Li et al. [
11] developed a robust watermarking scheme based on the wavelet domain. Due to the fact that the above two methods are single transform that have deficiencies without using inherent correlations in the frequency domain, hybrid transform watermarking schemes achieve better robustness and fidelity—many image-watermarking techniques combining several transform methods have been proposed [
12,
13,
14]. In [
12], a new method was presented, to hybrid SVD and integer wavelet transform to embed a watermark. Rastegar et al. [
13] proposed a mixed watermarking method based on SVD and FRAT. Lai and Tsai [
14] suggested a new image-watermarking method that blended a discrete wavelet transform and singular value decomposition. The method embedded a watermark on singular value of the host image’s DWT sub-bands.
From the above discussion, most image watermarking methods have been proposed to embed a watermark in a gray image or a channel. With the wide application of color images, the watermarking schemes for color images can be proposed [
2,
6,
8,
15,
16,
17,
18,
19,
20,
21]. Chou and Liu [
2] proposed a new color-image watermarking algorithm based on wavelet transform and significant difference, and embedded the maximum watermark information under imperceptible distortion. Chen et al. [
6] modulated at least one component of QDFT coefficients, and propagated the watermark to two or three RGB color channels. They used the characteristics of QDFT to avoid watermark energy loss. A color image-watermarking algorithm mingling with QDFT, LS-SVM and pseudo-Zernike moments has been proposed by Wang et al. [
8]. In [
8], quaternion Fourier transform allows watermark information energy to be propagated to all channels simultaneously to improve the robustness. Ma et al. [
15] developed a local quaternion Fourier transform for the color image-watermarking method. The method used deeds of quaternion Fourier transform to improve watermark invisibility, and considered an invariant feature transform to resist the geometric attacks of the image. Kais Rouis et al. [
16] proposed a method for image tampering detection, that has an underlying hashing process based on estimation of image gradient, and the performance of the method was compared to the use of QDFT method. Yang et al. [
17] introduced a robust digital watermarking algorithm for geometric correction using quaternion Exponential moments. Li et al. [
18] developed a color image-watermarking method based on QDFT and quaternion QR. The host image was decomposed by QDFT and quaternion QR, and a high-entropy block of the scalar part of the quaternion QR matrix was selected to embed the watermark.
Over the last decade, various image-watermarking schemes based on tensor decomposition have been proposed [
19,
20,
21]. Tensor decomposition can maintain the internal structure of the digital image and avoids the loss of important image information. Xu et al. [
19] pointed out a new blind watermarking scheme for color images based on the tensor domain. The scheme effectively considers the overall characteristics of color images, and propagates the watermark information to the three channels of the color image through tensor decomposition. Feng et al. [
20] used Tucker decomposition to decompose the luminance component, and then used adaptive dot matrix quantization index modulation to embed the watermark in the tensor domain. Fang et al. [
21] offered a watermarking scheme based on Tucker decomposition, and this method transformed the multi-spectral image and embedded the watermark into the element of the last frontal slice of the core tensor.
From the above methods, some embed a watermark in single transform domain [
6,
8,
15,
17,
19,
20]. Besides, in [
2], the methods did not take efficient account of the correlation of frequency components. In [
18], the scheme chose the high entropy block to embed a watermark, the block is unstable, which makes the watermark more vulnerable to attack. In a word, none of these methods takes full advantage of the three-dimensional
imaginary components of QDFT and the above methods suffer from watermark energy loss [
6].
Based on [
18,
19], the present paper proposes a hybrid transform color image watermarking scheme based on QDFT and tensor decomposition. The scheme considers the overall color image channels to improve attack resistance and decentralizes the distribution of the watermark further, and then enhances robustness. Furthermore, an appropriate strength is used to embed the watermark that satisfies the two conflicting factors, robustness and fidelity. The main contributions of the paper are as follows:
This paper blends QDFT with tensor decomposition (TD) and implements overall processing for a color image to embed a watermark.
The scheme proposed in this paper synchronously spreads the watermark to three RGB channels and enhances robustness performance.
This paper proves the correlation of three imaginary components of QDFT, using the QDFT’s components to structure a tensor.
The rest of this paper is arranged below. The relevant techniques are described in
Section 2. The embedding and extraction processes of watermarking are provided in
Section 3. The experimental part is provided in
Section 4. Finally, the paper is summarized in
Section 5.
2. Relevant Techniques
In this section, tensor decomposition, quaternion discrete Fourier transform, pseudo-Zernike moments, and multiple output LS-SVR are introduced.
2.1. Tensor Decomposition (TD)
Due to application requirements of high-order data, tensor decomposition (TD) is used as a tool to analyse high-order data. TD is a high-order extension of matrix decomposition in multi-linear algebra, and is an efficient technique used in many fields [
22,
23]. CANDECOMP/PARAFAC (CP) and Tucker decomposition are two particular ways to implement tensor decomposition; the well-known Tucker decomposition is always selected to implement TD.
Tucker decomposition can be considered to be higher-order extensions of the matrix singular value decomposition (SVD). The Tucker decomposition was introduced by Tucker [
24] and has been successfully applied to data dimensional reduction, feature extraction, tensor subspace learning, face image recognition [
25], data compression, image quality evaluation [
26], noise reduction [
27], and data analysis [
28]. In the present paper, Tucker decomposition is used to construct a watermark embedding domain.
A third-order tensor
is decomposed by Tucker decomposition, there will be obtained three orthogonal factor matrices
,
,
, and a core tensor
[
24].
Figure 1 shows Tucker decomposition of a third-order tensor
T.
Each element in the core tensor
K represents the degree of interaction between different slices. The Tucker decomposition [
22] is defined in Equation (
1).
For each element of the original tensor
T, the Tucker decomposition [
22] is expressed in Equation (
2).
where
P,
Q, and
R correspond to the number of column vectors of the factor matrices
,
, and
, respectively.
P,
Q, and
R are generally less than or equal to
M,
N, and
O, respectively. The symbol ‘∘’ represents outer product between two vectors. where the symbol ‘[[ ]]’ is a concise representation of Tucker decomposition given in [
22]. The core tensor
K has the same dimension as tensor
T, and it is expressed in Equation (
3).
K has full orthogonality, that is, any two slices of the core tensor
K are orthogonal to each other, and the inner product between the two slices are zero.
2.2. Quaternion Discrete Fourier Transform (QDFT)
Quaternion was introduced by Hamilton [
29], and was a generalization of a complex number. Quaternion [
30] was regarded as a kind of hyper-complex, which can be represented by a four-dimensional complex number with one real part and three imaginary parts, and is defined as follows:
where
,
,
, and
are real numbers,
i,
j, and
k are imaginary operators with the following properties:
where the ‘·’ is the cross product,
,
,
,
,
,
.
Sangwine [
30] was the first to demonstrate formulations of quaternion discrete Fourier transform (QDFT). Considering that QDFT does not satisfy the commutative law, QDFT is divided into three types, namely, left-way transform
, right-way transform
[
8], and hybrid transform
[
30]. The form of the left-way transform
is as follows:
where
is a color image of size
represented in the quaternion form as Equation (
8). The inverse
(
) [
8] is defined by,
In these definitions, the quaternion operator was generalized, and is any unit of pure quaternion, where . The operators i, j, and k are special cases of ; in this paper,
Color image pixels have three components, R, G, and B. Thus, they can be represented in quaternion form using a pure quaternion. For example, the coordinates of a pixel is
in a color-image can be represented as follows:
where
is the red component, and
and
are the green and blue components of a color image, respectively.
Using the Equations (6) and (8), we can obtain
is a real component,
,
, and
are the three imaginary components in Equation (
9).
the inverse
QDFT can be represented as follows:
where
is the real inverse quaternion discrete Fourier transform of array
P, and
is the
IQDFT.
2.3. Pseudo-Zernike Moment
Pseudo-Zernike moments [
31] are very effective orthogonal rotation invariant moments and pseudo-Zernike moments are robust image feature descriptors. The moments have several characteristics: (1) Redundancy of information expression is small. Since the basis of the Zernike moment is orthogonal polynomial, the extracted features can be guaranteed to have small correlation and redundancy. (2) Effectiveness of information expression. It has been proven that the set of pseudo-Zernike moments can provide a compact, fixed-length and computation effective representation of the image content, and only a small fixed number of compact pseudo-Zernike moments need to be stored to effectively characterize the image content. (3) Multilevel representation of information. Pseudo-Zernike Moments effectively represent the contour of an image. The low-order moments and middle-order moments of pseudo-Zernike moments describe the overall shape of an image, while the high-order moments describe the details of an image. The pseudo-Zernike moments [
32] of order
n with repetition
m for a 2-d continuous function
are expressed as follows:
where
is a complex conjugate of
and
n is any positive,
m is any positive and negative integer such that
. The variables
x and
y are such that
,
,
. Pseudo-Zernike polynomials [
32]
of order
n with repetitions
m are expressed as follows:
where
. The pseudo-Zernike radial polynomial [
32]
is defined as follows:
When
is an image size of
, the pseudo-Zernike moments [
33] are defined as follows:
where
is the number of pixels in an image that are mapped into the unit circle.
Figure 2 shows the information expression of pseudo-Zernike moments for an image. It can be seen from the figure that the low-order moments of pseudo-Zernike moments can be used to construct the contour of the image.
Considering global geometric distortions, we select six low-order pseudo-Zernike moments , including , , , , , and to reflect the global information of a digital image. The pseudo-Zernike moments are calculated as parameters to correct the geometric attack in the process of watermark extraction.
2.4. Multiple Output LS-SVR
Xu et al. [
34] proposed the MLS–SVR network. Multiple output regression aims to learn the mapping from a multiple input feature space to a multiple output space. Although the standard formula of least squares support vector regression (LS-SVR) has potential practicality, it cannot handle multiple output situations. Multiple independent LS-SVRs are usually trained, thereby ignoring the potential (potentially nonlinear) cross-correlation between different outputs. To solve this problem, Xu et al. [
34] used the multi-task learning method to propose a new machine learning network. The multiple outputs function
is
where
is the sample,
is Lagrange multiplier,
is the kernel function,
b is parameter of the model, and
b∈
,
m is the number of output parameters,
l is the number of
b,
is positive real regularized parameter,
,
is the replicate matrix function (repmat), B = repmat (A,n) returns an array containing n copies of A in the row and column dimensions. The size of B is size(A)*n when A is a matrix.
In our paper, the above-mentioned machine learning model is used for geometric correction. The inputs of this model are six low-order features of Zernike moments [
31], and the outputs of this model are parameters of geometric distortion.
3. Watermarking in Tensor Domain
To enhance the robustness for the color image watermarking scheme, this paper blends QDFT and TD to embed a watermark. QDFT considers the correlation among color image channels. Tensor decomposition fully utilizes the correlation among frequency components, and watermark is scattering on frequency components further by the decomposition, so tensor decomposition improves the robustness of the watermarking scheme. The scheme utilizes the overall characteristics of RGB three channels that provides better embedding performance than single-channel or each channel of a color image, the scheme is more appropriate for color image watermarking.
QDFT can process the three channels of the color image as a whole instead of processing them individually, thus avoiding unnecessary distortion and utilizing the inherent correlations among the three channels of the color image. The three imaginary components C, D, and E also have a strong correlation. Hence, three components can be used to construct a tensor T. Figure 9 shows three imaginary components C, D, and E.
Tucker decomposition can maintain the internal structural relationship of an image. The core tensor obtained by Tucker decomposition represents the main properties of each slice of the original tensor and reflects the correlation among the slices. The core tensor
K is a compressed version of the original tensor
T.
Figure 3 shows the Tucker decomposition flowchart.
We can use the method in the article [
19] to embed the watermark in the core tensor
K, the maximum value of the core tensor is located in the upper-left corner, in the
position, as shown in
Figure 3. The position is robust when the image has experienced various attacks. Therefore, we modify the
coefficient to embed the watermark. Then, we show the three slices of the core tensor
K, which is shown in
Figure 4. The brighter part in
Figure 4 corresponds to a larger value of magnitude. It can be clearly seen that
is larger than the other position.
The above content briefly introduces the proposed watermarking scheme in this paper. The rest of this section is arranged as below. This section introduces three contents, including correlation analysis among three imaginary components of QDFT, procedures of watermark embedding, and procedures of watermark extraction.
3.1. Correlation Analysis among Components of QDFT
A color-image is decomposed by QDFT to obtain four-dimensional frequency components, including a real component A and three imaginary components C, D, and E. The imaginary three-dimensional frequency components have a strong correlation. The part proves the correlation among the imaginary three-dimensional frequency components of QDFT.
Based on the analysis of its theory, the relationship of the three-dimensional frequency components is proved. Most images have close correlation among the three channels in the RGB color space. The color channels are derived from the same physical model, which determines that images not only have similarity among adjacent pixels, but also have close correlation among the color channels of each pixel [
35,
36]. Then, any channels of the color image, red, green, and blue, replaces another channel, such as red, green, and green. We find the reconstructed image is still clear, and no blur distortion occurs. Thus, the research fully shows that color image similarities among adjacent pixels, and the three channels of each pixel have a close correlation. Furthermore, the difference between the two color-channels are almost the same or very close, the results are shown in
Figure 5.
C,
D, and
E all have red, green, and blue channels, these are combined by different coefficients. We substitute Equation (
8) into Equation (
6) as follows:
where
= a,
, and
i,
j, and
k are all orthogonal to each other.
On the other hand, the correlation of the three-dimensional imaginary components is proved by data distribution characteristics. We randomly select image block
of size 16 × 16 in Lena,
Table 1 shows the statistical characteristics of the RGB color space and QDFT frequency space for
. Then, QDFT transformation operates on the image block
. The distribution of
C,
D, and
E are similar, the results are shown in
Figure 6. The
is the
r column value of C component,
is
r column value of D component, and
is
r column value of E component, where
. It can be found from
Table 1 that the max value of C is 57,738, as shown in
Figure 6a, the max value of first column is also 57,738. So similarly, we can analyse D, and E from
Table 1. Furthermore, the results point that the correlation among C, D, and E does not change with the different sizes of the image.
From all the above proof, it appears that the three imaginary components C, D, and E have a strong correlation. So, we can construct a tensor using C, D, and E.
3.2. Procedures of Watermark Embedding
This part mainly introduces the specific process of embedding.
Figure 7 shows a flowchart of watermark embedding. The embedding process of watermark information is as follows.
Obtain a color-image with dimensions of , and divide the into non-overlapping blocks of size . The numbers of the blocks are .
Construct a pure quaternion Fourier
using RGB channels of the color image block size of
, and perform QDFT on the each block to obtain
,
,
, and
by Equation (
6).
Use the three Fourier frequency components , , and of each block to construct a third-order tensor T.
Operate Tucker decomposition on each tensor T to obtain core tensor K, and the numbers of K are .
Perform logistic mapping on all the core tensor K blocks, a bit of the watermark is embedded in of each core tensor, and the odd–even quantization embedding technique is defined as follows:
if
,
else
,
where
Q is the quantization step, that is, the watermark embedding strength,
is the rounding operation, and
is the modulo operation.
The value of is made up of positive and negative numbers. If the traditional odd–even quantization watermarking rule is used, the error rate is relatively high. When , the traditional rule is , an error occurs when extracting the watermark. For example, , , , , and . When extracting the watermark, , , and . This result is inconsistent with when embedding. Hence, the paper replaces 0.5 with 0.6 to avoid this error.
Perform inverse logistic mapping on all the core tensor
blocks with the watermark, and then obtain tensor
using Equation (
1).
Obtain the three imaginary components
,
, and
from
using frontal slice way in
Figure 7, and then construct
.
Perform inverse QDFT transformation on
by using Equation (
7) to obtain
. Finally, construct a watermarked color image using
,
, and
, that is, the three RGB channels with the watermark.
3.3. Procedures of Watermark Extraction
This part mainly introduces the specific procedures of watermark extraction, as shown in
Figure 8. The extracting process of the watermark is as follows. The watermarked image is geometrically rectified before the watermark is extracted. The technique of geometric correction can improve the watermark correct extraction rate, as shown in Table 8.
Obtain the Zernike moment of the watermarked image with size , the six-order features of Zernike moments as the input of trained machine learning network MLS–SVR to correct geometric distortion, and the corrected watermarked image is obtained.
Divide the corrected watermarking image into blocks, with a block size of , the numbers of the blocks are .
Construct a pure quaternion using the three RGB channels of the color image block. We can obtain a real component and three imaginary components , and of each color block by QDFT.
Construct a third-order tensor with dimensions of using , , and of each color block.
Operate Tucker decomposition on and then the core tensor is obtained.
Perform logistic mapping for all core tensor blocks, and then, the odd–even quantization technique is used to extract a bit watermark in position of each , the specific extraction rules are as follows:
=
,
,
where ‘
’ is the functions abs.
Obtain complete watermark through the odd–even quantization rule.
4. Experimental Results and Discussions
This paper uses the peak signal to noise ratio
[
37], normalized correlation coefficient
[
9], and bit error rate
[
19] to evaluate the visibility and robustness of the watermarking scheme.
is used to describe the fidelity performance, and
is used to describe the watermarking robustness.
[
19] is the mean square error of the data, which is expressed below:
The
is defined as follows:
where
is the host image,
is the watermarked image. In addition, the bit error rate (BER) and normalized correlation (NC) are used to evaluate the performance in terms of the watermark’s robustness, BER and NC are defined as follows:
where
is the original watermark,
is the extraction watermark.
is a watermark of size
.
This section illustrates the performance of the scheme through a series of experiments, and only representative experimental results are given herein. The five parts include the QDFT transform and inverse QDFT transform, the geometric expression of pseudo-Zernike, optimal watermark strength, comparing the scheme with the existing schemes, and the forecasting performance of the MLS–SVR network.
4.1. QDFT Analysis
A color image can be transformed into four real numbers
A,
C,
D, and
E using a QDFT real transform.
Figure 9 shows the
color image and its red, green, and blue channels. The results of the
color image that was operated by quanternion discrete Fourier transform are shown in
Figure 10.
After inverse QDFT transform using Equation (
7),
is negligible and can be approximately regarded as 0, this result also conforms to Equation (
8). When the input is a pure quaternion, the result of
can also be approximately regarded as a pure quaternion. When reconstructing the image,
,
, and
are as red
, green
, and blue
channels, respectively. The experiment verifies that the difference between the reconstructed image and original image is approximately
. The difference is very small, thus allowing the image to be almost completely restored. The reconstituting red channel
, green channel
, and blue channel
are shown in
Figure 11.
4.2. Geometric Characteristics of Pseudo-Zernike Moments
An image
I with size
is selected to obtain the Zernike moment feature [
31]. The two parameters
n and
m of pseudo-Zernike moments, are in the order of an orthogonal polynomial. The values of (
n,
m) are (0, 0), (2, 2), (4, 4), (8, 8), (9, 9), and (11, 11). Three kinds of attacks are performed on the image
I, including translation, scaling, and rotation. Specifically, five pseudo-Zernike moment features of images are shown in
Figure 12, including the original image
, the image shift twenty pixels to the left
, a two-times magnified image
, rotating the image thirty degrees counter-clockwise
, and rotating the image thirty degrees clockwise
.
When the image is subjected to different geometric attacks, the differences in pseudo-Zernike moments are relatively obvious. Hence, pseudo-Zernike moments can remarkably represent the global geometric features of the image.
4.3. Choose Watermark Embedding Strength
To balance the robustness and fidelity, the part discusses the embedding strength Q. We set the watermark embedding strength .
Figure 13 shows that the value of
Q is increasing, PSNR is decreasing, and NC is increasing, indicating that the robustness of the watermark is improved, whereas the image quality is deteriorated. When the value of
Q reaches 410, NC is close to 1, and the watermark can be completely extracted without being attacked. To balance robustness and fidelity,
Q = 1160 and PSNR = 40.413.
Figure 14 shows the PSNR of the eight watermarked images, consisting of “Lena”, “Castle”, “Baboon”, “Barbara”, “Boats”, “Fruit”, “Airplane”, “Houses”, and a watermark
.
It can be seen from the
Figure 14 that
is larger than 40, which indicates our scheme has better fidelity.
4.4. Comparison with Existing Schemes
To further describe the performance of the proposed color image watermarking scheme, we compare the proposed with exiting schemes [
2,
6,
8,
18,
19]. The results are shown in
Table 2,
Table 3 and
Table 4. Considering that the QDFT and TD hybrid transform allow the watermark energy to propagate synchronously in the three color image channels, when a channel is replaced by another channel of a color image, the watermark can still be extracted. Hence, we test the effect of re-composition for RGB channels, which is regarded as a special attack in this paper. The specific experimental results are shown in
Table 5. Beyond that, this part also conducts an attack experiment, attack types including noise, filter, geometric, compression processing, and blur attack. Our scheme has many types of anti-attack and has strong anti-attack ability. The proposed scheme is very robust against noise, filtering, compression processing, blurring, and geometric attacks, and effectively resists each color channel exchange attack.
4.5. Forecasting Performance of MLS–SVR
To train the MLS–SVR model, we use the six-order features of pseudo-Zernike moments as the input parameters [
38,
39], and the scaling, rotation, and translation parameters of the image subjected to geometric attack as output parameters. This experiment includes 114 training and 30 test samples. The training prediction errors of scaling, rotation, and translation are 0.0069, 0.0052, and 0.0066, respectively.
Table 6 shows pseudo-Zernike moments of five random images from training samples. The forecasting results of MLS–SVR are shown in
Table 7.
The experimental results show that the prediction accuracy of the MLS–SVR network remains relatively high. The corrected watermark image can improve the accuracy of watermark extraction. When the watermarked image is subjected to rotation, translation, and scaling attacks with correction, the watermark extraction bit error rate is shown in
Table 8. It can be seen from
Table 8 that
is very small, which indicates the watermark can be be almost completely extracted after correction.