Robust Reversible Watermarking Scheme in Video Compression Domain Based on Multi-Layer Embedding

Meng, Yifei; Niu, Ke; Zhang, Yingnan; Liang, Yucheng; Hu, Fangmeng

doi:10.3390/electronics13183734

Open AccessArticle

Robust Reversible Watermarking Scheme in Video Compression Domain Based on Multi-Layer Embedding

by

Yifei Meng

^1,2

,

Ke Niu

^1,2,*,

Yingnan Zhang

^1,2,

Yucheng Liang

^1,2 and

Fangmeng Hu

^1,2

¹

Key Laboratory of Network and Information Security, The Chinese People Armed Police Force (PAP), Xi’an 710086, China

²

College of Cryptography Engineering, Engineering University of PAP, Xi’an 710086, China

^*

Author to whom correspondence should be addressed.

Electronics 2024, 13(18), 3734; https://doi.org/10.3390/electronics13183734

Submission received: 30 July 2024 / Revised: 13 September 2024 / Accepted: 18 September 2024 / Published: 20 September 2024

(This article belongs to the Special Issue Advances in Algorithm Optimization and Computational Intelligence)

Download

Browse Figures

Versions Notes

Abstract

Most of the existing research on video watermarking schemes focus on improving the robustness of watermarking. However, in application scenarios such as judicial forensics and telemedicine, the distortion caused by watermark embedding on the original video is unacceptable. To solve this problem, this paper proposes a robust reversible watermarking (RRW)scheme based on multi-layer embedding in the video compression domain. Firstly, the watermarking data are divided into several sub-secrets by using Shamir’s (t, n)-threshold secret sharing. After that, the chroma sub-block with more complex texture information is filtered out in the I-frame of each group of pictures (GOP), and the sub-secret is embedded in that frame by modifying the discrete cosine transform (DCT) coefficients within the sub-block. Finally, the auxiliary information required to recover the coefficients is embedded into the motion vector of the P-frame of each GOP by a reversible steganography algorithm. In the absence of an attack, the receiver can recover the DCT coefficients by extracting the auxiliary information in the vectors, ultimately recovering the video correctly. The watermarking scheme demonstrates strong robustness even when it suffers from malicious attacks such as recompression attacks and requantization attacks. The experimental results demonstrate that the watermarking scheme proposed in this paper exhibits reversibility and high visual quality. Moreover, the scheme surpasses other comparable methods in the robustness test session.

Keywords:

video watermarking; robust reversible watermarking; H.264/AVC; DCT coefficient; motion vector

1. Introduction

The rapid proliferation of video applications has significantly increased the reliance on video as a primary mode of interpersonal information exchange. Video files, which seamlessly integrate the strengths of both images and audio, possess the capability to convey substantial amounts of information within a constrained timeframe, making them an ideal medium for information transmission. However, concerns over copyright infringement and information security in videos have gained prominence in recent times. Video watermarking technology addresses these issues by embedding watermark information into videos [1,2,3,4,5,6,7,8], thereby providing both copyright protection and integrity authentication. Consequently, this technology has emerged as a crucial solution to contemporary video information security challenges.

To mitigate the risk of watermarked information loss during transmission due to malicious or unintentional attacks, researchers have focused on enhancing the robustness of video watermarking techniques [9,10,11,12,13,14,15,16,17,18,19,20,21,22,23,24]. These techniques are typically categorized into two groups: frame-content-based schemes [9,10,11,12,13,14,15] and compressed bitstream-based approaches [16,17,18,19,20,21,22,23,24], depending on where the watermark is embedded. Frame-content-based schemes treat video sequences as continuous image sets and adapt existing image watermarking algorithms to video frames by modifying pixel values or modulating transform domain coefficients. Dhall, S et al. [15] proposed a multilayered watermarking mechanism. The watermarking mechanism compresses and quantum encrypts electronic patient records, and then embeds them into medical images after LWT transformation. The various planes of the watermarked medical images are scrambled and compressed, and then embedded into different bands of the reference image after LWT (lifting wavelet transform) transformation. This algorithm has achieved a high level of security. However, these schemes often struggle to balance robustness and complexity. To better suit real-time applications while maintaining robustness, recent advancements have included video watermarking schemes integrated with the video’s coding process, which modify coding process artifacts to embed watermark information, offering improved real-time performance and broader applicability. Chan, K et al. [24] proposed four hypothetical probability estimation methods designed to enhance entropy coding through Contextbased Adaptive Binary Arithmetic Coding (CABAC). These methods have been integrated into the ongoing development of the next-generation video coding standard, known as Versatile Video Coding (VVC). By leveraging improved probability estimation, this approach aims to boost compression efficiency. The adaptability of the various parameters allows the estimator to accommodate environments with differing adaptation rates, thereby enhancing its versatility. Yang L et al. [16] proposed a video watermarking scheme based on discrete cosine transform (DCT) coefficients that embeds watermark information by modifying the highest frequency DCT coefficients. The watermark, encoded as a BCH code carrier to reduce the bit error rate (BER), is embedded into the U-channel of video frames using parity quantization. Synchronization codes are employed to locate the watermark post-cropping. While this scheme achieves superior stealth, its effectiveness against recompression attacks varies due to the limited error correction capability of BCH codes. Di Fan et al. [17] introduced a video watermarking algorithm based on H.264/AVC that selects suitable chroma sub-blocks based on non-zero quantization coefficients and energy factors, employing optimized modulation to embed watermarks into DCT quantization coefficients. It demonstrates good invisibility and compression resistance; however, the irreversible watermark embedding algorithm may cause permanent alterations to the original video.

For copyright protection and recovery of the original image, robust reversible watermarking (RRW) has garnered significant attention from researchers. The currently widely used reversible robust watermarking schemes can be categorized into two types: Generalized Histogram Shifting (GHS) and Multilayer Watermarking (MLW). However, the GHS scheme has limited applicability due to its requirement for additional channels to transmit auxiliary information. As shown in Figure 1, Coltuc initially proposed the MLW framework in the literature [6], which employs two distinct phases for watermark embedding: robust embedding and reversible embedding. The distortion introduced during robust embedding forms the basis for the information embedded during reversible embedding, enabling the extraction of the robust watermark and recovery of the original image even if the reversible watermark is compromised post-attack.

However, Coltuc’s approach utilizes the same embedding domain for both robust and reversible embedding phases, resulting in robust watermarking being affected by both embedding noise and attack noise. Furthermore, reversible watermarking necessitates embedding a large amount of information, and embedding it into the same embedding domain as robust watermarking leads to significant distortion of the embedding vector. Wang Xiang et al. [25] proposed a robust reversible watermarking scheme that relies on independent embedding domains. This scheme involves generating two separate embedding domains through an independent embedding domain transformation process. The robust watermark and the reversible watermark are then embedded into these two domains, respectively. Experimental results indicate that this method prevents mutual interference between the two watermarks, enhances the robustness of watermark embedding, and enables the restoration of the original carrier image in the absence of any attacks.

In this paper, we enhance the MLW framework and apply it to video reversible robust watermarking. Initially, the watermarking data are divided into several sub-secrets using Shamir’s (t, n)-threshold secret sharing. These sub-secrets are then embedded by modifying the QDCT coefficients of the I-frame chroma sub-block of each group of pictures (GOP) in the video frame sequence. Subsequently, the auxiliary information necessary for recovering the coefficients is embedded as a reversible watermark in the motion vectors of the subsequent P-frames of the I-frames. By embedding the robust and reversible watermark into different domains of the compressed video, the noise effect on the robust watermark caused by the reversible embedding is mitigated, resulting in improved visual quality of the secret-containing video. Experimental results demonstrate that the scheme surpasses similar schemes in terms of robustness and invisibility.

2. Preliminary Concepts

2.1. H.264/AVC Video Compression Standard

The H.264/AVC video codec standard, proposed jointly by the Joint Video Experts Group of the International Organization for Standardization (ISO) and the International Telecommunication Union (ITU), is currently the most prevalent video codec standard. H.264 employs predictive coding, leveraging either intra-frame or inter-frame prediction for the current coded frame, and organizes the sequence of video frames into GOPs. The initial frame of each GOP is designated as an I-frame, containing the complete picture information and utilizing only intra-frame prediction, enabling decoding to be self-contained. Conversely, P-frames, which constitute the remainder of the GOP, engage in predictive coding, utilizing previously coded video frames as references for inter-frame prediction. The discrepancies between prediction values and motion vectors are subsequently encoded into the video stream. Figure 2 illustrates the coding flow of a video frame using the H.264 standard.

2.2. Reference Frame Conversion Technique

One of the features of the H.264/AVC video codec standard is the multiple reference frame technique. When a video frame is decoded, it is stored in the decoded frame buffer, and some of the decoded frames are saved in the reference frame list. These decoded frames in the reference frame list are utilized as reference frames in the inter-frame prediction part of the subsequent frame encoding process. When motion vectors are used as vectors for reversible steganography, modifications to the motion vectors can cause a cumulative effect of errors in the subsequent frames. Taking the steganography of P-frame vectors as an example, the encoding of a P-frame depends on the completed decoding of the previous frame. If the vectors of the previous frame have been modified, the error caused by such modification will be further amplified in the encoding process of the current frame. To mitigate this issue, a frame is selected as a specific frame in a group of frame sequences at a certain interval distance, leveraging the fact that H.264/AVC can flexibly select a reference frame. As depicted in Figure 3, the specific frame is encoded with reference only to the previously encoded specific frame, and no secret information is embedded to keep the motion vector of the specific frame unchanged. Meanwhile, ordinary frames between adjacent specific frames use the previous specific frame as their own reference frame, which ensures that the accumulation of motion vector distortion caused by the embedding of secret information is limited to a small range between the two specific frames, ultimately reducing the impact of the distortion accumulation.

2.3. QP Value under Recompression Attack

Video watermarks are frequently embedded into video streams and stored and transmitted alongside video files. In practical applications, video files often need to be distributed and shared across various users and network platforms. Due to limited storage resources on local and network servers, as well as bandwidth constraints on the transmission network, video files often require recompression to meet storage and transmission requirements in different scenarios. To achieve a smaller file size, as shown in Figure 4a, users frequently increase the quantization factor QP during video compression to reduce resource consumption of video files. However, altering the QP directly affects parameters such as transform coefficients and prediction modes during video coding, ultimately affecting the video watermark. As depicted in Figure 4b, changes in the QP value result in corresponding changes in the DCT coefficients. Therefore, resistance to recompression and QP value change attacks is a crucial indicator of robust watermarking.

3. Proposed Scheme

In the traditional RRW framework, both the embedding of the robust watermark and the reversible watermark utilize the same embedding domain. This results in an inevitable impact on the robustness of the scheme when embedding the reversible watermark. Additionally, embedding excessive information in the same domain significantly diminishes the visual quality of the embedding video, thereby reducing the covertness of the watermarking scheme. Furthermore, Liang et al. [18] introduced a robust and reversible video watermarking technique that relies on video multi-domain embedding. This approach considers the high-frequency and low-frequency coefficients produced during U-channel encoding of the embedded frame as distinct embedding domains, allowing for the insertion of robust and reversible watermarks, respectively. As a result, embedding these two types of watermark into separate domains minimizes interference between them during the embedding process. Compared with similar algorithms, this method has stronger robustness. Similarly, Chen et al. [23] created distinct embedding domains by separately embedding two types of watermark into I-frames and P-frames. This embedding method also demonstrates its advantages in improving watermark robustness. These findings indicate that embedding reversible watermarks into different embedding domains than robust watermarks can avoid interference caused by reversible watermark embedding on robust watermarks, thereby improving the robustness of watermarking algorithms. In addition, more effective embedding domain construction methods can be implemented to further enhance the robustness of watermarks. To address this issue, as illustrated in Figure 5, the proposed scheme embeds robust and reversible watermarking into distinct compression domains of the video. Specifically, the robust watermark is embedded in the QDCT coefficients of the chroma macro-block generated during the intra-frame prediction of the I-frame in the H.264 coding process. The auxiliary information required for macro-block recovery is losslessly compressed into a reversible watermark and embedded into the motion vectors of the subsequent P-frames of the I-frame by a reversible steganography algorithm. This method enables the robust and reversible watermark to be embedded into different compression domains of the video, ultimately enhancing the robustness of the watermark and improving the visual quality of the embedded video.

3.1. Watermark Preprocessing

The (t, n)-threshold secret sharing scheme allows a sender to decompose a secret message

S

into n secret shares for distribution. The scheme enables the recovery of the secret message when the receiver holds at least t secret shares. The Shamir secret sharing scheme is based on the Lagrange interpolation and vector methods. For the secret

S

, take

t - 1

random numbers and construct a polynomial:

F (x) = S + a_{1} * x^{1} + a_{2} * x^{2} + \dots + a_{t - 1} * x^{t - 1} m o d (p) .

(1)

p

is a prime number greater than

S

. After that, take any n numbers

x_{1}, x_{2}, \dots, x_{n}

and substitute them into the polynomial

F (x)

to get n sets of secret shares

(x_{1}, F (x_{1})), (x_{2}, F (x_{2})), \dots, (x_{n}, F (x_{n}))

. In this paper, we utilize the Shamir password sharing scheme to partition the watermark into several secret shares. These shares are then embedded into different video frames to enhance the robustness of the watermarking scheme.

3.2. Embedding Location Selection

The rich bitstream information of H.264 provides researchers with a range of alternative locations for watermark embedding. Among these locations, the frequency domain coefficients generated during the coding process have better robustness than other elements and are widely used for robust watermark embedding. H.264 uses the current block pattern (CBP) parameter to determine whether and how the current residual block needs to be decoded. However, if the video suffers from a recompression attack or the embedded watermark causes a change in the DCT coefficients, the CBP value of the macro-block is likely to change, which can result in the watermarked block not being decoded correctly at the decoding side. The literature [7] suggests recompressing the video and differentiating between chrominance and luminance components. It was found that compared to the luminance component sub-block, the chrominance component sub-block has a lower probability of CBP mutation and a more stable prediction pattern. This means that the DCT coefficients of the chroma sub-block are less affected when the video is subjected to a compression attack, making it a more desirable location for robust watermark embedding. In order to improve the robustness of watermark embedding, watermarks should be embedded in areas with more complex textures in video frames, as these areas are assigned more encoding parameters compared to flat areas. This means that areas with complex textures have more redundant space when embedding watermarks and are less sensitive to compression operations, allowing watermarks to be more stably embedded in these areas [7]. Additionally, embedding information in texture-complex regions helps us find the information extraction location more accurately in the extraction watermarking ground. Because the recompression attack does not alter the visual content of each region in the video frame, this means that the video receiver can still accurately filter out regions with more complex textures in the video frame [22].

Meanwhile, according to the characteristics of the human visual system (HVS), modifying complex texture areas in the image is less noticeable compared to modifying flat background areas [20]. So, embedding watermarks in areas with complex textures can not only enhance the robustness of the watermark and help the recipient find the watermark position more accurately, but also reduce the impact of watermark embedding on video quality.

The number of non-zero coefficients (NNZs) serves as a common measure of the spatial complexity of a macro-block. It counts the total number of non-zero coefficients present within a macro-block. A higher NNZ value indicates a texture complexity region within a video frame. The energy factor (

E f

) is another metric that reflects the information level of a macro-block. It is defined as follows:

E f (i) = \sum_{j = 1}^{4} \sum_{k = 1}^{4} |c (i, j)| .

(2)

where

j, k

denote the coordinates of pixels in a 4 × 4 macro-block.

The literature [17] experimentally demonstrates that macro-blocks selected based on both NNZs and Ef exhibit greater robustness against requantization attacks, and introduces the texture coefficient

C (i) = N N Z (i) + E F (i)

as a metric. In this paper, we extract 4 × 4 chroma macro-blocks from frame I in the video frame sequence, calculate the

C (i)

of each macro-block, and select those with the highest

C (i)

values as the location to embed the watermark information. Compared to previous video codecs, H.264 offers a more fine-grained macro-block division and sub-pixel motion vector estimation and compensation, generating abundant vector information and providing numerous embedding locations for large-capacity reversible information hiding. In this study, we select motion vectors from several subsequent P-frames of I-frames in the video frame sequence as vectors, and compress the distortion caused to these vectors during the robust embedding stage into reversible watermarks using lossless compression techniques. These watermarks are then embedded into the motion vectors through reversible steganography.

3.3. Watermark Embedding

3.3.1. Robust Watermark Embedding

The 4 × 4 chroma macro-block generates a frequency domain coefficient matrix after DCT transformation and quantization, which contains two parts, the DC coefficients at the upper left position and the remaining AC coefficients. Among them, the DC coefficients DC carry a large amount of original image information. The DC coefficients AC, as shown in Figure 6, can be divided into low-frequency coefficients, medium-frequency coefficients, and high-frequency coefficients according to the frequency from low to high. The low-frequency coefficients carry large-scale information, which corresponds to the flat region in the image. Modification of them will have an impact on the visual quality of the video, so this part of the coefficients is not changed. As for the high-frequency coefficients in the lower right corner, they correspond to the texture detail part of the image. In order not to affect the efficiency of the subsequent entropy coding stage, this part of the coefficients is also kept unchanged. Therefore, the scheme embeds the robust watermark

W_{r o}

by modifying the mid-frequency coefficients of the chroma macro-block in the I-frame.

Compared to the lower coefficient blocks of NNZs, blocks with richer DCT non-zero coefficients exhibit higher robustness. However, excessive modification of zero coefficients can compromise video coding efficiency and increase the code stream volume. For the texture chroma blocks identified in Section 3.1, further differentiation is carried out to apply distinct embedding methods during watermark embedding. The coefficient blocks with NNZ > 5 in the texture chroma blocks are considered the most suitable embedding location. These coefficient blocks are grouped into robust blocks

B_{r o}

, and a strong embedding approach is taken for such coefficient blocks to embed watermarks by modifying the whole segment of mid-frequency coefficients to embed 3-bit watermark information in each robust block

B_{r o}

. On the other hand, the robustness of coefficient blocks with NNZ < 5 in the texture chromaticity block is slightly inferior, but they are still situated in the texture-rich region of the video, making watermark embedding less detectable. Thus, these coefficient blocks are grouped into texture blocks

B_{r o}

and two non-zero coefficients are selected in descending order of frequency to form coefficient pairs. Each texture block

B_{t e x}

embeds 1-bit watermark information by modifying the coefficient pairs.

Furthermore, to ensure the preservation of zero coefficients and thus maintain compression efficiency, the selection process for the texture block

B_{t e x}

focuses on the two non-zero mid-frequency coefficients

b_{1}

and

b_{2}

. These coefficients are chosen in descending order of frequency, from high to low. Subsequently, the difference between the coefficient pairs, denoted as

d = b_{1} - b_{2}

, is calculated. Following this, 1-bit information is embedded into each coefficient pair according to the specified modulation method.

b_{1}^{'} = b_{1} - ⌊\frac{d}{2}⌋ - η (1 - 2 w_{k}) .

(3)

b_{2}^{'} = b_{2} + ⌊\frac{d}{2}⌋ + η (1 - 2 w_{k}) .

(4)

where

η

is the embedding strength coefficient and

w_{k}

is the watermark bit to be embedded.

To enhance the robustness of watermark embedding in the robust block

B_{r o}

, segmentation of its coefficients and embedding information in distinct segments as carriers can be effective, as suggested by [26]. The mid-frequency coefficients

c_{0}, c_{1}, c_{2}, \dots, c_{n - 1}

in

B_{r o}

are divided into three segments, denoted as

S_{k} = \{S_{0}, S_{1}, S_{2}\}

. Subsequently, by modifying the coefficients within each segment as a whole, the 1-bit watermark information is embedded into that segment. The specific embedding method is presented in Equation (5), where

w_{k} \in \{0,1\}

represents the watermark information to be embedded into

S_{k}

, and

c_{w}^{i}

denotes the coefficient after embedding the watermark.

c_{w}^{i} = \{\begin{matrix} c^{i} + δ_{k}, i f w_{k} = 1 a n d c^{i} \in S_{k} \\ c^{i} - δ_{k}, i f w_{k} = 0 a n d c^{i} \in S_{k} \end{matrix} .

(5)

The modification of the coefficient fragment

S_{k}

, denoted as

δ_{k}

, is determined by the formula

δ_{k} = ⌊\frac{T - |\sum c^{i}|}{m}⌋

, where

T

is a pre-set threshold. For coefficient fragments

S_{k}

that are smaller than T, both the coefficients and

δ_{k}

are set to 0.

To eliminate the distortion caused during robust watermark embedding and recover the vector in subsequent stages, the embedding strength

η

of each coefficient pair, original difference

d

of each coefficient pair, and the modification

δ_{k}

corresponding to each coefficient segment

S_{k}

are recorded as auxiliary information

M

. This auxiliary information

M

is then converted into 0–1-bit strings using a lossless compression algorithm and embedded as a reversible watermark

W_{r e}

in the motion vectors of the subsequent P frames.

The following Algorithm 1 demonstrates the detailed steps of

W_{r o}

embedding:

Algorithm 1: Robust embedding algorithm

Input: Robust watermark

W_{r o}

, original reference frame

I

Output: Embedded reference frame

I'

, auxiliary information

M

Begin:

$F (x) = S + a_{1} * x^{1} + a_{2} * x^{2} + \dots + a_{t - 1} * x^{t - 1} m o d (p)$ ;
{sub-secret $S_{1}$ _, sub-secret $S_{2}$ _,…, sub-secret $S_{n}$ } = Watermark $W_{r o}$
Extract all chroma sub-blocks $B$ of $I$
$C (i) = N N Z (i) + E F (i)$ , $E f (i) = \sum_{j = 1}^{4} \sum_{k = 1}^{4} |c (j, k)|$ //Filtering blocks suitable for embedding
$B_{t e x}$ = ${B_{i}, B_{i} \in B a n d N N Z (B_{i}) > 5}$
$B_{r o}$ = ${B_{i}, B_{i} \in B a n d N N Z (B_{i}) < 5}$ //Classify the sub-blocks $B$ to be embedded according to NNZ
{ $b_{1}, b_{2}, \dots, b_{n}$ } = sorted ( $B_{t e x}$ , descending);
$d = b_{1} - b_{2}$ //Generate difference $d$
$b_{1}^{'} = b_{1} - ⌊\frac{d}{2}⌋ - η (1 - 2 w_{k})$ ;
$b_{2}^{'} = b_{2} + ⌊\frac{d}{2}⌋ + η (1 - 2 w_{k})$ //Embedding method for block $B_{t e x}$
${S_{1}, S_{2}, S_{3}} = {c^{0}, c^{1}, c^{2}, \dots, c^{n - 1}} = S k =$ Midfrequency ( $B_{r o}$ )//Segment the midfrequency coefficients within a block
$c_{w}^{i} = c^{i} + δ_{k}$ , if $w_{k} = 1$ and $c^{i} \in S_{k}$ ;
$c_{w}^{i} = c^{i} - δ_{k}$ , if $w_{k} = 0$ and $c^{i} \in S_{k}$ ;
$δ_{k} = ⌊\frac{T - |\sum c^{i}|}{m}⌋$ //Embedding method for block $B_{r o}$ //Embedding method for block $B_{r o}$
$M = {η, d, δ_{k}}$ //Generate auxiliary information

The modified sub-block is passed back to the encoder to complete the encoding, generating

I'

.

3.3.2. Reversible Watermark Embedding

Two-dimensional histogram shifting algorithms are steganographic algorithms widely used in the field of reversible steganography. They have demonstrated high embedding capacity. This class of algorithms utilizes information from the video stream itself, such as pixel values, prediction errors, motion vectors, and other elements, to construct two-dimensional value pairs. Information is embedded by establishing a mapping relationship between the information to be embedded and the modification of the value pairs at different locations. The literature [27] adapts to the characteristics of different vectors and enhances the embedding capacity by setting the centroid of the algorithm at the location that contains the highest number of vectors. However, given that the number of P-frames significantly exceeds the number of I-frames, the 2D histogram algorithm for motion vectors has been refined to improve the visual quality of the algorithm. This refinement aims to enhance visual quality while appropriately reducing embedding capacity.

After lossless compression of auxiliary information

M

into reversible watermark

W_{r e}

, it is embedded into the motion vector through the following reversible steganography algorithm. Firstly, the motion vectors contained in the video frame stream information are extracted. For each vector, its horizontal and vertical components are used as coordinates to construct two-dimensional value pairs, transforming the video frame vector information into a two-dimensional histogram. The vector information of the reference frame is then statistically analyzed to derive the histogram centroid

(k, j)

. Due to the coherence between the reference frame and subsequent frames, the vector centroid of the reference frame can be used as the centroid of the subsequent embedded frames during embedding. Meanwhile, the literature [28] demonstrates that embedding watermarks at positions with larger motion amplitudes in video frames can reduce the impact of watermark embedding on the visual quality of the video. To minimize the visual impact of the watermark embedding, the embedding location is fixed in the region with large motion amplitude by retaining only the centroids with

k, j \geq 2

, enhancing the covertness of the watermark. Subsequently, the value pairs around the centroid are divided into 13 non-overlapping sets

S = {S_{1}, S_{2}, \dots, S_{13}}

, and the corresponding shift operation is performed on each set to embed the information. The division of value pairs is shown in Equation (6).

S \{\begin{matrix} S_{1} = \{(B_{1}, B_{2} | B_{1} = k, B_{2} = j)\} \\ S_{2} = \{(B_{1}, B_{2} | B_{1} = k, B_{2} > j)\} \\ S_{3} = \{(B_{1}, B_{2} | B_{1} > k, B_{2} = j + 1)\} \\ S_{4} = \{(B_{1}, B_{2} | B_{1} > k, B_{2} = j)\} \\ S_{5} = \{(B_{1}, B_{2} | B_{1} = k + 1, B_{2} < j)\} \\ S_{6} = \{(B_{1}, B_{2} | B_{1} = k, B_{2} < j)\} \\ S_{7} = \{(B_{1}, B_{2}| B_{1} < k, B_{2} = j - 1)\} \\ S_{8} = \{(B_{1}, B_{2} | B_{1} < k, B_{2} = j)\} \\ S_{9} = \{(B_{1}, B_{2} | B_{1} = k - 1, B_{2} > j)\} \\ S_{10} = \{(B_{1}, B_{2}| B_{1} > k, B_{2} = j + 1)\} \\ S_{11} = \{(B_{1}, B_{2}| B_{1} > k + 1, B_{2} < j)\} \\ S_{12} = \{(B_{1}, B_{2} | B_{1} < k, B_{2} < j - 1)\} \\ S_{13} = \{(B_{1}, B_{2} | B_{1} < k - 1, B_{2} > j)\} \end{matrix} .

(6)

When

(B_{1}, B_{2}) \in S_{1}

, the mapping relationship between the modification method of the value pair and the embedding information is as follows. Specifically,

(B_{1}^{'}, B_{2}^{'})

represents the vector component after embedding, while

m_{t} m_{t + 1} m_{t + 2}

denotes the bit information to be embedded.

(B_{1}^{'}, B_{2}^{'}) = \{\begin{matrix} (B_{1}, B_{2}), m_{t} m_{t + 1} = 00 \\ (B_{1}, B_{2} + 1), m_{t} m_{t + 1} = 01 \\ (B_{1}, + 1 B_{2}), m_{t} m_{t + 1} = 10 \\ (B_{1}, B_{2} - 1), m_{t} m_{t + 1} = 11 \end{matrix} .

(7)

When

(B_{1}, B_{2}) \in S_{2}