Steganalysis of Neural Networks Based on Symmetric Histogram Distribution

Tang, Xiong; Wang, Zichi; Zhang, Xinpeng

doi:10.3390/sym15051079

Open AccessArticle

Steganalysis of Neural Networks Based on Symmetric Histogram Distribution

by

Xiong Tang

,

Zichi Wang

^*

and

Xinpeng Zhang

School of Communication and Information Engineering, Shanghai University, Shanghai 200444, China

^*

Author to whom correspondence should be addressed.

Symmetry 2023, 15(5), 1079; https://doi.org/10.3390/sym15051079

Submission received: 11 April 2023 / Revised: 10 May 2023 / Accepted: 12 May 2023 / Published: 13 May 2023

(This article belongs to the Special Issue Security, Communication and Privacy in Internet of Things: Symmetry and Advances)

Download

Browse Figures

Versions Notes

Abstract

:

Deep neural networks have achieved remarkable success in various fields of artificial intelligence. However, these models, which contain a large number of parameters, are widely distributed and disseminated by researchers, engineers, and even unauthorized users. Except for intelligent tasks, typically overparameterized deep neural networks have become new digital covers for data hiding, which may pose significant security challenges to AI systems. To address this issue, this paper proposes a symmetric steganalysis scheme specifically designed for neural networks trained for image classification tasks. The proposed method focuses on detecting the presence of additional data without access to the internal structure or parameters of the host network. It employs a well-designed method based on histogram distribution to find the optimal decision threshold, with a symmetric determining rule where the original networks and stego networks undergo two highly symmetrical flows to generate the classification labels; the method has been shown to be practical and effective. SVM and ensemble classifiers were chosen as the binary classifier for their applicability to feature vectors output from neural networks based on different datasets and network structures. This scheme is the first of its kind, focusing on steganalysis for neural networks based on the distribution of network output, compared to conventional digital media such as images, audio, and video. Overall, the proposed scheme offers a promising approach to enhancing the security of deep neural networks against data hiding attacks.

Keywords:

deep neural networks; steganalysis; histogram distribution

1. Introduction

Data hiding is a technique that aims to embed secret information imperceptibly, enabling the transmission of additional data without causing serious distortion to the cover media. While data hiding methods such as steganography are effective for covert communication and provide a reliable guarantee for information security, data hiding has also been used for illegal and malicious purposes, such as obtaining personal interests or conducting terrorist attacks, which poses a serious threat to social security. To prevent and address potential security problems caused by data hiding, techniques for detecting hidden secret information, such as steganalysis, have been developed. Steganography and steganalysis have been developing while promoting and opposing each other [1]. Multimedia steganography uses images, audio, and video as carriers of secret information, among which image steganography is the most widely studied and used. For image steganography, the original image before embedding the secret information is referred to as the cover image, and the image after embedding is referred to as the stego image. In the early stages of image steganography, methods such as LSB [2] and LSBM [3] are applied to uncompressed bitmap format images, where secret bit sequences are embedded into the cover image by replacing or modifying the least significant bits of pixels. However, such methods unavoidably alter the statistical characteristics of images. To enhance imperceptibility, content-adaptive steganography takes into account the attributes of the image to selectively embed the secret information in areas with complex textures or rich edges of the cover image, making it difficult to detect the presence of secret information. Common content-adaptive steganography algorithms include HUGO (highly undetectable stego) [4], WOW (wavelet obtained weights) [5], UNIWARD (universal wavelet relative distortion) [6], and HILL (high-pass, low-pass, and low-pass) [7]. Generally, these methods are combined with STC (syndrome-trellis codes) coding methods [8] to achieve the minimum embedding distortion under the premise of the given capacity and pixel distortion, and the difference lies in their different distortion functions. The design of distortion function determines the performance of the employed steganography method. For example, ASDL-GAN, an automatic steganographic distortion learning framework, was proposed in [9] to obtain a stego image by automatically learning the embedding change probabilities for every pixel using a steganographic generative subnetwork. Subsequently, a customized TES activation function converts the learned embedding change probabilities to embedding distortions for minimal distortion embedding. Recently, authors in [10] combined image steganography with deep learning to encode a large amount of useful information in the cover image by inputting the secret message and cover image into an encoder network simultaneously to extract features and output an encoded image (the stego image). The produced encoded image was then decoded to extract the secret message through a decoder network. In a specific experiment, the encoder and decoder networks, together with a noise layer and an adversary network, were jointly trained to verify the robustness of the proposed scheme. In [11], a secret RGB image was hidden in a cover RGB image of the same size using image compression. In other words, the ratio between the secret message and cover message is 1:1, with the embedding capacity reaching 24 bpp (bits per pixel). Additionally, researchers proposed an asymmetric reversible data hiding scheme for compressed images in which the recipient can recover the original cover image while extracting the secret information [12]. Using the adaptive bitrate strategy, the embedding capacity of the secret data in the scheme is greatly increased.

All the image steganography methods mentioned above rely on the high information redundancy of the cover image. Moreover, images are one of the most commonly used digital media and have wide accessibility. The development of image steganography is essentially an ongoing process of exploring and utilizing image characteristics. Digital images provide an ideal cover for hiding secret information, as the hidden information is difficult to detect.

Similarly, neural networks have gradually become an emerging digital cover, joining the ranks of images, audio, and video. Recent advancements in artificial intelligence technology, particularly in deep neural networks, have led to significant breakthroughs in various application fields, such as computer vision, speech recognition, and natural language processing. Deep neural network models, including AlexNet [13], ResNets (residual networks) [14], CapsNets (capsule networks) [15], and GANs (generative adversarial networks) [16], have been designed and trained to perform a diverse range of intelligent tasks, as illustrated in Figure 1. These models can be easily shared, accessed remotely as API interfaces, and widely distributed for scientific research and industrial development needs. It is predicted that with the rapid progress of deep learning, neural networks will become a common commodity on model-sharing platforms and online app stores in the future.

Most importantly, as illustrated in Figure 2, neural networks possess deep internal structures and a large number of model parameters, making them the candidate cover media for data hiding tasks. In [17], Uchida et al. first proposed a framework to embed secret data in model parameters using a fixed embedding transfer matrix. Since their embedding process is completed while training the host network for the original task, the performance of the model is not compromised significantly. However, this work mainly focused on embedding a digital watermark to protect the intellectual property of models, and embedding capacity was not the primary indicator. Recently, Wang et al., in [18], proposed the first work that focused on embedding additional data in parameters of neural networks for multiple receivers, with a total embedding capacity of up to 6000 bits. Similarly, the secret bits were also embedded into model parameters during the training process of neural networks, instead of directly modifying the trained parameters. Since deep neural networks are usually overparameterized, the model exploits its strong learning ability to secretly embed additional data into its parameters. Moreover, the secret data are embedded into neural networks during the training process rather than modifying the elements of the host network after training, without causing much damage to the original task. As a result, data hiding in neural networks is employed to exploit redundancy in the parameters of the trained model to contain additional data. The decreased detection accuracies of [17,18] are 1.47% and 0.22%, respectively, which indicates that the fidelity of stego network is satisfactory. To this end, embedding additional data in model parameters has become the mainstream method of data hiding in neural networks.

However, although methods of data hiding in neural networks have been proposed one after another, the corresponding steganalysis methods of neural networks are still missing. In terms of high embedding capacity and imperceptibility, data hiding in neural networks could pose greater information security vulnerabilities compared to image steganography and present new challenges to model security in deep learning. Moreover, this situation is expected to worsen with the continued popularization and application of neural networks and the advancement of data hiding algorithms. Therefore, the importance and necessity of a steganalysis solution for neural networks become increasingly apparent.

In this paper, we propose a new symmetric steganalysis scheme to detect hidden data in the parameters of neural networks trained for image classification tasks. Unlike existing steganalysis methods designed for multimedia, such as images or video, our proposed scheme focuses on deep image classification neural networks to determine whether they contain hidden information using a well-designed symmetric method. For the dataset, we trained multiple types of neural networks both for their original task, named the plain network, and, with additional data embedded, named the stego network, on multiple image datasets. Since the secret data are embedded in the training process of the host network, the reduction in detection accuracy of the neural network is trivial. For steganalytic features, we tested the neural networks such as a black box without access to the internal structure and parameters of the host network. The output prediction vectors are collected as unprocessed feature vectors for their respective models, including plain networks and stego networks. It is worth noting that the location where the secret information is embedded is unknown to the analyst. It may be the first layer, a hidden layer, or any layer. Such a blind detection method conforms to a real application scenario, so the practicability of the scheme is guaranteed. For classification, we designed a symmetric method based on histogram distribution where plain networks and stego networks are treated symmetrically to determine optimal decision thresholds, using the SVM and ensemble classifiers as steganalysis detectors. Experimental results verified the effectiveness of our proposed steganalysis scheme.

This paper makes the following contributions:

This paper focuses on a new form of steganalysis to detect the presence of hidden data in deep neural networks trained for image classification tasks. By extending steganalysis from multimedia content to deep neural network models, we can protect neural networks from being exploited to transmit secret data;
This paper proposes a steganalysis scheme using a well-designed symmetric method based on histogram distribution to determine the optimal classification thresholds. Since the neural network can be treated as a black box, the practicality of the proposed scheme is satisfactory;
This paper performs comprehensive experiments on a diverse dataset of massive neural networks in a progressive way. Experimental results verified the effectiveness of our proposed steganalysis scheme.

The rest of this paper is organized as follows. We introduce related work in Section 2. Our steganalysis scheme is described in Section 3. Experimental results and analysis are provided in Section 4. Section 5 concludes the paper.

2. Related Work

In this section, we introduce some related work, including data hiding methods of neural networks and steganalysis methods of digital images.

2.1. Data Hiding in Neural Networks

Data hiding in neural networks can be classified into two categories, model watermarking and steganography, both of which involve modifying the elements of the neural network.

Model watermarking is a technique that involves hiding specific information, called a watermark, within a neural network model. Building well-designed deep neural network models requires a significant amount of effort, including the expertise of the designer and the use of large amounts of training data and computing resources. Therefore, protecting neural networks from being maliciously tampered with, distributed, and abused is of utmost importance. Model watermarking was proposed to embed watermarks into deep neural networks to protect the intellectual property of neural networks [17]. Watermarks were embedded into the model parameters of DCNNs using a parameter regularizer. To improve embedding and extraction performance, researchers designed an error backpropagation-based watermarking technique [19] for model authentication using selective weights for watermarking. In addition to model parameters, some researchers used specially crafted training samples and changed labels to train the target network [20] to carry watermark information. For watermark verification, specially crafted samples were input as triggers to activate pre-specified predictions at inference. In [21], the authors proposed a watermarking technique that deliberately output specific (incorrect) labels for certain inputs, using random training instances and random labels. In [22], a novel watermarking scheme was proposed that enabled adversarial examples to convey the desired information to the host neural network using a decision frontier stitching algorithm. Additionally, a digital watermarking framework was proposed for deep neural networks by embedding watermarks into the output images [23]. The well-trained host network can complete its original task while embedding watermarks. The authors of [18] designed a data hiding scheme in deep neural networks for multiple receivers. Large amounts of additional data were embedded into model weights by matrix coding [24] in the training process to minimize the impact on the original task, as shown in Figure 3. The scheme’s robustness and universality were verified.

This paper focuses on the steganalysis of neural networks trained for image classification tasks. The steganalysis is able to detect additional data hidden in a neural network. Details are described in Section 3.

2.2. Steganalysis of Digital Images

The high correlation between pixels and textural complexity in the spatial domain [25] and frequency spectrum characteristics in the transform domain [26] provide large redundancy in digital images that can be exploited for data hiding. Image steganography leverages this redundancy to create a subliminal channel [27], which is unknown to a third party to transmit secret information. To prevent malicious or illegal use, the steganalysis of digital images has been developed to detect the existence of hidden secret information in the cover image and eliminate the subliminal channel.

In [28], a steganalysis model based on statistical analysis was proposed for LSB steganographic images. By applying a histogram characteristic function (HCF) [29], researchers solved the problem of detecting least significant bit matching (LSBM) steganography in the spatial domain [30]. An adaptive steganalytic scheme was designed for the WOW [5] method by narrowing down the possible modified regions, and therefore the detection performance was improved in [31]. Except for spatial domain methods, a steganalysis scheme applied the Markov process to model different JPEG 2-D arrays, which were used to enhance changes caused by JPEG steganography [32]. Meanwhile, the dimensionality of transition probability matrices was reduced greatly by means of threshold truncation. Additionally, the authors of [33] achieved the goal of classifying single- and double-compressed JPEG images into one of the preselected existing steganographic algorithms by using a multi-classifier based on supervised training, which is of significance for the further extraction of the secret message from stego images. However, these steganalysis methods that specifically target a particular steganographic algorithm have failed to cope with content-adaptive methods. Moreover, low-dimensional features are also insufficient for capturing changes caused by secret information in images with complex textures. To address these issues, in [34], the ensemble classifiers were used to extract a variety of statistical characteristics as the steganalytic features, by assembling a rich model composed of diverse high-dimensional submodels. The residual images output from the high-pass filters were further used to calculate the co-occurrence matrix to enhance features, and the detection performance was greatly improved. In the transform domain, the authors of [35] proposed a high-dimensional JPEG steganalysis scheme using high-order cascade filters to capture embedding traces and employing the ensemble classifiers as a steganalyzer. As shown in Figure 4, traditional steganalysis methods for digital images generally include feature extraction, feature enhancement, and classifier fitting.

However, the above-mentioned methods rely heavily on manually designed features, which require the prior knowledge and expertise of researchers. To avoid this, methods based on deep learning have been gradually proposed to function automatically. In [36], a well-designed convolutional neural network (CNN) architecture was proposed, with knowledge of steganalysis taken into account. The proposed CNN architecture consisted of three parts: a fixed HPF (high-pass filter) layer, a cascade of convolution layers, and a fully connected (FC) module, which corresponded to the three steganalysis stages of feature extraction, feature enhancement, and binary classification. Similarly, Yedroudj-Net [37] utilized all 30 high-pass filters from SRM [32] as a preprocessing layer to extract features and design a CNN-based spatial image steganalysis model. However, the manual design of filters in the preprocessing layer means that the weight parameters in this layer do not update during training and the steganalysis model was not a pure network-learning architecture. To overcome this limitation, SRNet [38] proposed an end-to-end deep-learning-based steganalysis model using a deep residual architecture. It achieved state-of-the-art detection accuracy for both spatial domain and JPEG steganography. Further improvements were made in [39], which used 3 × 3 kernels for preprocessing, depth-wise separable convolutions, and a spatial pyramid pooling (SPP) [40] technique to achieve higher detection rates.

In the following section, we propose a steganalysis scheme for neural networks trained for image classification tasks to detect the presence of embedded information in the model weights. This scheme is inspired by the paradigm of steganalysis of digital images.

3. Proposed Scheme

In this section, a novel symmetric steganalysis scheme is proposed to detect the presence of hidden secret information in deep classification neural networks. We first introduce the general framework of our proposed scheme. Then, we present the main content of the steganalysis scheme, including the image dataset settings, feature extraction and preprocessing, the classifier fitting, and a symmetric method to determine the optimal classification thresholds. Details are described below.

3.1. General Framework

This paper aims to detect the presence of hidden information in the model parameters of trained neural networks. Similar to the steganalysis of digital images, the objective of the steganalysis of neural networks is to build a binary classifier that can determine whether the neural network being investigated is a stego network. We denote the general function of the steganalysis detector as

{p^{(1)}, p^{(2)}, \dots, p^{(n)}} = f_{d e t e c t o r} (D_{t r a i n})

(1)

where

f_{d e t e c t o r}

is the binary classifier fitted by the training set

D_{t r a i n} = {x_{t r n}^{(m)}, y_{t r n}^{(m)}}_{m \in N^{t r n}}

and

N^{t r n}

is the number of training samples. Given a

D_{o u t}

-dimensional feature vector

{[x (1), x (2), \dots, x (d_{o u t})]}^{D_{o u t}} \in {[0, 1]}^{d_{o u t}}

from

D_{t r a i n}

, for example,

(x_{t r n}^{(m)}, y_{t r n}^{(m)})

, the output prediction label

p^{(m)}

is expected to be the same as the truth label

y_{t r n}^{(m)}

. The training of the classifier is to accurately predict the whole

{p^{(1)}, p^{(2)}, \dots, p^{(m)}}^{N^{t r n}} \in {0, 1}^{m}

as well as possible by minimizing the difference between the prediction labels and the corresponding truth labels, where 0 stands for cover network and 1 for stego network.

In this scheme, we utilize two supervised learning models for steganalysis: the support vector machine (SVM) [41] and the ensemble classifier [42]. Both the classifiers are commonly used in steganalysis for their satisfactory detection performance. The SVM model is defined as follows:

f_{S V M} : ω^{T} x + b = 0

(2)

where

ω = (ω_{1}; ω_{2}; \dots; ω_{d})

is the normal vector of the objective hyperplane which aims to separate positive and negative patterns and

b

is the distance to the origin. To find the optimal separating hyperplane, the total loss

L

of the SVM classifier with a maximal soft margin in Equation (3) is defined as the weighted summation of two parts, as shown in Equations (4) and (5).

L_{Ω}

represents the loss introduced by the model structure factors and is used to describe the properties of the support vector machine model. To some extent,

L_{Ω}

acts as a regularizer to help reduce overfitting in the process of minimizing the training errors by introducing a priori knowledge.

L_{ℓ}

is the empirical error for supervised learning misclassification and adopts a surrogate loss function

ℓ (\cdot)

to fit the binary classifier.

C

is used to adjust the weights of the two parts. The value of hyperparameter

C

should be tuned during experiments to avoid overfitting or underfitting, which will be discussed in Section 4.2.

L = L_{Ω} + C \cdot L_{ℓ}

(3)

L_{Ω} = \frac{1}{2} {‖ω‖}^{2}

(4)

L_{ℓ} = \sum_{i = 1}^{m} ℓ (f_{S V M} (x_{i}), y_{i})

(5)

In addition to the SVM model, we also employed ensemble classifiers in our proposed scheme, which are known for their low computational complexity and ability to efficiently handle large training sets. Following the discussion of alternatives in [42], we implemented the ensemble classifier as a random forest consisting of

N

base learners (individual learners). Each base learner was trained independently on a specific

d_{s u b}

-dimensional subspace, which is sampled uniformly from the feature space at random. The construction of the ensemble classifier is shown in Figure 5. We denote the

N

base learners as

B_{n}, n = 1, \dots, N,

and we use Fisher linear discriminants (FLDs) as binary base learners in the ensemble classification for their simplicity and low training complexity. In the steganalysis literature, samples for ensemble classification are combined as pairs, i.e., the cover samples and the corresponding stego samples are bound together. Such pairs from the training set are denoted as

{x^{(m)}, {\bar{x}}^{(m)}}, m = 1, \dots, N^{t r n}

. After each base learner provides a single vote, the ensemble classifier, denoted as

B^{(N)} (x) \in {0, 1}

(cover = 0, stego = 1), forms the final decision by aggregating all

N

decisions of the base learners using majority voting strategy. By using bagging (bootstrap aggregating) [43], only about 63% of the unique samples in the training set were used for each base learner, and the remaining 37% unseen data can be exploited as validation set for the “out-of-bag” (OOB) estimate, which is calculated as:

E_{OOB}^{(N)} = \frac{1}{2 N^{t r n}} \sum_{m = 1}^{N^{t r n}} (B^{(N)} (x^{(m)}) + 1 - B^{(N)} ({\bar{x}}^{(m)}))

(6)

Since the OOB estimate is an unbiased estimate of the testing error, monitoring the decrease trend in OOB error estimate in Equation (6) can trace the training process, which is helpful for evaluating the detection performance and determining parameters.

3.2. Steganalysis Scheme

3.2.1. Datasets and Feature Extraction

In this section, we introduce the dataset setting and elaborate on feature extraction. For the steganalysis of digital images, public datasets, such as [44,45], and the widely used evaluation metrics already exist. Given that the steganalysis of neural networks is a new problem, we built a new diverse dataset consisting of cover networks and stego networks for our scheme. The cover networks include three representative neural networks for image classification tasks, namely AlexNet, ResNets, and CapsNets, which are trained on three commonly used public image datasets: MNIST [46], CIFAR-10 [47], and ImageNet [13]. Stego networks are created by applying the same settings as cover networks, except that the secret data are embedded in the training process using the data hiding method in [18]. All stego networks are trained until the extraction error rate drops to zero to ensure the secret data embedding is complete, with the number of embedded bits ranging from 0 to the maximum capacity. More details of dataset preprocessing and network structures are described in Section 4.3. Our stego networks are created during the training process for the original task; despite this, modifications in the selected connection elements are inevitable, especially in cases where large amounts of secret data are embedded. To show the difference, we compared the parameter distributions of the cover and stego CapsNets on the MNIST dataset, as shown in Figure 6, with 10,000 embedded bits and a batch size of 50.

The results indicate that the embedding operation alters the underlying distribution of parameters to some extent. In this case, it seemingly makes sense to distinguish between the cover and stego networks given the location (connected elements) where the secret data are embedded. However, this is impractical. In contrast, our proposed method is a blind detection method, which does not require access to internal elements of the host network. Given the target network to be analyzed, the analyst can directly query the network, such as a black box. Only by running the host network properly can we obtain the outputs (prediction vectors) of the original task as the unprocessed feature vectors for our steganalysis scheme. Additionally, the cover and stego features are extracted in this way from the cover and stego networks, respectively. As we know, for image steganalysis, analysts usually spend considerable time and effort modeling the embedding traces for the first step of their steganalysis methods. Compared with that, the cover media of our scheme is a deep neural network, which is a dynamic and interactive model that is different from static digital images. Thus, we cleverly exploit this to extract feature vectors for our scheme through simple queries. Currently, most methods of data hiding in neural networks also take advantage of this property by embedding the secret data during the training of the original task, instead of modifying the static trained network, which enhances the imperceptibility and security of the scheme. Therefore, it makes sense for our steganalysis scheme to use the function and output of the original task to identify unknown networks. The blind detection of steganographic information and the lack of complex feature extraction steps significantly improve the practicality of the proposed scheme.

3.2.2. Feature Preprocessing and Fitting

Given the prediction vectors output from networks, labels were assigned to create supervised data; the cover feature was labeled 0 and the stego feature was labeled 1. Prior to training the model, all features were pre-scaled to be as training-friendly as possible. The normalized features were fed into the SVM model, which is sensitive to Euclidean distance. We used min–max normalization to limit values of

D_{o u t}

-dimensional (

D_{o u t}

represents the dimension of output prediction vectors, namely the number of classes of the original classification task) features to

[0, 1]

. The SVM model was trained by mapping the processed feature vectors to a higher dimensional implicit space with a given kernel function. For fitting the SVM classifier, we compared the detection performance of several kernels, including the linear kernel function, the Gaussian kernel function, the Sigmoid kernel function, and the polynomial kernel function. The surrogate loss function

ℓ (\cdot)

in Equation (5) was specified as the hinge loss function. The SVM classifier was first used for the simplest case of a fixed number of embedded bits to verify the feasibility of our proposed scheme. In our implementation, the training time complexity of the SVM classifier fell between

O ({(N^{t r n})}^{2} \times d_{o u t})

and

O ({(N^{t r n})}^{3} \times d_{o u t})

, which is related to the size of the dataset and the value of the hyperparameter. Additionally, the occupied memory also scales quadratically with the number of training samples

N^{t r n}

. As the training set grows, the SVM classifier dramatically increases the complexity of the system and becomes inapplicable. Therefore, we employed the ensemble classifiers that can efficiently cope with large training sets for complex cases. To prepare the training data, we standardized (Z-score normalization) the cover and stego features to eliminate the outliers in the feature vectors. It was noted that, for ensemble training, the training set only consisted of output vectors collected from three types of networks on the MNIST (grayscale image) and CIFAR-10 (RGB image) datasets. The outputs from the ImageNet dataset (RGB image) were excluded from the training set, but they were included in the testing set to evaluate the generalization ability of our proposed scheme. As mentioned in Section 3.1, the ensemble classifiers were implemented as random forests of

N

independent base learners. By training each base learner

B_{n}

on a different random subset of features, we obtained the eigenvector and threshold, which were adjusted to meet a desired performance criterion. In our implementation, we set the decision threshold of the ensemble classifier as

N / 2

, while the optimal values of the dimensionality of feature subspace

d_{s u b}

and the ensemble scale

N

were automatically determined during the training process. Finally, the trained ensemble classifiers were used to provide the final decision, which was then compared to the decision threshold

N / 2

to obtain the classification label, as shown in Equation (7).

B^{(N)} (x_{t s t}) = \{\begin{cases} 1, when \sum_{n = 1}^{N} B_{n} (x_{s u b}^{(n)}) \geq \frac{N}{2} \\ 0, when \sum_{n = 1}^{N} B_{n} (x_{s u b}^{(n)}) < \frac{N}{2} \end{cases}

(7)

where

x_{t s t}

is a testing sample and

x_{s u b}^{(n)}

is the

n

th sub-feature. In an epoch, the total detection error

P_{E}

in the training set is minimized so that a trade-off between the accuracy and diversity of each base learner is achieved.

P_{E} = \min_{P_{FA}} \frac{1}{2} (P_{FA} + P_{MD} (P_{FA}))

(8)

where

P_{FA}, P_{MD}

are the probabilities of false alarms and missed detection, respectively. By averaging multiple measurements, we calculated the final result with the accidental errors eliminated.

3.2.3. Determining the Optimal Threshold

After the classifier was trained, we used it to predict the labels of the features, and based on that, we designed a symmetric method based on histogram distribution to determine the optimal thresholds for classification. Specifically, we denoted the number of networks to be tested as

C_{n e t}

, and the testing set contained

N^{t s t}

images, while the number of classes was

d_{o u t}

, as defined in Section 3.1. We would like to clarify that the term “testing set” used in this context refers to a set of data used in the process of determining the optimal threshold for our proposed scheme. In machine learning or deep learning, a testing set is typically used to evaluate the performance of a trained model on previously unseen data. However, in our case, the testing set was used solely for the purpose of determining the optimal threshold. Specifically, we fed the data from this set into the trained classifier, which produced a 10,000-dimensional output consisting of labels with values of zero or one, with the size of the data matrix changing from

{C_{n e t}, N^{t s t}, d_{o u t}}

to

{C_{n e t}, N^{t s t}}

. Since the trained classifier learned patterns of the embedding traces, the presence of steganographic information caused a difference in the 0/1 ratio of the 10,000-dimensional output prediction labels for the stego network compared to that of the cover network. Then, these 10,000-dimensional labels were segmented on average, in which the length of each segment

S e g_{i}, i = 1, 2, \dots, [\frac{N^{t s t} \cdot d_{o u t}}{C_{n e t}}]

was set as

[\frac{C_{n e t}}{d_{o u t}}]

and the numbers of label “1” were calculated as results, as shown in Figure 7. For each segment, we obtained

C_{n e t}

figures calculated from the cover networks and

C_{n e t}

figures calculated from the stego networks symmetrically. Then, histograms of these two strings of figures were generated, along with their respective probability density curves. The segment’s optimal threshold

θ_{i}, i = 1, 2, \dots, [\frac{N^{t s t} \cdot d_{o u t}}{C_{n e t}}]

was determined by finding the unique valley between the two peaks (the intersection point) in the density curves. On the whole, the cover networks and stego networks undergo two highly symmetrical flows to generate their prediction labels to plot histograms and density curves.

All remaining segment thresholds were determined in the same symmetrical manner, with scattered data points whose differences were not primarily caused by steganographic information being removed. Finally, all the thresholds

θ_{i}

were arranged in order to form the final optimal threshold vector

Θ = [θ_{1}, θ_{2}, \dots, θ_{[\frac{N^{t s t} \cdot d_{o u t}}{C_{n e t}}]}]

, which serves as the decision basis for the proposed steganalysis scheme. Therefore, when a target network, which may be a cover network or a stego network, is captured, the trained SVM classifier or the ensemble classifiers can be used to predict prediction labels for its feature vectors. The 10,000-dimensional labels are then segmented in the same way to calculate the number of label “1”s in each segment. These numbers are compared to the threshold at the same index in the optimal threshold vector

Θ

. If the compliance degree exceeds 50% (similar to majority voting), the network is classified as a stego network.

To conclude this section, we note that the output vector from the trained network is class-biased due to the network’s original task. This bias acts as channel interference in information communication, making it necessary to keep testing samples for different networks under the same prior condition arranged in the same order. In other words, the test sets of the original network and the stego network are arranged in a highly symmetric manner, so that the bias caused by the original classification task is offset and the differences are mainly caused by steganographic information.

4. Experimental Results

To verify the effectiveness of our method, a group of experiments were conducted, detailed in this section as follows. We first set up the experimental environments and elaborate on the settings of image datasets and network structures. Then, we analyze the detection performance of different hyperparameters and determine the optimal values. Finally, we build a diverse dataset of neural networks and present the results of the steganalysis of neural networks trained for image classification tasks for multiple cases using the SVM classifier and the ensemble classifiers.

4.1. Experiment Setup

To improve the diversity of networks, we trained different network structures, including AlexNet, CapsNets, and ResNets, on commonly used image classification datasets, including the MNIST, CIFAR-10, and ImageNet datasets. The dataset and network settings were as follows:

MNIST dataset: The official version consists of 60,000 grayscale images for training and validation and 10,000 testing samples divided into 10 categories.

CIFAR-10 dataset: A normalized version of CIFAR-10, which contains 60,000 32 × 32 color images divided into 10 categories, with 6000 images in each category. The dataset was split into 50,000 training images and 10,000 test images.

ImageNet dataset: A tailored version of ImageNet consisting of 10 selected categories of 1600 color training images and 1000 color testing images each, all of which were resized to 64 × 64 and normalized.

AlexNet: A simplified version of AlexNet; the architecture for the MNIST dataset is shown in Figure 8.

CapsNets: The fundamental architecture of CapsNets for the MNIST dataset is shown in Figure 9, including two convolutional layers (Conv1 and PrimaryCaps) and one fully connected layer (DigitCaps). More details can be found in Reference [15].

ResNets: A modified version of ResNet18 with the number of neurons in the last fully connected layer changed to 10 nodes.

All of the settings described above were thoroughly verified and determined by comparing several metrics, including the accuracy of the original task, extraction error, and training time cost. It is important to note that the settings for the stego networks were identical to those of their corresponding cover (original) networks, except for the embedded secret data, which ensures that any perturbations in the stego network were solely the result of steganographic information. The feature set was partitioned into a training set and a testing set using stratified sampling to ensure a 1:1 ratio of cover and stego samples in both sets without introducing any additional biases. To add randomness to the model fitting process, the training set was shuffled, while the testing set remained in a fixed order. Note that this operation is not necessary for practical use by analysts but was performed in this study to highlight the perturbation caused by steganographic information and to ensure consistency with the subsequent threshold determination process, as described in Section 3.2.3.

All the neural network experiments in this paper were implemented by TensorFlow or PyTorch and trained under the environment of Python 3.8 on a Windows 10 system with an NVIDIA GeForce RTX 3090 GPU with 24 GB of memory. The Adam optimizer [48] was used for optimization.

4.2. Parameter Determination

In our experiments, parameter

C

in Equation (3), which represents the weight of the loss function of the SVM classifier, played a critical role in balancing the model and the training data to prevent overfitting or underfitting. The kernel function is another essential component of SVM, as it greatly impacts the learning ability and generalization performance of the classifier. We conducted a set of experiments on CapsNets trained on the MNIST dataset as an example, with embedding capacities of 100, 600, and 6000 bits (the maximum embedding capacity), two routing iterations, and a batch size of 50, to determine the optimal choice of kernel function and value of

C

. The surrogate loss function was specified as the hinge loss function. Table 1 shows the test accuracies and training time costs of the cover and stego neural networks. Each value in the table was calculated as an average of 10 trained networks, amounting to 100,000 data samples, on which the SVM classifier was trained to provide stable predictions. It can be seen that the test accuracies of the stego networks are roughly equivalent to those of the original networks, even with the number of embedded bits up to 6000.

We experimented with different values of the penalty term C, ranging from ten to the power of negative three to ten to the power of three. Four kernel functions, including the Gaussian (RBF) kernel, the polynomial kernel, the linear kernel, and the Sigmoid kernel, were combined to perform cross-validation. The detection accuracies of the SVM classifiers are presented in Figure 10. To ensure reliability, all the figures were obtained from the average testing results of 300,000 samples. The results show that, regardless of the embedding amount being 100 bits, 600 bits, or up to 6000 bits, the SVM classifier with kernel functions tends to achieve acceptable detection accuracy with the change in the penalty item C, except for the Sigmoid kernel. Moreover, we found that the SVM model with the Gaussian kernel function achieves a superior detection performance compared to other kernel functions. The detection accuracy curves of the Gaussian kernel have a convex shape, and the training and predicting time costs are relatively small at C = 1.0 or C = 10, as shown in Table 2. A large C value can lead to overfitting of the training samples. With overall consideration, the Gaussian kernel function was chosen as the kernel of the SVM model, and the value of C was set to 10 to achieve a satisfactory trade-off between detection accuracy and generalization performance simultaneously.

In a similar way, the kernel functions and values of C for the SVM classifiers of other kinds of networks were determined by means of cross-validation. The Gaussian kernel with C = 10 was selected as the kernel function for CapsNets on the CIFAR-10 dataset, and the Gaussian kernel with C = 0.1 was chosen for CapsNets on the ImageNet dataset. The Sigmoid kernel with C = 100 was used as the kernel function for AlexNet on the MNIST dataset, while the Gaussian kernel with C = 10 was used for AlexNet on the CIFAR-10 dataset and ImageNet dataset. For ResNets on the MNIST dataset, the Poly kernel with C = 10 was selected, and the Gaussian kernel with C = 100 and C = 1 was used for ResNets on the CIFAR-10 dataset and ImageNet dataset, respectively.

As for the ensemble classifier, the decision threshold was set to

N / 2

, as

P_{E}

was selected as the metric to evaluate the steganalyzer. The optimal values of the number of base learners

N

and the dimensionality of feature subspace

d_{s u b}

were automatically determined in the ensemble training using an iterative algorithm [40]. For the sake of data consistency, the size of the testing set

N^{t s t}

, namely the number of output vectors or the number of features, was determined to be 10,000 in order to use all the testing samples of the image datasets, and the dimensionality of feature vectors

d_{o u t}

, namely the number of output categories, was determined to be 10.

4.3. Dataset of Neural Networks

For the steganalysis of neural networks, there is no readily available dataset. As described in this section, we built a diverse dataset of neural networks for steganalysis by training different types of neural networks on multiple image classification datasets as described above. Specifically, for the CapsNets trained on the MNIST, CIFAR-10, and ImageNet datasets, we trained the original CapsNets as the cover network and the stego CapsNets with embedding capacities of 600, 1200, 1800, 2400, and 3000 bits, respectively. For the AlexNet trained on the MNIST, CIFAR-10, and ImageNet datasets, we trained the original AlexNet as the cover network and the stego AlexNet with embedding capacities of 100, 200, 300, 400, and 500 bits, respectively. For the ResNets trained on the MNIST, CIFAR-10, and ImageNet datasets, we trained the original ResNets as the cover network and the stego ResNets with embedding capacities of 150, 300, 450, 600, and 750 bits, respectively. During the training of these networks, we found that the image classification dataset to some extent determines the test accuracy of the original task, while the network structure determines the payload of secret information that can be hidden. In other words, neural networks tend to perform well on simple and easily classified datasets, while special network structures and mechanisms mean large redundancy for additional data hiding. For example, CapsNets have a large redundancy in weight parameters due to their unique routing-by-agreement mechanism, which is suitable for additional data hiding. In our data hiding experiments, the maximum payloads hidden in the weight parameters of CapsNets, AlexNet, and ResNets were set as 3000 bits, 500 bits, and 750 bits, respectively, to ensure the consistent number of embedded bits in the same type of network trained on different datasets under equal prior to comparison.

Furthermore, it is important to note that the main objective of training the stego networks is to ensure the successful embedding of all secret bits, rather than mainly focusing on training a better network with high test accuracy. Therefore, when training a large number of networks in practice, we considered multiple metrics such as detection accuracy, extraction error, and training time cost to determine the training details, including the number of training epochs, batch size, image preprocessing, and routing iterations. In total, there were 54 kinds of trained networks in the dataset, and the details of some networks are listed in Table 3. It is worth reiterating that, apart from the embedded secret information, the stego networks were kept consistent with the original networks as much as possible. Additionally, the stego networks were made as consistent as possible with each other to reduce inter-class differences.

4.4. Fixed Embedding Capacity

In this subsection, a complete experimental flow of steganalysis using the SVM classifier with the predetermined kernel function and C is presented. Firstly, we investigated the cases where the embedding capacity is fixed. Given the SVM with determined hyperparameters, the classifier was trained using the training set that was randomly selected from the dataset of neural networks. The training set consists of 20,000 feature samples, which are composed of preprocessed output vectors of the cover networks and stego networks. The testing set contains data that were not included in the training set. Considering the fact that the time consumption and memory requirement scale quadratically with the size of input samples, the testing results were averaged over 100,000 data in 10 independent testing batches. The detection accuracies of the SVM classifier for the case of fixed embedded bits are listed in Table 4. From the table, it can be seen that most feature vectors from 54 kinds of neural networks can be classified using a well-targeted SVM classifier, except for a few special cases. Among them, the experiments for AlexNet on the CIFAR-10 dataset achieve a satisfactory detection accuracy of 90%, and even a detection accuracy of more than 93% for AlexNet on the ImageNet dataset. However, the detection accuracies of CapsNets-CIFAR-10-1800, CapsNets-CIFAR-10-2400, and CapsNets-CIFAR-10-3000 are not high enough due to compromises made in the accuracy of the network in the training process, which result in high randomness in the output vectors. Additionally, for ResNets on the CIFAR-10 dataset, the secret information is drowned and tolerated as noise in the numerous parameters of the model when the number of embedded bits is comparatively too small.

Let

C_{n e t} = 1000

to determine the optimal thresholds. Without loss of generality, the testing set consists of 1000 cover CapsNets and 1000 stego CapsNets, with 600 bits of additional data embedded. These neural networks were trained on the MNIST dataset as an example. Features were extracted from the networks and then fed into the well-fitted SVM classifier to obtain classification labels. The size of the data matrix changed from

{1000, 10,000, 10}

to

{1000, 10,000}

. Each label vector represents a parent neural network. These 10,000-dimensional label vectors were evenly segmented into 100 segments, each with a length of 100, preserving their original order. The number of label “1”s in each segment was counted separately for the cover networks and stego networks. Specifically, 1000 figures were calculated from the cover networks and 1000 figures from the stego networks for each segment of identical indices. The histograms and probability density curves of these figures were plotted, as shown in Figure 11.

The optimal threshold for each segment was determined by the abscissa value at the intersection point of the two probability density curves, marked by a green triangle in Figure 11. The resulting optimal threshold vector

Θ = [46.78, 43.41, \dots, 30.76]

was formed by concatenating these thresholds from each segment. To test the proposed steganalysis scheme, we trained a new set of 1000 networks consisting of 500 cover networks and 500 stego networks that were previously unseen. The number of label “1”s in the segments of these new networks were calculated and compared to the optimal thresholds at the same index in

Θ

, using a majority voting method with a threshold of 50. Finally, 992 out of the 1000 networks were correctly classified, leading to a classification accuracy of 99.20% for the fixed embedded bits case.

4.5. Changing Embedding Capacities and Image Datasets

Furthermore, we extended our investigation to evaluate the performance of our scheme when the number of embedded bits, and even the image datasets, changed. For the case of changing embedding capacities, we trained independent SVM classifiers for networks with a certain number of embedded bits and used them to predict networks with a different number of embedded bits. The training set contains 20,000 feature samples, and the testing results were averaged over 100,000 data from 10 networks. This provides a preliminary understanding of the ability of each independent SVM classifier to detect networks with varying embedding capacities. The details of the detection accuracies for CapsNets on the MNIST dataset can be found in Table 5.

Each row in the table represents an independent SVM classifier fitted by the row header. Overall, the classifiers are still able to perform well when facing changes in the number of embedded bits. However, as the number of embedded bits increases, it becomes more challenging for the classifier to distinguish stego features from cover features. Additionally, it is important to note that the classifier does not necessarily perform best when facing features with the same number of embedded bits as the data in the training set. Based on this prior knowledge, we trained a unified SVM classifier using a training set that contains a total of 100,000 feature samples from multiple networks, including CapsNets-MNIST-600, CapsNets-MNIST-1200, CapsNets-MNIST-1800, CapsNets-MNIST-2400, CapsNets-MNIST-3000, and plain-CapsNets-MNIST, arranged in alternating order. The testing results were averaged over 100,000 data points. Detection accuracies and time costs for model fitting and predicting are shown in Table 6.

Compared with independent classifiers, the unified SVM classifier that includes more kinds of networks in the training data achieves much better detection performance overall. In particular, for networks with 3000 bits embedded, the unified SVM classifier achieves an accuracy of 82.74%, which represents a remarkable improvement of up to 10.87%. Moreover, the accuracy improvements for 2400, 1800, and 1200 embedded bits are 9.58%, 6.69%, and 0.06%, respectively. These improvements reflect the fact that mixed feature samples interact with each other, comprehensively affecting the decision boundaries of the SVM classifier. However, the complexity of training the unified model increases quite rapidly, to 116.0371 s, compared to 4.4740 s for the case of a fixed number of embedded bits. Furthermore, the time used for model predicting also increases significantly. With the well-fitted unified SVM classifier, we randomly sampled a testing set consisting of 1000 cover CapsNets and 1000 stego CapsNets, with the embedding capacities of 600, 1200, 1800, 2400, and 3000 bits on the MNIST dataset, and calculated the optimal thresholds step by step. The histograms and probability density curves are plotted in Figure 12.

For the case of a changing number of embedded bits, the final optimal threshold vector

Θ = [61.64, 55.63, \dots, 45.64]

was formed. In the classification results, 983 out of the 1000 testing networks were correctly classified, resulting in a classification accuracy of 98.30% for the proposed steganalysis scheme when the number of embedded bits changes. Furthermore, we removed the limitation of a single dataset to conduct experiments on cases with changing embedding capacities and image datasets. Firstly, we trained independent SVM classifiers for CapsNets on the CIFAR-10 and ImageNet datasets, respectively. The training set contains 20,000 feature samples, and the results of testing were averaged over 100,000 data from 10 networks that were not included in the training set. The details of detection accuracies for feature samples from CapsNets on the CIFAR-10 and ImageNet datasets are listed in Table 7 and Table 8, respectively.

As shown in Table 7 and Table 8, the detection accuracies of independent SVM classifiers with a changing number of embedded bits for CapsNets on the CIFAR-10 dataset are unsatisfactory. A significant proportion of the accuracy figures, especially those above the leading diagonal in Table 7, are close to or even below an accuracy of 50%, which means the independent classifiers perform no better than chance on the CIFAR-10 dataset. Additionally, the SVM classifier fitted by CapsNets-ImageNet-3000 also provides detection results that are not good enough. Inspired by the improved accuracy of the unified model on the MNIST dataset, we trained the SVM classifier using an extensive training set for the case of changing embedding capacities and image datasets. However, the time complexity of the SVM model training and testing cannot be ignored. Therefore, we included only some networks in the extensive training set based on their ability to classify feature samples with different embedding capacities, as shown in Table 5, Table 7 and Table 8. For CapsNets on the MNIST dataset, CapsNets-MNIST-1800, CapsNets-MNIST-1200, and CapsNets-MNIST-3000 were chosen in order of priority. For CapsNets on the CIFAR-10 dataset, CapsNets-CIFAR-10-2400, CapsNets-CIFAR-10-3000, and CapsNets-CIFAR-10-1200 were chosen in order of priority. For CapsNets on the ImageNet dataset, CapsNets-ImageNet-1200, CapsNets-ImageNet-2400, and CapsNets-ImageNet-1800 were chosen in order of priority. Then, these networks were randomly selected from the dataset of neural networks and arranged in alternating order. We performed experiments to train the SVM classifier progressively by gradually increasing the size of the training set. First, we trained the SVM classifier using a training set consisting of 60,000 feature samples, including 30,000 cover features from plain-CapsNets-MNIST, plain-CapsNets-CIFAR-10, and plain-CapsNets-ImageNet, and 30,000 stego features from CapsNets-MNIST-1800, CapsNets-CIFAR-10-2400, and CapsNets-ImageNet-1200. The testing results were averaged over 100,000 data from 10 unseen networks. The details of detection accuracies achieved by the trained SVM classifier are presented in Table 9.

The time required for model fitting and predicting is 106.6453 s and 47.5058 s, respectively. Next, we trained the SVM classifier on training data consisting of 120,000 feature samples. Additional stego networks, including CapsNets-MNIST-1200, CapsNets-CIFAR-10-3000, and CapsNets-ImageNet-2400, were included in the training set. The details of detection accuracies achieved by the trained SVM classifier are listed in Table 10.

The time required for model fitting and predicting for the larger training set increases to 615.2571 s and 122.1739 s, respectively. Finally, we trained the SVM classifier using all chosen stego networks, with additional CapsNets-MNIST-3000, CapsNets-CIFAR-10-1200, and CapsNets-ImageNet-1800 included in the training data. The training set contains a total of 180,000 feature samples, and the testing results were averaged over 100,000 data from 10 networks. The details of detection accuracies and accuracy improvements achieved by the trained SVM classifier are listed in Table 11. The time spent on model fitting and predicting was 1526.6490 s and 174.7290 s, respectively.

It is evident from the accuracy improvements in Table 11 that the trained SVM classifier outperforms the independent classifiers, especially for CapsNets trained on the CIFAR-10 dataset. Considering the complexity of changing embedding capacities and image datasets, the result is satisfactory. As the size of the training set increases, the detection performance of the trained SVM classifier improves, while the time complexity of the model training and testing also increases. However, a high time complexity can be disastrous, as a large number of networks need to be predicted in our method. Therefore, there is a trade-off between detection performance and time cost. Considering all these factors, we chose the SVM classifier trained on 120,000 data for label predicting. Following the same steps to calculate the optimal thresholds, the new testing set is composed of three kinds of cover networks as well as fifteen kinds of stego networks, listed in Table 10. The histograms and probability density curves for the case of changing embedding capacities and image datasets are plotted in Figure 13.

As a result, the final optimal threshold vector

Θ = [49.70, 48.47, \dots, 45.17]

is formed. In the classification results, 967 out of 1000 testing networks were correctly classified, resulting in a detection accuracy of 96.70% for our proposed steganalysis scheme of neural networks in cases where both the number of embedded bits and image datasets change.

4.6. Steganalysis by Ensemble Classifiers

In addition to the SVM classifier, we also present a more complex case using the ensemble classifiers. We attempted to train the SVM classifier based on a larger training set of 600,000 cover and stego feature vectors, with a Gaussian function as the kernel function and

C

set to 0.01, 1, and 100, respectively. The accuracies of these classifiers tested on a validation set are 58.72%, 63.15%, and 65.37%. However, it took over 14 h to train the model, making it impractical for large-scale label prediction tasks. Therefore, we turned to ensemble classifiers for their computational efficiency on large training sets. Similar experimental results of ensemble classification for the cases of independent models and unified models are omitted here, and the case of changing embedding capacities, image datasets, and network structures is presented. By monitoring the “out-of-bag” estimate, the optimal values of the dimensionality of feature subspace

d_{s u b}

and the ensemble scale

N

were automatically determined during each training process. The ensemble classifier was evaluated by detection error

P_{E}

, which was averaged over 10 independently trained ensemble classifiers. The cover feature set for training consists of 180,000 vectors from CapsNets, AlexNet, and ResNets on the MNIST and CIFAR-10 dataset, while the stego feature set for training is composed of 180,000 vectors output from CapsNets, AlexNet, and ResNets on the MNIST and CIFAR-10 datasets with a changing number of embedded bits. To avoid overfitting, we alternately arranged both the cover and stego features. The detection accuracies of the trained ensemble classifier for the case of a fixed number of embedded bits and the case of changing embedding capacities, image datasets, and network structures are listed in Table 12 and Table 13, respectively.

Comparing the results in Table 12 and Table 13, the high accuracies in Table 12 suggest that the ensemble classifiers are effective in detecting the embedding changes in most cases. However, as the training set becomes sufficiently complex, including cover features and stego features from different networks trained on different image datasets with different embedding capacities, the detection accuracies suffer significant losses and some cases become unclassifiable. One reason for this can be that the differences between different image datasets and network structures outweigh the differences caused by the embedding changes, particularly when the embedding amount is small. Additionally, since each network begins training with different initialization values, this can increase inter-class differences to some extent. For further study, the universality on different network structures of the proposed scheme can be improved. Given 1000 cover networks and 1000 stego networks, the classification labels were generated to determine the optimal thresholds, resulting in a total accuracy of 60.76% for feature vectors. The histograms and probability density curves of numbers of label “1” were plotted, as shown in Figure 14.

Similarly, we obtained the final optimal threshold vector

Θ = [43.18, 41.74, \dots, 39.17]

. For testing, another 1000 new networks consisting of 500 cover networks and 500 stego networks were randomly sampled from the dataset of networks, which includes networks trained on the ImageNet dataset. In the final classification results, 705 out of the 1000 testing networks were correctly classified, achieving a detection accuracy of 70.50% for the proposed steganalysis scheme of neural networks.

In summary, we conducted the proposed experiments in a progressive manner, presenting the detection accuracies of feature vectors as well as the classification results of neural networks, from simple cases to complex cases. For the case of a fixed number of embedded bits, we achieved a satisfactory classification accuracy of 99.20%. Furthermore, we tested the detection accuracies using the independent SVM classifiers for the case of a changing number of embedded bits, providing the knowledge of detection capabilities of each independent classifier. Given that, we trained the unified SVM classifier using a more extensive training set, resulting in a significant accuracy improvement of up to 10.87%. Using the unified SVM classifier, we achieved a classification accuracy of 98.30% for the case of a changing number of embedded bits. For the case where both the number of embedded bits and the image dataset change, we trained multiple SVM classifiers based on datasets of different sizes and selected an optimal one by weighing the detection performance and time complexity. Finally, we achieved a classification accuracy of 96.70% for our proposed steganalysis scheme. However, as the training set increases, the computational efficiency of the SVM classifier is greatly reduced, making it no longer suitable for large-scale label prediction. Given this, we turned to the ensemble classifier for more complex cases with changing embedding capacities, image datasets, and network structures. By using the same symmetric method to determine the optimal thresholds, we finally achieved a classification accuracy of 70.50% for the proposed steganalysis scheme of neural networks.

5. Conclusions

In this paper, we present a new steganalysis scheme for deep neural networks trained for image classification tasks. Our proposed method focuses on detecting the presence of secret data in widespread deep image classification networks, which can protect neural networks from being exploited to transmit secret data. We built a diverse network dataset consisting of neural networks with different structures, which are trained on different image datasets, with different numbers of secret bits embedded. Feature vectors for steganalysis are directly extracted from the outputs of the neural networks. Therefore, it is unnecessary to access the internal elements of the host network, which is more useful than white box methods in many practical scenarios. A well-designed histogram distribution method is proposed to find the optimal decision thresholds in a highly symmetrical manner, which is robust to interference from the original classification task. By progressively relaxing the conditional restrictions on stego networks, experimental results using SVM and ensemble classifiers show that the presence of additional data can be detected in most cases.

For future work, we propose developing a deep learning-based steganalysis scheme for neural networks that can automatically learn patterns caused by additional data to form a more universal framework for various deep networks. Additionally, exploring the exact location of additional data in model parameters can be another avenue of research.

Author Contributions

Conceptualization, Z.W. and X.T.; methodology, X.T.; software, X.T.; validation, X.T.; formal analysis, X.T.; investigation, X.T.; resources, Z.W. and X.Z.; data curation, Z.W.; writing—original draft preparation, X.T.; writing—review and editing, Z.W.; visualization, X.T.; supervision, X.Z.; project administration, X.Z.; funding acquisition, X.Z. All authors have read and agreed to the published version of the manuscript.

Funding

This research was funded by Natural Science Foundation of China under grants 62002214 and U1936214 and was funded by the Chenguang Program of Shanghai Education Development Foundation and Shanghai Municipal Education Commission under grant 22CG49.

Data Availability Statement

Not applicable.

Conflicts of Interest

The authors declare no conflict of interest.

References

Shehab, D.A.; Alhaddad, M.J. Comprehensive Survey of Multimedia Steganalysis: Techniques, Evaluations, and Trends in Future Research. Symmetry 2022, 14, 117. [Google Scholar] [CrossRef]
Chan, C.K.; Cheng, L.M. Hiding data in images by simple LSB substitution. Pattern Recognit. 2004, 37, 469–474. [Google Scholar] [CrossRef]
Mielikainen, J. LSB matching revisited. IEEE Signal Process. Lett. 2006, 13, 285–287. [Google Scholar] [CrossRef]
Pevný, T.; Filler, T.; Bas, P. Using high-dimensional image models to perform highly undetectable steganography. In Proceedings of the International Workshop on Information Hiding; Springer: Berlin/Heidelberg, Germany, 2010; pp. 161–177. [Google Scholar]
Holub, V.; Fridrich, J. Designing steganographic distortion using directional filters. In Proceedings of the 2012 IEEE International Workshop on Information Forensics and Security (WIFS), Costa Adeje, Spain, 2–5 December 2012; IEEE: Piscataway, NJ, USA, 2012; pp. 234–239. [Google Scholar]
Holub, V.; Fridrich, J. Digital image steganography using universal distortion. In Proceedings of the 1st ACM Workshop on Information Hiding and Multimedia Security, Montpellier, France, 17–19 June 2013; pp. 59–68. [Google Scholar]
Li, B.; Tan, S.; Wang, M.; Huang, J. Investigation on cost assignment in spatial image steganography. IEEE Trans. Inf. Forensics Secur. 2014, 9, 1264–1277. [Google Scholar] [CrossRef]
Filler, T.; Judas, J.; Fridrich, J. Minimizing additive distortion in steganography using syndrome-trellis codes. IEEE Trans. Inf. Forensics Secur. 2011, 6, 920–935. [Google Scholar] [CrossRef]
Tang, W.; Tan, S.; Li, B.; Huang, J. Automatic steganographic distortion learning using a generative adversarial network. IEEE Signal Process. Lett. 2017, 24, 1547–1551. [Google Scholar] [CrossRef]
Zhu, J.; Kaplan, R.; Johnson, J.; Li, F.-F. Hidden: Hiding data with deep networks. In Proceedings of the The European Conference on Computer Vision (ECCV), Munich, Germany, 8–14 September 2018. [Google Scholar]
Lin, J.; Chang, C.-C.; Horng, J.-H. Asymmetric Data Hiding for Compressed Images with High Payload and Reversibility. Symmetry 2021, 13, 2355. [Google Scholar] [CrossRef]
Baluja, S. Hiding images within images. IEEE Trans. Pattern Anal. Mach. Intell. 2019, 42, 1685–1697. [Google Scholar] [CrossRef] [PubMed]
Krizhevsky, A.; Sutskever, I.; Hinton, G. Imagenet classification with deep convolutional networks. In Proceedings of the Neural Information Processing Systems (NIPS), Lake Tahoe, NV, USA, 3–6 December 2012; pp. 1097–1105. [Google Scholar]
Wu, X.; Chen, Q.; You, J.; Xiao, Y. Unconstrained offline handwritten word recognition by position embedding integrated ResNets model. IEEE Signal Process. Lett. 2019, 26, 597–601. [Google Scholar] [CrossRef]
Sabour, S.; Frosst, N.; Hinton, G.E. Dynamic routing between capsules. In Proceedings of the Advances in Neural Information Processing Systems, Long Beach, CA, USA, 4–9 December 2017; pp. 3856–3866. [Google Scholar]
Creswell, A.; White, T.; Dumoulin, V.; Arulkumaran, K.; Sengupta, B.; Bharath, A.A. Generative adversarial networks: An overview. IEEE Signal Process. Mag. 2018, 35, 53–65. [Google Scholar] [CrossRef]
Uchida, Y.; Nagai, Y.; Sakazawa, S.; Satoh, S. Embedding watermarks into deep neural networks. In Proceedings of the ACM on International Conference Multimedia Retrieval, Bucharest, Romania, 6–9 June 2017; pp. 269–277. [Google Scholar]
Wang, Z.; Feng, G.; Wu, H.; Zhang, X. Data Hiding in Neural Networks for Multiple Receivers. IEEE Comput. Intell. Mag. 2021, 16, 70–84. [Google Scholar] [CrossRef]
Wang, J.; Wu, H.; Zhang, X.; Yao, Y. Watermarking in deep neural networks via error back-propagation. Electron. Imag. 2020, 2020, 1–22. [Google Scholar] [CrossRef]
Zhang, J.; Gu, Z.; Jang, J.; Wu, H. Protecting intellectual property of deep neural networks with watermarking. In Proceedings of the 2018 Asia Conference on Computer and Communications Security, Incheon, Republic of Korea, 4–8 June 2018; pp. 159–172. [Google Scholar]
Adi, Y.; Baum, C.; Cisse, M.; Pinkas, B.; Keshet, J. Turning your weakness into a strength: Watermarking deep neural networks by backdooring. In Proceedings of the 27th {USENIX} Security Symposium, ({USENIX} Security 18), Baltimore, MD, USA, 15–17 August 2018; pp. 1615–1631. [Google Scholar]
Merrer, E.L.; Perez, P.; Trédan, G. Adversarial frontier stitching for remote neural network watermarking. Neural Comput. Appl. 2020, 32, 9233–9244. [Google Scholar] [CrossRef]
Wu, H.; Liu, G.; Yao, Y.; Zhang, X. Watermarking neural networks with watermarked images. IEEE Trans. Circuits Syst. Video Technol. 2021, 31, 2591–2601. [Google Scholar] [CrossRef]
Fridrich, J.; Soukal, D. Matrix embedding for large payloads. IEEE Trans. Inf. Forensics Secur. 2006, 1, 390–395. [Google Scholar] [CrossRef]
Wang, Z.; Feng, G.; Zhang, X. Repeatable Data Hiding: Towards the Reusability of Digital Images. IEEE Trans. Circuits Syst. Video Technol. 2021, 32, 135–146. [Google Scholar] [CrossRef]
Tao, J.; Li, S.; Zhang, X.; Wang, Z. Towards Robust Image Steganography. IEEE Trans. Circuits Syst. Video Technol. 2019, 29, 594–600. [Google Scholar] [CrossRef]
Simmons, G.J. The prisoners’ problem and the subliminal channel. In Advance in Cryptology; Springer: New York, NY, USA, 1984; pp. 51–67. [Google Scholar]
Chandramouli, R.; Memon, N. Analysis of LSB based image steganography techniques. In Proceedings of the 2001 International Conference on Image Processing (Cat. No.01CH37205), Thessaloniki, Greece, 7–10 October 2001; Volume 3, pp. 1019–1022. [Google Scholar] [CrossRef]
Higher-order statistical steganalysis of palette images. In Proceedings of the SPIE Security Watermarking Multimedia Contents, Santa Clara, CA, USA, 20 January 2003; Volume 5020, pp. 131–142.
Ker, A.D. Steganalysis of LSB matching in grayscale images. IEEE Signal Process. Lett. 2005, 12, 441–444. [Google Scholar] [CrossRef]
Tang, W.; Li, H.; Luo, W.; Huang, J. Adaptive Steganalysis against WOW Embedding Algorithm; ACM: New York, NY, USA, 2014. [Google Scholar]
Shi, Y.Q.; Chen, C.; Wen, C. A Markov Process Based Approach to Effective Attacking JPEG Steganography. In Proceedings of the Information Hiding, 8th International Workshop, IH 2006, Alexandria, VA, USA, 10–12 July 2006. [Google Scholar]
Pevny, T.; Fridrich, J. Multiclass Detector of Current Steganographic Methods for JPEG Format. IEEE Trans. Inf. Forensics Secur. 2008, 3, 635–650. [Google Scholar] [CrossRef]
Fridrich, J.; Kodovsky, J. Rich Models for Steganalysis of Digital Images. IEEE Trans. Inf. Forensics Secur. 2012, 7, 868–882. [Google Scholar] [CrossRef]
Feng, G.; Zhang, X.; Ren, Y.; Qian, Z.; Li, S. Diversity-Based Cascade Filters for JPEG Steganalysis. IEEE Trans. Circuits Syst. Video Technol. 2020, 30, 376–386. [Google Scholar] [CrossRef]
Xu, G.; Wu, H.-Z.; Shi, Y.-Q. Structural Design of Convolutional Neural Networks for Steganalysis. IEEE Signal Process. Lett. 2016, 23, 708–712. [Google Scholar] [CrossRef]
Yedroudj, M.; Comby, F.; Chaumont, M. Yedroudj-Net: An efficient CNN for spatial steganalysis. arXiv 2018, arXiv:1803.00407. [Google Scholar]
Boroumand, M.; Chen, M.; Fridrich, J. Deep Residual Network for Steganalysis of Digital Images. IEEE Trans. Inf. Forensics Secur. 2019, 14, 1181–1193. [Google Scholar] [CrossRef]
Zhang, R.; Zhu, F.; Liu, J.; Liu, G. Depth-Wise Separable Convolutions and Multi-Level Pooling for an Efficient Spatial CNN-Based Steganalysis. IEEE Trans. Inf. Forensics Secur. 2020, 15, 1138–1150. [Google Scholar] [CrossRef]
He, K.; Zhang, X.; Ren, S.; Sun, J. Spatial Pyramid Pooling in Deep Convolutional Networks for Visual Recognition. IEEE Trans. Pattern Anal. Mach. Intell. 2015, 37, 1904–1916. [Google Scholar] [CrossRef] [PubMed]
Suykens, J.A.K.; Vandewalle, J. Least squares support vector machine classifiers. Neural Process. Lett. 1999, 9, 293–300. [Google Scholar] [CrossRef]
Kodovsky, J.; Fridrich, J.; Holub, V. Ensemble Classifiers for Steganalysis of Digital Media. IEEE Trans. Inf. Forensics Secur. 2012, 7, 432–444. [Google Scholar] [CrossRef]
Breiman, L. Bagging predictors. Mach. Learn. 1996, 24, 123–140. [Google Scholar] [CrossRef]
Bas, P.; Filler, T.; Pevný, T. “Break our steganographic system”: The ins and outs of organizing BOSS. In Proceedings of the International Workshop on Information Hiding; Springer: Berlin/Heidelberg, Germany, 2011; pp. 59–70. [Google Scholar]
Bas, P.; Furon, T. BOWS-2 Contest (Break Our Watermarking System). In Proceedings of the European Network of Excellence ECRYPT, Virtual, 17 July 2007–17 April 2008. [Google Scholar]
LeCun, Y.; Cortes, C.; Burges, C.J. The MNIST Database of Handwritten Digits. 1998. Available online: http://yann.lecun.com/exdb/mnist/ (accessed on 11 May 2023).
Krizhevsky, A. Learning Multiple Layers of Features from Tiny Images. Technical Report, CIFAR. 2009. Available online: https://www.cs.toronto.edu/~kriz/learning-features-2009-TR.pdf (accessed on 11 May 2023).
Kingma, D.P.; Ba, J. Adam: A method for stochastic optimization. arXiv 2014, arXiv:1412.6980. [Google Scholar]

Figure 1. Various types of neural networks for different intelligent tasks.

Figure 2. Neural networks have a deep network structure and a large number of parameters, for example, ResNet-34 and ResNet-50.

Figure 3. Additional data are embedded into the neural network during the process of training.

Figure 4. Flow of traditional image steganalysis methods.

Figure 5. Construction of the proposed ensemble classifier.

Figure 6. Parameter distributions of cover and CapsNets: (a) cover

c_{i j}

; (b) cover

s_{j}

; (c) cover

u_{i}

; (d) cover

{\hat{u}}_{j | i}

; (e) cover

v_{j}

; (f) cover

w_{i j}

; (g) stego

c_{i j}

; (h) stego

s_{j}

; (i) stego

u_{i}

; (j) stego

{\hat{u}}_{j | i}

; (k) stego

v_{j}

; (l) stego

w_{i j}

.

Figure 6. Parameter distributions of cover and CapsNets: (a) cover

c_{i j}

; (b) cover

s_{j}

; (c) cover

u_{i}

; (d) cover

{\hat{u}}_{j | i}

; (e) cover

v_{j}

; (f) cover

w_{i j}

; (g) stego

c_{i j}

; (h) stego

s_{j}

; (i) stego

u_{i}

; (j) stego

{\hat{u}}_{j | i}

; (k) stego

v_{j}

; (l) stego

w_{i j}

.

Figure 7. Label segmentation and determining the optimal classification threshold in a highly symmetrical manner.

Figure 8. Architecture of AlexNet for the MNIST dataset.

Figure 9. Architecture of CapsNets for the MNIST dataset.

Figure 10. Detection accuracies of SVM with different values of C and kernel functions: (a) 100 embedded bits; (b) 600 embedded bits; (c) 6000 embedded bits.

Figure 11. Histograms and fit density curves of segments for the case of fixed embedded bits: (a) Seg1; (b) Seg2; … (c) Seg100.

Figure 12. Histograms and fit density curves of segments for the case of changing number of embedded bits: (a) Seg1; (b) Seg2; … (c) Seg100.

Figure 13. Histograms and fit density curves of segments for the case of changing embedding capacities and image datasets: (a) Seg1; (b) Seg2; … (c) Seg100.

Figure 14. Histograms and fit density curves of segments for the case of changing embedding capacities, image datasets, and network structures: (a) Seg1; (b) Seg2; … (c) Seg100.

Table 1. Test accuracies of plain networks and CapsNets on MNIST with embedding capacities of 100, 600, and 6000, and training time costs.

Networks	Test Accuracy	Training Time Cost (s)
plain-CapsNets-MNIST	98.58	27.3654
CapsNets-MNIST-100	98.25	29.7114
CapsNets-MNIST-600	98.13	33.0454
CapsNets-MNIST-6000	98.38	67.8747

Table 2. Time cost for SVM fitting and predicting with different parameters.

Time for Model Fitting (s)
Kernel	The Value of C
Kernel	0.001	0.01	0.1	1.0	10	100	1000
Linear kernel	9.6312	9.5247	7.7596	6.8128	7.7702	13.1937	50.2388
Poly kernel	10.1259	7.4181	4.7071	4.0531	6.1947	23.6238	224.9755
Gaussian (RBF) kernel	15.0295	14.0100	8.8310	5.4564	4.4979	6.6971	22.2775
Sigmoid kernel	15.6790	16.3076	14.9520	14.1250	14.0520	13.9231	13.8807
Time for Model Predicting (s)
Kernel	The Value of C
Kernel	0.001	0.01	0.1	1.0	10	100	1000
Linear kernel	7.4459	6.4646	4.7664	4.0588	3.9461	3.9244	3.9447
Poly kernel	7.1663	4.7006	3.0290	2.2568	1.9474	1.8184	1.9649
Gaussian (RBF) kernel	35.8621	31.7887	18.6460	11.6413	8.0697	26.5473	45.7309
Sigmoid kernel	12.7003	11.1387	8.6828	8.4397	8.5023	8.3859	8.3899

Table 3. Test accuracies of different kinds of neural networks and training time cost.

Networks	Test Accuracy	Training Time Cost (s)	Extraction Error
plain-CapsNets-MNIST	98.82	26.7196	/
CapsNets-MNIST-3000	98.66	47.7244	0.0000
plain-CapsNets-ImageNet	60.42	57.3742	/
CapsNets-ImageNet-3000	59.61	63.5359	0.0000
plain-ResNets-MNIST	96.97	52.6523	/
ResNets-MNIST-750	96.77	64.0487	0.0000
plain-ResNets-CIFAR-10	58.26	49.3478	/
ResNets-CIFAR-10-750	57.83	61.2284	0.0000
plain-AlexNet-MNIST	96.55	18.6910	/
AlexNet-MNIST-500	95.76	24.0367	0.0000
plain-AlexNet-CIFAR-10	65.31	65.9686	/
AlexNet-CIFAR-10-500	63.94	70.3321	0.0000

Table 4. Detection accuracies by the predetermined SVM classifiers with the fixed embedding capacity.

Networks on Datasets	Detection Accuracy for Different Embedded Bits
/	600	1200	1800	2400	3000
CapsNets + MNIST	91.16	90.14	83.47	80.83	78.83
CapsNets + CIFAR-10	68.05	60.38	58.77	57.14	55.75
CapsNets + ImageNet	76.74	71.65	64.81	65.48	59.12
/	100	200	300	400	500
AlexNet + MNIST	67.96	68.73	68.92	66.84	69.33
AlexNet + CIFAR-10	90.83	90.87	91.00	91.96	90.92
AlexNet + ImageNet	93.89	94.66	95.03	94.70	94.63
/	150	300	450	600	750
ResNets + MNIST	75.04	76.35	78.13	80.91	82.69
ResNets + CIFAR-10	57.12	67.11	72.61	69.32	70.92
ResNets + ImageNet	71.35	73.52	76.65	77.02	78.66

Table 5. Detection accuracies of independent SVM classifiers with changing number of embedded bits.

	CapsNets-MNIST-600	CapsNets-MNIST-1200	CapsNets-MNIST-1800	CapsNets-MNIST-2400	CapsNets-MNIST-3000
CapsNets-MNIST-600	92.03%	90.14%	79.24%	77.19%	71.87%
CapsNets-MNIST-1200	91.91%	90.61%	81.49%	80.79%	75.52%
CapsNets-MNIST-1800	88.51%	88.80%	83.95%	84.02%	79.06%
CapsNets-MNIST-2400	83.81%	83.22%	80.20%	81.07%	77.77%
CapsNets-MNIST-3000	85.49%	83.75%	79.15%	80.66%	78.83%

Table 6. Detection accuracies and time costs of model fitting and predicting by the unified SVM classifiers with changing number of embedded bits.

	Fitting Time Cost (s)	Testing Accuracy					Predicting Time Cost (s)
fixed embedding capacity	4.4740	/					8.4185
changing embedding capacities	116.0371	600	1200	1800	2400	3000	600	1200	1800	2400	3000
changing embedding capacities	116.0371	90.94	90.67	85.84	86.77	82.74	39.1952	38.6721	39.1012	39.2150	39.0934

Table 7. Detection accuracies of independent SVM classifiers with a changing number of embedded bits for CapsNets on the CIFAR-10 dataset.

	CapsNets-CIFAR-10-600	CapsNets-CIFAR-10-1200	CapsNets-CIFAR-10-1800	CapsNets-CIFAR-10-2400	CapsNets-CIFAR-10-3000
CapsNets-CIFAR-10-600	68.05	53.63	51.56	49.84	51.06
CapsNets-CIFAR-10-1200	61.66	60.24	52.73	52.38	54.73
CapsNets-CIFAR-10-1800	50.44	44.85	46.16	43.90	44.03
CapsNets-CIFAR-10-2400	63.61	61.64	58.05	58.77	57.69
CapsNets-CIFAR-10-3000	64.37	56.24	52.73	55.21	55.75

Table 8. Detection accuracies of independent SVM classifiers with a changing number of embedded bits for CapsNets on the ImageNet dataset.

	CapsNets-ImageNet-600	CapsNets-ImageNet-1200	CapsNets-ImageNet-1800	CapsNets-ImageNet-2400	CapsNets-ImageNet-3000
CapsNets-ImageNet-600	76.74	71.62	63.01	62.24	55.39
CapsNets-ImageNet-1200	68.82	71.65	66.67	66.59	60.95
CapsNets-ImageNet-1800	66.70	69.51	64.81	61.24	57.74
CapsNets-ImageNet-2400	73.05	70.65	65.33	65.48	56.27
CapsNets-ImageNet-3000	52.06	55.25	53.17	52.42	59.12

Table 9. Detection accuracies of changing embedding capacities and image datasets using SVM classifier fitted by 60,000 data.

Networks on Datasets	Detection Accuracy with Different Numbers of Embedded Bits
Networks on Datasets	600	1200	1800	2400	3000
CapsNets + MNIST	90.78 (6.97%)	91.46 (8.24%)	85.93 (6.78%)	85.80 (8.61%)	79.86 (7.99%)
CapsNets + CIFAR-10	69.49 (19.05%)	68.61 (23.76%)	63.86 (17.7%)	63.86 (19.96%)	64.19 (20.16%)
CapsNets + ImageNet	60.98 (8.92%)	65.49 (10.24%)	60.74 (7.57%)	59.34 (6.92%)	52.41 (−2.6%)

Note that values in parentheses indicate the improvement in accuracy compared to the worst independent classifiers.

Table 10. Detection accuracies of changing embedding capacities and image datasets using SVM classifier fitted by 120,000 data.

Networks on Datasets	Detection Accuracy with Different Numbers of Embedded Bits
Networks on Datasets	600	1200	1800	2400	3000
CapsNets + MNIST	93.43 (9.62%)	93.23 (10.01%)	86.58 (7.43%)	86.46 (9.27%)	80.67 (8.8%)
CapsNets + CIFAR-10	72.16 (21.72%)	67.38 (22.53%)	60.27 (14.11%)	62.61 (18.71%)	62.31 (18.28%)
CapsNets + ImageNet	66.96 (14.9%)	69.71 (14.46%)	65.45 (12.28%)	64.56 (12.14%)	56.51 (1.5%)

Table 11. Detection accuracies of changing embedding capacities and image datasets using the SVM classifier fitted by 180,000 data.

Networks on Datasets	Detection Accuracy with Different Numbers of Embedded Bits
Networks on Datasets	600	1200	1800	2400	3000
CapsNets + MNIST	94.22 (10.41%)	93.37 (10.15%)	88.06 (8.91%)	88.65 (11.46%)	84.38 (12.51%)
CapsNets + CIFAR-10	73.67 (23.23%)	71.16 (26.31%)	64.09 (17.93%)	66.13 (22.23%)	64.48 (20.45%)
CapsNets + ImageNet	67.84 (15.78%)	71.99 (16.74%)	67.44 (14.27%)	65.85 (13.43%)	56.55 (1.54%)

Table 12. Detection accuracies by ensemble classifiers for the case of a fixed embedded bits.

Networks on Datasets	Detection Accuracy with Different Embedded Bits
/	600	1200	1800	2400	3000
CapsNets + MNIST	86.69	85.29	79.42	80.35	74.26
CapsNets + CIFAR-10	58.80	59.89	52.61	59.26	49.58
CapsNets + ImageNet	65.24	62.95	58.14	55.19	54.77
/	100	200	300	400	500
AlexNet + MNIST	70.23	69.61	70.14	66.94	72.62
AlexNet + CIFAR-10	66.34	66.65	72.78	74.88	70.81
AlexNet + ImageNet	92.67	92.76	92.74	92.05	92.15
/	150	300	450	600	750
ResNets + MNIST	79.40	76.31	78.54	78.37	78.79
ResNets + CIFAR-10	50.60	57.02	58.31	56.18	56.98
ResNets + ImageNet	52.49	53.78	54.75	56.21	56.07

Table 13. Detection accuracies by ensemble classifiers for the case of changing embedding capacities, image datasets and network structures.

Networks on Datasets	Detection Accuracy with Different Embedded Bits
/	600	1200	1800	2400	3000
CapsNets + MNIST	69.16	70.98	67.80	67.28	61.52
CapsNets + CIFAR-10	52.12	52.48	51.99	52.33	49.19
CapsNets + ImageNet *	53.89	53.73	52.27	51.98	50.46
/	100	200	300	400	500
AlexNet + MNIST	48.55	49.35	48.91	49.32	47.73
AlexNet + CIFAR-10	72.53	70.73	73.36	73.82	71.47
AlexNet + ImageNet *	57.56	56.13	61.03	58.65	58.51
/	150	300	450	600	750
ResNets + MNIST	56.71	58.18	56.73	58.07	57.19
ResNets + CIFAR-10	51.95	56.93	57.86	57.55	57.96
ResNets + ImageNet *	47.21	48.19	48.68	47.72	45.26

* Note that the testing set for evaluating the generalization performance of the fitted model includes all the feature vectors from CapsNets, AlexNet, and ResNets on the ImageNet dataset, which are not included in the training set.

Disclaimer/Publisher’s Note: The statements, opinions and data contained in all publications are solely those of the individual author(s) and contributor(s) and not of MDPI and/or the editor(s). MDPI and/or the editor(s) disclaim responsibility for any injury to people or property resulting from any ideas, methods, instructions or products referred to in the content.

© 2023 by the authors. Licensee MDPI, Basel, Switzerland. This article is an open access article distributed under the terms and conditions of the Creative Commons Attribution (CC BY) license (https://creativecommons.org/licenses/by/4.0/).

Share and Cite

MDPI and ACS Style

Tang, X.; Wang, Z.; Zhang, X. Steganalysis of Neural Networks Based on Symmetric Histogram Distribution. Symmetry 2023, 15, 1079. https://doi.org/10.3390/sym15051079

AMA Style

Tang X, Wang Z, Zhang X. Steganalysis of Neural Networks Based on Symmetric Histogram Distribution. Symmetry. 2023; 15(5):1079. https://doi.org/10.3390/sym15051079

Chicago/Turabian Style

Tang, Xiong, Zichi Wang, and Xinpeng Zhang. 2023. "Steganalysis of Neural Networks Based on Symmetric Histogram Distribution" Symmetry 15, no. 5: 1079. https://doi.org/10.3390/sym15051079

Note that from the first issue of 2016, this journal uses article numbers instead of page numbers. See further details here.

Article Menu

Steganalysis of Neural Networks Based on Symmetric Histogram Distribution

Abstract

1. Introduction

2. Related Work

2.1. Data Hiding in Neural Networks

2.2. Steganalysis of Digital Images

3. Proposed Scheme

3.1. General Framework

3.2. Steganalysis Scheme

3.2.1. Datasets and Feature Extraction

3.2.2. Feature Preprocessing and Fitting

3.2.3. Determining the Optimal Threshold

4. Experimental Results

4.1. Experiment Setup

4.2. Parameter Determination

4.3. Dataset of Neural Networks

4.4. Fixed Embedding Capacity

4.5. Changing Embedding Capacities and Image Datasets

4.6. Steganalysis by Ensemble Classifiers

5. Conclusions

Author Contributions

Funding

Data Availability Statement

Conflicts of Interest

References

Share and Cite

Article Metrics

Article Access Statistics

Further Information

Guidelines

MDPI Initiatives

Follow MDPI