1. Introduction
The importance of image security has grown due to its wide range of applications. Traditional cryptosystems often require significant time to encrypt large color images. As a result, cryptosystems with faster speeds and lower costs are being researched and developed. Recently, chaos-based image cryptosystems have garnered increasing attention for their lightweight nature and high efficiency [
1,
2]. The initial value sensitivity, ergodicity, and periodic point density of chaotic systems ensure that they are locally random but globally bounded, making their output difficult to predict and meeting the confusion and diffusion requirements of a good cryptosystem. Much research focuses on chaotic systems, attractors, and chaotic sequences to achieve high security for image encryption [
3,
4,
5]. Chaos-based image encryption algorithms can be categorized as either 1D chaos-based or higher-dimensional chaos-based. For instance, 1D chaotic systems with two seed maps are proposed in [
6] to create a novel image encryption scheme, transforming the original image into different encrypted images with the same key. Following a similar approach, two different 1D chaotic maps are used in [
7] to output sequences for encryption. A cosine-transform-based chaotic system is presented in [
8] to scramble and diffuse the image. Although 1D chaotic systems are efficient, they have a small key space and lack complexity, making their orbits predictable, which can lead to security vulnerabilities [
9,
10,
11]. High-dimensional chaotic systems, with more complex behavior and larger key spaces, provide greater security. In [
12], a color image is converted into a two-dimensional matrix, which is scrambled using a combined DNA coding operation with a three-dimensional chaotic system and Fisher–Yates scrambling. Mansouri and Wang [
13] improved the Arnold system by combining it with a shuffle operation to scramble and diffuse the image. The Lorenz system, a classical chaotic system, has been modified and widely used in chaos-based image encryption and communications [
14,
15,
16]. In [
17], the nonlinear term of the general Lorenz system was replaced by the sum of an exponential function and the square of a single variable. This new Lorenz system is then used to generate keys for scrambling image pixels, effectively resisting chosen plaintext attacks. In [
18], a coupled chaotic system with complex dynamic behavior is proposed for image encryption, achieving higher security and speed.
DNA coding has been introduced in chaos-based color image encryption [
19,
20,
21,
22] to enhance security. These schemes divide the image into three channels and transform them into matrices using DNA coding, then use the chaotic system-generated keys to diffuse the image. Ravichandran et al. proposed a two-level image encryption scheme based on the chaotic map and deoxyribonucleic acid (DNA) [
23,
24]. In [
25], a 4D cat map and elliptic curve ElGamal are used to encrypt color images, resulting in high resistance to known attacks. Numerous other chaos-based color image encryption schemes have also been proposed [
26,
27], all demonstrating high security and robustness. However, chaos-based encryption schemes often require multiple pixel scans and sorting operations, leading to high computational complexity [
28,
29,
30]. Furthermore, the binarization of chaotic systems introduces degradation, reducing security.
Deep learning is being used for image encryption due to its nonlinear structure, though it is still in its early stages. Ding and Zheng et al. proposed image encryption schemes based on GAN, cycle-GAN, and their variants [
31,
32]. In GAN-based encryption, a set of encrypted images is used as hidden factors to train the network, which transforms plain images into cipher images. In [
32], cycle-GAN is used to disguise plain images with cover images. In [
33], plain images are diffused before being passed through the encryption model, where GAN is used as the encryption component. Wang et al. generated cipher images directly, without training a neural network, using scrambled DCT coefficients instead. Another method uses deep learning to generate secret keys [
34,
35], achieving a high key space. However, convolution operations in deep learning without normalization cause pixel values to exceed 255, and normalization results in float values that cannot be displayed as images. Consequently, deep learning is typically used to generate control parameters for image encryption. In [
36], the facial image of a person is employed to extract features using a convolutional neural network (CNN). These features are then used to control the sine logistic modulation map, which generates chaotic matrices for the encryption of CT images. Zhou et al. proposed an image encryption scheme based on a conditional generative adversarial network (CGAN) [
37]. The primary image is encoded into two noise-like images, which are then used to generate a speckle pattern and trained with the primary image by the CGAN. Upon receiving the ciphertext of the two noise-like images, they are first decrypted and recombined into the speckle pattern. This speckle pattern is then input into the CGAN to output the corresponding original image. Panwar et al. summarized the latest deep learning-based image encryption methods, analyzing their advantages and possible vulnerabilities to attacks [
38]. These studies show that deep learning-based image encryption schemes may be vulnerable to attacks common to deep learning models [
38,
39], such as hidden factor leakage and network architecture exposure. Additionally, since the secret keys do not correlate with the plaintext image, they may be compromised by chosen plaintext attacks [
40,
41].
From the above, we can conclude that chaos-based image encryption schemes, when combined with DNA coding or other nonlinear components, enhance security while increasing complexity. Currently, no image encryption scheme employs the VGG network. Deep learning-based image encryption schemes, such as CNN-based encryption and CGAN-based encryption, may be vulnerable to attacks common to deep learning models. To address these security issues, we designed a lightweight VGG (LVGG) neural network based on VGG-16 [
42], which offers high efficiency. We then proposed a LVGG-based image encryption scheme that combines the nonlinearity of deep learning models with the randomness of chaotic systems. The LVGG has fewer layers than the classical VGG. Since the VGG network achieves the same receptive field with smaller convolutional kernels and uses fewer parameters than other CNNs, the proposed LVGG can achieve high efficiency in image encryption.
Our contributions are as follows: (i) We propose an LVGG-based key seed generator that takes a plain image as input, where the LVGG with only 7 layers improves the efficiency of key seed generation. We design a novel 4D chaotic system with complex dynamic behavior, based on the Lorenz system, to generate the key seed. This key seed is used as part of the initial values for generating the secret key, correlating the plain image with the encryption process and enhancing resistance to chosen plaintext attacks. (ii) We design a dynamic substitution box (S-box) to scramble the pixels of the image, improving the encryption’s resistance to statistical attacks. (iii) A dynamic SC layer, along with a convolutional addition and modular operation, is dynamically generated for image encryption. The convolutional addition is applied to the image using a convolution kernel of size 1 × 2, followed by modulo 256 calculations, achieving high efficiency in confusion. Finally, the security and robustness of the proposed scheme are analyzed through simulation.
The remainder of this article is organized as follows. In
Section 2, we introduce the VGG network and the Lorenz chaotic system. In
Section 3, we present the design of the LVGG-based image encryption scheme. This includes the LVGG network, a novel 4D Lorenz-based chaotic system, and the LVGG-chaos-based pseudorandom generator. The dynamic S-box and SC layer, constructed by the secret key, are used to scramble and diffuse the image. Finally, a convolution kernel is designed for further encryption. In
Section 4, we discuss the simulation and security analysis of the proposed scheme. Conclusions are drawn in
Section 5.
3. The Proposed Image Encryption Scheme LVGG-IE
In this section, we design an LVGG neural network to generate a plaintext image-correlated secret key seed, which has higher efficiency than the classical VGG network. This secret key seed is then used as part of the initial values of the proposed chaotic system, with the other two initial values chosen randomly. The pseudorandom sequence generated by the chaotic system serves as the secret key for constructing the substitution box and for image encryption. Additionally, we design an SC layer using the secret key to further confuse the image. Finally, a convolutional layer with a kernel size of is applied to the image’s pixel matrix. The details are as follows.
3.1. The LVGG-Based Key Seed Generator
To enhance image encryption against chosen plaintext attacks, the plaintext should be fed into the encryption process. We designed an LVGG-based key seed generator that takes a plain image as input, while the LVGG improves the efficiency of key seed generation. The neural network structure of the proposed LVGG is shown below.
In
Figure 1, the proposed LVGG neural network contains 7 layers and an input of the
image. The parameter
denotes the batch size, and the output is a vector with size two. The LVGG can classify the image with lower resource consumption and high speed. We trained the network model using a training set of
images and a test set of
. Both the training set and the test set contain 50% human images and 50% non-human images. All images in these sets were normalized and then resized to
. The number of epochs was set to 10, and the batch size was set to 32. The learning rate was optimized using the RMSprop algorithm, with a value of
. The network was trained for 10 epochs, achieving an accuracy of 0.788. The proposed LVGG is compared with the VGG 16 and VGG 19 of [
42] in
Table 2.
From
Table 2, we can see that the proposed LVGG requires only 20 s to train the model, which is one-fifteenth of the time needed by VGG 16 and VGG 19. This makes LVGG more efficient for image encryption, especially when the neural network needs to be retrained.
Since the LVGG network uses the softmax function for classification, any image can be input into the LVGG neural network to obtain a vector with two floating point values, and , where and are used as two key seeds. When , it outputs classification 1; otherwise, it outputs classification 0. Thus, the probability that is 78.8% if the input image belongs to a specified type. Although an attacker could correctly guess the type of input image with a probability of 78.8%, they cannot obtain the specific values of and . In other words, the proposed LVGG network can not only utilize certain features of the input image but also resist attacks targeting the neural network. However, the key seed needs to be transmitted to the receiver via a secure channel or public cryptosystem for each image, which improves the complexity of key management.
3.2. Pseudorandom Generator Based on LVGG and Chaos
To mitigate the degradation caused by the binarization of the chaotic system, we constructed a four-dimensional (4D) system by adding a new controller to the Lorenz system, which exhibits more complex dynamic behavior. The new 4D system is shown in Formula (2):
When
,
,
and
, the system becomes chaotic. Its phase diagram is shown in
Figure 2:
The Lyapunov exponents are
,
,
, and
for
. The Lyapunov dimension, which represents the complexity of the attractor, can be calculated using Formula (3).
where
is the
-th Lyapunov exponent, and
is the largest index that makes
. The Lyapunov dimension of the proposed scheme is 2.1229. From the above, it can be seen that the proposed 4D chaotic system has a larger chaotic range and more complex dynamic behavior than the Lorenz system, which effectively reduces the degradation of the quantization of the chaotic sequence.
Additionally, we designed a pseudorandom generator based on the LVGG and chaos, ensuring that the secret key is correlated with the plain image. In the pseudorandom sequence generation process, we use self-perturbation to minimize degeneration. The details are shown in
Figure 3.
The pseudorandom sequence generation process is as follows:
Step 1: Input the initial value (, , , ) into the chaotic system, where the output of LVGG and are added to and , respectively. While and are chosen randomly. The outputs of the chaotic system are denoted by , , , and . Discard the values of the first 200 iterations in each 10,000 iterations.
Step 2: Obtain the fractional part of the values , , , and by . Here, denotes the largest integer less than or equal to .
Step 3: Multiply
,
,
, and
by
and apply modulo 256 using Formula (4).
The binary sequence can be obtained by cascading , , , and , denoted by .
Step 4: If the iteration is a multiple of 10,000, compute the fractional part of
, denoted as
. Then, use it to disturb the chaotic system according to Formula (5). Otherwise, continue the iteration.
Since the length of the key generated in each iteration is 32 bits, the length of each of , , , and is 8 bits. The key generation process will stop after itertions when , where denotes the required key length. It will stop after itertions when , where is the remainder when is divided by , with and .
3.3. Design of the Dynamic S-Box
To achieve high efficiency, we design a dynamic substitution box (S-box). A sequence
with length 256 is selected from the secret key. This sequence is sorted in ascending order to obtain a new sequence
. The index of the element
in
is reshaped into a
matrix, which serves as the dynamic S-box. An example is shown in
Figure 4.
To transform the image pixels into cipher image pixels using the S-box, each pixel value of the plain image is divided into the left 4 bits and the right 4 bits. The decimal values derived from these bits represent the row and column numbers of the S-box, respectively.
3.4. LVGG-IE-Based Image Encryption Scheme
In this section, we present the proposed image encryption scheme, LVGG-IE, which consists of substitution via the S-box, permutation using the SC layer, convolutional addition, and modular operations. The encryption model is shown in
Figure 5.
In this encryption model, the sub-keys are separated from the secret key by
. For an image of size
, the length of these sub-keys are
, 256,
,
, 2,
,
,
, and 2, respectively. These sub-keys are generated by the pseudorandom generator designed in
Section 3.3. Consequently, the secret key for the proposed image encryption scheme consists of the key seed and the other four initial inputs for the proposed chaotic system. Thus, the secret key comprises 6 real numbers, requiring only 384 bits. The details of the image encryption are as below:
Step 1: First, permute the pixels of the plain image using
. Then, construct the dynamic S-box as described in
Figure 4 using
. Divide each pixel of the image into left 4 bits and right 4 bits, where the decimal values derived from these bits represent the row and column numbers of the S-box, respectively. Next, substitute all the pixels of the image with the corresponding values from the S-box.
Step 2: Perform modular addition for each bit-shifted pixel of the resulting image and the key matrix generated by
. Generate the SC layer by
, and permutate each column of the image using the SC layer, as shown in
Figure 6.
Step 3: Generate the convolution kernel
by Formula (6):
Then, transform the
image matrix
into one-dimension vectors
in column order. Perform the convolutional addition operation using the convolution kernel
. The pixel value is obtained by applying modulo 256. The details of the convolutional addition are shown in
Figure 7.
The different dotted boxes in
Figure 7 denote the different convolutional units. The cipher pixel can be calculated by Formula (7):
Step 4: Transform the one-dimension vectors
into the
image matrix
. Apply modular addition to each row of
using Formula (8):
Step 5: Apply modular addition to each column of using Formula (9).
The process is shown in
Figure 8, in which the different dotted boxes denote the different modular addition units:
Step 6: Generate the SC layer by
and permutate each row of the image by the SC layer. The operation is similar to that in
Figure 6, except that the input is now each row of the image.
Step 7: Generate the convolution kernel
by
and encryption the pixels of the image by convolutional addition and modulo operation, as shown in
Figure 7.
The encryption process is detailed in Algorithm 1.
Algorithm 1. The image encryption algorithm |
Input: ,,,, Output: |
1: | //Obtain the index of the sorted sequence of |
2: | //Obtain the index of the sorted sequence of |
3: | //Generate the S-box by the index value |
4: | //Reshape the plain image to one dimension vector |
5: | for
do |
6: | //Shuffle the plain image by the index |
7: | //Substitute the value by S-box |
8: | end |
9: | |
10: | |
11 | do //Permute the image by SC layer |
12: | , |
13: | |
14: | end |
15: | //Transform the image matrix to one-dimension vector |
16: | + 2()*,256)//Encrypt image by convolutional addition and modulo |
17: | for do |
18: | + 2()*,256) |
19: | end |
20: | //Transform the one-dimension vector to image matrix |
21: | //modular addition on row |
22: | for do |
23: | + ,256) |
24: | end |
25: | //modular addition on column |
26: | for do |
27: | + ,256) |
28: | end |
29: | do //Permute the image by SC layer |
30: | , |
31: | |
32: | end |
33: | //Transform the image matrix to one-dimension vector |
34: | + 2()*,256)//Encrypt image by convolutional addition and modular |
35: | for do |
36: | + 2()*,256) |
37: | end |
38: | //Transform the one-dimension vector to image matrix |
39: | return |
The decryption is the inverse of the encryption. The differences are as follows:
First, the inverse of convolutional addition and modular operations simply requires changing the addition operation to subtraction. The inverse of the SC layer follows the same steps as the decryption process shown in
Figure 5. Second, the bit shift XOR operation is modified to
. The modular addition of rows or columns is changed to modular subtraction
. Third, the inverse of the S-box is performed by searching for the value of each pixel in the cipher image within the S-box, obtaining its row and column values, and converting them into two 4-bit sequences. These are combined and then converted to a decimal value, which is the decrypted value from the inverse S-box.
4. Simulation and Security Analysis
The security of the proposed color image encryption scheme is analyzed in terms of key randomness, key sensitivity, the histogram of the cipher image, the correlation coefficient of adjacent pixels, and information entropy. The scheme’s ability to resist differential attacks, data loss attacks, and noise attacks is also simulated. Since the image “Lena” is not recommended by many journals, we replaced it with the image “Peppers,” which shares similar feature space characteristics [
45].
4.1. Randomness of the Key
The randomness of the key generated by the LVGG and chaos-based pseudorandom sequence generator in
Section 3.2 is tested by NIST SP 800. The initial value for the image “Peppers” is
. The result is shown in
Table 3.
From
Table 3, we can see that the key generated by the proposed pseudorandom sequence generator passes all tests, with 10 items passing at a proportion of 100%. This indicates good randomness for encryption. When the same method is applied to the Lorenz system, however, the pseudorandom sequence based on Lorenz fails the SP 800 test. A comparison of the two sequences is shown in
Table 3, where most
p-values for the Lorenz sequence are less than 0.05, which does not meet the SP 800 test requirements. Therefore, the Lorenz-based pseudorandom sequence is insecure for image encryption.
4.2. Security and Efficiency Analysis of the Dynamic S-Box and SC Layer
To validate the security of the dynamic S-box, we test its nonlinearity and strict avalanche criterion (SAC) in this section. We also analyze the security of the dynamic SC layer.
4.2.1. Nonlinearity Test
We generate 10,000 S-boxes using the method described in
Section 3.3 and calculate their nonlinearity using Formula (10):
Here, the cyclic spectrum of function
is denoted as
, where
is the dot product of
and
. The nonlinearity values are shown in
Figure 9.
In
Figure 9, the number of S-boxes with a nonlinearity greater than 110 exceeds half of the total. Since the nonlinearity of the S-box in AES is 110, the proposed method for generating dynamic S-boxes is secure.
4.2.2. SAC Test
The SAC of the S-box is another significant index for evaluating its security. It is defined such that the S-box is considered secure if its output flips with a probability of 50% when any single input bit is changed. We altered each bit of the input and calculated the flipping probability of each output for 10,000 different S-boxes. The results are shown in
Table 4.
In
Table 4, we observe that the probability of the S-box output is approximately 0.5 for different inputs. The average deviation from 0.5 is 0.00048, meeting the theoretical SAC deviation requirements.
4.2.3. Security of the Dynamic S-Box and Dynamic SC Layer
The static S-box can be brute-forced using chosen plaintext attacks, where the adversary constructs specific plaintext and obtains the substituted ciphertext through the S-box. In contrast, the dynamic S-box offers higher security since it changes with each encryption. Furthermore, the dynamic S-box proposed in this scheme exhibits high nonlinearity, with the average nonlinearity approaching that of the AES algorithm. It also shows a small deviation from the ideal SAC value. Therefore, the proposed generation method for the S-box ensures high security.
The dynamic SC layer is designed as a shift operation for image encryption in this work. A static SC layer may lead to a statistical attack, in which an adversary could generate different shift results for the plaintext image and deduce the entire SC layer. Based on this guessed SC layer, the adversary could reconstruct the plaintext after the SC shift, rendering subsequent operations ineffective. For example, an attacker might design a plaintext image where the first pixel encrypted by the SC layer is 0, so that the output of the subsequent convolutional addition and modular operation equals the key used in that step. By using a dynamic SC layer, the shift changes with each encryption, making it impossible to guess the SC layer and thereby enhancing security.
4.2.4. Efficiency of Dynamic Generation for S-Box and SC Layer
The dynamic generation of the S-box takes minimal time compared to the entire image encryption scheme. We tested the implementation time on an 11th Gen Intel(R) Core(TM) i7-1165G7 @ 2.80 GHz platform. The results indicate that dynamically generating the S-box requires only 0.0002 s. The dynamic generation of the SC layer requires storage for only or bits, as it is merely a rule for bit shifting. Thus, the dynamic method achieves high efficiency.
4.3. Encryption Results
In this section, the correctness of the proposed color image encryption scheme is validated. The images “Peppers” (
), Lake (
), and Female (
) are encrypted and decrypted as shown in
Figure 10.
The results in
Figure 10 show that the cipher images differ from the plaintext and resemble noise. The decrypted images are indistinguishable from the plaintext images, confirming that the proposed scheme can correctly encrypt and decrypt color images.
4.4. Key Space and Key Sensitivity
The strength of the image encryption algorithm relies heavily on the robustness of the key. An image encryption algorithm with a key space larger than
is capable of resisting brute-force attacks. The proposed scheme uses the plaintext to generate the key seed and obtain the key sequence through the proposed 4D chaotic system. The key seed is composed of two floating point numbers with a bit length of 64, giving it a key space of
. The control variables in
Figure 2 are
,
,
, and
when
. The key space of the chaotic map is approximately
when the key precision is set to 14. Therefore, the total key space is about
. The key space of the proposed scheme is compared with others in
Table 5:
In
Table 5, the key space of the proposed scheme is larger than that of [
20,
22], and only slightly smaller than that of [
23,
25]. This key space is sufficient to resist brute-force attacks.
Chaotic systems are sensitive to initial inputs, but quantization may reduce this sensitivity. To evaluate the proposed pseudorandom sequence generator’s capacity, we tested the key sensitivity in the encryption and decryption of color images. We generated a key
using the initial value
. Then, four different keys
,
,
and
are generated by changing
, and
by
, respectively. The sensitivity of these keys to tiny changes is tested. For clarity, the changes are presented in decimal form in
Figure 11.
Figure 11 demonstrates that a tiny change of 0.00000000000001 in the initial value results in a complete alteration of the key sequence. This confirms that the key is highly sensitive to changes in the initial value. Additionally, we decrypted the cipher image of “Peppers” (
), which had been encrypted using
. The result is shown in
Figure 12.
The rate of change between the decrypted images and the plaintext image is shown in
Table 6.
From the data, it is evident that the difference between the decrypted image using a slightly altered key and the plaintext image exceeds 99%. This confirms that the proposed scheme exhibits high key sensitivity.
4.5. Histogram Analysis
The histogram describes the distribution of pixel values in an image. A uniform pixel value distribution indicates stronger resistance to statistical attacks. The histograms of the plaintext image “Peppers” (
) and its encryption version are shown in
Figure 13.
The uniformity of the histogram can be estimated by the variance, which is calculated by Formula (11).
Here,
and
represent the number of pixels in the histogram
with gray values
and
, respectively. The parameter
refers to the gray level. Since the variances of the histograms of image in [
21] is the smallest, we compare the variances of the histograms of different images for different schemes in
Table 7.
Figure 13 shows that the histogram of the cipher image exhibits a uniform distribution, contrasting with the plaintext image’s histogram. From
Table 7, we observe that the variances of the histograms of the cipher images encrypted using Algorithm 1 are smaller than those in [
21], indicating that the proposed scheme can resist statistical attacks.
4.6. Encryption Quality Analysis
In this section, the accuracy of the proposed color image encryption scheme is validated. The quality of encryption is analyzed by examining the closeness of the obtained image to an ideally encrypted image [
46]. An ideal encrypted image has a uniform pixel distribution across all intensity levels, which can be assessed using metrics such as deviation from ideality, maximum deviation, and irregular deviation.
4.6.1. Deviation from Ideality
The histogram of an encrypted image generated by a robust encryption scheme should be uniformly distributed. The histogram of the ideal encrypted image
can be measured by Formula (12), where a small deviation indicates high security:
where
is the number of rows, and
is the number of columns in the image. The deviation from the ideal encrypted image can be calculated using Formula (13):
where
is the histogram of the encypted image.
4.6.2. Maximum Deviation
The maximum deviation (MD) evaluates the difference between the histograms of the cipher and plain images, as calculated by Formula (14). A larger MD represents higher security:
where
is the total number of pixel values (for an 8-bit image,
), and
is the difference between the
-th histogram of the original and encrypted images.
4.6.3. Irregular Deviation
Since maximum deviation alone may yield inaccurate results in some cases, it cannot be solely relied upon to assess encryption quality. Irregular deviation (ID) measures how close the statistical distribution deviation between the plain and cipher images is to a uniform distribution, as calculated by Formula (15):
where
is the difference between the histogram values of the plain and cipher images, and
is the mean of the histogram values. A higher ID indicates a more uniform pixel distribution.
We calculate the deviation from ideality, maximum deviation, and irregular deviation for the proposed scheme. Additionally, we compare the encryption quality of the proposed scheme using the key seed generated by the LVGG model with that using a random key seed. The results are shown in
Table 8.
In
Table 8, “Peppers” represents the cipher image encrypted with a key generated from a random key seed, while “Peppers*” represents the cipher image encrypted with a key generated using the key seed from the LVGG network. The results show that the deviation from ideality of the proposed scheme decreases to 0.55, indicating high security. The MD and ID values of the proposed scheme are sufficiently large. Furthermore, the encryption quality of the cipher image using the key seed generated by the LVGG network is better than that using the random key seed. Thus, the proposed image encryption scheme demonstrates high encryption quality.
4.7. Correlation Coefficient Between Adjacent Pixels Analysis
A good image encryption scheme should yield a low correlation coefficient between adjacent pixels in any direction, making it more resistant to statistical attacks. The correlation coefficient between adjacent pixels is calculated using Formula (16).
where
represents the number of chosen adjacent pixels in any direction of the image. In this test, 5000 adjacent pixels were selected to calculate the correlation coefficient between adjacent pixels of both the plaintext and cipher images for “Peppers” (
). The result is shown in
Figure 14.
Furthermore, we compare the correlation coefficient between the adjacent pixel of the image Peppers (
) with the cipher image Lena (
) of other schemes. The lowest correlation coefficient of the three components is chosen to compare with the other schemes. The result is shown in
Table 10.
The results show that the correlation coefficients of the encrypted images using our scheme are the lowest, with the minimum coefficient being about one-tenth of that of other schemes. The correlation coefficients of images encrypted using Lorenz-based keys are the highest, indicating that the Lorenz system is inadequate for image encryption due to its poor resistance to differential analysis.
4.8. Differential Attack
The number of pixels change rate (NPCR) and unified average changing intensity (UACI) are commonly used to evaluate the ability of an image encryption scheme to withstand differential attacks. If NPCR is close to 100% and UACI is near 33%, the encryption scheme is considered secure against such attacks. Let
and
be two plaintext images that differ by only one pixel. There exist that
and
.
and
are generated by encrypting
and
with the same key.
and
are any pixel of
and
correspondingly, we set
if
. Then, the NPCR and UACI can be calculated by the Formula (17):
where
and
are the numbers of rows and columns in the image matrix. Recently, the theoretical marginal values for NPCR and UACI have been defined in Formula (18).
where
represents the largest pixel value supported by the ciphertext image format, and
is the inverse cumulative density function of the stand normal distribution
. The
is the significance level. NPCR results for different images are shown in
Table 11.
The UACI of different images is tested in
Table 12.
The comparison of NPCR and UACI values for “Peppers” (
) and “Lena” (
) is presented in
Table 13.
The results show that NPCR and UACI values for the proposed scheme fall within the critical value ranges. Additionally, the UACI of the proposed scheme is closer to the critical values compared to other schemes, except for [
25]. Thus, the proposed scheme efficiently resists differential attacks. In contrast, the Lorenz-based encryption scheme fails the NPCR test, and its UACI only reaches the
critical value, indicating its vulnerability to differential attacks.
Additionally, we use the LVGG network to generate the key seed from the plaintext image, which enhances resistance against chosen plaintext attacks. Attackers could potentially construct specific images, such as an image with only one pixel set to “1” and all other pixels set to “0,” denoted as . By changing the position of the “1” within the pixel grid or its location in the image, attackers could create multiple plaintext images . After encrypting all these plaintext images to obtain ciphertext image , attackers could build a map containing , and . They could then attempt to guess the key by performing subtractions on pairs of and analyzing the differences between them. For instance, when in Algoruthm 1 for and , the subtraction can cancel out the convolutional addition and modulo operations from lines 34 to 37 in Algorithm 1. Additionally, could be compromised by collecting more than pairs of tuples , since each contains only one “1” within the pixels. The same approach could potentially be applied to obtain , and other sub-keys might also be vulnerable to brute-force attacks using this method. On the contrary, when the key is generated from a key seed produced by the LVGG network, each tuple pair corresponds to a different . These vulnerabilities are mitigated because the attacker cannot establish a direct mapping between the tuple pairs and the key.
4.9. Information Entropy Analysis
Information entropy reflects the degree of image confusion, which is defined by Formula (19).
where
is the image information, and
is the probability of the gray value
.
is the number of the gray values in image
. The theoretical value of a random image with 256 gary levels is 8. The information entropy of the plain images and the cipher images generated by the proposed scheme is shown in
Table 14.
The information entropy of the cipher image “Peppers” (
) generated by our scheme is compared with that of the cipher image “Lena” (
) from other schemes in
Table 15.
Table 14 and
Table 15 show that the information entropy of all cipher images generated by our scheme is very close to 8 and is higher than in other schemes, indicating that the proposed scheme is more resistant to statistical attacks. The entropy of the cipher image generated by the Lorenz system is lower, demonstrating less efficiency in resisting statistical attacks.
4.10. Robustness Analysis
Since encrypted images may suffer from noise interference or cropping attacks during transmission, robustness against such attacks is essential for an efficient image encryption scheme. We tested the robustness of the image “Peppers” against Gaussian noise (GN), Salt & Pepper noise (SPN), and Speckle noise (SN), as shown in
Figure 15.
To evaluate robustness against cropping attacks, different areas of the cipher image of “Peppers” were cropped and then decrypted, as shown in
Figure 16.
Figure 13 shows that even cropped cipher images can be partly decrypted. Though some pixels are lost, the image content remains recognizable. Additionally, we used the signal-to-noise ratio (PSNR) to measure resilience to noisy images. A higher PSNR indicates better resilience, and the PSNR is defined in Formula (20).
where
and
represent the image size,
is the original image, and the noisy cipher image of
is decrypted to
. The PSNR for the noisy cipher images is shown in
Table 16.
The PSNR values in
Table 16 are all above 17 dB for all noise attacks. Even when 25% of the cipher image information is lost, the recovered image remains recognizable, with a PSNR value of around 11 dB. Therefore, the proposed scheme is robust against noise and cropping attacks.
4.11. Visual Quality Analysis
The visual quality analysis can be evaluated using MSE and PSNR. The MSE and PSNR values of the plaintext image and the ciphertext image for the proposed scheme are shown in
Table 17.
Table 17 shows that the MSE between the plaintext image and the ciphertext image for the proposed scheme is high, and the PSNR is below 10. Therefore, the proposed image encryption scheme demonstrates high security.
4.12. Performance Analysis
To validate the efficiency of the proposed scheme, we tested the implementation time and compared it with other schemes, as shown in
Table 18.
Table 18 shows that the encryption speed of the proposed image encryption scheme is higher than that of AES and other chaos-based image encryption schemes, making it suitable for image encryption.
From the above, it is clear that the proposed scheme has a high efficiency and capacity to resist statistical, noise, and cropping attacks, making it more secure than other schemes.