On the Impact of Discrete Atomic Compression on Image Classification by Convolutional Neural Networks

Makarichev, Viktor; Lukin, Vladimir; Brysina, Iryna

doi:10.3390/computation12090176

Open AccessArticle

On the Impact of Discrete Atomic Compression on Image Classification by Convolutional Neural Networks

by

Viktor Makarichev

^1,*

,

Vladimir Lukin

¹

and

Iryna Brysina

²

¹

Department of Information-Communication Technologies, National Aerospace University “KhAI”, 61070 Kharkiv, Ukraine

²

Department of Higher Mathematics and System Analysis, National Aerospace University “KhAI”, 61070 Kharkiv, Ukraine

^*

Author to whom correspondence should be addressed.

Computation 2024, 12(9), 176; https://doi.org/10.3390/computation12090176

Submission received: 31 July 2024 / Revised: 29 August 2024 / Accepted: 30 August 2024 / Published: 1 September 2024

(This article belongs to the Special Issue Integrated Computer Technologies in Mechanical Engineering—Synergetic Engineering III)

Download

Browse Figures

Review Reports Versions Notes

Abstract

:

Digital images play a particular role in a wide range of systems. Image processing, storing and transferring via networks require a lot of memory, time and traffic. Also, appropriate protection is required in the case of confidential data. Discrete atomic compression (DAC) is an approach providing image compression and encryption simultaneously. It has two processing modes: lossless and lossy. The latter one ensures a higher compression ratio in combination with inevitable quality loss that may affect decompressed image analysis, in particular, classification. In this paper, we explore the impact of distortions produced by DAC on performance of several state-of-the-art classifiers based on convolutional neural networks (CNNs). The classic, block-splitting and chroma subsampling modes of DAC are considered. It is shown that each of them produces a quite small effect on MobileNetV2, VGG16, VGG19, ResNet50, NASNetMobile and NASNetLarge models. This research shows that, using the DAC approach, memory expenses can be reduced without significant degradation of performance of the aforementioned CNN-based classifiers.

Keywords:

lossy image compression; image classification; discrete atomic compression; convolutional neural network

1. Introduction

Data are an essential part of different processes, including manufacturing [1,2], social communication [3,4], economics and finances [5,6], medicine [7] and healthcare [8], media and education [9,10], sport, [11] and entertainment [12], etc. Analyzing the current trends, one may emphasize the following features:

an explosive increase in data volumes [13];
high data protection requirements [14,15];
widespread use of machine learning [16,17];
fast growth in edge computing [18,19].

Storing, processing and transferring big data via networks requires great memory, time, computational, and traffic resources. In order to reduce these expenses, numerous data compression techniques and algorithms are applied [20]. However, in the most cases, such algorithms do not provide data protection, which makes applying additional encryption methods the must. Various approaches to solving this task have been developed [21]. One problem is that their use increases computational resources.

Digital images occupy a particular place in big data. They are widely used in remote sensing [22,23], environmental monitoring [24,25], agriculture development [26,27], etc. Modern communication via popular messengers and social networks is impossible without digital photos [28].

Due to the wide availability of smartphones and various gadgets, the total number of photos is huge [29]. Moreover, modern sensors provide images of a very high resolution with millions of pixels [30], which makes their processing and analyzing a challenging task, especially when performing computations on edge devices or by autonomous robots [31,32]. Hence, a large size of image data samples increases the load of communication networks. In addition, this feature is the reason for high encryption costs.

A natural solution to the high resource expenses problem could be the application of image compression algorithms that have built-in data encryption features. Discrete atomic compression (DAC) is such an algorithm [33,34].

The DAC algorithm is based on the use of a special class of atomic functions introduced by V. Rvachov [35]. It has low time and spatial complexity [36,37]. For this reason, DAC can be considered as a light-weight self-encrypted image compression technique. Similar to many other coders, there are two compression modes for DAC: lossless and lossy. The former ensures an exact reconstruction of compressed data, whilst the later one provides a significantly higher compression ratio at the expense of a certain quality loss [38]. However, due to designed distortion control mechanism, one can achieve visually lossless compression [33] or, at least, appropriate losses. Then, the following question naturally arises: what is the impact of quality loss introduced by DAC on the efficiency of subsequent image analysis, and in particular, classification?

Image classification is a particular task in computer vision [39]. It processes image data and assigns a class label or category to a whole image or its parts. There are many methods that provide solutions to this task, and the use of convolutional neural networks (CNNs) has recently demonstrated a very high efficiency [40].

CNNs extract image features that ensure classification. When applying lossy compression, distortions are introduced, and some particular features might be lost or considerably deteriorated. The aim of this paper is to study the impact of quality loss, produced by the DAC algorithm, on the performance of MobileNetV2 [41], VGG16 and VGG19 [42], ResNet50 [43], NASNetMobile and NASNetLarge [44]. These networks can be treated as good, modern and state-of-the-art image classifiers. They have common features in combination with a set of major differences. Further on, we discuss them briefly.

There are several versions of DAC algorithms having certain peculiarities. In this paper, we consider three preprocessing modes of DAC including classic, block-splitting and chroma-subsampling [37] and explore their impact on the efficiency of the classifiers mentioned above, which constitute the novelty of the research. We show that the minor negative effects on classification performance are usually produced, which means that lossy compression by the DAC algorithm preserves the principal features of a processed image. This is the main contribution.

The paper is organized as follows. First, the DAC algorithm is presented, and its particular properties are discussed. Second, a test image dataset is compressed, and the impact of quality loss on classification accuracy is investigated. Next, the obtained results are discussed, and, finally, the conclusions follow.

2. Discrete Atomic Compression

The DAC algorithm is based on the classic lossy image compression approach, involving discrete data transform, quantization and encoding. Figure 1 shows its main stages [33] under the assumption that it is applied to conventional RGB color images or other types of three-channel data represented as RGB images.

The input for DAC is a 24-bit full color image given by a matrix of red (R), green (G) and blue (B) color intensities. The output is a byte array.

At the first step, the preprocessing is applied. It involves applying the YCrCb color space transform that computes three matrices of one luma (Y) and two chroma (Cr, Cb) components [45]. As said above, there are three modes:

“classic”—the obtained matrices Y, Cr and Cb are moved to the next stage without any changes;
“block-splitting”—the components Y, Cr and Cb are split into blocks of the same size (the common size is 512 × 512);
“chroma-subsampling”—the components Cr and Cb are processed using the 4:2:0 subsampling scheme [45].

Further, the obtained matrices are processed independently.

Next, a discrete atomic transform (DAT) is applied. DAT is a discrete wavelet transform based on the application of V.A. Rvachev [35]’s atomic functions, up_s(x). The output of this step is a set of matrices of DAT coefficients. Their number is equal to the channel count of the source image. Each of these matrices consists of blocks that contain DAT coefficients corresponding to specific frequency band. In Figure 2, a sample structure is shown.

Further, DAT coefficients are quantized. The following formula is applied:

v = Round(w/q).

(1)

In (1), w and v are, respectively, source and quantized DAT coefficients, and q is a coefficient of quantization; rounding-off is performed to the nearest integer.

It is just the data-quantization step that produces distortions. In DAC, the choice of quantization coefficients is specified by the parameter denoted by upper bound of maximum absolute deviation (UBMAD), which is used in the quality-loss control mechanism [33]. Varying UBMAD, one may obtain results with the desired distortions measured by maximum absolute deviation (MAD), root mean squared error (RMSE), and/or peak signal-to-noise ratio (PSNR). A larger UBMAD leads to larger MAD and RMSE and smaller PSNR.

Finally, quantized DAT coefficients are compressed in the lossless manner by a combination of Golomb codes and binary arithmetic coding [20]. This step produces a byte data array with a compressed image.

To provide lossless compression by DAC, in [34], it has been proposed to apply one extra step which complements byte array, which is obtained previously, with additional information ensuring image reconstruction without distortions. So, DAC has two modes: lossy and lossless. The lossless mode has a slightly higher complexity and provides a lower compression ratio (CR) than the lossy one. In this research, we concentrate on lossy image compression by the aforementioned DAC algorithms.

Besides, in DAC, image encryption is provided by varying a structure of DAT [38]. There are more than

10^{110}

different structures of this discrete transform, and its exact specification must be known for correct reconstruction of compressed image. This means that data protection is a built-in feature of DAC.

3. Classifying Compressed Images

Lossy compression by DAC produces distortions, and some particular image features might be lost. Our reasonable expectation is that it might decrease the further classification accuracy, when analyzing decompressed images. In this Section, we explore such an impact.

We start with a brief description of the selected neural networks. Then, we perform the following study. A large set of digital images from different content classes is taken. This is carried out to check the generality of observed tendencies and to obtain statistics. Next, each image from this dataset is compressed by DAC with various preprocessing modes and quality loss settings. After this, the obtained data are decompressed. Further, image classifiers are applied to each decompressed image and the corresponding source images, and classification results are compared. Finally, aggregation is performed.

3.1. Models

In this research, we apply the MobileNetV2, VGG16, VGG19, ResNet50, NASNetMobile and NASNetLarge models with pre-trained weights. Their parameters have been fitted using the ImageNet database [46]. This image collection contains more than 1.2 million training samples of one thousand classes. We use the TensorFlow Keras tools that provides out-of-the-box application of many state-of-the-art neural networks, including the selected ones [47].

Each of the selected models is a deep CNN. They have different architectures, in particular regarding the number of layers and parameters counted. However, there is a set of common features. We will consider them in more detail.

VGG16 and VGG19 are neural networks constructed using the so-called conv-blocks followed by fully connected layers. Each of these blocks consists of several sequential convolutional layers combined with the pooling one. The VGG16 classifier has 16 layers and 138.4 million parameters. The VGG19 classifier, which is a bigger version of VGG16, has 19 layers and 143.7 million parameters.

Next, the ResNet50 model belongs to the residual networks. They are built using residual blocks that consist of convolutional layers and residual connections, which ensure training of very deep networks. In the current research, we use the model with 107 layers and 25.6 million parameters.

Further, the MobileNetV2 presents MobileNet networks designed for on-device computer vision. MobileNetV2 consists of 55 layers and has 4.3 million parameters. Its building blocks combine the expansion, depthwise and projection operators with residual connections. Such structures, which are called bottlenecks, significantly reduce the number of operations that are of particular importance when deploying a model in edge computing systems.

Finally, NASNetMobile and NASNetLarge are CNNs obtained using neural architecture search (NAS). These models consist of 389 and 533 layers, respectively. The NASNetMobile has 5.3 million parameters, and NASNetLarge has 88.9 million parameters.

So, in this research, we explore the impact of lossy compression by the DAC algorithm on image classifiers that are neural networks with sufficiently different properties. The number of layers and parameters counted belong to the wide ranges. This explains the motivation behind the model’s choice. We stress that we use models with pre-trained parameters fitted on the training set of the ImageNet database.

3.2. Test Data

In the current research, a set of 131 images from the ImageNet database [46] are used. Samples are taken from 10 classes (see some examples in Figure 3). Source images are in the JPEG format. The total size is 11.3 MB (JPEG) and 94 MB (raw). The number of image pixels varies from 43,200 to 1,707,200.

In this research, we use open-source digital images with military content. This choice is due to the necessity for developing and applying autonomous systems for the detection and identification of damaged or destroyed military equipment [31]. The use of unmanned aerial vehicles and computer vision (CV) methods for explosive ordnance search is promising. Training highly efficient CV models requires huge amounts of data and computational resources. Therefore, it can be very useful to apply already trained CNNs in combination with transfer learning and fine tuning. As stated above, lossy compression introduces distortions that might impair features extracting property. So, this impact investigation is of particular importance, especially when processing military content images.

3.3. Compression

Each sample from the selected image dataset is compressed using DAC. The “classic”, “block-splitting” and “chroma-subsampling” preprocessing modes are applied. The following values for UBMAD, which specify quality loss, are used: 36, 63, 95 and 155.

Source and reconstructed images are available at the link: https://drive.google.com/drive/folders/1u9m3amxV-7kKmxMfyhlIXWIeGzw_iqSJ?usp=sharing (accessed on 27 June 2024). This Google Drive folder contains four sub-folders: “0. CLASSIC”, “1. BLOCK”, “2. CHROMA” and “RESULTS” that contain, respectively, the results obtained using “classic”, “block-splitting”, and “chroma-subsampling” preprocessing modes, as well as data regarding their further analysis.

Additionally, the distortions produced are evaluated using the MAD, RMSE and PSNR metrics (quality loss indicators). In addition, compression efficiency measured by compression ratio (CR) is analyzed. Also, the total time of compression and decompression of the whole test dataset is computed. We note that image processing by DAC was performed using the AMD Ryzen 5 5600H CPU.

Table 1, Table 2 and Table 3 and Figure 4, Figure 5, Figure 6 and Figure 7 show the dependence of averaged lossy compression performance indicators and total time expenses on the UBMAD parameter. In addition, Figure 8, Figure 9 and Figure 10 allow for comparison of the time and memory expenses.

Analyzing the obtained results, we see that for any UBMAD:

the “chroma-subsampling” mode produces greater distortions than “classic” and “block-splitting”, but a higher compression ratio is provided;
the “classic” and “block-splitting” modes introduce nearly the same quality loss measured by MAD, RMSE and PSNR;
“classic” mode compresses slightly better than the “block-splitting” one;
“block-splitting” has the best time performance; decompression is performed slightly faster than compression;
for the considered range of UBMAD, we basically deal with visually lossless compression (average PSNR exceeds 35 dB; although, the introduced distortions can be noticed (by visual inspection) for some particular images in UBMAD = 155).

Further, we analyze each pair of source and decompressed images by MobileNetV2, VGG16, VGG19, ResNet50, NASNetMobile and NASNetLarge, and explore the impact of the distortions produced on their performance.

3.4. Classification

The following investigation procedure is used for each selected model. First, each source image is classified, and the computed class label is compared with the true one. Then, we apply classification to each decompressed image corresponding to the correctly classified source sample. The results obtained are stored in CSV files available at the link to the Google drive folder given above. Due to page limitations, only the aggregation results are presented. In Table 4, Table 5 and Table 6, the percentage of correctly classified decompressed images is given for each preprocessing mode of the DAC and the UBMAD parameters specifying quality loss.

Figure 11, Figure 12 and Figure 13 visualize data presented in Table 4, Table 5 and Table 6.

Further, we analyze the obtained classification results.

3.5. Analysis

Analyzing the data in Table 4, Table 5 and Table 6 and Figure 11, Figure 12 and Figure 13, we see that, in all cases, the percentage of those correctly classified is less than 100%, which, in general, indicates the expected negative impact of distortions produced by the DAC algorithm. Meanwhile, the percentage is higher than 94%, which means that the effect of quality loss can be considered as insignificant. Moreover, the difference between the computed classification performance indicators is minor. These results might be explained by such models’ features regarding depth and the number of parameters. Indeed, data in Table 7 show that the considered networks are deep.

Combining the classification results with architectural features given in Table 7, this implies that, in most cases, deeper networks provide slightly better performance. At the same time, the number of parameters does not have such an impact. Indeed, the VGG16 model has 32 times more parameters than MobileNetV2. However, these models have nearly the same percentage of correctly classified decompressed images. In addition, we see that, when applying the “chroma-subsampling” mode, which produces the highest distortions (see Figure 4, Figure 5 and Figure 6), the MobileNetV2 and VGG16 indicate the highest robustness to the distortions introduced.

Finally, the behavior of models’ performance may not be monotonic with respect to quality loss. In the case of the VGG19 network, the distortion increase even has a positive impact. This feature is also observed in several other particular cases.

4. Discussion

From the results obtained in the previous Section, it is implied that distortions, which are produced by the DAC algorithm with various quality loss settings and different preprocessing modes, have minor negative effects on the classification performance of MobileNetV2, VGG16, VGG19, ResNet50, NASNetMobile and NASNetLarge. In our opinion, this is due to the combination of several factors.

First, the core of DAC is the DAT based on atomic functions that have good constructive properties in terms of approximation theory [35,36]. Indeed, the corresponding functional spaces are asymptotically extremal for approximation of wide classes of smooth functions. For this reason, data, in particular digital images, are well presented by DAT coefficients. Therein, high frequency coefficients can be quantized with large quantization coefficients without significant impact on the particular features of the image processed.

Second, the neural networks considered are based on applying convolutions that are robust to different distortions. In combination with the previous factor, this preserves the high efficiency of the models explored, at least, for the considered range of UBMAD variation. This is of particular importance, since many other models use the considered CNNs as image feature extractors. So, the following statement seems to be correct: lossy image compression by DAC has minor negative impact on any model, which is constructed on the base of the selected models, in particular by applying transfer learning and fine-tuning techniques.

Next, comparing the applied preprocessing modes, one may conclude that the “block-splitting” mode has the best time performance. The main reason is that this mode possesses memory localization features due to the use of small data buffers employed for compression and decompression [37]. Such algorithms ensure a very high performance due to efficient use of memory caches [48]. Also, if we compare memory expenses required for storing source and compressed data (see Figure 14a), we see that they can be significantly reduced with minor impacts on the further classification accuracy.

The “block-splitting” mode of DAC can be recommended for systems with low computational capabilities, for instance, edge computing [19].

Further, for each UBMAD, the best compression is provided by the “chroma-subsampling” mode of the DAC algorithm. However, the quality loss measured by the MAD, RMSE and PSNR indicators is larger than for other modes. Comparing the source and decompressed samples, one can see that distortions are hard to see for a human eye (Figure 15). In other words, visually lossless compression is obtained when UBMAD is not greater than 155. Taking into account memory expense reduction (Figure 14b) in combination with the minor impact on the classifiers’ performance, this preprocessing mode of DAC can be recommended for application on digital photos and other types of still-image compression.

We stress that the source images are given in the JPEG format. Figure 14 shows memory expenses required for storing raw data reconstructed from JPEG files. For this reason, Figure 14 should not be considered as a comparison of DAC with JPEG in terms of lossy image compression. An appropriate exploration has been carried out in our previous research [49]. In particular, it has been shown that DAC provides better compression than JPEG with the same loss of quality measured by PSNR.

Finally, the “classic” mode of DAC does not demonstrate superiority compared to other modes. Indeed, in terms of image compression metrics, this mode is similar to “block-splitting”. But it is slower and has higher spatial complexity. Nevertheless, when using the “classic” mode, matrices of DAT coefficients contain representation of the whole image, not its separate blocks. This feature might be of particular importance in constructing DAT-based machine learning methods.

5. Conclusions

In this research, the absence of a significant impact from the distortions produced by the DAC algorithm with different preprocessing modes and quality loss settings (UBMAD ≤ 155), on the performance of the MobileNetV2, VGG16, VGG19, ResNet50, NASNetMobile and NASNetLarge classifiers has been shown. Hence, these deep convolutional neural networks are robust compared to the lossy (at least, visually lossless) compression by DAC. So, this algorithm can be recommended for reducing memory expenses if further decompressed image classification is applied. It has been shown that considerable memory savings can be obtained using the “classic”, “block-splitting” and “chroma-subsampling” preprocessing modes of DAC.

The “block-splitting” mode demonstrated the best time performance which, in combination with its low spatial complexity, makes it preferable for application in edge computing. Also, taking into account the data protection features of DAC, we state that this algorithm can be recommended for using in imaging systems where protection is important, in particular in systems installed on unmanned aerial vehicles [31,50].

It has been demonstrated that DAC with “chroma-subsampling” ensures the highest compression. In addition, despite high values in quality-loss metrics, this mode can be considered near lossless due to the absence of visual distortions.

Finally, it follows that DAT coefficients contain image features required for highly effective classifying by deep convolutional neural networks. So, the image representation by DAC can be positioned as machine-learning oriented.

Author Contributions

Conceptualization, V.M. and V.L.; methodology, V.L.; software, V.M.; validation, V.L. and I.B.; formal analysis, V.M. and V.L.; investigation, V.M., V.L. and I.B.; resources, V.L.; data curation, V.L.; writing—original draft preparation, V.M.; writing—review and editing, V.L. and I.B.; visualization, V.M.; supervision, V.L.; project administration, V.M.; funding acquisition, I.B. All authors have read and agreed to the published version of the manuscript.

Funding

This research is partially supported by the National Research Foundation of Ukraine (https://nrfu.org.ua/en/) within the project no. 2023.04/0039 “Geospatial monitoring system for the war impact on the agriculture of Ukraine based on satellite data” (2024–2025).

Data Availability Statement

The original data presented in the study are openly available in the ImageNet database at https://www.image-net.org (accessed on 27 June 2024). Processing results, including decompressed images, tables with computed lossy compression metrics, and Jupyter notebooks providing their classification and exploration, are openly available at the link to the Google Drive folder https://drive.google.com/drive/folders/1u9m3amxV-7kKmxMfyhlIXWIeGzw_iqSJ?usp=drive_link (accessed on 27 June 2024).

Acknowledgments

The authors are personally grateful to V.O. Rvachov for his attention and support of this research.

Conflicts of Interest

The authors declare no conflicts of interest.

References

Mourtzis, D.; Angelopoulos, J.; Panopoulos, N. A Literature Review of the Challenges and Opportunities of the Transition from Industry 4.0 to Society 5.0. Energies 2022, 15, 6276. [Google Scholar] [CrossRef]
Karnik, N.; Bora, U.; Bhadri, K.; Kadambi, P.; Dhatrak, P. A comprehensive study on cur-rent and future trends towards the characteristics and enablers of industry 4.0. J. Ind. Inf. Integr. 2022, 27, 100294. [Google Scholar]
Kreijns, K.; Xu, K.; Weidlich, J. Social Presence: Conceptualization and Measurement. Educ. Psychol. Rev. 2022, 34, 139–170. [Google Scholar] [CrossRef] [PubMed]
Bataeva, E.V. An ethno-methodological analysis of on-line communications. A crisis experiment in chats. Sotsiologicheskie Issled. 2011, 12, 88–97. [Google Scholar]
Limna, P.; Kraiwanit, T.; Siripipatthanakul, S. The Growing Trend of Digital Economy: A Review Article. Int. J. Comput. Sci. Res. 2022, 7, 1351–1361. [Google Scholar] [CrossRef]
Cao, L. AI in Finance: Challenges, Techniques, and Opportunities. ACM Comput. Surv. 2022, 55, 1–38. [Google Scholar]
Wu, W.-T.; Li, Y.-J.; Feng, A.-Z.; Li, L.; Huang, T.; Xu, A.-D.; Lyu, J. Data mining in clinical big data: The frequently used databases, steps, and methodological models. Mil. Med. Res. 2021, 8, 44. [Google Scholar] [CrossRef]
Agrawal, R.; Prabakaran, S. Big data in digital healthcare: Lessons learnt and recommendations for general practice. Heredity 2020, 124, 525–534. [Google Scholar] [CrossRef] [PubMed]
Abkenar, S.B.; Kashani, M.H.; Mahdipour, E.; Jameii, S.M. Big data analytics meets social media: A systematic review of techniques, open issues, and future directions. Telemat. Inform. 2021, 57, 101517. [Google Scholar] [CrossRef]
Baig, M.I.; Shuib, L.; Yadegaridehkordi, E. Big data in education: A state of the art, limitations, and future research directions. Int. J. Educ. Technol. High Educ. 2020, 17, 44. [Google Scholar] [CrossRef]
Naik, B.T.; Hashmi, M.F.; Bokde, N.D. A Comprehensive Review of Computer Vision in Sports: Open Issues, Future Trends and Research Directions. Appl. Sci. 2022, 12, 4429. [Google Scholar] [CrossRef]
Nauman, A.; Qadri, Y.A.; Amjad, M.; Zikria, Y.B.; Afzal, M.K.; Kim, S.W. Multimedia Internet of Things: A Comprehensive Survey. IEEE Access 2020, 8, 8202–8250. [Google Scholar] [CrossRef]
Cisco Annual Internet Report (2018–2023) White Paper. Available online: https://www.cisco.com/c/en/us/solutions/collateral/executive-perspectives/annual-internet-report/white-paper-c11-741490.html (accessed on 29 June 2024).
General Data Protection Regulation GDPR. Available online: https://gdpr-info.eu/ (accessed on 29 June 2024).
California Consumer Privacy Act of 2018. Available online: https://leginfo.legislature.ca.gov/faces/codes_displayText.xhtml?division=3.&part=4.&lawCode=CIV&title=1.81.5 (accessed on 29 June 2024).
Cioffi, R.; Travaglioni, M.; Piscitelli, G.; Petrillo, A.; De Felice, F. Artificial Intelligence and Machine Learning Applications in Smart Production: Progress, Trends, and Directions. Sustainability 2020, 12, 492. [Google Scholar] [CrossRef]
Araújo, S.O.; Peres, R.S.; Ramalho, J.C.; Lidon, F.; Barata, J. Machine Learning Applications in Agriculture: Current Trends, Challenges, and Future Perspectives. Agronomy 2023, 13, 2976. [Google Scholar] [CrossRef]
The State of the Edge Report 2023. Available online: https://stateoftheedge.com/reports/state-of-the-edge-report-2023/ (accessed on 29 June 2024).
Hamdan, S.; Ayyash, M.; Almajali, S. Edge-Computing Architectures for Internet of Things Applications: A Survey. Sensors 2020, 20, 6441. [Google Scholar] [CrossRef] [PubMed]
Sayood, K. Introduction to Data Compression, 5th ed.; Morgan Kaufman: Cambridge, MA, USA, 2017. [Google Scholar]
Ahmed, S.T.; Hammood, D.A.; Chisab, R.F.; Al-Naji, A.; Chahl, J. Medical Image Encryption: A Comprehensive Review. Computers 2023, 12, 160. [Google Scholar] [CrossRef]
Pillai, D.K. New Computational Models for Image Remote Sensing and Big Data. In Big Data Analytics for Satellite Image Processing and Remote Sensing; Swarnalatha, P., Sevugan, P., Eds.; IGI Global: Hershey, PA, USA, 2018; pp. 1–21. [Google Scholar]
Joshi, N.; Baumann, M.; Ehammer, A.; Fensholt, R.; Grogan, K.; Hostert, P.; Jepsen, M.R.; Kuemmerle, T.; Meyfroidt, P.; Mitchard, E.T.A.; et al. A Review of the Application of Optical and Radar Remote Sensing Data Fusion to Land Use Mapping and Monitoring. Remote Sens. 2016, 8, 70. [Google Scholar] [CrossRef]
Song, W.; Song, W.; Gu, H.; Li, F. Progress in the Remote Sensing Monitoring of the Ecological Environment in Mining Areas. Int. J. Environ. Res. Public Health 2020, 17, 1846. [Google Scholar] [CrossRef]
Lechner, A.M.; Foody, G.M.; Boyd, D.S. Applications in remote sensing to forest ecology and management. One Earth 2020, 2, 405–412. [Google Scholar] [CrossRef]
Kussul, N.; Lavreniuk, M.; Skakun, S.; Shelestov, A. Deep learning classification of land cover and crop types using remote sensing data. IEEE Geosci. Remote Sens. Lett. 2017, 14, 778–782. [Google Scholar] [CrossRef]
Sishodia, R.P.; Ray, R.L.; Singh, S.K. Applications of Remote Sensing in Precision Agriculture: A Review. Remote Sens. 2020, 12, 3136. [Google Scholar] [CrossRef]
Kross, E.; Verduyn, P.; Sheppes, G.; Costello, G.K.; Jonides, J.; Ybarra, O. Social Media and Well-Being: Pitfalls, Progress, and Next Steps. Trends Cogn. Sci. 2021, 25, 55–66. [Google Scholar] [CrossRef]
People Will Take 1.2 Trillion Digital Photos This Year—Thanks to Smartphones. Available online: https://www.businessinsider.com/12-trillion-photos-to-be-taken-in-2017-thanks-to-smartphones-chart-2017-8 (accessed on 29 June 2024).
Nwokeji, C.E.; Sheikh-Akbari, A.; Gorbenko, A.; Mporas, I. Source Camera Identification Techniques: A Survey. J. Imaging 2024, 10, 31. [Google Scholar] [CrossRef] [PubMed]
Fedorenko, G.; Fesenko, H.; Kharchenko, V.; Kliushnikov, I.; Tolkunov, I. Robotic-biological systems for detection and identification of explosive ordnance: Concept, general structure, and models. Radioelectron. Comput. Syst. 2023, 106, 143–159. [Google Scholar]
Jin, H.; Jin, X.; Zhou, Y.; Guo, P.; Ren, J.; Yao, J.; Zhang, S. A survey of energy efficient methods for UAV communication. Veh. Commun. 2023, 41, 100594. [Google Scholar] [CrossRef]
Makarichev, V.; Vasilyeva, I.; Lukin, V.; Vozel, B.; Shelestov, A.; Kussul, N. Discrete Atomic Transform-Based Lossy Compression of Three-Channel Remote Sensing Images with Quality Control. Remote Sens. 2022, 14, 125. [Google Scholar] [CrossRef]
Makarichev, V.; Lukin, V.; Illiashenko, O.; Kharchenko, V. Digital Image Representation by Atomic Functions: The Compression and Protection of Data for Edge Computing in IoT Systems. Sensors 2022, 22, 3751. [Google Scholar] [CrossRef] [PubMed]
Rvachev, V.A. Compactly supported solutions of functional-differential equations and their applications. Russ. Math. Surv. 1990, 45, 87–120. [Google Scholar] [CrossRef]
Makarichev, V.; Lukin, V.; Brysina, I. On the Applications of the Special Class of Atomic Functions: Practical Aspects and Perspectives. In Integrated Computer Technologies in Mechanical Engineering; Nechyporuk, M., Pavlikov, V., Kritskiy, D., Eds.; Springer: Cham, Switzerland, 2021; Volume 188, pp. 42–54. [Google Scholar]
Makarichev, V.O.; Lukin, V.V.; Brysina, I.V.; Vozel, B. Spatial Complexity Reduction in Remote Sensing Image Compression by Atomic Functions. IEEE Geosci. Remote Sens. Lett. 2022, 19, 6517305. [Google Scholar] [CrossRef]
Makarichev, V.O.; Lukin, V.V.; Kharchenko, V.S. Image Compression and Protection Systems Based on Atomic Functions. Int. J. Comput. 2023, 22, 283–291. [Google Scholar] [CrossRef]
Szelinski, R. Computer Vision: Algorithms and Applications, 2nd ed.; Springer: Cham, Switzerland, 2022. [Google Scholar]
Parka, J.; Jung, Y. A review and comparison of convolution neural network models under a unified framework. Commun. Stat. Appl. Methods 2022, 29, 161–176. [Google Scholar] [CrossRef]
Sandler, M.; Howard, A.; Zhu, M.; Zhmoginov, A.; Chen, L.-C. MobileNetV2: Inverted Residuals and Linear Bottlenecks. In Proceedings of the 2018 IEEE/CVF Conference on Computer Vision and Pattern Recognition, Salt Lake City, UT, USA, 18–23 June 2018; pp. 4510–4520. [Google Scholar]
Simonyan, K.; Zisserman, A. Very deep convolutional networks for large-scale image recognition. arXiv 2014, arXiv:1409.1556. [Google Scholar]
He, K.; Zhang, X.; Ren, S.; Sun, J. Deep Residual Learning for Image Recognition. In Proceedings of the 2016 IEEE Conference on Computer Vision and Pattern Recognition (CVPR), Las Vegas, NV, USA, 27–30 June 2016; pp. 770–778. [Google Scholar]
Zoph, B.; Vasudevan, V.; Shlens, J.; Le, Q. Learning Transferable Architectures for Scalable Image Recognition. In Proceedings of the 2018 IEEE/CVF Conference on Computer Vision and Pattern Recognition (CVPR), Salt Lake City, UT, USA, 18–23 June 2018; pp. 8697–8710. [Google Scholar]
Gonzalez, R.C.; Woods, R.E. Digital Image Processing, 4th ed.; Pearson: New York, NY, USA, 2018. [Google Scholar]
ImageNet. Available online: https://www.image-net.org/ (accessed on 29 June 2024).
Keras Applications. Available online: https://keras.io/api/applications/ (accessed on 29 June 2024).
Bryant, R.; O’Hallaron, D. Computer Systems: A Programmer’s Perspective, 3rd ed.; Pearson: London, UK, 2015. [Google Scholar]
Makarichev, V.O.; Lukin, V.V.; Brysina, I.V.; Vozel, B.; Chehdi, K. Atomic wavelets in lossy and near-lossless image compression. In Proceedings of the Image and Signal Processing for Remote Sensing XXVI, SPIE Remote Sensing, Edinburgh, UK, 21–25 September 2020; Volume 11533, p. 1153313. [Google Scholar]
Michailidis, E.T.; Maliatsos, K.; Skoutas, D.N.; Vouyioukas, D.; Skianis, C. Secure UAV-Aided Mobile Edge Computing for IoT: A Review. IEEE Access 2022, 10, 86353–86383. [Google Scholar] [CrossRef]

Figure 1. Discrete atomic transform.

Figure 2. Block structure of DAT coefficients matrix.

Figure 3. Test image samples from the ImageNet database: (a) n02687172_1299; (b) n02704792_12522; (c) n04389033_11909; (d) n04552348_11494.

Figure 4. Compression by DAC: the MAD indicator.

Figure 5. Compression by DAC: the RMSE indicator.

Figure 6. Compression by DAC: the PSNR indicator.

Figure 7. Compression by DAC: compression ratio.

Figure 8. Compression by DAC: memory expenses, MB.

Figure 9. Processing by DAC: time of compression, s.

Figure 10. Processing by DAC: time of decompression. sec.

Figure 11. Percentage of the correctly classified decompressed images: the “classic” preprocessing mode of DAC.

Figure 12. Percentage of the correctly classified decompressed images: the “block-splitting” preprocessing mode of DAC.

Figure 13. Percentage of the correctly classified decompressed images: the “chroma-subsampling” preprocessing mode of DAC.

Figure 14. The comparison of memory expenses required for storing source and compressed data: (a) the “block-splitting” mode; (b) the “chroma-subsampling” mode; (c) the “classic” mode.

Figure 15. Processing the test image “n04552348_10588” by DAC with “chroma-subsampling” and UBMAD = 155: (a) source image; (b) decompressed image (MAD = 54, RMSE = 3.57 and PSNR = 37.32 dB).

Table 1. Results of the test image compression using DAC: the “classic” preprocessing mode.

UBMAD	MAD	RMSE	PSNR, dB	CR	Total Compression Time, s	Total Decompression Time, s
36	7.31	1.18	46.72	4.30	9.23	8.01
63	12.02	1.79	43.15	7.59	7.73	6.73
95	17.18	2.44	40.51	10.04	7.24	6.32
155	25.95	3.39	37.75	14.80	6.72	5.89

Table 2. Results of the test image compression using DAC: the “block-splitting” preprocessing mode.

UBMAD	MAD	RMSE	PSNR, dB	CR	Total Compression Time, s	Total Decompression Time, s
36	7.40	1.18	46.71	4.05	7.90	5.09
63	12.18	1.79	43.15	7.11	6.35	3.72
95	17.49	2.44	40.51	9.41	5.85	3.28
155	26.42	3.39	37.75	13.85	5.29	2.80

Table 3. Results of the test image compression using DAC: the “chroma-subsampling” mode.

UBMAD	MAD	RMSE	PSNR, dB	CR	Total Compression Time, s	Total Decompression Time, s
36	44.52	2.23	42.17	5.60	7.76	6.99
63	45.90	2.70	40.19	9.30	6.79	6.10
95	47.49	3.23	38.42	12	6.42	5.76
155	52.48	4.07	36.29	17.04	6.06	5.44

Table 4. Percentage of the correctly classified decompressed images: the “classic” mode of DAC.

Model	UBMAD
Model	36	63	95	155
MobileNetV2	96.58	96.58	97.43	96.58
VGG16	96.58	96.58	97.43	96.58
VGG19	96.61	96.61	97.45	97.45
ResNet50	97.47	97.47	96.63	96.63
NASNetMobile	97.54	97.54	97.54	95.90
NASNetLarge	98.42	98.42	97.63	97.63

Table 5. Percentage of the correctly classified decompressed images: the “block-splitting” mode.

Model	UBMAD
Model	36	63	95	155
MobileNetV2	96.58	96.58	97.43	96.58
VGG16	96.58	97.43	97.43	96.58
VGG19	96.61	96.61	97.45	97.45
ResNet50	97.47	97.47	96.63	96.63
NASNetMobile	97.54	97.54	97.54	96.72
NASNetLarge	98.42	98.42	97.63	98.42

Table 6. Percentage of the correctly classified decompressed images: the “chroma-subsampling” mode.

Model	UBMAD
Model	36	63	95	155
MobileNetV2	97.43	97.43	97.43	97.43
VGG16	97.43	97.43	97.43	97.43
VGG19	96.61	94.91	96.61	97.45
ResNet50	97.47	97.47	96.63	97.47
NASNetMobile	97.54	97.54	97.54	95.90
NASNetLarge	97.63	97.63	96.06	96.85

Table 7. Models’ characteristics [47].

Model	Depth	Number of Parameters, Millions
MobileNetV2	55	4.3
VGG16	16	138.4
VGG19	19	143.7
ResNet50	107	25.6
NASNetMobile	389	5.3
NASNetLarge	533	88.9

Disclaimer/Publisher’s Note: The statements, opinions and data contained in all publications are solely those of the individual author(s) and contributor(s) and not of MDPI and/or the editor(s). MDPI and/or the editor(s) disclaim responsibility for any injury to people or property resulting from any ideas, methods, instructions or products referred to in the content.

© 2024 by the authors. Licensee MDPI, Basel, Switzerland. This article is an open access article distributed under the terms and conditions of the Creative Commons Attribution (CC BY) license (https://creativecommons.org/licenses/by/4.0/).

Share and Cite

MDPI and ACS Style

Makarichev, V.; Lukin, V.; Brysina, I. On the Impact of Discrete Atomic Compression on Image Classification by Convolutional Neural Networks. Computation 2024, 12, 176. https://doi.org/10.3390/computation12090176

AMA Style

Makarichev V, Lukin V, Brysina I. On the Impact of Discrete Atomic Compression on Image Classification by Convolutional Neural Networks. Computation. 2024; 12(9):176. https://doi.org/10.3390/computation12090176

Chicago/Turabian Style

Makarichev, Viktor, Vladimir Lukin, and Iryna Brysina. 2024. "On the Impact of Discrete Atomic Compression on Image Classification by Convolutional Neural Networks" Computation 12, no. 9: 176. https://doi.org/10.3390/computation12090176

Note that from the first issue of 2016, this journal uses article numbers instead of page numbers. See further details here.

Article Menu

On the Impact of Discrete Atomic Compression on Image Classification by Convolutional Neural Networks

Abstract

1. Introduction

2. Discrete Atomic Compression

3. Classifying Compressed Images

3.1. Models

3.2. Test Data

3.3. Compression

3.4. Classification

3.5. Analysis

4. Discussion

5. Conclusions

Author Contributions

Funding

Data Availability Statement

Acknowledgments

Conflicts of Interest

References

Share and Cite

Article Metrics

Article Access Statistics

Further Information

Guidelines

MDPI Initiatives

Follow MDPI