Multimodal Quanvolutional and Convolutional Neural Networks for Multi-Class Image Classification

Gordienko, Yuri; Trochun, Yevhenii; Stirenko, Sergii

doi:10.3390/bdcc8070075

Open AccessArticle

Multimodal Quanvolutional and Convolutional Neural Networks for Multi-Class Image Classification

by

Yuri Gordienko

^*,†

,

Yevhenii Trochun

^†

and

Sergii Stirenko

^†

Faculty of Informatics and Computer Science, National Technical University of Ukraine “Igor Sikorsky Kyiv Polytechnic Institute”, 03056 Kyiv, Ukraine

^*

Author to whom correspondence should be addressed.

^†

These authors contributed equally to this work.

Big Data Cogn. Comput. 2024, 8(7), 75; https://doi.org/10.3390/bdcc8070075

Submission received: 9 April 2024 / Revised: 30 May 2024 / Accepted: 11 June 2024 / Published: 8 July 2024

(This article belongs to the Special Issue Machine Learning and AI Technology for Sustainable Development)

Download

Browse Figures

Review Reports Versions Notes

Abstract

:

By utilizing hybrid quantum–classical neural networks (HNNs), this research aims to enhance the efficiency of image classification tasks. HNNs allow us to utilize quantum computing to solve machine learning problems, which can be highly power-efficient and provide significant computation speedup compared to classical operations. This is particularly relevant in sustainable applications where reducing computational resources and energy consumption is crucial. This study explores the feasibility of a novel architecture by leveraging quantum devices as the first layer of the neural network, which proved to be useful for scaling HNNs’ training process. Understanding the role of quanvolutional operations and how they interact with classical neural networks can lead to optimized model architectures that are more efficient and effective for image classification tasks. This research investigates the performance of HNNs across different datasets, including CIFAR100 and Satellite Images of Hurricane Damage by evaluating the performance of HNNs on these datasets in comparison with the performance of reference classical models. By evaluating the scalability of HNNs on diverse datasets, the study provides insights into their applicability across various real-world scenarios, which is essential for building sustainable machine learning solutions that can adapt to different environments. Leveraging transfer learning techniques with pre-trained models such as ResNet, EfficientNet, and VGG16 demonstrates the potential for HNNs to benefit from existing knowledge in classical neural networks. This approach can significantly reduce the computational cost of training HNNs from scratch while still achieving competitive performance. The feasibility study conducted in this research assesses the practicality and viability of deploying HNNs for real-world image classification tasks. By comparing the performance of HNNs with classical reference models like ResNet, EfficientNet, and VGG-16, this study provides evidence of the potential advantages of HNNs in certain scenarios. Overall, the findings of this research contribute to advancing sustainable applications of machine learning by proposing novel techniques, optimizing model architectures, and demonstrating the feasibility of adopting HNNs for real-world image classification problems. These insights can inform the development of more efficient and environmentally friendly machine learning solutions.

Keywords:

artificial intelligence; neural network; quanvolutional neural networks; convolutional neural networks; image classification; multi-class image classification; CIFAR100; satellite image processing

1. Introduction

Quantum computing (QC) is one of the most trending research topics in recent years and this field is highly interesting in terms of the possibilities that new advancements in it could bring to various fields of study, such as artificial intelligence (AI), physics, chemistry, etc. Advancements in QC promise to bring many innovations and revolutionary changes to many different areas of life and research. One such field that can highly benefit from it is AI and Quantum Machine Learning (QML) in particular. In the context of AI, quantum computers can be used to increase the efficiency of machine learning (ML) operations. This comes in the form of QML, where a full neural network operates on quantum hardware [1,2] and hybrid quantum–classical neural networks (HNNs), where part of the ML model is powered by classical hardware and another part is executed on quantum hardware. Since quantum computers are able to perform certain computations significantly faster and, in addition to that, are more power-efficient than the classical ones, their usage for ML promises to be very fruitful in the future.

Modern ML solutions in general, and ML applications for image classification tasks in particular, are subject to major problems: high power consumption and slow neural network runtime. This can be attributed to the fact that recent advancements in the AI field bring to light more and more complex state-of-the-art model architectures which require more processing power. Furthermore, the problem of power efficiency will become more acute in the future, given the current trends of AI applications becoming more and more used and widespread in miscellaneous fields of modern human life. Furthermore, since AI usage for time-critical applications is growing, it can be concluded that computation speedup, which can be provided by utilizing QC, will help to improve the quality of service of miscellaneous critical systems that use AI solutions. One of the examples of such applications is satellite imagery analysis, such as detecting areas affected by natural or human-made disasters. Such applications heavily rely on image segmentation and image classification performed by AI algorithms, which can highly benefit from faster and more performant ML models. We assume QML and HNNs to be highly promising because of the significantly higher power efficiency and major computation speedup that QC can provide for bringing new and more sustainable AI solutions in the future.

In our previous work, we studied a later approach for solving various multi-class image classification problems [3]. We researched NNs’ hybridization technique where a quantum device was used as one of the hidden layers of the NN. Our research indicated the feasibility of using HNNs for solving different classical and more practical problems [4] but the results of experiments on HNNs showed significantly lower accuracy compared to their classical counterparts. However, this can be an acceptable trade-off of lower accuracy but faster processing in a number of situations.

In this paper, we strive to research applications of alternative to our previous work on neural network (NN) hybridization techniques where the resulting HNN is built using a quantum device as a first quanvolutional layer of the NN. This hybridization method makes it possible to utilize various complex and well-established ML techniques such as transfer learning [5] and state-of-the-art NN architectures for building HNNs. The concept of having a quanvolutional layer as the first layer of the HNN allows us to optimize the model training process by dividing the quantum and classical parts of the HNN into two separate steps of the process. This can be highly beneficial in terms of the optimization of quantum computer resource utilization because it allows us to perform quantum computations of all training datasets only once and then reuse quantum pre-processed data for training the classical part of the network. In the current state of quantum computers, this makes the model training process much cheaper because quantum hardware usage is optimized.

The remainder of this paper has the following structure: Section 2, Background and Related Work, provides a short summary of recent research work in the domain of hybrid quantum–classical NNs; Section 3, Materials and Methods, describes tooling, datasets, model architectures, and approaches used in research for dataset processing using various transformations based on quantum devices and structures of experiments for different chosen architectures; Section 4, Results and Discussion, summarizes the results obtained during the experiments; and Section 5, Conclusions and Future Work, outlines the obtained results together with an analysis of used methods and proposes future directions of research.

2. Background and Related Work

QML in general, and HNNs in particular, are promising and relevant fields of study. Since QC has not become a widely adopted technology yet, applications of QC in the machine learning field are more theoretical and in the research domain, rather than practical ones. However, a number of recent research indicates the feasibility of HNN applications for solving both theoretical and practical problems in various domains.

Medical applications of AI are one of the most hot and important research topics because they strive to improve patients’ well-being by helping doctors with examination results analyses. There is a huge number of recent studies done in this area, such as research by Mesut Toğaçar et al. [6] on retinal disease detection based on OCT images. Furthermore, it is one of the domains where HNNs proved to be a viable alternative to classical models, specifically illness detection based on patient examination result problems. One of the recent successful examples in this area includes research done by Ahalya, R.K. et al. [7] on rheumatoid arthritis detection based on thermal images of patients’ hands. Other researches in the medical domain done by Ullah et al. [8] on ischemic cardiopathy classification problem and research by Yumin Dong et al. [9] on brain tumor classification based on MRI imagery indicated that HNNs can even reach higher accuracy than their classical counterparts, which proved the feasibility of the application of HNNs for solving medicine-related image classification problems.

Other research fields where quantum–classical HNNs shine are chemistry and biology. Research performed by Gregory W. Kyro et al. [10] indicated that HNNs can reach state-of-the-art accuracy on protein–ligand binding affinity prediction problems.

QML for image processing [11] has seen significant advancements in recent years, with researchers exploring the potential of QC to enhance traditional deep learning (DL) techniques, including quantum feature extraction [12], quantum image processing [11], quantum-inspired convolutional neural networks [13], and hybrid quantum–classical architectures [14,15].

Quantum feature extraction (QFE) refers to the process of extracting relevant features or characteristics from quantum datasets using QC techniques. QFE involves leveraging the principles of quantum mechanics to analyze and identify patterns or properties in quantum data, which can then be used for various ML tasks such as classification, clustering, or dimensionality reduction. QFE aims to exploit the unique capabilities of QC to enhance the efficiency and effectiveness of feature extraction algorithms, particularly in scenarios where classical methods may be limited in handling large or complex datasets [12].

Quantum image processing (QIP) focuses on harnessing QC technologies to encode, manipulate, and reconstruct quantum images across various formats and applications [11]. This inspires the development of encoding schemes that leverage the inherent quantum mechanical properties of potential QC hardware.

Quantum-inspired convolutional neural networks (QICNNs) leverage complex-valued representations and operations, where the input real space is initially transformed into the complex space. In this complex space, parameters are manipulated using operations inspired by quantum computing principles. Recent research on QICNNs highlights the advantages of employing complex parameters in DL, citing benefits across computational, biological, and signal processing domains [13]. Complex numbers offer enhanced representational capacity, facilitating potentially easier optimization, improved generalization capabilities, and accelerated learning.

Hybrid quantum–classical architectures (HQCA) [14,15] became very popular to effectively extract high-level features from imagery data for classification purposes. For example, in the hybrid quantum–classical CNN (QC-CNN), the quantum part is a parameterized circuit to extract essential features from images, and the classical part conducts the classification accordingly. In addition, the QC-CNN model exploits the amplitude encoding technique for image classification tasks, which requires relatively fewer qubits than using computation-based encoding [14]. The shallow hybrid quantum–classical convolutional neural network (SHQCNN) architecture is based on an augmented variational quantum framework, exhibiting efficacy in image classification tasks [15]. Employing a kernel encoding strategy in the input layer, SHQCNN facilitates an enhanced discriminative capability of data representations. Within the hidden layer, a tailored variational quantum circuit architecture is deployed to mitigate network depth and intricacy.

Recently, quantum convolutional or quanvolutional NNs became of high research interest. Many recent pieces of research studied this topic for solving image classification problems of various levels of complexity like the classification of images of handwritten characters [16,17] or flower classification [18]. Some of the research was conducted on inherently more complex problems, like the ImageNet images classification with applied noise done by Debanjan Konar et al. [19], which demonstrated that HNNs can surpass classical models such as ResNet-18 in accuracy on a complex image classification problem.

From recent studies, it can be concluded that the topic of quantum–classical HNNs is of high research interest. Recent works indicated that HNNs can be successfully applied to solving various problems from widely different study domains with high effectiveness and some of the problems can be solved with the application of HNNs with higher accuracy compared to the reference classical models.

3. Materials and Methods

3.1. Development Tools

In this research, we used PyTorch library [20] for building and manipulating complex NNs (in Section 4.4) along with Tensorflow framework [21] for experiments on the simple HNNs (in Section 4.2 and Section 4.3). All our experiments were conducted in a Jypyter Notebooks environment. PennyLane library [22] was used for orchestrating quantum pre-processing of images. For conducting experiments without access to a physical quantum computer, we used quantum simulator software to simulate QC on a classical computer.

All source code for the experiments is available on GitHub [23].

3.2. Datasets

Taking into account the diversity of datasets and the applicability of HNNs in real-world scenarios, the standard dataset (CIFAR100 [24]) and specific dataset (“Satellite Images of Hurricane Damage” (HD) [25]) were selected for researching the feasibility of HNN applications for solving image classification problems.

CIFAR100 is one of the more classical and widely used datasets. It contains 60,000 32 × 32 RGB images of 100 different classes of 20 different categories or superclasses. Each class has 500 training images and 100 testing images. Images of CIFAR100 dataset are equally distributed across all classes, so each class has 500 training images and 100 testing images. Each superclass has an equal number of classes that correspond to it, making it 2500 training images and 500 testing images per superclass. Figure 1 shows a sample of CIFAR100 dataset.

As an example of a real-world scenario of HNN application, we chose the HD dataset because we believe that one of the prominent fields where QC can be used is satellite imagery analysis. This is especially prominent and important in the context of how, currently, problems such as damage assessment after natural or human-made catastrophes are usually performed. One of the methods used for such tasks are windshield surveys. This method relies on volunteers and emergency response crews to drive around the affected area and visually inspect areas. This is both a labor-intensive and dangerous activity which can be impossible in some cases (for example, in various war-related scenarios). Instead of the aforementioned method, or in addition to it, damage assessment problems can be solved using manual or automatic analysis of satellite imagery of affected area. Our assumption that QC can be utilized for this problem is based on the fact that satellites produce enormous amount of data and the task of analysis of such massive datasets can highly benefit from the computation speedup that QC can provide.

HD dataset contains 23,000 256 × 256 RGB pictures of damaged and undamaged buildings. This dataset is composed of images taken from a satellite in the Greater Houston area after Hurricane Harvey that affected the area in 2017. In this research, we tried to eliminate potential bias of HD dataset; thus, we used a subset of the HD dataset that contains an equal number of images of every class—5000 images of damaged and 5000 of undamaged buildings. Figure 2 shows a sample of the dataset with examples of both damaged and undamaged buildings.

Both datasets that were used in the research are unbiased and contain images equally distributed among classes. This research focuses on image classification problem in general and we tried to eliminate potential bias in HD dataset because it can affect performance of HNNs.

3.3. Model Hybridization Technique

In our previous research, we evaluated a hybridization technique where the quantum device is embedded in the neural network instead of one of the layers of artificial neurons in the middle of the network. Such an approach has several major drawbacks; the main limitation of this approach is that quantum hardware is needed for the full training process, which is quite expensive with physical quantum computer in the current environment and limited QC supply or is very time-consuming if we use a quantum simulator for model training.

In this paper, we describe a hybridization technique that is based on the following intuition: if we can use quantum device as a first layer of an HNN, then, for training, we can process every data point in the training dataset with quantum device only once, which results in a more efficient resources utilization of quantum computer during training process. This means that we can highly optimize model training process and, also, we can apply transfer learning technique for HNN and embed quantum device as a first layer of the state-of-the-art models and, thus, boost the efficiency of HNN and actually reuse proven and highly efficient architectures of classical neural networks for building HNNs. Another major benefit of the described hybridization technique is that the same results produced from the quantum layer can be reused in many different HNN architectures because these results are essentially equivalent to data pre-processing and data augmentation process of the original dataset and the results produced from the quantum layer can be treated as a new dataset and reused.

3.4. Architecture Selection

To compare the effect of hybridization on the well-known neural network architectures, the ResNet, EfficientNet, and VGG-16 were selected and used. We chose these architectures based on the idea of selecting two state-of-the-art models for every dataset used in this research and comparing their performance with HNNs derived from the same state-of-the-art models.

CIFAR100 is a classical dataset, which is widely used in research and there is a high number of models that were evaluated on this dataset. The current state-of-the-art model for image classification problem on CIFAR100 dataset is EfficientNet. EfficientNet showed classification accuracy of 91.7% in its founding paper [26]. EfficientNet is based on a compound coefficient technique in order to scale up models and an AutoML. Another model that is widely used in the industry and showed decent classification accuracy on CIFAR100 dataset is ResNet with accuracy of 77.3% [27]. ResNet is a deep CNN that uses residual blocks on top of each other in order to boost accuracy compared to the regular CNN architecture.

HD dataset is much less used in research than CIFAR100 and there is a much smaller number of models that were evaluated on this dataset. The best accuracy on HD dataset was shown by EfficientNet with stunning classification accuracy of 99% [28]. Another highly efficient model on HD dataset is VGG-16 [29] that showed classification accuracy of 97% [30]. VGG-16 is a deep CNN that has 16 layers of trainable parameters and a pyramid-like structure.

In conclusion, in this research, we used EfficientNet, ResNet, and VGG-16 in our experiments to obtain metrics of efficiency of hybridization technique. We also leveraged the transfer learning technique and, for all experiments, used the aforementioned models with pre-trained weights on ImageNet dataset [31].

3.5. Data Pre-Processing Process

Both CIFAR100 and HD datasets were pre-processed using several different quantum devices that performed different operations on input images from datasets. This pre-processing is effectively the first quantum convolutional or quanvolutional layer of an HNN. This process has three hyper-parameters: size of convolutional kernel, stride of convolution kernel, and type of operations that are run in a quantum device. In this research, we used 2 × 2 and 3 × 3 convolutional kernels, 1 and 2 as stride values and X, XY, and XYZ qubit rotations as operations in a random quantum circuit. Figure 3 shows a diagram of an example random quantum circuit with 9 qubits.

The algorithm of image processing we used was described by Henderson et al. [32]. For every convolution, we used the value of every pixel in convolution as an input for a quantum circuit initialized with random weights with a corresponding number of qubits: for 2 × 2 kernel circuit, we used a circuit containing 4 qubits and for 3 × 3 kernel we used a circuit with 9 qubits. Then, we applied rotations to qubits and measured outputs of quantum device. The number of outputs of a quantum circuit is equal to the number of qubits in it and it corresponds to the number of pixels in the convolutional kernel. Each output of a quantum device is treated as a separate layer of output convolution, so for 2 × 2 kernel, we received 4 convolutional layers as output. Furthermore, each output layer also had 3 channels, which correspond to RGB channels of the image.

We chose a relatively small size of convolutional kernel due to multiple reasons. The first reason is that the resolution of the original images of CIFAR100 dataset is only 32 × 32 pixels. The second reason lies in the fact that 5 × 5 kernel size will require quantum circuit with 25 qubits, which is unpractical in today’s environment. This research is focused on researching the feasibility of applications of HNNs with quanvolutional layer and omits researching the influence of hyper-parameters on HNN applications.

3.6. Hybrid Neural Networks

In this research, we took a naive approach to building HNNs from classical NNs. We used the output of quantum device as an input to classical part of the HNN. Classical part is essentially a non-modified model.

This approach poses a problem: the number of dimensions of outputs from a quantum device is n-times higher than the number of inputs that takes classical part of the model, where n is equal to the number of qubits that the quantum device has. We resolved this issue by treating every output of quanvolution layer as a separate data point in the dataset during training and validation of the model. So, essentially, quanvolutional operation on the input data increases the size of the dataset for the classical part 4 times when 2 × 2 quanvolutional kernel is used and 9 times when 3 × 3 quanvolutional kernel is used.

Another problem is the size mismatch between the output of the quanvolutional layer and the input of the classical model. This issue appears even if the size of the original image exactly matches expected input size of the classical model that was used in HNN because quanvolutional operation, exactly like a classical convolutional operation, reduces the size of the input data. Dimensions of the output of the quanvolutional layer can be computed using the formula

W^{'} = ((W - K + 2 P) / S) + 1,

(1)

where W is original size of the image, W′ is size of the output image, K is size of quanvolution kernel, P is size of padding, and S is stride. In our experiments, the output of the quanvolutional layer was always much smaller than the expected input size of classical part of the HNN. In order to overcome this problem, images produced by the quanvolutional layer needed to be upscaled to match the expected size. For scaling images, we used a bi-linear interpolation algorithm. We chose bi-linear interpolation over alternatives because it provides a reasonable trade-off between the quality of the results and processing speed, which is satisfactory for our case. It is worth noting that the application of interpolation after quanvolutional operation might be sub-optimal because it can affect the quality of the image produced by the quantum layer of HNN. The effect of the interpolation at different stages of HNN on the performance of the network is a subject for future research.

In our experiments, we studied such metrics of HNNs as classification accuracy on validation data subset and model loss during training. We compared these metrics of HNNs and corresponding classical reference models and drew our conclusions based on them.

For computing test accuracy of HNNs, we used the Majority Voting technique. The idea behind this is simple: when used on the real problem, HNN has a single input image and should produce simple output prediction for that input image. Furthermore, the fact that quanvolutional layer increases the number of channels of the input can be utilized to predict output in a consensus-like manner by running Majority Voting (MV) on the results of the classical part of the model. So, the final prediction is the most common class that was reported by the classical part of HNN on outputs of quanvolutional layer.

Figure 4 shows a high-level architecture diagram of HNNs that were used in this research. This diagram shows general concepts and illustrates flow of data through the network. The first operation that is performed against an original image is the quanvolution operation performed by a quantum circuit. The detailed algorithm is described in Section 3.5, Data Pre-Processing Process. Quanvolutional layer produces multiple quantum-augmented versions of the original image, the number of output images is equal to the number of qubits in a quantum circuit that was used for quanvolution. Then, every quantum-augmented image is processed by a classical part of the network, which is a non-modified classical model. The last step of the process is an application of the Majority Voting algorithm for deriving a single final prediction based on classical model predictions for all quantum-augmented versions of the original image.

4. Results and Discussion

4.1. Data Pre-Processing Results

All output datasets are available on Kaggle [33,34]. Each pre-processed dataset contains twelve versions of images, one for every combination of our hyper-parameters: 2 × 2 and 3 × 3 quanvolutional kernel sizes, one and two strides, and X, XY, and XYZ qubit rotations. Figure 5 shows samples of the CIFAR100 dataset after processing. Figure 6 shows samples of the HD dataset after processing. In order to be brief, we will not include samples of all 12 versions of both datasets and we hope that the attached samples illustrate a general idea of the results.

From the samples of processed datasets, we can see that the input data are heavily augmented but the main patterns present in the original images are recognizable on all augmented images. Furthermore, the other thing that we can spot from the process is that each image produced by the quanvolutional layer is different from the others.

4.2. Simple HNNs for CIFAR100 with Fine Labeling

To understand the role of quanvolutional operations, the simple HNNs were investigated in the following simple configurations:

“Classic CNN”—a flattening input layer for the original input images, a CNN layer, and an output layer for the targeted number of classes;
“Classic Dense”—a flattening input layer for the original input images and an output layer for the targeted number of classes;
“Quantum + CNN”—a flattening input layer for the quanvolutionally pre-processed input images, a CNN layer, and an output layer for the targeted number of classes;
“Quantum + Dense”—a flattening input layer for the quanvolutionally pre-processed input images and an output layer for the targeted number of classes.

For the simple HNNs described in this section, the pre-processed dataset contains one of every combination of our hyper-parameters, namely 2 × 2 quanvolutional kernel size, one stride, and X qubit rotation.

The CIFAR100 dataset was used in the two labeling modes:

The fine labeling mode with 100 classes (Section 4.2);
The coarse labeling mode with 20 classes (Section 4.3).

4.2.1. Dependence on Training Dataset Size

The histories of validation accuracy (max values in parentheses) and loss (min values in parentheses) values for the quantum pre-processed CIFAR100 datasets are shown in Figure 7 for 1000 original training images (Figure 7, left) and 50,000 original images (with the addition of 200,000 pre-processed training images, see the details below) (Figure 7, right). Here, 2 × 2 kernel size, one stride, and X qubit rotation were used.

After training and validation for the various training dataset sizes (and the same validation dataset size of 5000 images), the maximal validation accuracy values and the minimal validation loss values were determined and are summarized in Figure 8 for 16 and in Figure 9 for 256 filters in a CNN layer.

From the results, we can conclude that the “Quantum + CNN” configuration shows itself at its best at bigger training dataset sizes and with a more complex classical part of the NN (higher number of filters in the CNN layer). We can draw a conclusion that the HNN is more sensitive to training data parameters and model complexity than a reference CNN model.

These results are explained by the fact that a quanvolutional operation creates a more complex version of input data because it produces almost four times more data than its output (four layers, where each layer is an RGB image with a slightly smaller size compared to the input) and, thus, requires more complex classical parts of the network in order to effectively generalize patterns that correspond to each individual class and more training data to reach a higher level of accuracy and become on-par with the reference model.

It should also be noted that, for the HNN model “Quantum + CNN” with 256 filters in a CNN layer (Figure 9), an increase of the dataset leads to a radically different rate of improvement for accuracy and loss. For dataset sizes in the range of 1000–4000 original training images, the accuracy (the first three data points in Figure 9, left) line follows the low improvement rate similar to the accuracy improvement rate of the simpler “Classic Dense” model. However, for dataset sizes in the range of 4000–50,000 original training images, the accuracy (the next five data points in Figure 9, left) line follows the high improvement rate, which is even faster than the accuracy improvement rate of the “Classic CNN” model.

Another conclusion that can be drawn from these experiments is that the “Quantum + Dense” configuration of the HNN is inferior in accuracy, regardless of the dataset size, and demonstrated significantly lower metrics than any other configuration described in this research. The only viable advantage of this configuration—theoretically, it should be the fastest in terms of speed of computations because of its simple architecture and the fact that it is the most complex part of the HNN—is that the quanvolutional layer is being computed on quantum hardware. However, for any task that requires significant accuracy, the application of such a configuration without an additional convolutional layer is not feasible.

4.2.2. Dependence on the Number of Convolutional Filters

The different performance for the various number of convolutional filters in a CNN layer demonstrated in Figure 8 and Figure 9 attract attention to the more detailed investigation of dependence on the number of convolutional filters in a CNN layer. The maximal validation accuracy (Figure 10, left) and minimal loss (Figure 10, right) as a function of the number of convolutional filters in a CNN layer (for 50,000 images in the original training dataset and 5000 images in the original validation dataset) were investigated and are shown in Figure 10.

Figure 10 indicates that the HNN is much more sensitive to the complexity of the model and demonstrates major changes of accuracy as the model complexity increases. The HNN, for optimal work, requires a more precise configuration of the classical model architecture and a number of convolutional layers act as a hyper-parameter of the HNN model and should be carefully tuned in order to reach the peak performance of the model. Increasing the complexity of the network above a certain threshold (in our case, above 256) will result in a worse performance of the HNN due to, most likely, overfitting.

Again, it should be noted that an increase of the number of convolutional filters in a CNN layer leads to the much faster improvement of the “Quantum + CNN” model in comparison to the “Classic CNN” model.

4.2.3. Dependence on the Number of CNN Layers

The effect of additional filters on the model performance raised the question about the influence of the additional CNN layers added. The maximal validation accuracy (left) and minimal loss (right) as a function of the number of convolutional filters in two CNN layers (for 50,000 images in the training dataset and 5000 images in the validation dataset) were investigated and are shown in Figure 11.

In general, adding the second CNN layer allowed to improve the performance of the classic CNN model (“Classic CNN” in Figure 11) to a much higher extent than for the HNN model (“Quantum + CNN” in Figure 11).

4.3. Simple HNNs for CIFAR100 with Coarse Labeling

The similar dependencies of various parameters on the performance of the simple HNNs in comparison to other NNs were investigated for CIFAR100 with coarse labeling with 20 coarse classes (instead of the fine labeling with 100 classes as before).

4.3.1. Dependence on Training Dataset Size

The effect of the various training dataset sizes (and the same validation dataset size of 5000 images), the maximal validation accuracy values, and the minimal validation loss values were determined and are summarized in Figure 12 for 16 and in Figure 13 for 256 filters in a CNN layer.

The results of this experiment follow the same pattern that can be seen in the corresponding results of a more complex problem (100-class classification problem) and emphasize the observed effects as similar and potentially universal for various numbers of classes.

4.3.2. Dependence on the Number of Convolutional Filters

The different performances for the various numbers of convolutional filters in a CNN layer demonstrated in Figure 12 and Figure 13 attract attention to the more detailed investigation of dependence on the number of convolutional filters in a CNN layer. The maximal validation accuracy (left) and minimal loss (right) as a function of the number of convolutional filters in a CNN layer (for 50,000 images in the original training dataset and 5000 images in the original validation dataset) were investigated and are shown in Figure 14.

Despite the lower maximum accuracy metrics and higher number of epochs needed for HNNs to reach the top classification accuracy, compared to the classical reference models, as shown in Figure 7, Figure 8, Figure 9, Figure 10, Figure 11, Figure 12, Figure 13 and Figure 14, HNNs can still have major advantages compared to a classical NN. The main advantages of HNNs are better power efficiency and a significantly higher computation speed provided by quantum hardware. These advantages can be highly beneficial and provide significant value, even in light of lower accuracy metrics in a wide number of applications.

In this subsection, we studied the impact of dataset size, dataset complexity, and HNN structure on the accuracy metric and compared it to the metrics shown by the reference classical models. The main goal of this subsection is not to find an HNN configuration that will surpass the accuracy metric of the classical NN but to study the impact of various parameters on HNNs’ performance.

4.4. Complex HNNs Based on Reference Classical Models

Figure 15 and Figure 16 illustrate HNNs’ and reference models’ training process on the CIFAR100 dataset. From the charts, it is obvious that the training process actually converges. We can see that HNNs show lower accuracy than their classical counterparts. Furthermore, another thing that is worth noting is that more complex HNN architectures outperform more simple HNNs and also more simple reference classical models; for example, HNNs based on EfficientNet demonstrate significantly higher accuracy than the more simplistic reference VGG-16 model.

Figure 17 and Figure 18 illustrate HNNs’ and reference models’ training process on the HD dataset. From the charts, it is obvious that the training process actually converges. On the HD dataset, we can see that certain HNNs outperformed the reference models in terms of both accuracy and training speed (number of epochs needed for maximal accuracy). This can be explained by the fact that, due to quantum data augmentation, we managed to significantly increase the size of the training dataset, which allowed models to better learn patterns that indicate whether a building was damaged or not.

Table 1 illustrates validation accuracy metrics of HNNs and reference classical NNs upon which HNNs are based. We can see that, for the CIFAR100 dataset, the highest accuracy was reached by the classical EfficientNet NN. The situation on the HD dataset is more intriguing and the best accuracy was shown by the HNN based on the EfficientNet architecture. These results indicate the feasibility of the application of HNNs for solving real-world problems and even suggest that HNNs in certain scenarios can outperform, by a small margin, classical models.

Table 2 illustrates the effect of the MV technique for computing HNN’s accuracy on the test subset. Upon comparing stats between Table 1 and Table 2, we can see that leveraging MV provides a significant increase in HNN’s accuracy metrics (4.83% for the EfficientNet-based HNN and 5.17% for the VGG-16-based HNN) on the more complex CIFAR100 image classification problem. Hybrid EfficientNet almost reached the reference model’s performance and hybrid VGG-16 even surpassed its unmodified counterpart on this problem.

We can observe another picture on the HD dataset. The application of the MV technique did not provide any benefits in accuracy for the hybrid EfficientNet and even proved to be slightly less accurate (by 1.06%). On the other hand, the hybrid VGG-16, with the MV technique, outperformed its classical counterpart and demonstrated an increase in accuracy of 1.6% and, also, the MV technique of accuracy calculation demonstrated a higher accuracy metric than the alternative calculation technique.

However, a deeper analysis of cases where HNNs underperform or outperform classical models in terms of accuracy should be conducted. This will help to obtain better insights on use cases where HNNs could be used efficiently for solving various problems and will contribute towards a better understanding of the strengths and limitations of the proposed hybridization technique in this and other practical contexts.

5. Conclusions and Future Work

The data pre-processing phase involved the augmentation of input datasets, generating multiple versions of images based on different hyper-parameter combinations. The samples of the pre-processed CIFAR100 and HD datasets showcased recognizable patterns from the original images, underscoring the effectiveness of the augmentation process. Notably, each image produced by the quanvolutional layer exhibited distinct characteristics, highlighting the diversity introduced by quantum operations.

The exploration of simple HNN configurations, including variations with quantum pre-processed inputs and classical NN layers, provided valuable comparisons with conventional CNN models. The analysis of the dependence on training dataset size revealed that the performance of HNNs, particularly the “Quantum + CNN” configuration, improves with larger training datasets and more complex network architectures. This sensitivity to training data parameters and model complexity underscores the importance of meticulous model design and dataset curation in maximizing HNN’s performance.

Moreover, the results on the influence of the convolutional filter count and the number of CNN layers on HNN’s performance indicated that, while increasing model complexity can enhance HNN’s performance up to a certain threshold, excessive complexity may lead to diminishing returns or overfitting. Notably, the addition of a second CNN layer yielded significant performance improvements for conventional CNN models, highlighting the nuanced interplay between model architecture and performance.

Furthermore, investigations into HNN’s performance under coarse labeling schemes reinforced the observed trends, suggesting the potential universality of the identified effects across different classification problem complexities.

Experiments conducted on a simple HNN configuration suggested that a quanvolutional operation creates a more complex version of input data and, thus, requires a more complex classical part of the HNN in order to reach its maximum accuracy. This applies to both the structure of the classical part (the CNN showed significantly higher accuracy than the dense NN) and its complexity (the number of filters in the CNN layer). Furthermore, experiments indicated that the quanvolutional HNN configuration is more sensitive to training dataset size than its classical counterpart.

Overall, the findings underscore the promise of HNNs in image classification tasks, offering insights into their sensitivity to dataset size, model complexity, and architectural configurations. By elucidating these dependencies, this research contributes to a deeper understanding of HNNs and their role in advancing explainable artificial intelligence methodologies. Moving forward, continued exploration and refinement of HNN architectures and training methodologies hold the key to unlocking their full potential in real-world applications.

Experiments conducted in our research on more complex configurations of HNNs indicated the feasibility of quanvolutional HNN applications for solving real-world problems on the example of image classification problems. The best results on multi-class classification on CIFAR100 were shown by the classical reference EfficientNet model. All HNNs showed a 5–10% lower accuracy compared to the reference models. Experiments on the HD dataset demonstrated another picture—ResNet and EfficientNet-based HNNs outperformed their classical reference models in accuracy by a small margin (0.3–2.2%). This can be explained by the fact that a quanvolution operation greatly increases training dataset size and this appeared to be quite useful in situations where the original dataset had a sub-optimal size.

However, HNNs’ application also has some major limitations. The biggest one is quantum computing resource supply being quite limited at the moment. This drawback is partially addressed by a proposed HNN architecture, where the quantum device serves as a first layer of the NN because it allows us to perform quantum transformation of the training data only once. However, even so, quantum computing resource supply is still a main bottleneck of HNNs in terms of scalability. Another potential limitation is a need for data transfer between the quantum and classical parts of the neural network. Since the quantum and classical parts are being executed on separate hardware, data transfer over a network is required, which can be quite slow if the physical distance between the quantum and classical hardware is significant. However, we believe that the computation speedup provided by QC utilization will be more significant than the time cost of data transfer over the network.

The MV technique was used for final result prediction and it represents a real-world usage scenario because the quanvolutional layer of the HNN increases dataset size n times, where n is a number of qubits in a quantum circuit. Furthermore, when the model is used on a new data point, we expect to receive a single class prediction for the input data; thus, the classical part of the model acts on every layer produced from the quantum device and, then, the voting algorithm selects the most common class reported by the classical part of the HNN as a final decision. This technique proved to be highly effective for HNNs on more complex problems, as was shown on the CIFAR100 dataset. For less complex image classification problems, for example, binary classification on the HD dataset, where the HNN showed high level of precision (98.81%), MV proved to be less effective. However, it also provides us the flexibility of configuration for solving situations where the model is unable to reliably classify an image; for example, if MV ends up being a tie for determining whether a building was damaged or not, we can automatically flag the building as damaged in order to reduce the number of false-negative scenarios, which is more unfavorable than a false-positive scenario in this case.

It should be noted that HNNs were compiled using quite a simplistic technique, where outputs of the quanvolutional layer were propagated directly to a classical part of the model, which was based on an unmodified state-of-the-art classical NN model. In our future research, we plan to work on replacing convolutional layers with quanvolutional layers and research the feasibility of such an approach in order to push HHN research further towards actually practical usage of HNNs for solving problems in real-world scenarios.

Another thing worth noting is that all experiments were conducted using quantum simulation software in order to be able to conduct experiments without access to the physical quantum computer. So, we were unable to assess important metrics of HNNs such as time metrics of the models’ training and evaluation. Measuring time metrics is especially important because one of the main advantages of QC is significant computation speedup. We plan to address this issue in our future research and evaluate the performance of HNNs on an actual quantum computer.

In our future research, we plan to focus on several important aspects of the proposed HNN architecture. One of such aspects is the effect of the application of image interpolation on the quality of the results produced by HNNs and determining an optimal stage of the pipeline where image interpolation should be applied. Another important aspect that we would like to touch upon in our future research is research on more complex HNN architectures; for example, instead of using a non-modified state-of-the-art classical model as a classical part of the HNN, it is worth researching approaches and algorithms that will help to simplify the classical part of the HNN by replacing some parts of the classical model by a quantum device without sacrificing network accuracy by a significant margin. Finally, we plan to research the impact of the quantum device architecture on the performance of the proposed HNN architecture. We believe that this vector of future research will enhance our understanding of the effect of different architectures and hyper-parameters on the performance of HNNs and will help to move forward to more practical applications of HNNs for solving real-world problems.

In conclusion, the augmentation of input datasets, particularly the utilization of quantum pre-processed inputs, showcases the potential for improving data efficiency and diversity, which can lead to more sustainable machine learning solutions by reducing the need for extensive data collection. The exploration of simple HNN configurations highlights the importance of meticulous model design and dataset curation in maximizing performance. This understanding can help in developing leaner and more resource-efficient models, contributing to sustainability efforts in machine learning. The analysis of model complexity and its impact on performance underscores the need for balanced architectures to avoid overfitting and unnecessary computational burden. This optimization can lead to more sustainable model deployment by conserving computational resources. The observed trends across different classification problem complexities suggest the potential universality of the identified effects, providing insights into the scalability and applicability of HNNs across various domains, thus fostering sustainability in machine learning applications. In the future, the following promising ways to continue research and improve these results could be proposed. For advanced architecture exploration, further research int replacing convolutional layers with quanvolutional layers can enhance the capabilities of HNNs, pushing the boundaries of quantum–classical hybrid architectures and their practical applicability in real-world scenarios. As for quantum computer assessment, conducting experiments on actual quantum computers will provide crucial insights into the performance metrics of HNNs, such as training and evaluation time, enabling a more comprehensive understanding of their potential advantages over classical approaches. Furthermore, the refinement of training methodologies for HNNs, including leveraging quantum simulation software and exploring novel optimization techniques, can improve model performance and scalability, facilitating their adoption in sustainable machine learning applications. Finally, continuing research along these directions can lead to the development of more efficient, robust, and scalable machine learning solutions, contributing to sustainability goals by reducing computational costs, optimizing resource utilization, and enhancing model interpretability and explainability.

Author Contributions

Conceptualization, Y.G. and S.S.; funding acquisition, S.S.; investigation, Y.G. and Y.T.; methodology, Y.G. and Y.T.; project administration, Y.G. and S.S.; resources, S.S.; software, Y.G. and Y.T.; supervision, S.S.; validation, Y.G. and Y.T.; visualization, Y.G. and Y.T.; writing—original draft, Y.T.; writing—review and editing, Y.G. and S.S. All authors have read and agreed to the published version of the manuscript.

Funding

This research was supported in part by the National Research Foundation of Ukraine (NRFU), grant 2022.01/0199.

Data Availability Statement

Datasets of quantum pre-processed CIFAR100 and “Satellite Images of Hurricane Damage” are openly available online on the Kaggle platform [33,34]. The source code of all experiments conducted in this research is also publicly available on the GitHub platform in the form of Jupyter Notebooks [23].

Conflicts of Interest

The authors declare no conflicts of interest.

Abbreviations

The following abbreviations are used in this manuscript:

AI	Artificial intelligence
CNN	Convolutional neural network
DL	Deep learning
HD	Hurricane Damage Dataset
HNN	Hybrid neural network
HQCA	Hybrid quantum–classical architectures
ML	Machine learning
MV	Majority Voting
NN	Neural network
QC	Quantum computing
QC-CNN	quantum–classical CNN
QIP	Quantum image processing
QFE	Quantum feature extraction
QML	Quantum machine learning
SHQCNN	Shallow hybrid quantum–classical CNN

References

Biamonte, J.; Wittek, P.; Pancotti, N.; Rebentrost, P.; Wiebe, N.; Lloyd, S. Quantum machine learning. Nature 2017, 549, 195–202. [Google Scholar] [CrossRef] [PubMed]
García, D.P.; Cruz-Benito, J.; García-Peñalvo, F.J. Systematic Literature Review: Quantum Machine Learning and Its Applications. arXiv 2023, arXiv:2201.04093. [Google Scholar]
Trochun, Y.; Stirenko, S.; Rokovyi, O.; Alienin, O.; Pavlov, E.; Gordienko, Y. Hybrid Classic-Quantum Neural Networks for Image Classification. In Proceedings of the 11th IEEE International Conference on Intelligent Data Acquisition and Advanced Computing Systems: Technology and Applications (IDAACS), Cracow, Poland, 22–25 September 2021. [Google Scholar]
Trochun, Y.; Wang, Z.; Rokovyi, O.; Peng, G.; Alienin, O.; Lai, G.; Gordienko, Y.; Stirenko, S. Hurricane Damage Detection by Classic and Hybrid Classic-Quantum Neural Networks. In Proceedings of the International Conference on Space-Air-Ground Computing (SAGC), Huizhou, China, 23–25 October 2021. [Google Scholar]
Torrey, L.; Shavlik, J. Transfer learning. In Handbook of Research on Machine Learning Applications and Trends: Algorithms, Methods, and Techniques; IGI Global: Hershey, PA, USA, 2010; pp. 242–264. [Google Scholar]
Toğaçar, M.; Ergen, B.; Tümen, V. Use of dominant activations obtained by processing OCT images with the CNNs and slime mold method in retinal disease detection. Biocybern. Biomed. Eng. 2022, 42, 646–666. [Google Scholar] [CrossRef]
Ahalya, R.K.; Almutairi, F.M.; Snekhalatha, U.; Dhanraj, V.; Aslam, S.M. RANet: A custom CNN model and quanvolutional neural network for the automated detection of rheumatoid arthritis in hand thermal images. Sci. Rep. 2023, 13, 15638. [Google Scholar] [CrossRef]
Ullah, U.; Jurado, A.G.O.; Gonzalez, I.D.; Garcia-Zapirain, B. A Fully Connected Quantum Convolutional Neural Network for Classifying Ischemic Cardiopathy. IEEE Access 2022, 10, 134592–134605. [Google Scholar] [CrossRef]
Dong, Y.; Fu, Y.; Liu, H.; Che, X.; Sun, L.; Luo, Y. An improved hybrid quantum-classical convolutional neural network for multi-class brain tumor MRI classification. J. Appl. Phys. 2023, 133, 064401. [Google Scholar] [CrossRef]
Kyro, G.W.; Brent, R.I.; Batista, V.S. HAC-Net: A Hybrid Attention-Based Convolutional Neural Network for Highly Accurate Protein–Ligand Binding Affinity Prediction. J. Chem. Inf. Model. 2023, 10, 1947–1960. [Google Scholar] [CrossRef] [PubMed]
Yan, F.; Iliyasu, A.M.; Venegas-Andraca, S.E. A survey of quantum image representations. Quantum Inf. Process 2016, 15, 1–35. [Google Scholar] [CrossRef]
Dou, T.; Zhang, G.; Cui, W. Efficient quantum feature extraction for CNN-based learning. J. Frankl. Inst. 2023, 360, 7438–7456. [Google Scholar] [CrossRef]
Shi, S.; Wang, Z.; Cui, G.; Wang, S.; Shang, R.; Li, W.; Wei, Z.; Gu, Y. Quantum-inspired complex convolutional neural networks. Appl. Intell. 2022, 52, 17912–17921. [Google Scholar] [CrossRef]
Liu, J.; Lim, K.H.; Wood, K.L.; Huang, W.; Guo, C.; Huang, H.L. Hybrid quantum-classical convolutional neural networks. Sci. China Phys. Mech. Astron. 2021, 64, 290311. [Google Scholar] [CrossRef]
Wang, A.; Hu, J.; Zhang, S.; Li, L. Shallow hybrid quantum-classical convolutional neural network model for image classification. Quantum Inf. Process 2024, 23, 17. [Google Scholar] [CrossRef]
Patil, A.P.; Pandey, S.; Das Kasat, N.; Modi, S.; Raj, S.; Kulkarni, R. Implementation of Handwritten Character Recognition using Quanvolutional Neural Network. In Proceedings of the 2022 IEEE North Karnataka Subsection Flagship International Conference (NKCon), Vijaypur, India, 20–21 November 2022; pp. 1–4. [Google Scholar]
Li, Y.C.; Zhou, R.-G.; Xu, R.Q.; Luo, J.; Hu, W.W. A quantum deep convolutional neural network for image recognition. Quantum Sci. Technol. 2020, 5, 044003. [Google Scholar] [CrossRef]
Chalumuri, A.; Kune, R.; Manoj, B.S. A hybrid classical-quantum approach for multi-class classification. Quantum Inf. Process 2021, 20, 119. [Google Scholar] [CrossRef]
Konar, D.; Sarma, A.D.; Bhandary, S.; Bhattacharyya, S.; Cangi, A.; Aggarwal, V. A shallow hybrid classical–quantum spiking feedforward neural network for noise-robust image classification. Appl. Soft Comput. 2023, 136, 110099. [Google Scholar] [CrossRef]
PyTorch. Available online: https://pytorch.org/ (accessed on 10 February 2024).
Abadi, M.; Agarwal, A.; Barham, P.; Brevdo, E.; Chen, Z.; Citro, C.; Corrado, G.S.; Davis, A.; Dean, J.; Devin, M.; et al. Tensorflow: Large-scale machine learning on heterogeneous distributed systems. arXiv 2016, arXiv:1603.04467. [Google Scholar]
PennyLane. Available online: https://github.com/PennyLaneAI/pennylane (accessed on 10 February 2024).
Quanvolutional Neural Networks. Available online: https://github.com/ZheniaTrochun/quanvolutional-neural-networks (accessed on 18 February 2024).
CIFAR10 and CIFAR100 Datasets. Available online: https://www.cs.toronto.edu/~kriz/cifar.html (accessed on 17 February 2024).
Cao, Q.D.; Choe, Y. Deep Learning Based Damage Detection on Post-Hurricane Satellite Imagery. CoRR 2018, arXiv:1807.01688. [Google Scholar]
Tan, M.; Le, Q.V. EfficientNet: Rethinking Model Scaling for Convolutional Neural Networks. CoRR 2019, arXiv:1905.11946. [Google Scholar]
He, K.; Zhang, X.; Ren, S.; Sun, J. Identity Mappings in Deep Residual Networks. CoRR 2016, arXiv:1603.05027. [Google Scholar]
Detection Hurricane Damage|EFFICIENTNETB0|Acc:99%. Available online: https://www.kaggle.com/code/hadeerismail/detection-hurricane-damage-efficientnetb0-acc-99 (accessed on 18 February 2024).
Simonyan, K.; Zisserman, A. Very Deep Convolutional Networks for Large-Scale Image Recognition. arXiv 2016, arXiv:1409.1556. [Google Scholar]
Satellite Images of Hurricane Damage. Available online: https://www.kaggle.com/code/yuempark/satellite-images-of-hurricane-damage (accessed on 18 February 2024).
ImageNet. Available online: https://www.image-net.org/ (accessed on 18 February 2024).
Henderson, M.; Shakya, S.; Pradhan, S.; Cook, T. Quanvolutional Neural Networks: Powering Image Recognition with Quantum Circuits. Quantum Mach. Intell. 2020, 2, 2. [Google Scholar] [CrossRef]
Quantum-Augmented CIFAR100. Available online: https://www.kaggle.com/datasets/yevheniitrochun/quantum-augmented-cifar100 (accessed on 18 February 2024).
Quantum-Augmented Images of Hurricane Damage. Available online: https://www.kaggle.com/datasets/yevheniitrochun/quantum-augmented-images-of-hurricane-damage-part1 (accessed on 18 February 2024).

Figure 1. Sample of CIFAR100 dataset.

Figure 2. Sample of “Satellite Images of Hurricane Damage” dataset.

Figure 3. Example of a random quantum circuit with 9 qubits.

Figure 4. Diagram of HNN architecture.

Figure 5. Sample of quantum pre-processed CIFAR100 2 × 2 kernel size, 2 stride and X qubit rotation.

Figure 6. Sample of quantum pre-processed Hurricane Damage dataset 2 × 2 kernel size, 2 stride and X qubit rotation.

Figure 7. The histories of validation accuracy (max values in parentheses) and loss (min values in parentheses) values for the quantum pre-processed CIFAR100 datasets: 1000 original training images (left) and 50,000 original training images (right).

Figure 8. The maximal validation accuracy (left) and minimal loss (right) as a function of the training dataset size (from 1000 up to 50,000 original training images) measured on the same validation dataset size of 5000 images for 16 filters in a CNN layer.

Figure 9. The maximal validation accuracy (left) and minimal loss (right) as a function of the training dataset size (from 1000 up to 50,000 original training images) measured on the same validation dataset size of 5000 images for 256 filters in a CNN layer.

Figure 10. The maximal validation accuracy (left) and minimal loss (right) as a function of the number of convolutional filters in a CNN layer (for 50,000 images in the training dataset and 5000 images in the validation dataset).

Figure 11. The maximal validation accuracy (left) and minimal loss (right) as a function of the number of convolutional filters in two CNN layers (for 50,000 images in the original training dataset and 5000 images in the original validation dataset).

Figure 12. The maximal validation accuracy (left) and minimal loss (right) as a function of the training dataset size (from 1000 up to 50,000 images) measured on the same validation dataset size of 5000 images for 16 filters in a CNN layer for CIFAR100 with coarse labeling.

Figure 13. The maximal validation accuracy (left) and minimal loss (right) as a function of the training dataset size (from 1000 up to 50,000 images) measured on the same validation dataset size of 5000 images for 256 filters in a CNN layer for CIFAR100 with coarse labeling.

Figure 14. The maximal validation accuracy (left) and minimal loss (right) as a function of the number of convolutional filters in a CNN layer (for 50,000 images in the training dataset and 5000 images in the validation dataset) for CIFAR100 with coarse labeling.

Figure 15. Loss and accuracy of ResNet-18 HNN and reference on CIFAR100.

Figure 16. Loss and accuracy of EfficientNet-v2-s HNN and reference on CIFAR100.

Figure 17. Loss and accuracy of ResNet-18 HNN and reference on Hurricane Damage dataset.

Figure 18. Loss and accuracy of EfficientNet-v2-s HNN and reference on Hurricane Damage dataset.

Table 1. HNNs and reference models’ validation accuracy (top accuracy in bold).

Model	CIFAR100	Hurricane Damage Dataset
ResNet-18 reference	76.74%	96.15%
EfficientNet-v2-s reference	78.76%	98.5%
VGG-16 reference	62.19%	96.5%
ResNet-18 HNN	62.88%	98.62%
EfficientNet-v2-s HNN	72.26%	98.81%
VGG-16 HNN	54.95%	85.94%

Table 2. HNNs’ models test accuracy using the Majority Voting technique.

Model	CIFAR100	Hurricane Damage Dataset
EfficientNet-v2-s HNN	77.09%	97.75%
VGG-16 HNN	65.12%	98.1%

Disclaimer/Publisher’s Note: The statements, opinions and data contained in all publications are solely those of the individual author(s) and contributor(s) and not of MDPI and/or the editor(s). MDPI and/or the editor(s) disclaim responsibility for any injury to people or property resulting from any ideas, methods, instructions or products referred to in the content.

© 2024 by the authors. Licensee MDPI, Basel, Switzerland. This article is an open access article distributed under the terms and conditions of the Creative Commons Attribution (CC BY) license (https://creativecommons.org/licenses/by/4.0/).

Share and Cite

MDPI and ACS Style

Gordienko, Y.; Trochun, Y.; Stirenko, S. Multimodal Quanvolutional and Convolutional Neural Networks for Multi-Class Image Classification. Big Data Cogn. Comput. 2024, 8, 75. https://doi.org/10.3390/bdcc8070075

AMA Style

Gordienko Y, Trochun Y, Stirenko S. Multimodal Quanvolutional and Convolutional Neural Networks for Multi-Class Image Classification. Big Data and Cognitive Computing. 2024; 8(7):75. https://doi.org/10.3390/bdcc8070075

Chicago/Turabian Style

Gordienko, Yuri, Yevhenii Trochun, and Sergii Stirenko. 2024. "Multimodal Quanvolutional and Convolutional Neural Networks for Multi-Class Image Classification" Big Data and Cognitive Computing 8, no. 7: 75. https://doi.org/10.3390/bdcc8070075

APA Style

Gordienko, Y., Trochun, Y., & Stirenko, S. (2024). Multimodal Quanvolutional and Convolutional Neural Networks for Multi-Class Image Classification. Big Data and Cognitive Computing, 8(7), 75. https://doi.org/10.3390/bdcc8070075

Article Menu

Multimodal Quanvolutional and Convolutional Neural Networks for Multi-Class Image Classification

Abstract

1. Introduction

2. Background and Related Work

3. Materials and Methods

3.1. Development Tools

3.2. Datasets

3.3. Model Hybridization Technique

3.4. Architecture Selection

3.5. Data Pre-Processing Process

3.6. Hybrid Neural Networks

4. Results and Discussion

4.1. Data Pre-Processing Results

4.2. Simple HNNs for CIFAR100 with Fine Labeling

4.2.1. Dependence on Training Dataset Size

4.2.2. Dependence on the Number of Convolutional Filters

4.2.3. Dependence on the Number of CNN Layers

4.3. Simple HNNs for CIFAR100 with Coarse Labeling

4.3.1. Dependence on Training Dataset Size

4.3.2. Dependence on the Number of Convolutional Filters

4.4. Complex HNNs Based on Reference Classical Models

5. Conclusions and Future Work

Author Contributions

Funding

Data Availability Statement

Conflicts of Interest

Abbreviations

References

Share and Cite

Article Metrics

Article Access Statistics

Further Information

Guidelines

MDPI Initiatives

Follow MDPI