1. Introduction
Quantum computing (QC) is one of the most trending research topics in recent years and this field is highly interesting in terms of the possibilities that new advancements in it could bring to various fields of study, such as artificial intelligence (AI), physics, chemistry, etc. Advancements in QC promise to bring many innovations and revolutionary changes to many different areas of life and research. One such field that can highly benefit from it is AI and Quantum Machine Learning (QML) in particular. In the context of AI, quantum computers can be used to increase the efficiency of machine learning (ML) operations. This comes in the form of QML, where a full neural network operates on quantum hardware [
1,
2] and hybrid quantum–classical neural networks (HNNs), where part of the ML model is powered by classical hardware and another part is executed on quantum hardware. Since quantum computers are able to perform certain computations significantly faster and, in addition to that, are more power-efficient than the classical ones, their usage for ML promises to be very fruitful in the future.
Modern ML solutions in general, and ML applications for image classification tasks in particular, are subject to major problems: high power consumption and slow neural network runtime. This can be attributed to the fact that recent advancements in the AI field bring to light more and more complex state-of-the-art model architectures which require more processing power. Furthermore, the problem of power efficiency will become more acute in the future, given the current trends of AI applications becoming more and more used and widespread in miscellaneous fields of modern human life. Furthermore, since AI usage for time-critical applications is growing, it can be concluded that computation speedup, which can be provided by utilizing QC, will help to improve the quality of service of miscellaneous critical systems that use AI solutions. One of the examples of such applications is satellite imagery analysis, such as detecting areas affected by natural or human-made disasters. Such applications heavily rely on image segmentation and image classification performed by AI algorithms, which can highly benefit from faster and more performant ML models. We assume QML and HNNs to be highly promising because of the significantly higher power efficiency and major computation speedup that QC can provide for bringing new and more sustainable AI solutions in the future.
In our previous work, we studied a later approach for solving various multi-class image classification problems [
3]. We researched NNs’ hybridization technique where a quantum device was used as one of the hidden layers of the NN. Our research indicated the feasibility of using HNNs for solving different classical and more practical problems [
4] but the results of experiments on HNNs showed significantly lower accuracy compared to their classical counterparts. However, this can be an acceptable trade-off of lower accuracy but faster processing in a number of situations.
In this paper, we strive to research applications of alternative to our previous work on neural network (NN) hybridization techniques where the resulting HNN is built using a quantum device as a first quanvolutional layer of the NN. This hybridization method makes it possible to utilize various complex and well-established ML techniques such as transfer learning [
5] and state-of-the-art NN architectures for building HNNs. The concept of having a quanvolutional layer as the first layer of the HNN allows us to optimize the model training process by dividing the quantum and classical parts of the HNN into two separate steps of the process. This can be highly beneficial in terms of the optimization of quantum computer resource utilization because it allows us to perform quantum computations of all training datasets only once and then reuse quantum pre-processed data for training the classical part of the network. In the current state of quantum computers, this makes the model training process much cheaper because quantum hardware usage is optimized.
The remainder of this paper has the following structure:
Section 2,
Background and Related Work, provides a short summary of recent research work in the domain of hybrid quantum–classical NNs;
Section 3,
Materials and Methods, describes tooling, datasets, model architectures, and approaches used in research for dataset processing using various transformations based on quantum devices and structures of experiments for different chosen architectures;
Section 4,
Results and Discussion, summarizes the results obtained during the experiments; and
Section 5,
Conclusions and Future Work, outlines the obtained results together with an analysis of used methods and proposes future directions of research.
2. Background and Related Work
QML in general, and HNNs in particular, are promising and relevant fields of study. Since QC has not become a widely adopted technology yet, applications of QC in the machine learning field are more theoretical and in the research domain, rather than practical ones. However, a number of recent research indicates the feasibility of HNN applications for solving both theoretical and practical problems in various domains.
Medical applications of AI are one of the most hot and important research topics because they strive to improve patients’ well-being by helping doctors with examination results analyses. There is a huge number of recent studies done in this area, such as research by Mesut Toğaçar et al. [
6] on retinal disease detection based on OCT images. Furthermore, it is one of the domains where HNNs proved to be a viable alternative to classical models, specifically illness detection based on patient examination result problems. One of the recent successful examples in this area includes research done by Ahalya, R.K. et al. [
7] on rheumatoid arthritis detection based on thermal images of patients’ hands. Other researches in the medical domain done by Ullah et al. [
8] on ischemic cardiopathy classification problem and research by Yumin Dong et al. [
9] on brain tumor classification based on MRI imagery indicated that HNNs can even reach higher accuracy than their classical counterparts, which proved the feasibility of the application of HNNs for solving medicine-related image classification problems.
Other research fields where quantum–classical HNNs shine are chemistry and biology. Research performed by Gregory W. Kyro et al. [
10] indicated that HNNs can reach state-of-the-art accuracy on protein–ligand binding affinity prediction problems.
QML for image processing [
11] has seen significant advancements in recent years, with researchers exploring the potential of QC to enhance traditional deep learning (DL) techniques, including quantum feature extraction [
12], quantum image processing [
11], quantum-inspired convolutional neural networks [
13], and hybrid quantum–classical architectures [
14,
15].
Quantum feature extraction (QFE) refers to the process of extracting relevant features or characteristics from quantum datasets using QC techniques. QFE involves leveraging the principles of quantum mechanics to analyze and identify patterns or properties in quantum data, which can then be used for various ML tasks such as classification, clustering, or dimensionality reduction. QFE aims to exploit the unique capabilities of QC to enhance the efficiency and effectiveness of feature extraction algorithms, particularly in scenarios where classical methods may be limited in handling large or complex datasets [
12].
Quantum image processing (QIP) focuses on harnessing QC technologies to encode, manipulate, and reconstruct quantum images across various formats and applications [
11]. This inspires the development of encoding schemes that leverage the inherent quantum mechanical properties of potential QC hardware.
Quantum-inspired convolutional neural networks (QICNNs) leverage complex-valued representations and operations, where the input real space is initially transformed into the complex space. In this complex space, parameters are manipulated using operations inspired by quantum computing principles. Recent research on QICNNs highlights the advantages of employing complex parameters in DL, citing benefits across computational, biological, and signal processing domains [
13]. Complex numbers offer enhanced representational capacity, facilitating potentially easier optimization, improved generalization capabilities, and accelerated learning.
Hybrid quantum–classical architectures (HQCA) [
14,
15] became very popular to effectively extract high-level features from imagery data for classification purposes. For example, in the hybrid quantum–classical CNN (QC-CNN), the quantum part is a parameterized circuit to extract essential features from images, and the classical part conducts the classification accordingly. In addition, the QC-CNN model exploits the amplitude encoding technique for image classification tasks, which requires relatively fewer qubits than using computation-based encoding [
14]. The shallow hybrid quantum–classical convolutional neural network (SHQCNN) architecture is based on an augmented variational quantum framework, exhibiting efficacy in image classification tasks [
15]. Employing a kernel encoding strategy in the input layer, SHQCNN facilitates an enhanced discriminative capability of data representations. Within the hidden layer, a tailored variational quantum circuit architecture is deployed to mitigate network depth and intricacy.
Recently, quantum convolutional or quanvolutional NNs became of high research interest. Many recent pieces of research studied this topic for solving image classification problems of various levels of complexity like the classification of images of handwritten characters [
16,
17] or flower classification [
18]. Some of the research was conducted on inherently more complex problems, like the ImageNet images classification with applied noise done by Debanjan Konar et al. [
19], which demonstrated that HNNs can surpass classical models such as ResNet-18 in accuracy on a complex image classification problem.
From recent studies, it can be concluded that the topic of quantum–classical HNNs is of high research interest. Recent works indicated that HNNs can be successfully applied to solving various problems from widely different study domains with high effectiveness and some of the problems can be solved with the application of HNNs with higher accuracy compared to the reference classical models.
3. Materials and Methods
3.1. Development Tools
In this research, we used PyTorch library [
20] for building and manipulating complex NNs (in
Section 4.4) along with Tensorflow framework [
21] for experiments on the simple HNNs (in
Section 4.2 and
Section 4.3). All our experiments were conducted in a Jypyter Notebooks environment. PennyLane library [
22] was used for orchestrating quantum pre-processing of images. For conducting experiments without access to a physical quantum computer, we used quantum simulator software to simulate QC on a classical computer.
All source code for the experiments is available on GitHub [
23].
3.2. Datasets
Taking into account the diversity of datasets and the applicability of HNNs in real-world scenarios, the standard dataset (CIFAR100 [
24]) and specific dataset (“Satellite Images of Hurricane Damage” (HD) [
25]) were selected for researching the feasibility of HNN applications for solving image classification problems.
CIFAR100 is one of the more classical and widely used datasets. It contains 60,000 32 × 32 RGB images of 100 different classes of 20 different categories or superclasses. Each class has 500 training images and 100 testing images. Images of CIFAR100 dataset are equally distributed across all classes, so each class has 500 training images and 100 testing images. Each superclass has an equal number of classes that correspond to it, making it 2500 training images and 500 testing images per superclass.
Figure 1 shows a sample of CIFAR100 dataset.
As an example of a real-world scenario of HNN application, we chose the HD dataset because we believe that one of the prominent fields where QC can be used is satellite imagery analysis. This is especially prominent and important in the context of how, currently, problems such as damage assessment after natural or human-made catastrophes are usually performed. One of the methods used for such tasks are windshield surveys. This method relies on volunteers and emergency response crews to drive around the affected area and visually inspect areas. This is both a labor-intensive and dangerous activity which can be impossible in some cases (for example, in various war-related scenarios). Instead of the aforementioned method, or in addition to it, damage assessment problems can be solved using manual or automatic analysis of satellite imagery of affected area. Our assumption that QC can be utilized for this problem is based on the fact that satellites produce enormous amount of data and the task of analysis of such massive datasets can highly benefit from the computation speedup that QC can provide.
HD dataset contains 23,000 256 × 256 RGB pictures of damaged and undamaged buildings. This dataset is composed of images taken from a satellite in the Greater Houston area after Hurricane Harvey that affected the area in 2017. In this research, we tried to eliminate potential bias of HD dataset; thus, we used a subset of the HD dataset that contains an equal number of images of every class—5000 images of damaged and 5000 of undamaged buildings.
Figure 2 shows a sample of the dataset with examples of both damaged and undamaged buildings.
Both datasets that were used in the research are unbiased and contain images equally distributed among classes. This research focuses on image classification problem in general and we tried to eliminate potential bias in HD dataset because it can affect performance of HNNs.
3.3. Model Hybridization Technique
In our previous research, we evaluated a hybridization technique where the quantum device is embedded in the neural network instead of one of the layers of artificial neurons in the middle of the network. Such an approach has several major drawbacks; the main limitation of this approach is that quantum hardware is needed for the full training process, which is quite expensive with physical quantum computer in the current environment and limited QC supply or is very time-consuming if we use a quantum simulator for model training.
In this paper, we describe a hybridization technique that is based on the following intuition: if we can use quantum device as a first layer of an HNN, then, for training, we can process every data point in the training dataset with quantum device only once, which results in a more efficient resources utilization of quantum computer during training process. This means that we can highly optimize model training process and, also, we can apply transfer learning technique for HNN and embed quantum device as a first layer of the state-of-the-art models and, thus, boost the efficiency of HNN and actually reuse proven and highly efficient architectures of classical neural networks for building HNNs. Another major benefit of the described hybridization technique is that the same results produced from the quantum layer can be reused in many different HNN architectures because these results are essentially equivalent to data pre-processing and data augmentation process of the original dataset and the results produced from the quantum layer can be treated as a new dataset and reused.
3.4. Architecture Selection
To compare the effect of hybridization on the well-known neural network architectures, the ResNet, EfficientNet, and VGG-16 were selected and used. We chose these architectures based on the idea of selecting two state-of-the-art models for every dataset used in this research and comparing their performance with HNNs derived from the same state-of-the-art models.
CIFAR100 is a classical dataset, which is widely used in research and there is a high number of models that were evaluated on this dataset. The current state-of-the-art model for image classification problem on CIFAR100 dataset is EfficientNet. EfficientNet showed classification accuracy of 91.7% in its founding paper [
26]. EfficientNet is based on a compound coefficient technique in order to scale up models and an AutoML. Another model that is widely used in the industry and showed decent classification accuracy on CIFAR100 dataset is ResNet with accuracy of 77.3% [
27]. ResNet is a deep CNN that uses residual blocks on top of each other in order to boost accuracy compared to the regular CNN architecture.
HD dataset is much less used in research than CIFAR100 and there is a much smaller number of models that were evaluated on this dataset. The best accuracy on HD dataset was shown by EfficientNet with stunning classification accuracy of 99% [
28]. Another highly efficient model on HD dataset is VGG-16 [
29] that showed classification accuracy of 97% [
30]. VGG-16 is a deep CNN that has 16 layers of trainable parameters and a pyramid-like structure.
In conclusion, in this research, we used EfficientNet, ResNet, and VGG-16 in our experiments to obtain metrics of efficiency of hybridization technique. We also leveraged the transfer learning technique and, for all experiments, used the aforementioned models with pre-trained weights on ImageNet dataset [
31].
3.5. Data Pre-Processing Process
Both CIFAR100 and HD datasets were pre-processed using several different quantum devices that performed different operations on input images from datasets. This pre-processing is effectively the first quantum convolutional or quanvolutional layer of an HNN. This process has three hyper-parameters: size of convolutional kernel, stride of convolution kernel, and type of operations that are run in a quantum device. In this research, we used 2 × 2 and 3 × 3 convolutional kernels, 1 and 2 as stride values and X, XY, and XYZ qubit rotations as operations in a random quantum circuit.
Figure 3 shows a diagram of an example random quantum circuit with 9 qubits.
The algorithm of image processing we used was described by Henderson et al. [
32]. For every convolution, we used the value of every pixel in convolution as an input for a quantum circuit initialized with random weights with a corresponding number of qubits: for 2 × 2 kernel circuit, we used a circuit containing 4 qubits and for 3 × 3 kernel we used a circuit with 9 qubits. Then, we applied rotations to qubits and measured outputs of quantum device. The number of outputs of a quantum circuit is equal to the number of qubits in it and it corresponds to the number of pixels in the convolutional kernel. Each output of a quantum device is treated as a separate layer of output convolution, so for 2 × 2 kernel, we received 4 convolutional layers as output. Furthermore, each output layer also had 3 channels, which correspond to RGB channels of the image.
We chose a relatively small size of convolutional kernel due to multiple reasons. The first reason is that the resolution of the original images of CIFAR100 dataset is only 32 × 32 pixels. The second reason lies in the fact that 5 × 5 kernel size will require quantum circuit with 25 qubits, which is unpractical in today’s environment. This research is focused on researching the feasibility of applications of HNNs with quanvolutional layer and omits researching the influence of hyper-parameters on HNN applications.
3.6. Hybrid Neural Networks
In this research, we took a naive approach to building HNNs from classical NNs. We used the output of quantum device as an input to classical part of the HNN. Classical part is essentially a non-modified model.
This approach poses a problem: the number of dimensions of outputs from a quantum device is n-times higher than the number of inputs that takes classical part of the model, where n is equal to the number of qubits that the quantum device has. We resolved this issue by treating every output of quanvolution layer as a separate data point in the dataset during training and validation of the model. So, essentially, quanvolutional operation on the input data increases the size of the dataset for the classical part 4 times when 2 × 2 quanvolutional kernel is used and 9 times when 3 × 3 quanvolutional kernel is used.
Another problem is the size mismatch between the output of the quanvolutional layer and the input of the classical model. This issue appears even if the size of the original image exactly matches expected input size of the classical model that was used in HNN because quanvolutional operation, exactly like a classical convolutional operation, reduces the size of the input data. Dimensions of the output of the quanvolutional layer can be computed using the formula
where
W is original size of the image,
W′ is size of the output image,
K is size of quanvolution kernel,
P is size of padding, and
S is stride. In our experiments, the output of the quanvolutional layer was always much smaller than the expected input size of classical part of the HNN. In order to overcome this problem, images produced by the quanvolutional layer needed to be upscaled to match the expected size. For scaling images, we used a bi-linear interpolation algorithm. We chose bi-linear interpolation over alternatives because it provides a reasonable trade-off between the quality of the results and processing speed, which is satisfactory for our case. It is worth noting that the application of interpolation after quanvolutional operation might be sub-optimal because it can affect the quality of the image produced by the quantum layer of HNN. The effect of the interpolation at different stages of HNN on the performance of the network is a subject for future research.
In our experiments, we studied such metrics of HNNs as classification accuracy on validation data subset and model loss during training. We compared these metrics of HNNs and corresponding classical reference models and drew our conclusions based on them.
For computing test accuracy of HNNs, we used the Majority Voting technique. The idea behind this is simple: when used on the real problem, HNN has a single input image and should produce simple output prediction for that input image. Furthermore, the fact that quanvolutional layer increases the number of channels of the input can be utilized to predict output in a consensus-like manner by running Majority Voting (MV) on the results of the classical part of the model. So, the final prediction is the most common class that was reported by the classical part of HNN on outputs of quanvolutional layer.
Figure 4 shows a high-level architecture diagram of HNNs that were used in this research. This diagram shows general concepts and illustrates flow of data through the network. The first operation that is performed against an original image is the quanvolution operation performed by a quantum circuit. The detailed algorithm is described in
Section 3.5,
Data Pre-Processing Process. Quanvolutional layer produces multiple quantum-augmented versions of the original image, the number of output images is equal to the number of qubits in a quantum circuit that was used for quanvolution. Then, every quantum-augmented image is processed by a classical part of the network, which is a non-modified classical model. The last step of the process is an application of the Majority Voting algorithm for deriving a single final prediction based on classical model predictions for all quantum-augmented versions of the original image.
5. Conclusions and Future Work
The data pre-processing phase involved the augmentation of input datasets, generating multiple versions of images based on different hyper-parameter combinations. The samples of the pre-processed CIFAR100 and HD datasets showcased recognizable patterns from the original images, underscoring the effectiveness of the augmentation process. Notably, each image produced by the quanvolutional layer exhibited distinct characteristics, highlighting the diversity introduced by quantum operations.
The exploration of simple HNN configurations, including variations with quantum pre-processed inputs and classical NN layers, provided valuable comparisons with conventional CNN models. The analysis of the dependence on training dataset size revealed that the performance of HNNs, particularly the “Quantum + CNN” configuration, improves with larger training datasets and more complex network architectures. This sensitivity to training data parameters and model complexity underscores the importance of meticulous model design and dataset curation in maximizing HNN’s performance.
Moreover, the results on the influence of the convolutional filter count and the number of CNN layers on HNN’s performance indicated that, while increasing model complexity can enhance HNN’s performance up to a certain threshold, excessive complexity may lead to diminishing returns or overfitting. Notably, the addition of a second CNN layer yielded significant performance improvements for conventional CNN models, highlighting the nuanced interplay between model architecture and performance.
Furthermore, investigations into HNN’s performance under coarse labeling schemes reinforced the observed trends, suggesting the potential universality of the identified effects across different classification problem complexities.
Experiments conducted on a simple HNN configuration suggested that a quanvolutional operation creates a more complex version of input data and, thus, requires a more complex classical part of the HNN in order to reach its maximum accuracy. This applies to both the structure of the classical part (the CNN showed significantly higher accuracy than the dense NN) and its complexity (the number of filters in the CNN layer). Furthermore, experiments indicated that the quanvolutional HNN configuration is more sensitive to training dataset size than its classical counterpart.
Overall, the findings underscore the promise of HNNs in image classification tasks, offering insights into their sensitivity to dataset size, model complexity, and architectural configurations. By elucidating these dependencies, this research contributes to a deeper understanding of HNNs and their role in advancing explainable artificial intelligence methodologies. Moving forward, continued exploration and refinement of HNN architectures and training methodologies hold the key to unlocking their full potential in real-world applications.
Experiments conducted in our research on more complex configurations of HNNs indicated the feasibility of quanvolutional HNN applications for solving real-world problems on the example of image classification problems. The best results on multi-class classification on CIFAR100 were shown by the classical reference EfficientNet model. All HNNs showed a 5–10% lower accuracy compared to the reference models. Experiments on the HD dataset demonstrated another picture—ResNet and EfficientNet-based HNNs outperformed their classical reference models in accuracy by a small margin (0.3–2.2%). This can be explained by the fact that a quanvolution operation greatly increases training dataset size and this appeared to be quite useful in situations where the original dataset had a sub-optimal size.
However, HNNs’ application also has some major limitations. The biggest one is quantum computing resource supply being quite limited at the moment. This drawback is partially addressed by a proposed HNN architecture, where the quantum device serves as a first layer of the NN because it allows us to perform quantum transformation of the training data only once. However, even so, quantum computing resource supply is still a main bottleneck of HNNs in terms of scalability. Another potential limitation is a need for data transfer between the quantum and classical parts of the neural network. Since the quantum and classical parts are being executed on separate hardware, data transfer over a network is required, which can be quite slow if the physical distance between the quantum and classical hardware is significant. However, we believe that the computation speedup provided by QC utilization will be more significant than the time cost of data transfer over the network.
The MV technique was used for final result prediction and it represents a real-world usage scenario because the quanvolutional layer of the HNN increases dataset size n times, where n is a number of qubits in a quantum circuit. Furthermore, when the model is used on a new data point, we expect to receive a single class prediction for the input data; thus, the classical part of the model acts on every layer produced from the quantum device and, then, the voting algorithm selects the most common class reported by the classical part of the HNN as a final decision. This technique proved to be highly effective for HNNs on more complex problems, as was shown on the CIFAR100 dataset. For less complex image classification problems, for example, binary classification on the HD dataset, where the HNN showed high level of precision (98.81%), MV proved to be less effective. However, it also provides us the flexibility of configuration for solving situations where the model is unable to reliably classify an image; for example, if MV ends up being a tie for determining whether a building was damaged or not, we can automatically flag the building as damaged in order to reduce the number of false-negative scenarios, which is more unfavorable than a false-positive scenario in this case.
It should be noted that HNNs were compiled using quite a simplistic technique, where outputs of the quanvolutional layer were propagated directly to a classical part of the model, which was based on an unmodified state-of-the-art classical NN model. In our future research, we plan to work on replacing convolutional layers with quanvolutional layers and research the feasibility of such an approach in order to push HHN research further towards actually practical usage of HNNs for solving problems in real-world scenarios.
Another thing worth noting is that all experiments were conducted using quantum simulation software in order to be able to conduct experiments without access to the physical quantum computer. So, we were unable to assess important metrics of HNNs such as time metrics of the models’ training and evaluation. Measuring time metrics is especially important because one of the main advantages of QC is significant computation speedup. We plan to address this issue in our future research and evaluate the performance of HNNs on an actual quantum computer.
In our future research, we plan to focus on several important aspects of the proposed HNN architecture. One of such aspects is the effect of the application of image interpolation on the quality of the results produced by HNNs and determining an optimal stage of the pipeline where image interpolation should be applied. Another important aspect that we would like to touch upon in our future research is research on more complex HNN architectures; for example, instead of using a non-modified state-of-the-art classical model as a classical part of the HNN, it is worth researching approaches and algorithms that will help to simplify the classical part of the HNN by replacing some parts of the classical model by a quantum device without sacrificing network accuracy by a significant margin. Finally, we plan to research the impact of the quantum device architecture on the performance of the proposed HNN architecture. We believe that this vector of future research will enhance our understanding of the effect of different architectures and hyper-parameters on the performance of HNNs and will help to move forward to more practical applications of HNNs for solving real-world problems.
In conclusion, the augmentation of input datasets, particularly the utilization of quantum pre-processed inputs, showcases the potential for improving data efficiency and diversity, which can lead to more sustainable machine learning solutions by reducing the need for extensive data collection. The exploration of simple HNN configurations highlights the importance of meticulous model design and dataset curation in maximizing performance. This understanding can help in developing leaner and more resource-efficient models, contributing to sustainability efforts in machine learning. The analysis of model complexity and its impact on performance underscores the need for balanced architectures to avoid overfitting and unnecessary computational burden. This optimization can lead to more sustainable model deployment by conserving computational resources. The observed trends across different classification problem complexities suggest the potential universality of the identified effects, providing insights into the scalability and applicability of HNNs across various domains, thus fostering sustainability in machine learning applications. In the future, the following promising ways to continue research and improve these results could be proposed. For advanced architecture exploration, further research int replacing convolutional layers with quanvolutional layers can enhance the capabilities of HNNs, pushing the boundaries of quantum–classical hybrid architectures and their practical applicability in real-world scenarios. As for quantum computer assessment, conducting experiments on actual quantum computers will provide crucial insights into the performance metrics of HNNs, such as training and evaluation time, enabling a more comprehensive understanding of their potential advantages over classical approaches. Furthermore, the refinement of training methodologies for HNNs, including leveraging quantum simulation software and exploring novel optimization techniques, can improve model performance and scalability, facilitating their adoption in sustainable machine learning applications. Finally, continuing research along these directions can lead to the development of more efficient, robust, and scalable machine learning solutions, contributing to sustainability goals by reducing computational costs, optimizing resource utilization, and enhancing model interpretability and explainability.