1. Introduction
The term “quantum” in physics refers to the smallest discrete unit of a ”physical quantity”. Quantum particles are dualistic (wave-particle) and fall under quantum theory, which deals with finding the probability of a quantum particle at a given point in space [
1]. In recent decades, Quantum Machine Learning (QML) has been developing and evolving in computer science, as it is linked to Machine Learning (ML), where data is processed and analyzed using various decision-making models. With data volume increasing by around 20% per year, it is necessary to manage it properly [
2].
The idea of optimizing a limited multivariable function without programming was proposed by A. Samuel in 1959 [
3]. Since then, this concept has been the basis of machine learning, where algorithms are implemented to match the input and output points in order to create decision functions. In QML, the most common supervised algorithm is the Quantum Support Vector Machine (QSVM) [
4], which uses the vector space optimization limit of the higher dimension to classify labeled data categories. QSVM creates patterns of unlabeled data and reduces them for easier analysis [
5].
Machine learning techniques create patterns in data, while quantum systems produce informal patterns and investigate how to create and implement quantum software for faster machine learning [
2]. In the 20th century, computers were built to analyze mathematical models using several techniques and methods. Artificial neural networks were implemented in the 1950s [
6], and from 1960 to 1990, deep learning was proposed (Hopfield networks, Boltzmann machines) using the backpropagation method [
7]. Recently, significant concerns have been defined for the security and robustness of machine learning models in critical applications [
8,
9,
10].
Quantum computers excel at solving complex algebraic expressions, such as the factorization of large integers and the computation of discrete logarithms, leveraging their computational power [
11]. These impressive capabilities stem from the unique axioms and characteristics of quantum mechanics, such as quantum bits (qubits), interpolation, superposition, and entanglement. Quantum computers process information using qubits, enabling breakthroughs in various scientific domains [
12,
13], including cryptography, big data analysis, machine learning, optimization, the Internet of Things (IoT), and Blockchain.
The simultaneous storage of several qubit states forms the basis of quantum parallelism. The entanglement and interference of quantum states combined with the above accelerate quantum computation. Quantum computation has evolved rapidly since Feynman first claimed it as an effective means of simulating complex quantum systems [
14,
15]. The first QML application was presented in 2014, where the quantum version of the SVM algorithm, QSVM, sorted large amounts of data [
4]. Another learning algorithm based on superposition and quantum operator developed was the Superposition-based Architecture Learning algorithm (SAL) [
16]. Search algorithms such as QKNN (Quantum K-Nearest Neighbors) [
17] and decision tree classification [
18] have also been proposed.
In addition to the distinction between classical machine learning (ML) and quantum machine learning (QML), researchers have also explored the hybrid approach, which combines both quantum and classical algorithms to achieve optimal performance. Extensive research has been conducted in this exciting area, yielding promising results. A notable study has been conducted in this area, where a classifier was presented for encoding data in N-dimensions with the help of the training algorithm [
19]. In another study, the authors introduced the Hybrid Quantum Feature Selection Algorithm (HQFSA), which is based on subroutines for the selection of features [
20]. Another approach is adiabatic QML, which is suitable for certain classes of optimization problems [
21,
22,
23].
Quantum technologies also include quantum cryptography, with the most famous being Quantum Key Distribution (QKD) (BB84 protocol) [
24,
25]. With the appearance of quantum computers, researchers started to focus on post-quantum cryptography (PQC), which is the pre-design and development of quantum encryption algorithms [
26,
27]. With the Shors algorithm [
28] and a powerful quantum computer, every known encryption algorithm like RSA or ECC could be exposed since they were designed based on known mathematical problems like elliptic-curve discrete logarithm problems or integer factorization. The algorithms implemented in quantum machine learning are not entirely quantum but correspond to classical methods adapted by differentiations.
The purpose of this paper is two-fold. Firstly, it offers an overview of Quantum Machine Learning and its connection to classical Machine Learning. Secondly, it presents a series of experiments conducted on three datasets, including one tested for the first time, to compare the classical and quantum versions of the SVM ML model. The experimental part is conducted to highlight QML and CML connections from a more technical perspective and verify their differences and limitations through experimental results.
The remainder of this paper is structured as follows: The introduction section establishes the research scope, performs a literature analysis, and outlines the main contributions of this study.
Section 2 describes the basic concepts of Quantum Machine Learning.
Section 3 presents quantum algorithms and their applications, focusing on the Support Vector Machine and Quantum Support Vector Machine.
Section 4 introduces quantum learning methods, and
Section 5 provides a critical analysis of their advantages and limitations.
Section 6 presents the conducted experiments, describes the applied methods, and evaluates the results. Finally,
Section 7 concludes the study with potential conclusions and future directions.
1.1. Motivation and Contribution
Quantum machine learning has become an increasingly popular research topic in the computer science field. While the roots of this field date back to earlier days, recent research has focused on various techniques and methods of quantum machine learning, including supervised and unsupervised algorithms. These studies aim to combine and compare classical and quantum machine learning algorithms through experiments on different datasets with complex features.
One notable theoretical study by Ciliberto et al. [
29] examines the computational costs and the need for data transformation in quantum machine learning. Other research has investigated quantum algorithms, such as Quantum SVM, which offers quantum speed-up in machine learning applications [
30]. Schuld et al. [
31] explain quantum machine learning algorithms for handling big data, while Havenstein et al. [
32] compare the performance of machine learning algorithms executed on traditional and quantum computers. The latter concludes that quantum multi-class SVM classifiers offer advantages for future quantum computers with large numbers of qubits.
In data classification, Support Vector Machines (SVM) and Quantum SVM [
2] are the most common methods used in classical and quantum machine learning, respectively. Many classical and quantum SVM algorithms have been developed to solve classification problems, as shown in various benchmark studies. For instance, [
33] implements a quantum support vector machine (QSVM) algorithm with the MNIST dataset of handwritten digits and compares classical and quantum SVM algorithms in terms of execution time and accuracy. Saini et al. [
34] implement a QSVM-based classification model on a breast cancer dataset and compare QSVM accuracy to SVM. Havlicek et al. [
35] experiment with quantum algorithms on noisy quantum computers, while Tang [
36] designs a QML-based recommendation algorithm that can achieve exponential improvement.
Other researchers propose new algorithms and methods to improve the accuracy and efficiency of classical and quantum machine learning. For example, Shan et al. [
37] designed a QSVM algorithm and proposed a quantum kernel estimation method with measurement error mitigation using the breast cancer dataset. Kumar et al. [
38] implement classification models using QSVM and classical SVM on quantum and classical computational backends on three constructed datasets. In [
39], the authors examine the execution speed and accuracy of quantum support vector machines compared to classical SVM by proper quantum feature mapping selection using IBMQ quantum computer. Batra et al. [
40] experiment with a large dataset, such as the drug dataset, which is hard for classical computations, to compare quantum and classical computations using the SVM algorithm. Finally, Suzuki et al. [
41] propose a method to analyze the feature map for the kernel-based quantum classifier with two qubits using the SVM algorithm, while Bay et al. [
42] propose a quadratic kernel-free least squares support vector machine (QLSSVM) to solve binary classification problems using the heart disease dataset.
The aforementioned works have produced interesting results. Some of these studies focus on the comparison between classical and quantum machine learning algorithms only theoretically, while others include experiments implementing classical and quantum classifiers such as SVM and QSVM to solve binary classification problems. However, all researchers evaluate and focus on the computational cost and accuracy, and they conclude that machine learning can harness the advantages offered by a quantum algorithm.
Motivated by the growing interest in quantum machine learning and its potential to revolutionize the field of machine learning, this paper aims to provide an overview of the current state of QML from a historical, theoretical, and technical perspective. Specifically, we focus on the implementation and comparison of classical machine learning algorithms, particularly Support Vector Machines, with their quantum counterparts, known as Quantum Support Vector Machines (QSVM). By comparing classical and quantum machine learning models, we aim to evaluate the current state of the QML field while highlighting the corresponding challenges and limitations.
To achieve this goal, we address several critical research questions, including:
- RQ1:
To what extent is classical computing combined with quantum computing?
- RQ2:
Does quantum machine learning provides speed increases compared to classical machine learning?
- RQ3:
Does combining classical and quantum approaches lead to increased accuracy, or are there cases where classical machine learning performs better?
- RQ4:
What are the advantages and limitations of quantum machine learning?
By providing answers to these questions, we aim to provide a better understanding of the current state of the QML field and the potential benefits and limitations of using quantum approaches in machine learning.
1.2. Literature Analysis
This paper presents a systematic literature review that explores the current state of Quantum Machine Learning (QML) from historical, theoretical, and technical perspectives. To ensure a thorough analysis, specific criteria were applied based on relevant keywords found in the titles of publications. The research methodology employed for this review was conducted in March 2023, and it included an initial search on Scopus with the following query: TITLE-ABS-KEY (quantum AND machine AND learning) OR TITLE-ABS-KEY (quantum AND svm) OR TITLE-ABS-KEY (quantum AND classifiers) (6171 papers). We limited the results to journal articles, conference papers, and book chapters only in English (5079).
Scopus includes the most important digital libraries (Elsevier, Springer, ΙΕΕΕ), provides a refined search, and facilitates the export of files, focusing on the years 2000–2023.
We conducted more detailed searches on Scopus specific to our paper, such as QSVM and SVM classifiers, evaluated them, and gathered any relevant works referenced in our paper. This process yielded around 84 papers that are included in this systematic review.
Quantum computing is a rapidly advancing field of research. Scientists are exploring ways to solve complex problems quickly, efficiently, and accurately using quantum systems. Many different quantum algorithms have been developed to address a wide range of computational challenges. Some key milestones in the history of quantum computing include Feynman’s observation in the early 1980s that classical computers cannot simulate quantum phenomena [
43], Deutsch’s demonstration in 1985 of the universality of quantum computing [
44], Shor’s 1994 development of an algorithm for prime factorization that is exponentially faster than classical algorithms [
28], and Grover’s 1996 algorithm for quantum cryptography [
45]. One of the major challenges of quantum computing is achieving faster computation speeds, and significant research efforts have been dedicated to this goal. In 2019, Google announced Quantum Supremacy, a significant milestone that demonstrated the superior computational power of quantum systems compared to classical computers [
46].
The number of publications on quantum computing has increased significantly since 2011, as illustrated in
Figure 1, depicting the articles published in the Scopus database until early 2022. Specifically, the first decade, from 2000 to 2010 exhibited a comparatively lower number of publications. However, from 2011 onwards, the interest of researchers has noticeably increased, reaching its peak in 2022, which recorded the highest number of publications to date. However, there has been a slight decrease in publications in 2016.
Below are the first ten countries that have published articles on QML, with the United States and China singling out almost 50% (
Figure 2), and finally, all countries that published articles from 2000 to 2023 (
Figure 3).
Figure 2 and
Figure 3 demonstrate that the United States and China are at the forefront of QML contributions. This finding is justified by prominent quantum hardware providers in these countries.
Quantum machine learning has a long and fascinating history that runs almost parallel to classical machine learning. QML began to take shape before the advent of quantum computers in 1900, and since then, it has been growing steadily, with several landmark contributions that led to the development of quantum computing.
In 1981, physicist Richard Feynman gave a lecture on the possible benefits of computation with quantum systems and the simulation of the physical properties of matter [
15,
43]. Four years later, physicist David Deutsch published the idea of a universal quantum computer. In 1994, mathematician Peter Shor developed an algorithm for finding the prime factors of large numbers, which demonstrated the immense potential of quantum computers [
11,
28]. In 1996, mathematician Lov Grover introduced an algorithm that optimizes search in an unstructured database [
45]. In 1998 [
46], Jones, Mosca, and Hansen of Oxford University executed Grover’s algorithm on a 2-qubit quantum computer.
In 2001, IBM partnered with Stanford University to publish and implement the Shor algorithm on a 7-qubit processor [
47]. In 2012, physicist John Preskill described the moment when controlled quantum systems perform tasks that transcend classical ones [
48]. In 2019, Google announced the achievement of quantum supremacy, a milestone in which a quantum computer can perform a task that is beyond the capability of classical computers [
49].
Today, QML has evolved into a new concept known as quantum brain-inspired machine learning, which aims to develop artificial algorithms that use interactions and dynamics of a quantum system as a direct resource of learning, mimicking the computation of the brain (
Figure 4).
2. Quantum Machine Learning—The Basic Concept
Quantum Machine Learning, classical Machine Learning, and Quantum Computing are interconnected, with the shared goal of developing more accurate and reliable models. Learning algorithms are designed to identify patterns in data for making predictions and decisions. The power of quantum computing lies in its qubits, which cannot be copied and have no ramifications or feedback loops [
50]. Quantum computation is represented by quantum registers, gates, and circuits that consist of qubits and denote the chronological order and mode of action of the gates and registers [
51]. A quantum register comprises a set of unordered qubits that simultaneously store all their states, with the elements numbered clockwise [
52]. The SWAP gate is central to designing networks for the quantum computation of qubits, which carries out multiple qubit gates to create a base. The multi-qubit and CNOT gates are typical features of quantum computation [
53], with the SWAP gate being crucial in the network design of Shor’s algorithm [
54,
55]. Recently, it has been suggested that generalizing quantum computation to higher-dimensional systems may offer advantages [
56]. At the heart of Machine Learning is the extraction of information from data distributions without being explicitly programmed. Therefore, harnessing quantum phenomena is necessary [
2], which is accomplished by developing quantum algorithms that implement classical algorithms using a quantum computer. In this way, data can be classified and analyzed by supervised and unsupervised learning methods using Quantum Neural Networks (QNNs) [
57,
58]. Variational Quantum Circuit (VQC) is a quantum gate circuit with free parameters that approximate, optimize, and classify various arithmetic tasks. The VQC-based algorithm is known as the Variational Quantum Algorithm (VQA), a classical quantum hybrid algorithm where parameter optimization typically occurs on classical computers [
59]. The VQA approaches the target function using learning parameters with quantum characteristics, such as reversible linear gate operations and multi-layer structures that use layers of engagement. VQC has been used to replace existing Convolutional Neural Networks (CNNs) [
60,
61], with QNNs being defined as a subset of VQA, and a general expression of the QNN quantum circuit is presented in
Figure 5 [
62,
63].
While VQA continues to be a prominent approach for designing QNNs, it also inherits some of its drawbacks. For instance, the QNN framework currently faces the issue of the barren plateau, but specific solutions to this problem have yet to be proposed. Additionally, the measurement efficiency of quantum circuits has not been thoroughly investigated, which remains a challenge for QNN designers.
3. Quantum Machine Learning—Algorithms and Applications
Quantum machine learning algorithms are applied in the domains of Supervised Learning, Unsupervised Learning, and Reinforcement Learning (RL). In Supervised Learning, patterns are learned by observing training data, whereas in Unsupervised Learning, the structure is recognized from a set of grouped data. In RL, the algorithm learns from direct interaction with the environment [
64]. Additionally, a technique called deep-supervised learning trains QNN to recognize patterns and images. It is a feed-forward network that employs circuits with qubits (based on neurons) and rotational gates (proportional to weights). On the other hand, Classical deep learning (CDL) uses complex multi-level neural networks, and a deep learning algorithm constructs multiple levels of abstraction from large datasets. Boltzmann machines (BM) are a well-known class of deep-learning networks where the nodes of graphs and connections are established by the Gibbs distribution [
65] (
Figure 6). The method aims to minimize the maximum distribution probability using the gradient descent method, which ensures consistency between the model and the training data [
66].
Recently, Wiebe et al. [
67] have developed two quantum algorithms, namely BM and QDL, which efficiently calculate the distribution of data. In BM, the state is initially approximated using the classical mean-field method before being fed into the quantum computer and applying sampling to calculate the required gradient. On the other hand, the second algorithm performs a similar process but requires access to the superposition training data (via QRAM) and provides more accurate solutions but not acceleration. This procedure is equivalent to an attribute map that assigns data to vectors in Hilbert space [
68,
69]. The inner products of such quantum states encode data and create kernels [
35].
Fuzzy cognitive maps (FCMs) have been introduced as a quantum-inspired machine learning model belonging to the category of expert systems. Quantum Fuzzy Cognitive Maps (QFCM) were initially introduced in 2009, presenting a quantum-based approach to cognitive maps [
2,
70]. In this framework, each concept is represented by a single qubit, and the concept value is computed through qubit superposition. In 2015, the QFCM model was further developed as an ensemble classifier [
71], which outperformed other models such as AdaBoost and Neural Networks, demonstrating increased robustness against noise. Additionally, in the same year, a variant of FCM called
bipolar quantum FCMs was proposed [
72]. In 2018, the authors of bipolar quantum FCM explored the application of the quantum cryptography problem [
73], where bipolar quantum FCMs performed well in comparison with other methods. Although QFCMs may not strictly fall within the domain of QML, most implementations are inspired by quantum principles, even though explicit proposals for execution on quantum computers are not prevalent.
A well-defined example of QML is the QSVM algorithm, which utilizes a quantum processor to estimate the kernel directly in the quantum space. This method involves a training phase where the kernel is computed, and support vectors are determined. Subsequently, the unlabeled data is classified based on the solution obtained during the training phase [
74]. The algorithm is capable of performing binary or multi-class classification based on the data classes and can even be utilized for data clustering and exploration.
Machine learning classifications are versatile tools with applications in diverse fields, such as computer vision, medical imaging, drug discovery, handwriting recognition, geostatistics, and more. Quantum computers hold the potential to help overcome the challenges of support vector machine (SVM) and kernel learning, as previous surveys have shown that quantum computation can exponentially accelerate SVM training. Quantum SVM and kernels can efficiently explore high-dimensional spaces, creating maps and decision boundaries (
Figure 7a) for specialized datasets in line with their design objective;. this task is difficult for classical kernel functions to match [
75]. This mapping of classical data to the Hilbert space is illustrated using the Bloch sphere, as shown in
Figure 7b.
SVM Kernels and Quantum SVM
The SVM algorithm can be implemented in two ways: using a kernel or using a quantum processor (QSVM). In cases where the data set is non-linear and cannot be handled by a linear classifier, the distance between each point and the center is calculated to create a new feature, which enables classification in a higher-dimensional space (see
Figure 8).
The kernel function maps an input feature space into a new, possibly higher-dimensional space where the training dataset can be better separated. For SVMs, the first step involves preparing the training dataset and mapping features into the range [−1, 1], followed by kernel optimization to minimize the cost function. The final step is to test the model. QSVMs follow a similar process, but instead of using a classical computer to evaluate the kernel function, qubits are used to encode the feature space, and the quantum computer performs the kernel evaluation. The attribute mapping is done by encoding data onto the quantum state of each qubit, allowing for the efficient calculation of the kernel matrix.
Figure 9 illustrates the nonlinear features classification to higher dimensions to observe new feature differences. In CML, these features are clustered. However, when they move in 3D space, they are actually scattered.
Mapping is achieved with a single gate, and this leads to the creation of a quantum circuit. The kernel is computed by selecting a reference point x and encoding the remaining data points relative to it using quantum states. The resulting circuit is then reversed to return to the zero state with a wider width that depends on the distance between the x and z states, giving rise to the kernel value. The weights of the QSVM model are optimized by minimizing a cost function using classical optimization techniques, similar to the classical SVM algorithm [
76].
The second approach involves utilizing the Variational Quantum Eigensolver (VQE), a hybrid quantum-classical computational method designed to define the eigenvalues of a Hamiltonian. However, there are two main limitations associated with this approach. Firstly, the complexity of the quantum circuits used in VQE can be challenging. Secondly, the classical optimization problem that relies on the variational ansatz [
77] introduces additional complexity. The Quantum Variational Circuit (QVC), which is applied in this method, enables a weighted rotation of L (a hyperparameter of the variational circuit) times on the Bloch sphere. As the vectors are already encoded as linear angles in the sphere, this technique provides a detailed description and the ability to search for optimal weights θ. The results are output as a distribution of 0 s and 1 s, mapped to +1 and −1 [
78].
6. A Preliminary Experimental Study
As described in the sections above, quantum machine learning has achieved significant breakthroughs over its classical counterpart, despite the fact that quantum hardware is still in its early stages of development. This success has raised questions about the concept of quantum supremacy and whether quantum computers hold an advantage over classical computers for machine learning tasks. However, one major limitation is the number of physical qubits available, which makes it challenging to execute quantum machine learning models for large datasets. This creates a paradox where quantum machine learning can outperform classical models in some cases but cannot handle large amounts of data that would demonstrate clear quantum supremacy over classical computers. As a result, a common solution is to simulate quantum machine learning models on classical hardware, where they still demonstrate their superiority over classical models.
Our experiments utilized Qiskit [
81] Aqua, a library specifically designed for building and developing quantum algorithms. This versatile framework also provides pre-implemented quantum machine learning algorithms. Our study employed two quantum machine learning algorithms: Support Vector Machine (SVM) and the quantum kernel-based method for SVM-based supervised classification. While the direct kernel-based method for SVM was run on a classical computer that utilized a regular CPU, we employed SVM and Quantum Support Vector Machine (QSVM) learning methods for binary classification problems using three benchmark datasets. It should be noted that quantum machine learning algorithms are only simulated in this study, and therefore the comparison is not perfectly fair; hence, it is a limitation.
To measure the performance of the two models in the classification problems, three known metrics were applied, ROC_AUC, F1-Score, and accuracy. ROC_AUC stands for Compute Area Under the Receiver Operating Characteristic Curve and is a statistical measure for classification problems. In general, it shows the performance of a classifier to distinguish the different classes in a dataset. The two terms that form this performance curve are True Positive Rate (TPR) or recall and False Positive Rate (FPR) or fall-out where their mathematical formulas are:
In Equation (1), the term TP stands for True Positive and indicates that a model correctly estimates the positive class. The term FN stands for False Negative and indicates that a model incorrectly estimates the negative class. In Equation (2), the term FP stands for False Positive and indicates that a model incorrectly estimated the positive class. The term TN stands for True Negative and indicates that a model correctly estimated the negative class. Based on these two formulas, the ROC_AUC curve is calculated. F1-score is also a performance metric that shows the balance of a model to distinguish the positive classes from the negatives. Its mathematical formula is:
In Equation (3), its terms are the same as TPR and FPR, and the results are the harmonic mean of precision and sensitivity. Similarly to the previous metrics, accuracy is also a performance metric calculated based on the previously mentioned terms. Its mathematical formula is:
Equation (4) is a fraction between the number of correct predictions and the total number of predictions. With these metrics, the overall performance of a classification model is covered, even if it is quantum or classical, since only the model predictions are considered.
6.1. Datasets Acquisition
In this study, we utilized three datasets from the UCI Machine Learning Repository. The first dataset is the Breast Cancer Wisconsin (Diagnosis) dataset, which contains 699 observations with ten features used to predict the diagnosis value of a breast cancer cell nucleus as either benign (represented by the default diagnosis value of 0) or malignant (represented by a diagnosis value of 1). This dataset is available from scikit learn [
82].
The second dataset is the UCI ML Ionosphere dataset, collected by a system in Goose Bay, Labrador, and consists of a phased array of 16 high-frequency antennas with total transmitted power on the order of 6.4 kilowatts. This dataset contains 351 observations and 34 features that are used to predict whether radar returns show evidence of some type of structure in the ionosphere (represented as “good” returns) or not (represented as “bad” returns) [
83].
The third dataset contains 4601 observations and 57 features related to the concept of spam, such as advertisements for products/websites, make-money-fast schemes, chain letters, etc. The dataset includes spam and non-spam emails from fieldwork and personal emails [
84]. All three datasets are suitable for quantum machine learning algorithms and were used for supervised binary classification problems.
6.2. Experimental Results and Discussion
To provide a more technical review, we conducted an experimental study to compare the performance of QSVM and classical SVM on three datasets, two of which were used for the first time in this study. The experiments were conducted in several phases, including data preprocessing, quantum and classical model training, and prediction using a 10-fold cross-validation approach. Cross-validation is the sampling by no overlap test sets, while “fold” is the number of resulting subsets by randomly sampling cases from the learning set without replacement. Then, the model is applied to the remaining subset, denoted as the validation set, and the performance is measured. This procedure is repeated until each of the k subsets has served as the validation set. In the 10-fold cross-validation, the first fold serves as the validation set, and the remaining nine folds serve as the training set. In the second fold, the second subset is the validation set, the remaining subsets are the training set, and so on. The cross-validated accuracy, for example, is the average of all ten accuracies achieved on the validation sets [
85,
86].
Figure 10 shows a diagram of the experimental setup:
Based on
Figure 10, the data preprocessing block differs between quantum and classical approaches. On the quantum side, each feature of the dataset is represented by a qubit to create the quantum state of a data point and the values are normalized and scaled to the range (−1, 1) on the Bloch sphere. Then, PCA is applied to reduce the dimensionality of the data features to meet the qubit simulation requirements. Next, a hybrid process occurs, where each fold is created on a classical computer, and then quantum training and prediction are performed. Finally, the measured labels are used to calculate the performance metrics. On the classical side, the same preprocessing steps are followed to ensure a fair comparison. As mentioned above, three datasets are used: Breast Cancer, Ionosphere, and Spam Base.
Table 1 below presents some characteristics of these datasets.
The datasets presented above have different characteristics that affect the performance of quantum and classical machine learning models. The Breast Cancer dataset is well-constructed, with a balanced number of features and samples for the two classes. It is a well-known dataset in the quantum machine learning community, and most models, both quantum and classical, can perform well without further processing.
The Ionosphere dataset is a two-class problem with fewer samples and more features than the Breast Cancer dataset. Furthermore, the dataset only contains numerical values, not categorical ones like the Breast Cancer dataset. These factors make Ionosphere a more challenging dataset than Breast Cancer.
The Spam Base dataset is a newly applied dataset in quantum machine learning, and it is the most challenging dataset in this experimental setup due to a large number of samples and features. The selection of this dataset is to demonstrate the superior performance of QSVM over classical SVM with a quantum simulator as the execution environment.
A fundamental difference between QSVM and classical SVM is the type of kernel used for data representation in order to find the hyperplane that separates the data into different classes. In the quantum version, quantum kernels can be constructed using Pauli operations along the X, Y, and Z axes and their combinations, creating higher-dimensional quantum kernels in a multidimensional Hilbert space. The other parameters of QSVM and SVM are the same and kept at their default values in the experimental comparison. It is worth noting that even with dimensionality reduction and simulation on a classical computer, quantum kernels perform equally or better than classical kernels in most cases. However, despite the fact that QSVM performs better than classical SVM in many cases, the supremacy of quantum over classical is not yet clear, especially in the absence of real quantum hardware due to limited access.
It is also worth noting that even with simulation, the execution of quantum circuits is faster than classical operations, with the time-consuming process being the encoding and decoding of classical data into quantum states.
Table 2 illustrates the performance of the QSVM and SVM for the case of the first dataset, whereas
Figure 11 and
Figure 12 illustrate the same results graphically.
Regarding the first dataset, the quantum advantage is not readily apparent, largely due to how the data is distributed. Both SVM and QSVM can approximate the hyperplane with relative ease. In fact, classical SVM outperforms QSVM with a 2% higher accuracy in both the training and testing samples for each of the folds, which encompass 614 training and 69 testing samples.
Table 3 illustrates the performance of the QSVM and SVM for the case of the second dataset, whereas
Figure 13 and
Figure 14 illustrate the same results graphically.
Regarding the second dataset, it is noteworthy that the Ionosphere dataset comprises a relatively small number of samples, and in this scenario, the results of QSVM are better than SVM. In 6 out of 10 folds QSVM brings better results than SVM. This implies that QSVM can effectively learn even with a limited amount of data. Furthermore, it should be noted that QSVM has the added ability to differentiate between classes based on numerical values, which can be advantageous in certain applications.
Table 4 illustrates the performance of the QSVM and SVM for the case of the third dataset, whereas
Figure 15 and
Figure 16 illustrate the same results graphically.
In this dataset, QSVM performs better than SVM in 9 out of 10 folds without a large difference. These findings suggest that QSVM has the potential to effectively handle complex datasets, such as the Spam Base dataset, and therefore can be considered as a promising solution in this domain. Based on the overall results, QSVM performs better in two of three datasets without having a large difference from classical SVM.
Regarding efficiency and speed, QSVM is computationally more demanding and slower than classical SVM. QSVM requires at least 25 GB of memory and over 3 h of training time for each dataset, while classical SVM operates with less than 1 GB of memory and completes training within 1 h for each dataset.
From an efficiency perspective, QSVM falls behind SVM, especially when executed on a classical machine where extensive computational resources are required and not all quantum computing characteristics can be fully harnessed. However, from a performance standpoint, QSVM demonstrates potential, as it outperforms SVM in two of three datasets. It is worth noting that in this study, the QSVM algorithm is compared with its classical counterpart, although other quantum-inspired classical SVM approaches that have been proposed recently [
87] can also be considered. In the latter cases, improved accuracy in an accelerated way seems feasible without implementing the SVM algorithm in a fully quantum form. However, when quantum computers are established as alternative computing devices, any classical ML model is expected to be outperformed by their QML counterparts.
In conclusion, quantum models exhibit promising potential for achieving superior performance compared to classical models, but there are numerous challenges and limitations that need to be addressed and resolved.
7. Conclusions and Future Works
Quantum computing is still in its early stages, and building a functional and efficient computer with enough qubits takes years. While quantum machine learning (QML) methodologies produce spectacular results, the quantum material currently available is not sufficient. Researchers need access to quantum computers with more physical resources to expand the scope and power of QML. As quantum hardware and computing continue to evolve, QML is likely to become a leading approach in applications such as unsupervised learning and generative models, which outperform classical versions.
Many classical machine learning (ML) methods and techniques can be transformed into QML schemes to increase the domain’s capabilities, such as expert systems. In this paper, we experimentally compared three datasets using classical and quantum machine learning methods. We used three datasets (Breast Cancer, Ionosphere, and Spam Base) and implemented SVM and QSVM algorithms to predict accuracy. The Spam Base dataset was used for the first time and was the most challenging of the experiments, where QSVM outperformed classical SVM. It is important to note that even though the quantum execution environment was simulated by a classical computer, QSVM achieved higher scores in Spam Base dataset, where each fold contained 1527 training samples and 712 testing samples.
Another important observation is that the higher the complexity of the dataset, the wider the performance gap between quantum and classical models. This is because quantum machine learning operations can generalize much better without losing performance, in contrast to classical models, where complex datasets and simple models often overfit. In the future, there are some open issues that need to be addressed. For instance, the quantum simulator was executed on a local classical machine without high computing power, while execution on a real quantum computer could provide more realistic outcomes. The QML domain should also target designing new quantum learning models that will observe patterns under quantum mechanics schemes, not classical statistical theory. This will provide an opportunity to explore new model architectures that might overcome classical machine learning limitations.