1. Introduction
Starting with the Deutsch–Jozsa algorithm and Shor’s discrete logarithm algorithm [
1,
2], the potential of quantum computing algorithms has extended beyond merely simulating quantum systems. The potential speedup of quantum algorithms over their classical counterparts has gathered tremendous attention, including a fundamental demand in science and engineering: solving linear systems. Harrow, Hassidim, and Lloyd (HHL) first developed a quantum linear solver with an exponential speedup in problem dimensions in [
3]. Built upon the exponential speedup of quantum linear system algorithms (QLSAs), many works have explored theoretical quantum advantages in various applications. These fields include portfolio optimization [
4], machine learning [
5,
6], differential equation solving [
7], linear optimization [
8,
9,
10,
11], and semi-definite optimization [
12,
13].
However, the HHL algorithm proposed in [
3] has a quadratic dependency on matrix condition number and matrix sparsity, worse than classical linear solvers such as factorization methods and conjugate gradient, where condition number is the product of the norm of the coefficient matrix and the norm of the inverse matrix. Several works have been proposed to reduce the dependency on the condition number of coefficient matrices and the accuracy of the solution state [
14,
15,
16,
17,
18,
19,
20,
21,
22]. Specifically, based on adiabatic theorems, the state of the art has a linear or quasi-linear dependency on the condition number and a logarithmic dependency on the inverse of the solution accuracy [
18,
19,
20,
22].
The HHL algorithm has been demonstrated in experiments to solve linear algebra problems. The largest linear systems demonstrated on real gate-based quantum machines are up to
systems with variants of the HHL algorithm [
23,
24,
25] and an
system with the linear solver based on adiabatic quantum computing [
26]. However, testing QLSAs on real quantum devices to demonstrate a quantum advantage still suffers from multiple obstacles, such as the large number of required quantum gates and the high noise level of current quantum devices [
27].
With the current development of quantum hardware and exploration of quantum error correction (QEC) codes, a large-scale fault-tolerant quantum computer is expected to be demonstrated in the foreseeable future [
28,
29,
30,
31,
32,
33,
34]. QEC codes, such as surface codes, are expected to detect and correct Pauli errors, as well as any linear combinations of them, provided the errors occur below a certain threshold probability [
35]. Although the gap between algorithm requirements and hardware specifications is shrinking, the gap still exists, which necessitates the analysis of the resource costs involved [
36]. Resource estimations have been performed for chemistry [
37], Grover’s algorithm on the Advanced Encryption Standard [
38], Shor’s discrete logarithm algorithm for the RSA cryptosystem [
39], and the computation of elliptic curve discrete logarithms [
40]. However, despite this being essential for understanding the disparity between hardware capabilities and practical applications, there is limited work on non-asymptotic resource estimation for QLSAs [
41].
In this paper, we focus on resource estimation and experiment with the HHL algorithm on several applications selected from domain science, such as power grid and climate projection. Different from the previous works about asymptotic and non-asymptotic resource analysis [
3,
14,
15,
16,
17,
18,
19,
20,
21,
22,
41], we investigate the factors affecting the final accuracy, resource cost, and fault-tolerant hardware requirements. Our experiments show the effectiveness of the HHL algorithm in scientific applications with a low accuracy in quantum phase estimation. Working with the Microsoft Azure Quantum resource estimator [
42,
43], we summarize the exponential dependency of quantum resources on the number of clock qubits in HHL circuits and demonstrate a possible method to reduce the demands on physical qubits in fault-tolerant quantum computing.
This paper is organized as follows:
Section 2 introduces the idea of quantum linear system solvers, with implementation-related details.
Section 3 presents the simulator, NWQSim [
44], and the resource estimation tool. Next, we explore the factors of interest in evaluating numerical experiments in
Section 4 and perform those experiments in
Section 5. Finally, we discuss the limitations in
Section 6 and conclude the implications of our work on domain science applications in
Section 7.
3. Simulator and Resource Estimation Tool
The statevector simulator carries the simulations in the experiments, SV-Sim [
57], in Northwest Quantum Circuit Simulation Environment (NWQSim)(V2.5.0) [
44]. As shown in [
57], compared to simulators in Aer from Qiskit [
58] and qsim from Cirq [
59], NWQSim provides specialized computation for a wide range of supported basis gates and architectures of CPUs and GPUs, such as gate fusion. In
Table 2 and later in
Section 5, we demonstrate that the gate fusion strategy in NWQSim can reduce about
of gates in the circuits without sacrificing error rates. On the other hand, NWQSim utilizes a communication model called “PGAS-based SHMEM” that significantly reduces communication latency for intra-node CPUs/GPUs and inter-node CPU/GPU clusters. In this case, SV-Sim has an exceptional performance over other simulators in deep-circuit simulation [
57].
Figure 2 shows the running time of the HHL circuit in the size of 11 qubits to 17 qubits on SV-Sim on four different GPUs.
The resource estimator in [
42,
43] from Microsoft Azure Quantum establishes a systematic framework to access and model the resources necessary for implementing quantum algorithms on a user-specified fault-tolerant scenario. This tool enables detailed estimation of various computational resources, such as the number of physical qubits, the runtime, and other QEC-related properties to achieve a quantum advantage for certain applications. Specifically, the tool accepts a wide range of qubit and quantum error correction code specifications and an error budget that allows different error rates to simulate a described fault-tolerant environment.
The tool is compatible with circuits generated from a high-level quantum computing language or package, including Qiskit and Q#. After a circuit is given, the input is compiled into Quantum Intermediate Representation through a unified processing program, and the estimator can examine the code and record qubit allocation, qubit release, gate operation, and measurement operation. Then, logical-level resources are estimated and used to compute the required physical-level resources further. The tool returns a thorough report on resources demanded to perform the given algorithm on fault-tolerant quantum computers, including the explanation and related mathematical equations of those estimates. A selected list of estimates is described in
Section 4, and their values in conducted experiments are displayed in
Section 5.
4. Factors of Interest
As we focus on the linear system in scientific applications instead of random systems for benchmarking, we have less control over the specific values of matrix properties like condition numbers. Our interest is more on the number of clock qubits
in the HHL circuit, which controls the accuracy of estimated eigenvalues. The error in eigenvalue estimation affects the solution of the linear system through Equation (
2). From [
35], to obtain an eigenvalue with
accuracy with at least
success probability using QPE, we need
In the Qiskit-based HHL implementation that we used [
55], it is suggested that
where
if the coefficient matrix has a negative eigenvalue; otherwise, it is 0. In this paper, we will adjust
to illustrate the influence of the QPE resources on the HHL circuit’s total cost and the algorithm’s accuracy in domain applications.
When discussing resource estimation under a fault-tolerant setting, our primary concerns are the estimated runtime, the number of physical qubits, and extra resources required from the QEC code. We adopt a distance-7 surface code that encodes 98 physical qubits into a single logical qubit. The theoretical logical qubit error rate is
, and the error correction threshold is 0.01. The Azure Quantum resource estimator provides several qubit parameter sets to simulate different qubit properties. The preset qubit settings we used in this paper are
and
from [
42], where the former is close to the specifications of superconducting transmon qubits or spin qubits, and the latter is more relevant for trapped-ion qubits [
42]. A list of detailed configurations of qubit parameter set
and
is in
Table 3. We enforce 2-D nearest-neighbor connectivity of the qubits to simulate the connectivity constraint on real quantum computers. So we also demonstrate the changes in some factors before this constraint is enforced (“pre-layout”) and after this constraint is enforced (“after layout”).
Another important tunable parameter is the overall allowed errors for the algorithm, namely error budget. Its parameter value is equally divided into three parts:
Logical error probability: the probability of at least one logical error;
T-distillation error probability: the probability of at least one faulty T-distillation;
Rotation synthesis error probability: the probability of at least one failed rotation synthesis.
There are also specific breakdowns in the resource required by QEC that are of interest [
42,
43]. We list them in
Table 4.
5. Scientific Applications and Evaluation
This section examines the utilization of the HHL algorithm in the fields of power grids and climate projection. We evaluate the performance of HHL in terms of solution accuracy, resource cost, and influence on convergence speed for applicable problems.
In addition to the hardware specifications in
Section 3, all resource estimator jobs are run on the Azure Quantum cloud server. Due to the limitation on the cloud service usage, we cannot examine some of the deepest circuits in this section with the resource estimator, and all evaluated circuits are transpiled.
With respect to a given basis gate set from the estimator using the transpiler in Qiskit. The optimization level of the transpiler is set to level 2. The Qiskit version is 0.46. The Azure Quantum version is 0.30.0. The MATPOWER version is 7.1.
5.1. Power Flow Problem in Power Grid
The use of quantum algorithms has drawn much attention in recent research on power system applications, especially the areas where quantum linear system solver can be deployed, including power flow, contingency analysis, state estimation, and transient simulation [
60,
61,
62,
63,
64,
65]. The specific problem type we illustrated in this section is an alternating current power flow problem.
The power flow equations are essential to analyzing the steady-state behavior of power systems by describing the relationship between bus voltages (magnitude and phase angles), currents, and power injections in a power system. The basic power flow equations are as follows:
where
: real power injection at bus k.
: reactive power injection at bus k.
: voltage magnitude at bus k.
: phase angle difference between bus k and bus j.
: admittance between bus k and bus j.
For a power flow problem with
B buses and
G generators, there are
unknowns representing voltage magnitudes,
, and phase angles,
, for load buses and voltage phase angles for generator buses. With the knowledge of the admittance matrix of the system that represents the nodal admittance of the buses, we can use the Newton–Raphson (N-R) method to solve power flow equation iteratively: after an initial guess for the voltages at all buses, in each N-R iteration, we solve
where
and
are computed using the admittance matrix, nodal power balance equation, and mismatch equations with the data from the last iteration or initial guess. Then,
and
are updated by
and
, respectively. The algorithm is considered converged when
and
are smaller than a convergence tolerance.
It is worth noting that while HHL can solve Equation (
4) for the normalized solution state
with limited accuracy, the un-normalized vector could have a smaller norm than the accuracy of HHL. Thus, the final accuracy of voltage magnitude and phase angles is much higher than the accuracy used in HHL. This situation is similar to iterative refinement in semi-definite optimization in [
13].
5.1.1. Settings of the Numerical Experiments
The test case is the four buses and two generators problem in ([
66], p. 377), coded in a MATLAB package called MATPOWER [
67]. Based on the framework built in [
68], we incorporate HHL circuits and quantum simulators into the solving process in MATPOWER. The linear systems of our interest are all
systems but not Hermitian. So, the actual input system is first expanded to
so the size of the RHS vector is the power of 2, and then it is enlarged to
following Equation (
1). So, we eventually use 4 qubits to encode the vector
. This process is illustrated in
Figure 3a.
The default value of
set by [
35] using Equation (
3) is 6. To demonstrate how the accuracy of eigenvalues affects an iterative algorithm, we select
from 4 to 7. With 4 clock qubits in QPE and an ancillary qubit required by the HHL algorithm, the number of qubits in each HHL circuit ranges from 9 to 12. The N-R method converges when
However, because the linear system formed in an N-R iteration depends on the solution from the previous N-R iteration, the linear systems at Iteration
j with different
will differ. Our comparison focuses on the convergence speed and the final solution at the convergence instead of errors at each iteration across different
.
5.1.2. Performance Evaluations
The sparsity of all tested coefficient matrices is
after the expansion, with condition numbers in the range of
. The minimums of the magnitude of eigenvalues are in the range of
, and the maximums are
.
Figure 3b,c provide illustrative evidence of the use of a less precise linear solver in the iterative method like the N-R method. Although the N-R method with an HHL subroutine converges slower than a classical linear solver in MATLAB, all methods converge under the same criteria and obtain a similar solution. A trade-off between convergence speed and complexity of linear system solving exists in our experiments.
On the other hand, if we compare the values of normalized error
, when
, using more clock qubits indeed leads to lower error from the HHL algorithm itself. However, increasing
does not imply less error on the solution vectors,
, nor faster convergence by looking at the values of
and
in
Figure 3. The HHL algorithm with
gives the fastest convergence, which is smaller than the default value, 6, from Equation (
3).
5.1.3. Gate Counts and Depths of HHL Circuits
Because the circuits from later iterations are in a similar resource demand, we only look at the circuits in the first iteration. The depths and gate counts of HHL circuits are the same across N-R iterations when
is fixed. While HHL with
gives similar convergence speed and accuracy, the required resources to run the circuits exponentially increase as
increases based on
Table 5. On the other hand, although gate fusion employed in NWQSim does not mitigate these exponential trends, it maintains a constant proportional performance across various HHL circuits: a
reduction in gate counts on all tested circuits regardless of the value of
.
5.1.4. Resource Estimation in a Fault-Tolerant Scenario
Encoded by the surface code described in
Section 4 along with a nearest-neighbor connectivity constraint, we estimate the runtime of HHL circuits by the Azure Quantum resource estimator and summarize the data in
Figure 4. A strong and consistent linear correlation between the number of clock qubits in QPE,
, and the runtime in log base 10 is displayed across qubit parameter sets and error budgets. Every extra clock qubit brings
times longer runtime when the error budget is
and
times longer when the error budget is
. This multiplier shows an increasing trend when the error budget decreases. Similar correlations are also demonstrated in
Figure 5a,b when we further investigate how
affects the number of logical cycles for the circuit and the number of
T states. Generally, the exponential dependencies of runtime, number of logical cycles, and number of
T states on
match the relationship between the number of gates in HHL circuits and
. Note that the slopes of the fitted line in
Figure 5a,b are not sensitive to error budgets, different from the behavior in
Figure 4. Error budgets affect the constant multiplier of the growth of logical cycles and the number of
T states more.
Table 6 summarizes the other factors of our interest. Those factors have the same values in
and
settings. Note that there is a dramatic fall in the number of physical qubits when the error budget is 0.01 and
raises from 4 to 5. Combined with
Figure 5c,d, this reduction comes from a large drop in the number of physical qubits spent on
T factories, a dominant demand on physical qubits instead of the quantum algorithm itself. The circuit requires 15
T factories when the error budget is
and
, but this number is reduced to 12 when
. Recall the definition of the number of
T factories in
Table 4, based on the fitted coefficients in
Figure 4 and
Figure 5; we can see while the increase in
from 4 to 5 leads to
times more
T states, the runtime becomes
times larger. Since
T factory duration and
T states per factory are kept constant, the faster-growing runtime reduces the number of
T factories required, thus decreasing the overall number of physical qubits required. This phenomenon does not occur when the error budget is
because the growth of runtime and
T-state count are at the same speed.
5.2. Heat Transfer Problem in Climate Projection
Linear solvers are deeply embedded in linear or non-linear differential equation solving through numerical methods such as the Carleman linearization and the finite difference method [
69]. Such methods discretize the domain of the problems into grids, and the dimension of the formed linear system scales as the size of discretization. The number of grid points scales polynomially with system size, while the demands for solving such differential equations (DEs) are ubiquitous in science and engineering. Due to the exponential speedup in problem dimension, the combination of quantum linear solvers and these numerical methods has become an attractive direction [
69,
70,
71,
72]. For example, accurate climate projection, one of the most scientifically challenging and socially urgent problems, is cursed by high dimension and could be revolutionized by quantum computing. In this section, we explore the application of a quantum linear system solver to the heat transfer equation that is important for atmospheric processes related to climate projection.
5.2.1. Settings of the Numerical Experiments
In this section, we examine the two-dimensional (2-D) heat diffusion equation in [
73]
where
T represents the temperature at a given 2-D point and time,
D is the heat transfer coefficient, and
F is the forcing term consisting of arbitrary boundary and initial conditions. Equation (
5) is a linear partial differential equation. We discretize Equation (
5) in space and time into a system of ordinary differential equations using the finite difference method,
where
A is the resulting coefficient matrix. Take the square lattices with a lateral size of three grid points and five grid points, and the resulting dimension of
A is
and
, respectively. Such configurations require 4 qubits and 5 qubits to represent the RHS vectors (
F term in Equation (
6)) in both linear systems, respectively. Let
be the coefficient matrix generated from
l number of grid points; the entry values are
where
p and
q denote the index of the entries of
A, and
r is
in the three-point case and
in the five-point case.
5.2.2. Performance and Resource Evaluations
The coefficient matrices are Hermitian by design, so we only need to expand the dimension to the nearest power of 2, i.e., 16 and 32. After dimension expansion, the coefficient matrices have sparsity and , respectively. Both matrices have condition number 1, and all of their eigenvalues are around 1.
When
, gate counts in
Table 5 and
Table 7 have almost the same numbers of circuit depths and gate counts. However, if we compare across different
in
Table 7, significant increases appear in depths and all gate counts. This situation reflects one of Aaronson’s concerns in [
74] about the efficiency and the cost of data reading in quantum linear solvers. Furthermore, similar to the scenario in
Section 5.1, the incremental of
, despite being very costly, has a limited contribution towards reducing errors, as shown in
Figure 6.
5.2.3. Resource Estimation in a Fault-Tolerant Scenario
Most of the observations from
Figure 7 and
Figure 8 and
Table 8 for both problem sizes are analogous to the findings in
Section 5.1.4, including the numerical values of the fitted-line coefficients related to runtime, logical cycles, and the number of
T states. The significant influence brought by deeper data-loading modules for the five-point problem is parallel shifts on longer runtime, more logical cycles, more
T states, and more strict requirements on the logical qubit error rate and
T state error rate. More data-loading qubits do not affect the growth speed of the logical cycle and the number of
T states. Due to the limitation of computational time in Azure Quantum cloud service, we cannot collect more data points to understand this correlation better. However, from a theoretical perspective, this is expected because the QPE costs of HHL circuits are the same with the same
in the power flow and heat transfer problems.
6. Discussion
This paper evaluates and analyzes the performance and resources required for the HHL algorithm in various scientific and engineering problems. There are still multiple points we need to address in future work. The foremost limitation in this work is the data-loading module in the HHL circuit generation. While the data-loading algorithm in [
55] can encode an arbitrary vector into a quantum circuit, the circuit depth of this module is exponential in the number of qubits. Thus, this first part of the circuit severely damages the potential quantum speedup from HHL. We mitigate this drawback by comparing the outcomes from problems of different sizes to isolate the influence of the data-loading module. An important future direction is incorporating an efficient data-loading scheme into our analysis framework, like block-encoding in [
8]. A different data-loading method could have a different accuracy, so it is necessary to investigate how data-loading accuracy and condition number of coefficient matrices collectively affect the solution accuracy. This future direction illustrates the second drawback of this study. That is, our tested coefficient matrices are all well-conditioned. Because our experiments do not utilize randomly generated test cases, we have less control over the matrix properties, including condition number and sparsity. A potential source of ill-conditioned test cases is the methods that naturally have ill-conditioned matrices, such as the Newton systems produced by the interior-point method in optimization problems [
13]. Thus, to solve those systems, iterative refinement with the HHL algorithm [
9] and a variant of the HHL algorithm in [
15], accompanied by the sparse approximate inverse preconditioner, is in our outlook. Limited by the single-job running time in the Azure Quantum cloud server, we cannot process large HHL circuits, mainly limited by the number of gates. This restricts the number of clock qubits in the QPE and the number of data points in each plot in
Section 5. This is why we only discuss the correlations whose coefficients of determination are almost 1. In future studies, we will dismantle the whole HHL circuit into different modules and evaluate the resource cost separately.
Some additional research can be conducted to further enhance our understanding of the application of quantum algorithms in scientific problems. An important direction is understanding the implication of various noise models on the HHL algorithm. We plan to conduct those experiments with the high-precision noise simulator in [
75]. We can also include the quantum algorithms that address similar scientific applications into our resource analysis framework, such as quantum differential equation solvers in [
76,
77] and quantum optimizers in [
78,
79].
7. Conclusions
In this paper, we investigate the practical applications and scalability of the HHL algorithm in solving quantum linear systems associated with scientific problems like power grids and heat transfer problems. Through the NWQSim package on high-performance computing platforms, we highlight the benefits of the utilization of low-accuracy QPE in HHL for both iterative and non-iterative methods in practice: low-accuracy QPE can exponentially reduce the gate counts and circuit depth in an HHL circuit, while keeping the same solution accuracy in iterative methods like the Newton–Raphson method and maintaining a similar level of accuracy in a non-iterative method like the finite difference method.
Furthermore, with the Azure Quantum resource estimator, we evaluate the resource requirements of HHL circuits in our experiments under two settings that simulate superconducting and trapped-ion qubits. The correlations between QEC-related criteria and the input HHL circuits have been thoroughly studied. The runtime, number of logical cycles, and number of T states have exponential dependencies on the number of clock qubits in QPE. However, this relation is not necessarily inherited by the number of physical qubits demanded. In our experiments, we find that even as increases and the error budget reduces, it is possible that T factory demand also decreases. More specifically, if the runtime growth is faster than the required amount of T states, the circuit needs fewer T factories and thus fewer physical qubits to prepare T factories. Since the growth of runtime is sensitive to the error budget, it is possible to reduce the physical qubit requirement if a low error budget is achievable on early fault-tolerant quantum devices.
Our study provides pivotal insights into the operational requirements of quantum linear system algorithms, paving the way for further empirical studies. We propose future research on the applications of quantum linear system solvers and iterative refinement on high-fidelity quantum computers for small-scale experiments. For large-scale experiments, we suggest using noise-modeled simulators on high-performance platforms. In the context of QEC and early fault-tolerant quantum computing, we believe it is crucial to focus on controlling the resource cost of
T factories by considering the runtime and error budget. These research directions hold promise for bridging the gap between theoretical potential and practical usability in quantum computing. All the code in this research will be hosted in a public repository (
https://github.com/pnnl/nwqlib, accessed on 4 August 2025).