1. Introduction
Quantum computers are computational devices which operate using phenomena described by quantum mechanics. Therefore, they can carry out the operations which are not available for classical computers. The ability of a quantum computer to solve a specific task faster than any classical computer is usually referred to as quantum advantage. Although quantum algorithms that provide exponential speedup over classical ones are known, they are hard to implement in practice. Examples of such algorithms include Shor’s algorithm of factoring integers [
1], that works in polynomial time, whereas all classical algorithms require exponential time. Modern quantum computers are far from experimentally demonstrating quantum advantage on basic problems like integer factorization.
Boson sampling [
2] is a problem that was proposed as a good candidate for demonstrating quantum advantage due to its nature. A boson sampler is a linear-optical device that consists of non-classical sources of indistinguishable photons, a multichannel interferometer mixing photons of different sources, and photon detectors at the output channels of the interferometer. In the original proposal, the indistinguishable photons were prepared in Fock states. The problem then is to calculate the photon statistics after the interferometer given an input state and the interferometer matrix. The relevant parameters are the number of modes
N and the total number of photons injected in the interferometer
M. Experimentally, it corresponds to performing multiple measurements of the photon counts at the outputs of such a device [
3].
Due to the technological complexity of generating Fock states, several variants of the original boson sampling problem have been proposed. They aim at improving the photon generation efficiency and increasing the scale of implementations. One such example is the scattershot boson sampling, which uses many parametric down-conversion sources to improve the single photon generation rate. It has been implemented experimentally using a 13-mode integrated photonic chip and six PDC photon sources [
4].
Another variant is the Gaussian boson sampling [
5,
6], in which Gaussian states are injected into the interferometer instead of Fock states. Gaussian input states can be generated using PDC sources, and it allows the non-classical input states to be prepared deterministically. In this variant, the relative input photon phases can affect the sampling distribution. Experiments were carried out with
N = 12 [
7],
N = 100 [
8] and
[
9,
10], with up to 255 photons registered in one event. The latter implementations used PPKTP crystals as PDC sources and employed an active phase-locking mechanism to ensure a coherent superposition.
Any experimental setup, of course, differs from the idealized model considered in theoretical modeling. Bosonic samplers suffer from two fundamental types of imperfections. First, the parameters of a real device, such as the reflection coefficients of the beam splitters and the phase rotations, are never known exactly. A small change in the interferometer parameters can affect the sampling statistics drastically, so that the modeling of an ideal device no longer makes much sense. Another type of imperfections is photon losses. These losses happen because of imperfections in photon preparation, absorption inside the interferometer and imperfect detectors and coupling.
There are different ways of modeling losses: for example, by introducing extra beam splitters [
11] or replacing the interferometer matrix by a combination of lossless linear optics transformations and the diagonal matrix that contains transmission coefficients [
12]. In the algorithm described in this paper, we will assume that losses occur on the inputs of the interferometer, and we will describe the exact way that we model them.
Imperfections in middle-sized systems make them, in general, easier to emulate with classical computers [
13]. It was shown [
14] that with the increase of losses in a system, the complexity of the task decreases. When the number of photons
that arrive at the outputs is less than
, the problem of boson sampling can be efficiently solved using classical computers. On the other hand, if the losses are low, the problem remains hard for classical computers [
15].
In this paper, we propose a classical algorithm for calculating probabilities of output states in a GBS problem. The algorithm uses Taylor series expansion, and it converges faster depending on the parameters of the problem: namely, the amount of losses in the system and the squeezing parameter of the input states. The higher the losses in the system, the less orders of the series are needed to approximate the probability of observing a given output state.
The work by Oh et al. [
16] used the following approach to simulating GBS: the covariance matrix of the output Gaussian state was decomposed into “quantum” and “classical” parts, in which the “quantum” part was simulated using matrix product states and the “classical” part was simulated by random displacement. Thus, when the photon loss rate is high, the computational complexity of this algorithm is reduced.
The algorithm that we propose in this paper uses some similar ideas: namely, the zeroth order of the Taylor series may be considered the “classical” part that is computed quite easily, while the remaining terms are the “quantum” part that is more computationally complex. The contribution of this “quantum” part is smaller when the losses in the system are high; thus, our algorithm also has optimal conditions that depend on the magnitude of losses. We also analyze some recent GBS implementations to compare the conditions in those experiments with the optimal conditions for our algorithm.
2. Problem Specification
Let us first consider a lossless linear-optics interferometer with a transmission matrix
U:
where creation operators acting on the
i-th input and output modes are denoted
and
. Suppose the input modes are injected with single-mode squeezed states:
where we omit the state’s normalizing constant
.
The goal is to calculate the probability of detecting
photons in the first output mode,
photons in the second output mode and so on. This probability can be calculated in the following way:
where
is the density matrix of the output state.
Modeling Losses
In real-life bosonic samplers, there will always be losses. Here, we will model them by substituting
where
acts on a mode that we cannot observe, and
,
. Now, the goal is to compute the same probability (
) but taking losses into account. The input state will now be
and we now take partial trace over all loss modes when calculating the density matrix:
3. Algorithm Derivation
Let us consider a single mode:
3.1. Calculating Partial Trace
We start by applying the Hubbard–Stratonovich transformation [
17,
18]
to both exponents in the density matrix operator. This gives us the following:
Let us redefine
,
for convenience:
We can now calculate the partial trace over loss modes:
The following expression can be simplified:
The density matrix now can be written in the following way:
3.2. Switching between Probability Density Functions
We can view this integral as taking an expected value over a
-dimensional normal distribution.
and
then become normally distributed random variables with a mean vector equal to zero. Their covariance matrix has the following form:
Then, we can write
where
denotes averaging over the
-dimensional normal distribution
.
The expression is troublesome to calculate, since there are two different variables and . We want to arrive somehow at an expression with only one such variable, i.e., , which we will denote .
We now will choose normally distributed random variables
such that
and the distributions over
and
have the same moments:
We have some freedom in choosing these variables; we will set
so that
and
. Then, the covariance matrix
of
will be determined by one parameter
:
Note that
. We will later find an optimal way to choose
h. The density matrix in terms of the new variables
is
Since
and
, the distribution
can be split into a combination of distributions over
and over
. The covariance matrix
of
and
is
and the distribution
can be written as
3.3. Taylor Series Expansion
We now consider the Taylor series of the expression
, leaving only
in the exponent:
Each term in the expression will be proportional to
and since
and
, the integral over
can be written as a product of integrals over
and
:
The moments
can be calculated analytically using Wick’s probability theorem.
3.4. Choosing
The idea consists of minimizing the “perturbation parameter” so that each subsequent order of the Taylor series expansion has less impact on the expression. Since higher orders of the expansion contain higher powers of and higher moments , and these moments can be calculated via second moments and , the role of the “perturbation parameter” is played by .
Let us consider the conditions that must be satisfied by
h. Firstly,
h must satisfy
, because
and
. Secondly, since
is a covariance matrix, its eigenvalues must be non-negative. The eigenvalues of
are
,
and
. Thus,
h needs to satisfy
The minimum of is realized when .
3.5. Multimode Case
Let us apply the steps described above to the case of
N modes. We start with an input state
We construct a density matrix and take the partial trace over loss modes:
We apply the Hubbard–Stratonovich transformation
times, resulting in an integral over
:
where the integral for each variable
and
is calculated over
.
Again, we redefine
,
:
We compute partial trace over loss modes:
This expression now can be considered as taking an expected value over a
-dimensional normal distribution where all variable pairs
are independent. Every variable pair
has covariance matrix
, and we can write this expression in the following way:
For each variable pair
we now choose
in a way that is described above. Then,
We now consider the Taylor series expansion (up to the second order) of the expression in the square brackets, which we will denote
:
The creation operators
that act on the input modes can be written in terms of the operators
that act on the output modes:
We can expand the brackets in the expression for
, leaving the terms up to the second order:
When we take the product of these two expressions, most of the resulting terms will have zero expected value because of the properties of the normal distribution. Then
The integrals over
,
result in specific moments of the distribution, and the integral over
can be calculated using Monte-Carlo methods. The final expression is
where by Wick’s probability theorem
.
3.6. Calculating Traces
In order to calculate
, we need to be able to calculate expressions
,
,
,
, etc., for different
. The first one can be calculated fairly easily:
Now, suppose we need to calculate
. First, we note that
where by, e.g.,
we mean
.
4. Algorithm Overview
The goal of the algorithm is to calculate the probability of a state , given , , c, s and U. We assume that the Taylor series expansion is calculated up to the desired order before computation starts. The integrals over and should also be computed (it can be completed analytically via Wick’s probability theorem).
We start by calculating two-variable covariance matrix using and s. We now select in the way specified above such that it minimizes the series expansion parameter . In order to compute the integrals over , we sample for each i from a normal distribution .
We now compute , which by linearity consists in computing traces of the form described above; for each sample, we need only a polynomial number of operations.
Finally, we take an average over our samples and multiply by the necessary constant .
5. Taylor Series Convergence for Actual Experimental Conditions
We have discussed above the fact that the role of the “perturbation parameter” in the series expansion is played by , which we can choose to be equal to . This parameter depends on the experimental conditions (i.e., the squeezing parameter of the input state and loss level ). The smaller this parameter is, the faster the series will converge. Thus, the best conditions for this algorithm are achieved when the loss level is high and the squeezing parameter is low. Let us consider the actual experimental implementation of the Gaussian boson sampling problem and estimate how small this parameter is in those conditions.
Let us consider the relation between and the average amount of photons per state . If the squeezing parameter is , then , while .
In a paper by Zhong et al. [
8], 25 PPKTP crystals were used to produce 25 two-mode squeezed states, which is equivalent to 50 single-mode squeezed states. The average number of photons registered by the detectors was 43. Thus, the average amount of photons per mode
is around
;
,
. The average collection efficiency is said to be
. Then,
.
In another paper by Zhong et al. [
9], the average amount of photons produced was increased to 70 at maximum pump intensity. This corresponds to
. The overall transmission rate in the experiment is said in the paper to be
and
for different settings, so we take
. This yields
.
In the most recent experiment by Deng et al. [
10], the average amount of photons was increased even more, measuring states with
,
and
photons on average with different pump intensities while still producing 25 two-mode squeezed states. The efficiency of the setup is said to be
, yielding
,
and
, respectively.
To estimate the expected accuracy of the algorithm, we can assume that the numerical values of each order are approximately equal, meaning that we can write
where
denotes the sum of all the terms of the
k-th order, and
is assumed. The expression then becomes a geometric progression with common ratio
. Then, on average, the 0-th order contributes to the probability a part equal to
, while the second order contributes
, the fourth contributes
, etc.
Calculating up to the second order then discards a total contribution of , which is approximately 0.182 = 3.2%, 0.142 = 1.96%, 0.112 = 1.21% and 0.122 = 1.44% for the conditions that are analyzed above. When the calculation is performed up to the fourth order, the lost contribution is approximately 0.183 ≈ 0.58%, 0.143 ≈ 0.27%, 0.113 ≈ 0.13% and 0.123 ≈ 0.17%.
The conclusion that we draw is that even in large GBS experiments which are said to demonstrate quantum advantage, the conditions are such that is fairly small, and calculating up to the fourth order is enough for the lost contribution to be below 1%.
6. Implementation Details
6.1. Contraction Precomputation
We can rewrite it as
where
is a contraction of
U with itself. It depends only on
U and can be calculated before sampling
, which reduces the amount of operations required to calculate each probability sample from a
sample.
6.2. Factorial Fractions Precomputation
In calculating traces of the form described above, we need to calculate factorial fractions of the form , where . Since the target state is fixed, .
6.3. Reusing
During calculation, while calculating each trace, we can calculate
only once for each
sample and then reuse it, thus using less operations to calculate each trace. Let us denote
;
. Then,
and
7. Complexity Analysis
7.1. Precomputation
In this section, we will analyze the computational complexity of precomputation. By precomputation we mean the calculations that need to be carried out only once before
sampling and before calculating probability samples for each
. The multiplicative constant
can be calculated with
multiplication operations. For each term in the resulting sum, we will define its order to be the number of variables
and
, or, equivalently, the power of the loss parameter
c. Thus, the term
will be of the second order. Then, each term of the order
K will have a contraction of the form
where some of the
can be conjugated. This leaves at most
different ways to conjugate the factors. Each contraction has
K free indices, and calculating the sum requires
additions and
multiplications. The total number of additions is
and the number of multiplications is
, where
K is the maximum order we choose to calculate.
Calculating all for requires only around multiplications, since , , , …, .
7.2. Probability Sample Computation
Here, we will analyze the computational complexity of calculating a single probability sample given . We will assume that the terms are calculated up to some order K.
Calculating the trace
requires one multiplication of an
matrix by a
N-dimensional vector,
N exponentiation operations and
multiplication operations. This calculations needs to be made only once for each
. Calculating any other trace of the form
requires
exponentiation operations and
multiplication operations (since factorial fractions are precomputed).
The number of terms for a given order K is times the number of different non-zero K-th order moments . The exact amount is hard to calculate, but the total number of moments (including those that are zero) is . Thus, the maximum amount of terms required to compute is .
Since the amount of operations required to calculate each term is , the total computational complexity of calculating a probability sample for a given is .
8. Results
Below are the results of probability calculation for for different output states. The calculated probabilities are compared to exact solutions. The parameters are , . The number of samples is 4096.
These results show that for calculating a single output state probability accurately, the number of samples needs to be on the order of
. Below are the results of using fewer samples per state, but instead of comparing individual probabilities, we look at the cosine similarity between the exact and approximated probability distributions over all two-photon states,
Figure 1,
Figure 2 and
Figure 3.
The above graph suggests that the number of samples per state needed to approximate the distribution does not depend much on
N. It is computationally hard to check this when comparing to the exact solution, but if we assume that the cosine similarity converges to a value close to 1, we can estimate how quickly it converges. Below, we look at the cosine similarity between a distribution calculated using
H samples per state and a distribution calculated with
samples per state for different
H, where we choose
. This allows us to estimate how much the distribution changes with
new samples: if the cosine similarity is close to 1, then new samples do not alter the distribution significantly.
Figure 4 suggests more strongly that the number of samples per state required for accurate approximation is not really influenced by
N. This can be explained by the fact that the number of
-photon states increases with
N. If the number of samples per state is constant, then the total number of samples increases with
N.
Below are benchmark results that show the average precomputation time, which depends only on
N, and the time per sample, which depends on
N and the amount of photons
M in the target state,
Figure 5 and
Figure 6.
These results show that even mode GBS can be simulated on an average laptop using this algorithm.
A direct comparison of the performance of our method with other published results is problematic, since our algorithm calculates the probability of output states rather than directly sampling states from some approximate distribution. The closest algorithm is described in [
16]. However, the performance comparison is still problematic because the error bars of the two methods cannot be compared directly. Nevertheless, judging by Figure 9 from [
16], our algorithm is more memory-efficient, since it uses about 0.1 GB of memory to run in conditions where the transmission rate is 0.5 and the number of modes is 40, whereas the algorithm [
16] uses 1 TB, and the memory requirement grows exponentially with the number of modes.
9. Conclusions
In this paper, we have presented a new algorithm for the approximate calculation of the probability of observing a given output state in the Gaussian boson sampling instance. We have discussed various implementation details that help to reduce the number of operations needed to calculate each probability sample. We also analyze the total computational complexity both of the calculations that need to be carried out once for each specific problem and of computing each probability sample.
This algorithm relies on the Taylor series expansion where the “perturbation” parameter is dependent on the problem conditions. The algorithm consists of calculating the terms of this Taylor series up to some finite order. For a fixed maximum order, the computational complexity of the algorithm is polynomial in N.
We have demonstrated that increasing the maximum order does increase the accuracy of the answer. We have also measured the precomputation and sampling time for a regular CPU, showing that even large instances of Gaussian boson sampling () can be solved in reasonable time.
We have considered recent GBS experiments and estimated the parameters of the problem for those conditions. We conclude that the contribution of the terms that are discarded when the calculation completed up to the second order is less than , and if the calculation is completed up to the fourth order, this number drops to .