1. Introduction
The security of Elliptic Curve Cryptosystems is based on the difficulty of solving the discrete logarithm problem in an elliptic curve group. It seems more difficult to deal with the problem for solving discrete logarithm in
than in
. The key agreement represents the protocol in which two or more parties together generate a secret key using a public channel [
1,
2,
3]. For instance, better security can be achieved in Diffie–Hellman Key exchange by choosing a suitable elliptic curve in
than in
when
p has 512 bits. The efficiency of the optimization for elliptic curve cryptosystems relies on the speed of the operations in the elliptic curve, whose core operation is the point addition. The efficient algorithms for elliptic curve cryptography are classified into high-level algorithms and low-level algorithms, i.e., group operations of elliptic curves and arithmetic operations in the fundamental finite field. Obviously, both of the above two-level operations should be optimized in order to realize the elliptic curve cryptosystem effectively.
With the intrinsic advantages in executing certain matrix multiplication operations, quantum algorithms are proposed to enhance data analysis techniques under some circumstances [
4]. The first paper to discuss in detail how to use a quantum algorithm to solve elliptic curve discrete logarithm problem is by Proos and Zalka [
5]. Based on this study, in 2017, Rötteler, Naehrig, Svore, and Lauter presented a concrete quantum resource estimation and the explicit quantum circuit for operations of point additions for solving the discrete logarithm problem in elliptic curves over
[
6].
While there is some common ground between the prime-field case and the characteristic-two case, there are also important differences. Elliptic curves over finite fields
play a prominent role in modern cryptography. Published quantum algorithms dealing with such curves build on a short Weierstrass form in combination with affine or projective coordinates. Amento, Rötteler, and Steinwandt use projective coordinates to avoid divisions [
7]. They need only 13 multiplications every step, which would result in
as the leading term in their Toffoli gate count if the multiplications were implemented using the space-efficient quantum Karatsuba multiplication [
8]. Amento et al. show in their paper [
7] the choice of how to represent the elements of
can have a significant impact on the resource requirements for quantum arithmetic. In particular, they show how the Gaussian normal basis representations and “ghost-bit basis” representations can be used to implement inverters with a quantum circuit of depth
. This is the first construction to compute inverse in
with subquadratic depth reported in the literature.
The quantum circuit of computing inverse in
in [
7] is based on the Itoh–Tsujii algorithm [
9] which exploits the property that, in a normal basis representation, squaring corresponds to a permutation of the coefficients. Because the map
is a bijection in
, it corresponds to an
n by
n nonsingular matrix, and all the elements in the matrix belong to
. Then, using an LUP-decomposition of this matrix, the needed exponentiation can be realized with
CNOT gates in depth
. For
they define
. Then the goal is to find
from
. For this they exploit that
for all
. Thus, in a polynomial basis representation, one evaluation of
can be realized in depth
using
Toffolis and
CNOT gates.
However, this use of projective coordinates has two disadvantages. First, they use many ancillary qubits and separate input and output qubits, leading to
qubits in one point-addition step even with space-efficient quantum Karatsuba multiplications. Second, projective coordinates have a much larger space disadvantage not pointed out in Ref. [
7]. Furthermore, Ref. [
7] does not specify the entirety of Shor’s algorithm, leaving open how exactly the presented results would be combined.
Building on the Karatsuba multiplier, the multiplication algorithm presented by Ref. [
8] can be realized using
Toffoli gates and
qubits, which has been exploited by Ref. [
10]. However, there exists a disadvantage in the method of [
8]. There are so many CNOT gates needed in Ref. [
8], which is
.
The number of qubits and the connectivity between qubits in practical quantum devices are limited by the noisy environment. However, the resource costs have not been discussed in Refs. [
5,
6,
7,
8,
9,
10] when the quantum bit connectivity is limited. We discuss the quantum circuit optimization for solving discrete logarithm of elliptic curve in
, obeying the nearest-neighbor constrained. It has been shown that when operating a CNOT gate between two qubits, the number and the depth of CNOT gates needed are determined by the distance between the two qubits. Therefore, the number and the depth of CNOT gates needed in elementary operations (such as additions, binary shifts, multiplications, and squarings) for point additions are dominated by the arrangement of qubits. In this paper we treat division by a field element as multiplication by the inverse of that element and the inversion step is based on Fermat’s little theorem (i.e., using the Itoh–Tsujii algorithm to compute the inverse). With the help of the Steiner tree problem reduction in Refs. [
11,
12], we optimize the number of CNOT gates included in the point addition on binary elliptic curves under a constrained connectivity. The optimized size of the CNOTs is
, where
is the minimum degree of the connected graph. Based on this, for both division algorithms, the FLT-based algorithm preserves the similar number of Toffoli gates and qubits and suppresses the disadvantage previously in Ref. [
10], which has roughly twice the number of the CNOT gate count compared with the GCD-based algorithm.
2. Materials and Methods
Each addition in
takes one CNOT gate. The addition of two polynomials
of degree at most
takes
n CNOT gates with depth 1. Considering the connectivity of qubits [
13], four CNOT gates will be needed in performing a CNOT gate between the first qubit and the third qubit, which is shown in the
Figure 1. Eight CNOT gates will be needed in performing a CNOT gate between the first qubit and the fourth qubit, which is shown in
Figure 2. Therefore,
CNOT gates will be needed in performing a CNOT gate between the first qubit and the
n-th qubit.
Let the connectivity of qubits corresponding to the coefficients of
be:
Then the number of and the depth of CNOT gates needed in the addition of
and
are still
n and 1, respectively. When these qubits are arranged in the following order
the number of and the depth of CNOT gates needed in the addition of
and
are
and
, respectively.
For polynomials in
multiplication by
x is a shift of the coefficient vector. This requires no quantum computation by doing a series of swaps. In a finite field
we want to multiply a polynomial
of degree at most
by
x then by a modular reduction by a fixed irreducible weight-
degree-
n polynomial
. In general, we let
be 3 or 5. As
is irreducible, it always has coefficient 1 for
, so after a reduction by
that qubit will be 1 and if no reduction takes place that qubit will be 0, which means the modular shift algorithm is always reversible. Considering the connectivity of qubits, when the Hamming weight of
is
and
(
), we let the connectivity of qubits corresponding to the coefficients of
be:
Then the number of and the depth of CNOT gates needed in multiplying
by
x then by a modular reduction by
are still
n and 1, respectively. When these qubits are arranged in the following order
the number of and the depth of CNOT gates needed in multiplying
by
x then by a modular reduction by
are
and
. respectively.
When the Hamming weight of
is
and
(
), let the connectivity of qubits corresponding to the coefficients of
be:
or
or
or
or
or
Then the number of and the depth of CNOT gates needed in multiplying by x then by a modular reduction by are 4 and 3, respectively. When these qubits are arranged in the following order , the number of and the depth of CNOT gates needed in multiplying by x then by a modular reduction by are at most and , respectively. The number of and the depth of CNOT gates are at least and , respectively.
For multiplication, if we use a space-efficient Karatsuba algorithm by Van Hoof, we will need CNOT gates, Toffoli gates, and total qubits: qubits for the input , and n separate qubits for the output . In a multiplication, most CNOT gates are needed in the processes of multiplying by or where k has values and each process need CNOT gates. In the quantum algorithm for the division we have to use up to multiplications, so (i.e., CNOT gates will be needed in the quantum algorithm for a division. If we take the constrained connectivity into consideration, at most (i.e., CNOT gates will be needed.
If the irreducible polymomial is fixed to a trinomial () or a pentanomial () each multiplying by or will need about CNOT gates. Then we use up to multiplications in the quantum algorithm for the division. Therefore only about CNOT gates are needed in the quantum algorithm for a division. When the constrained connectivity has been taken into consideration, at most CNOT gates will be needed.
Take for example the irreducible polynomial
, based on which the finite field
can be constructed. The quantum circuit of the space-efficient Karatsuba algorithm by Van Hoof is shown in the
Figure 3:
The simulation is ran under IBM T-like graph (T65). The topological structure of IBM T65 is depicted below::
For the sake of optimizing the number and the depth of CNOT gates while preserving the similar number of Toffoli gates and qubits, we adopt the implementation of a Toffoli gate shown in
Figure 4, which has been proposed by Ref. [
14]. If we take the constrained connectivity into consideration, 812 CNOT gates will be needed in the quantum circuit for the space-efficient Karatsuba algorithm by Van Hoof.
Because the map is a bijection in , we can think of squaring in as a circuit that replaces the input with the result. To square and replace the input, we make use of the fact that squaring is a linear map and we can write that map as an n by n matrix. Using an LUP-decomposition, we get a lower triangular, upper triangular, and permutation matrix, which can be translated into a circuit consisting of at most CNOT gates and a number of swaps. In the quantum algorithm for the division we have to use up to squarings, so CNOT gates will be needed in the quantum algorithm for a division. If we take the constrained connectivity into consideration, at most CNOT gates will be needed.
If the irreducible polymomial is fixed to a trinomial () or a pentanomial (), each squaring will need about CNOT gates. Then we use up to squarings in the quantum algorithm for the division. Therefore, only about CNOT gates are needed in the quantum algorithm for a division. When the constrained connectivity has been taken into consideration, at most CNOT gates will be needed.
Take for example the irreducible polynomial , based on which the finite field can be constructed. The quantum circuit of the squaring for a polynomial in need 5 CNOT gates. If we take the constrained connectivity into consideration, 8 CNOT gates will be needed.
3. Results
Fermat’s little theorem can be extended for binary finite fields to where n is the degree of . With the help of squarings, this can be calculated in n multiplications and squarings: . Itoh and Tsujii give an improvement to this straightforward method to reduce the cost to below log multiplications and squarings. The Itoh–Tsujii algorithm works as follows:
- (1)
Write as with and . Note that t is the Hamming weight of in binary and log and log;
- (2)
Calculate with multiplications, and save the intermediate results , ;
- (3)
Calculate using multiplications;
- (4)
Square the result to get
. In total,
multiplications are needed for the inversion
. The quantum circuit of computing
is shown in
Figure 5.
Therefore, Toffoli gates and ancillary qubits are needed for the division in the quantum case. The total number of logical qubits required for the division is .
The classic algorithm for the inversion uses squarings and the quantum algorithm for the division has to use up to .
Only CNOT gates exist in quantum circuits of squarings and, multiplying by or in the multiplications, these circuits are CNOT circuits, which cost many CNOT gates.
For a graph
with
n vertices, without loss of generality, we assume that the degree of vertices are denoted as
. A theorem has been given by Bujiao Wu et al. in [
12], which optimizes the size of CNOTs.
Given a set of terminals and a connectivity graph, the algorithm performs breadth-first search outwards from each of the terminals. When the paths collide, the nodes along that path consolidate into a single node and all the edges adjacent to the consolidated nodes are placed adjacent to this new node. The process is restarted with this node as a new terminal. From many trials, it seems that this approximation is sufficient to see a large reduction in the CNOT count of the output circuit. The choice of Steiner tree approximation algorithm for this purpose depends on the user’s efficiency and performance requirements.
It follows that the optimized size in Theorem 1 is asymptotically tight for a nearly regular graph.
Theorem 1. Given connected graph G(V,E) withthen there is a polynominal time algorithm to construct an equivalent
size CNOT circuit for any n-qubit CNOT circuit on topological graph G, and there needs at least
size of CNOT gates for some invertible matrix. We can see the proof of Theorem 1 in [
12]. Let
for any given CNOT circuits with
n qubits under a constrained connectivity, in which
is the minimum degree of the connected graph. Then it can be easily shown that the sum of degrees for any
k vertices is greater than
n. Therefore, we will get CNOT circuits who have
CNOT gates. Due to the lower bound of the size of CNOT gates being
for any CNOT circuits on a connected graph [
15], the bound
is tight for a regular graph. Let
, then the size and the depth of CNOT gates needed in the quantum algorithm for the division will be cut in half.