1. Introduction
Asymmetric cryptosystems are traditionally built on a mathematical function which is hard to compute unless the knowledge of a special parameter is available. Typically, such a function is known as a mathematical trapdoor, and the parameter acts as the private key of the asymmetric cryptosystem. Decoding a random linear block code was first proven to be equivalent to solve an instance of the three dimensional matching by Elwyn Berlekamp et al. in 1978 [
1]. By contrast, efficient decoding algorithms for well structured codes have a long history of being available. Therefore, McEliece himself proposed to disguise an efficiently decodable code as a random code and employ the knowledge of the efficiently decodable representation as the private key of an asymmetric cryptosystem. In this way, a legitimate user of the cryptosystem would be able to employ an efficient decoder for the chosen hidden code, while an attacker would be forced to resort to decoding techniques for a generic linear code.
Since the original proposal, a significant amount of variants of the McEliece cryptosystem were proposed swapping the original decodable code choice (Goppa codes [
2]) with other efficiently decodable codes with the intent of enhancing computational performances of reducing the key size. The first attempt in this direction was the Niederreiter cryptosystem [
3] using generalized Reed–Solomon (GRS) codes. While the original proposal by Niederreiter was broken by Sidel’nikov and Shestakov in [
4], replacing the hidden code with a Goppa code in Niederreiter’s proposal yields a cryptosystem which is currently unbroken. More recently, other families of structured codes have been considered in this framework, such as Quasi Cyclic (QC) codes [
5], Low Density Parity Check (LDPC) codes [
6], Quasi Dyadic (QD) codes [
7], Quasi Cyclic Low Density Parity Check (QC-LDPC) codes [
8] and Quasi Cyclic Moderate Density Parity Check (QC-MDPC) codes [
9].
The significant push in the development of code-based cryptosystems was also accompanied by a comparably sized research effort in their cryptanalysis. In particular, the best attack technique that does not rely on the underlying hidden code structure, and thus is applicable to all the variants, is known as Information Set Decoding (ISD). In a nutshell, ISD attempts at finding enough error-free locations in a codeword to be able to decode it regardless of the errors which affect the codeword itself. Such a technique was first proposed by Prange [
10] as a more efficient alternative to decode a general linear block code, with respect to a straightforward guess on the error affected locations. Since then, a significant amount of improvements to Prange’s original technique were proposed [
11,
12,
13,
14,
15,
16], effectively providing significant polynomial speedups on the exponential-time decoding task. In addition to the former works, where the focus is to propose an operational description of a general decoding technique based on information set decoding, the works by the authors of [
17,
18,
19] provide a more general view on general decoding techniques, including split syndrome decoding and supercode decoding, and report proven bounds on the complexities of the said approaches. Finally, we report the work of Bassalygo et al. [
20] as the first tackling formally the complexity of decoding linear codes. For a more comprehensive survey of hard problems in coding theory, we refer the interested reader to [
21,
22,
23].
The common praxis in the literature concerning ISD improvements is to evaluate the code parameters for the worst-case-scenario of the ISD, effectively binding together the code rate and number of corrected errors to the code length. Subsequently, the works analyze the asymptotic speedup as a function of the code length alone. While this approach is effective in showing an improvement in the running time of the ISD in principle, the practical relevance of the improvement when considering useful parameter sizes in cryptography may be less significant.
We note that, in addition to being the most efficient strategy to perform general random linear code decoding, ISD techniques can also be employed to recover the structure of the efficiently decodable code from its obfuscated version for the LDPC, QC-LDPC and QC-MDPC code families.
Recently, the National Institute of Standards and Technology (NIST) has started a selection process to standardize asymmetric cryptosystems resistant to attacks with quantum computers. Since decoding a random code is widely believed to require an exponential amount of time in the number of errors, even in presence of quantum computers, code-based cryptosystems are prominent candidate in the NIST selection process [
24]. Hence, having accurate and shared expressions in the finite length regime, both in the classical and in the quantum computing setting, for the work factor of attacks targeting such schemes, it is important to define a common basis for their security assessment. A work sharing our intent is [
25], where a non-asymptotic analysis of some ISD techniques is performed. However, a comprehensive source of this type is not available in the literature, to the best of our knowledge.
1.1. Contributions
In this work, we provide a survey of the existing ISD algorithms, with explicit finite regime expressions for their spatial and temporal complexities. We also detail which free parameters have to be optimized for each of the ISD algorithms, and provide a software tool implementing the said optimization procedure on a given set of code parameters in [
26].
1.2. Paper Organization
This work is organized as follows.
Section 2 states the required notation and recollects the code-based cryptography background required to understand ISD algorithms.
Section 3 surveys the existing ISD algorithms, providing complexity estimates for both the space they require and their execution time.
Section 4 contains a critical discussion of the results obtained in the finite regime in comparison to the ones available via asymptotic estimates, while
Section 5 summarizes our conclusions.
2. Background on Computationally Intractable Coding Theory Problems
In this section, we introduce the notation and background on error correcting codes, and state which hard problems we focus on, for which best solvers are ISD algorithms.
In the following, we consider the case of binary linear block codes, denoting as a code of length n, dimension k and minimum distance d. This code is thus a subspace of containing distinct vectors, and can be represented by a binary matrix G known as the code generator matrix. It is commonplace to indicate with the amount of redundant bits in an element of the code, i.e., a codeword. The minimum distance of a linear block code corresponds to the minimum weight of its codewords, apart from the null one (which, clearly, has null weight). An alternative representation is the one provided by the so-called parity check matrix H, which is obtained through the algebraic constraint . The parity check is thus an binary matrix for which it is easy to show that the product of a codeword c by H is a null r-bit-long vector. The quantity obtained multiplying a generic n-bit vector by H is called the syndrome of through H. Note that, if the binary vector is not a codeword, the syndrome of through H is not null; in other words, we can write , with c being a codeword, and thus have .
Decoding of through consists in finding the codeword c whose distance from is minimum. The procedure of finding a length-n binary vector e, with Hamming weight smaller than or equal to some integer t, given a non-null syndrome s and a parity matrix H is known as syndrome decoding. The name stems from the fact that decoding a given becomes possible by computing the syndrome s of , solving the syndrome decoding, and adding the obtained vector e to , and corresponds to finding the codeword which is at minimum distance from .
We can now recall the two decisional problems in coding theory which were proven to be NP-Complete by Berlekamp et al. in [
1], and from which the main building blocks of the cryptographic trapdoors of code-based cryptosystems are derived.
Statement 1 (Coset weights problem). Given a random, binary matrix H, an r bit vector s and a positive integer t, determine if an n bit vector e with , where is the Hamming weight function, such that exists.
The Coset weights problem is also known as the Decisional Syndrome Decoding Problem (DSDP).
Statement 2 (Subspace weights problem). Given a random, binary matrix H, and a positive integer w, determine if an n bit vector c with , where is the Hamming weight function, such that exists.
The Subspace weights problem is also known as the Decisional Codeword Finding Problem (DCFP).
The typical hard problems which are employed to build code-based cryptographic trapdoors are the search variants of the aforementioned two problems. Specifically, the Syndrome Decoding Problem (SDP) asks to
find an error vector of weight ≤
t, while the Codeword Finding Problem (CFP) asks to
find a codeword with a given weight
w. A well known result in computational complexity theory (e.g., [
27] Chap. 2.5) states that any decision problem belonging to the NP-Complete class has a search-to-decision reduction. In other words, it is possible to solve an instance of the search problem with a polynomial amount of calls to an oracle for the corresponding decision problem. This, in turn, states that the difficulty of Syndrome Decoding Problem (SDP) and Codeword Finding Problem (CFP) is the same of solving their decisional variants Decisional Syndrome Decoding Problem (DSDP) and Decisional Codeword Finding Problem (DCFP), i.e., they are as hard as an NP-Complete problem. We note that, in the light of such a reduction, and despite it is a notation abuse, it is commonplace to state that Syndrome Decoding Problem (SDP) and Codeword Finding Problem (CFP) are NP-Complete, although only decisional problems belong to the NP-complete class.
2.1. Applications to Cryptography
The class of NP-Complete problems is of particular interest to design cryptosystems, as it is widely believed that problems contained in such a class cannot be solved in polynomial time by a quantum computer. Indeed, the best known approaches to solve both the Codeword Finding Problem (CFP) and the Syndrome Decoding Problem (SDP) have a computational complexity which is exponential in the weight of the codeword or error vector to be found. A notable example of code-based cryptosystem relying on the hardness of the Syndrome Decoding Problem (SDP) is the one proposed by Niederreiter in [
3].
Niederreiter cryptosystem generates a public–private key-pair selecting as the private key an instance of a code from a family for which efficient decoding algorithms are available. The code is then represented by its parity check matrix
, which is multiplied by a rank-
r random square binary matrix
S obtaining the public key of the cryptosystem
. The assumption made by Niederreiter is that the multiplication by the random, full rank matrix
S makes
essentially indistinguishable from a random parity matrix. While the original choice to employ a Reed–Solomon code as private code was found to falsify this assumption and lead to a practical attack, other code families have proven to be good candidates (e.g., Goppa codes, Low/Medium Density Parity Check codes [
28,
29,
30,
31]). A message is encrypted in the Niederreiter cryptosystem encoding it as a fixed weight error vector
e and computing its syndrome through
,
, which acts as the ciphertext. The owner of the private key
is able to decipher the ciphertext through first obtaining
and subsequently performing the syndrome decoding of
employing
.
It is easy to note that, under the assumption that
is indistinguishable from a random parity check matrix, an attacker willing to perform a Message Recovery Attack (MRA) must solve an instance of the Syndrome Decoding Problem (SDP). We note that, as proven by Niederreiter [
3], the Syndrome Decoding Problem (SDP) is computationally equivalent to the problem of correcting a bounded amount of errors affecting a codeword, when given a random generator matrix of the code,
G. Such a problem goes by the name of Decoding Problem, and is the mainstay of the original cryptosystem proposal by McEliece [
32]. In such a scheme, the ciphertext thus corresponds to the sum between a codeword of the public code, obtained as
, with
m being a length-
k vector, and a vector
e of weight
t. The message can either be encoded into
e or into
m; in the latter case, Message Recovery Attack (MRA) is performed by searching for the error vector
e and, subsequently, by adding it to the intercepted ciphertext. We point out that this search can automatically turned into the formulation of Syndrome Decoding Problem (SDP), by first computing a valid
from
G and then by trying to solve Syndrome Decoding Problem (SDP) on the syndrome of the intercepted ciphertext through
.
One of the most prominent cases where Codeword Finding Problem (CFP) appears in code-based cryptosystems is represented by a Key Recovery Attack (KRA) against Niederreiter cryptosystems where the private parity check matrix contains rows with a known low weight w. Indeed, in such a case, considering as the generator matrix of the dual code, solving the Codeword Finding Problem (CFP) for such a code reveals the low weight rows of . We note that such a Key Recovery Attack (KRA) is in the same computational complexity class as Syndrome Decoding Problem (SDP), assuming that the obfuscation of makes it indistinguishable from a random one.
Two notable cases where solving the Codeword Finding Problem (CFP) is the currently best known method to perform a Key Recovery Attack (KRA) are the LEDAcrypt [
33] and BIKE [
34] proposals to the mentioned NIST standardization effort for post-quantum cryptosystems. Since such a Codeword Finding Problem (CFP) can also be seen as the problem of finding a binary vector
c with weight
w such that
, the problem is also known as the
Homogeneous Syndrome Decoding Problem (SDP), as it implies the solution of a simultaneous set of linear equations similar to the Syndrome Decoding Problem (SDP), save for the syndrome being set to zero.
2.2. Strategies to Perform MRA
As described in the previous section, security of code-based cryptosystems relies on the hardness of solving Syndrome Decoding Problem (SDP) or Codeword Finding Problem (CFP) instances. In this section, we analyze the case of Syndrome Decoding Problem (SDP) and show that the optimal strategy to perform Message Recovery Attack (MRA) depends on the code parameters. The optimal strategy for solving Syndrome Decoding Problem (SDP) depends on the relation between the actual parameters of the instance under analysis. In particular, in the cases where
t is above the Gilbert–Varshamov (GV) distance [
35], Generalized Birthday Algorithm (GBA) is the best currently known algorithm for solving Syndrome Decoding Problem (SDP) [
36,
37]. However, for the cases we consider in this paper, practical values of
t are significantly smaller than the GV distance; in such cases, the best known methods to solve Syndrome Decoding Problem (SDP) go by the name of Information Set Decoding (ISD) algorithms. Such algorithms are aimed at lessening the computational effort required in the guesswork of an exhaustive search for the unknown error vector
e of weight
t, given a syndrome and a parity check matrix. We point out that it is also possible to adapt all Information Set Decoding (ISD) algorithms, save for the first one proposed by Prange [
10], to solve the Codeword Finding Problem (CFP), as a consequence of the structural similarity of the two problems.
All Information Set Decoding (ISD) algorithms share a common structure where an attempt at retrieving the error vector corresponding to a given syndrome is repeated, for a number of times whose average value depends on the success probability of the single attempt itself. The complexity of all Information Set Decoding (ISD) variants can be expressed as the product between the complexity of each attempt, which we denote as , and the average number of required attempts. In particular, such a value can be obtained as the reciprocal of the success probability of each attempt, which we denote as ; thus, when considering a code with length n, redundancy and Hamming weight of the sought error bounded to t, we generically denote the time complexity of obtaining one solution of Syndrome Decoding Problem (SDP) employing the Information Set Decoding (ISD) variant at hand, i.e.,
As we show in the following, the work factor of a Message Recovery Attack (MRA) performed through Information Set Decoding (ISD) may actually depend on the system parameters; to this end, we first exploit the following well-known result. Let be a linear binary code with length n, dimension k and minimum distance d, and let H be a parity-check matrix for . Let s be a length-r binary vector and t be an integer ≤n; then, if , there is at maximum one vector e of weight t such that .
Thus, when is the parity-check matrix of a code with minimum distance , then solving Syndrome Decoding Problem (SDP) guarantees that the found error vector corresponds to the one that was used to encrypt the message. In this case, the attack work factor corresponds to .
However, when , the time complexity of a Message Recovery Attack (MRA) needs to be derived through a different approach. Indeed, in such a case, the adversary has no guarantee that the output of Information Set Decoding (ISD) corresponds to the error vector that was actually used in the encryption phase. Thus, the work factor of a Message Recovery Attack (MRA) cannot be simply taken as the time complexity of the chosen Information Set Decoding (ISD) algorithm. Let s be the syndrome corresponding to the intercepted ciphertext and e be the searched error vector, i.e., . We define
Clearly, corresponds to the number of valid outputs that Information Set Decoding (ISD) can produce, when applied on the syndrome s corresponding to e. In such a case, the probability that an Information Set Decoding (ISD) iteration will not find any valid error vector can be estimated as . Thus, in such a case, one attempt of Information Set Decoding (ISD) will succeed with probability . In particular, the algorithm will randomly return a vector among the set of the admissible ones: thus, the probability that the obtained vector corresponds to e is .
To obtain a closed-form expression for the attack work factor, we can consider the average value of , which we obtain by averaging over all the possible vectors e of weight t and length n, and denote it with N. Then, the attack work factor can be computed as
In particular, it can be shown that
, so that
with
We point out that, for the cases we analyze in this paper, we have , so that . Thus, from now on, we assume , i.e., that the time complexity of performing a Message Recovery Attack (MRA) is equal to that of running an Information Set Decoding (ISD) algorithm in the case in which a unique solution exists.
3. A Finite Regime Analysis of Information Set Decoding Techniques
In the following, we report an analysis of the best known variants of Information Set Decodings (ISDs) and their execution on a classical computer, namely the ones proposed by Prange [
10], Lee and Brickell [
11], Leon [
12], Stern [
13], Finiasz and Sendrier [
14], May, Meurer and Thomae [
15], and Becker, Joux, May and Meurer [
16]. For the sake of clarity, we describe the Information Set Decoding (ISD) algorithms in their syndrome decoding formulation, highlighting for the first variant amenable to dual-use, i.e., Lee and Brickell’s, how to adapt the technique to the Codeword Finding Problem (CFP). For all these algorithms, we provide finite-regime time complexities and space complexities, with the aim to analyze the actual computational effort and memory resources needed to solve both the Syndrome Decoding Problem (SDP) and the Codeword Finding Problem (CFP) on instances with cryptographically sized parameters. We also report lower bounds on the complexities of the execution of Prange, Lee and Brickell’s and Stern’s variants of the Information Set Decoding (ISD) on a quantum computer, allowing an evaluation of the corresponding computational efforts.
We provide the exact formulas for the time complexity of ISD variants as a function of the code length n, the code dimension k and the number of errors t. We note that the ISD algorithms having the best asymptotic time complexity are also characterized by an exponential space complexity, which may significantly hinder their efficiency or make their implementation unpractical. In particular, we also analyze the computational cost of such algorithms with a logarithmic memory access cost criterion. Indeed, the logarithmic access cost criterion is the one which fits better scenarios where the spatial complexity of an algorithm is more than polynomial in its input size, therefore resulting in a non-negligible cost for the memory accesses.
In the reported formulas, we employ the -notation simply to remove the need to specify the computing architecture- or implementation-dependant constants.
3.1. Prange
Prange’s algorithm [
10] is the first known variant of ISD, based on the idea of guessing a set
of
k error-free positions in the error vector
e to be found in the Syndrome Decoding Problem (SDP). For this purpose, the columns of
H are permuted so that those indexed by
are packed to the left. This operation is equivalent to the multiplication of
H by an appropriately sized permutation matrix
P. The column-reordered matrix
is hence obtained, which can be put in Reduced Row Echelon Form (RREF), with the identity matrix
placed to the right, i.e.,
. If turning
in Reduced Row Echelon Form (RREF) is not possible as the
rightmost submatrix is not full-rank, a different permutation is picked. The same transformation
U required to bring
H in Reduced Row Echelon Form (RREF) is then applied to the single-bit rows of the column syndrome vector
s, obtaining
. If the weight of the permuted error vector
obtained as
, where
is the all-zero error vector of length
k, matches the expected error weight
t, then the algorithm succeeds and the non-permuted error vector
is returned. A pseudo-code description of Prange’s ISD algorithm is provided in Algorithm 1.
Algorithm 1: Syndrome decoding formulation of Prange’s ISD. |
Input: s: an r-bit long syndrome (column vector) H: an binary parity-check matrix t: the weight of the error vector to be recovered Output: e: an n-bit binary row error vector s.t. , with weight Data: P: an permutation matrix an r-bit long binary column vector V: an binary matrix
- 1
repeat - 2
repeat - 3
RandomPermutationGen - 4
// the corresponding error vector is - 5
RedRowEchelonForm // - 6
until - 7
- 8
- 9
untilweight - 10
return
|
Proposition 1 (Computational complexity of Algorithm 1)
. Given H, an binary parity-check matrix and s, an r-bit long syndrome (column vector) obtained through H, the complexity for finding the row error vector e with length n and weight t such that with Algorithm 1 can be computed starting from the probability of success of a single iteration of the loop at Lines 1–9 and the computational requirements of executing the loop body . In particular, the time complexity is , with The spatial complexity is .
Proof. The loop body of Algorithm 1 is dominated by the cost of finding an information set and validating it through checking that the matrix W is indeed an identity, i.e., that the corresponding submatrix of indeed has full rank.
Note that, in an binary matrix, the first row has a probability of of being linearly dependent from itself (i.e., zero); the second row has a probability of of being linearly dependent (i.e., zero or equal to the first). With an inductive argument, we obtain that the rth row has a probability of of being linearly dependent from the previous ones. We thus have that the probability of having all the rows independent from one another is .
We thus have that the column permutation (Line 4), with computational complexity
(which can be lowered to
n keeping only the permuted column positions) and the Reduced Row Echelon Form (RREF) transformation (Line 5), with cost
have to be repeated
times, yielding the first addend of the computational cost
. The cost
is derived considering the Reduced Row Echelon Form (RREF) as an iterative algorithm performing as many iterations as the rank of the identity matrix in the result (i.e.,
r in this case). Each iteration
proceeds to find a pivot, taking
, swaps it with the
th row in
and proceeds to add the pivot to all the remaining
rows which have a one in the
th column. The total cost is
The second addend of the cost is constituted by the computational complexity of computing , which is . The total cost of computing an iteration is the sum of and the cost of building , i.e., .
is obtained as the number of permuted error vectors with the error-affected positions such that they are fitting the hypotheses made by the algorithm, divided by the number of all the possible error vectors. This fact holds for all ISD algorithms. In the case of Prange’s ISD, the permuted error vectors admissible by the hypotheses are , as all the error-affected positions should be within the last r bits of the permuted error vectors, while the number of error vectors is . □
For the sake of clarity, from now on, we denote as ISextract the procedure computing , performed on Lines 2–7 of Algorithm 1, with computational time complexity and space complexity .
3.2. Lee–Brickell
The ISD algorithm introduced by Lee and Brickell in [
11] starts with the same initial operations as in Prange’s, i.e., the computation of the Reduced Row Echelon Form (RREF) of
and the derivation of the corresponding syndrome
. However, Lee and Brickell improved Prange’s original idea by allowing
p positions in the
k selected in the error vector to be error-affected. These
p remaining error positions are guessed. To verify the guess, Lee and Brickell exploit the identity
, where
is split in two parts,
, with
being
k bit long and with weight
p, and
being
r bits long and with weight
. The identity is rewritten as
, from which follows the fact that
must have weight
. Indeed, this condition is employed by the algorithm to check if the guess of
p positions is correct. The procedure is summarized in Algorithm 2.
Algorithm 2: Syndrome decoding formulation of Lee and Brickell’s ISD. |
Input: s: an r-bit long syndrome (column vector) H: an binary parity-check matrix t: the weight of the error vector to be recovered Output: e: an n-bit binary row error vector s.t. , with weight Data: P: an permutation matrix : the error vector permuted by P p: the weight of the first k bits of , , proven optimal : an r-bit long binary column vector, equal to the syndrome of e through V: an binary matrix
- 1
repeat - 2
ISextract - 3
for to do - 4
IntegerToCombination // is a set of p distinct integers in - 5
if weight then - 6
- 7
foreach do - 8
- 9
break - 10
untilweight - 11
return
|
Proposition 2 (Computational complexity of Algorithm 2). Given H, an binary parity-check matrix and s, an r-bit long syndrome (column vector) obtained through H, finding the row error vector e with length n and weight t such that with Algorithm 2 requires an additional parameter .
The time complexity of Algorithm 2 can be computed starting from the probability of success of a single iteration of the loop at Lines 1–10 and the computational requirements of executing the loop body . In particular, the time complexity iswhere is as in Equation (6) and is the cost of decoding an integer into its combinadics representation, i.e, finding the corresponding combination among all the ones. The spatial complexity is . Proof. The probability of success of Lee and Brickell’s ISD is obtained following the same line of reasoning employed for Prange’s, thus dividing the number of admissible permuted error vectors, by the number of the possible error vectors .
The cost of an iteration of Lee and Brickell’s algorithm can be obtained as the cost of adding together bit vectors of length r, i.e., (Line 6), multiplied by the number of such additions, i.e., as they constitute the body of the loop at Lines 4–9. Note that, in a practical implementation where the value of p is fixed, it is possible to avoid altogether, specializing the algorithm with a p-deep loop nest to enumerate all the weight p, length k binary vectors. □
3.3. Adapting Lee and Brickell to Solve CFP
The structure of Lee and Brickell’s Information Set Decoding (ISD) allows employing substantially the same algorithm to solve the Codeword Finding Problem (CFP), given a parity matrix
H as the representation of the code where a weight
w codeword
c should be found. The line of reasoning to employ Lee and Brickell’s Information Set Decoding (ISD) to solve the Codeword Finding Problem (CFP) is to note that, by definition, for any codeword
c of the code represented by
H we have that
, i.e., a codeword multiplied by the parity check matrix yields a null syndrome. As a consequence, we have that
. This implies that
, which can be exploited as an alternative stopping condition to the one of Algorithm 2, yielding in turn Algorithm 3. The only remaining difference between the Syndrome Decoding Problem (SDP) solving Lee and Brickell’s ISD and the Codeword Finding Problem (CFP) is represented by the
ISextract primitive, which no longer needs to compute a transformed syndrome
as it is null. We thus have a small reduction in
, which becomes
, losing an additive
term. We note that such a reduction is expected to have little impact in practice as the dominant portion of the
ISextract function is represented by the Reduced Row Echelon Form (RREF) computation. This in turn implies that solving the Syndrome Decoding Problem (SDP) on a code
has practically the same complexity of finding a codeword with weight
in the same code. Therefore, finding low-weight codewords in the code defined by a Niederreiter cryptosystem public key
has an effort comparable to the one of performing syndrome decoding assuming an error with the same weight as the codeword to be found. Two families of codes which may be vulnerable to such an attack unless the parameters are designed taking into account a Codeword Finding Problem (CFP) ISD are the Low Density Parity Check (LDPC) and Moderate Density Parity Check (MDPC) codes. Indeed, such code families can be represented with a parity check matrix with low-weight rows, and such a low-weight representation can be relied upon to perform efficient decoding, leading to an effective cryptosystem break. Indeed, we now show that, if a code
can be represented by a low-weight parity matrix
, the code will contain low weight codewords. Without loss of generality, consider
. Moreover, consider
as split in three portions
of size
,
and
, respectively, with
B non-singular. We derive the corresponding generator matrix as
and consider the bottom
r rows
. Consider the product of such bottom rows by
, yielding
and note that all the rows of this product are valid codewords, as they are the result of a linear combination of rows of the generator matrix
G. Moreover, given that the private parity-check matrix
has low row and column weight by construction, we have that the aforementioned codewords, i.e., the rows of
, also have a low weight. This fact may thus allow an attacker to perform a Key Recovery Attack (KRA) retrieving the low-weight codewords and rebuilding
.
A different attack strategy for the same code families is to try and find codewords in the dual code with respect to the one represented by the parity check matrix . Such a code, by definition, sees as a valid generator matrix, and thus makes it possible to directly reconstruct solving r instances of Codeword Finding Problem (CFP) to obtain the r instances of . Solving the Codeword Finding Problem (CFP) on the dual code implies that Algorithm 3 is called considering the aforementioned G matrix as a parity check matrix. Thus, if we denote with the complexity of solving Codeword Finding Problem (CFP) on the code described by , solving the Codeword Finding Problem (CFP) on the dual code, will have a complexity of , where is the weight of the codeword of the dual code. Whether this strategy or the one of solving the Codeword Finding Problem (CFP) on the primal code is more advantageous depending on the code rate and the values of w and .
Algorithm 3: Codeword finding formulation of Lee and Brickell’s ISD. |
Input: H: an binary parity-check matrix w: the weight of the codeword to be found Output: c: an n-bit codeword with weight Data: P: an permutation matrix : the error vector permuted by P p: the weight of the first k bits of , , V: an binary matrix
- 1
repeat - 2
ISextract - 3
for to do - 4
IntegerToCombination // is a set of p distinct integers in - 5
if weight then - 6
- 7
foreach do - 8
- 9
break - 10
untilweight - 11
return
|
3.4. Leon
The algorithm proposed by Leon in [
12], reported in Algorithm 4, improves the Lee and Brickell’s Information Set Decoding (ISD) assuming that the contribution to the value of the first
ℓ bits of the syndrome
,
, comes only from columns in
V, i.e., there is a run of zeroes of length
ℓ leading the final
r bits of the permuted error vector
, i.e.,
, where
is
k bits long and
is
bits long. We thus have that the expected situation after the permutation and RREF computation is
where
is assumed to have a run of
ℓ zeroes in its first bits. Such an assumption will clearly reduce the success rate of an iteration, as not all the randomly chosen permutations will select columns having this property. However, making such an assumption allows performing a preliminary check of the value of the sum of the
ℓ topmost bits only of each column of
V. Indeed, such a sum should match the value of the corresponding
ℓ topmost bits of
,
, because the
ℓ leading null bits in
in turn nullify the contribution of the columns in the topmost
ℓ rows
of the identity matrix. Such a check (Line 5 in Algorithm 4) allows discarding a selection of the
p columns from the ones of
V, earlier, saving addition instructions with respect to a full column check. The length
ℓ of the run of zeroes should be picked so that the trade-off between the reduction in success probability is compensated by the gain in the speed of a single iteration.
Algorithm 4: Syndrome decoding formulation of Leon’s ISD. |
Input: s: an r-bit long syndrome (column vector) H: an binary parity-check matrix t: the weight of the error vector to be recovered
Output: e: an n-bit binary row error vector s.t. , with weight Data: P: an permutation matrix : the error vector permuted by P p: the weight of the first k bits of , , ℓ: length of the run of zeroes in : an r-bit long binary column vector, equal to the syndrome of e through , V: an r × k binary matrix - 1
repeat - 2
ISextract - 3
for to do - 4
IntegerToCombination // is a set of p distinct integers in - 5
if weight then - 6
if weight then - 7
- 8
foreach do - 9
- 10
break - 11
untilweight - 12
return
|
Proposition 3 (Computational complexity of Algorithm 4)
. Given H, an binary parity-check matrix and s, an r-bit long syndrome (column vector) obtained through H, finding the row error vector e with length n and weight t such that with Algorithm 4 requires two additional parameters , . The time complexity of Algorithm 4 can be computed starting from the probability of success of a single iteration of the loop at Lines 1–11 and the computational requirements of executing the loop body . In particular, the time complexity iswhere is as in Equation (6) and is the cost of decoding an integer into its combinadics representation, i.e., finding the corresponding combination among all the ones. Note that, if the value of p is fixed, it is possible to avoid , specializing the algorithm with a p-deep loop nest to generate the combinations. The spatial complexity is . Proof. The success probability of an iteration of Leon’s algorithm follows the same line of reasoning of Prange’s and Lee and Brickell’s, dividing the number of admissible permuted error vectors by the total one . The complexity of a single iteration is obtained considering that the loop at Lines 4–10 will perform iterations, where vectors of length ℓ are added together (complexity ), and, if the result is zero, a further addition of bit vectors, each one of length has to be performed (complexity ). This further addition takes place with a probability of , as all possible values for are , and only attempts at hitting the correct one are made, thus yielding the correct complexity, under the assumption that the sums of ℓ bit vectors are independent and uniformly distributed over all the ℓ bit strings. □
3.5. Stern
Stern’s algorithm, introduced in [
13], improves Leon’s (Algorithm 4) by employing a meet-in-the-middle strategy to find which set of size
p, containing
ℓ bit portions of columns of
V, adds up to the first
ℓ bits of the syndrome
. For the sake of clarity, consider
V as
where
are
ℓ-bit column vectors, and
are
-bit column vectors, and the transformed syndrome
as
.
Stern’s strategy splits the
p-sized set
of indexes of the columns of
, which should add up to
, into two
sized ones
and
(
). Stern’s strategy mandates that all columns indexed by
should be within the leftmost
ones of
V, while the ones indexed by
should be within the rightmost
ones. It then exploits the following equation
to precompute the value of
for all possible
choices of
, and store them into a lookup table
, together with the corresponding choice of
. The algorithm then enumerates all possible
sized sets of indexes
, computing for each one
, and checking if the result is present in
. If this is the case, the algorithm has found a candidate pair
for which
holds, and thus proceeds to check if
. This strategy reduces the cost of computing an iteration quadratically at the price of increasing the number of iterations with respect to Lee and Brickell’s approach, and taking a significant amount of space to store the lookup table
which contains
elements. We note that Stern’s variant of the ISD is the first one to exhibit non-polynomial memory requirements, due to the size of the set
, which should be memorized and looked up. Stern’s algorithm is summarized in Algorithm 5.
Algorithm 5: Syndrome decoding formulation of Stern’s ISD. |
Input: s: an r-bit long syndrome (column vector) H: an binary parity-check matrix t: the weight of the error vector to be recovered
Output: e: an n-bit binary row error vector s.t. , with weight Data: P: an permutation matrix : the error vector permuted by P p: the weight of the first k bits of , , ℓ: length of the run of zeroes in : an r-bit long binary column vector, equal to the syndrome of e through , V: an r × k binary matrix
: list of pairs , with a set of integer indexes between 0 and , and an ℓ-bit binary column vector
- 1
repeat - 2
ISextract - 3
- 4
for to do - 5
IntegerToCombination // is a set of distinct integers in - 6
- 7
for to do - 8
IntegerToCombination // is a set of distinct integers in - 9
if then - 10
if weight then - 11
- 12
foreach do - 13
- 14
foreach do - 15
- 16
break - 17
untilweight - 18
return
|
Proposition 4 (Computational complexity of Algorithm 5). As for Algorithm 4, given H, an binary parity-check matrix and s, an r-bit long syndrome (column vector) obtained through H, finding the row error vector e with length n and weight t such that with Algorithm 5 requires two additional parameters , .
The time complexity of Algorithm 5 can be computed starting from the probability of success of a single iteration of the loop at Lines 1–17 and the computational requirements of executing the loop body . In particular, the time complexity iswhere is as in Equation (6) and is the cost of decoding an integer into its combinadics representation, i.e., finding the corresponding combination among all the ones. Note that, if the value of p is fixed, it is possible to avoid , specializing the algorithm with a p-deep loop nest to generate the combinations. The spatial complexity is . Proof. The success probability of an iteration of Stern’s algorithm follows the same line of reasoning of the previous ones, dividing the number of admissible permuted error vectors by the total one . The complexity of a single iteration is obtained considering that the loop at Lines 5–7 will compute additions of vectors, each one ℓ bits in length (complexity ). The loop at Lines 8–16 performs iterations, where vectors of length ℓ are added together (complexity ), and the result is looked up in table . If the result is found, a further addition of bit vectors, each one bits long is performed (complexity ). This further addition takes place with a probability of , as all the possible values for the computed ℓ bit sum are , and only are present in .
The spatial complexity of Stern’s algorithm is the result of adding together the space required for the operations on the H matrix (i.e., ) with the amount of space required by the list , which is elements long. Each element of the list takes bits to store the set of indexes, and ℓ bits to store the partial sum, yielding a total spatial cost for the list of bits. □
The aforementioned temporal complexity is obtained assuming a constant memory access cost which, given the exponential amount of memory required is likely to be ignoring a non-negligible amount of time spent to perform memory accesses. Indeed, it is possible to take into account such a time employing a logarithmic memory access cost model. Recalling that the address decoding logic for an n element digital memory of any kind cannot have a circuit depth smaller than , we consider that the operations involved in the computation of an iteration will require such an access, in turn obtaining a cost per iteration equal to .
3.6. Finiasz–Sendrier
Finiasz and Sendrier in [
14] proposed two improvements on Stern’s Information Set Decoding (ISD) algorithm, obtaining Algorithm 6. The first improvement is represented by removing the requirement for the presence of a run of
ℓ zeroes in the permuted error vector
and allowing the
p error bits to be guessed to be present also in that region of
. Such an approach raises the success probability of an iteration. Following the fact that the
p positions which should be guessed are picked among the first
ones of the error vector, Finiasz and Sendrier computed only a partial Reduced Row Echelon Form (RREF) transformation obtaining a smaller,
, identity matrix in the lower rightmost portion of
, and leaving a zero submatrix on top of the identity. As a consequence, the cost of computing such an Reduced Row Echelon Form (RREF) is reduced to
Algorithm 6: Syndrome decoding formulation of Finiasz–Sendrier ISD. |
Input:
s: an r-bit long syndrome (column vector) H: an binary parity-check matrix t: the weight of the error vector to be recovered
Output: e: an n-bit binary row error vector s.t. , with weight
Data: P: an permutation matrix : the error vector permuted by P p: the weight of the first k bits of , , ℓ: a free parameter : an r-bit long binary column vector, equal to the syndrome of e through , V: an r × (k + ℓ) binary matrix
: list of pairs , with a set of integer indexes between 0 and , and an ℓ-bit binary column vector
- 1
repeat - 2
ISextract - 3
- 4
for to do - 5
IntegerToCombination // is a set of distinct int.s in - 6
- 7
for to do - 8
IntegerToCombination // is a set of distinct int.s in - 9
if then - 10
if weight then - 11
- 12
foreach do - 13
- 14
foreach do - 15
- 16
break - 17
untilweight - 18
return
|
Considering that the invertibility condition is required only for an
submatrix, we have that
for a use of the ISD to solve Syndrome Decoding Problem (SDP), while the last
term is not present in case the method is employed to solve the Codeword Finding Problem (CFP).
Proposition 5 (Computational complexity of Algorithm 6). Given H, an binary parity-check matrix and s, an r-bit long syndrome (column vector) obtained through H, finding the row error vector e with length n and weight t such that with Algorithm 6 also requires two additional parameters , .
The time complexity of Algorithm 6 can be computed starting from the probability of success of a single iteration of the loop at Lines 1–17 and the computational requirements of executing the loop body . In particular, the time complexity iswhere is the cost of decoding an integer into its combinadics representation, i.e., finding the corresponding combination among all the ones. Note that, if the value of p is fixed, it is possible to avoid , specializing the algorithm with a p-deep loop nest to generate the combinations. The spatial complexity is With a line of reasoning analogous to Stern’s ISD, we consider the complexity of Finiasz and Sendrier’s ISD with a logarithmic memory access cost, multiplying the cost of the iteration by the binary logarithm of the size of the required memory.
3.7. May–Meurer–Thomae
The variant of ISD proposed by May, Meurer and Thomae in [
15] improves Finiasz and Sendrier’s variant by introducing two tweaks, resulting in Algorithm 7. The first tweak changes the way in which the
p error positions in the permuted error vector
are chosen. Instead of splitting them equally as
in the leftmost
columns and
in the subsequent
ones, the selection is made picking two disjoint sets of indexes
. Such an approach increases the number of possible permutations which respect this constraint.
The second tweak considers as logically split into two submatrices dividing it row-wise into two parts, the first one with rows and the second one with rows. After performing the same partial RREF, as it is done in the Finiasz and Sendrier’s ISD, we obtain and the corresponding .
Such a subdivision is employed to further enhance the efficiency of the checks on the columns of V with respect to the idea of matching bit strings with a precomputed set introduced by Stern. To this end, the sets and are in turn obtained as the disjoint union of a pair subsets with cardinality . Let be and be . For the sake of simplicity, the disjoint union is realized picking the elements of in and the ones of in .
The May–Meurer–Thomae (MMT) algorithm thus proceeds to exploit a time to memory trade-off deriving from the same equation employed by Stern, applying twice the precomputation strategy. The derivation from the test equality on the syndrome is done as follows
May, Meurer and Thomae exploited the strategy described by Stern to derive candidate values for the elements of
and
, rewriting the last two equalities as
and exploiting their form to build two lists of candidate values for
and
,
and
, such that for the elements of the first list, it holds that
, and for the elements of the second list it holds that
. We note that, through a straightforward implementation optimization, matching the one employed in Stern’s algorithm, only the first list needs to be materialized (appearing as
in Algorithm 7).
The second observation made in the MMT algorithm relates to the possibility of reducing the size of the list involved in Stern’s algorithm to compute the value of the sought error vector via a meet-in-the-middle strategy. The observation relies on the fact that it is possible to obtain the first bits of the permuted error vector , as the sums of two long bit vectors, with weight each. Stern’s algorithm limits the positions of the ones in to be in the first half of the bits for and in the second half for , yielding a single valid pair for a given . By contrast May–Meurer–Thomae does not constrain the positions of the two sets of positions to be picked from different halves of the region of the error vector, but instead it only constrains the choice to non-overlapping positions. In such a fashion, considering the correct guess of p positions, we have that they can be split into the two sets in possible valid ways, in turn increasing the likelihood of a correct guess. If this choice is made, it is possible to reduce the size of the lists employed to compute the man in the middle approach by a factor of , while retaining (on average), at least a solution. To this end, the authors suggested picking the value of as , in a way to reduce the size of the lists by the proper factor.
Proposition 6 (Computational complexity of Algorithm 7). Given H, an binary parity-check matrix and s, an r-bit long syndrome (column vector) obtained through H, finding the row error vector e with length n and weight t such that with Algorithm 7 requires three additional parameters , and such that .
The time complexity of May–Meurer–Thomae is The spatial complexity of May–Meurer–Thomae is Algorithm 7: Syndrome decoding formulation of May–Meurer–Thomae ISD. |
Input: s: an r-bit long syndrome (column vector) H: an binary parity-check matrix t: the weight of the error vector to be recovered Output: e: an n-bit binary row error vector s.t. , with weight Data: P: an permutation matrix : the error vector permuted by P p: the weight of the first k bits of , , proven optimal : two parameters with , : an r-bit long binary column vector, equal to the syndrome of e through , V: an r × (k + ℓ) binary matrix : list of pairs with a set of integer indexes between 0 and k + ℓ − 1, and an ℓ-bit binary column vector; length of is kept at most : list of pairs , and with , set of integer indexes between 0 and , and , , ℓ1 and ℓ2 bit binary column vectors
- 1
repeat - 2
ISextract- - 3
- 4
for to do - 5
IntegerToCombination // is a set of int.s in - 6
- 7
- 8
for to do - 9
IntegerToCombination // is a set of int.s in - 10
if then - 11
- 12
for do - 13
- 14
- 15
if then - 16
break // is no longer needed from here onwards - 17
- 18
for to do - 19
IntegerToCombination // is a set of int.s in - 20
- 21
for to do - 22
IntegerToCombination // is a set of int.s in - 23
if then - 24
- 25
for do - 26
- 27
if then - 28
if weight then - 29
- 30
foreach do - 31
- 32
break - 33
untilweight - 34
return
|
Proof. The computational complexity is derived considering the number of iterations of the loops in the algorithm, taking into account the probability of the checks being taken. The spatial complexity is obtained as the sum of the size of the matrix H, the size requirements of (as has the same size and can reuse its space) and the expected size of considering how many pairs may survive the check performed when building it. □
3.8. Becker–Joux–May–Meurer
The Becker–Joux–May–Meurer (BJMM) algorithm introduced in [
16] improves the MMT algorithm in two ways: the former is a recursive application of the list building strategy, and the latter is a change to the list element generation employed. We discuss the latter first, forsaking for the sake of clarity its recursive application at first. We then describe the adaptations needed to adopt the recursive splitting strategy without issues.
The BJMM algorithm considers that it is possible to represent a vector e of weight p, length , as the sum of two vectors of weight , and with the same length, under the assumption that the extra ones cancel out during the addition. We recall that the MMT approach demands that both and have weight strictly equal to . The BJMM algorithm thus raises the number of valid pairs employed to represent e by a factor equal to . Such an improvement is employed to further reduce the size of the lists of values which need to be computed to find e with a meet-in-the-middle approach on checking that the condition . Indeed, since pairs which respect exist, searching a fraction of the exhaustively will yield (on average) a solution, assuming that the solution pairs are uniformly distributed over all the ones.
Willing to employ a strategy to enumerate only a fraction of the pairs, while doing useful computation instead of a simple sub-sampling of the space of the pairs, the BJMM algorithm opts for performing a partial check of the equation, on a smaller number of bits than ℓ and discarding the pairs which do not pass the check.
Let us denote with the first rows of and with the first bits of the syndrome The BJMM algorithm thus employs the test to obtain a twofold objective: discard a fraction of the pairs, and select the portion to be kept among the pairs which at least have the first bits of the sum of the columns of matching the value of the corresponding syndrome bits. Such an approach has the advantage over a random sampling that the pairs which are selected have already been checked for compliance on a part of the equation, i.e., it performs a random subsampling while doing useful computation. The BJMM paper suggests that the value of should be : under the assumption that the r bit sums being performed are made of random values, and that the sum should match a given r bit value, only a fraction equal to , i.e., survives. Note that, regardless of the choice of , a selection of the correct positions will always survive the preliminary checks on bits, while wrong solutions are filtered out on the basis that they will not match on the first bits. In the complete algorithm, the aforementioned line of reasoning is applied twice, as the list-building strategy is recursively applied, leading to two different reduction factors depending on the recursion depth itself.
We now come to the second improvement of the BJMM algorithm, the recursive application of the meet-in-the-middle strategy employed by all the ISDs since Stern’s. Stern’s meet-in-the-middle approach starts from rewriting as . The original BJMM proceeds to build two lists of pairs. The first list contains, for all possible , the pairs . The second list contains, for all possible , the pairs . The BJMM algorithm sorts lexicographically the two lists on the first elements of the pairs and then checks for the matching pairs in essentially linear time in the length of the lists.
We note that a more efficient (memory saving) way of performing the same computation involves inserting in a (open hash) hashmap which employs as the key for the value . Subsequently, computing on the fly , and looking it up in the hashmap, yields all the matching pairs for . Let N be the number of possible pairs and M the number of matching pairs, the original strategy requires vector sized operations, while the modified one requires .
The BJMM algorithm employs the meet-in-the-middle strategy to generate the values for the candidate vectors e more than once. In particular, a candidate for e, , weight p, length , is generated from two vectors , weight , length . The vectors are in turn generated by pairs and , which all have weight . Finally, , are generated by pairs , which have length , and weight , i.e., no extra ones which will cancel out are allowed when generating ’s.
In adopting this approach, two issues must be coped with: no overlapping positions for the ones should be present between any pair, and the values all the , , are matched against should be unrelated, so that the sampling of pairs during the merge action of two lists is indeed picking items independently from another list merger on the same level. This allows a list merger at a lower level to consider the elements from above to be picked at random. The first issue is solved picking the positions of the ones for a pair from disjoint sets. Since the independence among the inputs of two level-3 list mergers should still hold, a pair of disjoint sets is generated for each level-3 list pair. In other words, for a given pair, the position of the ones of , belong to a sized set which has null intersection with the set of positions from which the ones of are picked. This constraint may cause a given target permuted error not to be representable as a combination of the aforementioned pairs. Indeed, consider the fact that the aforementioned strategy implies that there are possible pairs, while the vectors to be represented are .
The second issue is solved slightly modifying the matching equations as follows
so that the checks at each level also act in such a fashion that the total sum over
matches the syndrome already, while retaining the desired randomness.
Proposition 7 (Computational complexity of Algorithm 8). Given H, an binary parity-check matrix and s, an r-bit long syndrome (column vector) obtained through H, finding the row error vector e with length n and weight t such that with Algorithm 8 requires three additional parameters , and such that .
The time complexity of the algorithm iswhere the probability of an iteration to succeed, , is equal to the one in May–Meurer–Thomae (i.e., ), multiplied by a factor which quantifies the fact that it is possible, picking the two disjoint sets over the subsets of positions from the error vector are selected may result in a set which does not contain enough positions. Such a factor, considered in all the splits in the BJMM is . The cost of an iteration of the loop at Lines 1–27 of the BJMM is as follows The first line constitutes the cost of the loop at Lines 4–16, the second line is the cost of the loop at Lines 17–23, the third line is the cost of the loop at Lines 24–27 save for the portion related to the branch at Line 25 being taken. The last line is the cost of computing the body of the taken branch, multiplied by the probability of such a branch being taken.
The BJMM variant of the ISD shares with the Stern, Finiasz and Sendrier, and May–Meurer– Thomae ISDs the fact that the exponential memory requirements should be taken into account by a logarithmic access cost, instead of a constant one. We do so following the same method employed for the aforementioned variants, i.e., augmenting the cost of the iteration accordingly.
3.9. Speedups in ISD Algorithms Due to Quasi-Cyclic Codes
A common choice to reduce the size of the keys in a McEliece or Niederreiter cryptosystem is to employ a so-called quasi-cyclic code. Such a code is characterized by a parity-check matrix composed by circulant block matrices, i.e., matrices where all the rows are obtained as a cyclic shift of the first one.
It is possible to exploit such a structure to provide a polynomial speedup factor to both the solution of Codeword Finding Problem (CFP) and Syndrome Decoding Problem (SDP). The speedup in the solution of the Codeword Finding Problem (CFP) can be derived in a straightforward fashion observing that, in the case of both the Codeword Finding Problem (CFP) against the primal and the one against the dual code, for each codeword to be found in a quasi cyclic code with p sized circulant blocks, more codewords can be derived simply as a block-wise circulant shift of the first one. As a consequence, for a given codeword with weight w sought by the algorithm, it is guaranteed that at least p many of them are present in the code. Thus, in this case. the success probability of each iteration can be obtained as ; when , this in turn implies that the probability of success approximately grows by a factor p, in turn speeding up any ISD by the same factor.
An analogous line of reasoning leads to exploit the Decoding One Out of Many (DOOM) algorithm proposed by Sendrier in [
38] to speed up the solution of the Syndrome Decoding Problem (SDP). Decoding One Out of Many (DOOM) relies on the fact that a set of syndromes
S through the same parity check matrix are provided to the attacker, and he attempts at decoding at least one of them. In case of a quasi cyclic code, cyclically shifting the syndrome yields a different, valid syndrome, and a predictable cyclic shift on the corresponding (unknown) error vector. It is therefore possible for an attacker to derive
p different syndromes, starting from one and, in case one of them is successfully decoded, no matter which one, he will be able to reconstruct the sought error vector. Essentially, Decoding One Out of Many (DOOM) performs multiple ISD instances, taking care of duplicating only the checks which involve the syndrome, thus pooling the considerable amount of effort required in the rest of the iteration. The overall speedup achieved by Decoding One Out of Many (DOOM) for a quasi cyclic code with block size
p is
.
Algorithm 8: Syndrome decoding formulation of Becker–Joux–May–Meurer ISD. |
Input: s: an r-bit long syndrome (column vector) H: an binary parity-check matrix t: the weight of the error vector to be recovered Output: e: an n-bit binary row error vector s.t. , with weight Data: P: an permutation matrix : the error vector permuted by P p: the weight of the first k bits of , ℓ: a free parameter : an r-bit long binary column vector, equal to the syndrome of e through , V: an r × (k + ℓ) binary matrix : list of set of indexes , at layer a, (root layer: a = 0, leaf layer a = 3)
, , p0 = p, ε3 = 0; the elements of are integers in {0,…,k + ℓ − 1}
ℓ1,ℓ2: parameters respecting 0 ≤ ℓ2 ≤ ℓ1 ≤ ℓ, stated as optimized for ,
- 1
repeat - 2
ISextract- - 3
for to 3 do - 4
- 5
RandomExtract // populates with values in - 6
- 7
for to do - 8
EnumerateComb // Picks the jth comb. of items of - 9
- 10
for to 1 do - 11
RandBitString - 12
- 13
foreach to do - 14
EnumerateComb - 15
- 16
if then - 17
- 18
if then - 19
break - 20
foreach do - 21
if then - 22
if weight then - 23
- 24
foreach do - 25
- 26
break - 27
untilweight - 28
return
|
3.10. Speedups from Quantum Computing
While there is no known polynomial time algorithm running on a quantum computer able to solve either Syndrome Decoding Problem (SDP) or Codeword Finding Problem (CFP), it is still possible to achieve a significant speedup in the attacks exploiting Grover’s zero-finding algorithm. Grover’s algorithm [
39] finds a zero of an
n-input Boolean function with a computational cost of
function computations, instead of the
required with a classical computer. The first instance of a proposed exploitation of Grover’s algorithm to speed up ISDs was made by Bernstein in [
40], observing that one iteration of Prange’s algorithm can be rewritten as a Boolean function having a zero iff the iteration is successful in finding a valid error vector. The essence of the observation is that the Reduced Row Echelon Form (RREF) computation, and the weight check on the resulting syndrome can be expressed as Boolean functions, and it is straightforward to extend them so that a single bit output indicating the success of the iteration is added. Such an approach allows reducing the number of iterations to be performed to the square root of the one for the classical algorithm, since each iteration of Prange’s algorithm is essentially trying (exhaustively) to find a zero of the aforementioned Boolean function. We therefore rephrase the computational complexity of Prange’s algorithm on a quantum computer. For the sake of simplicity in the analysis, we forgo the overhead of implementing the Boolean function as a reversible circuit, obtaining a conservative estimate of the actual complexity.
Proposition 8 (Quantum computational complexity of Algorithm 1)
. Given H, an binary parity-check matrix and s, an r-bit long syndrome (column vector) obtained through H, finding the row error vector e with length n and weight t such that with Algorithm 1 running on a quantum computer can be computed starting from the probability of success of a single iteration of the loop at Lines 1–7 and the computational requirements of executing the loop body . In particular, the time complexity is , with The spatial complexity is .
Following an argument similar to the one for Prange, we note that there is essentially no difficulty in interpreting Lee and Brickell’s variant of the ISD as computing a slightly more complex Boolean function at each iteration, allowing to reformulate its complexity (Proposition 2) for the quantum case as follows.
Proposition 9 (Quantum computational complexity of Algorithm 2). Given H, an binary parity-check matrix and s, an r-bit long syndrome (column vector) obtained through H, finding the row error vector e with length n and weight t such that with Algorithm 2 requires an additional parameter .
The time complexity of Algorithm 2 running on a quantum computer can be computed starting from the probability of success of a single iteration of the loop at Lines 1–14 and the computational requirements of executing the loop body . In particular, the time complexity iswhere is the cost of decoding an integer into its combinadics representation, i.e., finding the corresponding combination among all the ones. The spatial complexity is . Similarly, we also reformulate Leon’s Algorithm 4 as follows.
Proposition 10 (Quantum computational complexity of Algorithm 4). Given H, an binary parity-check matrix, s, an r-bit long syndrome (column vector) obtained through H, and the two parameters , , the complexity of finding the row error vector e with length n and weight t such that with Algorithm 4 running on a quantum computer can be computed starting from the probability of success of a single iteration of the loop at Lines 1–10 and the computational requirements of executing the loop body . In particular, the time complexity is , where is the cost of decoding an integer into its combinadics representation, i.e., finding the corresponding combination among all the ones. Note that, if the value of p is fixed, it is possible to avoid , specializing the algorithm with a p-deep loop nest to generate the combinations. The spatial complexity is .
Finally, we tackle the reformulation of Stern’s ISD for the quantum case.
Proposition 11 (Quantum computational complexity of Algorithm 5). As for Algorithm 4, given H, an binary parity-check matrix and s, an r-bit long syndrome (column vector) obtained through H, finding the row error vector e with length n and weight t such that with Algorithm 5 requires two additional parameters , .
The time complexity of Algorithm 5 running on a quantum computer can be computed starting from the probability of success of a single iteration of the loop at Lines 1–14 and the computational requirements of executing the loop body . In particular, the time complexity iswhere is the cost of decoding an integer into its combinadics representation, i.e., finding the corresponding combination among all the ones. Note that, if the value of p is fixed, it is possible to avoid , specializing the algorithm with a p-deep loop nest to generate the combinations. The spatial complexity is . Since advanced ISD algorithms reduce the overall complexity by reducing the number of iterations at the cost of an increased complexity per iteration, the speedup due to Grover’s algorithm is less evident for modern variants than for classical forms of ISD. Indeed, for all cases where the trade-off on a classical computer reduces by a factor
the number of iterations, at the cost of raising by
the cost of the iteration itself, we have that the trade-off turns out to be disadvantageous on a quantum computer where the speedup factor
is cut down to
by Grover, while the single iteration slowdown stays the same. This was already observed in [
41], where it was found that the quantum variant of Stern’s algorithm does not achieve smaller work factors than the quantum variant of Lee and Brickell’s algorithm. Indeed, in the same work, it is noted that the complexity of the quantum-computer variant of Stern’s algorithm achieves a smaller complexity than the straightforward quantized version of the BJMM ISD. Finally, we note that the authors of [
42] reported a re-elaboration of the MMT and BJMM ISDs for quantum computers which succeeds in effectively lowering their asymptotic complexities. We however do not take them into account in this work, as no finite regime complexity formulas are provided in [
42] for the proposed algorithms. We report, in
Table 1, a summary of all the computational complexities of the examined Information Set Decoding (ISD) algorithms, together with the parameters which should be optimized to achieve the best running time for each one of them.
4. Quantitative Assessment of ISD Complexities
In this section, we analyze the computational complexities to solve the Syndrome Decoding Problem (SDP) and the Codeword Finding Problem (CFP) for sets of code parameters relevant to post quantum cryptosystems. To this end, we select the proposals which were admitted to the second round of the NIST post quantum cryptography standardization effort [
24] relying on a Niederreiter cryptosystem, or its McEliece variant.
In particular, we consider Classic McEliece [
43] and NTS-KEM [
44], which employ Goppa codes, and BIKE [
34] and LEDAcrypt [
31], which employ quasi cyclic codes, to assess our finite domain approach on both quasi-cyclic codes and non-quasi-cyclic codes. The parameters for the reported cryptosystems are designed to match the computation effort required to break them to the one required to break AES-128 (Category 1), AES-192 (Category-3), and AES-256 (Category 5). We report the code length
n, code dimension
k, number of errors to be corrected
t and size of the circulant block
p for all the aforementioned candidates in
Table 2. Furthermore, for the cryptosystems relying on low- or moderate-density parity check codes, we also report the weight of the codeword to be found,
w, in the case of a Codeword Finding Problem (CFP).
We implemented all the complexity formulas from
Section 3 employing Victor Shoup’s NTL library, representing the intermediate values either as arbitrary precision integers, or as
NTL::RR selectable precision floating point numbers. We chose to employ a 128 bit mantissa and the default 64 bit exponent for the selectable precision.
To minimize the error in computing a large amount of binomial coefficients, while retaining acceptable performance, we precompute the exact values for all the binomials for all pairs up to . Furthermore, to minimize the error of Stirling’s approximation whenever we also precompute all the exact values for the binomials up to , and compute the exact value whenever . To provide a fast approximated computation for all the binomials which do not fall in the aforementioned intervals, we compute the binomials employing the logarithm of Stirling’s series approximated to the fourth term.
We explored the parameter space of each algorithm considering the fact that the different parameters drive different trade-off points in each algorithm. To this end, we explored an appropriately sized region of the parameter space, which we report in
Table 3. To determine the explored region, we started from a reasonable choice and enlarged the region until the value of the parameters minimizing the attack complexity was no longer on the region edge for all the involved parameters. We employed, for the choice of the
parameter in the MMT ISD variant and the
and
parameters in the BJMM variant, the choices which were advised in the respective works.
We took into account the advantage provided by a quasi cyclic code in both the Syndrome Decoding Problem (SDP) and the Codeword Finding Problem (CFP) solution complexity, reducing it by a factor equal to
, the square root of the cyclic block size for the Syndrome Decoding Problem (SDP), and
p for the Codeword Finding Problem (CFP), in accordance with the point raised in
Section 3.9.
Table 4 reports the computational costs of solving both Syndrome Decoding Problem (SDP) and Codeword Finding Problem (CFP) by means of the described variants of the Information Set Decoding (ISD). In addition to the computational complexities obtained, the value of a simple asymptotic cost criterion for the Information Set Decoding (ISD)s, described in [
34], is reported. Such a criterion states that the asymptotic complexity of an ISD is
, for the case of the use in solving a Syndrome Decoding Problem (SDP). A noteworthy point to observe is that, considering the finite regime value of the complexities, the May–Meurer–Thomae algorithm attains a lower time complexity than the Becker–Joux–May–Meurer algorithm in most cases. Indeed, while the Becker–Joux–May–Meurer Information Set Decoding (ISD) variant has a lower asymptotic cost, considering a worst-case-scenario for the solution of the Syndrome Decoding Problem (SDP), i.e., code rate close to 0.5, and a large enough value for
n, a finite regime estimate of its cost reports that employing the May–Meurer–Thomae approach should result in a faster computation. Concerning the space complexities of the approaches with exponential (in the code length
n) space requirements, we report the obtained values in
Table 5. We note that the Information Set Decoding (ISD) variants proposed by Stern and Finiasz and Sendrier have an overall lower memory consumption that their more advanced counterparts. In particular, the space complexities of the aforementioned variants start as low as 16 Gib for Category 1 parameters, and are thus amenable to an implementation which keeps the entire lists in main memory on a modern desktop. By contrast, the May–Meurer–Thomae and Becker–Joux–May–Meurer Information Set Decoding (ISD) variants require a significantly higher amount of memory, with the latter being less demanding than the former in all cases but the one of LEDAcrypt in its
parameterization for extremely long term keys (LEDAcrypt-XT). In all cases, the space complexities of May–Meurer–Thomae and Becker–Joux–May–Meurer exceed
, pointing strongly to the need of a significant amount of mass storage to implement a practical attack. Such a requirement is even more stringent in the case of higher security levels, where the memory requirements exceed
for most parameter choices.