1. Introduction
The famous Rivest–-Shamir–-Adleman (RSA) public-key cryptosystem is the de facto standard utilized in global technologies as a powerful encryption and decryption mechanism. It was introduced in 1978 by Rivest, Shamir, and Adleman as the first working public-key encryption scheme [
1]. It has been included in many cryptographic standards and libraries due to its practicality and simplicity. Its key generation algorithm computes two distinct
n-bit primes,
p and
q, called RSA primes. These primes are the first two private keys of RSA which form
, called the RSA modulus. The modulus
N is the first RSA public key and each
N has its corresponding
that is derived from Euler’s phi function. Specifically,
; this value is also kept as the RSA private key. The second public key,
e (also called the public exponent) is chosen such that
. Each
e has its corresponding private exponent,
d where
. Thus, the RSA public keys are given by the pair
, while the RSA private keys are represented by the tuple
.
Since its introduction, RSA has been successfully retained for forty years for its defence against various attacks [
2]. The security of the RSA cryptosystem relies on the hardness of solving the following problems: Firstly the integer factorization problem (IFP), entrenched in the modulus
. Secondly, the hardness of solving the key equation
and, finally, the
eth root problem in the encryption function. Constant cryptanalysis or ‘attacks’ on these three problems is crucial to maintain the security of RSA at the highest level [
3]. This crucial need for information security has led to the rise of various cryptographic algorithms to implement security in different dimensions and for numerous purposes [
4].
Before RSA was introduced, prior results had shown that
and
that have small factors cause
to be vulnerable when factored in polynomial time using the Pollard
algorithm [
5]. Pollard’s
algorithm is exceptionally efficient whenever all prime factors of
and
are small [
6]. In addition, a technique known as an estimated prime factor (EPF) was improved by Tahir et. al [
7] to solve
N generated from balanced or unbalanced primes
p and
q. Furthermore, Pollard [
8] showed that
N with a small size is easily factored since the complexity of the factorization algorithm depends on the size of
. Subsequently, research undertaken by many others [
9,
10,
11] extended this complexity using the number field sieve method, which has dominated efforts to factor the RSA modulus ever since. In 2021,
N with 829 bits was successfully factored using this method [
12]. Later simulations demonstrated that the 2048-bit RSA modulus can only be factored by a quantum computer with 13 436 qubits within 177 days [
13].
The development of quantum computers with effective factoring implementation is unlikely to be realized for many years. Thus, it can be assumed that RSA can still be used securely. However, in this investigation, we show that certain unexplored structures of p and q cause to be factored without the aid of quantum computation in polynomial time. Specifically, if p and q are near-square primes, then N will be vulnerable. The general definition of a near-square prime is given in the following definition.
Definition 1. Let a be any integer and m be a power of 2. If is a prime number where is a countable integer (for example, ), then we define p as a near-square prime.
Prior to this work, factoring of near-square primes was only discussed using a theoretical sieve approach [
14] and never in cryptographic settings. However, our previous investigations [
15,
16] showed that such primes can become vulnerable points in the RSA cryptosystem. Furthermore, the abundance of such primes due to the common size used for RSA primes in standard cryptographic libraries highlights the importance of defining near-square primes with a description that fits RSA in practice. We define below the particular notion of a near-square prime used in all of our attacks.
Definition 2. Let a be any integer and m be a power of 2. If is a prime number where is a ‘sufficiently small’ integer, then we define p as an -near-square prime.
In practical circumstances, the term ‘sufficiently small’ used in Definition 2 refers to the size of integers that are computationally feasible to be performed via an exhaustive search method in the future. For this, readers are advised to refer to the latest standard key size published by the National Institute of Standards and Technology (NIST) [
17].
1.1. Motivation for this Paper
Since RSA is still a leading public-key cryptosystem used in digital applications, its security level must be maintained at the highest level. However, it has been found that there are weak primes that are unknowingly used as RSA primes. These weak primes are in the form of near-square primes as defined in Definition 1.
The authors of [
15] showed that the number of near-square primes falling under Definition 2 is asymptotic to
Based on Equation (
1),
s is the smallest squared number with
n-bit size, and
is the higher value of two near-square primes,
and
. In terms of RSA-2048, then there are approximately
near-square primes with 1024-bits [
15]. Based on this vast amount of near-square primes, this paper intends to emphasize the importance of
not selecting near-square primes as RSA primes in the current implementation of the RSA key generation algorithm since there is the possibility they are being used unknowingly in digital applications using RSA today. This is because no current cryptographic standards have imposed any conditions to prevent appointing near-square primes as RSA primes. From the results provided in this paper, we hope that this practice may be amended in the near future to maintain the security of RSA.
1.2. Contribution of This Paper
The results presented in this paper represent a continuation of previous research in [
15,
16] which exposed the vulnerabilities of using
as the RSA modulus. The main aim of this paper is to cryptanalyze (or attack) three other distinct forms of the RSA modulus with near-square prime factors. Specifically, in the first attack, the RSA primes are set to be in the form of
. In the second attack, the RSA primes have the form of
, while in the third attack, the prime factors are considered to be in the form
. As a result, we show that near-square primes should not be used as RSA primes since they enable
N to be factored using the quadratic root method which can feasibly computed by any adversary.
A summary of the structures of near-square primes computed to be
N covered in our previous work [
16] and in this section is shown in
Figure 1.
1.3. Organization of the Paper
The paper is organized as follows: In
Section 2, we discuss some previous related studies that how how the structures and conditions imposed on the RSA primes can lead to a total break.
Section 3 highlights and compiles three new attacks to factor the RSA modulus
N. We show that there are some types of RSA primes that can feasibly lead to a total break of RSA. Additionally, we provide three algorithms to perform the newly proposed attacks. In
Section 4, we propose a countermeasure against all the proposed attacks. Our proposed countermeasure is straightforward and can be easily implemented in RSA key generation standard practices. In
Section 5, we provide a comparative analysis of attacks that focus on the structure of RSA primes in order to factor
N. Finally, we conclude the paper and provide suggestions for future work in
Section 6.
2. Related Work
In this section, we review some of the past attacks against RSA that exploit the structures of the primes as the source of the vulnerabilities so that N can be factored in polynomial time.
One of the earliest such papers was presented even before RSA was established.The authors of [
5] showed that a composite number can be factored easily if the value preceding one of its prime factors comprises negligibly small primes e.g.,
. This work showed that there exists a condition on a prime that causes the composite number it formed to be easily factored. The algorithm from this condition is called a specific factorization algorithm, i.e., an algorithm that can factor a composite number with the specific condition.
Apart from the specific-purpose factorization algorithm, there are also algorithms designed for any composite number without specific structures. This kind of algorithm is called a general-purpose factorization algorithm. It is of note that the running time of a general-purpose factorization algorithm depends solely on the size of a composite number
N. Among the popular factoring algorithms belonging to this category are the quadratic sieve (QS) and general number field sieve (GNFS). In practice, the QS algorithm has proven to be simpler than the GNFS algorithm and is fastest for integers below 100 decimal digits, but no better than the GNFS algorithm for integers with 110–120 digits [
4]. It was first introduced by [
9] and called the quadratic sieve algorithm. It is regarded as the fastest factoring algorithm for 50–100 bit integers. The authors of Lenstra et al. [
11] then introduced a more general approach called the number field sieve factorization algorithm; this algorithm has since been able to factor RSA numbers up to 829 bits [
12]. However, since its complexity is sub-exponential, the size of integers remains a significant hurdle for it to break the RSA modulus with 2048 bits efficiently.
In De Weger’s result [
18], it was then shown that the prime difference of
p and
q in RSA can influence the result shown previously by [
19]. Specifically, if
then the adversary only requires
to factor
N using a lattice-based attack. This work also showed the relation between the prime difference and the early work on small decryption exponents introduced by [
20].
A further assumption commonly applied when attacking RSA is that the adversary is able to know certain bits of RSA private keys beforehand. For example, Ernst et al. [
21] showed that, by knowing certain bits of the RSA private key exponent,
d, the adversary can factor
N in polynomial time. Later, Sarkar and Maitra [
22] extended this attack by combining this assumption with an additional assumption that, if certain bits of
p and/or
q are also known, then the result by [
21] can be extended to a more generalized form. However, the attack depends solely on the capabilities of the adversary to collect the secret bits, either from the side-channel method or through faulty coding from implementation.
However, in 2019, Abd Ghafar et al. [
16] studied the impact of using near-square RSA primes which yield factorization of the RSA modulus
N. Note that, the objective of the work in [
16] was to factor an RSA modulus with near-square prime factors, i.e.,
, such that
and
. This shows the importance of work exploring extended conditions of the near-square primes. Application of this research shows that an adversary can also conduct a partial key exposure attack—similar to assumptions made by [
21,
22]—on the LSBs of primes that satisfy both given conditions [
23].
Useful Lemmas
Next, we present some previous findings from [
16] that are used as in our result. In Lemma 1, the aim is to show the integer and decimal forms of the equality of
.
Lemma 1 ([
16])
. Suppose are positive integers and is a power of 2. If , then . Proof. Refer to Lemma 3.1 of [
16]. □
Subsequently, the lower bound and upper bound of can be determined as shown in Lemma 2.
Lemma 2 ([
16])
. Suppose are positive integers and is an even integer satisfying . Let where . If and , then Proof. Refer to Lemma 3.2 of [
16]. □
Then, the following Theorem 1 to find the factorization of the RSA modulus is proposed upon determining the lower bound and upper bound of .
Theorem 1 ([
16])
. Suppose are positive integers and is an even integer with . Suppose is a valid RSA modulus. Let and where . If is sufficiently small, then the factorization of N can be performed in polynomial time. Proof. Refer to Theorem 3.1 of [
16]. □
3. Attacks on Near-Square RSA Primes
This section presents our newly proposed attacks to factor the RSA modulus N. Following the direction of our previous investigations, we propose new results regarding the near-square RSA primes which yield the factorization of N in polynomial time. In the following subsections, we describe three new attacks to factor the RSA modulus N with distinct structures of near-square prime factors. Specifically, the attacks are structured as follows:
Attack I: When the prime factors have the form
Attack II: When the prime factors have the form
Attack III: When the prime factors have the form
3.1. Attack I:
The objective of Attack I is to factor an RSA modulus with near-square prime factors, i.e., where and . First, we need to show the equality of to its integer and decimal forms as follows.
Lemma 3. Suppose are positive integers where is a power of 2. If , then .
Proof. Suppose
is an integer where
a is a positive integer. Then
Since
, then
. □
Based on the result obtained in Lemma 3, we proceed to determine the upper and lower bounds of in the next Lemma 4.
Lemma 4. Suppose are positive integers and is a power of 2 such that . Let where . If and , then、
Proof. We need to satisfy the following statement to prove the lower bound:
Observe that
Since
will always be a positive integer, this implies
Then
Thus,
or can be written as
.
Now, the task is to prove the upper bound. Observe that
and
. Then, based on Lemma 3,
If
and
, then
If
, then (
2) becomes
or can be written as
.
Thus, the bounds are written as . □
Next, we propose the following theorem to show that the modulus can be factored in polynomial time upon obtaining the upper and lower bounds of in Lemma 4.
Theorem 2. Suppose are positive integers and is a power of 2 with . Let be a valid RSA modulus. Let and where . If is sufficiently small, then N can be factored in polynomial time.
Proof. Observe that from Lemma 4, we have
Thus, (
4) can also be rewritten as
Assume that
and
are known since
is sufficiently small. Then, the difference between the lower and upper bounds of (
5) is given by
which shows the maximum number of integers to find
. If
is sufficiently small, then we can find
in polynomial time.
Note that
can be found by computing
. Then, we can observe that
From
and
, then
Here, the value of
can be computed without modular reduction. Considering the values of
and
are already known,
p and
q can be obtained by finding the solutions of the following quadratic equation:
We have determined that
and
. Since
and
are known, we can obtain
Thus, the modulus
N can be factored by computing
This terminates the proof. □
The following Algorithm 1 demonstrates the factorization of
via Theorem 2. The algorithm is as follows:
Algorithm 1 Factoring via Theorem 2. |
|
3.1.1. The Complexity of Attack I
Observe that the most expensive operation in Algorithm 1 is the modular reduction of calculating
. From [
24], we know that the classical modular reduction of modulo
works at
. Since
is the potential value of
, the maximum integers to find it are less than
, as shown in Equation (
6). Based on this computation, we have the complexity of Attack I presented in Algorithm 2 to be
. As we assume
to be sufficiently small, the attack can also feasibly be computed.
Algorithm 2 Factoring via Theorem 3. |
|
3.2. Attack II:
The objective of Attack II is to factor an RSA modulus with near-square prime factors, i.e., . First, we introduce Lemma 5 that will aid our attack later. It will be used not only in the second attack, but also in the following Attack III.
Lemma 5. Suppose are positive integers and is a power of 2 such that . If and , then Proof. If
then
Since
is negligible in (
7) because
. This shows that
when
. Now, if
, then
Since
and
are negligible in (
8) because
and
, then
This shows that
when
. This completes the proof. □
Based on the result obtained in Lemma 5, we continue to determine the upper bound and lower bound of as shown in Lemma 6.
Lemma 6. Suppose are positive integers and is a power of 2 such that . Let . If and , then Proof. By using the result in Lemma 5, we have
Since
is always a positive number, it follows that
or can be written as
Then,
Thus, the upper bound can be rewritten as
Now, we want to prove the lower bound. Based on Lemmas 1 and 3, observe that
and
. Then,
From (
3), we know that
If
, then (
10) will become
Therefore, the bounds are written as
This completes the proof. □
Now, we propose the following theorem to show that the modulus can be factored in polynomial time upon obtaining the bounds of in Lemma 6.
Theorem 3. Suppose are positive integers and is a power of 2 satisfying . Let be a valid RSA modulus. Let and where . If is sufficiently small, then N can be factored in polynomial time.
Proof. As observed from Lemma 6, we have
Thus,
Assume that
. Then, the difference between the upper bound and lower bound of (
12) is given by
which represents the maximum number of integers to find
.
Since
is sufficiently small then
and
can be found in polynomial time. Subsequently, as
is sufficiently small, then
can be obtained in polynomial time.
Note that
can be found by computing
. Then, we can see that
Notice that
and
, hence, it yields
Accordingly, we can compute
without modular reduction. Considering the values of
and
are already known,
p and
q can be obtained by finding the solutions of the following quadratic equation:
We find that
and
. Since
and
are known, we can obtain
Thus, the modulus
N can be factored by calculating
This completes the proof. □
As shown in Algorithm 2 is the factorization of via Theorem 3.
3.2.1. The Complexity of Attack II
Observe that the most expensive operation in Algorithm 2 is the modular reduction of calculating
. Using the similar reference from Attack I, we know that the classical modular reduction of modulo
works at
. Since
is the potential value of
, the maximum integer to find it is
, as shown in Equation (
13). Based on this computation, we have the complexity of Attack II presented in Algorithm 2 to be
. As we assume
to be sufficiently small, the attack can also feasibly be computed.
3.3. Attack III:
The aim of Attack III presented in this section is to factor an RSA modulus with near-square prime factors, i.e., .
According to the result obtained in Lemma 5, we proceed to determine the lower and upper bounds of for the case when the prime factors are in the forms and , respectively.
Lemma 7. Suppose are positive integers and is a power of 2 such that . Let . If and , then Proof. We refer to (
9) to prove the lower bound. It states that
This inequality is true regardless of the structure of
N since it discusses results obtained from Lemma 5 and
in general.
Then, observe
Thus, the lower bound is written as
Now, we want to prove the upper bound. Based on Lemmas 1 and 3, observe that
and
. Then,
Then (
14) will become
since
. Therefore, the bounds are written as
This terminates the proof. □
Subsequently, we propose Theorem 4 to show that the modulus can be factored in polynomial time upon obtaining the upper and lower bounds of in Lemma 7.
Theorem 4. Suppose are positive integers and is a power of 2 satisfying . Let be a valid RSA modulus. Let and where . If is sufficiently small, then N can be factored in polynomial time.
Proof. As observed from Lemma 7, we have
Thus,
Assume that
. Then, the difference between the lower and upper bounds of (
16) is given by
which represents the maximum number of integers required to find
.
Since is sufficiently small then and can be found in polynomial time. Subsequently, as is sufficiently small, then we can find in polynomial time.
As previously mentioned,
can be found by calculating
. Then, we can see that
Observe that from
and
, then we can have
Thus, the value of
can be computed without modular reduction. Considering the values of
and
are already known,
p and
q can be obtained by finding the solutions of the following quadratic equation:
We find that
and
. Since
and
are known, we can obtain
Thus, the modulus
N can be factored by calculating
This completes the proof. □
The Algorithm 3 to factor
via Theorem 4 is as follows:
Algorithm 3 Factoring via Theorem 4. |
|
3.3.1. The Complexity of Attack III
Observe that the most expensive operation in Algorithm 3 is the modular reduction of calculating
. Using the similar reference from Attack I, we know that the classical modular reduction of modulo
works at
. Since
is the potential value of
, the maximum integer to find it is
, as shown in Equation (
17). Based on this computation, we have the complexity of Attack III presented in Algorithm 3 to be
. As we assume
to be sufficiently small, hence the attack is also feasible to be computed.
4. Countermeasure of the Attacks
From Equations (
4), (
11), and (
15), we observe that all attacks discussed previously have a sufficiently small set of integers to find the actual value of
. Since
p and
q discussed in the attacks are near-square primes, we can find the nearest squared integer of both by computing
and
where
and
are fundamentally sufficiently small integers, and depend on the types of attacks presented previously. This implies that the owner of the private keys can check the distance between
N and
by computing
If D is sufficiently small, we know that an adversary can find the values of in polynomial time, as shown in the previous sections. Thus, p and q must not be used as the private keys and another set of RSA primes must be generated. This countermeasure is efficient since it only requires minimal computations; hence, it can easily be adopted in future implementations of RSA.
5. Comparative Analysis
This section provides a comparative analysis between attacks that focus on the structure of RSA primes in order to factor
N. For this comparison, we choose five types of attack, as discussed in
Section 2: (a) specific-purpose factorization algorithm; (b) general-purpose factorization algorithm; (c) small prime difference; (d) partial key exposure; and (e) near-square primes, as shown in this research.
In most discussions of implementing RSA correctly (e.g., [
25]), there are preventive measures to avoid all of these attacks (except for (e)). Hence, we believe the analyses of these attacks (except for (e)) have contributed to maintaining the security of RSA at its highest level, which is our aim in this research. We compare the advantages and disadvantages of these attacks with results presented in this research, as shown in
Table 1.
From the comparison in
Table 1, we can see that our attack is efficient since its complexity shown in
Section 3.1.1,
Section 3.2.1 and
Section 3.3.1 are all in polynomial time. However, the structure of
p and
q must be in specific forms to conduct the attack although the number of primes in these forms is large, as shown in Equation (
1).
6. Conclusions and Future Work
We have successfully shown that the RSA modulus with near-square prime factors would render the factorization of N in polynomial time. Specifically, we showed that such primes can become vulnerable points in the RSA cryptosystem. This poses a danger to the existing RSA implementation since there are potentially significant numbers of the RSA modulus that unknowingly employ the structure used in digital applications today. An RSA modulus with near-square prime factors, i.e., and can be factored using the quadratic root method to solve for the prime factors of N. In all of our attacks, it is necessary to examine the distance between N and which is sufficiently small, i.e., , in order to find the factorization of the RSA modulus N to be feasible in polynomial time. This poses a danger to the digital applications using RSA today since many implementations ignore this value, and, unknowingly, in some RSA key generation processes, the values are sufficiently small. To avoid this catastrophe for many digital users, we have proposed a countermeasure that can avoid the attacks which fits RSA in practice.
For future work, we suggest that further analysis should be carried out to find the conditions that allow factorization of the RSA modulus when only one of the RSA primes is a near-square prime. If such conditions exist, then we believe many current RSA keys are weak since there is a high possibility to generate such an RSA modulus. This belief is based on the current implementation of cryptographic libraries that is lenient on near-square primes chosen as RSA primes. Thus, a mitigation plan is required to prevent the keys from being exploited by real-world adversaries.