Multiscale Sieve for Smart Prime Generation and Application in Info-Security, IoT and Blockchain

Iovane, Gerardo; Benedetto, Elmo; Gallo, Carmine

doi:10.3390/app14198983

Open AccessArticle

Multiscale Sieve for Smart Prime Generation and Application in Info-Security, IoT and Blockchain

by

Gerardo Iovane

^*,†

,

Elmo Benedetto

^†

and

Carmine Gallo

^†

Department of Computer Science, University of Salerno, Via Giovanni Paolo II, 84084 Fisciano, SA, Italy

^*

Author to whom correspondence should be addressed.

^†

All authors contributed equally to this work.

Appl. Sci. 2024, 14(19), 8983; https://doi.org/10.3390/app14198983 (registering DOI)

Submission received: 11 September 2024 / Revised: 21 September 2024 / Accepted: 24 September 2024 / Published: 5 October 2024

(This article belongs to the Special Issue Application of IoT and Cybersecurity Technologies)

Download

Browse Figures

Versions Notes

Abstract

:

The huge computational cost required to test whether a number is prime and the inefficiency of the known sieving algorithms for extremely large inputs have posed significant challenges in computational number theory. Traditional deterministic prime generation methods struggle to maintain performance when the input sizes increase exponentially. In this work, we show that, through multiscale distribution and deterministic prime number generation, it is possible to create a multiscale sieve with drastically better performance than the deterministic algorithms known to date, providing a more efficient solution for large-scale prime number generation, demonstrated by several benchmarks that highlight the potential of our approach. Consequently, we can gain some advantages in cryptography and in info-security, such as in IoT and blockchain environments.

Keywords:

prime number; sieve algorithms; multiscale distribution; cryptography; IoT; blockchain

1. Introduction

The generation of prime numbers and their distribution and the research of a possible deterministic method to discover new primes have been important issues in mathematics over the last two centuries (see, for example, [1,2,3,4,5,6,7]).

Sieve algorithms, therefore, play a fundamental role in the field of number theory, particularly in the identification of prime numbers. These algorithms are used in numerous application contexts, including cryptography, data analysis, and scientific computing. To date, we know of numerous sieving algorithms, starting with the oldest sieves that laid the foundations, such as the Sieve of Eratosthenes, the Sieve of Sundaram, the linear sieve, and many other well-known methods that iteratively eliminate composite numbers (see [8,9]). While some of these algorithms are efficient for small to medium-sized numbers, they become inefficient when dealing with very large numbers without further optimization. Modern sieving algorithms, which allow slightly larger numbers to be handled more efficiently, have built on the foundations of ancient algorithms. For instance, the Sieve of Atkin improves upon the Sieve of Eratosthenes by using a more complex mathematical approach to reduce the number of unnecessary operations [10], while the Segmented Sieve of Eratosthenes with Wheel Factorization optimizes the memory usage and reduces the computational load by segmenting the range of numbers and incorporating wheel factorization to skip multiples of small primes [11]. Meanwhile, there are others, such as AKS [12].

In contrast to deterministic sieving algorithms, probabilistic algorithms do not guarantee the identification of a prime but provide a probability of correctness, such as the Rabin, Solovay, and Strassen tests and the Baillie-PSW test using the Lucas–Lehmer test (for details, see [13,14,15,16]). These probabilistic methods, while faster, have several disadvantages, including a high false positive rate, making them unsuitable for applications requiring absolute certainty, and their complex and computationally intensive implementation.

In Table 1, we report some of more relevant sieves and their time and space complexity.

In this paper, we show that, through our work in [17,18,19,20], where we performed a multiscale analysis showing how the primes in one fixed scale become the seeds that generate new primes in the next scale, we can create a deterministic multiscale sieve with drastically better performance than the deterministic algorithms known so far, providing a more efficient solution for the generation of primes and even new primes at a large scale. In addition, as we will see, our approach is very useful for distributed and decentralized infrastructures. In [21], we find an interesting example of a new security approach in the context of IoT, while, in [22], another interesting application to blockchain is presented. Therefore, the present work could be the basis for the proof of work in the blockchain context. Furthermore, considering that a multiscale approach like the one proposed lends itself very well to simple but highly parallel computation, the proposed system is also very attractive in the IoT context, in relation to aspects of both info-security and gamification.

This article is organized as follows. In Section 2, we describe the forms that prime numbers can assume and how the automation of this process is the basis of the sieve, while, in Section 3, we present an overview of the key elements and fundamental logic on which the algorithm is based. In Section 4, we analyze the sets of prime candidates generated at each level and explain the important optimization of their generation through the reduction of the search space. We next describe the steps that follow the generation, for the search of prime numbers. We then perform an in-depth comparative analysis of the results in Section 5, based on several parameters, and, in Section 6, we examine the possible future uses of the algorithm through the use of the GMP library and environments with greater computational resources. Furthermore, we illustrate some possible examples in terms of implementing the algorithm in the blockchain and IoT contexts; finally, we arrive at an important conclusion about the efficiency of the algorithm in Section 7.

Relevant Works and Starting Points

A perfect number is commonly defined as an integer that equals the sum of its proper positive divisors, i.e., the sum of the divisors excluding the number itself. Equivalently, a perfect number is one that is half the sum of all of its positive divisors, i.e.,

σ (n) = 2 n

, where

σ (n)

represents the sum of the divisors of n. It is worth highlighting that the first perfect number is 6, since its proper positive divisors are 1, 2, and 3, and their sum is

1 + 2 + 3 = 6

.

Through the work in [17], in [18], we proved that the generation of prime numbers can be seen as a deterministic process. In fact, we found the minimal representation, i.e., the most compact expression, to denote the set of prime numbers in

N

in terms of the first perfect number.

Let us introduce the sets:

\begin{matrix} S_{1} & = \{s_{1} = 6 k - 5 : k \in N\}; \end{matrix}

(1)

\begin{matrix} S_{2} & = \{s_{2} = 6 k - 4 : k \in N\}; \end{matrix}

(2)

\begin{matrix} S_{3} & = \{s_{3} = 6 k - 3 : k \in N\}; \end{matrix}

(3)

\begin{matrix} S_{4} & = \{s_{4} = 6 k - 2 : k \in N\}; \end{matrix}

(4)

\begin{matrix} X & = \{x_{k} = 6 k - 1 : k \in N\}; \end{matrix}

(5)

\begin{matrix} S_{6} & = \{s_{6} = 6 k : k \in N\}; \end{matrix}

(6)

\begin{matrix} Y & = \{y_{k} = 6 k + 1 : k \in N\} . \end{matrix}

(7)

Therefore, we can represent the set of natural numbers

N

as follows:

\begin{matrix} N = {1} \cup S_{2} \cup S_{3} \cup S_{4} \cup X \cup S_{6} \cup Y \end{matrix}

(8)

In fact,

S_{1}

contains the same elements as Y, except for the first element, which is 1. Moreover, the sets following Y share the same elements as the previous sets from

S_{1}

to Y (excluding the elements omitted due to the shift mod 6), since their elements are obtained by simply shifting those of the previous sets.

We prove that the set of the prime candidates

\tilde{P}

can be written as

\begin{matrix} \tilde{P} = {2} \cup {3} \cup X \cup Y \end{matrix}

(9)

Unfortunately, the sets X and Y also include composite numbers, such as positive integers that are multiples of 5 and others. Consequently, we have introduced selection rules to exclude composite numbers and obtain the complete set of prime numbers.

Let us introduce the following two subsets

X^{(-)} \subset X

and

Y^{(-)} \subset Y

:

X^{(-)} = \{x_{k i j} \in X : k_{i j} \in R_{i j}^{(1)}, \forall k_{i j}, i, j \in N\},

Y^{(-)} = \{y_{k i j} \in Y : k_{i j} \in R_{i j}^{(2)}, and k_{i j} \in R_{i j}^{(3)}, \forall k_{i j}, i, j \in N\},

where

R_{i j}^{(1)} = \{k \in N : k = (6 i j - i + j), \forall i, j \in N\},

R_{i j}^{(2)} = \{k \in N : k = (6 i j + i + j), \forall i, j \in N\},

R_{i j}^{(3)} = \{k \in N : k = (6 i j - i - j), \forall i, j \in N\} .

The natural numbers

x_{k}

and

y_{k}

are composite if and only if

x_{k} \in X^{(-)}

and

y_{k} \in Y^{(-)}

.

Thus, we obtain the minimal explicit representation for the set of primes:

\begin{matrix} P = {2} \cup {3} \cup X^{'} \cup Y^{'} \end{matrix}

(10)

where

X^{'} = \{x_{k} \in N : x_{k} = 6 k - 1 and k \neq 6 i j - i + j, \forall k, i, j \in N\},

Y^{'} = \{y_{k} \in N : y_{k} = 6 k + 1 and k \neq 6 i j + i + j or k \neq 6 i j - i - j, \forall k, i, j \in N\} .

In [19], we introduced a novel method to estimate a number of prime candidates less than n, utilizing a multiscale analysis or, alternatively, a tree-based approach.

In [20], a multiscale approach is introduced to progressively improve the computing performance, starting with the sets

6 k \mp 1

. It is important to note that the choice of the number 6, or the class of remainders mod 6, is related to the fact that

6 = 2 \cdot 3

, the product of the first two prime numbers. This approach can be iterated by considering

30 = 2 \cdot 3 \cdot 5

,

210 = 2 \cdot 3 \cdot 5 \cdot 7

, and so on. In other words, we can subdivide

N

into sets such as

30 k - α

,

210 k - α

, and so on. Among these sets of candidate primes, such as the set

6 k - α

with

α = \mp 1

, some contain predominantly prime numbers. From an initial analysis, it is evident that the set of prime candidates at the first level, i.e.,

6 k \mp 1

, does not eliminate multiples of 5, while the prime candidates at the second level, i.e.,

30 k - α

(where

α

is a prime found by using

6 k \mp 1

for k so that the result is less than 30), do not eliminate multiples of 7. In other words, while multiples of 5 are mixed with other composites in the set

6 k \mp 1

, at the second level, they are confined to two specific sets:

30 k - 5

and

30 k - 25

. Therefore, these multiples can be removed by eliminating entire classes. By performing this process iteratively, we obtain a multiscale algorithm capable of generating a pure set of primes without any selection rules or with increasingly selective rules.

2. Core Concepts of Multiscale Sieve

Briefly, in [20], we showed how each prime number can be written as the difference between the product of the first m primes and a prime obtained in the previous level of gridding:

p_{i j} = (\prod_{j = 1}^{m} p_{j}) k - p_{i},

(11)

where

$k \in N$ is a count index;
$p_{i}$ is a prime number obtained at the level $m - 1$ ;
and j is the level of decomposition or the multiresolution step.

Therefore, through this approach, it is possible to automate the process of generating prime numbers, which underlies the sieve algorithm. Given a number

n \in N

as input, the sieve will generate, through the multiscale approach described above, all prime numbers up to and including n. If the last number generated is n, then the number given as input is a prime number; otherwise, it is a composite number, and the algorithm will return the prime number closest to n. In this way, the algorithm acts as both a primality test and a prime generator.

Memory Management

The first step of the algorithm is the allocation of the memory required to store only candidate prime numbers that are useful for multiscale generation, up to the nearest prime number to the n given as input. The set of candidate prime numbers generated at each level is called a multiscale prime candidate set (see Section 4).

After taking the input n in the correct formatting, the necessary amount of memory to be allocated is calculated, which is equal to the number of candidate prime numbers that make up the multiscale set. This can be calculated exactly through Formula (23) explained below, which uses prime numbers. The calculation is carried out at each level and reallocations are made accordingly, so that no more memory is allocated than is truly needed, thus avoiding effects on the performance.

3. Overview of the Algorithm

After allocating the necessary memory, we consider the key elements and steps of the algorithm. Let us start with Formula (11).

$(\prod_{j = 1}^{m} p_{j})$

(12)

is the product of the first m primes, which changes at each level and serves as the basis for the generation of the multiscale prime candidate set. At each level, the addition of a new prime number to the product expands the set of candidates and, at the same time, automatically eliminates numbers that are multiples of this new prime, as we can see in Table 2. The process takes place at several levels, starting from an initial base that considers the first two prime numbers $p_{1} = 2$ and $p_{2} = 3$ , with a base product that eliminates all numbers that are multiples of 6. Subsequently, at each level, a new prime number is introduced, which is multiplied by the product of the previous level, allowing us to systematically exclude multiples of this number, thus narrowing the field of possible numbers and facilitating the search for actual primes. For example, at the second level, with the addition of $p_{3} = 5$ , the new product becomes $6 \cdot 5 = 30$ , thus allowing the elimination of numbers that are multiples of 30.

Definition 1.

We call

p_{m + 1} - p_{j}

with

j = m + 1

that is the prime number (already obtained at previous levels

< m

), which serves as a factor at the next level for the base

\prod p_{j}

.

2.: At each level, k takes on all values in the range of 1 to $k_{\max}$ , excluding the lower or upper limit. The latter ( $k_{\max}$ ) varies from level to level because it assumes the value of $p_{m + 1}$ .
3.: To the product obtained between the $\prod p_{j}$ and k, we subtract (or add) each prime number $p_{i}$ , belonging to the multiscale prime candidate sets of the previous levels, starting from the prime number $p_{m + 1}$ .

With the steps described, we have all of the elements available for the generation of multiscale prime candidate sets.

Table 2 illustrates how, at each level, the number of candidate classes obtained

Λ

, thanks to the multiscale approach, increases, but some of these classes are eliminated

Ψ

because they contain numbers that are multiples of the newly introduced prime numbers. For example, at the second level, with the product

2 \cdot 3 \cdot 5 = 30

, there are 10 total classes, but two classes are eliminated because they include numbers that are multiples of 5, leaving eight remaining classes

Γ

.

Here, we give a pseudocode example of the generation Algorithm 1 and a numerical Example 1 to understand the procedure.

Example 1.

The following Figure 1 represents the multiscale prime candidate set generated up to the third level (

m = 3

), the numbers eliminated to reduce the research space (see Section 4.1), and the values assumed by the variables.

Algorithm 1 Pseudocode of the Generation Algorithm

Require: n: An integer representing the limit for prime calculations
1: while

s e t s [p i] < n

do
2:

p j \leftarrow p j \times s e t s [p m + +]

3:

k_m a x \leftarrow s e t s [p m]

4:

l a s t_p i \leftarrow p i - 1

▹ save the upper limit of the range of the current set
5:
6: for

k = 1

to

k_m a x

do
7:

s e t s [p i + +] \leftarrow (p j \times k) - s e t s [0]

▹ calculate primes on the constant sets ±1
8:

s e t s [p i + +] \leftarrow (p j \times k) + s e t s [0]

9:
10: for

j = p m

to

l a s t_p i

do
11:

s e t s [p i + +] \leftarrow (p j \times k) + s e t s [j]

12:

j \leftarrow j + 1

13: end for
14:

k \leftarrow k + 1

15: end for
16: end while
Ensure: A sets array containing all multiscale prime candidate sets up to n

4. Multiscale Prime Candidate Sets

For the algorithm to work properly and to employ further optimization, it is essential to analyze and understand how to handle sets. From [20], we can understand the relations between

Λ

,

Ψ

, and

Γ

given in terms of others and in terms of themselves at different scales.

Starting from

\begin{matrix} Λ_{i} = p_{i} Γ_{i - 1}, \forall i \in N \end{matrix}

(13)

and imposing a substitution through the following expression

\begin{matrix} Γ_{i} = p_{i} Γ_{i - 1} - Γ_{i - 1} = (p_{i} - 1) Γ_{i - 1}, \forall i \in N \end{matrix}

(14)

we can generalize the relation between

\prod p_{i}

and the classes

Λ

.

Now, let us show the evaluation step by step.

First, substituting the expression of

Γ_{i}

into

Λ_{i}

,

\begin{matrix} Λ_{i} = p_{i} Γ_{i - 1} \end{matrix}

(15)

\begin{matrix} = p_{i} ((p_{i - 1} - 1) Γ_{i - 2}) \end{matrix}

(16)

\begin{matrix} = p_{i} (p_{i - 1} - 1) Γ_{i - 2} \end{matrix}

(17)

Next, continue substituting iteratively:

\begin{matrix} Λ_{i} = p_{i} (p_{i - 1} - 1) ((p_{i - 2} - 1) Γ_{i - 3}) \end{matrix}

(18)

\begin{matrix} = p_{i} (p_{i - 1} - 1) (p_{i - 2} - 1) Γ_{i - 3} \end{matrix}

(19)

After n steps, the generalized form is

\begin{matrix} Λ_{i} = p_{i} (p_{i - 1} - 1) (p_{i - 2} - 1) \dots (p_{1} - 1) Γ_{0} \end{matrix}

(20)

Assuming

Γ_{0} = 1

for simplicity, we have

\begin{matrix} Λ_{i} = p_{i} (p_{i - 1} - 1) (p_{i - 2} - 1) \dots (p_{1} - 1) \end{matrix}

(21)

Rewriting this in terms of the m level,

Λ_{i} = p_{m} (p_{m - 1} - 1) (p_{m - 2} - 1) \dots (p_{1} - 1)

(22)

Then, we obtain the following result.

Theorem 1.

Λ_{m}

indicates the multiscale prime candidate sets generated at each m level before the reduction process:

Λ_{m} = (\prod_{i = 1}^{m - 1} (p_{i} - 1)) p_{m}

(23)

Proof.

The thesis is directly obtained by applying the induction principle, as the property holds for a base case and is preserved for subsequent steps. □

The analysis in Figure 2 reveals that

Λ_{m}

increases almost linearly with

\prod p_{i}

on a log-log scale, confirming the exponential growth model. The high coefficient of determination (

R^{2} = 0.99991

) indicates an excellent fit for the model.

4.1. Reducing the Research Space

To reduce the research space at each level during generation, we can mark or eliminate from the sets the prime candidate multiples of

p_{m + 1}

, starting from the prime candidate following

p_{m_{+} 1}

. It is important to remember that, when

\prod p_{i}

= {2, 6}

, the only two values that make up their sets are

{+ 1, - 1}

, in addition to belonging to each set.

Starting from

Γ_{i} = p_{i} Γ_{i - 1} - Γ_{i - 1} = (p_{i} - 1) Γ_{i - 1}, \forall i \in N

(24)

we can generalize the relation between

\prod p_{i}

and the classes

Γ

.

Now, let us show the evaluation step by step.

Continue substituting iteratively:

Γ_{i} = (p_{i} - 1) ((p_{i - 1} - 1) Γ_{i - 2})

(25)

= (p_{i} - 1) (p_{i - 1} - 1) Γ_{i - 2}

(26)

After n steps, the generalized form is

\begin{matrix} Γ_{i} = (p_{i} - 1) (p_{i - 1} - 1) \dots (p_{1} - 1) Γ_{0} \end{matrix}

(27)

Assuming

Γ_{0} = 1

for simplicity, we have

\begin{matrix} Γ_{i} = (p_{i} - 1) (p_{i - 1} - 1) \dots (p_{1} - 1) \end{matrix}

(28)

Rewriting this in terms of the m level,

\begin{matrix} Γ_{i} = (p_{m} - 1) (p_{m - 1} - 1) \dots (p_{1} - 1) \end{matrix}

(29)

Then, we obtain the following result.

Theorem 2.

Γ_{m}

indicates the multiscale prime candidate sets remaining at each m level after the reduction process:

Γ_{m} = (\prod_{i = 1}^{m} (p_{i} - 1))

(30)

Proof.

The thesis is directly obtained by applying the induction principle, as the property holds for a base case and is preserved for subsequent steps. □

The analysis in Figure 3 reveals that

Γ_{m}

increases almost linearly with

\prod p_{i}

on a log-log scale, confirming the exponential growth model, but with a lower rate of growth compared to the initial classes

Λ_{m}

, due to the reduction process. The high coefficient of determination (

R^{2} = 0.9997

) indicates an excellent fit for the model.

Furthermore, we visualize in Figure 4 the correlation between the classes

Λ_{m}

generated at each level and those eliminated

Ψ_{m}

by the reduction process.

We can deduce that the classes

Λ_{m}

grow faster than the eliminated classes

Ψ_{m}

in absolute terms than

\prod p_{i}

at each level. However, the eliminated classes

Ψ_{m}

also grow exponentially, but at a lower growth rate. This leads to an overall efficient reduction in the research space as the levels increase. Thus, we can deduce that, instead of considering every odd number of the partition

\frac{N}{2}

, the domain is divided into partitions expressed in the form

\frac{Γ_{m}}{\prod p i}

. Each partition, such as

\frac{2}{6} N

,

\frac{8}{30} N

,

\frac{48}{210} N

, and so on, selects a specific portion of the search space, thus drastically reducing the number of elements to be examined. This approach allows large partitions of numbers to be excluded directly, reducing the number of iterations required exponentially and, consequently, the computational time. In addition, at each level, all candidate prime factors of

\prod p_{i}

are excluded from generation. The result is the realization of a numerical accelerator for the generation of prime numbers.

4.2. Prime Number Finding

The generation process stops when a prime candidate number is greater than the given input n. Then, through an iterative loop, starting from the last generated prime candidate, we proceed to verify the primality through a simple and efficient test. Since we have already eliminated all factors of

\prod p_{i}

in the generation, and thanks to the process of reducing the research space by marking or eliminating non-primes, we have all of the primes needed to perform the primality test. For these reasons, we start the primality test with the prime

p_{m + 1}

, iterating a few instructions until the prime

p_{i}

is smaller than the square root of the prime candidate.

We check whether the prime candidate is a multiple of

p_{i}

:

then, we change the prime candidate to be verified by restarting with the primality test;
otherwise, we continue iterating by changing $p_{i}$ .

The algorithm terminates when the closest prime number to the given input is found. The number of iterations before finding the closest prime number to the input among the prime candidates generated is very low.

Example 2.

For values n ≤

10^{10}

, the average number of iterations is 3.

5. Benchmark Setup

The multiscale sieve algorithm was implemented in C and run on an M2 chip with a 64-bit ARM architecture and 16 GB RAM on a single thread at 3.5 GHz. The algorithm was compiled in an optimized way; through the C library <time.h>, and with the function clock, we measured the total time that the CPU required in executing the different inputs. Meanwhile, through the C library <sys/resource.h> and with the function getrusage, we measured the maximum size of the resident set, which represented the maximum amount of physical memory used to execute the algorithm.

In order to perform a comparative analysis, we used ancient and modern optimized versions of the sieve algorithms known to date, implemented in C. All sieve tests were executed under the same conditions as above.

Graphs and Benchmark Analysis

As can be seen from the following Table 3, less optimized algorithms or partial versions, such as the Segmented Sieve of Eratosthenes, Wheel Factorization, the Sieve of Sorenson, and many others, have been excluded, but optimized versions consisting of the union of several sieves have been included in order to make the comparison more competitive. All values are expressed in seconds (s).

Each algorithm shown in the tables, as mentioned above, is an optimized version; it is clearly possible in some cases to perform further optimizations, which is due to the fact that there is no standard implementation of these algorithms.

From an initial, purely computational analysis, we can see that the multiscale sieve is the most efficient for both small to medium and large inputs. Next, we graphically visualize in Figure 5 the growth in the execution time of the different sieving algorithms for a more complete analysis.

Immediately, it can be seen that the multiscale sieve grows exponentially, performing better on every input, but, more importantly, the growth rate is significantly lower for very large inputs. This implies that, compared to other algorithms, the growth rate will always be lower and thus more efficient for larger and larger inputs.

Furthermore, we can see in Table 4 how much faster the multiscale sieve is at a percentage level compared to each algorithm and input size.

It is also crucial to analyze the memory used by the algorithm during execution, which can impact the efficiency. The following Table 5 shows all maximum values of the resident memory used for each process execution, which includes all physical memory pages that the process currently has in RAM. This includes the stack memory, heap memory, BSS, data segments and code segments. All values are expressed in megabytes (MB).

The Segmented Sieve of Eratosthenes is known for its distinct memory efficiency, along with other similar variants, such as the bitwise sieve. In fact, even in this comparison, it proves to be one of the most memory-efficient at every input size.

Next, we graphically visualise in Figure 6 the growth in the memory usage of the different sieving algorithms for a more complete analysis.

We can state that, apart from the segmented sieve, which is always the most efficient in terms of memory consumption, the multiscale sieve ranks, on average, close to the modern sieving algorithms such as Atkin or the linear sieve, with almost similar consumption.

6. Applications: GMP for Big Numbers, IoT and Blockchain

A version of the algorithm has been implemented via the GMP library, demonstrating efficiency very similar to that analyzed above, which allows us to work with high precision on numbers that cannot be represented in the range

[0, 2^{64} - 1]

. This implementation exploits the advanced optimizations and capabilities of GMP to rigorously handle numbers with thousands or even millions of digits, which is essential for applications requiring extremely high precision, such as cryptography, i.e., RSA, XTR and elliptic curve cryptosystems (ECC); numerical analysis; and scientific computing.

For instance, RSA is a widely used public key encryption system that underpins much of the secure communication on the internet today. The security of RSA depends on the mathematical challenge of factoring large composite numbers, specifically those that are the product of two large prime numbers, randomly and independently of each other, denoted as

N = p \cdot q

. The difficulty in breaking RSA lies in the fact that determining p and q from N is computationally demanding and time-consuming with the current algorithms and hardware. The multiscale sieve offers a more efficient way to approach the factorization problem; it could significantly reduce the time needed to decompose N into its prime factors. Unlike traditional algorithms, such as the quadratic sieve, Pollard Rho [23] or General Sieve of Number Fields [24], which operate on fixed scales, the multiscale sieve might employ a hierarchical or layered approach to examine the N number and might uncover structural weaknesses or patterns that make factoring easier. The implications of this are profound. Firstly, if the sieve demonstrates superior performance in factorization, it could challenge the current assumptions about the security of RSA. This could lead to the rethinking of what constitutes an adequate key length in RSA encryption, as, to date, keys are typically 2048 bits or longer to ensure security. However, if factorization becomes easier, much longer keys might be necessary, increasing the computational load for encryption and decryption processes. Secondly, it could reveal vulnerabilities in the ways in which prime numbers p and q are generated. If certain properties of these prime numbers make them more susceptible to factoring when analyzed with a multiscale approach, it could reveal flaws in the random number generation processes used in cryptographic applications. In a broader sense, the application of the multiscale sieve could have far-reaching consequences beyond RSA alone; many cryptographic protocols are based on the assumption that certain mathematical problems, such as factoring, are computationally unattainable at large scales. Furthermore, in this way, the algorithm can be run for such applications on massive arrays of graphic processing units (GPUs) or even simple supercomputers (SSCs).

It would be interesting to integrate our algorithm into a system such as MuReQua [25], thus involving more users in the search for new prime numbers, with a coin reward system. Integration takes place through the use of an oracle, which functions as a tool for the handling of information and the storage of the prime numbers found. It is also responsible for checking the correctness of the proposed prime numbers and sorting them by creating a structured and easily accessible record. This sorting function is particularly important to ensure the optimization of future searches within the network, facilitating the retrieval and analysis of data by other users or applications integrated in the system. Due to the quantum and multiscale nature of the MuReQua blockchain, the validation and storage process takes place with high efficiency, ensuring data security and integrity. The implementation of this feature not only encourages user participation through gamification, but also contributes to the strengthening of the network itself.

With the following Figure 7, we show how our system can be designed. The two main components are the oracle and a function that allows us to calculate and distribute a coin reward system through the MuReQua system. The oracle performs fundamental work, as described above; in particular, it systematically provides ranges of multiscale prime candidate sets to the various users, which are useful for the search and generation of new primes. The distribution and reward system is explained later.

The coin system is based on the asymptotic distribution of prime numbers described by the following equation:

C (n) = π (n) \approx \frac{n}{log n}

(31)

where

C (n)

represents the total number of coins issued as n changes.

After this, the coins are distributed and separated into four parts as follows:

$\frac{C (n)}{2}$ is the number of coins distributed among users who find prime numbers;
$\frac{C (n)}{4}$ is the number of coins distributed among users who find composite numbers;
$\frac{C (n)}{8}$ is the number of coins that servers earn by eliminating intrinsically composed numbers through the implementation of the algorithm;
$\frac{C (n)}{8}$ is the number of coins that the oracle earns by storing and sorting all of the prime numbers found by users.

The following Table 6 shows the number of coins issued and distributed at each level, in addition, we graphically visualize in Figure 8 the distribution of the coins and their growth.

From the graph comparing the growth and distribution of the coins with respect to

\prod p_{i}

, we observe that the coins generated at each level follow a sublinear growth pattern, as indicated by the function

\frac{n}{log n}

. In particular, the curve gradually moves away from the linear growth line, demonstrating that the system distributes coins at a much more moderate rate than linear or exponential growth. This behavior reflects the controlled nature of the issuing process.

By considering the previous results in this section, we can use them also in the context of IoT for gaming. In fact, by considering an infrastructure based on low-cost devices, like a Raspberry Pi, Arduino or similar, we can realize different mechanisms of rewards for users involved in gaming or who offer services to a specific community, such as an energetic community, water community, consultancy, the promotion of third-party products or services, and so on. In an IoT-based infrastructure, devices can be considered as agents working collaboratively, implementing the concept of swarm intelligence. In this context, each device or node in the network can contribute to the decentralized generation of prime numbers, making local decisions based on real-time data and interactions with neighboring devices, all without a centralized control system. In a gaming community built on IoT principles, these devices could autonomously optimize not only the distribution of resources, such as energy consumption or the processing of complex calculations, but also the generation of prime numbers through distributed algorithms, which could be the basic nodes for the computational quantum key distribution, as in MuReQua, to increase the info-security of the infrastructure. Data sharing and collaboration between nodes would accelerate the process of generating prime numbers, distributing the computational load efficiently. Finally, this approach favors the creation of a self-organizing and resilient infrastructure, capable of dynamically adapting to the network’s needs and distributing the benefits fairly among its members.

7. Conclusions

In conclusion, we can affirm that the multiscale sieve is significantly more efficient than the modern sieving algorithms known to date, especially with large inputs. On the other hand, without further optimization, it has memory consumption that is quite similar to modern algorithms, excluding specific versions known and optimized for low memory utilization (segmented and bitwise sieves).

Furthermore, the implementation of the multiscale sieve through the GMP library represents an important step forward in the field of large number factorization, and the efficiency demonstrated by this approach could have a significant impact on cryptographic security, as any reduction in factorization times could undermine the robustness of current systems, also revealing vulnerabilities in the prime number generation methods used for key construction, raising questions about the security of many cryptographic protocols. Another important area for the integration of this algorithm is in decentralized platforms such as MuReQua, where, with the use of blockchain-based validation systems, it allows us not only to harness the computational power of a large network of users but also to incentivize participation through reward mechanisms. This approach not only enhances the efficiency and security of the system, but also creates opportunities for use in broader contexts, such as in IoT through low-cost devices. Ultimately, the adoption of these technologies could radically transform the cryptography and digital computing landscape, with profound implications for digital security and decentralized collaboration.

Although the obtained results are interesting with respect to the state of the art, in terms of the future, they encourage us to determine several possible optimizations that can be implemented.

Cache optimization: the alignment of the data in memory to make the best use of cache lines.
Multithreading: distributing and parallelizing the workload equally over several threads.
Bitmasking: using a bitmask or data structures to perform the more efficient marking of indexes during the research space reduction process.

Author Contributions

Conceptualization, G.I., E.B. and C.G.; Writing—review & editing, G.I., E.B. and C.G. All authors have read and agreed to the published version of the manuscript.

Funding

This research received no external funding.

Institutional Review Board Statement

Not applicable.

Informed Consent Statement

Not applicable.

Data Availability Statement

The raw data supporting the conclusions of this article will be made available by the authors on request.

Conflicts of Interest

The authors declare no conflicts of interest.

References

Bombieri, E. Problems of the Millenium: The Riemann Hypothesis; Clay Mathematics Institute: Cambridge, MA, USA, 2000. [Google Scholar]
Granville, A. Harald Cramér and the distribution of prime numbers. Presented at the Cramér Symposium, Stockholm, Sweden, 24 September 1993. [Google Scholar]
Du Sautoy, M. The Music of the Primes; RCS Libri: Milano, Italy, 2003. [Google Scholar]
Connes, A. Trace formula in non-commutative geometry and the zeros of the Riemann zeta function. Sel. Math. 1999, 5, 29–106. [Google Scholar] [CrossRef]
Montgomery, H.L. Distribution of the zeros of the Riemann zeta function. In Proceedings of the International Congress of Mathematicians, Vancouver, BC, Canada, 21–29 August 1974; Volume I, pp. 379–381. [Google Scholar]
Odlyzko, A.M. Supercomputers and the Riemann Zeta Function, Supercomputing 89: Supercomputing Structures and Computations. In Proceedings of the 4th International Conference on Supercomputing; Kartashev, L.P., Kartashev, S.I., Eds.; International Supercomputing Institute: St. Petersburg, FL, USA, 1989; pp. 348–352. [Google Scholar]
Rudnik, Z.; Sarnak, P. Zero of principal L-Functions and random matrix theory. Duke Math. J. 1996, 82, 269–322. [Google Scholar] [CrossRef]
Abd-Elnaby, M.; El-Baz, D. Development of the Sieve of Eratosthenes and Proof of the Sieve of Sundaram. arXiv 2021, arXiv:2102.06653. [Google Scholar]
Das, A.; Madhavan, C.E.V. Performance Comparison of Linear Sieve and Cubic Sieve Algorithms for Discrete Logarithms over Prime Fields. In Algorithms and Computation, ISAAC 1999; Lecture Notes in Computer Science; Springer: Berlin/Heidelberg, Germany, 1999; Volume 1741, pp. 30–39. [Google Scholar]
Atkin, A.O.L.; Bernstein, D.J. Prime sieves using binary quadratic forms. Math. Comput. 2004, 73, 1023–1030. [Google Scholar] [CrossRef]
Hartman, S.; Sorenson, J.P. Reducing the Space Used by the Sieve of Eratosthenes When Factoring. arXiv 2024, arXiv:2406.09150. [Google Scholar]
Agrawal, M.; Kayal, N.; Saxena, N. PRIMES is in P. Ann. Math. 2004, 160, 781–793. [Google Scholar] [CrossRef]
Rabin, M.O. Probabilistic algorithm for testing primality. J. Number Theory 1980, 12, 128–138. [Google Scholar] [CrossRef]
Solovay, R.; Strassen, V. A fast Monte-Carlo test for primality. SIAM J. Comput. 1977, 6, 84–86. [Google Scholar] [CrossRef]
Baillie, R.; Wagstaff, S.S. Lucas pseudoprimes. Math. Comput. 1980, 35, 1391–1417. [Google Scholar] [CrossRef]
Lucas, E. Théorie des Fonctions Numériques Simplement Périodiques. Am. J. Math. 1878, 1, 184–196. [Google Scholar] [CrossRef]
Iovane, G. The distribution of prime numbers: The solution comes from dynamical processes and genetic algorithms. Chaos Solitons Fractals 2008, 37, 23–42. [Google Scholar] [CrossRef]
Iovane, G. The set of prime numbers: Symmetries and supersymmetries of selection rules and asymptotic behaviours. Chaos Solitons Fractals 2008, 37, 950–961. [Google Scholar] [CrossRef]
Iovane, G. The set of prime numbers: Multiscale analysis and numeric accelerators. Chaos Solitons Fractals 2009, 41, 1953–1965. [Google Scholar] [CrossRef]
Iovane, G. The set of prime numbers: Multifractals and multiscale analysis. Chaos Solitons Fractals 2009, 42, 1945–1958. [Google Scholar] [CrossRef]
Tawalbeh, L.; Muheidat, F.; Tawalbeh, M.; Quwaider, M. IoT Privacy and Security: Challenges and Solutions. Appl. Sci. 2020, 10, 4102. [Google Scholar] [CrossRef]
Dhar, S.; Khare, A.; Dwivedi, A.D.; Singh, R. Securing IoT devices: A novel approach using blockchain and quantum cryptography. Internet Things 2024, 25, 101019. [Google Scholar] [CrossRef]
Li, Z.; Gasarch, W. An Empirical Comparison of the Quadratic Sieve Factoring Algorithm and the Pollard Rho Factoring Algorithm. arXiv 2021, arXiv:2111.02967. [Google Scholar]
Wang, Q.; Fan, X.; Zang, H.; Wang, Y. The Space Complexity Analysis in the General Number Field Sieve Integer Factorization. Theor. Comput. Sci. 2016, 630, 76–94. [Google Scholar] [CrossRef]
Iovane, G. MuReQua Chain: Multiscale Relativistic Quantum Blockchain. IEEE Access 2021, 9, 39827–39838. [Google Scholar] [CrossRef]

Figure 1. Multiscale prime candidate set generated up to the third level, where, in dark grey, we have marked the eliminated composite numbers, while, in light grey, we have marked the composite numbers not eliminated at the current level as they are useful for the generation of subsequent sets.

Figure 2. Relationship between

Λ_{m}

and the product of primes

\prod p_{i}

.

Figure 2. Relationship between

Λ_{m}

and the product of primes

\prod p_{i}

.

Figure 3. Relationship between

Γ_{m}

and the product of primes

\prod p_{i}

.

Figure 3. Relationship between

Γ_{m}

and the product of primes

\prod p_{i}

.

Figure 4. Relationship between

Λ_{m}

and

Ψ_{m}

at each level.

Figure 4. Relationship between

Λ_{m}

and

Ψ_{m}

at each level.

Figure 5. Growth in execution time in relation to input size.

Figure 6. Growth in memory usage in relation to input size.

Figure 7. System architecture with MuReQua integration.

Figure 8. Growth and distribution of coins at each level.

Table 1. Overview of some of the determinstic sieving algorithms.

Name of Sieve	Time Complexity	Space Complexity
Sieve of Eratosthenes	$O (n log log n)$	$O (n)$
Linear Sieve	$O (n)$	$O (n)$
Sieve of Sundaram	$O (n log n)$	$O (n)$
Sieve of Atkin	$O (n / log log n)$	$O (\sqrt{n})$
Sieve of Sorenson	$O (n)$	$O (n / log n)$
Segmented Sieve of Eratosthenes with Wheel Factorization	$O (n / log log n)$	$O (\sqrt{n})$

Table 2. Overview of the reduction in prime candidate classes.

$\prod p_{i}$	# Classes $Λ$	Deleted Classes $Ψ$	Remaining Classes $Γ$	Eliminated Multiple of Ratio
$2 \cdot 3$	2	0	2	3
$2 \cdot 3 \cdot 5$	10	2	8	5
$2 \cdot \dots \cdot 7$	56	8	48	7
$2 \cdot \dots \cdot 11$	528	48	480	11
$2 \cdot \dots \cdot 13$	6240	480	5760	13
$2 \cdot \dots \cdot 17$	97,920	5760	92,160	17
$2 \cdot \dots \cdot 19$	1,751,040	92,160	1,658,880	19

Table 3. Execution time benchmark comparison—ancient and modern sieves.

Size of Input	Sieve of Sundaram	Sieve of Atkin—Bernstein	Linear Sieve	Segmented Sieve of Eratosthenes with Wheel Factorization	Multiscale Sieve	Primes Generated
$10^{6}$	0.00327	0.0057	0.0082	0.0057	0.0014	78,498
$10^{7}$	0.0327	0.0440	0.0398	0.0385	0.0150	664,579
$10^{8}$	0.5432	0.4608	0.2319	0.1971	0.0504	5,761,455
$10^{9}$	8.7509	5.2928	2.0829	1.6061	0.2758	50,847,535
$10^{10}$	116.5851	53.6592	22.0850	15.4788	3.2695	450,052,510

Table 4. Comparison of execution times in percentages.

Size of Input	Sieve of Sundaram	Sieve of Atkin-Bernstein	Linear Sieve	Segmented Sieve of Eratosthenes with Wheel Factorization
$10^{6}$	234.24%	408.09%	584.60%	409.67%
$10^{7}$	218.60%	293.64%	266.09%	257.21%
$10^{8}$	1077.77%	914.33%	460.12%	391.04%
$10^{9}$	3172.77%	1918.99%	755.19%	582.30%
$10^{10}$	3565.80%	1641.19%	675.48%	473.43%

Table 5. Memory usage benchmark comparison—ancient and modern sieves.

Size of Input	Sieve of Sundaram	Sieve of Atkin-Bernstein	Linear Sieve	Segmented Sieve of Eratosthenes with Wheel Factorization	Multiscale Sieve	Primes Generated
$10^{6}$	1.66	2.19	2.80	1.30	3.38	78,498
$10^{7}$	6.13	10.75	15.83	1.38	16.00	664,579
$10^{8}$	48.89	96.58	140.55	1.34	143.67	5,761,455
$10^{9}$	478.05	954.89	1342.83	1.47	1250.33	50,847,535
$10^{10}$	4769.58	9537.47	10,285.86	1.84	7934.70	450,052,510

Table 6. Distribution of coins issued at each level.

$\prod p_{i}$	Coins Emitted	User Earnings—Prime Found	User Earnings—Composite Eliminated	Oracle Cost—Storage and Sorting	Server Earnings—Composite Eliminated
6	12	6	3	1.5	1.5
30	70	35	18	8.8	8.8
210	596	298	149	74.5	74.5
2310	6020	3010	1505	752.5	752.5
30,030	82,731	41,366	20,683	10,341.4	10,341.4
510,510	1,298,860	649,430	324,715	162,357.5	162,357.5
9,699,690	25,343,400	12,667,150	6,333,575	3,166,787.5	3,166,787.5

Disclaimer/Publisher’s Note: The statements, opinions and data contained in all publications are solely those of the individual author(s) and contributor(s) and not of MDPI and/or the editor(s). MDPI and/or the editor(s) disclaim responsibility for any injury to people or property resulting from any ideas, methods, instructions or products referred to in the content.

© 2024 by the authors. Licensee MDPI, Basel, Switzerland. This article is an open access article distributed under the terms and conditions of the Creative Commons Attribution (CC BY) license (https://creativecommons.org/licenses/by/4.0/).

Share and Cite

MDPI and ACS Style

Iovane, G.; Benedetto, E.; Gallo, C. Multiscale Sieve for Smart Prime Generation and Application in Info-Security, IoT and Blockchain. Appl. Sci. 2024, 14, 8983. https://doi.org/10.3390/app14198983

AMA Style

Iovane G, Benedetto E, Gallo C. Multiscale Sieve for Smart Prime Generation and Application in Info-Security, IoT and Blockchain. Applied Sciences. 2024; 14(19):8983. https://doi.org/10.3390/app14198983

Chicago/Turabian Style

Iovane, Gerardo, Elmo Benedetto, and Carmine Gallo. 2024. "Multiscale Sieve for Smart Prime Generation and Application in Info-Security, IoT and Blockchain" Applied Sciences 14, no. 19: 8983. https://doi.org/10.3390/app14198983

Note that from the first issue of 2016, this journal uses article numbers instead of page numbers. See further details here.

Article Menu

Multiscale Sieve for Smart Prime Generation and Application in Info-Security, IoT and Blockchain

Abstract

1. Introduction

Relevant Works and Starting Points

2. Core Concepts of Multiscale Sieve

Memory Management

3. Overview of the Algorithm

4. Multiscale Prime Candidate Sets

4.1. Reducing the Research Space

4.2. Prime Number Finding

5. Benchmark Setup

Graphs and Benchmark Analysis

6. Applications: GMP for Big Numbers, IoT and Blockchain

7. Conclusions

Author Contributions

Funding

Institutional Review Board Statement

Informed Consent Statement

Data Availability Statement

Conflicts of Interest

References

Share and Cite

Article Metrics

Article Access Statistics

Further Information

Guidelines

MDPI Initiatives

Follow MDPI